# INTRA- AND INTER-INDIVIDUAL VARIABILITY OF EXECUTIVE FUNCTIONS: DETERMINANT AND MODULATING FACTORS IN HEALTHY AND PATHOLOGICAL CONDITIONS

EDITED BY : Sarah E. MacPherson, Celine R. Gillebert, Gail A. Robinson and Antonino Vallesi PUBLISHED IN : Frontiers in Psychology and Frontiers in Human Neuroscience

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-837-0 DOI 10.3389/978-2-88945-837-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# INTRA- AND INTER-INDIVIDUAL VARIABILITY OF EXECUTIVE FUNCTIONS: DETERMINANT AND MODULATING FACTORS IN HEALTHY AND PATHOLOGICAL CONDITIONS

Topic Editors:

Sarah E. MacPherson, University of Edinburgh, United Kingdom Celine R. Gillebert, KU Leuven, Belgium; University of Oxford, United Kingdom Gail A. Robinson, The University of Queensland, Australia Antonino Vallesi, University of Padova, Italy; IRCCS San Camillo Hospital, Italy

This eBook attempts to unify the contributions of different research groups investigating the sources of variability in executive functions, discussing the most recent developments and integrating the knowledge accumulated across different fields. It consists of a compilation of empirical, theoretical and review articles studying executive functions in both clinical and healthy human populations. Some of the key influences on intra- and inter-variability in executive functions discussed include the developmental trajectory of executive functions, healthy and pathological aging in executive functions, as well as the influence of environmental factors and intelligence on executive functions.

Citation: MacPherson, S. E., Gillebert, C. R., Robinson, G. A., Vallesi, A., eds. (2019). Intra- and Inter-individual Variability of Executive Functions: Determinant and Modulating Factors in Healthy and Pathological Conditions. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-837-0

# Table of Contents

*07 Editorial: Intra- and Inter-individual Variability of Executive Functions: Determinant and Modulating Factors in Healthy and Pathological Conditions*

Sarah E. MacPherson, Celine R. Gillebert, Gail A. Robinson and Antonino Vallesi

#### SECTION 1

#### DEVELOPMENTAL TRAJECTORY OF EFs


Harry R. Smolker, Naomi P. Friedman, John K. Hewitt and Marie T. Banich


Chiara Malagoli and Maria Carmen Usai


Marisa G. Filipe, Sónia Frota and Selene G. Vicente

	- *Down Syndrome Compared to Typically Developing Children* Laura Traverso, Martina Fontana, Maria Carmen Usai and Maria C. Passolunghi

# SECTION 2

#### AGING AND CLINICAL POPULATIONS

*137 Intraindividual Variability in Executive Function Performance in Healthy Adults: Cross-Sectional Analysis of the NAB Executive Functions Module* Dorota Buczylowska and Franz Petermann

#### *146 Intra-Individual Variability of Error Awareness and Post-error Slowing in Three Different Age-Groups*

Fabio Masina, Elisa Di Rosa and Daniela Mapelli


Sonia Di Tella, Francesca Baglio, Monia Cabinio, Raffaello Nemni, Daniela Traficante and Maria C. Silveri

*181 Role of the Cingulate Cortex in Dyskinesias-Reduced-Self-Awareness: An fMRI Study on Parkinson's Disease Patients*

Sara Palermo, Leonardo Lopiano, Rosalba Morese, Maurizio Zibetti, Alberto Romagnolo, Mario Stanziano, Mario Giorgio Rizzone, Giuliano Carlo Geminiani, Maria Consuelo Valentini and Martina Amanzio

*191 The Color of Noise and Weak Stationarity at the NREM to REM Sleep Transition in Mild Cognitive Impaired Subjects*

Alejandra Rosales-Lagarde, Erika E. Rodriguez-Torres, Benjamín A. Itzá-Ortiz, Pedro Miramontes, Génesis Vázquez-Tagle, Julio C. Enciso-Alva, Valeria García-Muñoz, Lourdes Cubero-Rego, José E. Pineda-Sánchez, Claudia I. Martínez-Alcalá and Jose S. Lopez-Noguerola

*209 Effects of Mild Cognitive Impairment on the Event-Related Brain Potential Components Elicited in Executive Control Tasks* Montserrat Zurrón, Mónica Lindín, Jesús Cespón, Susana Cid-Fernández,

Santiago Galdo-Álvarez, Marta Ramos-Goicoa and Fernando Díaz

*217 Working Memory Deficits After Lesions Involving the Supplementary Motor Area*

Alba Cañas, Montserrat Juncadella, Ruth Lau, Andreu Gabarrós and Mireia Hernández

*229 Executive Functions Rating Scale and Neurobiochemical Profile in HIV-Positive Individuals*

Vojislava Bugarski Ignjatovic, Jelena Mitrovic, Dusko Kozic, Jasmina Boban, Daniela Maric and Snezana Brkic

*242 Preserved but Less Efficient Control of Response Interference After Unilateral Lesions of the Striatum*

Claudia C. Schmidt, David C. Timpert, Isabel Arend, Simone Vossel, Anna Dovern, Jochen Saliger, Hans Karbe, Gereon R. Fink, Avishai Henik and Peter H. Weiss

*255 The Complex Interplay Between Depression/Anxiety and Executive Functioning: Insights From the ECAS in a Large ALS Population*

Laura Carelli, Federica Solca, Andrea Faini, Fabiana Madotto, Annalisa Lafronza, Alessia Monti, Stefano Zago, Alberto Doretti, Andrea Ciammola, Nicola Ticozzi, Vincenzo Silani and Barbara Poletti

# SECTION 3

#### ENVIRONMENTAL FACTORS

*263 Intra-Individual Variability Across Fluid Cognition Can Reveal Qualitatively Different Cognitive Styles of the Aging Brain*

Sara De Felice and Carol A. Holland

*279 General Slowing and Education Mediate Task Switching Performance Across the Life-Span*

Luca Moretti, Carlo Semenza and Antonino Vallesi


Wi Hoon Jung, Tae Young Lee, Youngwoo B. Yoon, Chi-Hoon Choi and Jun Soo Kwon

*312 Monitoring Processes in Visual Search Enhanced by Professional Experience: The Case of Orange Quality-Control Workers* Antonino Visalli and Antonino Vallesi

# SECTION 4

#### INTELLIGENCE

*324 The (In)significance of Executive Functions for the Trait of Self-Control: A Psychometric Study*

Edward Nęcka, Aleksandra Gruszka, Jarosław Orzechowski, Michał Nowak and Natalia Wójcik

*336 The Influence of Fluid Intelligence, Executive Functions and Premorbid Intelligence on Memory in Frontal Patients* Edgar Chan, Sarah E. MacPherson, Marco Bozzali, Tim Shallice and Lisa Cipolotti

# SECTION 5

#### PERSONALITY


# SECTION 6

#### REWARD, AROUSAL AND EMOTION


Andrzej Cudo, Piotr Francuz, Paweł Augustynowicz and Paweł Stróżak

*382 Post-error Brain Activity Correlates With Incidental Memory for Negative Words*

Magdalena Senderecka, Michał Ociepka, Magdalena Matyjek and Bartłomiej Kroczek

# SECTION 7

#### DECISION-MAKING

*400 Executive Functions and Performance Variability Measured by Event-Related Potentials to Understand the Neural Bases of Perceptual Decision-Making* Rinaldo L. Perri and Francesco Di Russo

# SECTION 8

#### ATTENTION AND COGNITIVE CONTROL


Tianlu Wang and Celine R. Gillebert

*463 Within-Subject Correlation Analysis to Detect Functional Areas Associated With Response Inhibition*

Tomoko Yamasaki, Akitoshi Ogawa, Takahiro Osada, Koji Jimura and Seiki Konishi

# Editorial: Intra- and Inter-individual Variability of Executive Functions: Determinant and Modulating Factors in Healthy and Pathological Conditions

#### Sarah E. MacPherson1,2 \*, Celine R. Gillebert 3,4, Gail A. Robinson5,6 and Antonino Vallesi 7,8

*<sup>1</sup> Human Cognitive Neuroscience, Department of Psychology, University of Edinburgh, Edinburgh, United Kingdom, <sup>2</sup> Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, United Kingdom, <sup>3</sup> Department of Brain and Cognition, KU Leuven, Leuven, Belgium, <sup>4</sup> Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom, <sup>5</sup> Neuropsychology Research Unit, School of Psychology, The University of Queensland, Brisbane, QLD, Australia, <sup>6</sup> Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia, <sup>7</sup> Department of Neuroscience & Padova Neuroscience Center, University of Padua, Padua, Italy, <sup>8</sup> Brain Imaging and Neural Dynamics Research Group, IRCCS San Camillo Hospital, Venice, Italy*

Keywords: executive abilities/function, cognitive aging, intelligence, expertise, bilingualism, development, interindividual and intra-individual differences, cognitive reserve

#### **Editorial on the Research Topic**

#### Edited and reviewed by:

*Bernhard Hommel, Leiden University, Netherlands*

\*Correspondence: *Sarah E. MacPherson sarah.macpherson@ed.ac.uk*

#### Specialty section:

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

Received: *10 February 2019* Accepted: *13 February 2019* Published: *08 March 2019*

#### Citation:

*MacPherson SE, Gillebert CR, Robinson GA and Vallesi A (2019) Editorial: Intra- and Inter-individual Variability of Executive Functions: Determinant and Modulating Factors in Healthy and Pathological Conditions. Front. Psychol. 10:432. doi: 10.3389/fpsyg.2019.00432* **Factors in Healthy and Pathological Conditions** Executive functioning generally refers to the ability to organize thought and action based on

**Intra- and Inter-individual Variability of Executive Functions: Determinant and Modulating**

intentions and goals, especially in novel, complex or difficult situations. Executive functioning is a multifaceted psychological construct that may be depicted as a set of related but separable highlevel cognitive abilities, possibly supported by the prefrontal cortex and implemented by larger brain networks (Shallice and Burgess, 1996; Miyake et al., 2000) but see Duncan et al. (1997). Many models exist that emphasize commonalities or differences among various executive functions (EF). While the number and type of EF that exist remain a topic of debate, most authors would agree that EF show high intra- and inter-individual variability in terms of their cognitive and behavioral manifestations.

But what are the determinant and modulating factors that might explain the variability across EF? Do neuro-anatomical or neuro-functional factors and/or the environment influence EF? The overall goal of our research topic was to provide a forum to explore the contributions of different research groups investigating intra- and inter-individual variability in EF. We welcomed empirical, theoretical and meta-analytical work involving both clinical and healthy human populations. We were impressed by the number of authors who did indeed rally to our call; our research topic resulted in contributions from 187 authors and 39 published articles. At the time of writing, our research topic has resulted in an impressive 62,809 total views and 5,728 article downloads. We hope after reading these articles, you will be more sensitive to the various factors that contribute to intra- and inter-subject variability in EF and will be inspired to consider these when studying EF in both healthy and pathological conditions.

What follows is a brief overview of the contributions to our research topic. We aim to highlight some of the key influences on EF variability, and some of the interesting questions to emerge from these articles that we hope will encourage and influence future research. We appreciate that this editorial cannot fully do our research topic justice in terms of the breadth and depth of topics/questions included and so we encourage you to read further the contributions that these articles offer to the research area of EF.

# DEVELOPMENTAL TRAJECTORY OF EF

Although EF are thought to be multifaceted, the general consensus in the developmental literature is that there is a unitary EF factor in preschool children (Wiebe et al., 2008). This develops into a two factor model (working memory and inhibition/shifting) in primary school-aged children (Brydges et al., 2014) and finally, manifests into a three-factor model in adolescence (Latzman and Markon, 2010). By late adulthood, EF become more unidimensional again (sometimes referred to as the differentiation-dedifferentiation hypothesis, Wiebe et al., 2011; Brydges et al., 2012).

Developing this work further, several contributions in our research topic examine the tripartite model in children, adolescents and young adults. Messer et al. examined the relationship between 10 verbal and non-verbal EF tasks in 128 typically developing primary-school aged children. Their aim was to determine how performance on these distinct EF tasks relates with one another. The exploratory factor analysis produced two factors, one inhibition factor containing the two inhibition tasks, and a general EF factor that included the other shifting, working memory/updating, fluency, and planning tasks. Here, the findings of a two-factor EF model in primary-school aged children was replicated, although the nature of the factors varied. It may be that different factor structures are the product of task impurity (Miyake et al., 2000) where distinct tasks tapping the same EF function have different relationships with other EF tasks. The selection of the EF components considered is often taskbased but Messer et al. propose that future work should select EF tasks based on evidence from brain/behavior relationships.

Developmental changes in the factor structure of EF factors are thought to be related to maturation in the prefrontal cortex, a region which continues to experience considerable changes in adolescence (Yakovlev and Lecours, 1967). Neuroimaging studies have shown a linear increase in prefrontal white matter volume due to increased myelination during adolescence (Barnea-Goraly et al., 2005). There is also a reduction in gray matter volume (Gogtay et al., 2004) due to a reduction in synaptic density but an increase in the remaining synapse's efficiency (Blakemore and Choudhury, 2006). This brain development may continue in late adolescence and early 20s (Gogtay et al., 2004) and not reach stability until around 30 years of age (Sowell et al., 2003). In our research topic, Smolker et al. examined whether individual differences in gray and white matter measures are associated with individual differences in EF in young adults in their 20s. They administered 6 tasks tapping the three constructs of the tripartite model to 251 adults. Smolker et al. reported a common factor influencing performance on all EF tasks, as well as updating-specific and shifting-specific factors. In terms of associations between the EF and neuroanatomical measures, they found the common EF was related to several gray matter and fractional anisotropy characteristics. The updating-specific factor was associated with gray matter characteristics only, whereas the shifting-specific factor was associated with several white matter properties (see Smolker et al.). In another study involving the same cohort, Reineberg et al. examined the relationship between fMRI resting state network connectivity and individual differences in separable components of EF. The authors found that individuals with higher performance on the shifting-specific factor had more positive connectivity between the frontoparietal and visual networks, whereas individuals with higher performance on the common EF factor exhibited increased connectivity between sensory and default mode networks. These results uncover more specific relationships between connectivity and EF.

Contributors to our research topic have also examined the latent factor structure of EF in relation to neurodevelopmental conditions such as autism (Filipe et al.) and dyslexia (Doyle et al.). Filipe et al. highlighted an important bidirectional link between EF skills (divided attention, working memory, setswitching, inhibition) and prosodic abilities, although children with high functioning autism and controls did not differ. Doyle et al. examined how different EF contribute to reading ability by studying children with dyslexia and age-matched controls. Proficient reading is thought to require EF to switch between multiple reading processes, inhibit irrelevant information, and hold and update speech. However, the exact profile of spared and impaired EF associated with dyslexia remains unclear with some studies reporting EF impairments (Bexkens et al., 2014) and others not (Poljac et al., 2010). Doyle et al. found that the inhibition and updating composite scores significantly predicted reading ability and the likelihood of dyslexia whereas switching did not. These findings encourage future work to explore EF training as an intervention for children with dyslexia, which in turn, might transfer to improved reading ability.

# AGING AND EF

Moving to the other end of the spectrum and the influence of cognitive aging on EF, studies consistently report that healthy older adults perform poorer than younger adults on EF tasks (see MacPherson and Della Sala, 2015). Frontal lobe theories of cognitive aging propose that the age-related decline on EF tasks is either due to overall frontal lobe decline (West, 1996) or more specific dorsolateral prefrontal decline (MacPherson et al., 2002). In support of these theories, neuroimaging studies have demonstrated that the frontal lobes are especially vulnerable to age-related changes in terms of overall cortical volume, cortical thickness, and white matter compared to other brain regions (Fjell et al., 2009).

While there seems little doubt that healthy and pathological aging result in structural and functional changes in the frontal lobes and poorer EF performance (Cabeza and Dennis, 2013), it remains less clear whether older adults experience similar patterns of deterioration across different EF. In the cognitive aging literature, most attention has been placed on examining intra-individual variability across task trials (Dykiert et al., 2012), and less attention has been placed on "dispersion"—the study of variability across cognitive tasks (Hilborn et al., 2009). Some cross-sectional and longitudinal aging studies have reported that dispersion reduces with age (Rabbitt et al., 2004) but others have found an increase in dispersion with age (Sosnoff and Newell, 2006). In our research topic, Buczylowska and Petermann examined a large group of 444 healthy adults aged from 18 to 99 years performing the NAB Executive Functions Module, which includes subtests of planning, mazes, letter fluency, judgment, categories, and word generation. The authors found that the variability across EF tasks decreased with age and there were increasing intercorrelations between tasks. These findings suggest EF in late adulthood become unidimensional in nature and provide support for the dedifferentiation hypothesis.

On a different note, our research topic also includes work further examining the relationship between EF performance and neurodegenerative changes in older adults. For example, Di Tella et al. explored the relationship between EF, specifically selection, and changes in cortical thickness in the inferior regions of the frontal lobes in patients with Parkinson's disease (PD) with predominantly left or right cortical involvement. Twenty-one PD patients and 19 controls performed a noun-verb generation task and a second verb-noun derivation task. Only PD patients with left-sided but not right-sided atrophy were impaired compared to the controls on both linguistic tasks. Furthermore, in the left-sided PD patients, significant correlations between accuracy and RTs and cortical thickness in the left inferior frontal gyrus (IFG) were found. Di Tella et al. conclude that linguistic and EF processes interact in the left IFG during word production tasks involving selection and suggest that future work should consider these structural cortical asymmetries in PD further.

In another study, Palermo et al. examined PD patients' partial or complete unawareness of their involuntary movements (i.e., dyskinesias-reduced-self-awareness, DRSA) in relation to performance on response-inhibition tasks and hypofunctionality in the anterior cingulate cortex (ACC). Previously, Maier et al. (2016) demonstrated that impaired self-awareness in PD patients was related to reduced metabolism in the bilateral frontal regions including the medial frontal gyrus (particularly the ACC), which has been associated with impaired self-awareness in Alzheimer's disease (AD; Amanzio et al., 2011), acquired brain injury (Palermo et al., 2014), bipolar disorder (Palermo et al., 2015), and schizophrenia (Orfei et al., 2010). Palermo et al. extend their own work to 27 PD patients presenting with motor fluctuations and dyskinesias who underwent event-related functional MRI while performing a response-inhibition GO/No-GO task. They found that reduced bilateral ACC involvement, as well as in the bilateral anterior insular cortex and right dorsolateral prefrontal cortex, was related to the presence of DRSA. Furthermore, DRSA scores significantly correlated with percent errors on the No-GO condition. The authors conclude that the reduction in self-awareness of dyskinesias in PD may be due to a specific impairment in EF related to metacognitive awareness.

# ENVIRONMENTAL INFLUENCES ON EF

Certain lifetime experiences have been proposed to "protect" against the impact of brain damage, which may account for the variability in cognitive performance that can be found in patients with similar degrees of brain pathology. These protective influences have been referred to as cognitive reserve (CR; Stern, 2002). As CR cannot be assessed directly, a number of indicators have been proposed as CR proxies. Education level is a commonly adopted index of CR, as is literacy attainment, which is typically measured using single word reading tests such as the National Adult Reading Test (NART; Nelson and Willison, 1991). CR has predominantly been investigated in relation to neurodegenerative disorders such as AD, traumatic brain injury and healthy aging (Harrison et al., 2015), where individuals who have higher levels of education and/or NART IQ are found to have less cognitive impairment than individuals with lower levels of education and/or NART IQ (e.g., Singh-Manoux et al., 2011).

Readers of this research topic will be most keen to consider the influence, if any, of CR on EF. Indeed, there is some evidence to suggest that EF are susceptible to the mitigating effects of CR. Educational attainment has been found to predict EF performance both in healthy aging (Meguro et al., 2001) and AD (Scarmeas et al., 2006). Higher education in stroke patients has also been associated with better performance on EF tests (Ojala-Oksala et al., 2012). More recently, MacPherson et al. (2017) retrospectively examined patients with frontal lesions and found that NART IQ (and age) predicted performance on EF tests (i.e., Stroop Test and letter fluency). Therefore, there do appear to be protective effects of CR on EF and this may explain some of the inter-individual variability in performance on certain tasks across patients with similar levels of brain pathology.

In the current research topic, De Felice and Holland studied whether CR factors might have differential effects on individuals' performance on distinct EF tasks (i.e., fluency, Trail-Making Test, and digit span forwards and backwards) depending upon their age. They compared younger (22–31 years), middle-old (59–71 years), and old-old (76–91 years) groups. They reported a trend that old-old adults had the greatest dispersion index, and this was coupled with poorer task performance compared to the younger and middle-old groups. The authors conclude that middle-old adults with better cognition exclusively benefit from higher CR and demonstrate a dispersion index equivalent to younger adults.

Both education and NART IQ have been criticized as indices of CR (Jones et al., 2011). Education varies in the quality, availability and subjects taught across different countries and social groups whereas dyslexia and other learning difficulties are detrimental to performance on literacy attainment and can result in inaccurate estimates (Ikanga et al., 2016). Moreover, other real-life factors that may modify cognitive decline such as occupational attainment (Garibotto et al., 2008) and leisure activities (Wilson et al., 2002) are considered less by researchers. Given that different indices might contribute to CR, Nucci et al. (2012) devised the Cognitive Reserve Index questionnaire (CRIq), which provides a measure of overall cognitive reserve but also distinct dimensions that contribute to the overall score (i.e., education, occupational attainment, and leisure time). In our research topic, Moretti et al. considered the potential role of distinct CR factors and general slowing on modulating cognitive flexibility in young, middle-age and older adults. Using the CRIq, the authors report that education was the only index associated with reduced switch costs under time pressure and highlight the importance of using tools designed to distinguish between different CR dimensions to understand better which life-long experiences protect different cognitive functions (Puccioni and Vallesi, 2012a,b).

Another potential life course factor thought to play a protective role against cognitive decline is bilingualism. Bilingualism is a hot topic in the EF literature given that some work has shown that bilingualism results in improved cognitive function in healthy aging (Bak et al., 2014) and post-stroke (Alladi et al., 2016) and is associated with a delay in the onset of mild cognitive impairment (Ramakrishnan et al., 2017), dementia (Bialystok et al., 2007) and behavioral variant frontotemporal dementia (Alladi et al., 2017). While there is considerable debate around the presence, magnitude and mechanisms associated with the bilingualism effect (Freedman et al., 2014), some research has shown positive effects on EF associated with speaking more than one language (Bak, 2016).

When studying bilingualism, it is important to know whether such benefits are specific to language or are domain-general. While some propose that long-standing bilingualism affects non-linguistic executive control, as smaller switch costs are reported in bilinguals performing non-linguistic tasks compared to monolinguals (Prior and Macwhinney, 2010), others have not found a bilingual advantage (Paap et al., 2017). In this research topic, Timmer et al. argue that the currently used linguistic and non-linguistic control measures in bilinguals may not be reliable. Using linguistic and non-linguistic switch tasks administered to Catalan-Spanish-English trilinguals, they demonstrated that the cost of switching between languages/tasks compared to repeating the same language/task is a reliable measure of cross-talk between linguistic and non-linguistic executive control and that there are at least some shared processes across the tasks. Timmer et al.'s work makes us reconsider the reliability of the measures used to study bilingualism. Perhaps bilingualism can result in domain-general benefits but, for now, the jury is still out.

While the bilingualism debate will continue for some time, our research topic also includes studies examining whether expertise for other skills, such as playing strategy board games, goes beyond the specific skill itself and results in a more general advantage for cognitive skills. Training in board games such as chess may potentially enhance an individual's working memory abilities (WM) as players need to hold in WM several potential offensive moves and their opponent's predicted responses to each of those future moves. Consistently, however, experimental studies involving chess experts and novices performing WM tasks using chessboards and faces or scenes have reported group differences between the experts and novices for chessboard stimuli but not other stimulus types (Bartlett et al., 2013). The neuroimaging results are less consistent with some studies reporting an increase in activation in the fusiform gyrus in experts compared to novices in response to chessboards (Bilalic et al., 2011 ´ ), yet others report no differences (Krawczyk et al., 2011).

In our research topic, Jung et al. examined whether expertise for the Korean strategy board game, Baduk, goes beyond the game itself and how it maps on networks associated with cognitive abilities that are not directly trained. The authors adopted a data-driven, whole-brain multivariate analytic approach as part of a connectome-wise association study (CWAS) to examine brain-behavior relationships in experts. Seventeen Baduk experts performed a visual n-back WM task including both face matching and spatial location matching conditions. They found that experts did not show an increase in WM ability compared to novices suggesting that expertise does not transfer to other cognitive abilities. However, experts did have greater activation in the superior parietal cortex during the face WM task and greater connectivity between frontal and parietal regions and between frontal and temporal regions. These findings provide evidence that experts undergo reorganization of functional interactions between brain regions associated with WM., showing that experience-related brain changes may be more sensitive than behavioral ones.

In another study of expertise, Visalli and Vallesi examined the expertise of quality-control employees, focusing on whether visual search expertise extends to generalized search behaviors. In particular, they focused on monitoring processes, the goal of which is to "quality check" in order to enhance behavior (see Vallesi, 2012 for an overview). Twenty-four fruit quality controllers and 23 controls performed a computerized visual search task with one block containing oranges (expert knowledge) and one block containing the Smurfette doll (neutral knowledge). They found that quality-controllers were significantly faster than controls in the conditions thought to require monitoring processes (i.e., all target-present and target-absent conditions except the orange-present condition). These results suggest that top-down processes in visual search can be enhanced through immersive real-life experience beyond visual expertise advantages. Therefore, the findings of associations between expertise and improved EF are not consistent and may depend on the type of expertise and the tasks involved.

# INTELLIGENCE AND EF

Some theories suggest that the frontal lobes play a role in general control processes that are employed when performing diverse cognitive tasks, regardless of the type of information being processed (e.g., Duncan, 2001; Miller and Cohen, 2001). Neuroimaging studies have demonstrated activation in the lateral frontal cortex, dorsomedial frontal cortex, and anterior insula, as well as the intraparietal sulcus, when performing difficult tasks across different domains (Fedorenko et al., 2012). The activation of these regions when performing distinct tasks has been referred to as the multiple-demand (MD) network, and this network is thought to be central in the organization of several types of behavior (Duncan, 2005).

The activity in the MD network when performing different cognitive tests has been associated with fluid intelligence (e.g., Woolgar et al., 2010) and this has led researchers to investigate the relationship between fluid intelligence and EF. Research studies have found that fluid intelligence positively correlates with EF measures and frontal lobe lesions impair performance on tests of fluid intelligence (Duncan et al., 1995), particularly lesions involving MD regions (Woolgar et al., 2010). Furthermore, activation in the MD network is found when individuals perform fluid intelligence tests (Duncan et al., 2000). Interestingly, increasing complexity in nonverbal reasoning tasks has recently been associated with abnormal MD network activation in individuals with developmental corpus callosal dysgenesis (Hearne et al., 2018). These findings suggest that it may be a decline in fluid intelligence which underlies the EF impairments reported in frontal patients. Roca et al. (2010) demonstrated that impaired performance in frontal patients on EF tests such as the Wisconsin Card Sorting Test, verbal fluency and the Iowa Gambling Task can be explained by fluid intelligence impairments, although Robinson et al. (2012) showed the opposite for verbal fluency. However, for other EF tasks such as the Hayling Sentence Completion test and the Stroop test, frontal patients' impairments could not be accounted for by reduced fluid abilities (Roca et al., 2010; Cipolotti et al., 2016: for a similar finding in schizophrenia see Martin et al., 2015). Moreover, although Barbey et al. (2012) identified shared neural substrates in the frontal and parietal cortex for EF and general intelligence (g), there were additional brain regions specific to EF (e.g., the left anterior pole) and brain regions specific to g (e.g., the left inferior occipital gyrus and the right superior and inferior parietal lobe).

In our research topic, contributors have further examined the relationship between EF abilities and intelligence in healthy and patient populations. N˛ecka et al. investigated whether selfcontrol (SC) is subserved by EF in 296 healthy younger volunteers through the administration of 5 EF tasks, 3 self-report SC measures and two fluid intelligence tests. Using a structural equation modeling approach, three latent variables of executive control, behavioral control, and fluid intelligence (Gf) were extracted. Surprisingly, N˛ecka et al. did not find any EF-SC or Gf-SC relationships. However, a reasonably strong EF-Gf relationship was found. The authors conclude that SC may not depend on the strength of executive control, at least in healthy adults.

Moving onto studies involving frontal patients, Chan et al. examined whether the memory impairments often reported in frontal patients are better explained by declines in fluid

#### REFERENCES


intelligence or EF. Thirty-nine patients with focal frontal lesions were assessed on tests of recall and recognition memory, fluid intelligence, and EF. As in their previous work (e.g., MacPherson et al., 2016), Chan et al. found that their frontal patients were impaired on both recall and recognition memory tests compared to healthy controls. Importantly, however, whereas fluid intelligence was the strongest predictor of recall deficits, recognition memory was not related to intelligence or EF performance. Overall, Chan et al. show that the nature of the frontal deficit on different memory tasks may be separable in relation to other clinical and cognitive factors influencing performance.

This has been a very brief overview of the contents of our research topic. While our editorial cannot encompass all author contributions, it highlights some of the interesting findings that can be found within the topic, with the hope of encouraging you to read further. And of course, not all determining and modulating factors that contribute to EF variability have been discussed. When we proposed this research topic to Frontiers, our aim was to provide a forum where the contributions of different research groups investigating intra- and inter-individual variability in EF could be discussed. We hope you find this forum a valuable contribution to the EF literature and it generates as many questions as it answers.

# AUTHOR CONTRIBUTIONS

SM wrote the first draft of the manuscript. All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

CG was funded by the Research Foundation Flanders (G072517N). GR is supported by an Australian National Health and Medical Research Council Boosting Dementia Research Leadership Fellowship (APP1135769). AV was funded by the FP7 European Research Council Starting Grant LEX-MEA (GA #313692).


some tenable reflective and formative models. J. Clin. Exp. Neuropsychol. 20, 1–12. doi: 10.1080/13803395.2016.1201462


dyslexia but not in children with autism. Q. J. Exp. Psychol. 63, 401–416. doi: 10.1080/17470210902990803


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 MacPherson, Gillebert, Robinson and Vallesi. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# An Exploration of the Factor Structure of Executive Functioning in Children

David Messer<sup>1</sup> \*, Marialivia Bernardi<sup>2</sup> , Nicola Botting<sup>2</sup> , Elisabeth L. Hill<sup>3</sup> , Gilly Nash<sup>2</sup> , Hayley C. Leonard<sup>4</sup> and Lucy A. Henry<sup>2</sup>

<sup>1</sup> Centre for Research in Education and Educational Technology, The Open University, Milton Keynes, United Kingdom, <sup>2</sup> Language and Communication Science, City University of London, London, United Kingdom, <sup>3</sup> Department of Psychology, Goldsmiths, University of London, London, United Kingdom, <sup>4</sup> School of Psychology, University of Surrey, Guildford, United Kingdom

There has been considerable debate and interest in the factor structure of executive functioning (EF). For children and young people, there is evidence of a progression from a single factor to a more differentiated structure, although the precise nature of these factors differs between investigations. The purpose of the current study was to look at this issue again with another sample, and try to understand possible reasons for previous differences between investigations. In addition, we examined the relationship between less central EF tasks, such as fluency and planning, to the more common tasks of updating/executive working memory (EWM), inhibition, and switching/shifting. A final aim was to carry out analyses which are relevant to the debate about whether EF is influenced by language ability, or language ability is influenced by EF. We reasoned that if language ability affects EF, a factor analysis of verbal and non-verbal EF tasks might result in the identification of a factor which predominantly contains verbal tasks and a factor that predominately contains non-verbal tasks. Our investigation involved 128 typically developing participants (mean age 10:4) who were given EF assessments that included verbal and non-verbal versions of each task: EWM; switching; inhibition; fluency; and planning. Exploratory factor analyses on EWM, switching, and inhibition produced a structure consisting of inhibition in one factor and the remaining tasks in another. It was decided to exclude verbal planning from the next analyses of all the ten tasks because of statistical considerations. Analysis of the remaining nine EF tasks produced two factors, one factor containing the two inhibition tasks, and another factor that contained all the other tasks (switching, EWM, fluency, and non-verbal planning). There was little evidence that the verbal or non-verbal elements in these tasks affected the factor structure. Both these issues are considered in the discussion, where there is a general evaluation of findings about the factor structure of EF.

Keywords: executive functioning, children, factor structure, task impurity, unity and diversity

# INTRODUCTION

Executive functioning (EF) continues to be an important topic of research in relation to children and young people (Diamond, 2013). There is a growing consensus about the cognitive processes and relevant assessment procedures for the investigation of EF. However, there has been a longstanding discussion about whether the different forms of EF should be considered as making

#### Edited by:

Celine R. Gillebert, KU Leuven, Belgium

#### Reviewed by:

Joe Bathelt, University of Cambridge, United Kingdom Didier Le Gall, Laboratoire de Psychologie des Pays de la Loire (LPPL), France

> \*Correspondence: David Messer david.messer@open.ac.uk

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 10 February 2018 Accepted: 18 June 2018 Published: 13 July 2018

#### Citation:

Messer D, Bernardi M, Botting N, Hill EL, Nash G, Leonard HC and Henry LA (2018) An Exploration of the Factor Structure of Executive Functioning in Children. Front. Psychol. 9:1179. doi: 10.3389/fpsyg.2018.01179

**14**

up one single area of cognitive functioning or involve separable/distinct statistical factors, as well as discussion about the nature of, and relationships between, identifiable factors. Such investigations can help with the understanding of relationships between different tasks that are used to assess EF. These are important and challenging issues similar to those seen in research on the separability of intelligence into different factors (McGrew, 2005).

#### The Structure of EF and Its Development

Research with adults tends to identify three EF factors (inhibition, switching, and updating), which are related to each other, but nevertheless are separable, hence the suggestion that EF involves both unity and diversity (Miyake et al., 2000). In relation to children and young people, there is a widely held view that with increasing age the elements of EF become more separable from one another, although there are disagreements about which factors are separable and at which ages. We use the term "factor" to refer to EF tasks that have been identified on a statistical basis as being related to one another. "Component" is used to refer to the three commonly identified forms of EF, specifically updating/executive working memory (EWM; which involves the executive component of working memory), switching/shifting, and inhibition. For children between 3 and 6 years, several investigators (Wiebe et al., 2008, 2011; Hughes et al., 2010) have reported that EF is best described as a single factor. Thus, it appears that in the pre-school age, EF may be undifferentiated and does not involve statistically separable factors, so that individual differences (i.e., the differences between children) across different EF components appear to be influenced by a general cognitive capacity such as attention (Garon et al., 2008).

In the 6 to 12 year age range a number of different factor structures have been identified. For children aged 7–9 years and 10–11 years, Xu et al. (2013) compared five models of the structure of EF, reporting that a one-factor model was reasonably good at accounting for their data (inhibition, EWM, and switching). However, several groups of researchers have identified two-factor models of EF in the 6–12 years age range, although the models differ with regards to which EF tasks occur in the same factor. At 9–12 years, van der Sluis et al. (2007) reported that EWM and shifting were separate factors, but a separate inhibition factor was not supported by their data. In another study with 11–12-year-old children, St Clair-Thompson and Gathercole (2006) identified updating/EWM and inhibition as separate factors, but not switching. van der Ven et al. (2013) also reported a two-factor model (an updating factor and a combined inhibition and shifting factor), but noted that verbal ability and motor speed were additionally implicated. Finally, Huizinga et al. (2006) found good evidence for two factors (EWM, set shifting) in 7- and 11-year-olds (and also in 15- and 21-year-olds), although there was no evidence for an underlying inhibition construct as the three inhibition measures they used did not relate well to each other.

There are also findings providing support for a threefactor structure. Lehto et al. (2003) used both exploratory and confirmatory factor analyses with 8 to 13 year-old-children, and identified three interrelated factors which had an approximate correspondence with EWM, inhibition and shifting. In addition, Wu et al. (2011) found that this three-factor structure of EF in individuals aged between 7 and 14 years also provided the best fit for their data.

Thus, in the primary school years, it is possible to identify separable factors involving EF abilities, but there is a lack of agreement about the composition of these factors. Most investigations have used confirmatory factor analysis to identify the factor structure that best fits the relevant data. Given the uncertainty about which model is supported by theory and previous research, we used exploratory factor analysis (EFA) rather than confirmatory factor analysis (CFA).

# Further Measures of EF in Children: Planning and Fluency

Planning and fluency are often studied in patients with frontal lobe damage and reflect a range processes that are relevant for everyday life (e.g., Pennington and Ozonoff, 1996). However, although these processes involve potentially important assessments of EF, there are uncertainties about how they relate to EWM, inhibition and shifting.

Our planning measure was the "sorting" task from the Delis–Kaplan Executive Functioning System (D-KEFS; Delis et al., 2001) and involved grouping cards into equal sized sets based on card features such as size, shape, and concept. According to the manual, this task assesses problem-solving, in particular concept-formation and rule generation. As with many EF tasks it may also assess inhibition of previous responses (Swanson, 2005), and more generally the task has been thought to assess planning ability (Henry et al., 2012). Furthermore, although planning is sometimes regarded as another component of EF, it also has been argued as being a higher order construct (Diamond, 2013). Research on the D-KFES Sorting task has been limited, but performance on the task appears to differentiate between children with disabilities and children with typical development (Mattson et al., 1999).

The other additional EF assessment concerned fluency, the ability to generate as many different examples of a class of items as possible within a short time period. The usual tasks used to assess verbal fluency involve target categories such as animals or words beginning with a particular letter (semantic and phonemic fluency, respectively); a common example of a non-verbal fluency task involves drawing as many different shapes as possible on a template of the same pattern of dots (design fluency). There are limited findings that fluency relates to some of the three commonly identified components of EF. For example, Lehto et al. (2003) reported that performance on semantic and phonemic fluency tasks was related to performance on a shifting task (Trail Making), while Rosen and Engle (1997) found that verbal fluency was related to working memory ability. There has also been discussion of whether fluency is more closely related to EF or language abilities (Shao et al., 2014; Henry et al., 2015; Whiteside et al., 2016; Marshall et al., 2017). Consequently, there is a need to understand the way that verbal and non-verbal fluency relate to the more usual assessment of EF.

# Relationships Between EF and Language Ability

Our interest in the structure of EF also concerned whether verbal and non-verbal assessments were grouped into separate factors. There has been discussion about whether EF is influenced by language ability or vice versa (Bishop et al., 2014). In two previous investigations findings indicated that the influence of language disorder on EF is not confined to verbal tasks, but also extends to non-verbal EF tasks, something that would not be expected if language disorders only had a direct and specific effect on tasks which involve verbal operations (Henry et al., 2012; Yang and Gray, 2017).

However, different findings have been reported about the relationships between language ability and EF in students who are deaf. These students often have delays in the progress of spoken and/or sign languages and this could affect verbal and non-verbal EF performance. In these investigations there is more evidence that language ability influences performance on EF tasks rather than vice versa (Figueras et al., 2008; Botting et al., 2016). Jones et al. (unpublished), using cross lagged regressions, confirmed that language led EF developmentally and not just at the performance level, although this effect was stronger for deaf children than hearing participants.

A further viewpoint is provided by Gooch et al. (2016) who failed to identify influences in either direction between EF and language in children at risk for dyslexia and typically developing children: the abilities appeared to develop together, but did not influence each other. This was interpreted as supporting the existence of a third influence, such as processing speed, on both EF and language, which causes relationships between the two domains.

Factor analyses provide an additional way to investigate these issues about language and EF by examining the relationships between non-verbal and verbal EF tasks. If language abilities only affect performance on verbal tasks and not non-verbal tasks, it might be expected that verbal EF tasks would be a notable feature of one factor, and that non-verbal EF tasks would be a notable feature of another factor. Such findings would provide additional indirect evidence about the relationship between language and EF.

# The Current Study

Our investigation of the factor structure of EF in the primary school years was carried out on data already collected from typically developing children in two previous studies (Henry et al., 2012; Leonard et al., 2015). The same assessments of EF were used in both investigations, and to ensure comparability in the measures, separate z-scores were calculated for each sample, which should minimize the effect of any confounds. The research was designed to address three research questions concerning children in the 6–12 year age range:

(1) Does EFA using verbal and non-verbal EF tasks assessing EWM, inhibition and switching produce a factor structure that is similar to one of those reported in previous investigations?


# MATERIALS AND METHODS

# Participants

A total of 159 participants were recruited to be part of the typically developing comparison groups of two investigations concerned with EF, one study was concerned with specific language impairment (SLI) and the other with developmental coordination disorder (DCD). The former study recruited 88 children with typical development and the latter 71 children with typical development; 14 children recruited into the SLI study were excluded to give an age range in the remaining sample between 6 and 12 years 6 months [SLI study mean age 9:2 years (SD 23 months); DCD study mean age 9:5 years (SD 12 months)].

The selection criteria in the two investigations ensured that children considered as typically developing in each study were distinguishable from the target clinical groups. Thus, both groups of children with typical development met acceptable, but slightly different, criteria for inclusion. In the SLI study the criteria for inclusion were non-verbal abilities in the average range as assessed by BAS-II Matrices (T-scores of 40 or greater, mean = 50, SD = 10; British Ability Scales-II, Elliott et al., 1996) and scaled scores of eight or more on four CELF-4-UK subscales (Clinical Evaluation of Language Fundamentals-4-UK; Semel et al., 2006; see below). In the DCD study, the inclusion criteria were a General Cognitive Index of 70 or above (calculated from BAS3, Word Definitions, Verbal Similarities and Matrices subscales; Elliot and Smith, 2011), together with at least one standard score of four or above on two CELF-4-UK subtests (Formulated Sentences and Word Classes-Receptive). The children in the latter study also had to have percentile scores equal to or above 25 on the Movement Assessment Battery for Children (MABC-2; Henderson et al., 2007) and a standardized score of 70 or above on the Test of Word Reading Efficiency (TOWRE; Torgesen et al., 1999).

To help to ensure comparability between the two samples, children from the DCD study were excluded if their Matrices subscale T-score was below 40 and if either of the two CELF-4-UK subscales administered were below eight. This excluded 17 children, so the remaining total sample consisted of 128 participants (mean age 111.13 months, SD 19.59; there were 58 female participants). The standardized scores from the BAS-II (SLI study) and BAS3 (DCD study) for verbal ability were SLI, 111.56 (SD 10.39) and DCD, 108.70 (SD 10.77). The T-scores for the BAS matrices assessment were, respectively, 52.03 (SD 6.29) and 52.63 (SD 8.19). The mean scores for both groups of children were slightly above average and this probably reflects the selection criteria for both these samples.

The children were recruited from schools within Greater London and, in the study involving children with SLI, very occasionally, via direct contact with parents/guardians.

The catchment areas of the schools were variable in nature, but predominately low to mid socio-economic status. All the children were regarded by their assessors as having typical levels of spoken English and no child appeared to have English as a second language. All the children in the sample had BAS verbal standardized scores above 89.

For the study that concerned children with SLI, testing took place across 3–8 sessions, making up 31/<sup>2</sup> h for the complete battery, usually at school but occasionally at the child's home. For the DCD study, 5–6 sessions of 45 min to 1 h each were conducted at school, making up 5 h for the complete battery. A range of non-EF assessments were also carried out in these investigations and further details about the general findings are described in our other publications (Henry et al., 2012; Leonard et al., 2015). Measures were administered in random orders to participants.

The projects were granted ethical approval from the appropriate University Research Ethics Committees, and were discussed in detail with relevant school staff before recruitment. Informed consent for participation was obtained in writing (telephone permission occasionally) from parents/guardians; children/students also gave their oral and written assent and were told they could opt out at any time.

# EF Tasks

Each executive ability was assessed using pairs of tests, one for the verbal domain and one for the non-verbal domain. We used various strategies to try to select comparable verbal and non-verbal tasks that assessed predominantly the construct in question. In some cases it was possible to use assessments which had the same task structure, but involved either verbal or nonverbal behavior (e.g., inhibition), in other cases we were guided by theoretical models which have resulted in different tasks to assess comparable verbal and non-verbal abilities (e.g., EWM), or we used similar tasks from the same assessment battery which involved either a verbal or non-verbal response (e.g., fluency and planning). Although, the tasks were selected to provide a useful test of differences between verbal and non-verbal functioning, we are not claiming that task purity was achieved.

#### Executive Working Memory

Executive working memory requires concurrent processing and storage. The verbal task was Listening Recall (Working Memory Test Battery for Children, WMTB-C; Pickering and Gathercole, 2001). A series of short sentences were read to the children and they judged whether each was true/false (processing). The children were then asked to recall the final word from each sentence in correct serial order (storage). The first trials had a list length of one item, and the task progressed on to longer lists, with six trials per list length, until 4/6 trials were incorrect. Total trials correct were scored. Test–retest reliabilities of 0.38–0.83 are reported for the relevant ages (Pickering and Gathercole, 2001).

The odd-one-out test was the non-verbal EWM task (Henry, 2001). The Experimenter presented three cards showing simple nonsense shapes (horizontally orientated on 20 cm × 4 cm cards). The child pointed to the shape which was the "oddone-out" (processing). Storage was assessed via response sheets (20 cm × 30cm) which had three "empty" boxes that represented the cards, so the child could point to the location of each identified "odd-one-out." The first trial had one item, and the task progressed on to longer lists, with three trials per list length, until 2/3 trials were incorrect. Total trials correct were scored. The span version of this task has a reliability of 0.80 (Henry, 2001).

#### Inhibition

The "Verbal Inhibition, Motor Inhibition" test (VIMI; Henry et al., 2012) was used. This task had two types of response: to copy the Experimenter; or to inhibit copying and produce an alternative response. For part A of the verbal task, the Experimenter said either "doll" or "car" and the participant was asked to repeat the same word (block 1). Next, in block 2, the child was expected to inhibit repeating the response: "If I say doll, you say car; and if I say car, you say doll." Next there was a second "copy" block and a second "inhibit" block. Each of the four blocks had 20 trials. This entire sequence was repeated in part B, with new stimuli ("bus" and "drum"). In the nonverbal motor task the same format was followed, but words were replaced with hand actions. For part A, the action was a pointing finger versus a fist; for part B the action was a flat horizontal hand versus a flat vertical hand. The total number of errors made across parts A and B on each task was used as the measure of inhibition and was expressed as a negative score. Cronbach's alpha, based on total error scores from parts A and B was 0.915 for the non-verbal task, and 0.727 for the verbal task.

#### Switching

It was difficult to obtain simple and comparable measures of switching that were in the verbal versus visuospatial domains, the two selected were the verbal trail making task (D-KEFS; Delis et al., 2001) and the non-verbal Intra/Extra Dimensional Set Shift test (Cambridge Neuropsychological Test Automated Battery; Cambridge Cognition, 2006). The Trail Making Task requires continual switching between two classes of item (easily nameable numbers and letters), whereas the Intra/Extra Dimensional Shift test required children to learn a rule to guide responding and then switch to another rule unpredictably, and this task concerned stimuli that were not easily nameable. These are not identical tasks, but they both required children to switch between response sets and also required them to be flexible when responding. These tasks (and other similar versions of them) have been commonly used in previous literature to assess switching in both children and adults so have considerable face validity for measuring this construct.

In the Trail Making Test children joined small circles containing letters and numbers alternately, in sequence (1- A-2-B-3-C through 16-P). Four control conditions assessed component skills. The most relevant were: number sequencing (connecting numbers 1–16); and letter sequencing (connecting letters A–P). "Switching cost" was the total time taken for combined letter/number switching, minus the sum of the time taken for the number and letter sequencing component skills. These scores were multiplied by −1 so that as the scores increased from negative to positive this represented increasing switching ability. The letter sequencing and the number

sequencing tasks were terminated after 150 s; the number– letter switching task was terminated after 240 s. Test–retest reliabilities for measures contributing to "switching cost" are reported as: number sequencing (0.77), letter sequencing (0.57); letter/number switching (0.20; Delis et al., 2001). Reliability for switching measures can be low, given they are difference scores; consequently, somewhat lower reliabilities are likely in this area (Henry and Bettenay, 2010).

For the intra/extra dimensional set shift task, initially, two colored stimuli were presented on a screen, and by touching one, the child could learn a rule from feedback about which was "correct." Later, a second dimension, an irrelevant white line, was introduced. This introduced new stimuli, yet the child still needed to respond to the shape stimuli. The complex stimuli were later changed and the child had to switch attention to the previously irrelevant dimension to obtain "correct" responses ("extradimensional" shift). Total error scores were used (test– retest reliability reported as 0.40; Cambridge Cognition, 2006) and the scores were multiplied by −1 so that as the scores became less negative this represented increasing switching ability.

#### Fluency

Verbal fluency (D-KEFS; Delis et al., 2001) involved several versions of a similar task. In all tasks, the children were asked to say as many words as possible in 1 min according to a criterion. "Letter fluency" involved the letters F, A, and S; "category fluency" concerned the semantic categories of "animals" and "boys" names'. Verbal fluency was the total raw score from all five tasks.

Non-verbal fluency (Design Fluency, D-KEFS) involved a response booklet containing patterns of dots in boxes. The children were asked to draw as many different designs as possible in 1 min, each in a different box, by connecting dots with four straight lines (with no line drawn in isolation). Condition 1 consisted of only filled dots; Condition 2 consisted of arrays of filled and empty dots and the child connected only empty dots. Design fluency was the total raw score from these two conditions. Test–retest reliabilities are reported as: letter (0.67); category (0.70); filled dots (0.66); empty dots (0.43) (Delis et al., 2001).

#### Planning

The Sorting Test (D-KEFS) assessed verbal and non-verbal planning. Children sorted sets of six cards into two groups of three, in as many different ways as they could. There were three possible "verbal" sorts (e.g., transport/animals; things that fly/thing that move along the ground); and five possible "perceptual" sorts (e.g., small/large; straight/curved edges). Total numbers of correct verbal or perceptual sorts were used as the measures of verbal or non-verbal planning, respectively (test– retest reliability reported as 0.49; Delis et al., 2001).

# RESULTS

The mean scores on the ten EF assessments are shown in **Table 1**. Bivariate correlations between the assessments are given in **Table 2** and show moderate correlations between variables, with no correlations above 0.50.

To ensure comparability of data from the two samples, z-scores were calculated for each measure; this was done separately for each of the two samples and then the data were combined. This ensured that any differences due to sampling would be minimized. Examination of skewness and kurtosis was carried out, using a critical value for medium sized samples of 3.29 (Kim, 2013). The skewness and kurtosis of all the variables was acceptable except for the skewness of verbal working memory and verbal inhibition, and the kurtosis of verbal working memory. Inspection of the relevant graphs was carried out and they appeared acceptable given that univariate assumption of normality is not always considered as critical to factor analysis. Checks were made on univariate outliers and there were no extreme scores according to SPSS box plots. Mahalanobis distance was also checked and there was only one instance of a multivariate outlier, removal of this case did not influence the analyses.

Exploratory factor analysis (Principal Axis Factoring in SPSS) was used rather than CFA, as previous theory and research has produced different models of EF structures and we were limited to two variables for each construct. For the EFA analyses, Oblique rotation (oblimax) was employed, as it was thought that EF factors could be related to one another as suggested by the idea of unity and diversity (Miyake and Friedman, 2012). To check whether a different method of extraction and rotation resulted in different factors, principal components analyses (PCA) with orthogonal rotation (varimax) were also conducted. PCA is usually recommended for the derivation of scores rather than the investigation of factor structure, and varimax rotation is usually regarded as maximizing the spread of loadings within factors (Field, 2009). Consequently, the main interest was in

TABLE 1 | Mean and standard deviations of the EF assessments.


<sup>∗</sup>Verbal Inhibition, Motor Inhibition Test.


TABLE 2 | Bivariate correlations between EF assessments.

fpsyg-09-01179 July 12, 2018 Time: 17:46 # 6

the findings from the EFA, with the PCA analysis being used to check that a different form of analysis produced similar findings.

For the first analysis on the six core EF variables (i.e., EWM, inhibition, and switching), Bartlett's Test of Sphericity was significant at p < 0.001 (95.67, df 15). The Kaiser-Meyer-Olkin statistic of sampling accuracy was 0.66, which is acceptable according to Tabachnick and Fidell (2013). Even so, caution should be exercised when interpreting the findings about the separation of variables into factors. The measures of sampling adequacy of the variables from the diagonals of the anti-image correlation matrix were all above 0.6 (for switching 0.74 and for the remaining variables between 0.61 and 0.68), and therefore were adequate (Field, 2009).

Two factors were identified by the analysis. The eigenvalues for the first three factors were: 2.1, 1.1, and 0.9 showing a reasonable separation between factors 2 and 3 which supports the choice of factors with eigenvalues above 1. The first two factors accounted for 54.86% of the variance. **Table 3** displays the pattern matrix (i.e., rotated) which provides information about the regression coefficients for each variable. Coefficients

TABLE 3 | Pattern matrix for exploratory factor analyses (EFA; oblique rotation) and principal components analyses (PCA; varimax) on assessments of EWM, switching, and inhibition.


or loadings above 0.30 are displayed in this and the other table. The findings in the pattern matrix indicates that the first factor had the most important contribution from verbal EWM, and included non-verbal EWM as well as smaller contributions from the two switching variables. The second factor contained the two inhibition variables. This suggests the presence of two factors, one which primarily involved EWM and switching, and a second factor than involved inhibition. The organization of the variables into factors showed no evidence of a separation into verbal and non-verbal variables. The findings from the PCA analysis are also provided in **Table 3**. The major differences between the EFA and the PCA involve higher loadings from the PCA, which is often the case. Furthermore in the PCA, non-verbal working memory was identified with a loading of above 0.30 on the second factor involving inhibition.

For the analyses on the 10 EF variables (i.e., including verbal and non-verbal fluency and planning in addition to the six core EF variables) different structures were produced for the initial EFA and PCA analyses. These differences were only present when the verbal planning variable was entered into the analyses of the ten variables. There were other problematic issues with this variable. Verbal planning had the most limited range of scores of any variable and had the lowest measure of sampling adequacy in the anti-image correlation table. In addition, non-verbal planning which involved a very similar task, but with a greater range of scores, did not have the same problems. Consequently, it was decided to remove verbal planning from the analyses.

In the analyses of the nine EF variables (**Table 4**), the Kaiser-Meyer-Olkin statistic was acceptable (0.74) as was Bartlett's test of sphericity (208.85, p < 0.001, df 36). The measures of sampling adequacy figures also were acceptable, as all were above 0.62 (verbal and non-verbal inhibition 0.62–69; verbal switching was 0.82, and the remaining variables were between 0.70 and 0.79). In the analysis using EFA with oblique rotation, two factors were identified and the eigenvalues for the first three factors were: 2.9, 1.4, and 1.0 showing a reasonable separation between factors 2 and 3. The first two factors accounted for 47.56% of the variance in total, 32.45 and 15.11%, respectively.



The pattern matrix reported in **Table 4** shows that the majority of the variables contributed to the first factor, with the most important contributions from verbal fluency and verbal EWM. The second factor was made up of verbal and non-verbal inhibition. The findings did not show an obvious separation of variables according to whether or not they involved verbal or non-verbal EF tasks.

A further analysis on the same variables conducted using PCA with varimax rotation is also reported in **Table 4**. The findings were similar to the EFA in that all the variables except for verbal and non-verbal inhibition loaded on the first factor, the most notable difference to the EFA analysis was that verbal working memory had a low loading on factor 2. Again, verbal working memory and verbal fluency made the largest contributions to component 1.

# DISCUSSION

# The Structure of EF in Primary School Aged Children

Exploratory factor analyses and PCAs were conducted on data concerning verbal and non-verbal assessments of EF obtained from 128 typically developing children aged between 6 and 12 years. The findings from an EFA involving the core EF tasks of EWM, inhibition, and switching identified two factors. The first factor had contributions from all the four EWM and switching variables, and the second factor consisted of verbal and nonverbal inhibition. A PCA produced similar findings, although in this case there was evidence from the component loadings of weak links between non-verbal EWM and inhibition.

Further analyses were conducted with the inclusion of verbal and non-verbal, planning and fluency. The initial analyses indicated that the inclusion of verbal planning resulted in different structures in EFA and PCAs. Because these two sets of analyses are usually expected to produce similar findings, and verbal planning had poor psychometric properties, it was decided to remove the verbal planning variable from subsequent analyses. Further EFA on the nine remaining EF variables resulted in a twofactor solution. The first factor had contributions from verbal and non-verbal EWM, verbal and non-verbal switching, verbal and non-verbal fluency, and non-verbal planning. The second factor was made up of verbal and non-verbal inhibition. The PCA produced similar findings, and again there was a weak contribution from non-verbal EWM to the inhibition factor. Consequently, the additional fluency and planning variables loaded onto the first factor/component in both analyses, which appeared to involve a general EF ability. It was notable that both verbal EWM and verbal fluency had the highest loadings on this factor.

The analyses on the nine variables using different forms of data reduction produced very similar outcomes, however, it needs to be acknowledged that this only occurred after excluding verbal planning from the analyses. This variable had a low range of scores and a low measure of statistical adequacy, which provided a justification for its removal. In addition, non-verbal planning which involved very similar activities, but had a greater range of scores, did not have the same problems. Consequently, although there are advantages of the D-KEFS assessment of verbal planning, as it seems less affected by the task impurity problems associated with Tower tasks, it may have disadvantages when used with children between 6 and 12 years. Future research might consider alternative assessments of verbal planning with better psychometric properties and less restricted variance. More generally, it also would be desirable to have a greater number of assessments for each construct and a larger sample size than in this investigation.

Thus, the current analyses provided support for an inhibition factor and a general EF factor involving EWM, switching, fluency, and planning. The findings are consistent with previous research in children between 6 and 12 years as more than one EF factor was identified. However, previous research has largely considered only three EF components, namely EWM, switching, and inhibition. A novel contribution of the current study is that adding measures of planning (non-verbal) and fluency (verbal and non-verbal) resulted in the same two-factor structure, with the additional measures loading largely on a general EF factor. In relation to these findings, it is worth noting that factor analysis is less effective than structural equation modeling with a larger sample in identify whether planning, as has been previously discussed (Diamond, 2013), is a higher order EF structure.

# Explanations for Different EF Structures

One general issue in relation to our findings concerns the reasons why two-factor solutions should be the most common description of the organization of EF between 6 and 12 years. Part of the answer is likely to be that the period between 6 and 12 years represents a progression from the one-factor solutions that are reported at younger ages (Wiebe et al., 2008, 2011; Hughes et al., 2010) before reaching the more complex three-factor solutions identified in adulthood (Miyake and Friedman, 2012). The onefactor solutions reported in pre-school children suggest that

individual differences in EF abilities are similar across all aspects of EF. This may be the result of a set of general problem-solving abilities, such as core components relating to self-control or selfregulation (e.g., Miyake and Friedman, 2012), attentional abilities (Garon et al., 2008) or processing speed (Gooch et al., 2016) influencing performance across a wide range of EF tasks, with the result being consistent individual differences across the different EF tasks.

The commonly reported finding of a two-factor EF structure during the primary school years has been replicated here, and suggests that during this age range more specialist and differentiated mental capacities are available. In terms of individual differences, this implies that some children become good at one aspect of EF while other children become good at another. However, this development should not result in the variability we see in factor structures across different investigations. For example, in previous research, there is more evidence for a separation into abilities which are relevant to updating/EWM on the one hand, and inhibition-switching abilities on the other, as suggested by Lee et al. (2013). Nevertheless, there are also reports of a separation into abilities relevant to inhibition versus EWM-switching abilities, as suggested by St Clair-Thompson and Gathercole (2006), mirroring the findings of our analyses.

It is possible that the different factor structures in the 6– 12 years age range are a product of task impurity (Miyake et al., 2000). It is generally agreed that task impurities result in performance on assessments being driven by several different EF abilities and potentially other non-EF abilities (Friedman et al., 2008). Across different investigations, task impurity could mean that even different tasks believed to assess the same EF component may have different relationships with other EF tasks. CFA analysis with the use of latent variables helps to avoid this type of problem (Miyake and Friedman, 2012), but even here the latent variable will be dependent on which tasks have been chosen to represent it. Consequently, if different investigations use different tasks to assess each of the three EF components, this is likely to result in different factor structures across the investigations. It is possible that a larger number of tasks to assess each component of EF ability and a larger number of children would result in greater consistency, but ethical and practical constraints on testing time and participant numbers make it extremely difficult to achieve this.

Not only is task impurity an issue, but a related problem is that there is variation between investigations about which tasks assess the most relevant characteristics of an EF component. For example, a range of tasks have been used to provide indicators of inhibition ability, and the use of very similar inhibition tasks is likely to result in a more coherent and stronger underlying factor or latent variable. In our study the two assessments of inhibition had very similar task demands and inhibition was identified as a separate factor. In contrast, Huizinga et al. (2006) could not identify a common factor from the three different assessments of inhibition that they used (specifically, stop signal, flanker, and stroop). These issues about the choice of variables that are entered into a factor analysis may be as important as some of the statistical considerations in determining the factor structure, but it is much more difficult to specify what is best practice.

A further reason for different factor structures across investigations is that our conceptualization of the identity of the different forms of EF ability in the 6–12 age range needs to be further refined. Much of the thinking about the components of EF appears to be task-based and this is a sensible initial approach. However, we may need to consider potential neurocognitive processes that give rise to different EF abilities (Anderson, 2002), and so take a more brain-orientated and cognitive-based approach to the abilities underlying EF. This could involve investigating the brain structures which are activated during different EF tasks and using this as a basis to help identify those areas which are common to different EF processes.

#### Language and EF Abilities

If we had found that verbal and non-verbal EF tasks loaded on different factors, this would have provided strong support for the idea that verbal ability has an influence on verbal EF tasks. However, the factors that were identified contained a mix of verbal and non-verbal variables. Consequently, the findings from this study failed to provide support for the argument that language ability directly affects verbal EF abilities at the task performance level (Bishop et al., 2014).

Although, these findings are consistent with the idea that language ability is not an important influence on EF performance, our evidence in support of this position is limited in nature, especially as there is a range of sources of evidence that should be used to address this complex question (Botting et al., 2016). In other words, our data are not able to provide clear support for the idea that language does not influence EF abilities. This is because the evidence is cross-sectional, correlational in nature and consists of the absence of a positive effect. Further, we acknowledge that the relationship between language and EF abilities is complicated by the fact that verbal abilities are relevant to non-verbal tasks in order to understand instructions, and for the operation of inner speech which could be utilized during EF tasks; it also might be that some non-verbal processes have an influence on verbal tasks (e.g., certain forms of inhibition). Thus, the current findings do not provide definitive evidence about the relationship between EF and language. Rather, they provide support for the idea that concurrent language ability does not differentially affect performance on tasks selected to assess verbal and non-verbal EF.

#### Summary

Our findings support previous research concerning twofactor structures of EF in the primary school years, and suggest that planning and fluency contribute to a general EF factor. However, the current findings and those from previous investigations about the composition of the factors suggest that future research should keep in mind important methodological considerations relating to EF measures, and that task influences may be as important as individual differences in determining factor structures. Our findings did not provide evidence of separable verbal and non-verbal factors, and consequently failed to provide support for an effect of language ability on EF. Finally, research and theorizing could benefit from a greater focus on basic neurocognitive operations that underlie performance on EF tasks, to more fully understand the developmental, clinical, and educational implications of differentiation in EF with age.

#### DATA AVAILABILITY

fpsyg-09-01179 July 12, 2018 Time: 17:46 # 9

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

#### AUTHOR CONTRIBUTIONS

DM led on the writing and analyses. MB contributed to writing, analyses, and data collection. LH was principal

#### REFERENCES


investigator on both studies, the remaining authors made equivalent contributions to writing and/or data collection.

## FUNDING

This research was funded by the Economic and Social Research Council grant number RES-062-23-0535 and Waterloo Foundation, References 1121/1555 and 920/2318.

#### ACKNOWLEDGMENTS

We would like to thank the children, teachers, parents, and Speech and Language Therapists who kindly helped with this project.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Messer, Bernardi, Botting, Hill, Nash, Leonard and Henry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neuroanatomical Correlates of the Unity and Diversity Model of Executive Function in Young Adults

Harry R. Smolker1,2 \*, Naomi P. Friedman1,2, John K. Hewitt1,2 and Marie T. Banich<sup>1</sup>

<sup>1</sup> Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States, <sup>2</sup> Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO, United States

Understanding the neuroanatomical correlates of individual differences in executive function (EF) is integral to a complete characterization of the neural systems supporting cognition. While studies have investigated EF-neuroanatomy relationships in adults, these studies often include samples with wide variation in age, which may mask relationships between neuroanatomy and EF specific to certain neurodevelopmental time points, and such studies often use unreliable single task measures of EF. Here we address both issues. First, we focused on a specific age at which the majority of neurodevelopmental changes are complete but at which age-related atrophy is not likely (N = 251; mean age of 28.71 years, SD = 0.57). Second, we assessed EF through multiple tasks, deriving three factors scores guided by the unity/diversity model of EF, which posits a common EF factor that influences all EF tasks, as well as an updating-specific and shifting-specific factor. We found that better common EF was associated with greater volume and surface area of regions in right middle frontal gyrus/frontal pole, right inferior temporal gyrus, as well as fractional anisotropy in portions of the right superior longitudinal fasciculus (rSLF) and the left anterior thalamic radiation. Better updating-specific ability was associated with greater cortical thickness of a cluster in left cuneus/precuneus, and reduced cortical thickness in regions of right superior frontal gyrus and right middle/superior temporal gyrus, but no aspects of white matter diffusion. In contrast, better shifting-specific ability was not associated with gray matter characteristics, but rather was associated with increased mean diffusivity and reduced radial diffusivity throughout much of the brain and reduced axial diffusivity in distinct clusters of the left superior longitudinal fasciculus, the corpus callosum, and the right optic radiation. These results demonstrate that associations between individual differences in EF ability and regional neuroanatomical properties occur not only within classic brain networks thought to support EF, but also in a variety of other regions and white matter tracts. These relationships appear to differ from observations made in emerging adults (Smolker et al., 2015), which might indicate that the brain systems associated with EF continue to experience behaviorally relevant maturational process beyond the early 20s.

Keywords: executive control, neuroanatomy, individual differences, structural MRI, diffusion tensor imaging

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Tim Shallice, University College London, United Kingdom Corentin Gonthier, University of Rennes 2 – Upper Brittany, France Frini Karayanidis, University of Newcastle School of Medicine and Public Health, Australia

#### \*Correspondence:

Harry R. Smolker harry.smolker@colorado.edu

Received: 08 January 2018 Accepted: 25 June 2018 Published: 20 July 2018

#### Citation:

Smolker HR, Friedman NP, Hewitt JK and Banich MT (2018) Neuroanatomical Correlates of the Unity and Diversity Model of Executive Function in Young Adults. Front. Hum. Neurosci. 12:283. doi: 10.3389/fnhum.2018.00283

# INTRODUCTION

fnhum-12-00283 July 18, 2018 Time: 16:14 # 2

Executive function (EF) is a set of domain-general cognitive control mechanisms supporting goal-directed behaviors (Banich, 2009). Lesion and functional neuroimaging studies have identified the prefrontal cortex (PFC) as being critically involved in supporting EF processes (for a review, see Alvarez and Emory, 2006), likely regulating behavior by biasing neuronal dynamics in more posterior brain regions supporting sensory processing, motor execution, and emotion, among other domains (Miller and Cohen, 2001; Banich, 2009; Depue et al., 2016). Specifically, functional magnetic resonance imaging (fMRI) studies evaluating the neural substrates associated with distinct EF dimensions have implicated regions of lateral PFC (Duncan and Owen, 2000; Wager and Smith, 2003; Collette et al., 2005; Petrides, 2005; Banich, 2009) and medial PFC (Wager et al., 2004; Derrfuss et al., 2005) in supporting discrete EF constructs analogous to those evaluated in the current study. Nonetheless, it remains unclear whether individual differences in EF abilities in healthy individuals who do not suffer from neurological insult are associated with neuroanatomical characteristics of these same PFC brain regions. An alternative possibility, which we explore in the current study, is that higher levels of EF are associated with regions outside of the PFC, and potentially even outside of the fronto-parietal network (FPN). Such findings would suggest that higher levels of EF are characterized by the potential for larger participation and/or distribution of processing across the brain.

In the current study, we focus on neuroanatomical correlates of EF derived from structural MRI (sMRI) including surface based morphometry (SBM) and diffusion-tensor imaging (DTI). Unlike fMRI, which is fleeting and susceptible to confounding factors such as fatigue (Peltier et al., 2005; Cook et al., 2007), stimulus exposure (Grill-Spector et al., 2006), and mood (Posse et al., 2003), sMRI provides relatively stable metrics of brain organization that may change with development, but do not fluctuate on a day-to-day or moment-to-moment basis, as fMRI may. Of particular interest is the degree to which individual differences in neuroanatomy – specifically gray matter volume, thickness, surface area, and local gyrification, as well as white matter diffusion properties, including fractional anisotropy, mean diffusivity, axial diffusivity, and radial diffusivity, are associated with individual differences in EF ability.

To date, research into the neuroanatomical correlates of EF abilities have largely implicated regions within the FPN, specifically the PFC, as being central to EF task performance (Gunning-Dixon and Raz, 2003; Van Petten et al., 2004; Zimmerman et al., 2006; Vasic et al., 2008; Depue et al., 2010; Salthouse, 2011; Yuan and Raz, 2014; Bettcher et al., 2016). However, most research on associations between neuroanatomy and individual differences in EF in healthy adults have had two characteristics that make drawing conclusions somewhat difficult. First, much of the existing research on the neuroanatomical correlates of EF fail to differentiate between EF and non-EF processes that contribute to task performance on any given EF task, and thus suffer from the "task impurity problem" (Miyake et al., 2000). Second, many studies have employed samples that span a wide range of ages (Zimmerman et al., 2006; Newman et al., 2007), including individuals for whom the brain is continuing to develop (i.e., individuals in their teens and early 20s) as well as individuals for which atrophy may have already commenced, such as those middle-aged and beyond (Zimmerman et al., 2006; Newman et al., 2007). The current study addresses both of these issues. We characterize each individual's EF performance across a battery of tasks, allowing us to compute EF factor scores that provide a measure of an individual's EF abilities less contaminated by specific requirements for any given task. In particular, we derive three EF factors dimensions as posited by the unity/diversity model of EF, a well validated individual differences model of EF (Friedman et al., 2008; Miyake and Friedman, 2012; Friedman and Miyake, 2017). The association of these factor scores with neuroanatomical characteristics of the brain are then investigated in a developmentally homogenous sample of adults (all within a year of 29 years of age), following up on our prior investigations with an emerging adult (i.e., college-aged) sample (Smolker et al., 2015). We discuss each of these issues in turn.

A major challenge to a clear understanding of the relationship between individual differences in EF and brain anatomy is that there is very little agreement across studies as to the best models and methodologies to characterize EF. One issue is that performance on any given EF task likely taps both EF and non-EF processes (Denckla, 1996; Rabbitt, 1997; Miyake et al., 2000), such as speed of processing, visual acuity, amongst others. To reduce this "task impurity problem," EF researchers have used latent variables or composites of performance across tasks that tap the same EF, each of which require different non-EF processes (Miyake et al., 2000). While these factor analytic methodologies have been frequently employed in general research on EF, such methodologies have been scarcely employed in trying to understand the neuroanatomical correlates of discrete EF constructs. Although researchers have tested models of EF that posit multiple EF constructs (Robbins, 1996; Stuss and Alexander, 2007; Stuss, 2011), studies evaluating the neuroanatomical correlates of EF have rarely measured individual differences on EF constructs through factor analyses across multiple, reliable EF tasks. As such, different studies are investigating different "slices" of EF, making it difficult to interpret across studies (i.e., whether results are not replicating, or whether individual differences in distinct aspects of EF are associated with distinct neuroanatomical substrates). In past (Smolker et al., 2015), present, and future studies, our research team is attempting to conduct a series of studies examining the linkage between individual differences in EF and that employs consistent methodologies so as to allow for meaningful comparisons across samples, such as those in different age segments across the lifespan.

An abundance of evidence suggests that there are multiple separable constructs at the core of EF ability (Alvarez and Emory, 2006; Stuss and Alexander, 2007; Friedman and Miyake, 2017). Although EF has been operationalized under a number of frameworks (Baddeley, 1996; Miyake et al., 2000; Banich, 2009), the unity/diversity model of EF has emerged as a powerful model for interrogating the mechanistic structure of EF in the context of individual differences (Friedman and Miyake, 2017). With

confirmatory factor analysis, the unity/diversity model partitions performance on multiple EF tasks into separable EF dimensions, and also reduces measurement error, providing a more accurate estimate of an individual's underlying EF abilities (Miyake et al., 2000). Specifically, the model captures the correlations among response inhibition, working memory updating, and mental set shifting tasks with three orthogonal factors: a common factor that is involved in all EF tasks, known as common EF, as well as two separable and more specific factors, known as shifting-specific EF and updating-specific EF, respectively. Common EF captures variance in performance that is shared between EF tasks, and has been conceptualized as an ability to maintain a task set or goals. Shifting-specific and updatingspecific represent residual covariance among mental set shifting tasks and working memory updating tasks, respectively, after variance due to common EF has been removed. Shifting-specific is thought to reflect the speed with which no-longer-relevant goals can be cleared from working memory, while updatingspecific is thought to reflect the accuracy of working memory gating and possibly retrieval processes (Miyake and Friedman, 2012; Friedman and Miyake, 2017). There is no inhibitionspecific factor, because once the common EF factor is in the model, there are no remaining correlations among the inhibition tasks; that is, common EF captures all the individual differences in response inhibition (see Friedman and Miyake, 2017, for further discussion).

Studies with adult samples spanning wide age ranges or with clinical populations generally demonstrate that impairment in EF abilities is associated with reductions in neuroanatomical properties within regions of the FPN, including measures of gray matter morphometry and diffusion characteristics of white matter. Of note, however, not only do such studies generally fail to use specific models of individual differences in EF, like the unity/diversity model, many of the studies that have investigated the relationship between level of EF ability and neuroanatomy have done so in the elderly (Gunning-Dixon and Raz, 2003; Van Petten et al., 2004; Kramer et al., 2007; Salthouse, 2011; Bettcher et al., 2016) or across a wide expanse of age, ranging from the teens to 60s or 70s (Zimmerman et al., 2006; Newman et al., 2007). Recent research suggests that the late teens and early 20s are an especially active times in brain development (Gogtay et al., 2004), a developmental time period that some have referred to as "emerging adulthood" (e.g., Arnett, 2000). It is becoming increasingly clear that during this time period, aspects of brain morphology and white matter diffusion continue to develop (Sowell et al., 2001, 2003; Mukherjee et al., 2002; Asato et al., 2010), with levels of multiple neuroanatomical properties not reaching stability until around 30 years of age, if not older (Sowell et al., 2003; Westlye et al., 2009). Conversely, aspects of brain atrophy can start to be observed (Rusinek et al., 2003; Salthouse, 2011; Bettcher et al., 2016) in the 60s and 70s. As such, employing samples that are heterogeneous for age and/or developmental status may obscure informative brain-behavior relationships that are only present during specific developmental periods. This presents a challenge to traditional practices used to investigate individual differences, in which maximum between-subject variability in dependent and independent variables is desirable. Whereas many studies of individual differences attempt to increase between-subject variance to have a better chance at detecting an effect, in the current study, we sought to minimize between-subject variance attributable to age.

Studies that have examined the relationship between neuroanatomical structure and individuals suffering from psychopathology (Szeszko et al., 2000; Pantelis et al., 2002; Makris et al., 2006; Rüsch et al., 2007; Watari et al., 2008; Keller et al., 2009; Depue et al., 2010) also provide limited information regarding individual differences in EF and brain structure in the neurologically normal brain, as these are populations in whom brain structure is likely altered. Findings within neurologically normal individuals regarding the relationship of individual differences in EF and brain neuroanatomy have been highly inconsistent. Whereas some studies report positive correlations between level of EF and aspects of brain neuroanatomy (i.e., better EF associated with greater neuroanatomy; Ettinger et al., 2005; Newman et al., 2007; Elderkin-Thompson et al., 2008; Gautam et al., 2011), others report negative (Gautam et al., 2009, 2011; Tamnes et al., 2010; Smolker et al., 2015) relationships across the brain (i.e., better EF associated with reduced neuroanatomy). As such, the degree to which the neuroanatomy of PFC regions predicts levels of EF in non-aging or non-clinical populations remains to be seen. To address this, we focus our investigation on individuals in a relatively limited but developmentally stable time frame of what we refer to young adulthood, around the age of 30.

A very limited number of studies have examined the neuroanatomical correlates of EF from the perspective of the unity/diversity model, employing factor analyses across multiple tasks, in neurologically normal populations. These investigations have been limited so far to a sample of children and adolescents (Tamnes et al., 2010), as well as emerging adults (Smolker et al., 2015). None have done so in a population in whom most major neurodevelopmental processes are complete. Tamnes et al. (2010) found that, across childhood and adolescence, improved performance on tasks tapping three correlated EF dimensions (response inhibition, working memory updating, and set shifting), was associated with reductions in cortical thickness across a number of brain regions. Specifically, better performance on the antisaccade task, a proxy for common EF, was associated reductions in cortical thickness of bilateral occipital lobe. Better performance on the keep track task, a proxy for updating (which contains variance related to common EF and updating-specific), was associated with reductions in cortical thickness of a portion of left dlPFC, extending back into bilateral postcentral gyrus (Tamnes et al., 2010). Finally, better performance on the plus-minus task, a proxy for shifting-shifting (which contains variance related to common EF and shiftingspecific), was associated with reductions in cortical thickness of left precentral gyrus. Of these regions, only the dlPFC region associated with updating is considered part of the FPN (for a characterization of the FPN, see Yeo et al., 2011), suggesting that individual differences in EFs are associated with neuroanatomy both within and outside of the FPN, at least in childhood and adolescence.

Similarly, our research group found that each of the different dimensions of the unity/diversity model were associated with different aspects of brain neuroanatomy in a sample of emerging adults tightly centered around college age and somewhat overlapping in age with Tamnes and colleagues' sample (Smolker et al., 2015). Specifically, we found that less regional gray matter volume and local gyrification within the PFC, as well as increased fractional anisotropy of white matter tracts (a measure of white matter integrity), connecting the PFC to posterior brain regions, were associated with higher EF ability. Specifically, better common EF was associated with reduced volume and local gyrification of ventromedial PFC and greater fractional anisotropy of the right superior longitudinal fasciculus. Moreover, better updating-specific was associated with decreased volume and gyrification of left dorsolateral PFC (dlPFC), and better shifting-specific was associated with reduced volume and local gyrification of the right ventrolateral PFC and increased fractional anisotropy of the right inferior fronto-occipital fasciculus. Of these results, as was the case in Tamnes et al. (2010), the only association between EF and regional neuroanatomy within the FPN was between updating-specific and dlPFC neuroanatomy, in this case with regards to volume and gyrification. The common EF and shifting-specific associations, though observed within the PFC, were not in regions commonly considered as being part of the FPN.

Taken together, these two studies provide converging evidence that, across childhood, adolescence, and early adulthood, (1) the unity/diversity dimensions of EF (or related tasks/constructs) are associated with neuroanatomy both within and outside of the FPN, and (2) there might be heterogeneity across age groups in the neuroanatomical regions associated with EF. However, it is difficult to disentangle age effects from effects driven by methodological differences between studies. On the one hand, in both Tamnes et al.'s (2010) child/adolescent and Smolker et al.'s (2015) emerging adult samples, updating performance was negatively correlated with dlPFC neuroanatomy. On the other hand, the regions associated with common EF and set shifting appeared to differ between age groups. An additional commonality between the results of Tamnes et al. (2010) and our research group (Smolker et al., 2015) is that reduced gray matter morphometry was associated with better EF performance. While at first glance this finding may seem counterintuitive, the brain is undergoing significant pruning during these ages, through which superfluous neurons are culled and the brain undergoes regional gray matter shrinkage, resulting in increased neural efficiency (Blakemore and Choudhury, 2006). Taken together, these results paint a picture of childhood through young adulthood in which better EF might be associated with individual differences in pruning of specific brain regions, with those individuals who have experienced greater pruning (and thus reduced gray matter morphometry) having better EF ability. At least in young adults, these negative associations between gray matter morphometry and EFs coincide with associations between white matter properties and EF, which may partially mediate gray matter-EF relationships (Smolker et al., 2015).

Given that the brain is likely still undergoing major developmental changes during the age ranges examined in both of the aforementioned samples, the negative correlations between gray matter morphometry and the unity/diversity dimensions may not be indicative of the relationships that would be observed in young adults. Indeed, studies evaluating neuroanatomical changes across the lifespan find that individuals in their early 20s are still undergoing neuronal pruning and axonal myelination, which may persist throughout much of the 20s. These effects stabilize around 30 years of age, and begin to change again around age 60, as individuals start to experience age-related neurodegeneration (Sowell et al., 2003; Westlye et al., 2009). As such, negative correlations between EF performance and gray matter morphometry during adolescence/emerging adulthood may exist because individuals with greater pruning are likely further along in typical neurodevelopmental processes, resulting in a) better EF and b) reduced gray matter morphometry. It remains unclear, however, whether the same neuroanatomical properties that are associated with EF during ages at which pruning and myelination is ongoing will also be associated with EF in individuals for which pruning and myelination has largely finished. This uncertainty applies not only to the specific neuroanatomical properties implicated between different age groups but also to the regions implicated. For instance, it may be that neuroanatomical properties of the PFC are particularly relevant to individual differences in EF during development, but as individual's complete development, the variability in properties of the PFC between subjects becomes minimal and properties of the PFC are no longer relevant to individual differences in EF. Instead, the specific neuroanatomical properties and brain regions associated with EF may change across the lifespan, with distinct neuroanatomical correlates of EF occurring at distinct points in the lifespan.

Hence, in the present study, we focused on an age range in which the vast majority of neurodevelopmental processes associated with adolescence and emerging adulthood are likely to be over, but one at which age-related cognitive decline (and potential brain atrophy) are not likely yet to manifest. In a sample whose age is tightly focused around 30, we test for associations between individual differences in EFs and regional brain neuroanatomy, including characteristics of gray matter morphometry (volume, thickness, surface area, and local gyrification index), as well as multiple measures of white matter diffusion (fractional anisotropy, mean diffusivity, radial diffusivity, axial diffusivity). In addition to testing for gray matter morphometry and DTI measures on a whole-brain basis, we also employed ROI analyses based on gray matter regions and white matter tracts associated with EFs in emerging adults (Smolker et al., 2015). Consistent with Smolker et al. (2015), we expected that regions of gray matter and white matter tracts associated with individual differences in EF will not be restricted to the FPN, but will likely include other prefrontal and posterior brain regions outside of the FPN. Due to developmental differences between the current sample and the sample employed in Smolker et al. (2015), we expect that the direction of the relationship between measures of gray matter

morphometry, DTI measures, and EF may differ and/or the regions implicated as related to individual difference in EFs may also be distinct.

# MATERIALS AND METHODS

## Participants

Participants were 251 individuals drawn from the larger Colorado Longitudinal Twin Study (LTS) who were scanned when they were mean age 28.71 years (SD = 0.57). Of these 251 individuals, 108 were monozygotic (MZ) (72 female), 88 were dizygotic (DZ) same-gender (54 female), and 55 were singletons (28 female) whose co-twins had not participated at the time of the analyses. Written informed consent was obtained from all participants prior to carrying out the experimental session. This study was carried out in accordance with the recommendations of University of Colorado Boulder Institutional Review Board with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University of Colorado Boulder Institutional Review Board prior to data collection. Structural images for neuroanatomical analyses were collected as part of a larger protocol including fMRI scans during tasks and at rest.

#### EF Measures

The three EF constructs posited by the unity/diversity model of EF were assessed with six tasks previously shown to load on these constructs (Friedman et al., 2016). These six tasks were selected based on their factor loadings from prior waves of assessment using nine EF tasks in this sample (Friedman et al., 2016). The antisaccade, category-switch, and keep track tasks were completed during an fMRI session immediately following the T1 structural scan. Participants practiced these tasks outside the scanner prior to the scanning session to ensure they understood the tasks. They were reminded of the instructions at the beginning of the scanner tasks. The Stroop, letter memory, and number–letter tasks were completed as part of a larger behavioral battery immediately after the scanning session.

#### Antisaccade [Adapted From Roberts et al. (1994)]

Antisaccade captures the ability to maintain and execute a task set in the face of distracting information; specifically, it requires inhibiting prepotent eye movements (Miyake et al., 2000). In the scanner version, participants completed 20 s blocks of prosaccade, antisaccade, and rest (fixation) trials (12 blocks of each across two runs; 5 trials per block for the prosaccade and antisaccade blocks), each preceded by a jittered instruction (TOWARD, AWAY, or FIXATION for 2, 4, or 6 s). On each trial, after a jittered fixation lasting 1–3 s, a small visual cue flashed on one side of the computer screen for 234 ms, followed by a target (a digit from 0 to 9) that appeared for 150 ms before being masked. The mask lasted 1650 ms, during which time the participant vocalized the target. The cue and target appeared on the same side of the screen during prosaccade trials and opposite sides during antisaccade trials. Thus, to see the target for long enough to identify the number in the antisaccade trials, participants had to avoid the automatic tendency to saccade to the cue and instead immediately look in the opposite direction. The dependent measure was the proportion of correctly identified targets on the 60 antisaccade trials.

#### Stroop [Adapted From Stroop (1935)]

Stroop captures the ability to maintain a task set in the face of the distracting information, specifically, inhibiting the prepotent tendency to read words. Participants verbally indicated the font color (red, blue, or green) of text presented on a black screen as quickly as possible, with reaction time (RT) measured via a ms-accurate voice key. Trials were divided up into three types: a block of 42 neutral trials consisting of asterisks (3–5 characters long) presented in one of three colors; a block of 42 congruent trials consisting of color words that matched the font color (e.g., the word "RED" displayed in red font); and two blocks of 42 trials each of incongruent trials consisting of color words that did not match the font color (e.g., the word "RED" displayed in blue ink). Each word disappeared as soon as the voice key detected the response, and the next word appeared after a 250 ms white fixation. The dependent measure was the mean RT difference between correct incongruent and neutral trials.

#### Keep Track [Adapted From Yntema (1963)]

Keep track captures the ability to maintain and update information in working memory. On each trial in the scanner version, participants were given 3 or 4 target categories (animals, colors, countries, distances, metals, or relatives) that remained on the screen throughout the trial. After viewing a serial list of 16 words drawn from 6 categories (one word every 2 s), they saw a "???" prompt on the screen for 10 s, during which they orally recalled the last exemplar of each target category. Because each list contained 1–3 exemplars of each category, they had to update which words to remember and ignore words from irrelevant categories. In addition to these "Remember" trials, the scanner version of the task included baseline conditions of "Read" trials, in which participants just silently read the words without trying to remember them, and 20 s rest (fixation) trials. Each trial type was preceded by a jittered instruction (REMEMBER, READ, or FIXATION for 2, 4, or 6 s). There were three runs, each with 3 recall trials (two with 4 words to recall and one with 3), 3 read trials, and 3 rest trials. The behavioral dependent measure was the proportion of the 33 words correctly recalled out of all remember trials.

#### Letter Memory [Adapted From Morris and Jones (1990)]

Letter memory captures the ability to maintain and update items in working memory. In each trial, participants saw a series of 9, 11, or 13 consonants, with each letter appearing for 3 s, and had to say aloud the last four letters, including the current letter. The dependent measure was the proportion of 132 sets correctly rehearsed (i.e., the last four letters reported in the correct order) across 12 trials.

#### Number–Letter [Adapted From Rogers and Monsell (1995)]

Number–letter captures the ability to shift between mental sets. In each trial of the scanner version, participants saw a box sectioned into 4 quadrants. The borders of one quadrant were darkened (i.e., cued) for 350 ms, then a number–letter or letter– number pair (e.g., 4K) appeared inside until it was categorized. The participant had to categorize the number (top 2 quadrants) or letter (bottom 2 quadrants) as odd/even or consonant/vowel, respectively, using two buttons on a button box. The stimuli disappeared from the screen when categorized, and there was a 350 ms response-to-cue interval. The trials were arranged in blocks, and rest blocks (20 s) were intermixed with the task blocks. Each block was preceded by a jittered instruction (TOP, BOTTOM, MIXED, or FIXATION for 2, 4, or 6 s) that indicated where the stimuli would appear for that block. In mixed blocks, half the trials were repeat trials in which the task stayed the same as the previous trial; the other trials required a switch in categorization task. Each block consisted of 13 trials, with the first trial not counted because it was neither switch nor repeat. There were 2 runs, each containing 8 mixed blocks, 8 single-task blocks (4 each number and letter blocks), and rest blocks. The behavioral dependent measure was the local switch cost – the difference between average response times on correct switch and no-switch trials within mixed blocks (96 trials of each type).

# Category-Switch [Adapted From Mayr and Kliegl (2000)]

Category-switch captures the ability to shift between mental sets. In each trial, participants categorized a word according to animacy (i.e., living vs. non-living) or size (i.e., smaller or larger than a soccer ball), depending on a cue (heart or crossed arrows, respectively) that preceded the word by 350 ms and remained above the word until the participant responded with one of two buttons on a button box. The stimuli disappeared from the screen when categorized, and there was a 350 ms responseto-cue interval. A 200-ms buzz sounded for errors. The task began with two single-task blocks of 32 trials each, in which participants categorized words only by animacy then only by size. Then participants completed two mixed blocks of 64 trials each, in which half the trials required switching the categorization criterion. The dependent measure was the local switch cost – the difference between average response times on correct switch and no-switch trials within mixed blocks (64 trials of each type).

# T1 Structural Scan and DTI Procedure

All structural MRI data were acquired using a Siemens 3-Tesla MAGNETOM Trio MRI scanner at the University of Colorado Boulder. A 32-channel headcoil was used for radiofrequency transmission and reception. Data pertaining to gray matter structure was acquired via a T1-weighted Magnetization Prepared Gradient Echo sequence in 224 sagittal slices, with a repetition time (TR) = 2400 ms, echo time (TE) = 2.01 ms, flip angle = 8◦ , field of view (FoV) = 256 mm, and voxel size of 0.8 mm<sup>3</sup> . Diffusion-weighted data presented in this paper was acquired via a set of three scans, all with a multi-band acceleration factor of 3, capturing a total of 172 gradient directions. These scans each consisted of 72 slices, had a TR = 4000 ms, TE = 112 ms, flip angle = 84◦ , FoV = 224 mm, β = 3000 s/m<sup>2</sup> and voxel size of 2 mm<sup>3</sup> , with the first and third scans captured with a phase encoding direction of left to right, and the second with a phase encoding direction of right to left.

# Data Analysis

For table describing the analysis types, steps, associated tables, and an example, see Supplementary Table S1.

#### EF Data

Scores on the six EF tasks were subjected to the same trimming and transformation used in prior studies to improve normality and reliability (Friedman et al., 2016). Specifically, RT tasks underwent within-subject trimming (Wilcox and Keselman, 2003). Though the exact number of trials that were trimmed differed between participants, on average, under 7% of trials on the Stroop and under 10% of trials on the category switch task were trimmed. Additionally, within the number–letter and category-switch tasks, RTs following error trials were excluded, as determining switch versus repeat trials is dependent on the preceding trial. Following within-subject RT trimming, extreme high and low scores at the between-subjects level (greater than ±3 SDs from the group mean) were replaced with the cutoff value of 3 SDs above or below the mean, respectively, to improve normality and reduce the impact of extreme scores while maintaining these scores in the distribution. Fewer than 3% of EF scores were adjusted by this transformation for any given task. We have used this same criterion of 3 SDs in prior waves of data collection with this twin sample (Friedman et al., 2016); we selected this conservative criterion because, with this large of a sample size, some cases within 3 SDs should be expected, and such cases have less impact on both the standard deviation of the distribution and on correlations, compared to what their influence would be in a smaller sample.

Factor scores were extracted via a confirmatory factor analysis in Mplus (Muthén and Muthén, 1998), with all six EF tasks loading on common EF, the keep track and letter memory tasks loading on the orthogonal updating-specific factor, and the number–letter and category-switch tasks loading on the orthogonal shifting-specific factor (see **Figure 1**). The loadings were equated (after scaling the measures to have similar variances) within the updating-specific and shifting-specific factors to identify these two-indicator factors. These EF factor scores were then used as dependent measures for analyses of interest, including surface-based morphometry and tract based spatial statistics of diffusion data.

#### Surface-Based Morphometry

Surface-based morphometry (SBM) was carried out using the Freesurfer analysis suite<sup>1</sup> . We chose SBM over voxelbased techniques because SBM allows for the examination of surface area and cortical thickness in addition to volume,

<sup>1</sup>https://surfer.nmr.mgh.harvard.edu/

number–letter and category-switch tasks. All parameters were statistically significant (p < 0.05). EF, executive function.

whereas voxel-based methods only allow for the investigation of cortical volume. Surface area and thickness are under distinct genetic control (Winkler et al., 2010), suggesting they may capture different mechanisms in neural organization that is lost by looking at volume alone. Additionally, voxel-based techniques are particularly susceptible to be confounded by partial volume effects, effects which SBM is more robust. T1 weighted structural images were brain extracted using a hybrid watershed/surface deformation procedure (Segonne et al., 2004), followed by a transformation into Talaiarch space, intensity normalization (Sled et al., 1998), tessellation of the gray/white matter boundary (Fischl et al., 2001), and surface deformation along intensity gradients to optimally differentiate gray matter, white matter, and cerebral spinal fluid boundaries (Dale et al., 1999; Fischl and Dale, 2000). The resulting segmented surfaces were registered to a standard spherical inflated brain template (Fischl et al., 1999a,b), parcelated according to gyral and sulcal structure (Fischl et al., 2004; Desikan et al., 2006), and then used to compute a range of surface-based measurements, including cortical volume, surface area, thickness, and local gyrification. Whereas cortical volume captures the total amount of gray matter within a region, and can be decomposed into two constituent parts, namely thickness and surface area, local gyrification index the degree to which the amount of surface area that is contained within the sulci of the brain.

#### Confirmatory SBM Analyses

In an attempt to replicate results found in Smolker et al. (2015), we carried out ROI analyses in which we tested for associations between mean gray matter morphometry values for each subject from the ROIs identified in Smolker et al. (2015). We then carried out multiple regression models to test for significant associations between neuroanatomy of these ROIs and EFs (see Multiple Regression below for details analyses).

#### Exploratory SBM Analyses

To investigate the degree to which regional variability in multiple measures of gray matter morphometry were associated with EF factor scores, we performed gray matter morphometry analyses of volume, cortical thickness, surface area, and local gyrification index via general linear models, which tested for vertex-wise associations between the aforementioned SBM measures and the EFs across the entire cortex. SBM analyses involving volume and surface area treated total intracranial volume (ICV) as a nuisance covariate, in line with recommendations from previous work (Buckner et al., 2004). Smoothing was set to a fullwidth-half-max parameter of 10mm, and all results that passed p < 0.05 where then corrected for multiple comparisons via Monte Carlo simulations (Hagler et al., 2006). These simulations generated data-driven cluster size limits for determining cluster extent significance. All reported clusters passed Monte Carlo simulations at p < 0.05.

#### Diffusion Tensor Imaging

Diffusion-weighted tensor images were processed with FSL (Smith et al., 2004), using the FDT toolbox (Behrens et al., 2003b, 2007) and tract-based spatial statistics (TBSS; Smith et al., 2006). All images were first motion and distortion corrected. Within each subject, diffusion tensor models were fit for each voxel, creating images of four common measures of white matter diffusion, including fractional anisotropy, mean diffusivity, radial diffusivity, and axial diffusivity, across the whole brain. Fractional anisotropy, a measure of the degree to which the motion of water molecules are constrained within neural axons, is thought to reflect the overall integrity of myelin in the brain and can be decomposed into constituent parts: including mean diffusivity,

an average of the eigenvalues associated with three primary diffusion directions; axial diffusivity, the degree to which water molecules diffuse in the primary eigenvalue directions; and radial diffusivity, the average of two non-primary eigenvalues (for a review of DTI measures, see Alexander et al., 2007). For the four diffusion measures separately, all resulting subject-specific diffusion images were then non-linearly aligned to a 1mm<sup>3</sup> white matter template in standard space. The aligned images were then skeletonized and averaged, creating an average skeleton mask of prominent white matter tracts. For confirmatory analyses, we calculated the mean fractional anisotropy of the white matter tracts previously implicated in emerging adults (Smolker et al., 2015): the rSLF and the bilateral inferior fontoocccipital fasciculus (iFOF). These tract-specific ROIs were defined based on the JHU white-matter tractography atlas, and values for mean FA of these white matter tracts were extracted for each subject, individually.

#### Confirmatory DTI Analyses

In an attempt to replicate results found in Smolker et al. (2015), we tested for associations between mean whole-tract fractional anisotropy of the rSLF and bilateral iFOF with common EF and shifting-specific, respectively. Because these white matter tracts may still be associated with EF, but on a more regional as opposed to whole-tract level, we carried out voxel-wise TBSS within masks of these two tracts.

#### Exploratory DTI Analyses

To investigate the degree to which multiple voxel-wise diffusion measures within major white matter tracts throughout the whole brain was associated with EF, we carried out TBSS within a skeletonized mask of prominent white matter tracts. Reported statistics for voxel-wise analyses were corrected for multiple comparisons using Threshold-Free Cluster Enhancement (TFCE) (Smith and Nichols, 2009), which provides a thresholdfree method for determining significant clusters. All reported DTI clusters passed at a TFCE-corrected 1-p value of 0.95. For use in subsequent multiple regression analyses, we computed each subject's mean DTI values for all clusters individually. All reported clusters passed multiple regression testing at a p < 0.05 after accounting for family structure and gender.

#### Cluster Selection

For all clusters that passed correction for multiple comparisons, each subjects mean neuroanatomical value across a given cluster was extracted. Because of the prevalence of twins in the current sample has the potential to inflate effect sizes by reducing sample variance, all clusters that passed correction for multiple comparisons were then corrected for family structure in the context of regressing each participants EF factor score on each participant's mean value for a given cluster. Family structure was coded as a unique identifying number for each family, with twins receiving the same value if they belonged to the same family. These family identification numbers were then used as the grouping variable for sandwich estimation, as implemented in by MPlus's TYPE = COMPLEX option. This procedure was performed for all clusters individually. Prior to running regression models, EF factor scores and mean neuroanatomical estimates of the identified clusters were Winsorized betweensubjects, with any values above or below the 99th and 1st percentile being moved to exactly the 99th and 1st percentiles, respectively. Because of the number of distinct analyses run, it was important to adjust the alpha level for determining significance. For each of the three EF factors scores, we carried out four whole brain SBM analyses (12 test) and four whole brain DTI analyses (12 test). While we report clusters in the current manuscript that reached a standard alpha threshold of 0.05, we note that the Bonferroni corrected alpha in the current study is p < 0.0021. Of the 17 distinct neuroanatomical clusters identified in the current sample shown in **Table 2**, only one cluster did not pass Bonferroni level correction.

#### Multiple Regression

All multiple regression analyses were carried out using MPlus (Muthén and Muthén, 1998). We employed multiple regression with sandwich estimation to both account for family structure and interpolate effect sizes when multiple neuroanatomical predictors were used to predict a given EF factor scores. To better understand the manner in which gray matter morphometry and white matter diffusion can be used in tandem to better understand individual differences in EF, we included all neuroanatomical clusters found to be associated with a given EF, in a single model predicting that EF (i.e., the full model). This procedure enabled two important insights. First, it allowed the examination of which neuroanatomical clusters remained significantly associated with EF after taking into account all other neuroanatomical clusters associated with that EF. Distinct measures both within- and betweengray matter and white matter modalities have been shown to explain overlapping variance in individual differences in behavior (Erus et al., 2014), suggesting a potential integrative relationship across aspects of both gray matter and white matter structure that coincides with network-oriented models of the brain and cognition. Second, the full model allowed us to determine the total variance in EF factor scores that can be explained by the identified neuroanatomical clusters.

# RESULTS

# Behavioral Results

For descriptive statistics of the six behavioral tasks used in the confirmatory factor analysis to obtain the factor scores, see **Table 1**. This confirmatory factor analysis model (see **Figure 1**) fit reasonably well, χ 2 (7) = 24.08, p = 0.001, CFI = 0.935, RMSEA = 0.093. Although the fit indices slightly exceeded the cutoffs typically used to indicate good fit (i.e., CFI > 0.95 and RMSEA < 0.06; Hu and Bentler, 1995), we did not implement any model modifications so as to maintain consistency with prior versions of the model that have been shown to fit well (Friedman et al., 2016). Factor score determinacy estimates for the complete data pattern were 0.83, 0.60, and 0.75 for common EF, updating-specific, and shifting-specific, respectively.

#### TABLE 1 | Descriptive statistics of executive function tasks.

fnhum-12-00283 July 18, 2018 Time: 16:14 # 9


<sup>a</sup>Reliability is split-half (odd/even for Stroop and Category-switch or run1/run2 for antisaccade and number–letter), adjusted with the Spearman–Brown prophecy formula. <sup>b</sup>Reliability is Chronbach's alpha across 3 runs for keep track and 4 sets of trials for letter memory. N, number of participants who had usable data on a given task.



Clusters that passed correction for multiple comparisons and family structure. Reported stats are from regressing associated executive function dimension on a mean neuroanatomical values of a single cluster (controlling for nuisance covariates). "X", "Y", and "Z" represent MNI coordinates of a given cluster's peak. ! indicates that cluster's p-value did not pass Bonferroni level correction of p < 0.0021. cEF, common executive function; UPD, updating-specific; SHI, shifting-specific; rMFG/FP, right middle frontal gyrus/frontal pole; rITG, right inferior temporal gyrus; rSFG, right superior frontal gyrus; rM/STG, right middle/superior temporal gyrus; lCUN/PC, left cuneus/precuneus cortex; lATR, left anterior thalamic radiation; lSLF, left superior longitudinal fasciculus; rSLF, right superior longitudinal fasciculus; rOR, right optic radiation; lCC, left corpus callosum; vent, ventral; dors, dorsal; post, posterior; vol, volume; thick, thickness; FA, fractional anisotropy; AD, axial diffusivity; MD, mean diffusivity; RD, radial diffusivity.

Demonstrating the reliability and stability of these factor scores, common EF factor scores from this age 29 assessment correlated 0.79 and 0.68 with common EF factor scores from the 9-task batteries completed by this sample at ages 23 and 17, respectively; updating-specific factor scores from this wave correlated 0.61 and 0.44 with updating-specific scores (based on 3 tasks each) at ages 23 and 17, respectively; and shifting-specific factor scores from this wave correlated 0.62 and 0.60 with shifting-specific scores (based on 3 tasks each) at ages 23 and 17, respectively (all ps < 0.001).

#### Surface-Based Morphometry

For a full list of SBM results that passed correction for multiple comparisons and family structure, see **Table 2**. In the confirmatory ROI analyses, none of the gray matter features identified in Smolker et al. (2015) were significantly associated with EFs in the current sample. In exploratory analyses, better common EF was associated with increased volume and surface area of clusters spanning right frontal pole (FP)/right middle frontal gyrus (MFG) (volume: x = 24, y = 43, z = 23, β = 0.294, SE = 0.062, p < 0.001; area: x = 21, y = 60, z = 6, β = 0.278, SE = 0.069, p < 0.001) (**Figure 2**), as well as increased surface area of a cluster in the right inferior temporal gyrus (rITG; x = 47, y = −21, z = −27; β = 0.276, SE = 0.085, p = 0.001) (**Figure 2**). Better updating-specific was associated with increased thickness of a region of left cuneus/precuneus (lCun/PC; x = −21, y = −61, z = 9, β = 0.249, SE = 0.058, p < 0.001) (**Figure 3**), and decreased thickness of clusters in the medial portion of right superior frontal gyrus (rSFG; x = 8, y = 19, z = 47, β = −0.240, SE = 0.073, p = 0.001) and right anterior superior/middle temporal gyrus (rS/MTG; x = 48, y = 5, z = −27; β = −0.282, SE = 0.080, p < 0.001) (**Figure 3**).

FIGURE 2 | Regional gray matter clusters associated with common executive function. Cortical clusters that passed correction for multiple comparisons. All clusters remained significant after taking into account family structure. Scatterplots show simple correlation between gray matter measure total of a given ROI and common EF factor score. <sup>∗</sup> Indicates clusters that remained significantly associated with common EF in the full model, in which all neuroanatomical clusters associated with cEF were included in a single model. red, greater common EF associated with greater surface area; yellow, greater common EF associated with greater volume; orange, overlap between red and yellow clusters; EF, executive function; GM, gray matter; R, right; A, anterior, P, posterior; D, dorsal; V, ventral.

# Diffusion Tensor Imaging

For a full list of DTI results that passed correction for multiple comparisons and family structure, see **Table 2**. Confirmatory analyses of relationships from Smolker et al. (2015) revealed that, in the current sample, better common EF was marginally associated with increased average fractional anisotropy across the entire rSLF, while shifting-specific was not significantly associated with fractional anisotropy of the bilateral iFOF. We then tested for voxel-wise associations between fractional anisotropy and EF within the white matter tract implicated in Smolker et al. (2015) finding a significant positive association of fractional anisotropy of a cluster in the anterior portion of right SLF (see **Figure 4**), specifically SLF-II, with common EF (x = 39, y = −9, z = 29; β = 0.229, SE = 0.071, p = 0.001), with greater FA associating with better common EF (**Figure 4**).

FIGURE 3 | Regional gray matter clusters associated with updating-specific factor score. Cortical clusters that passed correction for multiple comparisons. All clusters remained significant after taking into account family structure. <sup>∗</sup> Indicates clusters that remained significantly associated with updating-specific in the full model, in which all neuroanatomical clusters associated with updating-specific were included in a single model. Hot colors (i.e., pink) indicate greater morphometry associated with greater updating-specific factor scores. Cold colors (e.g., teal) indicated less morphometry associated with greater updating-specific ability. Scatterplots show simple correlation between morphometry total of ROI and updating-specific factor score. pink, greater updating associated greater thickness; teal, greater updating associated with less surface area; GM, gray matter; R, right; L, left; A, anterior, P, posterior.

In exploratory analyses, we found a significant positive association between fractional anisotropy in a cluster of the left anterior thalamic radiation (lATR; x = −24, y = 15, z = 16; β = 0.277, SE = 0.058, p < 0.001) and common EF (**Figure 4**) with better common EF associated with greater fractional anisotropy. No significant DTI results were found for updatingspecific, though we found a number of DTI clusters that were significantly associated with shifting-specific. Clusters in which DTI properties were associated with shifting-specific included axial diffusivity of a clusters in the right optic radiation (rOR; x = 48, y = −33, z = −11; β = −0.265, SE = 0.058, p < 0.001) (**Figure 5**), three clusters within the left SLF (**Figure 5**), including a more ventral cluster (lSLF-vent; x = −44, y = −13, z = 26; β = −0.257, SE = 0.068, p < 0.001), a more dorsal cluster (lSLFdors; x = −36, y = −10, z = 32; β = −0.209, SE = 0.067, p = 0.002), and a more posterior cluster (lSLF-post; x = −33,

FIGURE 4 | Regional fractional anisotropy clusters associated with common executive function factor scores. Significant results from Tract-Based Spatial Statistics (TBSS) within a mask of the right superior longitudinal fasciculus as well as within skeletonized mask of all major white matter tracts across the whole brain. Significant positive association were found between common EF and fractional anisotropy of a cluster in the right superior longitudinal fasciculus (shown in light red), specifically SLF-II. When conducting TBSS within skeletonized mask of all major white matter tracts across the whole brain, a significant positive association was found between common EF and fractional anisotropy of a cluster in the left anterior thalamic radiation. Scatterplots show simple correlation between fractional anisotropy total of ROI and common EF factor score. <sup>∗</sup> Indicates clusters that remained significantly associated with common EF in the full model, in which all neuroanatomical clusters associated with common EF were included in a single model. R, right; A, anterior; P, posterior; X, Y, and Z are MNI coordinates of peak of cluster.

y = −34, z = 36; β = −0.257, SE = 0.085, p = 0.002), and a cluster spanning almost the entirety of the left corpus callosum (lCC; x = −20, y = −44, z = 31; β = −0.260, SE = 0.073, p < 0.001) (**Figure 5**). Shifting-specific was also associated with radial diffusivity in two clusters in the rOR (**Figure 6**) – a more anterior cluster (rOR-ant; x = 39, y = −48, z = −15; β = −0.189, SE = 0.071, p = 0.008) and a more posterior cluster (rOR-post; x = 12, y = −77, z = 22; β = −0.209, SE = 0.059, p < 0.001) – as well as radial diffusivity of a cluster spanning much of the brain (whole brain; x = −34, y = −36, z = 24; β = −0.255, SE = 0.062, p < 0.001) (**Figure 6**) as well an mean diffusivity of a cluster spanning much of the brain (whole brain; x = −37, y = −25, z = 31; β = −0.248, SE = 0.067, p < 0.001) (**Figure 7**).

FIGURE 5 | Regional axial diffusivity clusters associated with shifting-specific factor scores. Significant results from TBSS within skeletonized mask of all major white matter tracts across the whole brain. Significant negative association were found between shifting-specific factor scores and five clusters of radial diffusivity, after accounting for family structure and gender. These clusters included the left corpus callosum (shown in green), three clusters within the left superior longitudinal fasciculus (lSLF), including a more dorsal cluster (lSLF-dorsal; shown in red), a more ventral cluster (lSLF-ventral; shown in dark purple), and a more posterior cluster (lSLF-posterior) shown in light purple. Scatterplots show simple correlation between mean axial diffusivity of a given ROI and shifting-specific factor score. <sup>∗</sup> Indicates clusters that remained significantly associated with shifting-specific in the full model, in which all neuroanatomical clusters associated with shifting-specific were included in a single model. R, right; L, left; A, anterior; P, posterior; X, Y, and Z are MNI coordinates of peak of cluster.

# Cross- and Within-Modality Multiple Regression

For a full list of clusters that remained significantly associated with a given EF when included in a model with other clusters associated with that EF, see **Table 3**. The only EF dimensions for which we observed both significant gray matter and DTI predictors was common EF, resulting in a model including volume of rFP/MFG cluster (β = 0.222, SE = 0.112, p = 0.048), area of the rITG cluster (β = 0.292, SE = 0.076, p < 0.001), FA of the rSLF cluster (β = 0.164, SE = 0.068, p = 0.015), and fractional anisotropy of the lATR cluster (β = 0.225, SE = 0.058, p < 0.001) as predictors of common EF, all of which remained significantly positively associated with common EF (full model R <sup>2</sup> = 0.225, SE = 0.046, p < 0.001) (**Table 3**). When all

white matter tracts across the whole brain. Significant negative associations were found between shifting-specific factor scores and a three radial diffusivity clusters, including a cluster spanning nearly all major white matter tracts in the whole brain (top), and two clusters within the right optic radiation, a more anterior cluster and a more posterior cluster, after accounting for family structure and gender. Of these three clusters, only the whole brain cluster remained significant in the full model, which regressed shifting-specific factor scores on all associated neuroanatomical clusters. Scatterplots show simple correlation between mean radial diffusivity of a given ROI and shifting-specific factor score. <sup>∗</sup> Indicates clusters that remained significantly associated with shifting-specific in the full model, in which all neuroanatomical clusters associated with shifting-specific were included in a single model. R, right; A, anterior; P, posterior; D, dorsal; V, ventral.

significant gray matter clusters associated with updating-specific were included in a single model controlling for family structure, total ICV, and gender, the lCun/PC cluster (β = 0.214, SE = 0.059, p < 0.001), rSFG cluster (β = −0.152, SE = 0.070, p = 0.030), and rS/MTG clusters (β = −0.180, SE = 0.077, p = 0.020) all remained significantly associated with updating-specific (full model R <sup>2</sup> = 0.179, SE = 0.051, p < 0.001) (**Table 3**). When all significant DTI clusters associated with shifting-specific were included in a single model controlling for family structure and gender (**Table 3**), the axial diffusivity cluster in rOR (β = −0.200, SE = 0.054, p < 0.001), the axial diffusivity cluster lSLF-vent (β = −0.173, SE = 0.057, p = 0.002), the axial diffusivity cluster in lCC (β = −0.357, SE = 0.110, p = 0.001), the whole brain mean diffusivity cluster (β = 1.201, SE = 0.242, p < 0.001), and the whole brain radial diffusivity cluster (β = −0.933, SE = 0.205, p < 0.001) cluster remained significantly associated with updating-specific (full model R <sup>2</sup> = 0.237, SE = 0.051, p < 0.001).

FIGURE 7 | Regional mean diffusivity cluster associated with shifting-specific factor scores. Significant results from TBSS within skeletonized mask of all major white matter tracts across the whole brain. Significant positive association was found between shifting-specific factor scores and a cluster of mean diffusivity spanning nearly all major white matter tracts in the whole brain, after accounting for family structure and gender. Scatterplot shows simple correlation between average mean diffusivity across a given ROI and shifting-specific factor score. <sup>∗</sup> Indicates clusters that remained significantly associated with shifting-specific in the full model, in which all neuroanatomical clusters associated with shifting-specific were included in a single model. A, anterior; P, posterior; D, dorsal; V, ventral; Z, MNI Z coordinate.

# DISCUSSION

The current study tested for associations between neuroanatomical measures and the three distinct EF constructs of the unity/diversity model of EF in a non-clinical sample closely clustered around 29 years of age. We observed relationships between common EF and multiple gray matter and fractional anisotropy characteristics. Updating-specific was associated with gray matter properties only, while shifting-specific was associated with a range of properties of white matter, including regional variability in mean, radial, and axial diffusivity. It is important to note that, while the effect sizes of neuroanatomy-EF relationships observed in the current study may be considered weak, it is unlikely that large portions of variance in complex cognitive behaviors in healthy individuals will be explained by neuroanatomy alone. Instead, neuroanatomy represents one piece of what are likely highly complex, multimodal brain systems supporting complex behaviors. We center our discussion around two questions: (1) whether the areas of gray matter and white matter that show associations with EF are within or




Significant results from "final model," in which executive function factor scores were regressed on all gray matter morphometry and diffusion tensor imaging (DTI) clusters that remained significant after accounting for family structure, simultaneously, in a single model. The only executive function factor score for which both gray matter morphometry and DTI clusters were found was common executive function (cEF). For updating-specific (UPD) and shifting-specific (SHI), displayed results are from models including all associated gray matter (updating-specific) and DTI (shifting-specific) clusters, respectively. rFP/MFG, right frontal pole/middle frontal gyrus; rITG, right inferior temporal gyrus; rSFG, right superior frontal gyrus; rM/STG, right middle/superior temporal gyrus; lCUN/PC, left cuneus/precuneus cortex; lATR, left anterior thalamic radiation; rSLF, right superior longitudinal fasciculus; rOR, right optic radiation; lSLF-vent, left superior longitudinal fasciculus – ventral; lCC, left corpus callosum; vol, volume; thick, thickness; FA, fractional anisotropy.

outside the FPN; and (2) whether the pattern of results observed is consistent with what we have observed previously in a sample of emerging adults.

# Neuroanatomical Correlates of EF: Within or Outside the FPN?

We observed that the gray matter morphometry of regions that are associated with common EF and updating-specific did not fall squarely within the FPN, but instead fell in brain regions commonly associated with default mode network (DMN), as well as regions supporting non-EF processes vital to task performance. The majority of the observed associations were in regions outside of the FPN, with only two associations occurring with clusters that spanned the FPN. Moreover, we observed that decreased gray matter in regions commonly associated with DMN were associated with better updating-specific. The DMN has been shown to have inverse associations with the efficacy of FPN engagement (Fox et al., 2005; Elton and Gao, 2015), as well as regions supporting non-EF processes vital to task performance. In terms of white matter, results suggested that individuals with higher EF are characterized by the properties of white matter tracts connecting a range of brain regions, including prefrontal to more posterior brain regions, as inferred from DTI measures.

One of the two major results regarding gray matter properties and common EF was an association between increased volume and surface area of the rFP/MFG and higher common EF. Comparing the spatial location of these clusters to a popular seven-network parcellation of brain networks (Yeo et al., 2011) the more lateral aspects of the rFP/MFG clusters lie in cortex associated with the FPN, whereas the more medial portions of these clusters lie in cortex associated with the DMN. Hence, the region so-identified does not fall squarely within middlPFC region that has been suggested to be at the top of a neuroanatomical hierarchy for EF (Nee and D' Esposito, 2016, 2017), but rather it is located a bit more dorsal and anterior. The FP has been implicated by our group and others with highlevel goal representations (Gilbert et al., 2006; Burgess et al., 2007, 2008; Tsujimoto et al., 2011; Orr and Banich, 2014; Orr et al., 2015). For example, Orr and Banich (2014) found there is greater FP activation when task goals must be voluntarily selected by an individual as compared to when they are given explicit instructions regarding the task goal. This finding is consistent with models of FP function as biasing behavior in accordance with internal goals in the absence of external goal cues (Burgess et al., 2007). Evidence from DTI and functional co-activation suggests that dorsal portions of the FP, with which the common EF cluster in the current study is contiguous, have short-range projections to other PFC regions. Such projections may allow for the updating of goal-related information in more mid-dorsolateral regions, which then in turn, can modulate activity of posterior regions in accordance with task goals (Orr et al., 2015). Thus, it may be that the structural characteristics of the FP associated with higher common EF may influence the processing of higher-level goal representations. In line with this idea, common EF has been theorized to capture the maintenance of goal information that is used to bias lower-level processing in pursuit of these goals (Friedman and Miyake, 2017).

The second major gray matter result for common EF was that higher cEF was associated with greater surface area along the ventral surface of anterior right ITG, (Ishai et al., 1999; Visser et al., 2010, 2012; Peelen and Caramazza, 2012), which is not part of what is commonly considered the FPN. Rather, anterior ITG has been linked to conceptual information regarding a visual object, including semantic information, location, and associated action (Visser et al., 2010, 2012; Peelen and Caramazza, 2012). We may have found this association because the majority of EF tasks that load on the cEF factor require the interpretation of visual cues. For example, during the category-switch task, participants are presented with two cues that have distinct semantic judgments associated with them and must rapidly identify the visual cue and access the appropriate semantic category, a function which has been ascribed to anterior ITG (Visser et al., 2010, 2012). In the antisaccade and number– letter tasks, participants must identify the location of visual cues and use this location information to inform subsequent actions, once again, a function ascribed to anterior ITG (Peelen and Caramazza, 2012). The involvement of access to higherorder conceptual information pertaining to visual objects, likely grounded in anterior ITG function, may be a prerequisite for good performance on nearly all visually based EF tasks. Given that common EF captures mechanisms involved across all EF tasks, it is not surprising the brain regions supporting the conceptual information of visual objects show associations with common EF. It should be noted that such a finding does not necessarily suggest that this association is an "artifact" of using

visual tasks to assess EF. Rather, it may suggest that individuals with higher common EF abilities may be better able to process information regarding lower-level processing in the context of current task goals.

With regards to white matter, higher common EF was associated with increased fractional anisotropy of white matter in clusters of the rSLF and the lATR. The SLF is often considered to be a key anatomical connection connecting frontal and parietal regions of the FPN, and has been implicated in higherlevel cognitive processes, including selective attention, working memory, and EF (Vestergaard et al., 2011; Smolker et al., 2015; Urger et al., 2015). Of the five primary subcomponents that make up the SLF (Kamali et al., 2014), the cluster associated with common EF lay in the SLF-II subcomponent, which has been shown to connect the angular gyrus to middle frontal and precentral gyri (Wang et al., 2016). Though the SLF-II likely plays a role in a wide range of functions, it has been suggested to be preferentially associated with the regulation of spatial attention, with some suggesting that it plays a critical role integrating the dorsal- and ventral- attention networks, mediating information flow related to goal-directed attention (originating from dorsal attention network via SLF-I) and attention to salient events (originating from ventral attention network via SLF-III) (De Schotten et al., 2011). In the context of common EF, this purported function of integrating goal-oriented attentional signals with automatic, salient spatial attention to objects is likely involved in all, if not the majority of EF tasks. That is, all of the EF tasks paradigms that went into the common EF factor score required participants to guide spatial attention in accordance with task goals, and the ability to successfully do this is likely contingent upon the properties of the neural systems supporting spatial attention, including the SLF and its subcomponents. As such, the white matter findings are consistent with those regarding gray matter as both point to the possibility that individuals with higher common EF have associations with aspects of brain neuroanatomy that would be suggestive of expanded involvement of both top–down and bottom–up brain regions as well as their integration.

The lATR, the other white matter tract associated with common EF, connects the mediodorsal nucleus of the thalamus to the PFC (Behrens et al., 2003a; Jang and Yeo, 2014). Showing three distinct functional connectivity profiles, the mediodorsal nucleus has been shown to have dissociable connections with orbitofrontal cortex, ventrolateral PFC, and dlPFC, all of which pass through the ATR (Jang and Yeo, 2014). Indeed, in the current sample, post hoc analyses revealed a significant positive correlation between fractional anisotropy of the lATR cluster and volume of the right FP/MFG cluster (r = 0.238, p < 0.001), suggesting a potential neural circuit important to common EF ability, though each cluster appeared to predict unique portions of variance in common EF. Despite few if any studies implicating fractional anisotropy of the ATR in individual differences in EF amongst healthy young adults, fractional anisotropy of the ATR has been shown to be reduced in patient populations, with the degree of reduction associating with EF impairment (Mamah et al., 2010). Moreover, the mediodorsal nucleus of the thalamus has been proposed to play an important role in rapid learning of an associative nature as well as decision-making paradigms that involve multiple cognitive processes (Mitchell, 2015), exactly the type of processes tapped by the EF tasks in our behavioral battery. Hence, individuals with higher common EF have increased fractional anisotropy, often taken as an index of structural integrity (Alba-Ferrara and de Erausquin, 2013), of both a tract that connects cortical regions to prefrontal cortex (i.e., rSLF) and well as a tract that connects subcortical regions to prefrontal cortex (i.e., lATR).

With regards to updating-specific, four major associations with gray matter morphometry were observed. One of these was an association between better updating-specific and decreased surface area of the rSFG in a cluster spanning cortex both FPN and DMN. This rSFG cluster spans the dorsal portions of both the middle and anterior zones of the medial frontal cortex identified in a recent meta-analytic parcellation by de la Vega et al. (2016), although the majority of it falls within the anterior zone. The dorsal portion of the middle zone is associated with working memory and cognitive control de la Vega et al., 2016) and shows high degrees of co-activation with key components of the FPN. While this posterior portion of the rSFG cluster falls within this middle zone attributed to the FPN, the majority of this cluster sits in a region of medial PFC commonly attributed to the DMN. This portion of the anterior zone has been strongly implicated with social processing, including social perception and self-referential thought (Mitchell et al., 2005; de la Vega et al., 2016). Though it is unclear how social perception and self-referential thought relate to updating-specific or the functional consequences of reductions in surface area are, one possibility is that better updating-specific is associated with reduced engagement of these inwardly directed modes of thought. In line with this interpretation, we also observed that better updating-specific is associated with reduced surface area of the anterior right M/STG, a region implicated as in the DMN (Yeo et al., 2011), as well as affective processing (Olson et al., 2007).

In contrast, updating-specific was associated with increased cortical thickness of a region that spanned from dorsal regions of the left cuneus/precuneus, commonly implicated in visual attention (Vanni et al., 2001), to more ventral regions reaching the posterior cingulate. Though not a classic EF region per se, the cuneus/precuneus is frequently implicated in EF tasks due to a reliance on rapid visual processing (Wager and Smith, 2003; Simões-Franklin et al., 2010). For proper updating to occur, the environment must be monitored for cues indicating an update is needed and, in the case of the EF tasks in this study, these cues can only be discerned through rapid visual processing, to which the cuneus is a key contributor. The posterior cingulate region is one of the core hubs of the DMN, and becomes active when individuals make self-relevant, affective decisions (Andrews-Hanna et al., 2010). Like the region in the rSFG, this region spanned areas typically considered to be both the FPN and the DMN. Whether this association is indicative of alterations in individuals with higher updating-specific ability in the interaction between these two systems, which commonly activate in an antagonistic manner (Fox et al., 2005), remains to be seen and will require examination of functional patterns of brain activation.

Though no significant associations between gray matter morphometry and shifting-specific were observed in the current sample, quite a number of associations were found between regional variability in the white matter diffusion measures and individual differences in shifting-specific factor scores. One potential interpretation of shifting-specific's exclusive associations with white matter properties may be that shifting is reliant on more transient neural processes, such as the ability to effectively reconfigure task sets and representations, and to quickly clear or replace no-longer relevant representations (Herd et al., 2014). Such reconfiguration may be dependent upon the efficiency of connectivity between multiple brain regions, with connectivity largely driven by diffusion properties of white matter (Skudlarski et al., 2008), not regional gray matter morphometry.

Specifically, we found shifting-specific ability to be associated with multiple white matter characteristics including axial diffusivity of three clusters of the lSLF, axial diffusivity of the left corpus callosum, axial diffusivity of a portion of the rOR, mean diffusivity of a cluster spanning much of the entire brain, radial diffusivity of a similar cluster spanning much of the brain, as well as radial diffusivity of two clusters within the rOR. When viewed as a whole, these results suggest three general points regarding the diffusion correlates of shifting-specific. First, the whole brain clusters identified in both mean diffusivity and radial diffusivity analyses suggest that shifting-specific is associated with diffusion properties across the entire brain. Though these results were not expected, they suggest that shifting-specific ability is at least partially dependent upon general white matter properties not just those linked to specific portions of discrete tracts. Interestingly, despite the considerable spatial overlap between the regions identified as associated with mean diffusivity and radial diffusivity, respectively and the fact that radial diffusivity is a mathematical component of mean diffusivity (Alexander et al., 2007), both clusters remained significantly associated with shifting-specific, even after taking into account the other cluster. This finding suggests that, while highly associated, mean diffusivity and radial diffusivity have distinct associations with behavior that are not captured by one measure alone, providing credence to methodologies that aim to investigate multiple measures of white matter diffusion in tandem.

A second point from the DTI analyses of shifting-specific is that, whereas cEF has been shown to be associated with diffusion properties of the right SLF in emerging adults (Smolker et al., 2015) and in the current sample, shifting-specific ability appears to be related to regional axial diffusivity of clusters in the left SLF, specifically SLF-II. As discussed previously, the SLF, particularly SLF-II, allows for long-range connections between the prefrontal and parietal cortices, including regions implicated in the FPN, and has been implicated in the regulation of attention, along with other EF-associated behaviors (Vestergaard et al., 2011; Smolker et al., 2015; Urger et al., 2015). The observed left lateralization of this relationship between shifting-specific and axial diffusivity of the SLF may reflect the linguistic nature of the shifting tasks, as the lSLF (Maldonado et al., 2011; Urger et al., 2015) and left hemisphere in general (Binder et al., 1995), have been heavily implicated in linguistic and semantic processing. We additionally found that axial diffusivity of a cluster spanning most of left hemisphere portions of the corpus callosum was negatively correlated with shifting-specific, such that better shifting-specific was associated with reduced axial diffusivity in these regions. Unlike the majority of white matter tracts that generally run anteriorly to posteriorly, the corpus callosum is the main anatomical pathway connecting the two hemispheres (Roland et al., 2017), but also has vertical projections which innervate the major lobes of the brain (Hofer and Frahm, 2006). The lSLF clusters found to be associated with shiftingspecific lay directly adjacent to the more anterior section of the corpus callosum cluster, that connect to prefrontal regions. This finding raises the possibility that these clusters are capturing distinct portions of an integrated neural circuit important for determining individual differences in shifting-specific ability. In fact, prior work has shown that lower switch costs in individuals are associated with greater coupling of right and left MFG activity and that such coupling is predicted by greater volume of anterior regions of the corpus callosum (Baniqued et al., 2018). Such findings are also consistent with the notion that engaging both hemispheres is particularly helpful to task performance under conditions of higher level demand (Banich, 1998), which well describes EF tasks.

The third notable result with shifting-specific was the considerable evidence implicating distinct portions of the rOR in shifting-specific ability. Specifically, shifting-specific was found to be associated with clusters of axial and radial diffusivity in two adjacent portions of the optic radiation, as well as a second radial diffusivity cluster where the optic radiation terminates in the medial occipital lobe. Though the axial diffusivity optic radiation cluster was the only cluster of these three to remain significant after taking into account all other DTI clusters associated with shifting-specific, the fact that we found associations between shifting-specific and multiple portions of the optic radiation, across multiple diffusion measures, provides converging evidence for this relationship. Likely serving a supportive role to EF mechanisms, the optic radiation, which spans from the lateral geniculate nucleus to the occipital cortex (Yamamoto et al., 2007) provides a pathway for visual information to travel from the retina to the primary visual cortex. It is not entirely surprising to find associations between behavioral measures grounded in visual tasks to be associated with properties of the optic radiation, properties which presumably may influence the rate and efficacy with which visual information enters the visual cortex. Why a relationship would be observed with shifting-specific and not the more general common EF factor remains unclear and a potential point of further inquiry.

Given the patterns observed, it is important to consider potential explanations for associations between individual differences in EF with neuroanatomical properties outside of the classic brain regions thought to support EF. First, as discussed above, the non-FPN regions associated with individual differences in EF might indicate that individuals with higher EF rely on a larger or more diverse set of brain regions than those with lower EF. In other words, individuals who have higher EF may employ additional brain systems while performing EF tasks that are not engaged by individuals who have lower EF, or vice versa. Second, individuals with

higher EF may have distinct anatomical characteristics of regions that often are observed to work in opposition to the FPN. The results indicated that better updating-specific was associated with reduced area in two regions of the brain, the superior medial frontal cortex and anterior sections of the right superior/middle temporal gyrus, that are associated with the DMN. Research has shown that activity in these two networks are often anti-correlated (Fox et al., 2005). Of course, it is impossible to determine patterns of activation on the basis of neuroanatomy, so for now these ideas are mainly speculative and will need to be evaluated by investigations focused on individual differences in EF and brain activation.

# Comparing Current Results to Younger Individuals

Another lens through which to interpret the results of this study is in comparison to our previous examination (Smolker et al., 2015) in which we similarly investigated the relationship between neuroanatomy and these same aspects of EF (common EF, updating-specific, shifting-specific). That study varied from the present one in two ways. First, it was performed on a younger sample of college-aged individual, who can be considered emerging adults. Second, we used factor scores based on a battery of three EF tasks (antisaccade, category-switch, and keep-track) rather than six, as in the present study. As in the current study, Smolker et al. (2015) found that common EF was associated with white matter tracts that connect to prefrontal and posterior regions, namely the rSLF. While the relationship with the entire rSLF was only marginal in the current sample, when we ran voxel-wise FA analyses within a mask of the rSLF, we found a significant positive association between a portion of SLF-II and common EF, suggesting that the rSLF is important to individual differences in common EF across both emerging and young adulthood. In addition, as in our prior study most of the associations observed with EF in the current study occurred in brain regions outside of the FPN.

However, the specific gray matter regions implicated and their general directionality for the most part differed between the two studies. Whereas reduced volume in bilateral ventromedial PFC was associated with better cEF in emerging adults, the current study found increased rFP/MFG volume to be associated with better cEF performance in our young adult sample. In the emerging adult sample, better updating-specific was associated with reductions in gray matter volume in left dlPFC, while the current study found increased updating-specific associated with reductions in surface area of a medial cluster of the right SFG, reductions in surface area of a cluster in right anterior temporal lobe, and increases in thickness of a cluster spanning cuneus/precuneus. The current study did not find any significant associations between shifting-specific ability and regional gray matter morphometry, whereas in our prior study there was an association between better shifting-specific ability and reduced gray matter volume in left ventrolateral PFC (BA 10/47). Additionally, whereas Smolker et al. (2015) found associations between shifting-specific and mean fractional anisotropy of the inferior frontooccipital fasciculus, in the current sample shifting-specific was not associated with FA anywhere in the brain, and instead was associated with mean diffusivity, radial diffusivity, and axial diffusivity within a number of regional clusters.

Though no formal tests were carried out comparing the current sample with the younger sample in Smolker et al. (2015), we speculate that the discrepancies between these two studies may emerge from differences in the age of the participants. At a mean age near 29 years, the current study employed a sample which is almost a decade older on average than the sample used in Smolker et al. (2015). By age 30 or so, aspects neurodevelopment, particularly of the PFC, have likely stabilized (Sowell et al., 2003), whereas neurodevelopment was likely still on-going in the younger sample (Smolker et al., 2015). Supporting this conjecture, we found that reductions in volume were associated with EF in Smolker et al. (2015), suggestive that greater developmental pruning is associated with better EF. In that sample we also found that local gryification index was a potent predictor of individual differences in EF, but observed no relationships with local gyrification index in the current study. Local gyrification index has been found to show reductions during the late teens/early 20s (Klein et al., 2014), likely driven by increases in underlying white matter characteristics (Ribeiro et al., 2013). This pattern also suggests that on-going developmental processes may be influencing associations with EF in this younger sample. Such findings are consistent with prior studies indicating that neurodevelopment has profound effects on the brain regions utilized for specific cognitive functions (Rubia et al., 2000). Nonetheless, the current study coupled with Smolker et al. (2015) do not provide a clear trajectory of how the neuroanatomical characteristics associated with EF change during the 20s. Largescale longitudinal studies will be needed to investigate the dynamic evolution of the neural systems associated with EF performance.

# Limitations and Future Directions

The current study is not without limitations. First, a limitation to the current study is that analyses were carried out in a univariate fashion, despite evidence that behaviorally relevant neuroanatomical properties segregate into multivariate components (Xu et al., 2009a,b; Brown et al., 2012). Second, the age range of our participants is rather restricted, which may bring about reduced variability in neuroanatomy between subjects. On the other hand, having such a large sample in this relatively narrow age range, provided a clear picture of the associations between brain anatomy and EF during young adulthood. An additional limitation is that, without testing in a replication sample, it is unclear if the current results reflect biologically real associations or chance variation that can influence such studies. Finally, despite having a sample of twins and more power than most neuroimaging studies, we are currently underpowered for twin models. Following the completion of data collection for the larger study of which this project is a part, we plan to investigate (1) the replicability of the current findings in a well-matched replication sample, and (2) the degree to which neuroanatomical correlates of individual differences in EF are driven by genetic or environmental factors.

#### CONCLUSION

fnhum-12-00283 July 18, 2018 Time: 16:14 # 20

Within a sample of developmentally mature young adults, common EF and updating-specific were associated with distinct properties of regional gray matter morphometry and the location of these features fell both within and outside of the FPN. Additionally, common EF was associated with fractional anisotropy of clusters in the rSLF and lATR while shifting-specific was associated with diffusion properties of multiple white matter tracts throughout the brain. These results suggest that individual differences in EF are associated with properties of neural systems of not only brain regions classically thought to support EF, but also brain systems associated with processes not traditionally conceptualized as supporting EF. These latter regions fall into one of two categories: those that are likely to support higher-order, amodal cognitive processes (e.g., goal maintenance, semantic processing) or those that allow for improved categorization of relevant perceptual information (e.g., visual processing and attentional control areas), both of which could aid performance during complex EF tasks. Coupled with the white matter findings, these results suggest that individuals with higher EF may have a more expanded, integrated and/or connected neural substrate associated with EF performance, a hypothesis that should be tested further by multimodal follow up studies. The current findings show distinct patterns of neuroanatomy-EF associations from what we have observed in younger individuals (Smolker et al., 2015), suggesting that the significant development of cortical organization occurring well into the third decade of

#### REFERENCES


life influence the underlying neuroanatomical characteristics associated with EF.

#### AUTHOR CONTRIBUTIONS

JH and NF designed the study that collected the data. HS, MB, and NF conceived the current analysis plan. HS and NF carried out statistical analyses included in this manuscript. HS, NF, MB, and JH wrote, revised, and/or provided comments throughout the writing of the manuscript.

# FUNDING

This research was supported by NIH grant MH063207.

# ACKNOWLEDGMENTS

The authors would like to acknowledge Kathy Pearson, Dr. Tor Wager, the staff of the Intermountain Neuroimaging Consortium, and the staff of the Institute for Behavioral Genetics for their invaluable contributions to data collection, analyses, and interpretation.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00283/full#supplementary-material



anti-saccade eye movements in healthy humans. Neuroimage 24, 487–494. doi: 10.1016/j.neuroimage.2004.08.019



neuropsychological performance in a large sample of cognitively healthy adults. Brain Imaging Behav. 1, 3–10. doi: 10.1007/s11682-007-9000-5



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Smolker, Friedman, Hewitt and Banich. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Relationship Between Resting State Network Connectivity and Individual Differences in Executive Functions

#### Andrew E. Reineberg1,2 \*, Daniel E. Gustavson<sup>3</sup> , Chelsie Benca<sup>4</sup> , Marie T. Banich1,5 and Naomi P. Friedman1,2

<sup>1</sup> Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, CO, United States, <sup>2</sup> Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO, United States, <sup>3</sup> Department of Psychiatry, University of California, San Diego, San Diego, CA, United States, <sup>4</sup> Department of Psychology, Emory University, Atlanta, GA, United States, <sup>5</sup> Institute of Cognitive Science, University of Colorado Boulder, Boulder, CO, United States

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Dawei Li, Duke University, United States Micaela Mitolo, IRCCS Fondazione Ospedale San Camillo, Italy

\*Correspondence: Andrew E. Reineberg andrew.reineberg@colorado.edu

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 08 February 2018 Accepted: 13 August 2018 Published: 05 September 2018

#### Citation:

Reineberg AE, Gustavson DE, Benca C, Banich MT and Friedman NP (2018) The Relationship Between Resting State Network Connectivity and Individual Differences in Executive Functions. Front. Psychol. 9:1600. doi: 10.3389/fpsyg.2018.01600 The brain is organized into a number of large networks based on shared function, for example, high-level cognitive functions (frontoparietal network), attentional capabilities (dorsal and ventral attention networks), and internal mentation (default network). The correlations of these networks during resting-state fMRI scans varies across individuals and is an indicator of individual differences in ability. Prior work shows higher cognitive functioning (as measured by working memory and attention tasks) is associated with stronger negative correlations between frontoparietal/attention and default networks, suggesting that increased ability may depend upon the diverging activation of networks with contrasting function. However, these prior studies lack specificity with regard to the higher-level cognitive functions involved, particularly with regards to separable components of executive function (EF). Here we decompose EF into three factors from the unity/diversity model of EFs: Common EF, Shifting-specific EF, and Updatingspecific EF, measuring each via factor scores derived from a battery of behavioral tasks completed by 250 adult participants (age 28) at the time of a resting-state scan. We found the hypothesized segregated pattern only for Shifting-specific EF. Specifically, after accounting for one's general EF ability (Common EF), individuals better able to fluidly switch between task sets have a stronger negative correlation between the ventral attention network and the default network. We also report non-predicted novel findings in that individuals with higher Shifting-specific abilities exhibited more positive connectivity between frontoparietal and visual networks, while those individuals with higher Common EF exhibited increased connectivity between sensory and default networks. Overall, these results reveal a new degree of specificity with regard to connectivity/EF relationships.

Keywords: executive function, functional connectivity, networks, resting-state, fMRI

# INTRODUCTION

fpsyg-09-01600 September 4, 2018 Time: 9:45 # 2

Executive functions (EFs) are a set of higher-level cognitive abilities that contribute to the maintenance, implementation, and modification of goals (Banich, 2009; Friedman and Miyake, 2017). Classically, EFs have been linked to frontal lobe function based on both studies of individuals with localized lesions (Stuss and Alexander, 2000; Alvarez and Emory, 2006) and on task-based functional magnetic resonance imaging studies (fMRI; Wager and Smith, 2003; Wager et al., 2004). Recent work has begun to examine other potential neural correlates of EFs, in particular, connectivity between large scale brain systems. Brain systems can be studied in many contexts. For example, networks of brain regions that are involved in similar processes (functional networks) are observed in task-based fMRI studies when specific cognitive constructs are targeted with subtraction-based methods and in fMRI studies of resting-state functional connectivity. Resting-state functional connectivity refers to the observation that regions of related function have similar time courses of low frequency BOLD signal when individuals are asked to merely relax inside an fMRI scanner. The resting state is a particularly interesting context because it is mostly free of instruction-related demands on participants and provides a measure of coordination (time course correlation) between functional networks that is highly stable (Shehzad et al., 2009; Choe et al., 2015).

While an intact frontal system may be necessary for performing EF tasks, high functioning may depend upon the segregation of EF-related mechanisms from contrasting mechanisms such as those related to internal mentation. Prior work in the clinical domain has found that depression is associated with co-activation of resting-state brain systems responsible for cognitive control and internal mentation (Kaiser et al., 2016), which is one possible explanation for EF-related deficits that are frequently observed in individuals suffering from depression and/or other forms of psychopathology (Snyder et al., 2015). In addition, some preliminary work in neurologically normal individuals has shown that variation in connectivity between networks is linked to individual differences in cognitive ability. For example, altered connectivity between networks responsible for externally- versus internally directed attention has been observed in individuals with high versus low working memory ability as measured by both sequencing and span tasks (Keller et al., 2015) and across individuals with variations in attentional control ability (Kelly et al., 2008). The tasks used in these studies index some cognitive abilities specific to working memory and attentional mechanisms, respectively, but also measure common mechanisms such as the ability to learn/maintain complex rules or insulate task goals from competing personal thoughts, abilities shared across many EF tasks. Thus, the nature of these previously reported brain/behavior relationships are imprecise due to task impurity, leading to the question of whether network connectivity is a neural correlate of processes common to many cognitive tasks or those specific to a particular task or operation.

The current study utilizes a multitask EF battery to more specifically investigate the link between network connectivity and EFs. We utilize the Unity/Diversity model of EF, an influential framework that re-parameterizes variance in three commonly studied EF processes (prepotent response inhibition, mental set shifting, and working memory updating) into three orthogonal latent factors (Miyake and Friedman, 2012; Friedman and Miyake, 2017). The first factor, Common EF, accounts for performance on all EF tasks, and is thought to reflect the ability to actively maintain and implement a task goal or attentional set. Two orthogonal diversity factors predict additional variance in the shifting and updating tasks. Shifting-specific EF is thought to reflect the speed with which one can clear goals that are no longer relevant, beyond those goal-management processes recruited in Common EF. Similarly, Updating-specific EF reflects working memory operations that are not captured by Common EF, such as gating and possibly episodic retrieval. There is no evidence for an inhibition-specific factor, suggesting that individual differences in response inhibition are captured by Common EF (Friedman and Miyake, 2017). Hence, this framework captures both unity (Common EF) and diversity (Shifting-specific and Updating-specific) of EFs.

Although prior work has focused on the correlation of specific regions of interest within the functional networks implicated in externally and internally directed attention, we utilize a whole-cortex network approach so as to not limit ourselves to specific, subjectively chosen functional regions of interest. This approach also affords the potential to reveal novel network connectivity/EF relationships. There is overwhelming evidence of functional networks from parcellation studies of brain activity during resting-state scans (Power et al., 2011; Yeo et al., 2011). For the current analysis, we chose a popular lowdimensionality solution as determined by a clustering analysis of resting-state scans from over 1000 individuals; this solution describes seven networks: visual, sensory/somatomotor, dorsal attention, ventral attention, salience, default, and frontoparietal networks (Yeo et al., 2011; the authors also provide a 17-network solution we utilize to provide more detail on EF-related connections).

Within this framework, visual and sensory-somatomotor networks are well-characterized and contain regions located in close proximity to V1 and the sensory/motor strips, respectively. The limbic network contains predominantly orbitofrontal cortex (Mega et al., 1997), which is involved in affect, valuation, and decision-making. The remaining networks of the seven-network parcellation can broadly be categorized as task positive or task negative based on whether or not their activation typically increases or decreases, respectively, during difficult, externally directed cognitive control tasks when compared to baseline. In the parcellation provided by Yeo and colleagues, there are three task-positive networks — the frontoparietal, dorsal attention, and ventral attention networks — that are implicated in the various levels of control needed to perform directed tasks. The dorsal attention network is involved in top-down biasing of attention during goal pursuit, whereas the ventral attention network is involved bottom up attentional processes such as reorienting or filtering of attention toward sensory information in the environment that may be goal-related (Vossel et al., 2014). The frontoparietal network is implicated in higher-level functions such as fine adjustment of current

behavior in response to changes in task demands (Dosenbach et al., 2008). The task-negative network is the default network, which is a set of midline frontal, posterior cingulate, and middle temporal areas implicated in a family of self-related processes such as imagination and reminiscence (Andrews-Hanna, 2011). The typical decrease in BOLD signal of the default network during difficult externally directed tasks is explained as a decreased focus on the internal world and redirection of attention to the demanding task (Fox et al., 2005).

We examined the hypothesis that individual differences in EFs are associated with individual differences in correlation strength between task-positive and task-negative networks. Prior studies have found evidence of hypoconnectivity (decreased positive/increased negative connectivity) in individuals with higher working memory span and sequencing ability (e.g., Keller et al., 2015). However, due to ambiguity in the exact EF processes measured by these previous studies, we examine three EF factors to determine whether task-positive-to-task-negative network connectivity is linked to EF mechanisms that are common to many tasks (Common EF) or to a more specific EF ability such as Updating-specific or Shifting-specific abilities. Specifically, we test for connectivity-EF relationships in six models predicting each of six pairwise relationships between the frontoparietal, dorsal attention, ventral attention, and default networks from three EF factors, reporting only those relationships that withstand correction for the six models.

One advantage of the current study is that it used a larger sample size (N = 250) than typically employed in prior studies of this nature. This approach afforded us the opportunity to investigate two supplemental research questions regarding connectivity-EF relationships that might not have emerged in previous studies of small samples and single EF measures. First, does high Common EF, Shifting-specific, or Updating-specific ability relate to hyperconnectivity (increased positive/decreased negative connectivity) between systems with complementary functions such as the frontoparietal and the dorsal/ventral attention networks or between the dorsal and ventral attention networks themselves? This question is motivated by a finding that stronger positive connectivity between dorsal attention and frontoparietal networks is associated with higher performance on the stop signal task, which is typically considered a measure of inhibitory control (Tian et al., 2013), as well as evidence that task-positive regions become and stay hyperconnected after a challenging EF task (Gordon et al., 2012). Second, is hypo- or hyperconnectivity between lower-level sensory and higher-level cognitive networks related to individual differences in EFs? This question is motivated by prior resting-state work from our group in a younger and smaller sample in which we found Common EF and Shifting-specific ability was linked to the functional connectivity characteristics of lower-level sensory areas (Reineberg and Banich, 2016). Finally, we follow up our primary analyses using a finer-grained network parcellation (n = 17) to test spatial specificity (e.g., are brain-EF relationships isolated to particular subcomponents of task-positive networks versus the network at coarse level of analysis?).

# MATERIALS AND METHODS

# Participants

Participants were 250 individuals from the ongoing Colorado Longitudinal Twin Study [LTS; M(age) = 28.7 years, SD(age) = 0.57 years; 97 males], who completed a resting state scan as part of a larger testing session. Data from an additional 15 participants were excluded, because they showed excessive levels of movement during the scanning session based on the criteria of greater than 2 mm translation (motion in X, Y, or Z plane) or 2 degrees rotation (roll, pitch, or yaw motion) (n = 14), and failure of the presentation computer to display a fixation cross during the resting scan (n = 1). Of the 250 individuals, there were 54 pairs of monozygotic (MZ; identical) twins, 45 pairs of same-sex dizygotic (DZ; fraternal) twins, 24 MZ twin singletons, and 28 DZ twin singletons. Singletons are members of twin pairs whose co-twins either did not participate or were excluded from analysis. All LTS participants were recruited from the Colorado Twin Registry based on birth records, and is representative of the Colorado population at the time of recruitment (see Rhea et al., 2006, 2013 for additional details). Based on self-report, the LTS sample is 92.6% White, 5.0% "more than one race," <1% American Indian/Alaskan Native, <1% Pacific Islander; 1.2% did not report race. Hispanic individuals composed 9.1% of the sample. Participants were paid \$150 for participation in the 3-h study; those who did not finish the entire protocol were paid \$25 per half hour. All study procedures were approved by the Institutional Review Board of the University of Colorado Boulder.

# Procedure

The study session involved the administration of behavioral tasks that measured EF ability as well as acquisition of anatomical and functional brain data via MRI. Testing took place in a single 3-h session. Following informed consent, participants were familiarized with the imaging procedures including practice versions of the behavioral tasks to ensure comprehension later in the scanner. They also completed some interviews and questionnaires, then completed a 1.5-h scanning session that began with a structural scan followed by a 6-min resting state, three EF tasks (antisaccade, keep track, and number–letter, in that order), and a diffusion tensor imaging sequence (not analyzed here). The current study only utilizes behavioral data (i.e., reaction time, accuracy) acquired during functional scanning of the antisaccade, keep track, and number-letter tasks. After the scan, participants returned to a behavioral testing room to complete three additional EF tasks (Stroop, category-switch, and letter memory, in that order). If both twins of a pair participated on the same day, the twins completed the protocol sequentially (twin order randomized) with the same ordering of behavioral testing and imaging acquisition.

# Brain Imaging

Participants were scanned in a Siemens Tim Trio 3T scanner. Neuroanatomical data were acquired with T1-weighted MP-RAGE sequence [acquisition

parameters: repetition time (TR) = 2400 ms, echo time (TE) = 2.07, matrix size = 320 × 320 × 224, voxel size = 0.80 mm × 0.80 mm × 0.80 mm, flip angle (FA) = 8.00 deg., slice thickness = 0.80 mm]. Resting state data was acquired with a T2<sup>∗</sup> -weighted echo-planar functional scan [acquisition parameters: number of volumes = 816, TR = 460 ms, TE = 27.2 ms, matrix size = 82 × 82 × 56, voxel size = 3.02 mm × 3.02 mm × 3.00 mm, FA = 44.0 deg., slice thickness = 3.00 mm, field of view (FOV) = 248 mm]. During the resting-state scan, participants were instructed to relax and stare at a fixation cross while blinking as they normally would. We based this decision on suggestions in the literature indicating that eyes open and fixated is the optimal instruction for maximizing reliability (Zou et al., 2015). In addition, this approach is thought to minimize the variability that is observed in the visual processing stream when participants are instructed to keep their eyes closed versus open during the resting scan. Visual network variability seemingly comes from top-down imagination/visualization processes, although the exact mechanism is unknown (Patriat et al., 2013).

#### Measures

A strength of the LTS sample is a detailed characterization of EF ability. Specifically, rather than measuring EFs with only a single task, we calculated EF factor scores from the six tasks completed on the day of the scan.

#### Antisaccade Task

This task was adapted for fMRI from Roberts et al. (1994). Only behavioral performance was analyzed for the current manuscript. Antisaccade captures the ability to maintain and execute a task set in the face of distracting information; specifically, it requires inhibiting prepotent eye movements (Miyake et al., 2000). In the scanner version, participants completed 20 s blocks of prosaccade, antisaccade, and rest (fixation) trials (12 blocks of each across two runs; 5 trials per block for the prosaccade and antisaccade blocks), each was preceded by a jittered instruction (TOWARD, AWAY, or FIXATION for 2, 4, or 6 s). On each trial, after a jittered fixation lasting 1–3 s, a small visual cue flashed on one side of the computer screen for 234 ms, followed by a target (a digit from 0 to 9) that appeared for 150 ms before being masked. The mask lasted 1650 ms, during which time the participant vocalized the target. The cue and target appeared on the same side of the screen during prosaccade trials and opposite sides during anti-saccade trials. Hence, in order to identify the number on the antisaccade trials, participants had to avoid the automatic tendency to saccade to the cue and instead immediately look in the opposite direction. The dependent measure was the proportion of correctly identified targets on the 60 anti-saccade trials.

#### Stroop Task

This task was adapted from Stroop (1935). Stroop captures the ability to maintain a task set in the face of pre-potent distracting information, specifically, inhibiting the prepotent tendency to read words. Participants verbally indicated the font color (red, blue, or green) of text presented on a black screen as quickly as possible, with reaction time measured via a ms-accurate voice key. Trials were divided up into three types: a block of 42 neutral trials consisting of asterisks (3–5 characters long) presented in one of three colors (red, blue, and green); a block of 42 congruent trials consisting of color words that matched the font color (e.g., the word "RED" displayed in red font); and two blocks of 42 trials each of incongruent trials consisting of color words that did not match the font color (e.g., the word "RED" displayed in blue ink). Each word disappeared as soon as the voice key detected the response, and the next word appeared after a 250 ms white fixation. The dependent measure was the mean reaction time difference between correct incongruent and neutral trials.

#### Keep Track Task

This task was adapted for fMRI from Yntema (1963). Only behavioral performance was analyzed for the current manuscript. Keep track captures the ability to maintain and update information in working memory. On each trial in the scanner version, participants were given 3 or 4 target categories (animals, colors, countries, distances, metals, or relatives) that remained on the screen throughout the trial. After viewing a serial list of 16 words drawn from 6 categories (one word every 2 s), they saw a "???" prompt on the screen for 10 s, during which they orally recalled the last exemplar of each target category. Because each list contained 1–3 exemplars of each category, they had to update which words to remember and ignore words from irrelevant categories. In addition to these "Remember" trials, the scanner version of the task included baseline conditions of "Read" trials, in which participants just silently read the words without trying to remember them, and 20 s rest (fixation) trials. Each trial type was preceded by a jittered instruction (REMEMBER, READ, or FIXATION for 2, 4, or 6 s). There were three runs, each with 3 recall trials (two with 4 words to recall and one with 3), 3 read trials, and 3 rest trials. The behavioral dependent measure was the proportion of the 45 words correctly recalled out of all remember trials.

#### Letter Memory

This task was adapted from Morris and Jones (1990). Letter memory captures the ability to maintain and update items in working memory. In each trial, participants saw a series of 9, 11, or 13 consonants, with each letter appearing for 3 s. As each letter appeared, they had to say aloud the last four letters, including the current letter. The dependent measure was the proportion of 132 sets correctly rehearsed (i.e., the last 4 letters reported in the correct order) across 12 trials.

#### Number–Letter Task

This task was adapted for fMRI from Rogers and Monsell (1995). Only behavioral performance was analyzed for the current manuscript. Number–letter captures the ability to shift between mental sets. In each trial of the scanner version, participants saw a box sectioned into four quadrants. The borders of one quadrant were darkened (i.e., cued) for 350 ms, then a number–letter or letter–number pair (e.g., 4K) appeared inside until it was categorized. The participant had to categorize the number (top 2 quadrants) or letter (bottom 2 quadrants) as

odd/even or consonant/vowel, respectively, using two buttons on a button box. The stimuli disappeared from the screen when categorized, and there was a 350 ms response-to-cue interval. The trials were arranged in blocks, and rest blocks (20 s) were intermixed with the task blocks. Each block was preceded by a jittered instruction (TOP, BOTTOM, MIXED, or FIXATION for 2, 4, or 6 s) that indicated where the stimuli would appear for that block. In mixed blocks, half the trials were repeat trials in which the task stayed the same as the previous trial; the other trials required a switch in categorization task. Each block consisted of 13 trials, with the first trial not counted because it was neither switch nor repeat. There were two runs, each containing eight mixed blocks, eight single-task blocks (four each number and letter blocks), and rest blocks. The behavioral dependent measure was the local switch cost the difference between average response times on correct switch and no-switch trials within mixed blocks (96 trials of each type).

#### Category-Switch Task

This task was adapted from Mayr and Kliegl (2000). Categoryswitch captures the ability to shift between mental sets. In each trial, participants categorized a word according to animacy (i.e., living vs. non-living) or size (i.e., smaller or larger than a soccer ball), depending on a cue (heart or crossed arrows, respectively) that preceded the word by 350 ms and remained above the word until the participant responded with one of two buttons on a button box. The stimuli disappeared from the screen when categorized, and there was a 350 ms response-to-cue interval. A 200-ms buzz sounded for errors. The task began with two single-task blocks of 32 trials each, in which participants categorized words only by animacy then only by size. Then participants completed two mixed blocks of 64 trials each, in which half the trials required switching the categorization criterion. The dependent measure was the local switch cost — the difference between average response times on correct switch and no-switch trials within mixed blocks.

# Data Analysis

#### EF Data

Scores on the six EF tasks were subjected to the same trimming and transformation used in prior studies to improve normality and reliability (Friedman et al., 2016). Specifically, correct reaction times were trimmed within-subject to obtain the best measures of central tendency within conditions (Wilcox and Keselman, 2003). Additionally, within the number-letter and category-switch tasks, trials following error trials were excluded, as determining switch versus repeat trials is dependent on the preceding trial. Following within-subject reaction time trimming, extreme high and low scores at the between-subjects level (greater than 3 SDs from the group mean) were Windsorized (replaced with the cutoff value of 3 SDs above or below the mean, respectively) to improve normality and reduce the impact of extreme scores while maintaining these scores in the distribution.

Factor scores were extracted via a confirmatory factor analysis in Mplus 8.0 (Muthén and Muthén, 1998–2017), with all six EF tasks loading on Common EF, the keep track and letter memory tasks loading on the orthogonal Updatingspecific factor, and the number–letter and category-switch tasks loading on the orthogonal Shifting-specific factor. The loadings were equated (after scaling the measures to have similar variances) within the Updating-specific and Shifting-specific factors to identify these two-indicator factors.

#### Preprocessing

All processing of brain data was performed in a standard install of FSL build 5.09 (Jenkinson et al., 2012). To account for signal stabilization, the first 10 volumes of each individual functional scan were removed, yielding 806 volumes per subject for additional analysis. The functional scans were corrected for head motion using MCFLIRT, FSL's motion correction tool. Brain extraction (BET) was used to remove signal associated with non-brain material (e.g., skull, sinuses, etc.). FSL's FLIRT utility was used to perform a boundary-based registration of each participant's functional scan to his or her anatomical volume and a six-degree-of-freedom affine registration to MNI152 standard space. LTS scans were subjected to AROMA, an automated independent components analysis-based, single-subject de-noising procedure (Pruim et al., 2015). Signal was extracted from masks of the lateral ventricles, white matter, and whole brain volume and regressed out along with a set of six motion regressors and associated first and second derivatives. The scans were band-pass filtered (0.001–0.08 Hz band). Finally, time courses for each of the functional networks of interest were extracted for each individual with FSL's "fslmeants" command (Jenkinson et al., 2012) using the network templates provided by Yeo and colleagues as a mask.

#### Statistical Models

We used the time courses generated by the procedure outlined above to determine whether or not individual differences in network-to-network connectivity are associated with variation in EF ability. We calculated network-to-network connectivity as Fisher's z-corrected Pearson's r-values for all pairwise relationships between functional networks of interest. We then performed a multiple regression analysis regressing network-to-network connectivity on Common EF, Shifting-specific, and Updating-specific factor scores as well as gender and mean translation and rotation movement during the resting-state scan. To account for non-independence of twin pairs, we utilized the "type = complex" option in Mplus. This option uses a sandwich estimator to obtain standard errors corrected for familial clustering. The relevant measures were treated as approximately continuous variables using the robust maximum likelihood (MLR) estimator.

Because we had genetically informative data, we evaluated whether significant associations were present within-families and/or between-families, using a multilevel twin difference model (Vitaro et al., 2009). Specifically, we used a random intercepts

model of the connection strength, with level 1 (within-family) predictors of each twin's deviation from his or her family mean of Common EF, Shifting-specific, and Updating-specific score (i.e., cluster-centered), as well as grand-mean-centered translation and rotation. The slopes for the within-family Common EF, Shifting-specific, and Updating-specific effects were allowed to vary by zygosity, but were not allowed to have residual variance (i.e., we specified these slopes as random, regressed them on zygosity at level 2, and fixed their residual variances to zero). At level 2 (between), we regressed the random intercept on the family means for Common EF, Shifting-specific, and Updatingspecific scores, as well as sex (which did not vary within families). We standardized all continuous variables to obtain parameter estimates in standard deviation units. The Mplus syntax for this model is provided in the **Supplementary Material**.

# RESULTS

# Behavioral Data

Descriptive statistics for all behavioral tasks are provided in **Table 1A**, while the factor scores for Common EF, Shifting-specific EF, and Updating-specific EF are provided in **Table 1B**. In latent variable form, Common EF, Shifting-specific, and Updating-specific are orthogonal; however, their factor scores are moderately correlated because they are imperfect approximations of latent variables (factor score indeterminacy).

TABLE 1 | Descriptive statistics for executive function tasks, measures, and correlations among resting-state networks.


(A) Descriptive statistics for six EF tasks. (B) Descriptive statistics for Common EF, Shifting-specific EF, and Updating-specific EF. (C) Descriptive statistics for correlations amongst seven resting-state networks. V, visual network; SM, sensory/somatomotor network; DAN, dorsal attention network; VAN, ventral attention network; FP, frontoparietal network; DEF, default network. For example, the mean for V\_to\_SM is Fishers' z-transformation of Pearson's correlation between visual and sensory/somatomotor networks. <sup>∗</sup>Split-half reliability (odd/even for Stroop and Category-switch or run1/run2 for antisaccade and number–letter), adjusted with the Spearman-Brown prophecy formula. <sup>∧</sup>Chronbach's alpha across 3 runs for keep track and 4 sets of trials for letter memory.

Factor score determinacy estimates for the complete data pattern were 0.83, 0.75, and 0.60 for Common EF, Shifting-specific EF, and Updating-specific EF, respectively. Common EF was positively correlated with Updating-specific EF (r = 0.33, p < 0.001) and Shifting-specific EF (r = 0.20, p < 0.001), whereas Updating-specific EF and Shifting-specific EF were negatively correlated (r = −0.33, p < 0.001).

#### Mean Network Connectivity

Average connectivity among all seven functional networks provides some assurance the current sample is consistent with prior work and serves as a validity check. **Figure 1** shows all group average pairwise correlations between each of the seven functional networks, while descriptive statistics for all network-to-network connectivity measures are provided in **Table 1C**. As expected, there is an average positive connectivity between dorsal and ventral attention networks, and an average negative connectivity between the default network (i.e., implicated in internally directed attention) and attention networks (i.e., implicated in external attention). However, the relationship between the frontoparietal and default network was slightly positive, on average.

# Relationship Between Network Connectivity and EF

#### Analysis of Higher-Level Cognitive Networks – General

The primary analyses used to investigate network connectivity and individuals differences in levels of EF were six multiple regression models in which each pairwise connection between the default, frontoparietal, dorsal attention, and ventral attention networks was regressed on the Common EF, Shifting-specific, and Updating-specific factor scores while controlling for a summary of motion during the resting-state scan and gender. After Bonferroni correcting for these six models (alpha = 0.05/6 = 0.0083), we found one EF parameter estimate was statistically significant: Individuals with higher Shifting-specific scores had reduced connectivity between the ventral attention and default networks (**Figure 2**; standardized beta = −0.181, p = 0.005). This particular connection was strongly

negatively correlated across the group, so, individuals with better Shifting-specific ability had stronger negative correlations between these two systems.

To further explore this finding, we investigated the spatial specificity of the connectivity between the ventral attention network and the default network as it relates to Shifting-specific EF, by using the multiple ventral attention and default network subcomponents from Yeo et al. (2011) 17-network parcellation. This parcellation divides the ventral attention network into two subcomponents that are primary differentiated by involving anterior as compared to posterior divisions of all the key cingulate, insular, and temporal/parietal areas. The default network is divided into four subcomponents: three are divisions of the midline hubs and lateral parietal aspects of the default network, and another is best described as the temporal lobe subsystem of the default network. Our supplemental analysis found higher Shifting-specific was associated with more negative connectivity of the posterior ventral attention subsystem (**Figure 3**, blue) and the hub subsystems of the default network (**Figure 4**, blue), but notably not the temporal lobe subsystem of the default network.

Additionally we considered some other aspects of our findings. While statistically significant but not passing correction for multiple comparisons we found that individuals with higher Shifting-specific scores had reduced connectivity between frontoparietal and ventral attention networks (standardized beta = −0.159, p = 0.032), See **Table 2A** for standardized beta weights for all models of a priori interest.

Moreover, due to multiple reports of frontoparietal-todefault hypoconnectivity being significantly associated with working memory span/sequencing, we specifically interrogated this relationship. Although the direction of the relationship for Common EF was consistent with these prior reports, such that higher Common EF scores were associated with hypoconnected frontoparietal and default networks, the effect was not significant (standardized beta = −0.110, p = 0.142).

#### Analysis of Higher-Level Cognitive Networks – Genetic Influences

Because we had genetically informative data, we evaluated whether the significant association between Shifting-specific ability and the connectivity of ventral attention and default networks was influenced by genetic factors. To do so, we used a multilevel twin difference model (Vitaro et al., 2009). If the effect is significant within MZ twin pairs (i.e., the twin with the higher Shifting-specific score has a more negative connection between ventral attention and default networks), it suggests the effect is due to non-shared environmental influences that affect both connection strength and Shifting-specific ability. Such a finding would be consistent with, but not assuring of, a causal effect. A between-family effect suggests that differences between families (which can include genetic and shared environmental effects such as socioeconomic status) drive the association. We found a significant between-family effect (beta = −0.159, p = 0.022) of Shifting-specific EF on the connection strength between the ventral attention and default networks. The within-family effect was not significant averaging across zygosity (beta = −0.126, p = 0.303), but there was a marginally significant interaction of the within effect by zygosity (beta = 0.468, p = 0.053), such that there is a marginally significant within effect for MZ pairs (simple effect beta = −0.354, p = 0.055) but not DZ pairs (simple effect beta = 0.114,

p = 0.469). Together, these effects are evidence suggesting genes and shared environments influence the relationship between Shifting-specific EF and connectivity and preliminary evidence of non-shared environmental influences.

#### Exploratory Analysis of Sensory Networks

Our final analyses explored associations of connectivity between higher-level and lower-level systems and Common EF, Shifting-specific EF, and Updating-specific EF. Specifically,

we used the same multiple regression models described above to predict pairwise connections between the higher-level cognitive networks discussed above and the visual and somatomotor networks, respectively. We found higher Shifting-specific was associated with greater positivity connectivity between the visual network and the frontoparietal network (standardized beta = 0.172, p = 0.017), the visual network and dorsal attention network (standardized beta = 0.173, p = 0.010), and the visual network and ventral attention network (standardized beta = 0.176, p = 0.007). Hence, higher Shifting-specific EF is associated with greater positive connectivity between the visual network and higher-order executive/attention networks. In addition, higher Common EF was associated with a more negative relationship between activity in the somatomotor and dorsal attention network (standardized beta = −0.171, p = 0.014) and with a more positive relationship between the somatomotor and default network activity (standardized beta = 0.203, p = 0.005). However, no exploratory results were significant after Bonferroni correction for the six original and eight additional models (alpha = 0.05/14 = 0.0036). See **Table 2B** for standardized beta weights for all models of exploratory interest.

# DISCUSSION

We investigated the associations between three EF components and coordination among large scale brain systems, with a particular focus on brain systems involved in higher-level cognition. We found that better abilities specific to quickly shifting between task sets, as measured by a Shifting-specific



(A) Results from 6 models involving higher-level cognitive networks of a priori interest. (B) Results from 8 models involving connectivity between higher-level cognitive and visual/sensory-somatomotor networks. V, visual network; SM, sensory/somatomotor network; DAN, dorsal attention network; VAN, ventral attention network; FP, frontoparietal network; DEF, default network. <sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < Bonferroni corrected alpha [0.0083 for (A) and 0.0036 for (B)]. factor score, are related to more negative connectivity between a brain system involved in internal mentation (the default network) and the ventral attention network. We will first discuss this principal finding in more detail and then discuss this result in the context of prior reports of behavior-related hypoconnectivity of higher-level cognitive networks with default networks in both the clinical domain and in neurologically normal individuals. Finally, we discuss findings of exploratory analyses regarding network connectivity between higher-level cognitive networks and lowerlevel networks such as the visual network and the somatomotor network.

Our primary analysis revealed a novel relationship between Shifting-specific ability and connectivity between the default and ventral attention networks. A test of spatial specificity further revealed the effect may be primarily driven by connectivity between the midline regions of the default network hubs — ventromedial prefrontal (vmPFC) and posterior cingulate cortices (PCC) — and a posterior subsystem of the ventral attention network.

To put this finding in perspective, we consider the purported functions of these regions. A review of the functions of the default network suggests that the midline hubs of the default network are involved in many aspects of self-referential processing including self-reflection, mentalizing, autobiographical memory, and episodic future thinking among others (Andrews-Hanna, 2011).

The ventral attention network, in the context of the Yeo et al. (2011) parcellation, contains at least three main subsystems: higher-level visual/attention areas (temporo-parietal junction), right lateral prefrontal cortex, and the cingulo-opercular system (predominantly insula and dorsal anterior cingulate cortex). One popular theory of the function of the ventral attention system suggests this part of cortex specializes in detection of behaviorally relevant stimuli (Corbetta and Shulman, 2002) and reorientation of attention toward relevant environmental information (Vossel et al., 2014). However, our examination of spatial specificity of the effect we observed using a finer-grained network parcellation revealed that although Shifting-specific EF was related to connectivity of the ventral attention network as a whole, the effect may be driven more specifically by connectivity of the cingulo-opercular subsystem. Characterization of the functions of the cingulo-opercular system is a topic of considerable interest and current controversy. The cingulo-opercular system has unique cytoarchitectonic properties (Seeley et al., 2012) and is often implicated in very broad cognitive constructs such as alertness, maintenance, and awareness (Dosenbach et al., 2007; Craig, 2009; Craig, 2011; Sadaghiani and D'Esposito, 2015; Coste and Kleinschmidt, 2016), perhaps in part due to insula's high base rate of activation in fMRI studies (Yarkoni et al., 2011). A detailed functional description of anterior and posterior subsystems of the ventral attention network does not currently exist. However, a meta-analysis of the insula using thousands of fMRI studies as ascertained from Neurosynth (Yarkoni et al., 2011) revealed the posterior portion of the insula identified in the current report may be functionally distinguished from anterior portions by processing related to switching, inhibition, error processing, conflict, feedback, somatosensory, and other terms

(Chang et al., 2013). That is, although anterior and posterior insula are involved in very similar types of processing, the posterior region may be activated more than anterior portions in certain EF-related contexts (i.e., switching, inhibition, etc.). Regarding EFs more directly, it has been proposed that the insula may play a critical role in regulating the coordination of frontoparietal and default network functions (Sridharan et al., 2008; Goulden et al., 2014). Work in the clinical domain supports the notion that ventral attention network functioning is compromised in disorders that often have comorbid EF deficits, such as anxiety (Sylvester et al., 2012) and depression (Kaiser et al., 2016).

Considering the functions of the default and ventral attention networks, one must ask how coordination of the default and ventral attention networks translates to increased performance in a specific aspect of EF that involves the rapid/fluid shifting between task/mental sets and rules, over and above goal maintenance or other general EF abilities (Common EF). Intrinsic network connections in high shifting ability individuals could be a specific, optimized state that places that individual metabolically closer to the brain states required when performing difficult cognitive tasks. Prior work has shown that better performers in a variety of cognitive domains have smaller changes in functional connectivity when going from rest to a task-directed state, possibly reflecting more efficient neural configurations (Schultz and Cole, 2016). In the context of the current study, perhaps more negative default to ventral attention connectivity is a brain state uniquely beneficial for shifting functions. From the perspective that stable resting-state connectivity reflects a history of co-activation (Wig et al., 2011), better shifters may have a stronger history of suppressing default network activity during times when interference from internal mentation functions may be disadvantageous (for a review of default network deactivation and hypoconnectivity see Anticevic et al., 2012), for example, when mind wandering might be detrimental to performance on a demanding task (Weissman et al., 2006). However, the exact mechanism through which network connectivity translates to increased performance is still an open question.

Our study also provided an example of how genetically informative data can be used to provide insights about the causes of inter-individual variation in network connectivity. Prior work utilizing a large sample of twins revealed the cross-twin correlation of default-to-cingulo-opercular connectivity was moderate and significant for both MZ (r = 0.336) and DZ (r = 0.245) twins, stronger for MZ twins, and substantially lower than 1 (Yang et al., 2016). This pattern of results indicates mixed influences of genes, shared environments and non-shared environments. Although a classic twin model to estimate the genetic, shared environmental, and non-shared environmental influence on ventral attention-to-default connectivity could be applied to the data in current study, due to small sample size we opted to perform a multilevel twin difference model. This analysis revealed that the ventral attention-todefault network connectivity relationship with Shifting-specific EF is primarily driven by between-family differences, which include both genetic and shared environmental influences. We also observed a marginally significant within-family effect for MZ twins (but not DZ twins), which suggests the nonshared environmental influences that cause one MZ twin to have higher Shifting-specific ability than his or her cotwin may be the same non-shared environmental influences that cause that MZ twin to have more negative ventral attention-to-default connectivity. Future work using larger twin samples should continue this line of research to tease apart genetic and environmental influences on network and regional connectivity.

Regarding other associations between higher-level cognitive network connectivity and EFs, we did not find any other strong associations after correcting for multiple comparisons. Nonetheless, there were some results worth noting. First, we did find that individuals with higher Shifting-specific scores had increased negative connectivity between the frontoparietal and ventral attention systems, which reached a univariate level of significance (p < 0.05) but did not when Bonferroni-corrected. At first glance, increased Shifting-Specific EF ability and a reliance upon non-simultaneous activation of two closely related systems might seem counterintuitive, as one might have expected higher EF ability to be associated with greater co-activation of closely related, higher-level cognitive systems. However, prior work has established that EF requires a trade-off between cognitive stability and flexibility, with stability required to impose and maintain a task set, and flexibility required to switch between tasks and subgoals (Goschke, 2000) with flexibility-related measures (such as Shifting-specific EF) sometimes showing the opposite relationship with outcomes than measures of stability (see Herd et al., 2014; Friedman and Miyake, 2017). Examples of such findings are studies that found a relationship between increased shifting-specific ability and increased substance use (Gustavson et al., 2017), decreased intelligence, and poorer self-restraint (Friedman et al., 2011). Although speculative, perhaps this brain-behavior relationship is a neural manifestations of the flexibility-stability tradeoff.

Based on prior findings in the clinical domain and limited work with neurologically normal individuals (e.g., Kelly et al., 2008), we expected to find that higher Common EF or Updating-specific EF would be associated with reduced connectivity between the frontoparietal and default networks. We did not find this result. However, we did find a trend for individuals with higher Common EF to have a more negative relationship between activation in the frontoparietal and default networks, consistent with expectations. Although our results suggest there is no reliable association between default-tofrontoparietal network connectivity and EFs, we did not test for association between EFs and connectivity at the level of small and specific regions-of-interest (as in Keller et al., 2015) or between larger conglomerate networks that might combine signal across many task-positive networks (e.g., frontoparietal, dorsal attention, and cingulo-opercular regions; as in Kelly et al., 2008). In summary, the results of the current study complement prior research in the area of EF-connectivity relationships by providing an alternative measurement of both EF behavior (i.e., in the context of the Unity/Diversity model) and connectivity at the level of seven functional networks.

In an exploratory analysis, we investigated associations between EFs and connectivity between higher-level cognitive and lower-level sensory networks. We found higher Shiftingspecific EF was associated with increased positive connectivity between the visual network and each of the task-positive networks (frontoparietal, dorsal attention, and ventral attention). These results are novel but complement prior work from our group in a younger sample in which we showed individuals with higher Shifting-specific ability had more diffusely connectivity visual cortices as quantified by local clustering coefficient, a graph theoretic measure (Reineberg and Banich, 2016). We also found higher Common EF was associated with more positive connectivity between the somatomotor and default networks as well as more negative connectivity between the somatomotor and dorsal attention networks. These findings are both novel and should be replicated/explored in future work. Generally, the results of our exploratory analysis suggest EFs may rely on broad patterns of connectivity across many brain systems, including those that are not typically associated with inter-individual variation on EF tasks (e.g., visual network and sensory/somatomotor network).

It is important to consider some limitations of the current work. As alluded to earlier, the mechanism through which network connectivity influences behavior is unclear. The stability of resting-state measures suggests high-EF individuals may have intrinsic brain characteristics that foster or allow for their higher behavioral performance. But in contrast, a substantial literature shows the malleability of connectivity in the face of specific cognitive challenges and state inductions (Spreng et al., 2010; Fornitoa et al., 2012; Cocchi et al., 2013). This literature suggests high-EF individuals could be in cognitive states during resting-state scans that differentiate them from low ability individuals – for example, simulating, planning, or rehearsing rules for cognitive tasks that are part of the testing session. Future work could utilize experience sampling or experimental manipulations of task instructions/order to rule out these possible mechanisms. Another limitation of the current study is quantification of resting-state connectivity in a static manner. Dynamic connectivity methods are an alternative that measure changes in network connectivity over the course of a resting-state scan rather than as a single summary of the entire scan (Allen et al., 2014; Dixon et al., 2017). Preliminary work in this area suggests dynamics may be related to individual differences in cognitive abilities (Liu et al., 2017; Nomi et al.,

# REFERENCES


2017), so future work could build upon the current study by measuring these dynamics in relation to multiple EF factors. Finally, although we utilized a large sample for investigating the neural basis of individual differences in EFs, we will be able to more accurately estimate genetic and environmental influences on the behaviorally relevant brain signals described in the current work as we increase the sample size of this ongoing study.

In summary, using a large sample of twins with thoroughly measured EF abilities, we have provided a new perspective on cognitively relevant signals in the connectome, particularly connectivity among large scale networks with broadly defined higher-level cognitive functions. Our work suggests "hypoconnectivity" or "anticorrelation" of functional networks may be an important indicator of skill/ability. Individuals who are better able to fluidly shift between mental task sets had more negatively correlated default and ventral attention networks than their less skilled peers.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Institutional Review Board of the University of Colorado Boulder. The protocol was approved by the Institutional Review Board of the University of Colorado Boulder. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

MB, NF, and AR designed the research. AR and NF performed the analysis. All authors wrote the manuscript.

# FUNDING

This research was supported by NIH grant MH063207.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01600/full#supplementary-material

Anticevic, A., Cole, M. W., Murray, J. D., Corlett, P. R., Wang, X.-J., and Krystal, J. H. (2012). The role of default network deactivation in cognition and disease. Trends Cogn. Sci. 16, 584–592. doi: 10.1016/j.tics.2012.10.008

Banich, M. T. (2009). Executive function: the search for an integrated account. Curr. Dir. Psychol. Sci. 18, 89–94. doi: 10.1111/j.1467-8721.2009.01615.x

Chang, L. J., Yarkoni, T., Khaw, M. W., and Sanfey, A. G. (2013). Decoding the role of the insula in human cognition: functional parcellation and large-scale reverse inference. Cereb. Cortex 23, 739–749. doi: 10.1093/cercor/bhs065

Choe, A. S., Jones, C. K., Joel, S. E., Muschelli, J., Belegu, V., Caffo, B. S., et al. (2015). Reproducibility and temporal structure in weekly resting-state fMRI over a period of 3.5 years. PLoS One 10:e0140134. doi: 10.1371/journal.pone.0140134


of resting-state functional connectivity. JAMA Psychiatry 72, 603–611. doi: 10. 1001/jamapsychiatry.2015.0071.Large-scale



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Reineberg, Gustavson, Benca, Banich and Friedman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# WM in Adolescence: What Is the Relationship With Emotional Regulation and Behavioral Outcomes?

#### Chiara Malagoli\* and Maria Carmen Usai

Department of Education, University of Genoa, Genoa, Italy

Adolescence is a fundamental transition phase, marked by physical, social, cognitive and emotional changes. At this stage in development two contrasting phenomena take place: brain changes cause a sensitivity to emotional aspects (Dahl, 2004); while also control processes register as well impressive improvements (e.g., Hooper et al., 2004; Best and Miller, 2010). The study is aimed to investigate the relationship between a core cognitive feature such as working memory (WM) (Diamond, 2013) and complex abilities such as emotion regulation (ER) and behavioral self-reported outcomes using a structural equation model approach. A sample of 227 typically developed adolescents between 14 and19 years of age (148 females; mean age in months 202.8, SD 18.57) participated in this study. The following tasks and self-reports were administered in a 45-min test session at school: Symmetry Span task (Kane et al., 2004). Reading Span task (Daneman and Carpenter, 1980), Mr. Cucumber (Case, 1985); Youth Self-Report (YSR, 11–18 years, Achenbach and Rescorla, 2001); Difficulties ER Scale (DERS, Gratz and Roemer, 2004; Italian version by Giromini et al., 2012). Results showed that difficulties in ER correlated with WM: high levels of ER difficulties are associated with low WM efficiency while no significant contributions of these predictors was observed on externalizing or internalizing symptoms. This study showed a significant relationship between self-reported difficulties in ER and WM, while no significant contribution of the considered predictors was showed on the outcomes, adding knowledge about how behavioral and emotional self-reported outcomes may relate to these processes.

Keywords: working memory, emotional regulation, behavioral outcomes, adolescence, individual differences

# INTRODUCTION

Adolescence is a special time in development. Dahl (2004) defines adolescence as "The developmental interval that encompasses the body and brain changes of puberty." During this time, two apparently contrasting developmental phenomena occur. On one hand, brain changes cause a sensitivity to emotional aspects of experiences that influence the increase in emotional arousal and such phenomena as sensation-seeking, risk-taking, increased conflict with parents, increased mood volatility and a particular increase in negative emotions (Dahl, 2004). On the other hand, cognitive processes, particularly cognitive control functions, register impressive improvements exhibited in such abilities as abstract thought, organization, decision-making, planning, rule management,

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Daniel Romer, University of Pennsylvania, United States Anne Bonnefond, BP (France), France

\*Correspondence: Chiara Malagoli chiara.malagoli@edu.unige.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 30 January 2018 Accepted: 11 May 2018 Published: 29 May 2018

#### Citation:

Malagoli C and Usai MC (2018) WM in Adolescence: What Is the Relationship With Emotional Regulation and Behavioral Outcomes? Front. Psychol. 9:844. doi: 10.3389/fpsyg.2018.00844

**61**

and flexible adaptation to different contexts (Johnson, 2000; Anderson et al., 2001; Hooper et al., 2004; Best and Miller, 2010). These developmental outcomes are apparently in opposition to each other because the ability to generate plans and reasoning in the abstract would not explain risk-taking, sensation-seeking and recklessness or simply general impulsiveness that is typically observed during adolescence. This situation is particularly interesting, since adolescents seem to have all of the crafts necessary to evaluate correctly and plan actions. Many aspects of decision-making appear adult-like, but adolescent decisions may still not be coherent, and as a consequence, it is possible to observe mal-adaptive or even dangerous behaviors.

In the literature, most of the studies on adolescence seem to be more focused on extreme clinical outcomes, such as conduct disorders (e.g., Kim et al., 2001), drug and alcohol abuse (see for a review Peeters et al., 2015; Kim-Spoon et al., 2017), and aggressive behaviors, such as bullying (e.g., Hamama and Ronen-Shenhav, 2013; Pouwels et al., 2017), while fewer investigations have concerned typical adolescence and the impact of superior cognitive features, such as working memory (WM), on the complex relationships between difficulties in emotion regulation (ER) and behavioral outcomes, with both being considered not as clinical issues but as normally challenging features of adolescence. The present study aims to try to fill this gap in the literature by investigating these aspects in a typical population of adolescents, with a particular attention to individual differences in difficulties in ER, WM efficiency, and behavioral outcomes.

# Emotion Regulation During Adolescence

During adolescence, everyday life situations characterized by strong affective stimuli, as mentioned, often result in enhanced emotional outcomes. Although adolescents present a more mature and perfected awareness of emotions, in comparison to children, in general, the control functions exhibited by adolescents often emerge to be unsatisfactory (Casey et al., 2008). This phenomenon has been addressed by examining a variety of possible developmental reasons, such as hormonal activation, different brain development in regions that underlie this imbalance (Giedd et al., 1996, 1999; Sowell et al., 1999; Gogtay et al., 2004; Barnea-Goraly et al., 2005) and, in particular, elevated activity in the ventral striatum observed during adolescence (Ernst et al., 2005; Galvan et al., 2006; Delgado, 2007; Van Leijenhorst et al., 2010; Sescousse et al., 2013), which seems to influence such processes as risk-taking and decision-making.

Moreover, adolescence is a period in which the peer group became more central and also friendship acquires importance (Rubin et al., 2008). In this perspective, the open question is whether failures in ER or dangerous behaviors and Rule-Breaking may be due to a specific difficulty in regulating emotions and behaviors or if being risky and impulsive may also be mediated by individual and specific features that may or may not also be elicited by the social environment. In terms of ER, defined as the process that elicits the onset, offset and magnitude/duration or quality of one or more emotional features of emotional response (Gross, 1998; Gross and Thompson, 2007), more or less adaptive abilities may also be linked to individual differences in more cognitive forms of ER. In fact, less efficient forms of this specific type of regulation have been connected to poor psychological well-being (Balzarotti et al., 2016). Indeed, as reported by Ahmed et al. (2015), this specific period of life was found related to an enhanced occurrence of internalizing and externalizing problems (Spear, 2000; Paus et al., 2008; Lee et al., 2014). This finding suggests that adolescents may be vulnerable to emotional dysregulation, which may cause not only maladaptive outcomes but also may affect cognitive processes, including WM, that undergo development during adolescence (Somerville and Casey, 2010; Sebastian et al., 2011; Blakemore and Robbins, 2012; Dumontheil, 2014) and may be particularly challenged in more complex every-day life situations.

# Working Memory in Adolescence

Working memory refers to a system that can maintain and process information simultaneously (Engle et al., 1992; Oberauer et al., 2016). WM is also defined as the ability to maintain representations of recently experienced or recalled information over a short period of time (Curtis and D'Esposito, 2003). Therefore, WM is required for the optimal performance of goal-directed behaviors (Diamond, 2013; for a discussion Morra et al., 2018). Important changes occur in WM during development, and WM task performance and latent structure have been shown to evolve throughout middle and late adolescence (Malagoli and Usai, 2018) following a protracted course of development into young-adulthood (Huizinga et al., 2006). Individual differences in WM capacity are observed to be correlated with a variety of cognitive and social outcomes, including school performance (Gathercole et al., 2004; Dumontheil and Klingberg, 2012; Finn et al., 2014). Research has shown that WM filtering ability – the ability to filter extraneous or distracting information from WM during encoding – is strongly associated with overall WM capacity and accuracy (Vogel et al., 2005; Peverill et al., 2016).

# Which Relationship Between WM and ER?

Considering WM specifically, many studies run in laboratories have reported that WM performance is affected by emotionally based stimuli both in a positive and negative way, such as a more vivid memory for emotional pictures (Canli et al., 2000), emotional word-lists (Jones et al., 1987; Dietrich et al., 2001), or humor (Schmidt and Williams, 2001), while there are other situations in which the most adaptive behavior is actually ignoring emotional information and attempting to not be affected by it.

Romer et al. (2009, 2011) performed a large cohort study investigating WM ability in a sample of early adolescents (n = 387, ages 10–12 at baseline). In three annual assessments, these researchers examined models in order to understand the trajectory of weak WM, early manifestations of externalizing problems, and heightened levels of trait impulsivity. Participants were tested with a computerized battery of tasks to assess WM, cognitive control, and reward processing, plus an audio-guided computerized self-interview for impulsivity and risk behaviors and a self-report questionnaire to evaluate externalizing and

internalizing difficulties (YSR Achenbach and Rescorla, 2001). Romer et al. (2009) study found WM only indirectly related to externalizing behavior due to its relationship with acting without thinking. In a second study (Romer et al., 2011) with the same sample, the authors found that WM prospectively seemed to predict reduced externalizing behavior both directly and indirectly, with a mediation effect by acting without thinking. In addition, WM positively predicted sensation seeking, which was positively related to externalizing behavior also controlling for acting without thinking. These research suggests that WM may have a positive relationship with externalizing behavior with a mediation of more complex aspects as sensation seeking.

These studies show the importance of WM in particular as a cognitive factor that can relate not only to behavioral outcomes but that may also explain more emotional forms of regulation, such as sensation-seeking and risk-taking. However, these studies do not address another research question of how WM may be related to specific difficulties in ER in this sensitive time of development.

# Driven or Non-driven Behaviors, WM and ER: What Are the Implications in Everyday Life?

ER comprehends both more automatic features and more effortful ones: the ability to voluntarily guide behavior and make an effort to direct the flow of thoughts and emotions in a goal-directed way is essential to mature decision-making, while more automatic forms of ER are important and useful for directing the flow of emotions and thoughts in more known situations. From this perspective, Gyurak et al. (2011) state that implicit processes may be evoked in an automatic way by the stimulus itself and completed without an active monitoring. Cognitive control processes, such as WM, allow us to voluntarily guide our behavior and support both forms but particularly explicit forms of regulation need this kind of cognitive control. While adolescents can demonstrate refined voluntary behavior, the ability to maintain consistently this attitude continues to improve during adolescence, for this reason cognitive control features are particularly interesting to investigate the vulnerabilities of this period. In this sense, and in particular at this age, ER may have a fundamental role in controlling impulses and behaviors. The ability to manage emotions, as mentioned, is a relatively voluntary, effortful and deliberate process, that attempts to outbalance more spontaneous emotional responses. Finally, ER allows people to enhance, maintain, or reduce both negative and positive emotions. Coherently with these features, ER often implies some adjustments in emotional responding. Ironically, these emotional adjustments may not reach the individual's goal of a particular emotional state (e.g., trying to switch from anxious to calm), and these "defeats" may also mirror strong and emotional outcomes that people would usually like to disguise (Wegner et al., 1993) or exhibit the emotions they actually wanted to hide regardless their efforts. In these specific circumstances, the natural salience of emotional stimuli and the human tendency to process them transform these episodes into strong interferences in competition for cognitive resources with more relevant information (Ellis and Ashbrook, 1988), often resulting in decreasing performance on the task in action (Dolcos and McCarthy, 2006; Dolcos et al., 2008; Anticevic et al., 2010; Chuah et al., 2010; Denkova et al., 2010). Considering the enhanced emotional arousal documented in adolescence (Dahl, 2004), investigating how difficulties in emotional regulation may be related to WM may be particularly useful in order to better understand the vulnerabilities of this developmental stage.

# THE PRESENT STUDY

The data illustrated in this paper are part of a larger study investigating cognitive processes in adolescence (Malagoli and Usai, 2018). The aim of this study is to explore the relationship between specific self-reported difficulties in ER, WM and behavioral outcomes, adding knowledge regarding this topic and contributing to the analysis of individual differences in typical development. Due to the importance that WM has in predicting stronger regulation abilities (e.g., rule management, updating of useful information) (Diamond, 2013) and the prolonged development in time that both WM and the ability to regulate emotion show (Dahl, 2004; Huizinga et al., 2006), we expect difficulties in ER are associated with WM performance (Dietrich et al., 2001; Jones et al., 1987) and to be related with behavioral outcomes (Anticevic et al., 2010). We expect also WM to be related to specific behavioral outcomes (Dahl, 2004; Dolcos and McCarthy, 2006; Dolcos et al., 2008; Anticevic et al., 2010; Chuah et al., 2010; Denkova et al., 2010).

# MATERIALS AND METHODS

# Participants

A sample of 240 14- to 19-year-old adolescent (158 females) high school students participated in this study. The participants were excluded if they did not speak Italian as their first language or had been diagnosed with any disease (e.g., learning disabilities) or neurological (e.g., brain infection) or psychiatric disorder. Eight participants were excluded due to learning disabilities, four were excluded for not having Italian as their first language, and one was excluded due to neurological issues. The final sample included 227 participants (148 females; mean age in months 202.8, SD 18.57).

# Materials and Procedure

We administered one 45-min test session in a quiet room that was provided by the school. A symmetry-span task, reading span task and the Mr. Cucumber task were administered by a trained experimenter. The task sequence was exactly as listed above. Questionnaires were asked to be filled out during the session, and participants were asked to return them at the end of the administration day.

#### Self-Report Measures

Difficulties in Emotion Regulation Scale (DERS, Gratz and Roemer, 2004; Italian version by Giromini et al., 2012). The

DERS is a self-report questionnaire composed by 36 items developed to assess severe difficulties in ER abilities. Scores are provided for six scales: Non-acceptance of Emotional Responses (Non-acceptance, 6 items), Difficulties Engaging in Goal-Directed Behavior (Goals, 5 items), Impulse Control Difficulties (Impulse, 6 items), Lack of Emotional Awareness (Awareness, 6 items), Limited Access to ER Strategies (Strategies, 8 items), and Lack of Emotional Clarity (Clarity, 5 items). Participants may set their responses on a 5-point Likert scale ranging from 1 (almost never) to 5 (almost always). To determine the internal consistency of the questionnaire, Cronbach's alpha was calculated for the total DERS score and for each subscale. Cronbach's alphas were larger than 0.70 for all of the scales: Non-acceptance 0.73, Goals 0.85, Impulse 0.85, Awareness 0.72, Strategies 0.89, Clarity 0.84 and DERS Total 0.92.

Youth Self-Report (YSR, 11–18 years, Achenbach and Rescorla, 2001, Italian version as available on the http://www.aseba.org website). This questionnaire is a screening measure for behavioral and emotional difficulties in children and adolescents. The YSR is also part of the Achenbach System of Empirically Based Assessments (ASEBA). The 2001 revised YSR comprises 112 items in a six-month time lapse. Participants are asked to indicate how often a certain behavior applies to them on a three-point scale (0 = absent, 1 = occurs sometimes, 2 = occurs often). Scores are provided on eight subscales: Anxious/Depressed, Withdrawn/Depressed, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Rule-Breaking Behavior, and Aggressive Behavior. Subscales are clustered in order to identify individual's externalizing or internalizing profiles. Internalizing is the resulting profile from Anxious/ Depressed, Withdrawn/Depressed, and Somatic Complaints scores, and Rule-Breaking Behavior and Aggressive Behavior result in the Externalizing profile. Cronbach's alphas were larger than 0.70 for all the scales: Anxious/Depressed 0.74, Withdrawn/Depressed 0.75, Somatic Complaints 0.73, Social Problems 0.71, Thought Problems 0.72, Attention Problems 0.73, Rule-Breaking Behavior 0.73, and Aggressive Behavior 0.70.

#### Working Memory Tasks (for an Extensive Description See Malagoli and Usai, 2018)

Symmetry span task (SymmSpan; Kane et al., 2004). This task is a complex measure of WM capacity composed of two different tasks that are performed at the same time. The first task consisted of recalling a sequence of squares that turned red on a matrix that appeared on the screen, while the second consisted of judging figure symmetry. The tasks were clustered together in two to five sets. Every square presentation was spaced out by a symmetry task. For example, if a set is composed of two sets, two squares and two symmetry problems, alternatively, all of the squares must be recalled at the end of each set. A 4×4 square matrix appeared in the center of the screen, and one of them turned red. Then, the symmetry judgment task was presented in an 8×8 matrix, with some squares filled in black, and the participants decided whether the black-square design was symmetrical along its vertical axis. Set sizes ranged from two to five symmetry–memory matrices per trial (for 12 trials total). The presentation of sets is sequential both for the square and symmetry problems. The participants were instructed to recall the whole sequence of squares in the correct order and to maintain at least 85% accuracy on the symmetry trials. Three controls appeared on the screen, and they were available to use during the participants' recall: "blank" to point to a square that they could not recall, "clear" to delete the sequence and attempt it again, and "exit" to go to the next set. These controls were activated by the participants themselves using a touchpad. The participants had an unlimited amount of time to recall all of the squares. The computer calculated the mean RT of the participants in the practice phase for the symmetry problems to use in the test phase. The mean RT was expressed to the participants after the practice phase. Feedback was provided at the end of each set, informing the participants of their performance accuracy. One practice block was presented for the square task (four square sets), one for the symmetry task (15 symmetry problems) and one for the combined task (three sets with three square tasks and three symmetry problems). The test phase was composed of three sets of three combinations, three sets of four combinations and six sets of five combinations. The dependent variable was the absolute span score, which was computed using the traditional absolute span scoring method. This score was the sum of all perfectly recalled sets. For example, if an individual correctly recalled two squares in a set of two, three squares in a set of three, and three squares in a set of four, their SPAN score would have been five (2 + 3 + 0). A split-half reliability procedure was performed for this task. The Spearman-brown coefficient was 0.91.

Reading span task (RSPAN; Daneman and Carpenter, 1980). This span task is structurally identical to the previous one but with different stimuli. The task consisted of recalling a sequence of letters that appear on the screen while the participants judged whether some phrases made logical sense. The two tasks were clustered in sets (ranging from two to seven). Every letter in each set was spaced out by a phrase problem. The participants were asked to recall the entire sequence of letters in the correct order at the end. The letters appeared in the center of the screen one by one. The sentences also appeared written in the center. Each sentence consisted of 10–15 words. Letter practice used sequential selection. The letter-sentence practice and the letter-sentence test used random selection without replacement for each sequence. When presented with the recall cue, the participant recalled each letter from the preceding set, in the order in which they appeared, by selecting them using the touchpad from a matrix of 12 alternatives. The set sizes ranged from two to five sentence–letter problems per trial (for 12 trials total). The three controls ("blank," "clear," and "exit") were kept the same. The participants were given all of the time that they needed to recall the letter sequence. The computer calculated the participants' mean RT during the sentence problem practice phase, and this personalized mean time was the maximum time that the participants had to solve sentence problems during the test phase. Feedback was also provided at the end of each phase. The phases were the same, and they consisted of letters (two sets of two and two sets of three), phrase practices (15 letters), and combined practices (two sets of two and three sets of three letter/phrase combinations). Finally, a combined test phase that was composed of three sets of three, four, five, and

seven letter and/or phrase combinations was administered. The participants received feedback as they did in the previous task. The participants were instructed to try to be precise in recalling the letter sequence and to attempt to maintain at least 85% accuracy in judging the sentences. The absolute span score (i.e., the sum of all of the perfectly recalled sets, RSPAN) was used. A split-half reliability statistic was calculated for this task. The Spearman-brown coefficient was 0.91.

Mr. Cucumber task (Case, 1985). This task is a classic visual-spatial span measure. The outline of a complex figure (an extraterrestrial) was presented, to which colored stickers were applied. This non-computerized task comprised three practice items and a test phase with eight levels. In each one, three items were displayed. The subjects were able to watch the figure for 5 s until the fifth level, and the stickers appeared for the last three levels. The task required the participants to recall the position of all of the stickers by pointing to a figure without a sticker. A thick sheet of paper depicting a grill was shown to avoid any contribution of iconic memory when watching the time lapse and before presenting the recall figure. A point was given for each level that was fully correctly recalled. One-third of a point (0.33) was given for each correct item beyond that level. The test was discontinued if the participants failed on all three items in the same level. The dependent measure was the score that was obtained (expected range 0–8). The Cronbach's alpha was 0.88.

# Data Analysis

Descriptive statistics (i.e., means, standard deviations, possible score ranges, skewness and kurtosis) and zero-order and partial (Pearson) correlations controlling for age were calculated. Outlier values that deviate from the mean more than three standard deviations were excluded from the analyses. In addition, 9 values were excluded from memory scores because they did not maintain at least 85% accuracy. The total excluded values representing 1.8% of the full sample. A series of Structural Equation Models (SEM) were conducted based on raw data using MPlus 7.4 software (Muthén and Muthén, 1998–2010). The maximum likelihood with robust standard errors (MLR) was used as estimator. The optimal full information maximum likelihood approach was used to estimate missing data (Collins et al., 2001). Each model fit to the data was estimated by examining multiple fit indices (Schermelleh-Engel et al., 2003), including the χ 2 statistic, root mean square error of approximation (RMSEA), standardized root mean squared residual (SRMR) and the Bentler comparative fit index (CFI). The χ 2 test was used to evaluate the appropriateness of the SEM model. The RMSEA measured the precision with which the covariances predicted by the model matched the actual covariances (approximate fit in the population). The RMSEA values ≤0.05 represented a good fit, values that were between 0.05 and 0.08 represented an adequate fit, values that were between 0.08 and 0.10 a mediocre fit, and values that were greater than 0.10 were inadmissible (Browne and Cudeck, 1993). The SRMR was the square root of averaged squared residuals (i.e., the differences between observed and predicted covariances). SRMR values <0.10 were acceptable. Nevertheless, a good fit was considered values that were less than 0.05 (Schermelleh-Engel et al., 2003). The CFI compared the covariance matrix that was predicted by the model with the observed covariance matrix and compared the null model with the observed covariance matrix. A CFI value greater than 0.97 indicates a good fit, whereas values greater than 0.95 represent an acceptable fit (Schermelleh-Engel et al., 2003). A SEM model was tested considering difficulties in ER and ability in WM latent variables as predictors. WM latent predictor has been tested on a previous study (Malagoli and Usai, 2018), whereas ER latent variables were based on an exploratory factor analysis (EFA) using principal axis factoring as the extraction method and varimax rotation of the factor structure. In the SEM model the YSR subscales were grouped into two latent variables representing internalizing and externalizing problems. As suggested by Achenbach and Rescorla (2001), the Anxious/Depressed, the Withdrawn/Depressed, and the Somatic Complaints subscales load on the internalizing factor, whereas the Rule-Breaking Behavior and the Aggressive Behavior subscales load on the externalizing factor. The scores of the remaining three subscales were entered in both the aforementioned latent factors.

# RESULTS

The descriptive statistics are summarized in **Table 1**. Correlations among the measures are summarized in **Table 2**. All WM tasks show a pattern of significant correlations that remain when controlling for age. DERS subscales are significantly and moderately associated with each other, as are the YSR subscales. The pattern of associations between the different groups of measures is restricted to a few significant correlations. Considering the associations between the WM measures and the questionnaire subscales, the symmetry span task is negatively correlated with the DERS total score and with three DERS subscales: Lack of Emotional Awareness, Limited Access to ER Strategies, and Lack of Emotional Clarity. Moreover, the symmetry span task significantly correlates with the YSR - Aggressive Behavior subscale. The correlations between the symmetry span task and both the DERS - Lack of Emotional Awareness and the YSR – Aggressive Behavior subscales are not significant when controlling for age. Considering the pattern of associations between the questionnaires, the DERS – Difficulties Engaging in Goal-Directed Behavior is significantly associated with YSR – Somatic Complaints, and the DERS – Impulse Control Difficulties positively correlates with YSR – Social Problems, and these correlations remain significant when controlling for age.

The EFA extracted two factors that account for 51% of the total variance in the DERS. The DERS subscales load mainly on factor 1 (35% of the total variance): Non-acceptance (factor loading = 0.686), Goals (factor loading = 0.620), Impulse (factor loading = 0.686), and Strategies (factor loading = 0.801). Awareness and Clarity subscales load on factor 2 (16%; factor loadings 0.554 and 0.755, respectively). The two factors are labeled difficulties in emotion response (EM\_R) and difficulties in emotion knowledge (EM\_K), respectively. Indeed, factor 1


#### TABLE 1 | Descriptive statistics.

fpsyg-09-00844 May 29, 2018 Time: 16:25 # 6

Symm\_Span = Symmetry complex span; RSPAN = Reading Span.

(EM\_R) consisted of subscales measuring the difficulties in managing a response to an emotional elicitation. Factor 2 (EM\_K) included two subscales measuring the difficulties in understanding an emotional state.

Considering the SEM analysis, a full factorial model controlled for age and gender, with three latent variables as predictors and two latent factors as outcomes representing internalizing and externalizing problems, is tested. The error-term squares were considered to be estimates of the unexplained variance for each measure. The fit indices were good or acceptable, excepting for the chi square, but this statistic is very sensitive to sample sizes, and in case of large sample size (greater than 200), authors suggest relying upon other indices (Schermelleh-Engel et al., 2003): χ <sup>2</sup> = 165.447, df = 107, p = 0.000, RMSEA = 0.049 [90% CI = 0.034–0.063], SRMR = 0.053, and CFI = 0.947. The model complete with the standardized solution is illustrated in **Figure 1** and the parameters are shown in **Table 3**. Two out six DERS subscales, i.e., Awareness and Clarity, load on a latent variable named difficulties in ER knowledge (EM\_K). The other four DERS subscales (Non-acceptance, Goals, Impulse, and Strategies) load on a latent variable named difficulties in ER response (EM\_R). The three WM measures load on the latent variable WM. The factor loadings for these latent variables are all significant (t-values >2, **Table 3**). The three latent predictors correlate with each other. The eight YSR measures load on the two Internalizing and Externalizing latent factors (t-values >2, **Table 3**). The EM\_R, the EM\_K, and the WM latent variables are considered as predictors of the two YSR latent factors, but none of these contribute significantly. The model shows that difficulties in ER are significantly associated with WM. The factor loading is negative, indicating that high levels of ER difficulties on knowledge and response are associated with low WM ability.

Age and gender influenced significantly a few measures. Higher Awareness and Clarity difficulties were shown by the youngest individuals (and vice-versa), that also tend to report less problems on the Rule-Breaking Behavior YSR subscale. As regard to gender influences, females reported more difficulties on Clarity and Strategies DERS subscales and more Somatic Complaints in the YSR self-report.

# DISCUSSION

This study investigates the relationships between difficulties in ER, WM and self-reported aspects of behavioral outcomes in typically developing adolescents and young adults. In particular, it considers the cognitive features of ER and WM, and this study examines if these components of self-regulation can be associated with non-adaptive outcomes in a typically developed population.

# Analysis of Correlations: Pattern in WM Tasks, DERS and YSR

A preliminary Pearson correlation analysis indicated that all WM tasks were correlated among themselves, as were DERS and YRS sub-scales. Pearson correlations also showed a negative correlation between the symmetry span task, the DERS total score and three DERS subscales: Lack of Emotional Awareness, Limited Access to ER Strategies and Lack of Emotional Clarity. These relationships among visual WM and specific difficulties in awareness, strategies and clarity, are particularly interesting considering the power that emotion has both to fix information (Jones et al., 1987; Dietrich et al., 2001; Schmidt and Williams, 2001) or to interfere with the recall of it (Dolcos and McCarthy, 2006; Dolcos et al., 2008; Anticevic et al., 2010; Chuah et al., 2010; Denkova et al., 2010). The symmetry span task significantly


TABLE

2

fpsyg-09-00844 May 29, 2018 Time: 16:25 # 7

correlates also with the YSR – Aggressive Behavior subscale, but this association, as well the association with the DERS – Lack of Emotional Awareness subscale, was no longer significant when controlling for age.

Considering now the pattern of association between the DERS scale and the YSR self-report, the DERS – Difficulties Engaging in Goal-Directed Behavior is significantly associated with YSR – Somatic Complaints, and the DERS – Impulse Control Difficulties positively correlates with YSR – Social Problems, and these correlations remain significant when controlling for age. These associations seem to show an actual connection, specifically between more visual aspects of WM and specific difficulties in ER and behavioral outcomes, that mostly maintain their significance even controlling for age, as if this pattern may be somehow typical of the age range. Given these relationships, the SEM analysis enabled us to investigate this existing relationship better considering latent variables and possibly different outcomes.

# Association Between WM and ER Components

The SEM analysis considers three latent variables representing difficulties in ER knowledge (EM\_K) and in ER response (EM\_R) perceived by the participants loaded by the DERS scales and a unitary latent dimension for WM abilities. This dual organization for ER variables is coherent with the process model of ER abilities illustrated by Gross and Thompson (2007): emotion-generative process vs. cognitive reappraisal, which in our model would be reflected by EM\_R factor and EM\_K factor respectively. The unitary WM model as well has already been documented by existing literature (e.g., Huizinga et al., 2006; McAuley and White, 2011) and this specific one has been tested in a previous study meant to investigate the organization of WM and inhibition during adolescence (Malagoli and Usai, 2018).

Age and gender show limited influence and in accordance with the literature. With age increasing, the perceived difficulties in the understanding of mental states related to an emotion decrease (Zimmermann and Iwanski, 2014). Females more than males, in addition to reporting difficulties in clearly perceiving emotions, tend to report more problems in ER strategies (Weinberg and Klonsky, 2009). Moreover, females report more frequent somatic problems (Rescorla et al., 2007). After controlling for the aforementioned age and gender influences, the model shows that WM is negatively associated with both dimensions tapping ER difficulties.

This negative relationship may be interpreted on one hand as the awareness that emotional difficulties interfere with WM abilities while, on the other hand, as better WM abilities may reduce the impact of ER difficulties. This result is coherent with the literature on adults and with theories and neuro-studies on adolescents that show an association between emotional components, such as difficulties in managing emotions, and more cognitive abilities such as WM, as they could actually interfere or modulate each other (Dahl, 2004; Dolcos and McCarthy, 2006; Dolcos et al., 2008; Anticevic et al., 2010; Chuah et al., 2010; Denkova et al., 2010; Bridgett et al., 2013; Hendricks and Buchanan, 2016). Bridgett et al. (2013), in particular, investigating the effects of EF, and particularly complex features of WM (updating and monitoring) toward more specific aspects of self-regulation, such as effortful control, show how all these aspects may be differentiated versus integrated in explaining the role of self-regulatory systems in ER.

The literature documents a slower development of subcortical versus dorsal brain regions, and often this slower development has been considered one possible explanation for risk-taking and difficulties in managing more emotional situations, as this difference in time could interfere with cognitive processing of information (Giedd et al., 1996, 1999; Sowell et al., 1999; Dahl, 2004; Gogtay et al., 2004; Hooper et al., 2004; Barnea-Goraly et al., 2005). An interesting perspective that could explain this condition is given by Lewis and Todd (2007). In fact, in the authors' theorization, they reflect on how it can be highly difficult and tricky to speak separately about cognitive regulation and emotional regulation. On one side, it is clear that certain forms of regulation are carried out by executive processes, top-down and

#### TABLE 3 | Factor model parameters.

fpsyg-09-00844 May 29, 2018 Time: 16:25 # 9


(Continued)

#### TABLE 3 | Continued


Uppercase words in the first column refer to the M-Plus sintax: BY relates latent factors with their observed variables; ON means that a variable is predicted from one or more others latent or observed variables; WITH indicates a correlation between variables; Symm\_Span = Symmetry complex span; RSPAN = Reading Span

more subject to voluntary control, while others seem to click in a more "automatic" way and rely on more primitive processes. The key point is that these processes are in constant interaction, and this interaction increases activity that can be both cognitive and emotional. Using the concept of neuroaxis, Lewis and Todd (2007) analyzed the brain activity in a bidirectional way. They imagined these processes as the extremes of a vertical continuum that can be "walked through" both from top-down to bottom-up and vice-versa. In this sense, the system that may have a larger impact on decision-making processes may change according to the situation, and in particular emotional situations, and behavior may be driven primarily by processes that are more primitive and less influenced by the environment, similar to control processes.

# Behavioral Outcomes, ER and WM: Which Relationship?

Internalizing/externalizing factors of YSR appear to be separate yet related coherently with the latent organization suggested by Achenbach and Rescorla (2001), although no significant and direct relation with ER and WM factor appeared. Our results, in fact, show how difficulties in emotion regulation are associated with WM efficiency, with WM that appear to be affected by more complex aspects of regulation. This specific result is particularly interesting, especially considering the age range investigated and the potential in terms of adding knowledge about the connection between two dimensions often treated as separate such as emotion regulation and WM, in a time of enhancement of the emotional arousal and growth of cognitive skills. On the other hand the absence of significant direct relation between WM and YSR may represent a less consistent result with respect to the documented relation between WM and the two ER factors. One possible explanation calls into account the problematic use of the YSR scale with the typically developing population. In fact, the presence of much-polarized items may not reflect a typical expression of internalizing or externalizing tendencies, thus the YSR scale may be more useful to discriminate extreme behaviors than to capture individual differences within the continuum of the norm.

Also, the fact that our sample is not a clinical one may have a role in not finding a strong association between these components. While presenting results, most participants indicated that they actually could recognize themselves in Dahl's paradox, meaning they sometimes could not help their feelings and emotions that they felt "all over the place." Experiencing emotions and actually mentalising about them may be two very different issues, especially during adolescence (Dahl, 2004), and the emotional impact of the event may highlight these episodes in their minds so they report them due to their vivid memory of it. With regard to the organization of ER, behavioral outcomes and WM, there is a general lack of studies that directly document similar results in terms of WM efficiency and its relation with specific self-reported aspects of regulation, both behavioral and emotional.

Another possible explanation with reference to this specific result, may be found in the hypothesis formulated by Romer et al. (2009), who found non-direct relation between cognitive features and some more complex outcomes related also to more emotional issues may exist In fact, according to this hypothesis, in regression analysis impulsivity seems to be related to externalizing behavior whereas WM shows only indirect association with externalizing behavior mediated by impulsivity and more complex aspects of human behavior such as sensation seeking. We are not able to demonstrate that in our sample as we did not consider sensation seeking for our study although this interpretation would make sense as WM and Inhibition work together in managing complex everyday situations (Diamond, 2013) and would be an interesting perspective to be considered for future research. In conclusion, the results of the present study confirm the importance of ER difficulties in WM and in being a possible explanation for individual differences. In turn, WM is important in managing the impact of emotional elicitations but also in enhancing the awareness of difficulties, in reporting more aggressive behaviors. From this perspective, the awareness from

having strong WM may be used as a tool for intervention in the TD population, who is in a delicate time of growth (Crone, 2009; Dahl, 2004), to be able to help them to cope with these statistically normal difficulties and prevent future discomfort or disease.

#### Limitations to the Present Study

Several limitations of this study warrant mentioning. We used self-reported questionnaires basically developed as screening instruments that investigate social aspects of regulation indirectly, such as tendency to "lose control" or "getting into a fight," which are specific and social aspects at one extreme, while such phenomena as "going blank" and not being able to complete a task even when the rules to perform it have already been learned but emotional arousal or anxiety prevent the individual from accessing that knowledge, were not investigated. Additionally, a debriefing asking in order to better understand the threepoint responses to the YSR would have been useful. Despite these limitations, the study showed some strengths, taking into account a developmental period that has been less investigated, and comparing it with adulthood and childhood using a larger and more uniform sample, in terms of age range, than those that have been employed in previous studies (e.g., Romer et al., 2009, 2011). It also offers an investigation of less explored aspects of the existing relationship between WM, ER and behavioral outcomes.

#### CONCLUSION

In general, this study showed a significant relationship between self-reported difficulties in ER and WM, adding knowledge

#### REFERENCES


regarding how behavioral and emotional self-reported outcomes may relate to these processes.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Ethical Code of Italian Psychology Order and of the Ethical guidelines of the Italian Association of Psychology with written informed consent from all subjects. All parents of the subjects gave written informed consent in accordance with the Declaration of Helsinki. At the time we collected the data no ethical committee was yet present to which we could refer to.

# AUTHOR CONTRIBUTIONS

CM and MCU revised the literature and conceived and designed the study. CM collected the data. MCU run the analysis. CM wrote the first draft of the manuscript, that was revised also by MCU. All authors read and approved the final manuscript.

# FUNDING

This study was supported by a doctoral grant awarded to CM by the University of Genoa.

control, executive functioning, and links to negative affectivity. Emotion 13, 47–63. doi: 10.1037/a0029536




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Malagoli and Usai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive Function and Academic Achievement in Primary School Children: The Use of Task-Related Processing Speed

Rebecca Gordon<sup>1</sup> \*, James H. Smith-Spark <sup>2</sup> , Elizabeth J. Newton<sup>2</sup> and Lucy A. Henry <sup>3</sup>

*<sup>1</sup> Department of Psychology, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom, <sup>2</sup> Division of Psychology, London South Bank University, London, United Kingdom, <sup>3</sup> Division of Language & Communication Science, City, University of London, London, United Kingdom*

Keywords: executive function, working memory, academic achievement, processing speed, updating, attention, inhibition, task-switching

# OVERVIEW

Edited by:

*Celine R. Gillebert, KU Leuven, Belgium*

#### Reviewed by:

*Loren Vandenbroucke, KU Leuven, Belgium Verena Erica Pritchard, Australian Catholic University, Australia*

> \*Correspondence: *Rebecca Gordon rebecca.gordon@kcl.ac.uk*

#### Specialty section:

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

Received: *07 February 2018* Accepted: *06 April 2018* Published: *23 April 2018*

#### Citation:

*Gordon R, Smith-Spark JH, Newton EJ and Henry LA (2018) Executive Function and Academic Achievement in Primary School Children: The Use of Task-Related Processing Speed. Front. Psychol. 9:582. doi: 10.3389/fpsyg.2018.00582* This article argues that individual differences in processing speed are important in the relationship between executive function (EF) and academic achievement in primary school children. It proposes that processing times within EF tasks can be used to predict academic attainment and aid in the development of intervention programmes.

# EXECUTIVE FUNCTION AND ACADEMIC ACHIEVEMENT

Executive function (EF) is an umbrella term for a set of cognitive constructs required when routine behavior is insufficient to achieve a known goal; in such cases, executive control of attention is required (Norman and Shallice, 1980/1986). There is much evidence that this effortful attentional resource is limited (e.g., Schmeichel, 2007), and is used in prioritizing behavior, inhibiting irrelevant or inappropriate actions, maintaining information in short-term memory, filtering out irrelevant stimuli, and switching attention between tasks or rules (Diamond, 2006). Research generally considers task-switching, inhibition and updating<sup>1</sup> to be the core EFs (e.g., Miyake et al., 2000). As such, studies investigating the structure of EFs in children have typically examined these three constructs (e.g., Huizinga et al., 2006; van der Ven et al., 2013).

Research over the past 20 years has shown that children exhibit developmental increases in EF from infancy to adulthood (Anderson, 2002) and that such increases are linked to academic achievement (e.g., Best et al., 2011). For example, studies have linked inhibition to mathematics (Bull and Scerif, 2001), task-switching to reading (e.g., van der Sluis et al., 2007) and mathematics (e.g., Bull and Scerif, 2001), and updating (Van der Ven et al., 2012) or working memory (WM) (Cragg et al., 2017) to mathematics. There is strong evidence to suggest that an understanding of how EF facilitates learning can enable early cognitive deficit identification and subsequent intervention programmes (e.g., Ribner et al., 2017).

<sup>1</sup>Updating has been defined as the cognitive ability to store, monitor and modify information in an accessible state (e.g., Miyake et al., 2000). St Clair-Thompson and Gathercole (2006) assessed children on four WM tasks and two updating measures. It was found that all the tasks loaded together on the same factor. They concluded that measures of WM and updating assess the same underlying construct. Updating in this article is thus considered synonymous with WM.

# VARIABILITY IN METHOD AND FINDINGS

Since the seminal work of Miyake et al. (2000), studies have increasingly looked to latent variable analysis to understand the structure and role of EFs. By calculating the variance shared between tasks purporting to measure a certain EF construct, a latent variable for that construct is created. However, there is variability in the findings from such studies. For example, studies have found a two-factor structure of WM and shifting, wherein inhibition was not identifiable in 7- to 21-year-olds (Huizinga et al., 2006) and 9- to 12-year-olds (van der Sluis et al., 2007). Conversely, other research found that WM and a combination of inhibition and shifting created a two-factor model in 6- to 8-yearolds (van der Ven et al., 2013) and 5- to 13-year-olds (Lee et al., 2013).

Further to this, there has also been concern regarding the reliability of EF measures (see Miyake and Friedman, 2012, for a review). Studies that have found EF to predict school achievement have varied in the methods employed. For example, studies linking inhibition to mathematics have used a single measure to represent this EF (e.g., Bull and Scerif, 2001) or have not used a specific measure of inhibition, but used inference from other assessments, such as the ability to reject irrelevant information in a WM task (e.g., Passolunghi et al., 1999).

To address these methodological issues, latent variable analysis has been used to examine the relationship between EF and academic abilities. When this has been done, a different story starts to emerge than that shown in earlier studies. In a study of 211 7- and 8-year-olds, Van der Ven et al. (2012) found that, after controlling for updating ability, latent constructs for inhibition and task-switching did not predict mathematical performance. Similarly, van der Sluis et al. (2007) examined the contributions of inhibition, task-switching and updating to reading, arithmetic and non-verbal reasoning in 9- to 12-yearolds. No latent inhibition factor was identified, and the taskswitching factor predicted only non-verbal reasoning and reading performance. However, updating related to reading, arithmetic, and non-verbal reasoning. In fact, when updating was included as a predictor in such studies, the variance in academic ability explained by inhibition and task-switching was usually no longer significant (see also Toll et al., 2011).

These studies illustrate that there remain unanswered questions regarding the structure of EF, and its relationship with academic attainment. In this article, it is argued that considering the role of processing speed in EF task performance may assist in answering these questions. The basis for this argument lies in the findings of the following studies that investigated issues in EF measurement.

# ADDRESSING ISSUES IN EF MEASUREMENT

Processing speed has been shown to influence the structure of EF. For example, van der Ven et al. (2013) controlled for baseline speed in measures of EF, and used speed scores to indicate inhibition and shifting ability. On the basis of their findings, they argued that variations in the structural organization of EF might be the result of differences in the methodologies used (i.e., controlling or not controlling for speed). Further evidence supports this finding. Huizinga et al. (2006) could not identify an inhibition factor in 9- to 12-year-olds, when controlling for processing speed. In addition, McAuley and White (2011) found that processing speed accounted for significant variance in the developmental trajectory of WM and inhibition. Acknowledging some degree of speculation, they suggested that processing speed may enable faster interpretation of environmental cues which indicate the suitability of certain purposeful behaviors. These studies provide evidence that processing speed is important in EF and are consistent with Fry and Hale (1996, 2000) who argued that processing speed underpins all EF constructs.

Given the varied findings regarding the link between EF structure and academic ability, there is value in investigating how processing speed may influence this relationship. Although studies have investigated this, they have used speeded tasks that sit outside of EF tasks (e.g., Bayliss et al., 2005; Berg, 2008; Passolunghi and Lanfranchi, 2012). However, van der Sluis et al. (2007) looked at the role of processing speed within EF, and how EF then relates to academic ability. They did this to address an important issue in EF measurement, the task impurity problem. This problem arises due to the need for participants to engage other, non-executive, cognitive abilities when completing EF tasks (Burgess, 1997). van der Sluis examined the structure of EF and its relationship to reading, arithmetic and non-verbal reasoning in 9- to 12-year-olds. Seven tasks were used and performance on each was separated into executive and nonexecutive components. For example, a non-executive component required rapid naming of a letter and the executive component required naming of the letter dependent on its location within a square. Performance on the simple processing component of the task was separated from performance when there was an executive load. The two performance indices (i.e., accuracy in the executive component and processing time in the nonexecutive component) were used to predict academic ability. A shifting and an updating factor were identified when controlling for the variance explained by the speeded non-executive task and updating was linked to reading and mathematics. However, performance on the non-executive speeded components was more strongly related to arithmetic and reading ability than the executive-loaded components.

So far we have discussed the evidence that processing speed is important in EF structure, and that it influences how EF constructs relate to academic achievement. This article now argues that identifying individual differences in processing speeds when there is an executive load can explain the link between EF and academic attainment.

# THE USE OF TASK-RELATED PROCESSING SPEED TO PREDICT ACADEMIC ACHIEVEMENT

There is considerable evidence for links between processing speed, EF and academic achievement. This is, in part, evident in the similar developmental trajectories of the three abilities. Information processing has been shown to develop rapidly from 3 to 5 years of age (Espy et al., 2006), with significant improvements observed in 9- and 10-year-olds (Kail, 1986). This trend is commensurate with the developmental increases in EF (Anderson, 2002; Demetriou et al., 2014) and academic achievement (Best et al., 2011; Demetriou et al., 2014) mentioned previously. Links between processing speed and EF are further supported by early research explaining capacity increases in WM. According to the task-switching (Towse and Hitch, 1995) and resource sharing (Daneman and Carpenter, 1980) hypotheses, a developmental increase in processing speed can explain an enhanced ability to refresh decaying memory items (Towse and Hitch, 1995) or free up storage space (Daneman and Carpenter, 1980). In addition, Bayliss et al. (2005) found that processing speed contributed, in part, to developmental improvements in complex span task performance due to decay prevention and faster reactivation of memory items. Furthermore, the timebased resource-sharing model of WM (Camos and Barrouillet, 2011) argues for the development of an attentional switching capability to explain increases in WM capacity at approximately 7 years of age; and this ability is demonstrated by a linear relationship between processing speed and storage capacity in complex span tasks.

Even when examining WM alone, placing stress on a participant's ability to process information more quickly has resulted in stronger relationships with measures of reading, mathematics and non-verbal reasoning (Lépine et al., 2005). Lépine and colleagues restricted the time available for participants to process stimuli in complex span tasks, before asking them to recall the memoranda related to the task. When comparing performance to that on tasks with no time restrictions, it was found that time-restricted tasks showed stronger links to performance on the measures of reading, mathematics and non-verbal reasoning.

The research discussed in this article provides evidence that individual differences in EF may be underpinned by the speed with which information can be processed when there are executive demands. Furthermore, the relationship between EF and academic abilities is strengthened when time restrictions are placed on the processing component of EF tasks (Lépine et al., 2005). This suggests that individual differences in processing speed during executive control of attention may explain differences in EF, and its relationship with academic

# REFERENCES


achievement. However, the required evidence may only be identifiable if studies unpack the tasks used to measure EF in order to identify underlying mechanisms. Some earlier studies have been successful in adopting this approach, whereby the components of EF measures are extracted and analyzed as predictors of academic abilities. For example, the time taken to recall memoranda in complex span tasks designed to assess WM capacity have been shown to predict reading ability (Cowan et al., 2003; Towse et al., 2008).

# APPLICATION

The purpose of identifying which cognitive constructs influence academic achievement is to enable subsequent intervention programmes (e.g., Ribner et al., 2017). As evidence suggests that processing speed as early as 5 months of age influences long-term EF (see Cuevas and Bell, 2014), it is unlikely that intervention programmes aimed at improving processing speed in primary school would be beneficial. However, as lesson structures in UK primary schools are time-restricted, often to 20-min slots (Qualifications and Curriculum Authority, 2002), this may hinder children with slower processing speeds. If there is a greater awareness of the role of processing speed in tasks that rely on EF, then school intervention programmes can make reasonable time adjustments for children who struggle due to a deficit in this area; similar to interventions which exist for developmental disorders such as dyslexia.

#### SUMMARY

The evidence discussed here provides opportunities to develop a new approach to examining the relationship between EF and academic achievement. Future studies should clarify the role of executive-loaded processing speed in tasks by measuring individual differences in processing times. Using these as predictors of academic attainment, may allow identification of children who, due to slower processing speeds, struggle with academic tasks when there is an executive load.

# AUTHOR CONTRIBUTIONS

RG: Conception of article and drafting of manuscript; RG and JS-S: Critical revision of the text; EN and LH: Review of text; RG, JS-S, EN, and LH: Approved the final version of the manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, LV, and handling Editor declared their shared affiliation.

Copyright © 2018 Gordon, Smith-Spark, Newton and Henry. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive Functions and Prosodic Abilities in Children With High-Functioning Autism

#### Marisa G. Filipe1,2 \*, Sónia Frota<sup>1</sup>† and Selene G. Vicente<sup>2</sup>†

<sup>1</sup> Center of Linguistics, School of Arts and Humanities, University of Lisbon, Lisbon, Portugal, <sup>2</sup> Centre for Psychology, Faculty of Psychology and Education Sciences, University of Porto, Porto, Portugal

Little is known about the relationship between prosodic abilities and executive function skills. As deficits in executive functions (EFs) and prosodic impairments are characteristics of autism, we examined how EFs are related to prosodic performance in children with high-functioning autism (HFA). Fifteen children with HFA (M = 7.4 years; SD = 1.12), matched to 15 typically developing peers on age, gender, and non-verbal intelligence participated in the study. The Profiling Elements of Prosody in Speech-Communication (PEPS-C) was used to assess prosodic performance. The Children's Color Trails Test (CCTT-1, CCTT-2, and CCTT Interference Index) was used as an indicator of executive control abilities. Our findings suggest no relation between prosodic abilities and visual search and processing speed (assessed by CCTT-1), but a significant link between prosodic skills and divided attention, working memory/sequencing, setswitching, and inhibition (assessed by CCTT-2 and CCTT Interference Index). These findings may be of clinical relevance since difficulties in EFs and prosodic deficits are characteristic of many neurodevelopmental disorders. Future studies are needed to further investigate the nature of the relationship between impaired prosody and executive (dys)function.

Keywords: executive functions, prosody, prosodic skills, high-functioning autism, autism spectrum disorders

# INTRODUCTION

There has been a recent interest in the study of the relationship between executive functions (EFs) and communication skills in typical and atypical development (e.g., Bishop and Norbury, 2005; Ellis Weismer et al., 2005; Im-Bolter et al., 2006; Henry et al., 2012; Vugs et al., 2014). In typical development, a link has been suggested between inhibition and lexical and syntactic disambiguation in children and young adults (Khanna and Boland, 2010). Working memory has been associated with auditory and written sentence comprehension in children and adults (e.g., Daneman and Carpenter, 1980; Roberts et al., 2007) and with sentence production in young adults (Slevc, 2011). In atypical development, difficulties in EFs have been observed in populations with communication impairments. For example, children with specific language impairment tend to have lower scores than typically developing (TD) peers on measures that assess EFs, including inhibition (Bishop and Norbury, 2005; Im-Bolter et al., 2006), task-shifting (Marton, 2008), and working memory (Ellis Weismer et al., 2005; Henry et al., 2012; Vugs et al., 2014). Deficits in EFs have also been observed in other disorders that include communication challenges, such as aphasia (Yeung and Law, 2010), traumatic brain injury (e.g., Sainson et al., 2014), and autism spectrum disorders (ASD) (e.g., Joseph et al., 2005). Crucially, EFs and language abilities seem to be related, both in comprehension

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Verena Erica Pritchard, Australian Catholic University, Australia Bahia Guellai, University of Paris Ouest Nanterre, France

#### \*Correspondence:

Marisa G. Filipe marisafilipe@fpce.up.pt; labfon@letras.ulisboa.pt

†These authors are joint last authors.

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 December 2017 Accepted: 05 March 2018 Published: 21 March 2018

#### Citation:

Filipe MG, Frota S and Vicente SG (2018) Executive Functions and Prosodic Abilities in Children With High-Functioning Autism. Front. Psychol. 9:359. doi: 10.3389/fpsyg.2018.00359

**78**

and production, and prior findings suggest a link between communication impairments and deficits in EFs.

In line with evidence suggesting that EFs are closely related to language abilities, several models of language impairment now propose that language performance includes cognitive factors such as processing speed, attention, and EFs in addition to linguistic ability (Montgomery, 2002; Gomes et al., 2007; Leonard et al., 2007; Montgomery and Windsor, 2007; Bishop et al., 2014). Bishop et al. (2014) proposed three models to explain the relationship between EFs and language skills: (a) EFs influence the development of language; (b) children use verbal facilitation to assist them in EFs tasks; and (c) there is no causal relationship between these skills, and it is possible that shared problems in the development of the nervous system could account for the correlations. Gooch et al. (2016) described a further alternative: EFs and language skills may develop in a reciprocal interaction, and the relationship could change over time. In this context, longitudinal studies provided a starting point for the understanding of this relationship. Kuhn et al. (2014) studied the link between children's early communicative gestures at 15 months, language abilities at 2/3 years, and EFs at 4 years of age, and they found that early language skills predicted later EFs. Exploring the relationship between language and EFs in children at-risk for language learning impairments in the transition from preschool to schooling, Gooch et al. (2016) found a strong concurrent relationship between language and EFs. Therefore, EFs and language performance are related, and theoretically this could also be true for prosodic performance.

Prosody plays an important role in communication disorders, as difficulties with prosodic skills can impact on language abilities in general and dramatically influence daily conversations, social interactions (Shriberg et al., 2001; Paul et al., 2005), and even typical language development (e.g., Cutler and Swinney, 1987; Frota et al., 2016). For example, prosody has been shown to play an important role in lexical and syntactic acquisition (Christophe et al., 2008; Hawthorne and Gerken, 2014; de Carvalho et al., 2016). Indeed, prosody is crucial for the production and comprehension of the organization of speech, manifested by patterns of intonation, rhythm, prominence, and chunking of the speech continuum (Wagner and Watson, 2010). Prosodic features impact not only on "how we say it" but also on "what we say."

However, very little is known about the relationship between cognitive processes and prosodic abilities, and an important theoretical question is whether prosody is independent of other cognitive aspects such as EFs. Given that deficits in EFs and prosodic impairments are both characteristics of autism, this study investigates how EFs are related to prosodic performance in children with high-functioning autism (HFA), thus contributing to our understanding about the cognitive mechanisms that underlie language development.

# AUTISM SPECTRUM DISORDERS

Autism spectrum disorders (ASD) are a complex group of neurodevelopmental disorders, evident by early childhood, in which the severity of symptoms ranges from minor to incapacitating impairments. Common manifestations are repetitive or stereotyped interests, mannerisms, and difficulties in social communication (American Psychiatric Association, 2013). Regarding intellectual abilities, 44% of children affected with ASD are reported to have an average intellectual ability, 24% have a borderline intelligence quotient (IQ), and 32% have an intellectual disability (Christensen et al., 2016). The children without intellectual disabilities are often referred to as having HFA. Although autism is a disorder characterized by multiple impairments, including deficits in EFs and prosody, research has failed to clearly document the relationship between ASD, language, cognition, and EFs.

# EXECUTIVE FUNCTIONS IN AUTISM SPECTRUM DISORDERS

Impairments in EFs have been considered a central deficit in autism (e.g., Rajendran and Mitchell, 2007), and investigating aspects of EFs in ASD has been an active area of research. Specifically, studies have indicated that children with ASD struggle with tasks requiring working memory, inhibition, and set-shifting abilities (e.g., Ozonoff et al., 1991, 2005; Hughes et al., 1994; Ozonoff and McEvoy, 1994; Ozonoff and Jensen, 1999; Adams and Jarrold, 2012). Additionally, a dysfunction frequently found is the perseveration of behavior, that is, the tendency to continue to perform actions that are no longer appropriate to the context (e.g., Rumsey and Hamburger, 1988; Prior and Hoffman, 1990). Children with ASD have also shown deficits in shifting attention, and in sustained or selective attention (Noterdaeme et al., 2001; Landry and Bryson, 2004). Furthermore, O'Hearn et al. (2008), in a review, reported that impairments in tasks requiring response inhibition, working memory, planning, and attention are also present in adulthood. In fact, impairments in EFs could be a potential explanation for many features of ADS, including difficulties with planning, inhibition, flexibility, and working memory.

# PROSODIC SKILLS IN AUTISM SPECTRUM DISORDERS

Prosodic impairments appeared amongst the first clinical descriptions of autism (Kanner, 1943; Asperger, 1944), and currently diagnostic tools of ASD include atypical expressive prosody as a feature of ASD (e.g., the Autism Diagnostic Interview-Revised, ADI-R, Lord et al., 1994; and the Autism Diagnostic Observation Schedule, ADOS, Lord et al., 1989).

Prosodic impairments in ASD have been extensively investigated from the viewpoint of perception and production. Deficits in the perception of prosodic features in individuals with ASD have been described, for example, in the comprehension of emphatic stress (Paul et al., 2005), as well as in the perception of pairs of the same auditory stimuli as prosodically different (Peppé et al., 2007). Impairments in expressive prosody in individuals with ASD have been described for rhythm, rate of speech,

intonation patterns (e.g., Shriberg et al., 2001; McCann and Peppé, 2003; Paul et al., 2005), and the use of prosody to convey phrase-level prominence (McCann et al., 2007). However, some findings are controversial. For instance, monotone intonation has been reported but so has exaggerated intonation (Kanner, 1943; Baltaxe and Simmons, 1985; Sharda et al., 2010; Bonneh et al., 2011; DePape et al., 2012; Filipe et al., 2014), and slow syllabic speech has been described together with fast articulation rate (Baron-Cohen and Staunton, 1994; for a review, see McCann and Peppé, 2003).

In sum, results on prosody in ASD are mixed, with no agreement between studies. So far, no convincing explanation for these discrepant findings has been put forward. This atypical variation might be explained by methodological problems related to the assessment of prosody, poor diagnostic data, small sample sizes, and lack of appropriate comparison groups (e.g., McCann and Peppé, 2003; Diehl et al., 2009), but also by the multiplicity and heterogeneity of symptoms in ASD (Shriberg et al., 2001; Rice et al., 2005). Therefore, there is a current need for research in this field that takes in account the link between symptoms in ASD, such as cognitive abilities and prosodic skills.

# PRESENT STUDY

This study examines EFs and prosody in children with HFA and TD peers. To the best of our knowledge, no study has yet analyzed the relation between EFs and prosodic skills, although evidence suggests that EFs are closely related to language abilities and that these are crucial foundations for development and learning. Since deficits in EFs and prosodic impairments may be a common feature of many disorders, including neurodevelopmental disorders such as ASD, we examined EFs performance and prosodic performance in HFA to determine whether prosodic abilities are associated with EFs, and if so to what extent and with what particular functions. Furthermore, we wanted to investigate if prosodic abilities are mediating differences in EFs performance, or if the reverse pattern was found. This specific clinical population offers methodological advantages because it separates out the confounding cognitive issues seen in other atypical populations. Specifically, the analyses aim to address the following research questions: (a) Does atypical development (i.e., HFA) affect performance on tests that assess prosodic skills and EFs?; (b) Do prosodic skills correlate with EFs measures in the HFA group?; and (c) Do prosodic skills mediate the differences in EFs between the HFA group and the TD group, or does the reverse pattern hold?

# MATERIALS AND METHODS

#### Participants

Fifteen children (3 girls, 12 boys) with HFA (6–9 years; M = 7.40, SD = 1.12), who met the DSM-5 criteria for Autism (American Psychiatric Association, 2013), participated in the study. A team of child-psychiatrists and psychologists made the diagnosis of ASD. The materials used in the diagnostic procedure were the Autism Diagnostic Interview-Revised (ADI-R; Lord et al., 1994) and the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 1989). Participants characteristics are shown in **Table 1**. All HFA participants were required to have an IQ of 80 or higher (assessed with Wechsler Intelligence Scale for Children-III; Wechsler, 1991), to control for poor performance on the prosodic tasks not being a general consequence of cognitive impairments. Exclusion criteria were obsessive–compulsive disorders, attention deficit hyperactivity disorder, and learning disorders, according to DSM-5. The group with HFA was matched to a TD group on age (M = 7.53, SD = 0.99), gender, and non-verbal intelligence (HFA: M = 25.33, SD = 5.10; TD: M = 24, SD = 4.22; assessed with Raven's Colored Progressive Matrices, Raven, 1995; Portuguese version, Simões, 2000). The groups were significantly different in general language level (HFA: M = 83.46, SD = 17.22; TD: M = 96.89, SD = 4.97; assessed with Griffiths Mental Development Scales 2–8 years – Sub-scale Language, GMDS; Luiz et al., 2007), but the difference between groups for receptive vocabulary was non-significant (HFA: M = 120.07, SD = 34.42; TD: M = 142.07, SD = 31.51; assessed with Peabody Picture Vocabulary Test, PPVT; Dunn and Dunn, 2007; Vicente et al., 2011, unpublished). All participants were native speakers of European Portuguese, born and raised in monolingual homes in the North of Portugal, with no visual or hearing problems.

# Material

#### Prosody

Participants were evaluated with the European Portuguese version of the Profiling Elements of Prosody in Speech-Communication (PEPS-C; original version: Peppé and McCann, 2003; Portuguese version: Filipe et al., 2017). This test assesses prosodic skills through twelve subtests: six of the subtests address receptive abilities and the other six address expressive abilities. Each subtest comprises 2 example, 2 training, and 16 experimental items. The following is a description of each subtest.



TABLE 1 | Mean (M), standard deviation (SD) and range for age, non-verbal intelligence, language, and vocabulary of the participants in the high-functioning autism (HFA) and typically developing (TD) groups.

<sup>∗</sup>p ≤ 0.05 (one-way ANOVA). Maximum score for non-verbal Intelligence = 36. Score for language: M = 100; SD = 15. Maximum score for vocabulary = 228.

or disliking intonation is presented simultaneously with a picture of the stimulus. Then two images showing a happy and a sad face appear, and the participant selects the image corresponding to the intonation pattern heard (i.e., happy face for liking intonation or sad face for disliking intonation).


Different reasons led us to prefer the PEPS-C test to other ways of testing prosody: (1) it is a comprehensive prosodic test already used with children with autism; (2) to our knowledge, it is the only prosodic test available for European Portuguese that assess both receptive and expressive prosodic abilities; (3) it does not require specialized transcription skills; and (4) responses are the same for all participants.

#### Executive Functions

Participants were evaluated with the Children's Color Trails Test (CCTT, Llorente et al., 2003). The CCTT consists of two parts: CCTT-1 and CCTT-2. The CCTT-1 measures visual tracking, processing speed, and graphomotor skills. The CCTT-2 is a more complex task that adds divided attention, setswitching, inhibition, and working memory/sequencing. In both parts, participants connect circled numbers (1–15) with a pencil in ascending order, but in CCTT-2 the numbers alternate in color (pink and yellow). Moreover, the CCTT allows the computation of an Interference Index that measures the added task requirements of CCTT-2 using Time raw scores (raw scores completion time in seconds) through the following formula: (CCTT-2 Time raw score – CCTT-1 Time raw score)/Time CCTT-1 Time raw score.

The CCTT test was chosen to test EFs for the following reasons: (1) it reduces the impact of linguistic components including administration guidelines, because visual instructions allow administration without the linguistic component; and (2) it overcomes limitations of an older similar test, the Children's Trail Making Test (Reitan, 1971), which uses a combination of the English alphabet with colors, that might exclude children with language or learning disabilities.

# Procedure

Informed consent was obtained from participants' parents/caregivers, who had the opportunity to ask for further information about the study. Each child was assessed individually in a quiet room with adequate lightning conditions, at their school, in their home, or at the University of Porto.

The assessment was performed in two to three sessions completed within a month and lasting approximately 45 min each. Administration order was the same for all the participants: CCTT-1, CCTT-2, and PEPS-C (Short-Item, Long Item, Turn-End, Affect, Chunking, and Focus). In PEPS-C, half of the participants started with the receptive tasks and the other half with the expressive tasks.

# RESULTS

fpsyg-09-00359 March 19, 2018 Time: 17:22 # 5

For the PEPS-C tasks, each participant's answer was scored as correct (with 1 point) or incorrect (with 0 points). PEPS-C allows the computation of a score for each subtest (maximum = 16), and the computation of a total score that corresponds to the sum of all subtests (maximum = 192). For the CCTT test, scores involved the completion test time in seconds of CCTT-1 and CCTT-2. Additionally, the CCTT Interference Index was computed to measure the added task requirements of CCTT-2.

# HFA and Typically Developing Group Comparisons

To examine performance differences between the HFA and TD groups on the PEPS-C and the CCTT, a comparative analysis was conducted. The results were analyzed separately for each PEPS-C task (Short-Item Discrimination, Short-Item Imitation, Long-Item Discrimination, Long-Item Imitation, Turn-End Reception, Turn-End Expression, Affect Reception, Affect Expression, Chunking Reception, Chunking Expression, Focus Reception, and Focus Expression) and for each CCTT component (CCTT-1, CCTT-2, and CCTT Interference Index; see **Table 2** for details).

In the PEPS-C, the difference between groups on the overall mean score was significant: F(1,28) = 5.214, p = 0.030; η <sup>2</sup> = 0.157. In all the PEPS-C tasks, HFA children showed lower scores, however the differences were significant only for Short-Item Discrimination [F(1,28) = 4.244, p = 0.049; η <sup>2</sup> = 0.132], Short-Item Imitation [F(1,28) = 10.975, p = 0.003; η <sup>2</sup> = 0.282), Turn-End Reception (F(1,28) = 4.847, p = 0.036; η <sup>2</sup> = 0.148), Turn-End Expression (F(1,28) = 4.959, p = 0.034; η <sup>2</sup> = 0.150), and Affect Expression (F(1,28) = 6.322, p = 0.018; η <sup>2</sup> = 0.184).

TABLE 2 | Scores in PEPS-C tasks and CCTT components in the high-functioning autism (HFA) and typically developing (TD) groups.


<sup>∗</sup>p < 0.05.

In the CCTT, no difference between groups was found for the time required to complete CCTT-1 or CCTT-2 (F(1,28) > 1; F(1,28) = 2.503, p = 0.125; respectively). However, a significant difference between groups for the CCTT Interference Index was found (F(1,28) = 6.710, p = 0.015; η <sup>2</sup> = 0.193; see **Table 2**).

# Correlations Between the EFs Test and the Prosodic Test

In order to analyze the relation between possible prosodic impairments and other basic deficits, we computed Pearson correlations between variables. We used the overall mean score of the PEPS-C and the scores in the different components of the CCTT. For both groups together (i.e., HFA and TD children), we found no correlation between PEPS-C and CCTT-1 (see **Figure 1**), but moderate correlations were found between the PEPS-C and CCTT-2 (Pearson'sr = 0.50, p < 0.001; see **Figure 2**), and between the PEPS-C and the CCTT Interference Index (Pearson's r = 0.48, p < 0.001; see **Figure 3**). Additionally, correlations between PEPS-C individual tasks and CCTT components were also calculated, with receptive tasks being more correlated with CCTT components than expressive tasks (the exception is Affect Expression), and CCTT-2 and CCTT Interference Index generally showing stronger correlations with PEPS-C tasks (see **Table 3** for details). However, when the groups were considered separately, the correlations lost statistical significance.

# Mediation Analysis

To explore the possible link between EFs (assessed by the CCTT Interference Index) and prosodic impairments in HFA, and to further analyze the group effect, we used a mediation analysis following Baron and Kenny (1986) in the assumption that the effect of an independent variable on a dependent variable is mediated by a mediating variable. First, we examined the hypothesis that prosodic abilities mediate the differences in EFs between the HFA group and the TD group. This hypothesis

Filipe et al. Executive Functions and Prosodic Abilities

would be supported if the effect of Prosody (i.e., the mediator) on EFs (i.e., the dependent variable) is greater than the effect of Group (i.e., the independent variable) on EFs, and the effect of Group on EFs is significantly reduced or absent after controlling for prosody. A series of regression analyses were thus conducted to assess (see **Figure 4** for details): (a) the direct effect of Group on Prosody; (b) the direct effect of Prosody on EFs; (c) and the direct effect of Group on EFs.

The direct effect of Group on Prosody (path a in **Figure 4**) showed an adjusted R <sup>2</sup> of.127 (β = −0.39; t = −2,28, p = 0.03); the direct effect of Prosody on EFs (path b in **Figure 4**) showed an adjusted R <sup>2</sup> of.229 (β = 0.48; t = 2,87, p = 0.008); and the direct effect of Group on EFs (path c in **Figure 4**) showed an adjusted R 2 of.165 (β = 0.44; t = 2,59, p = 0.015). All models were significant. However, the effect of Prosody on EFs was larger than the effect of Group on EFs, and the effect of Group on EFs after controlling for Prosody became not significant (path c' in **Figure 4**; β = 0.30; t = 1,70, p = 0.101).

#### FIGURE 2 | Scatter plot displaying the correlation between PEPS-C and CCTT-2.

FIGURE 3 | Scatter plot displaying the correlation between PEPS-C and CCTT Interference Index (CCTT-IF).

TABLE 3 | Correlations between PEPS-C tasks, CCTT-1, CCTT-2, and CCTT interference index.


<sup>∗</sup>p < 0.05. ∗∗p ≤ 0.001.

Secondly, to analyze the direction of this effect, the reverse regression was performed exploring the effect of Group on Prosody after controlling for EFs. Results showed that Group also became not significant after controlling for EFs (β = 0.23; t = −1,26, p = 0.218). This suggests that Prosody is mediating differences in EFs between groups, but the reverse pattern also holds.

# DISCUSSION

The relation between EFs and prosody is of interest to researchers and clinicians. The present study extends research on prosodic skills and EFs in autism by investigating these abilities in children with HFA compared to TD peers, and examining the relations between these abilities. Fifteen children with HFA were matched to 15 TD peers on chronological age, gender, and non-verbal intelligence. The PEPS-C was used to assess prosodic performance and the CCTT as an indicator of nonverbal executive control abilities. The results of the present study point to three main findings.

First, HFA children scored significantly lower on the PEPS-C than TD children, pointing to impaired prosodic skills. Lower performance on EFs was also found for HFA children: for CCTT-1 there was no difference between groups; for CCTT-2 the HFA scored worse than TD, although this difference was non-significant; for CCTT Interference Index the difference between groups was significant. These findings show that atypical development affects both prosody and EFs. These results are consistent with findings from other studies reporting that children with HFA performed significantly less well than controls in prosodic tasks (e.g., Rutherford et al., 2002; Peppé et al., 2007) and in EFs tests (e.g., Rajendran and Mitchell, 2007).

Second, the examination of the relation between cognitive processes and prosodic performance in the clinical group showed no correlation between PEPS-C and CCTT-1, but moderate correlations between PEPS-C and CCTT-2, and between PEPS-C and the CCTT Interference Index. Our findings thus suggest

no relation between prosodic abilities and visual search and processing speed (assessed by CCTT-1), but a significant association between prosodic deficits and divided attention, working memory/sequencing, set-switching, and inhibition (assessed by CCTT-2 and CCTT Interference Index). Prior research involving children with atypical development also found that deficits in aspects of communication and EFs are associated (e.g., Bishop and Norbury, 2005; Ellis Weismer et al., 2005; Im-Bolter et al., 2006; Henry et al., 2012; Vugs et al., 2014). The current study extends these findings to prosodic abilities and EFs skills.

Third, the results from the mediation analysis showed that the effect of Prosody on EFs was greater than the effect of Group (HFA vs. TD) on EFs, and the effect of Group on EFs after controlling for Prosody became non-significant, thus confirming the hypothesis that prosody influences EFs. The reverse pattern was also found, however, showing that EFs also affect prosodic skills. These results highlight the important (bidirectional) link between EFs skills and prosodic abilities.

Although several studies have described expressive and receptive prosodic impairments in ASD, no consensus has emerged on the characterization of the prosodic profile of this clinical population. From earlier research it is also not clear whether impaired prosody is related to specific cognitive profiles. It is unknown whether deficits in EFs lead to poor communication or whether other cognitive aspects are also at play, influencing the development of EFs and communication (e.g., Bishop et al., 2014). The present study, by focusing on children with HFA, sheds some light on the link between prosodic abilities and EFs, while controlling for the confounding cognitive difficulties related to intellectual disabilities that usually characterize atypical populations. The finding of an association between prosodic impairments (and therefore communication deficits) and EFs in the current study thus presents an important contribution to this research field. Such association is evident not only in the fact that prosodic abilities and EFs are related, but also in the mediating role of prosodic abilities in EFs performance, and vice-versa, with our results pointing to poorer prosodic abilities leading to poorer EFs, and poorer EFs influencing poorer prosodic abilities. Even though our findings did not provide a clear answer about the direction of the relation between EFs and prosodic abilities, as they suggest that the influence is bidirectional, this strong influence raises the important question that shared genetic mechanisms could be involved in the development of both abilities. Bishop et al. (2014) suggest that delayed development of frontal lobes may impact on brain regions that are important for EFs and language processing. Both EF and prosodic abilities emerge early in development, but continue to develop until later ages, with adult-level performance on many tests of EF and prosodic skills being reached at puberty, and performance on many measures continuing to change into adulthood (e.g., Anderson, 2002; Peppé and McCann, 2003; Wells et al., 2004; Best et al., 2009; Filipe et al., 2017). Therefore, the comorbidity between difficulties in EF and prosodic impairments could be a consequence of shared genetic mechanisms.

Childhood communication disorders are associated with different neuropsychological problems. The most commonly associated neuropsychological deficits are problems involving attention and EFs that are usually a common denominator in the different clinical pictures of language disorders. Although the linguistic signs of these disorders are fairly well understood, the associated neuropsychological signs have yet not been studied. It is hoped that the present study is a first step in this direction. As EFs play an important role in the cognitive control of behavior, clinicians, such as speech-language pathologists, should be aware of the relation between cognitive behavior and communicative impairments. Our findings suggest that clinicians responsible for the evaluation of patients with a wide variety of cognitive disorders and language impairments should test both language and EFs in their assessments.

This study has a number of limitations that should be carefully considered. One limitation is the use of the PEPS-C as the only measure of prosodic abilities. The PEPS-C involves the explicit use of prosody, and this makes the tasks easier for children with autism because these individuals tend to not attend to socially relevant information, but might be able to process information when their attention is navigated toward it (Senju, 2012). Future studies should provide another kind of measures for prosodic skills, such as acoustic and phonological analysis for expressive skills and online perception tasks for receptive skills, in order to draw a more comprehensive and accurate view of prosodic deficits in autism. In addition, although the test for EFs was carefully chosen for its non-verbal demands, using the CCTT as the only measure of EFs is certainly a limitation. The fact that the mediation analysis relies only on the CCTT Interference Index is yet another limitation. Future studies should measure other components of EFs to better characterize this multidimensional construct.

Future research should also explore the relation between EFs and prosodic abilities with larger sample sizes and more robust statistical analyses to verify if the present pattern of findings can be replicated. An interesting approach to use in future research is the latent variables approach to capture the link between EFs skills and language domains. Furthermore, some studies have shown that methylphenidate, which produces effects in alertness, combats fatigue, and improves attention, also improves language processing in children (Westby and Watson, 2004; McInnes et al., 2007). Thus further evidence for the possible causal relation between EFs and language domains could be provided by studies addressing whether improving EFs also improves prosodic/language/communicative performance.

#### CONCLUSION

The field of communication impairments and EFs promises to continue as an important area of research concerning the challenging problems of autism. The present study provides important and exciting new directions in the research on prosodic and EFs skills in autism, and may be of considerable interest for clinical practice since EFs and prosodic impairments are characteristic of many neurodevelopmental disorders.

#### AUTHOR CONTRIBUTIONS

MF, SF, and SV contributed to the design of the work. MF prepared the first draft of the manuscript. SF and SV revised it

#### REFERENCES


critically for important intellectual content. Final version of the manuscript was approved by all authors.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of 'European Union Agency for Fundamental Rights' with written informed consent from all subjects, following Portuguese regulations. All caregivers gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by schools' boards because there is no review board at the University of Porto or Lisbon when this project was conducted. Participants or caregivers of participants were selected via notices in schools and were informed about: (a) The purpose of the research, expected duration, and procedures; (b) Participants' rights to decline to participate and to withdraw from the research once it has started, as well as the anticipated consequences of doing so; (c) Factors that may influence their willingness to participate, such as potential discomfort or adverse effects; (d) Any prospective research benefits; (e) Limits of confidentiality, such as data coding, disposal, sharing and archiving.

# FUNDING

This research was supported by the Portuguese Foundation for Science and Technology (SFRH/BD/64166/2009, BPD/100696/ 2014, PEst-OE/LIN/UI0214/2013, and UID/PSI/000050/2013).


Dunn, D. M., and Dunn, L. M. (2007). Peabody Picture Vocabulary Test, 4th Edn. Minneapolis, MN: NCS Pearson, Inc.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Filipe, Frota and Vicente. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Inhibition and Updating, but Not Switching, Predict Developmental Dyslexia and Individual Variation in Reading Ability

#### Caoilainn Doyle<sup>1</sup> \*, Alan F. Smeaton<sup>2</sup> , Richard A. P. Roche<sup>3</sup> and Lorraine Boran<sup>1</sup>

*<sup>1</sup> School of Nursing and Human Sciences, Dublin City University, Dublin, Ireland, <sup>2</sup> Insight Centre for Data Analytics, Dublin City University, Dublin, Ireland, <sup>3</sup> Department of Psychology, Maynooth University, Kildare, Ireland*

To elucidate the core executive function profile (strengths and weaknesses in inhibition, updating, and switching) associated with dyslexia, this study explored executive function in 27 children with dyslexia and 29 age matched controls using sensitive z-mean measures of each ability and controlled for individual differences in processing speed. This study found that developmental dyslexia is associated with inhibition and updating, but not switching impairments, at the error z-mean composite level, whilst controlling for processing speed. Inhibition and updating (but not switching) error composites predicted both dyslexia likelihood and reading ability across the full range of variation from typical to atypical. The predictive relationships were such that those with poorer performance on inhibition and updating measures were significantly more likely to have a diagnosis of developmental dyslexia and also demonstrate poorer reading ability. These findings suggest that inhibition and updating abilities are associated with developmental dyslexia and predict reading ability. Future studies should explore executive function training as an intervention for children with dyslexia as core executive functions appear to be modifiable with training and may transfer to improved reading ability.

#### Edited by:

*Sarah E. MacPherson, University of Edinburgh, United Kingdom*

#### Reviewed by:

*Maurits W. Van Der Molen, University of Amsterdam, Netherlands Cedric A. Bouquet, University of Poitiers, France*

#### \*Correspondence:

*Caoilainn Doyle caoilainn.doyle96@mail.dcu.ie*

#### Specialty section:

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

Received: *07 December 2017* Accepted: *03 May 2018* Published: *28 May 2018*

#### Citation:

*Doyle C, Smeaton AF, Roche RAP and Boran L (2018) Inhibition and Updating, but Not Switching, Predict Developmental Dyslexia and Individual Variation in Reading Ability. Front. Psychol. 9:795. doi: 10.3389/fpsyg.2018.00795* Keywords: dyslexia, executive function, inhibition, updating, processing speed, reading

# INTRODUCTION

Although developmental dyslexia is a neurodevelopmental disorder characterised by reading (such as accuracy and speed problems) and phonological difficulties (awareness and implementation of sound structure of language), despite adequate instruction and intellectual ability (World Health Organization, 1992; American Psychiatric Association, 1994, 2013). Executive function impairments are frequently observed.

Executive function is an umbrella term for a range of high-level cognitive processes associated with frontal regions of the brain which subserve goal-directed behaviour. Executive function is what enables us to represent and manipulate goal-related information in a highly active state, focus our attention in the face of distraction, update goal relevant information in working memory, rapidly adapt to changing demands within our environment and plan our actions accordingly. Although in agreement on the importance of executive function for directing behaviour, most theories define and segment the elusive concept of executive function differently (see Jurado and Rosselli, 2007 for a more comprehensive review of executive function and associated theories). Baddeley and Hitch (1974) proposed that working memory is comprised of two domain specific storage components (visuo-spatial sketchpad and phonological loop) and one domain general control component (central executive). Within their model, the central executive is defined as the component responsible for the manipulation of information, focusing attention on relevant and inhibiting irrelevant stimuli, regulating performance across multi-tasking conditions, and planning behavioural sequences (Baddeley and Hitch, 1974). The Supervisory Attentional System (SAS) conceptualised by Norman and and Shallice (1986) is an attentional control mechanism necessary for the initiation of effortful goal-directed behaviours (requiring planning, error monitoring and resisting) as opposed to automatic effortless behaviours. The SAS selection and control of actions depends upon contention scheduling, a process involving anatagonistic activation and inhibition of action schemas (Norman and and Shallice, 1986). Within both models, executive function is labelled as a unitary component responsible for multiple sub-functions. Executive function is also often measured with complex tasks such as Tower of London, Wisconsin Card Sort Task, and complex span tasks, which tap multiple sub-functions together and are sensitive for detecting profuse executive dysfunction in frontal lesion patients.

More recent work on the latent factor structure of executive function in typical samples indicates that executive function is comprised of a set of core related (through the common executive function: inhibition) and distinct (updating specific and switching specific) processes which contribute differentially to complex tasks and may be antagonistically related (tradeoffs between inhibition and switching specific; Miyake et al., 2000; Miyake and Friedman, 2012; Snyder et al., 2015). Complex executive function tasks therefore lack the specificity to detect the fine-grained core executive processes of inhibition, updating and switching (Miyake et al., 2000; Snyder et al., 2015), particularly in conditions which are associated with more subtle impairments rather than the severe executive impairments observed in lesion patients (Snyder et al., 2015). This is not to say that complex executive processes such as planning, decision making, problem solving, and verbal fluency are not "executive." Within Diamond's (2013, p.136) model of executive function, these complex processes are classified as higher-order executive processes which are "built" from the core executive processes of inhibition, working memory and switching. As such, there is a value in establishing the core executive profile associated with a condition before we begin to consider how higher-order executive processes are impacted. Miyake and Friedman (2012) provide a useful framework for exploring and measuring the core executive functions of inhibition, updating and switching. Inhibition is defined as the ability to override inappropriate responses, regulate appropriate behaviour and control attention by focusing on relevant information and filtering out distracting information; updating is the ability to hold and continuously update information in working memory from moment to moment; and switching is the ability to rapidly adapt to changing task demands (Miyake et al., 2000; Miyake and Friedman, 2012; Diamond, 2013).

The core executive functions may contribute to typical reading ability in many ways. Efficient reading requires the coordination of multiple processes such as focusing of attention on visual information, decoding visual information into speech sounds, maintaining, and updating speech sounds in working memory, combining speech sounds, matching combinations of speech sounds with stored words, deriving semantic meaning for comprehension, and moving onto the next word to start this process again. Beyond efficient functioning of each stage separately, these processes need to be carried out rapidly, sometimes in parallel and efficient switching between each stage is required. Inhibition may contribute to reading ability by focusing attention on relevant visual information, ignoring irrelevant information, maintaining speech sounds active and protected from interference in working memory while other stages are completed. In addition, children are often faced with reading in somewhat noisy and distracting environments such as the classroom, where additional demands are placed on inhibition to filter out distracting information. Updating may contribute to reading ability by holding and updating speech sounds in working memory during ongoing decoding of text and combining old speech sounds with new speech sounds to enable full word reading and comprehension. Switching processes may also contribute to reading, and given that multiple processes are involved in reading, switching abilities may support rapid alteration between different stages in the reading process which may support reading speed.

There is evidence for genetic linkages between executive function and reading development, Kegel and Bus (2013) found that genes important for the development of dopamine receptors in pre-frontal brain areas (DRD4) predict the acquisition of alphabetic skills important for reading from kindergarten to first grade, with executive function mediating this relationship. Some studies have also found that dyslexia is associated with underactivity of parietal and prefrontal areas important for executive function during an updating task (Beneventi et al., 2010), and abnormal neurophysiological markers of executive functioning during a range of executive tasks (Beneventi et al., 2010; Liotti et al., 2010; Van De Voorde et al., 2010; Horowitz-Kraus, 2014).

Despite evidence of genetic linkages between executive function and reading and reduced activity in brain areas supporting executive function in dyslexia, thus far, the exact core executive function profile (strengths and impairments in inhibition, updating, and switching; Miyake and Friedman, 2012; Friedman and Miyake, 2016) associated with dyslexia is unclear. Although some studies report that dyslexia is not associated with executive function impairments (Bental and Tirosh, 2007; Smith-Spark and Fisk, 2007; Peng et al., 2013; Bexkens et al., 2014), the majority of the literature thus far point to impairments (Nydén et al., 1999; Helland and Asbjørnsen, 2000; Willcutt et al., 2001, 2005; Brosnan et al., 2002; van der Sluis et al., 2007; Beneventi et al., 2010; Menghini et al., 2010; Poljac et al., 2010; Moura et al., 2016; see **Table 1**). However, there are conflicting findings regarding exactly which executive functions are compromised in dyslexia. A number of studies report inhibition impairments in dyslexia (Willcutt et al., 2001, 2005; Brosnan et al., 2002;

#### TABLE 1 | Summarising characteristics of previous EF profiling studies in dyslexia.


*D, dyslexia; C, control; CD, clinical diagnosis; RAST, researcher administered standardised test; SD, standard deviation; ST, standardised tool; NH, no history; UA, unified ability; MUP, multiple unrelated processes; SM, single measure; SP, single process; EF, executive function; Comp, composite score; TOH/L, Tower of Hanoi/London; MFF, Matching Familiar Figures; CPT, Continuous Performance Test; GNG, Go No-Go; WCST, Wisconsin Card Sort Test, SST, Stop Signal Task; Stroop, Stroop Task; TMT, Trail Making Task; GEFT, Group Embedded Figures Task; FT, Fluency Task; CU, Consonant Updating; SU, Spatial Updating; PM, Porteus Maze; OW, Opposite Worlds (TEACH); MT, Matching Switch Task; P2-back, phoneme 2-back, AN-Alphanumeric; Pi, Pic; Num, Number; W2-back, Word 2-back Task; N2-back, Number 2-back Task; Sim. T, Simon Task; Cog, Cognitive; inhib, inhibition; Behav, Behavioural.*

De Lima et al., 2012; Booth et al., 2014; Proulx and Elmasry, 2014), while others do not (Reiter et al., 2005; Bental and Tirosh, 2007; Marzocchi et al., 2008; Schmid et al., 2011; Bexkens et al., 2014). A number of studies report updating (working memory) impairments in dyslexia (Brosnan et al., 2002; Rucklidge and Tannock, 2002; McGee et al., 2004; Willcutt et al., 2005; Bental and Tirosh, 2007; Smith-Spark and Fisk, 2007), while others do not (Willcutt et al., 2005; Marzocchi et al., 2008; Peng et al., 2013). Likewise, a number of studies report switching impairments in dyslexia (Helland and Asbjørnsen, 2000; Poljac et al., 2010; De Lima et al., 2012), while others do not (Reiter et al., 2005; Bental and Tirosh, 2007; Marzocchi et al., 2008; Tiffin-Richards et al., 2008; Menghini et al., 2010). A meta-analysis conducted by Booth et al. (2010) indicates that dyslexia is associated with executive dysfunction (Hedges g = 0.57), however, effect sizes vary across tasks due to underlying task demands, these task impurity issues make it difficult to conclude on the exact profile of core executive functions in dyslexia.

Conflicting findings also emerge across studies exploring the predictive ability of executive function for dyslexia likelihood and for explaining variance in reading abilities. For instance, Booth et al. (2014) found that inhibition and updating combined predict dyslexia likelihood, while Moura et al. (2015) found that switching alone predicts dyslexia likelihood. Yet others report that executive function does not predict dyslexia likelihood (Willcutt et al., 2010). In typical samples, executive function appears to explain variance in reading skills, however, it is unclear which core executive functions (inhibition, updating, and switching) are predictive of reading. Some studies find that working memory updating predicts word reading ability (Christopher et al., 2012), while others argue that switching predicts word reading ability (Cartwright, 2012). There is also evidence that a combination of executive functions predict word reading ability, yet studies also differ regarding which combination of executive functions predict reading. For instance, van der Sluis et al. (2007) found that updating and switching combined are predictive of word reading ability, while other studies suggest that inhibition and updating combined are predictive of word reading ability (Welsh et al., 2010; Arrington et al., 2014). Some authors have also found that a combination of updating and processing resources such as speed predict reading ability in typically developing children (Christopher et al., 2012).

In atypical reading samples (dyslexia), it is unclear whether executive function predicts reading problems, as some studies find a predictive relationship, while others do not. Those reporting that executive function is implicated in word reading problems find that a different combination of executive functions explain variability. For instance, some studies find that impaired inhibition and updating combined predict reading problems in dyslexia (Wang and Yang, 2015), while others find that impaired inhibition and switching combined predict reading problems in dyslexia (Altemeier et al., 2008). However, some studies do not report a predictive relationship between executive function and reading problems. Instead, a combination of working memory capacity, processing speed and phonological abilities predict reading problems (McGrath et al., 2011) or processing speed and phonological abilities predict reading problems (Peterson et al., 2016).

Inconsistencies in the type of executive functions impaired in dyslexia and of clinical relevance for predicting dyslexia likelihood and reading ability, may be due to differences in sample characteristics/criteria, theoretically informed profiling approach, measurement tools and systematic control of confounding variables across studies (see **Tables 1**, **2**). These issues make it difficult to infer the exact core executive function profile associated with dyslexia and whether variability in core executive functions are of clinical relevance for predicting dyslexia likelihood and variance in reading ability.

Across executive function profiling and predictive studies there is a discrepancy between how dyslexia is classified within the sample (see **Tables 1**, **2**). Some studies include only participants with a clinical diagnosis of dyslexia given by a clinical/educational psychologist and based on DSM criteria (de Jong et al., 2009; Gooch et al., 2011; Varvara et al., 2014; Moura et al., 2015, 2016), while others use researcher-administered standardised tools to classify dyslexia, which vary in terms of cut-off points for classification (Altemeier et al., 2008; Peng et al., 2013; Bexkens et al., 2014; Booth et al., 2014). Studies also differ with regard to method for screening co-occurring ADHD or potentially undiagnosed ADHD from the dyslexia sample, although some studies implement a standardised tool to screen ADHD from the dyslexia sample (Pennington et al., 1993; Willcutt et al., 2001, 2005; Marzocchi et al., 2008; Tiffin-Richards et al., 2008; de Jong et al., 2009; Varvara et al., 2014), the majority just require no history of a diagnosis or report no method of tracking/screening ADHD from the dyslexia alone sample, or track ADHD but do not screen it from the sample. Not screening ADHD from the dyslexia sample is problematic as these conditions frequently co-occur (Willcutt and Pennington, 2000), and ADHD is associated with executive function impairments (Barkley, 1997). This makes it difficult to determine if executive function impairments are associated with dyslexia alone or manifest due to the presence of elevated ADHD within the sample.

Executive function profiling and predictive studies in dyslexia also differ in terms of approach to measuring executive function (see **Tables 1**, **2**). A number of studies view executive function as a unitary construct (employing complex measures such as Wisconsin Card Sort Task or unitary executive function composites; Pennington et al., 1993; Welsh et al., 2010; Beidas et al., 2013; Moura et al., 2015), whilst others view it as multiple but separate abilities (employing multiple complex measures such as planning, switching, inhibition, interference control, and verbal fluency) (Willcutt et al., 2001, 2005; Altemeier et al., 2008; Menghini et al., 2010; De Lima et al., 2012; Arrington et al., 2014; Booth et al., 2014; Moura et al., 2015, 2016), or look at separate processes in isolation with single tasks or composite scores (Beneventi et al., 2010; Poljac et al., 2010; Schmid et al., 2011; Wang and Yang, 2015). Extensive research carried out on the 3-factor model of executive function suggests that it is comprised of three core related (inhibition-common executive function) but separable abilities (updating and switching) which are most sensitively measured at the latent level with multiple tasks (Miyake and Friedman, 2012). The 3-factor structure of executive function has been found in childhood (Lehto et al., 2003) and adulthood (Miyake and Friedman, 2012), However, Huizinga et al. (2006) found evidence of a 2-factor rather than a 3-factor model across development (7–21 years) with latent factors of updating and switching, but not inhibition emerging. All of the variance of inhibition was not captured by updating and switching, rather inhibition tasks were not treated as a single factor due to low and opposing correlations, therefore they were included as manifest task-level factors in the model which best fit the data (Huizinga et al., 2006). Most executive function profiling and predictive studies do


TABLE 2 | Summarising characteristics of executive function predictive studies of reading and dyslexia.

*D, dyslexia; A, ADHD C, control; CA, chronological age controls; RA, reading age controls; Y-C, young controls; O-C, old controls; 1G-4G C, 1st- 4th grade controls; 3G-6G C, 3rd-6th grade controls; 5GC, 5th grade controls; S1, study 1; S2, study 2; NR, not reported; T1, time 1; T2, time 2; T3, time 3; CD, clinical diagnosis; RAST, researcher administered standardised test; SD, standard deviation; Perc, percentile; ST, Standardised Tool; NH, no history; UA, unified ability; MUP, multiple unrelated processes; SM, single measure; SP, single process; LA, latent analysis; EF, Executive function; Comp, composite score; Inhib, inhibition; EA-CAS, Expressive Attention Subtest of Cognitive Assessment System; ND-CAS, Number Detect of Cognitive Assessment System; BDS, Backward Digit Span; MD-SCPS, Mapping and Direction Subtest from Swanson Cognitive Processing Test; M1, model 1; M2, model 2; TMT, Trail Making Task; TOH/L, Tower of Hanoi/London; FT, Fluency Task; WM, working memory; LF, latent factor; DS, Digit Span; CS, Counting Span; SS, Sentence Span; CPT, Continuous Performance Test; SST, Stop Signal Task; CPST, Colorado Perceptual Speed Test; RAN, Rapid Automatized Naming; UP, updating; SW, Switching; QI, Quantity Inhibition; OI, Object Inhibition; Stroop, Stroop Task; NSI, Number Size Inhibition; KTT, Keep Track Task; LM, Letter Memory; DM, Digit Memory; OS, Objects Shifting; SS, symbol switching; PS, place shifting; NRS, Numbers Reversed Subtest; VPI, Verbal Proactive Interference; BWS, Backward Span; PTT, Peg Tapping Task; DMCST, Dimensional Card Sort Task; Cog, Cognitive; Behav, Behavioural; CWI, Colour Word Interference from D-KEFS; CWI-SS, Colour Word Interference Switch Score from D-KEFS; RAS, Rapid Automatic Switching.*

not measure executive function in such a way that they can elucidate the core profile of executive functions associated with dyslexia.

Previous approaches to profiling executive function in dyslexia and modelling its predictive ability for dyslexia likelihood and variance in reading ability are also problematic due to task impurity issues (see **Tables 1**, **2**). Complex tasks are poor profiling tools for detecting fine grained impairments in core executive functions, they lack specificity in detecting key underlying impairments, as performance is driven by a range of core executive functions (inhibition, updating, and switching) and non-executive processes (e.g., learning from feedback in WCST; Miyake et al., 2000; Miyake and Friedman, 2012; Snyder et al., 2015). Viewing executive function as a number of separate unrelated abilities or looking at single processes in isolation does not address how these abilities are facilitated by a number of core underlying processes which are both related (through the common factor: inhibition) and unique (updating and switching) (Miyake and Friedman, 2012). In addition, use of complex or higher-order executive tasks cannot assess potential trade-offs between executive functions (Goschke, 2000; Gruber and Goschke, 2004; Miyake and Friedman, 2012; Blackwell et al., 2014; Snyder et al., 2015; Friedman and Miyake, 2016). For instance, trade-offs have been observed between inhibition and switching due to the incompatibility of each demand (Goschke, 2000; Gruber and Goschke, 2004;

Blackwell et al., 2014). Inhibition facilitates increased focus by filtering irrelevant information/distractions in a top-down manner, whilst switching requires a degree of distraction to aid in considering alternative options in order to flexibly adapt to changing task demands (Gruber and Goschke, 2004). Although some authors are exploring more sensitive measurement of executive function by employing latent constructs (van der Sluis et al., 2007; Christopher et al., 2012). The majority of predictive and profiling studies in dyslexia do not employ "pure" measures of core executive functions and this necessarily limits the fine-grained understanding of how the common and specific aspects of executive function are clinically relevant for predicting dyslexia likelihood and explaining variance in reading ability.

A confounding factor in determining whether executive function is impaired in dyslexia and of clinical relevance, is processing speed. Processing speed is an index of the speed of cognitive processing and is considered a general mechanism underpinning performance on a wide range of cognitive tasks, as constrained speed of processing results in poor performance on time limited cognitive tasks (Salthouse, 1996). Similar to executive function, processing speed efficiency increases from childhood into adulthood and reduced efficiency is observed in later adulthood (Kail, 1991; Salthouse, 1996). Processing speed has been found to mediate the age-related changes in inhibition, working memory, and switching from childhood to adulthood (Span et al., 2004), suggesting that it is responsible for developmental changes in executive function. In addition, processing speed has been shown to explain variance in inhibition and switching, but not updating, at the individual task level (Huizinga et al., 2006). Previous research suggests that dyslexia is associated with processing speed impairments compared to control participants and that processing speed is predictive of reading ability (Willcutt et al., 2005, 2010; Shanahan et al., 2006; McGrath et al., 2011; Peterson et al., 2016). Peng et al. (2013) found updating and inhibition impairments in dyslexia, yet when they controlled for general processing speed impairments, updating and inhibition impairments no longer reached significance. This is problematic because poor performance on executive function tasks in dyslexia could be a consequence of impaired processing speed. Not controlling for processing speed then could result in false positive findings of executive impairments, which are reflective of a general slowness as opposed to an executive impairment per se.

Although we highlight the importance of controlling for processing speed when exploring whether executive function is impaired in dyslexia and of clinical relevance, this is not to say that processing speed is more important to account for than phonological processes in a predictive model of reading. Phonological impairments are consistently found in dyslexia (Swan and Goswami, 1997; Wimmer et al., 1998), and predict future reading ability (Mann, 1993). However, the focus of the present study is not to establish the predictive nature of executive function for reading ability beyond the contributions of phonological processing or processing speed. Rather, the focus of the present study is to establish the fine-grained profile of executive functions associated with dyslexia and which specific aspects of executive function support reading ability while controlling for the confounding influence of processing speed on executive function performance.

Most executive function profiling and predictive studies do not measure executive function in such a way that they can elucidate the core profile of executive functions associated with dyslexia while accounting for individual differences in processing speed. To address this, our study aims to profile and explore the predictive ability of core executive functions in dyslexia using Miyake and Friedman's (2012) 3-factor model and to employ sensitive measures of each construct whilst controlling for individual differences in processing speed. Tasks were deemed sensitive measures if they: (1) demonstrate significant loadings onto core executive function constructs within previous latent variable analyses studies; and (2) are underpinned by frontal brain activation. Within this study, multiple measures are employed for each executive construct (inhibition, updating, and switching) with different types of content (e.g. picture, phoneme, and alpha-numeric). Following from the work of Beneventi et al. (2010) which found phonemic updating impairments in dyslexia, these tasks were also carefully selected to allow for an exploration of phoneme specific vs. general executive processing in dyslexia. However, a consideration of processing constraints imposed by phonemic content is beyond the scope of this paper. Although, latent variable analysis is considered the most sensitive approach to measure core executive functions (Miyake and Friedman, 2012), it could not be conducted in this study due to sample size constraints. Executive function zmean composite scores were created for each construct which provide cleaner measures by filtering out any non-executive noise when sample size is constrained (Snyder et al., 2015). This study will include a homogenous sample of participants with a clinical diagnosis of dyslexia and screen for elevated ADHD using the combined ADHD subscale of the Child Behaviour Checklist (Achenbach and Rescorla, 2001). Children scoring in the pre-clinical/clinical range on the Child Behaviour Checklist for their age and gender will be screened from the dyslexia sample. By remedying some of the issues associated with executive function measurement in dyslexia, this study may shed light on the core executive function profile associated with dyslexia and whether this is clinically relevant for variability in reading.

Overall, there is difficulty in determining the core executive function profile of dyslexia and whether core executive functions are clinically relevant for predicting dyslexia likelihood and variance in reading ability. By using z-mean measures of each executive construct, this study aims to establish the core executive function profile (strengths/impairments in inhibition, updating, and switching) associated with dyslexia and determine which core executive functions are predictive of dyslexia likelihood and variance in reading ability while controlling for individual differences in processing speed. Exploring executive function in dyslexia using the 3-factor structure may also elucidate strengths and impairments, as well as potential trade-offs between executive functions which often manifest between inhibition and switching due to incompatibility of each demand (Goschke, 2000; Gruber and Goschke, 2004; Blackwell et al., 2014), thus allowing for the development of a more sensitive and specific executive function profile of dyslexia which cannot be captured by previous profiling approaches.

# METHODS AND MATERIALS

## Participants

Fifty-six participants aged 10–12 years were recruited to take part in this study: 27 participants (13 female, 14 male; mean age: 10.78 years) with developmental dyslexia, and 29 participants (female:12, male: 17; mean age:10.93) with no clinical diagnosis served as a control group. Dyslexia diagnosis was confirmed with a copy of the psychological assessment report conducted by a clinical or educational psychologist. Two participants in the dyslexia group did not have a formal diagnosis of dyslexia but were enrolled on a dyslexia support workshop at their primary school. Initially 31 participants with dyslexia were recruited, however 4 were removed from the analysis due to scoring in the clinical range on the ADHD scale of the Child Behaviour Checklist (Achenbach and Rescorla, 2001). All participants were monolingual English speakers with normal or corrected vision and hearing. Participants had no additional diagnosis of a psychological or neurodevelopmental condition. Informed consent and assent were obtained from participating parents and children in written form. Ethical approval for this research project was granted by Dublin City University's Research Ethics Committee (DCUREC/2014/167) in accordance with the declaration of Helsinki. Participants were recruited through the Dyslexia Association of Ireland and primary schools in Ireland.

# Procedure

The research study was carried out in the psychology laboratories in the School of Nursing and Human Sciences at Dublin City University. All participants were assessed individually in the presence of a parent or guardian. The testing session took ∼2 h to complete and a break was taken half way through. During the testing session children completed a battery of neuro-cognitive (executive function), reading and processing speed measures. The order of tasks was counterbalanced for each participant to control for fatigue effects. All neuro-cognitive measures were created with E-Prime Software and responses were recorded on a Cedrus RB-50 response pad.

# Measures

#### Processing Speed

Participants completed a computerized version of the coding task (Wechsler, 2003) as a measure of processing speed. On screen participants viewed a row of letters with a row of numbers directly underneath while a letter was presented centrally. Participants were tasked with searching for the centrally presented letter on the letter row and pressing the number on the keypad which was directly underneath the letter. This task consisted of 30 trials and a practice block of 10 trials where feedback was given. The dependent measure in this task is the number of trials correctly completed after 30 s. Latent analyses of the coding task reveal that it loads highly onto a general processing speed factor (Keith et al., 2006; Watkins et al., 2006; Bodin et al., 2009). Although some authors find that this task is correlated with inhibition and predicts variance in working memory (Cepeda et al., 2013). Confirmatory factor analytic studies suggest that this task has higher loadings on a processing speed than a working memory factor (Watkins et al., 2006; Bodin et al., 2009). Watkins et al. (2006) found that the loadings of the coding task on a processing speed factor (0.70) far out-weighed its loadings on a working memory factor (−0.04).

#### Inhibition Measures

#### **Stroop task**

Participants completed the Stroop Task (Balota et al., 2010) as a measure of response inhibition. In this task participants were presented with four colour words (red, blue, green, yellow) and four non-colour words (poor, deep, legal, bad) which were presented on screen in varying ink colours (red, blue, green, yellow). In the first block (colour naming) participants had to press the button on the response pad corresponding to the ink colour of the word. In the second block (word naming) participants had to press the button on the response pad corresponding to the meaning of the word (e.g., press red for word red only). Practice blocks were given before each experimental block which consisted of 16 trials. Experimental blocks consisted of 104 trials. Stimuli appeared on screen for 5,000 ms with an inter-stimulus fixation of 500 ms. Stroop interference effect scores for errors and reaction time were calculated by subtracting reaction time/errors on congruent trials from reaction time/errors on incongruent trials. The Stroop task significantly loads onto an inhibition latent variable (Miyake et al., 2000; Friedman and Miyake, 2004), and is underpinned by frontal brain activation (Bench et al., 1993; Collette et al., 2005).

#### **Picture Go No-Go task**

Participants completed the picture Go No-Go task as a measure of inhibition. This task was an adapted version of the Go No-Go task (Brocki and Bohlin, 2004; McAuley and White, 2011) to include pictures of common objects from the Snodgrass and Vanderwart (1980) collection. Stimuli were chosen on the basis of having an age of acquisition below 8 years and a name agreement level of over 65% in children aged 5–6 years (Snodgrass and Vanderwart, 1980; Cycowicz et al., 1997). Participants viewed a sequence of object pictures which appeared centrally on screen and were required to press a button for all Go pictures (manmade objects) and to withhold response for No-Go pictures (natural objects). The experimental block consisted of 100 trials (75 go trials and 25 no-go trials). A practice block of 20 trials with feedback was given prior to the experimental block. Stimuli appeared on screen for 2,000 ms with an inter-stimulus fixation for 1,000 ms. Stimuli were presented in the same pseudo-random order for each participant. The dependent measure on this task was the percentage commission errors committed. The Go No-Go paradigm of task significantly loads on to an inhibitory control factor (Archibald and Kerns, 1999), and is underpinned by frontal brain activation (Casey et al., 1997; Booth et al., 2005).

#### **Phoneme Go No-Go task**

Participants completed the phoneme Go No-Go task as a measure of inhibition. This task was an adapted version of the Go No-Go task (Brocki and Bohlin, 2004; McAuley and White, 2011) to include phoneme-picture information. Stimuli were selected from the Snodgrass and Vanderwart (1980) collection on the basis of picture name being monosyllabic or bi-syllabic, having an age of acquisition below 8 years and a name agreement level of over 65% in children aged 5–6 years (Snodgrass and Vanderwart, 1980; Cycowicz et al., 1997). Participants viewed a sequence of pictures which appeared centrally on screen and were required to press a button for Go stimuli (pictures beginning with a consonant) and to withhold response for No-Go stimuli (pictures beginning with a vowel). The experimental block consisted of 100 trials (75 Go trials and 25 No-Go trials). A practice block of 20 trials with feedback was given prior to experimental block. Stimuli appeared on screen for 2,000 ms with an inter-stimulus fixation for 1,000 ms. Stimuli were presented in the same pseudo-random order for each participant. The dependent measure on this task was the percentage commission errors committed. The Go No-Go paradigm of task significantly loads on to an inhibitory control factor (Archibald and Kerns, 1999), and is underpinned by frontal brain activation (Casey et al., 1997; Booth et al., 2005).

#### **Sustained attention to response task (SART)**

Participants completed the random SART task as a measure of inhibition (Robertson et al., 1997; Johnson et al., 2007). Participants viewed a random sequence of single digits (1–9) on screen and were instructed to respond to all digits (go trials) with a button press except 3 (no-go trial). The experimental block consisted of 225 trials. A practice block consisting of 18 trials with feedback was administered prior to the experimental block. Single digits (1–9) appeared on screen for 313 ms, followed by a response cue for 563 ms and a fixation cross for 563 ms. Participants were instructed to respond when the response cue was on screen. The dependent measure on this task was the percentage commission errors committed. The random SART places demands on inhibition (Johnson et al., 2007), is similar in task procedure to Go No-Go task which significantly loads on to inhibitory control (Archibald and Kerns, 1999) and is underpinned by frontal brain activation (Fassbender et al., 2004).

#### **Inhibition composite**

Inhibition Z-mean composite scores were calculated to provide a cleaner measure of inhibition by filtering out non-executive noise and to increase power due to sample size constraints (Snyder et al., 2015). Z-mean composite scores were created rather than Z-scores to account for the influence of number of tasks contributing to the composite score. Z-scores for errors and reaction time from the Picture Go No-Go, Phoneme Go No-Go, SART, and Stroop task were combined to create inhibition composite scores as follows:

Error composite :

$$\left(\frac{\text{ZPicGGNG}omm + \text{ZPhonGNG} + \text{ZSARTComm} + \text{ZStroopError}}{4}\right)$$
   
 Reaction time composite: 
$$\left(\frac{\text{ZPicGGNGT} + \text{ZPhonGNGRT} + \text{ZSARTT} + \text{ZStroopRT}}{\text{ZPion}}\right)$$

4

#### Updating Measures

#### **Letter 2-back task**

Participants completed the letter 2-back (Kane et al., 2007) task as a measure of updating working memory. Participants viewed a continuous stream of letters presented centrally on screen and were required to decide if the current letter on screen matched the letter presented 2 times ago. If the letters matched participants were instructed to press the green button on the response pad and if the letters did not match participants were instructed to press the red button on the response pad. The experimental block consisted of 96 trials. Stimuli were presented on screen for 1,000 ms with an inter-stimulus fixation for 100 ms. Participants completed a practice block of 7 trials with feedback given prior to the experimental block. The dependent measure in this task is the percentage errors. The 2-back task loads on to a working memory updating factor (Wilhelm et al., 2013), and is underpinned by frontal brain activation (Owen et al., 2005).

#### **Picture 2-back task**

Participants completed the picture 2-back task as a measure of updating. This task was modified (Beneventi et al., 2010) to include basic visual information. Stimuli were selected from the Snodgrass and Vanderwart (1980) collection on the basis of having an age of acquisition below 8 years and a name agreement level of over 65% in children aged 5–6 years (Snodgrass and Vanderwart, 1980; Cycowicz et al., 1997). Participants were presented with a continuous stream of pictures appearing centrally on screen and were required to decide if the current picture on screen matched the picture that was on screen 2 times ago. If the pictures matched, participants were instructed to press the green button on the response pad and if pictures did not match participants were instructed to press the red button on the response pad. The experimental block consisted of 100 trials (33 of which were target matches). Participants completed a practice block of 20 trials with feedback prior to the experimental block. Stimuli appeared on screen for 1,000 ms with an inter-stimulus fixation for 1,500 ms. The dependent measure in this task is the percentage errors. The 2-back task loads on to a working memory updating factor (Wilhelm et al., 2013) and is underpinned by frontal brain activation (Owen et al., 2005; Beneventi et al., 2010).

#### **Phoneme 2-back task**

Participants completed the phoneme 2-back task as a measure of updating. This task was a modified version of the phoneme updating task used by Beneventi et al. (2010). This task was adapted for English speaking participants and only the first phoneme 2-back condition is used in the current study. Stimuli were selected from the Snodgrass and Vanderwart (1980) collection on the basis of picture name being monosyllabic or bi-syllabic, having an age of acquisition below 8 years and a name agreement level of over 65% in children aged 5–6 years (Snodgrass and Vanderwart, 1980; Cycowicz et al., 1997). Participants viewed a continuous sequence of pictures presented centrally on screen and were required to decide if the first phoneme of the current picture on screen matched the first phoneme of the picture presented on screen two times ago. If the phonemes matched participants were instructed to press the green button on the response pad and if phonemes did not match participants were instructed to press the red button on the response pad. The experimental block consisted of 100 trials (33 of which were target matches). Participants completed a practice block of 20 trials with feedback prior to the experimental block. Stimuli appeared on screen for 1,000 ms with an inter-stimulus fixation for 1,500 ms. The dependent measure in this task is the percentage errors. The 2-back task loads on to a working memory updating factor (Wilhelm et al., 2013) and is underpinned by frontal brain activation (Owen et al., 2005; Beneventi et al., 2010).

#### **Updating composite**

Updating Z-mean composite scores were calculated to provide a cleaner measure of updating by filtering out any non-EF noise and to increase power due to sample size (Snyder et al., 2015). Z-mean composite scores were created rather than Z-scores to account for the influence of number of tasks contributing to the composite score. Z-scores for errors and reactions times for the Picture 2-back, Phoneme 2-back, and Letter 2-back tasks were combined to create updating composite scores expressed as:

#### Switching Measures

#### **Number-letter switch task**

Participants completed the number-letter switch task as a measure of switching ability. An adapted version of the numberletter task (Rogers and Monsell, 1995; Miyake et al., 2000) was used where switch is based on the colour of stimuli instead of the location of stimuli. Participants were presented with different number-letter pairs (e.g., 2A) centrally on screen and were required to decide on the number if green or to decide on the letter if red. If the number-letter pair appeared in red participants had to focus on the letter and decide if it was a consonant or a vowel. If the number-letter pair appeared in green participants had to focus on the number and decide if it was even or odd. In the first block of 20 trials the number-letter pair only appeared in red. In the second block of 20 trials the number-letter pair only appeared in green. In the third block of 116 trials the numberletter pair changed between red and green and participants were required to switch between processing number or letter- switch occurred on every 4th trial. Participants completed a practice block of 12 trials with feedback prior to each experimental block. Stimuli appeared on-screen for 5,000 ms with an inter-stimulus fixation for 150 ms. The switch cost in errors and reaction time for this task is the difference between trials that required a switch and trials which required no switch. The number-letter switch task loads onto a switching construct (Miyake et al., 2000; Collette et al., 2005), and is underpinned by frontal brain activation (Collette et al., 2005).

#### **Phoneme switch task**

Participants completed the phoneme switch task as a measure of switching ability. The number letter-task procedure (Rogers and Monsell, 1995; Miyake et al., 2000) was adapted to contain phoneme information. Stimuli for this task were pictures of common objects from the Snodgrass and Vanderwart (1980) collection on the basis of picture name being monosyllabic or bi-syllabic, having an age of acquisition below 8 years and a name agreement level of over 65% in children aged 5–6 years (Snodgrass and Vanderwart, 1980; Cycowicz et al., 1997). Participants viewed a different number of pictures (e.g. 2 apples, 1 star, 3 balloons) on screen in light (light red, green, or blue) or dark colours (dark red, green or blue). Participants were required to do one of two things depending on the first phoneme (letter sound) of the pictures. If the first phoneme was a consonantsound, participants had to decide if the pictures were light or dark in colour. If the first phoneme was a vowel-sound, participants had to decide if the number of pictures was even or odd. In the first block of 20 trials only first phoneme consonant pictures appeared on screen. In the second block of 20 trials only first phoneme vowel pictures appeared on screen. In the third block of 116 trials the pictures changed between first phoneme consonant and vowel, and participants were required to switch between processing number or colour- switch occurred on every 4th trial. Participants completed a practice block of 12 trials with feedback prior to each experimental block. Stimuli appeared on screen for 5,000 ms with an inter-stimulus fixation for 150 ms. The switch cost in errors and reaction time for this task is the performance difference between trials that required a switch and trials which required no switch. A similar task the number-letter switch task loads onto switching construct (Miyake et al., 2000; Collette et al., 2005), and is underpinned by frontal brain activation (Collette et al., 2005).

#### **Switching composite**

Switching Z-mean composite scores were calculated to provide a cleaner measure of switching by filtering out any non-executive noise and to increase power due to sample size constraints (Snyder et al., 2015). Z-mean composite scores were created rather than Z-scores to account for the influence of number of tasks contributing to the composite score. Z-scores for errors and reaction time were combined from the Number-Letter and Phoneme switch tasks to create switching composite scores

expressed as:

Error composite:

Reaction time composite: 

$$\left(\frac{ZNumLetswidthRTcost + ZPhonswidthRTcost}{2}\right)$$

#### Reading

#### **Reading ability**

Participants completed the Green word reading list from the Wide Range Achievement Test 4 (WRAT-4) (Wilkinson and Robertson, 2006) as a measure of reading ability. The word reading subtest from WRAT-4 requires participants to read from a list of 55 items increasing in difficulty. The assessment was discontinued if participants had 10 consecutive errors. The WRAT-4 word reading subtest demonstrates good test retest reliability (subtest = 0.86) and consistency (subtest = 0.87) (Wilkinson and Robertson, 2006).

# RESULTS

#### Data Analysis

To explore executive function profile associated with dyslexia, ANOVA, and ANCOVA analyses were performed. To explore the predictive ability of executive function z-mean composite scores for dyslexia diagnosis logistic regressions and receiver operating characteristic (ROC) curve analyses was performed. To explore whether executive function z-mean composites are predictive of variance in reading multiple linear regression analysis was performed. Preliminary analyses were conducted to ensure that variables did not violate the assumptions of normality, homogeneity of variance, homogeneity of regression slopes, independence of errors, multicollinearity, linearity, and linearity of logit. The Stroop interference effect in errors and reaction time on the Picture 2-back task violated the assumption of normality, appropriate non-parametric analysis was employed for these variables. All assumptions were met for the executive function z-mean composite scores for linear and logistic regression analyses.

#### Descriptive Statistics

Descriptive statistics and between group comparisons for the dyslexia and control group are summarised in **Table 3**.

#### Executive Function Profile Associated With Dyslexia **Inhibition**

Results from separate 2 (group: dyslexia, control) x 1 (inhibition measure: composite, task-level) ANOVAs indicate that dyslexia is associated with a significant inhibition impairment. At the composite level, dyslexia is characterized by a significantly higher inhibition z-mean error score than control participants [F(1, 53) = 13.85, p = 0.000, Cohen's d = 1.01; **Figure 1**]. At the individual task-level, dyslexia is associated with significantly more commission errors than control participants during the Picture Go No-Go [F(1,,53) = 12.75, p = 0.001, Cohen's d = 0.94], Phoneme Go No-Go [F(1, 54) = 5.19, p = 0.027, Cohen's d = 0.61), and SART tasks [F(1, 54) = 6.56, p = 0.013, Cohen's d = 0.68].

Dyslexia is also associated with a processing speed impairment [F(1, 54) = 4.84, p < 0.05, Cohen's d = 0.60; **Figure 2**]. After controlling for individual variation in processing speed, significant differences in the inhibition z-mean error score [F(1, 52) = 9.29, p = 0.004], commission errors during the Picture Go No-Go task [F(1, 52) = 8.50, p = 0.005] and commission errors during the SART task [F(1, 53) = 5.91, p = 0.019] remain. Group differences in commission errors on the Phoneme Go No-Go [F(1, 53) = 2.59, p = 0.114] task are no longer significant after controlling for individual variation in processing speed.

No significant group differences were observed for the inhibition z-mean reaction time composite score, Stroop task (Stroop effect in reaction time or error), or reaction time during the Picture Go No-Go, Phoneme Go No-Go and SART tasks, before or after controlling for individual variation in processing speed (see **Table 3**).

#### **Updating**

Results from separate 2 (group: dyslexia, control) x 1 (updating measure: composite, task-level) ANOVAs indicate that dyslexia is associated with a significant updating impairment. At the composite level, dyslexia is characterized by a significantly higher updating z-mean error score than control participants [F(1, 54) = 19.14, p = 0.004, Cohen's d = 0.86, **Figure 3**]. At the individual task-level, dyslexia is associated with significantly more errors during the Letter 2-back [F(1, 54) = 19.14, p = 0.000, Cohen's d = 1.20] and Picture 2-back [F(1, 54) = 7.72, p = 0.007, Cohen's d = 0.47) tasks.

After controlling for individual variation in processing speed, significant differences for the updating z-mean error composite score [F(1, 53) = 5.68, p = 0.021] and errors during the Letter 2-back task [F(1, 53) = 15.41, p = 0.000] remained. Group differences in errors on the Picture 2-back task are no longer significant [F(1, 53) = 3.88, p = 0.054) after controlling for individual variation in processing speed.

No significant differences were observed for the updating zmean reaction time composite score, Phoneme Go No-Go task (error rate or reaction time), or reaction time during the Letter 2-back and Picture 2-back tasks, before or after controlling for individual variation in processing speed (see **Table 3**).

#### **Switching**

Results from separate 2 (group: dyslexia, control) x 1 (switching measure: composite, task level) ANOVAs indicate that dyslexia is associated with a switching strength. At the composite level, dyslexia is characterized by a significantly lower switching zmean reaction time cost score than control participants [F(1, 54) = 5.03, p = 0.029, Cohen's d = −0.60].

After controlling for individual variation in processing speed, significant differences for the switching z-mean reaction time cost score are no longer significant [F(1, 53) = 2.55, p = 0.116].

No significant differences were observed for the switching z-mean error composite score (see **Figure 4**), Number-Letter



*Stroop RT, Stroop effect in reaction time; Stroop err, Stroop effect in error; GNG, GoNoGo; Comm, commission errors; Comp, composite score; RT, reaction time; Num-Let SW error, Number-Letter switch cost in errors; Num-Let SW RT, Number-Letter switch cost in reaction time; Phon SW err, Phoneme switch cost in errors; Phon SW RT, phoneme switch cost in reaction time; Proc. Speed, processing speed. P* < *0.05*\* *, P* < *0.01*\*\* *.*

Switch task (reaction time cost or error cost) or the Phoneme Switching task (reaction time cost or error cost), before or after controlling for individual variation in processing speed (see **Table 3**).

#### Predicting Dyslexia Likelihood

Results from the binary logistic regression are summarised in **Table 4**. At step 1, processing speed only was entered into the model to control for its influence on executive

FIGURE 3 | Updating Z-mean error composite scores for dyslexia and control participants.

function. At step 2, in addition to processing speed, inhibition, updating and switching z-mean error composite scores were entered into the model respectively to reflect the pattern of impaired and unimpaired processes associated with dyslexia.

Step 1 (processing speed): demonstrated a trend for predicting dyslexia, the chi square [X<sup>2</sup> (1) = 5.29, p = 0.032] and−2Log Likelihood (70.94) statistics demonstrate good model fit. Model 1 correctly classified 65.5% of participants according to presence/absence of dyslexia diagnosis: sensitivity 59.3% (truepositive) and specificity 71.4% (true-negative).

The addition of the inhibition, updating and switching composite scores at step 2, significantly improved model fit [Chi square: Model X<sup>2</sup> (3) = 15.49, p = 0.001; −2Log Likelihood: 55.45; R<sup>2</sup> cs = 0.315; R<sup>2</sup> <sup>N</sup>=0.42]. This model correctly classified 78.2% of participants according to presence/absence of dyslexia diagnosis: sensitivity 81.5% (true-positive) and specificity 75% (true-negative). As outlined in **Table 4**, this model suggests that when accounting for low-level processing speed only inhibition [Wald: X<sup>2</sup> (1) =7.06, p = 0.008] and updating composite scores [Wald: X<sup>2</sup> (1) = 5.17, p = 0.023] predict TABLE 4 | Binary logistic regression with executive function error composite scores.


Step 1: *R <sup>2</sup>* = *0.092 (Cox & Snell), 122 (Nagelkerke), Model X<sup>2</sup> (1)* = *5.29, p* < *0.05.* Step 2: *R <sup>2</sup>* = *0.315 (Cox & Snell), 0.42 (Nagelkerke), Model X<sup>2</sup> (4)* = *20.77, p* < *0.001. Hosmer and Lemeshow (Step 1) X<sup>2</sup> (6)* = *10.13, p* = *0.119, (Step 2) X<sup>2</sup> (7)* = *8.19, p* = *0.316 indicates good model fit. P* < *0.05*\**, P* < *0.01*\*\**.*

dyslexia. The b-values reflect that for every for one-unit change in inhibition score (errors) there is a corresponding 1.83 unit change in the logit of the outcome variable, while for every one-unit change in updating score (errors) there is a 1.28-unit change in the logit of the outcome variable. The proportionate odds values [Exp (B)] are >1 for both predictors suggesting that as error score on each predictor increases the likelihood of the outcome occurring (dyslexia diagnosis) increases.

ROC curve analysis (see **Figure 5**) indicates that the executive function predictive model (inhibition and updating) is a good fit with an area under the curve (AUC) of.835 (95% CI:0.727–0.942, p = 0.000). A randomly selected participant with dyslexia will have a higher error rate on inhibition and updating composites than a randomly selected control participant approximately 83.5% of the time. According to Swets (1988), criteria for diagnostic accuracy (poor:0.5–0.7, moderate:0.7–0.9, high:0.9– 1.0), inhibition and updating composites demonstrate moderate accuracy in predicting dyslexia diagnosis.

#### Predicting Reading Ability

Hierarchical multiple linear regression is explored here with processing speed entered at step 1 and inhibition, updating, and switching error composites scores entered respectively at step 2 to address whether core executive functions are predictive of reading ability (see **Table 5** for results). Hierarchical multiple linear regression was explored within dyslexia alone and control alone (see **Tables 6**, **7** for summary of results). It should be noted that overall each model was non-significant (dyslexia: R 2 = 0.275, p = 0.07; control: R <sup>2</sup> = 0.287, p = 0.09). However, exploring the predictive relationship between cognitive processes and behavioural outcomes separately in clinical and non-clinical groups has recently been criticized as it does not include the full dimension of variability from typical to atypical (Cuthbert, 2014). For this reason, we sought to explore the predictive

TABLE 5 | Linear regression model with executive function error composites predicting reading ability across groups.


Step 1: *R <sup>2</sup>* = *0.114;* Step 2: *R <sup>2</sup>* = *0.459.* \**p* < *.05,* \*\**p* < *0.01.*

relationship between core executive functions and reading ability to understand how it related to the full dimension of reading ability.

Step 1 (processing speed): significantly predicted 11.4% of the variance in reading ability across groups. Step 2 (processing speed and executive function): Adding executive function composite scores to the model significantly improved the predictive ability (45.9%) and explained an additional 34.5% of the variance in reading ability [R<sup>2</sup> change = 0.345, F(3, 54) = 25.98, p = 0.000]. As outlined in **Table 5**, the results suggest that after controlling for processing speed abilities inhibition and updating significantly predict reading ability. Beta values for inhibition and updating reflect a 0.527 and 0.307 decrease in reading ability score for every 1SD increase in executive function composite error respectively. This suggests that inhibition and updating can predict variance in reading abilities across a trajectory from typical-atypical reading.

TABLE 6 | Linear regression model with executive function error composites predicting reading ability dyslexia alone.


Step 1: *R <sup>2</sup>* = *0.008;* Step 2: *R <sup>2</sup>* = *0.275.* \**p* < *0.05,* \*\**p* < *0.01.*

TABLE 7 | Linear regression model with executive function error composites predicting reading ability control alone.


Step 1: *R <sup>2</sup>* = *0.099;* Step 2: *R <sup>2</sup>* = *0.287.* \**p* < *0.05,* \*\**p* < *0.01.*

#### Summary of Results

Dyslexia is associated with inhibition and updating impairments while controlling for individual variation in processing speed impairments. Inhibition and updating are clinically relevant for predicting dyslexia likelihood and reading ability while controlling for individual variation in processing speed.

# DISCUSSION

From previous research, the core executive function profile (strengths and impairments in inhibition, updating, and switching) associated with dyslexia alone is unclear. Inconsistent impairments are found across a range of executive measures in dyslexia. In addition, there are inconsistencies regarding which exact aspects of executive function are predictive of dyslexia likelihood and reading ability. Potential reasons for inconsistent findings across the literature include discrepancies with group classification, theoretical approach to profiling, task impurity issues, and a lack of control for the confounding influence of processing speed on executive function. These issues make it increasingly difficult to infer the core executive function profile associated with dyslexia and whether core executive functions are clinically relevant for predicting dyslexia diagnosis and variance in reading ability. This study contributed to existing literature on executive functions in dyslexia by employing sensitive measures of each core executive construct (z-mean composites) within the 3-factor model of executive function (Miyake et al., 2000; Miyake and Friedman, 2012; Snyder et al., 2015) while controlling for individual variation in processing speed, in a homogenous sample of children with dyslexia (clinical diagnosis, screened for elevated ADHD with a standardised measure). Findings suggest that dyslexia is associated with inhibition and updating, but not switching impairments at the z-mean composite level, whilst controlling for individual variation in processing speed. Inhibition and updating, but not switching, were also predictive of dyslexia likelihood and reading ability, whilst controlling for individual variation in processing speed.

The model for predicting dyslexia likelihood developed in this study demonstrated that inhibition and updating error composites significantly predict likelihood of dyslexia (sensitivity: 81.5%, specificity: 75%) with moderate diagnostic accuracy (0.835) according to Swets (1988) criteria (poor:0.5–0.7; moderate:0.7–0.9, high:0.9–1.0). The accuracy rate suggests that a randomly selected participant with dyslexia will have a higher error rate on inhibition and updating z-mean error composites than a randomly selected control participant approximately 83.5% of the time. These findings suggest that inhibition and updating abilities not only differentiate dyslexia from control participants but are capable of discriminating dyslexia from control participants.

The predictive ability of inhibition and updating for dyslexia likelihood found in this study is consistent with the work of Booth et al. (2014), which found that a model including inhibition and working memory abilities predict dyslexia likelihood. Booth et al. (2014) found that a model including a non-verbal working memory task and an inhibition composite score (comprised of Stroop task and Number-Detection task performance) correctly classified 78% of participants according to absence/presence of dyslexia (sensitivity: 86%; specificity: 65%). However, in their model only the inhibition composite score discriminated between dyslexia and control participants (Booth et al., 2014). Although, our findings are similar to Booth et al. (2014), their study did not include measures of switching, control for processing speed and included 9 dyslexia participants with elevated ADHD. The findings from our dyslexia predictive model are inconsistent with the work of Moura et al. (2015), which found that switching abilities predict dyslexia likelihood. Moura et al. (2015) found that switching, as measured with the Trail Making Task, significantly predicts dyslexia likelihood with moderate diagnostic accuracy (0.73), and correctly classifies 71.7% of participants according to absence/presence of dyslexia (sensitivity: 69.4%; specificity: 74%). However, their study did not use a screening tool to remove potential undetected ADHD, include measures of the other core executive functions (inhibition and updating) and did not control for processing speed. The model developed in the present study demonstrates higher diagnostic accuracy (0.835) and correctly classifies a higher proportion of participants (78.2%) than both previous studies (Booth et al., 2014; Moura et al., 2015). To our knowledge, the present study is the first to explore the ability of all three core executive functions for predicting dyslexia likelihood while controlling for processing speed. Although our model found that processing speed predicts dyslexia likelihood, after the executive function error composites were included, the predictive relationship between processing speed and dyslexia likelihood was no longer significant. The only core executive functions predictive of dyslexia likelihood were inhibition and updating, suggesting that these abilities can discriminate dyslexia from control participants.

Inhibition and updating composites also significant predicted reading ability. The model for predicting reading ability developed in this study explained 45.9% of the variance in reading ability. The initial model including only processing speed at step 1 demonstrated a trend for predicting variance in reading ability (11.4%), however executive function z-mean error composites significantly improved the model's predictive ability, explaining an additional 34.5% of variance in reading ability. Processing speed was no longer significant after executive functions were entered and the model suggested that inhibition and updating were the only significant core executive predictors of reading. The relationship was such that those with higher errors on inhibition and updating z-mean composites had significantly poorer reading ability.

The predictive relationship of inhibition and updating for reading ability is consistent with previous work finding that inhibition and working memory combined are predictive of reading in typical samples (Welsh et al., 2010; Arrington et al., 2014) and that working memory and inhibition are predictive of the severity of reading impairment expressed in dyslexia (Wang and Yang, 2015). Arrington et al. (2014) found that working memory (Digit Span Backward task) and response inhibition (Stop Signal task) predicted word reading ability. However, Wang and Yang (2015) found that working memory (sentence span) and a cognitive inhibition composite score (comprised of Stroop and Group Embedded Figures task performance), but not a behavioural inhibition composite score (comprised of Go No-Go and Stop signal task performance), predicted reading ability in dyslexia. The findings of the present study are more similar to those of Arrington et al. (2014), as our inhibition composite predicting reading ability was more heavily weighted on response inhibition (Picture Go No-Go, Phoneme Go No-Go, and SART tasks) than cognitive inhibition (Stroop task) which did not differentiate participants at the task level. However, both studies did not include measures of switching and updating, or control for the influence of processing speed on executive performance. Christopher et al. (2012) found that working memory (sentence span, digit span, counting span) and processing speed (perceptual speed, identical pictures), but not inhibition (continuous performance, stop signal tasks) latent factors predict reading ability. The predictive relationship between processing speed and reading is also found in other studies (McGrath et al., 2011; Peterson et al., 2016). Yet, our findings suggest that after including core executive functions in the reading model processing speed is no longer a significant predictor, while inhibition and updating are the only significant predictors of reading ability.

Poor performance on inhibition and updating composites was associated with poor reading ability. This suggests that the core executive functions of inhibition and updating support word reading ability and when disrupted as is the case in dyslexia contribute to reading impairment. As previously discussed, efficient reading requires the coordination of multiple processes such as focusing of attention on visual information, decoding visual information into speech sounds, maintaining, and updating speech sounds in working memory, combining speech sounds, matching combinations of speech sounds with stored words, deriving semantic meaning for comprehension, and moving onto the next word to start this process again. Our findings suggest that inhibition can contribute to this process, children with dyslexia experience inhibition difficulties which may result in a difficulty suppressing irrelevant information and protecting the contents of working memory. As suggested by Arrington et al. (2014) response inhibition in particular, may be an important gating function preventing activation of similar words with spelling-sound mappings in working memory. This constraint may also result in further difficulty gating external information such as classroom noise from working memory while reading.

Children with dyslexia also experience working memory updating impairments, this difficulty may result in reading difficulty due to an inability/reduced capacity to hold and update speech sounds in working memory during ongoing decoding. Switching was unimpaired in dyslexia and did not predict reading, therefore children with dyslexia do not appear to struggle with the rapid alteration between different demands.

The findings from the present study suggest that dyslexia is associated with impaired inhibition and updating while controlling for processing speed. These findings are consistent with previous research documenting impaired inhibition (Helland and Asbjørnsen, 2000; Willcutt et al., 2005, 2007; de Jong et al., 2009; De Lima et al., 2012; Booth et al., 2014; Wang and Yang, 2015), impaired updating/working memory (Beneventi et al., 2010; Booth et al., 2014; Wang and Yang, 2015), and unimpaired switching in dyslexia (Reiter et al., 2005; Willcutt et al., 2005; Bental and Tirosh, 2007; Menghini et al., 2010; Moura et al., 2015). However, all of these studies explored group differences at the individual task level and not at the composite level. To our knowledge this is the first study to explore all three core executive functions (inhibition, updating, and switching) within the same study in dyslexia with more sensitive z-mean measures while controlling for individual differences in processing speed.

Only one study thus far has controlled for the confounding influence of processing speed on the performance profile of executive functions associated with dyslexia (Peng et al., 2013). Peng et al. (2013) found updating and inhibition impairments in dyslexia, yet when they controlled for general processing speed impairments, updating and inhibition impairments no longer reached significance. The findings from this study are inconsistent with Peng et al. (2013), suggesting that inhibition and updating impairments remain in dyslexia even while controlling for the confounding influence of processing speed. For inhibition, impairments remained in dyslexia at the composite level and individual task level (Picture Go No-Go, SART task) while controlling for processing speed. However, impairments on the Phoneme Go No-Go task in dyslexia were no longer significant after accounting for individual differences in processing speed. For updating, impairments remained at the composite level and individual task level (Letter 2-back task) while controlling for processing speed. However, impairments on the Picture 2-back task were no longer significant after controlling for processing speed. For switching, a significant strength on the z-mean reaction time switch cost score was found, however, this was no longer significant after controlling for speed.

The pattern of findings suggest that processing speed may mediate some performance in the core executive functions of inhibition and updating at the task level, and, switching at the composite level. Consistent with previous work, this study found a processing speed impairment in dyslexia (Willcutt et al., 2005; McGrath et al., 2011). Despite accounting for some variability in performance, inhibition and updating impairments remain in dyslexia while controlling for processing speed. These findings relate to the previous work conducted by Huizinga et al. (2006), who found that inhibition and switching tasks load onto a processing speed factor. This may explain why switching strengths were no longer significant in dyslexia after controlling for individual variation in processing speed. However, we also found that processing speed can account for impairments on some inhibition and updating tasks in dyslexia also. These findings support previous work that processing speed mediates some executive function performance (Span et al., 2004). Despite accounting for some performance in executive function tasks in dyslexia, we found that processing speed does not account for inhibition and updating impairments in dyslexia. Suggesting inhibition and updating impairments in dyslexia are not accounted for by individual variation in processing speed as suggested by Peng et al. (2013).

However, issues flagged in prior work relating to measurement of core executive functions make it difficult to relate our specific findings to previous work (Miyake and Friedman, 2012; Goschke, 2014; Snyder et al., 2015; Friedman and Miyake, 2016). By employing more sensitive z-mean executive function composite scores to reduce non-executive noise and isolate core executive processes (Snyder et al., 2015), this study found for the first time that inhibition and updating impairments are associated with dyslexia while controlling for processing speed and that inhibition and updating abilities are predictive of dyslexia likelihood and reading ability across the spectrum of typical to atypical reading while controlling for processing speed. We would argue that executive function, particularly inhibition, may underlie the severity of reading impairments in dyslexia.

By exploring the executive function profile associated with dyslexia using Miyake and Friedman's (2012) three factor model, it is apparent that inhibition (common executive function) may be the central executive function impairment associated with dyslexia. Inhibition was the most severe impairment associated with dyslexia; and within predictive models of both dyslexia likelihood and reading ability, it was the most significant and heavily weighted predictor. This 'common executive' impairment may lead to impaired updating and unimpaired switching due to shared variance (Friedman et al., 2006, 2007, 2008; Miyake and Friedman, 2012) and antagonistic relationships such as performance trade-offs between inhibition and switching (Goschke, 2000; Gruber and Goschke, 2004; Blackwell et al., 2014). For instance, inhibition facilitates focus by shielding information from irrelevant distractors in a top down manner (provides stability), while switching requires interference from distractors to consider alternative options and to flexibly adapt to changing demands (mental flexibility; Gruber and Goschke, 2004). This may be the reason why the present study found impaired inhibition and updating, and spared switching abilities associated with dyslexia. Operationally defining and measuring executive function within the 3-factor latent model (Miyake and Friedman, 2012) allows us to see that executive functions may operate in a strengths and impairments manner (Snyder et al., 2015).

Overall results from this study suggest that dyslexia is associated with inhibition and updating impairments, which are predictive of disorder likelihood and variability in reading, even when controlling for processing speed. These findings suggest that inhibition impairments are implicated in dyslexia and predict individual differences in reading ability.

This study is not without limitations. Although our measure of processing speed loads highly on processing speed in factor analytic studies (Keith et al., 2006; Watkins et al., 2006; Bodin et al., 2009), some authors report that this task is correlated with inhibition and predicts variance in working memory (Cepeda et al., 2013). Cepeda et al. (2013) caution against using processing speed measures which are correlated with executive functions as they may overestimate the role of processing speed in executive processes. As such, a possible limitation of our study is that by using the coding task as a measure of processing speed we removed important executive associated variance from our measures. Therefore, our study may underestimate the degree to which executive function is impaired in dyslexia. Another limitation is that the limited number of switch trials in the switching tasks (e.g., a switch occurred on every 4th switch trial) mat have resulted in unreliable data for this measure. In addition, although this study attempted to derive purer measures of each core executive construct by calculating z-mean composite scores from performance across multiple measures, non-executive similarities in the measures used may contribute variance to the composite score. For example, this study employed three 2-back tasks which included different stimuli (e.g., picture, phoneme, letter), all of these tasks required a button press and visual identification of stimuli. As such, this variance may be contributing to the composite score. In addition, the majority of tasks contributing to the inhibition composite scores were Go No-Go style tasks, meaning that this measure is more heavily weighted on response inhibition rather than interference control as measured by the Stroop task.

Future research should explore the core executive functions in dyslexia while controlling for processing speed with latent analysis techniques and structural equation modelling, and explore whether inhibition training can improve core executive functions and reading ability in children with dyslexia.

# AUTHOR CONTRIBUTIONS

CD and LB ideated the project and designed the protocol. CD created the EPrime measures, led the data acquisition, data analysis, and wrote the initial version of the paper. LB contributed to the interpretation of the data, revised the paper for intellectual content, reviewed, and refined the paper. AS and RR reviewed and refined the paper. All authors agree to be accountable for all aspects of the work.

#### FUNDING

The work reported here was funded by a Postdoctoral Research grant (P30424) awarded to LB by the School of Nursing and Human Sciences, Dublin City University. The Fugitsu laptop on which data were collected, analyzed, and the paper was written was donated by Amazon Ireland Charity Committee as part of their corporate social responsibility.

#### REFERENCES


to reading comprehension and decoding. Sci. Stud. Read. 18, 325–346. doi: 10.1080/10888438.2014.902461


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Doyle, Smeaton, Roche and Boran. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Response Inhibition and Interference Suppression in Individuals With Down Syndrome Compared to Typically Developing Children

Laura Traverso<sup>1</sup> \*, Martina Fontana<sup>2</sup> , Maria Carmen Usai <sup>1</sup> and Maria C. Passolunghi <sup>2</sup> \*

*<sup>1</sup> Department of Education Sciences, University of Genoa, Genoa, Italy, <sup>2</sup> Department of Life Sciences, University of Trieste, Trieste, Italy*

#### *Edited by:*

*Sarah E. MacPherson, University of Edinburgh, United Kingdom*

#### *Reviewed by:*

*Miriam Gade, Medical School Berlin, Germany Robert Reeve, University of Melbourne, Australia*

#### *\*Correspondence:*

*Laura Traverso lauratraverso4@gmail.com Maria C. Passolunghi passolu@units.it*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 05 January 2018 Accepted: 16 April 2018 Published: 04 May 2018*

#### *Citation:*

*Traverso L, Fontana M, Usai MC and Passolunghi MC (2018) Response Inhibition and Interference Suppression in Individuals With Down Syndrome Compared to Typically Developing Children. Front. Psychol. 9:660. doi: 10.3389/fpsyg.2018.00660* The present study aims to investigate inhibition in individuals with Down Syndrome compared to typically developing children with different inhibitory tasks tapping response inhibition and interference suppression. Previous studies that aimed to investigate inhibition in individuals with Down Syndrome reported contradictory results that are difficult to compare given the different types of inhibitory tasks used and the lack of reference to a theoretical model of inhibition that was tested in children (see Bunge et al., 2002; Gandolfi et al., 2014). Three groups took part in the study: 32 individuals with Down Syndrome (DS) with a mean age of 14 years and 4 months, 35 typically developing children 5 years of age (5TD), and 30 typically developing children 6 years of age (6TD). No difference emerged among the groups in fluid intelligence. Based on a confirmatory factor analysis, two different inhibition factors were identified (response inhibition and interference suppression), and two composite scores were calculated. An ANOVA was then executed with the composite inhibitory scores as dependent variables and group membership as the between-subject variable to explore the group differences in inhibition components. The 6TD group outperformed the 5TD group in both response inhibition and interference suppression component scores. No differences were found in both inhibition components between the DS group and 5TD. In contrast, the 6TD group outperformed the DS group in both response inhibition and in the interference suppression component's scores. Summarizing, our findings show that both response inhibition and interference suppression significantly increased during school transition and that individuals with DS showed a delay in both response inhibition and interference suppression components compared to typically developing 6-year-olds, but their performance was similar to typically developing 5-year-olds.

Keywords: Down Syndrome, executive function, inhibition, interference suppression, response inhibition

# INTRODUCTION

Down Syndrome (DS) is the most common genetic syndrome associated with intellectual disability and affects ∼1 in 700 newborns (Sherman et al., 2007; Mégarbané et al., 2009). Individuals with DS seem to have higher psychopathological risk than individuals with other intellectual disabilities (Gath and Gumley, 1986; Collacott et al., 1992; Dykens, 2007; Tassé et al., 2016). Therefore, acquiring more information on the weaknesses and strengths of the neuropsychological profile of individuals with DS is necessary for planning interventions.

Individuals with DS are usually characterized by moderate to severe learning disabilities and relative language impairments, with greater expressive difficulties than receptive ones (Fowler et al., 1994; Abbeduto et al., 2001; Laws and Bishop, 2004; Fidler and Nadel, 2007; Næss et al., 2011). Research on other cognitive abilities has focused mainly on memory resources, particularly working memory (Jarrold et al., 2000; Lanfranchi et al., 2004, 2012; Baddeley and Jarrold, 2007). People with DS have poorer working memory performance than controls, especially on tasks that require verbal processing compared to tasks with visual and spatial stimuli (Jarrold and Baddeley, 1997; Jarrold et al., 1999). This difference seems to be independent of the acoustic deficits typical of DS (Jarrold et al., 2000).

There is widespread agreement about impairments in executive function (Costanzo et al., 2013; Lee et al., 2015), a set of general-purpose control processes that regulate one's thoughts and behaviors (Miyake and Friedman, 2012). However, in the literature examining the cognitive profile of individuals with DS, there is a lack of information about inhibition, one of the core components of executive function (Miyake et al., 2000; Diamond, 2013). Inhibition has been considered to play a central role in cognitive development. Klenberg et al. (2001) claim that the development of basic inhibitory functions may precede the development of more complex cognitive functions. Miyake and Friedman (2012) speculate that inhibition may be a general resource for other executive functions. Because inhibition plays an important role in several cognitive activities, it is reasonable that an investigation into this ability may contribute to explaining cognitive impairments. Nevertheless, to date, only a few studies have examined the diverse inhibition components in individuals with DS, and the results are not consistent.

# INHIBITION DEVELOPMENT

Inhibition processes generally refer to the ability to control one's mental processes and responses, to ignore an internal or external prompt and to perform an alternative action (Diamond, 2013). Studies that focus on inhibition have commonly described this ability as a multi-componential construct that includes different dimensions that are useful to perform different tasks (Dempster, 1993; Harnishfeger, 1995; Nigg, 2000; Diamond, 2013). For example, Diamond (2013) argues that inhibition comprises the ability to control irrelevant information at the level of thought and memories (cognitive inhibition), the ability to manage irrelevant data when acquiring information (inhibition at the level of attention), and the ability to control an action at the level of behavior (response inhibition). The concept of inhibition has been widely used and studied (i.e., Dempster and Brainerd, 1995). However, the psychometric construct of inhibition has been investigated only in recent decades (i.e., Friedman and Miyake, 2004). Using a latent variable approach, Rey-Mermet et al. (2017) demonstrated that a two-factor model in which two components, the inhibition of prepotent responses (the ability to suppress dominant responses) and the resistance to distracter interference (the ability to ignore distracting information or to suppress competing response tendencies), were distinguishable best explained the data observed in young and older adults (see also Stahl et al., 2014). However, this evidence collected with adults may not be applied to the early stages of development. As argued by Friedman and Miyake (2004) and observed by Bunge et al. (2002) in an fMRI study, children and adults may be characterized by different inhibition processes. Although a response inhibition component was not distinguishable in study by Friedman and Miyake (2004), in Bunge et al. (2002) study, different activation patterns for interference suppression and response inhibition were observed in children.

Recently, Gandolfi et al. (2014) proposed an empirical investigation of the latent organization of inhibitory processes in early childhood. They suggested that a unitary model was more useful for describing inhibitory processes in younger children (24- to 32-month-old children), whereas a two-factor model showed the best fit in children aged 36–48 months. Specifically, in 3- to 4-year-old children, Gandolfi et al. (2014) distinguished a response inhibition component from an interference suppression component (see also Bunge et al., 2002; Martin-Rhee and Bialystok, 2008; Cragg, 2016, in which interference at the level of response and interference at the level of the stimulus were considered, corresponding to what we define as the response inhibition and interference suppression components, respectively). The first component, "response inhibition," significantly predicted the children's performance in tasks such as Go-No/Go, in which the child is presented with a stimulus that activates an automatic response that must be suppressed to give the correct response. The second component, "interference suppression," explained performance in tasks such as the Flanker task, in which the child is presented with a stimulus that shows ambivalent data (the target and the flankers). In these tasks, the child must control the interference due to the stimulus characteristic and focus on the relevant information to give the correct response. This evidence may suggest that diverse inhibition components may emerge at different stages of development. For example, interference suppression may emerge after response inhibition, and it may be responsible for the differences between younger and older children in performing tasks in which interference must be controlled.

# INHIBITION IN DOWN SYNDROME

Reviewing the literature of the last 20 years, to the best of our knowledge, we were able to identify 10 studies in which at least one inhibition task was proposed to a sample of individuals with DS (**Table 1**).


TABLE 1 | Previous studies examining inhibition in individuals with Down Syndrome.

*(Continued)*


*dCohenwith pooled standard deviation.*

TABLE

1


Continued

Although the study designs were comparable, contradictory findings emerged. In some studies, the DS group performed significantly worse on the inhibitory task administered compared to the control group (Lanfranchi et al., 2010; Schott and Holfelder, 2015; Amadó et al., 2016). In other studies, no difference emerged (Pennington et al., 2003; Cornish et al., 2007; Carney et al., 2013). Finally, in some studies, mixed results were reported (Rowe et al., 2006; Brunamonti et al., 2011; Borella et al., 2013; Costanzo et al., 2013). For example, Borella et al. (2013) found a significant difference in accuracy on all three tasks, although no difference emerged in response time in one of the tasks. In Costanzo et al. (2013), a difference was found for the Stroop task but not for the Go-No/Go task.

These inconsistencies seem to highlight the need to differentiate performance across inhibition components rather than by considering a unitary inhibition dimension. Nevertheless, comparing these results to derive conclusions about the development of the inhibition component in DS is not easy. In most studies, only one task was used. Therefore, contradictory findings may be due to the differences in the tasks used. For example, in both Amadó et al. (2016) and Lanfranchi et al. (2010), accuracy in a Day-Night Stroop task was considered, and in both studies, a significant difference between the DS and the control group was reported. However, these consistent results may involve non-inhibition abilities necessary to perform the task or diverse inhibition components required by the Stroop task that are not assessed with other inhibition tasks. Conversely, in Costanzo et al. (2013) and Borella et al. (2013), a Stroop task was used, and these two studies reported different results using response time and accuracy as indicators. In Costanzo et al. (2013), the DS sample differed from the control group in response time but not in accuracy, whereas the opposite pattern was observed in Borella et al. (2013). As reported by Friedman and Miyake (2004), several problems arise when single and raw inhibition scores are considered. Moreover, although these studies provide useful information about diverse cognitive abilities in DS individuals, the fact that only one task was used to assess inhibition does not allow us to investigate the development of the diverse inhibition components. Only the study by Borella et al. (2013) used three inhibition tasks to assess the three inhibition components initially hypothesized for adults by Friedman and Miyake (2004). In the other studies, the proposed tasks are generally defined as inhibition tasks without providing clarification of the specific component that may be assessed with each task. If we consider the model proposed and verified for children (see Bunge et al., 2002; Gandolfi et al., 2014) in which response inhibition and interference suppression were identified, previous studies on individuals with DS have mostly investigated response inhibition (see inhibition task column in **Table 1**, in which diverse response inhibition tasks were included, such as the Go-No/Go task, the Finger Tapping task, and the Stroop task) rather than the interference suppression component of inhibition. In summary, there is a need for a study that analyses the development of response inhibition and interference suppression components (following the two-factor model proposed and tested with children by Gandolfi et al., 2014) in a DS sample.

# THE PRESENT STUDY

The current study aims to investigate diverse inhibition components in typically developing children and individuals with DS. In agreement with several authors (Friedman and Miyake, 2004; Diamond, 2013), we consider inhibition as having a multicomponent nature, and we hypothesize that at least these two components will be identifiable at this stage of development in TD children (Gandolfi et al., 2014). Specifically, we aim to verify whether two inhibition components, response inhibition and interference suppression, can be found in typically developing children at five (5TD) and 6 years of age (6TD). In addition, considering that inhibition abilities undergo rapid changes in the typical population at the ages considered (Davidson et al., 2006), we investigate whether differences in response inhibition and interference suppression efficiency may be found between TD children aged 5 and 6 years. Moreover, response inhibition and interference suppression are examined in individuals with DS with the same mental age of the two TD groups. Our aims are to investigate whether the DS and the TD groups differ in inhibition performance and to acquire more information concerning inhibition development in DS by comparing this group with two TD groups that may differ in the level of inhibition development.

In contrast to previous studies in which only single task scores were considered, we aimed to at least partially overcome the problems due to task impurity (see Friedman and Miyake, 2004) by creating a composite score for each inhibition component. The difference between typical children of 5 and 6 years and individuals with DS matched for mental age is examined with consideration of these composite scores. Borella et al. (2013) reported general impairment in the diverse inhibition components investigated; thus, we may hypothesize that significant differences will emerge in both components. However, Borella et al. (2013) refer to an adult model of inhibition, whereas we aim to investigate for the first time two inhibition components that have been identified in typical children in a sample of youth with DS.

# METHODS

# Participants

A final sample of 97 individuals belonging to three groups took part in this study. Thirty-two individuals with Down Syndrome (DS), 22 girls and 10 boys with a mean age of 14 years and 4 months (Mage 173.75 in months, S.D. 65.17, range: 73–299 months), were included in the DS group. Thirty-five typically developing children, 18 girls and 17 boys with a mean age of 5 years and 6 months (Mage 67.37 in months, S.D. 2.85, range: 62–71 months), were included in the typically developing control group of 5-year-olds (5TD). Thirty typically developing children, 13 girls and 17 boys with a mean age of 6 years and 2 months (Mage 74.40 in months, S.D. 4.42, range: 72–84 months), were included in the typically developing control group of 6-year-olds (6TD). Individuals with DS had trisomy 21 without mosaicism and were recruited from two treatment centers in the north of Italy. Typically developing children were recruited from different educational services in the same area. None of the children had a history of neurological impairment or developmental disabilities.

# Procedure

A battery of inhibition tasks was administered to the three groups by trained psychologists. All participants were tested individually in a quiet room in two separate testing sessions, each lasting ∼20–30 min, at an interval of 3–4 days. The DS group was assessed in the treatment center, and the TD children were tested at educational services. The families were previously informed about the aims of the study and about the activities in which the participants were involved. A written informed consent form was completed by the parents before testing began.

All tasks consisted of well-known inhibition paradigms. These tasks have been widely used with children and did not show any floor or ceiling effect in the mental age range of interest (Davidson et al., 2006; Traverso et al., 2015). These tasks minimize the non-executive function abilities required. Basic knowledge (such as colors) and simple responses (such as pointing or pressing) are required to perform the tasks. Finally, all tasks (except for the Go/No-Go) included practice trials before the test began. The examiner gave the instructions and then conducted the practice trials to verify whether the child had comprehended the requirements of the task.

#### Measures

The Colored Progressive Matrices Test (Raven, 1947; Belacchi et al., 2008) was administered to measure fluid intelligence and was used as a screening measure to match fluid intelligence between the DS group and the two TD groups. It is a multiplechoice test of abstract reasoning in which the child is required to complete a geometrical figure by choosing the missing piece among six possible drawings. The tasks included 36 items. The items varied in difficulty. The score was the number of correct responses (CPM, expected range 0–36).

#### Inhibition Battery

To assess inhibition, the following tasks were administered.

#### **Go/No-Go task (adapted from Berlin and Bohlin, 2002)**

The Go/No-Go task is a well-known paradigm that tests the abilities of both adults and children to inhibit prepotent responses (Durston et al., 2002; Verbruggen and Logan, 2008). The children were asked to restrain an automatic response. While in front of a computer screen, the child was instructed to press the space bar according to the instructions given by the examiner for the following condition: "Press the space bar when you see a blue figure; do not press when you see a red figure" (24 blue items and six red items). The percentage of go responses was 80%. The stimulus duration was 3,000 ms, and the blank page that appeared after each stimulus lasted 1,000 ms. The sum of the correct responses in the no-go condition was recorded (Go/No-Go Accuracy, expected range 0–6). Test-retest reliability (Pearson's r) was calculated in a sample of 75 typically developing children (age range 62–76 months, Mage = 68.64; S.D. = 3.5) was 0.55, p < 0.0005 (unpublished results from the data set used in

#### **Preschool matching familiar figure task (PMFFT, adapted by Kagan, 1966; Traverso et al., 2016)**

This task measures the child's ability to restrain impulsive responses and to compare the target with all of the pictures by shifting attention from the target to each alternative. The children were asked to perform 14 trials, selecting among five different alternatives the figure that was identical to the target picture at the top of the page. The number of errors (PMFFT Errors, expected range 0–56) and the mean latency between the presentation of the item and the child's response (PMFFT Time, expected range 0-no limit) were recorded. Cronbach's alphas calculated in a sample of 174 children (Mage = 60.04) were 0.67 for PMFFT Errors and 0.95 for PMFFT Time (Traverso et al., 2016). Cronbach's alpha calculated in the present study for PMFFT Accuracy was 0.76 in the TD group and 0.85 in the DS group. Cronbach's alpha for PMFFT Time was 0.94 for both groups.

#### **Fish flanker task (adapted from Ridderinkhof and van der**

**Molen, 1995; Gandolfi et al., 2014; Traverso et al., 2015)** The Flanker task is a well-known paradigm that is used to evaluate the ability to inhibit irrelevant interfering stimuli (Eriksen and Eriksen, 1974; Kramer et al., 1994). The children were required to respond to a left or right fish presented at the center of the computer screen by pressing a left or right response button. The fish was flanked by two fishes pointing in the same direction (congruent condition, 16 items) or in the opposite direction (incongruent condition, 16 items). After a brief training consisting of four items (two of each condition), 48 items were randomly presented (16 items per condition, half left and half right). A warning cross (500 ms in duration) preceded the stimulus. After the response, the screen turned blank for 500 ms. Accuracies (Flanker Accuracy, expected range 0–16) and response times (Flanker Time) in the incongruent condition were recorded. Test-retest reliability (Pearson's r) calculated in a sample of 43 typically developing children (age range 62–75 months, Mage = 68.60; S.D. = 3.5) was 0.42, p = 0.002 and 0.56, p < 0.001 for Flanker Accuracy and Flanker Time, respectively (Usai et al., 2017). Cronbach's alphas calculated in the present study for Flanker Accuracy were 0.96 in the TD group and 0.81 in the DS group. Cronbach's alphas for Flanker Time were 0.96 in the TD group and 0.93 in the DS group.

#### **Dots task (adapted by Diamond et al., 2007; Traverso et al., 2015)**

This task is a high cognitive conflict task (see Diamond et al., 2007; Diamond and Lee, 2011). A heart or a flower appears on the right or left of a computer screen. The child is told that he must press on the same side of the heart but on the opposite side of the flower, which requires inhibiting the tendency to respond on the side where the stimulus appeared and to control the response based on which stimulus appears. After a brief training session with heart and flower items, the test began, and hearts and flowers were intermixed in the test. The sum of correct responses (Dots Accuracy, expected range 0–20) and the response time (Dots Time) were recorded for each child. Test-retest reliability (Pearson's r) calculated in a sample of 43 typically developing children (age range 62–75 months, Mage = 68.60; S.D. = 3.5) was 0.62 (p < 0.001) for Dots Accuracy and 0.72 (p > 0. 001) for Dots Time (Usai et al., 2017). Cronbach's alpha calculated in the present study for Dots Accuracy was 0.97 in the TD group and 0.80 in the DS group. Cronbach's alpha for Dots Time was 0.89 in the TD group and 0.85 in the DS group.

#### Statistical Analyses

Descriptive analyses and ANOVAs on CPM and inhibitory measures were conducted to compare the three groups' performance considering both accuracy and response time scores. The relation between accuracy and response time was investigated with bivariate correlations. A confirmatory factor analysis (CFA) was performed using the TD group's inhibitory task scores to verify the characteristics of the inhibition construct in early childhood. Multiple fit indices were considered to compare models (for an extensive description, see, e.g., Schermelleh-Engel et al., 2003): the X<sup>2</sup> statistic, the Comparative Fit Index (CFI), the root mean square error of approximation (RMSEA), the standardized root mean squared residual (SRMR), the Akaike Information Criterion (AIC), and the Bayesian information criterion (BIC). The X<sup>2</sup> test was used to evaluate the appropriateness of the CFA model. Non-significant X<sup>2</sup> values indicated a minor difference between the covariance matrix generated by the model and the observed matrix and thus an acceptable fit. CFI values > 0.97 are indicative of a good fit, whereas values > 0.95 may be interpreted as an acceptable fit (Schermelleh-Engel et al., 2003). RMSEA values ≤ 0.05 represent a good fit, values between 0.05 and 0.08 represent an adequate fit, values between 0.08 and 0.10 represent a mediocre fit, and values > 0.10 are not acceptable (Browne and Cudeck, 1993). The SRMR is the square root of the averaged squared residuals (i.e., the differences between the observed and predicted co-variances). SRMR values < 0.10 are acceptable; however, values lower than 0.05 represent a good fit (Schermelleh-Engel et al., 2003). Based on the CFA results, composite scores were calculated as the mean of the inhibitory z-score to represent the latent inhibitory dimensions. Finally, an ANOVA was conducted with the composite inhibitory scores as dependent variables and group membership as the between-subject variable to explore group differences in the inhibition components.

# RESULTS

Descriptive statistics and ANOVA results for the three groups are shown in **Table 2**. A univariate analysis of variance showed no significant difference in the CPM score. In contrast, significant differences among the groups were found for all the inhibition tasks with the exception of the Dots Time score.

Post-hoc tests using Bonferroni correction revealed that 6 year-olds outperformed 5-year-olds in PMFFT Errors (6TD made fewer errors than 5TD), Flanker Accuracy and Dots Accuracy. The DS group showed high variability in all tasks. This group performed worse than the 6TD group but was similar to the 5TD group in PMFFT accuracy. The opposite was observed for PMFFT time, and the DS group showed a similar response time to the 6TD and a higher response time than 5TD. A significant difference emerged in the Go/No-Go task between the 6TD and DS groups; however, this difference disappeared when a mathematical transformation (exponential function, Kline, 2005) was applied to the Go/No-Go raw score to obtain acceptable skewness and kurtosis parameters. For Flanker Accuracy, the DS group showed similar accuracy scores to the two TD groups and a higher response time than both the 5TD and the 6TD groups. Finally, the DS group showed worse performance in Dots Accuracy than 5TD and 6TD, and no differences emerged in Dots Time.

Zero-order correlations among tasks are reported for the two TD groups (**Table 3**) and the DS group (**Table 4**).

As expected, the inhibition task scores were not highly related (Willoughby et al., 2015). In the 5TD group, a significant association emerged between performance in the PMFFT (Errors) and the Go/No-Go tasks. In the 6TD group, the Dots Accuracy was positively correlated with the Flanker Accuracy, and the Dots Accuracy was related to the Go/No-Go performance. In the DS group, performance in the PMFFT (Errors) and the Go/No-Go tasks were associated, and the Flanker Accuracy was related to both the PMFFT (Errors) and the Go/No-Go Accuracy. Accuracy and response time correlated significantly in both the 5-year-old (r ranged from 0.347 to 0.592) and the 6-year-old (r ranged from 0.391 to 0.754) groups. However, in the DS group, only the Dots Accuracy and the Dots Time scores were related (r = 0.372). The CPM performance was associated with the PMFFT Time and the Flanker task (Time and Accuracy) in the 6TD group, no significant association emerged considering the 5TD group, and CPM was related to the PMFFT Time in the DS group. Finally, age was significantly related only to the PMFFT time in the 5TD group.

#### Identifying the Inhibitory Components

To verify whether the two-factor model, in which response inhibition and interference suppression were distinguished, would be more useful to explain the observed data than a onefactor model (**Figure 1**), a series of CFAs based on raw data were performed using Mplus software (version 7.4) (Muthén and Muthén, 2007).

The unitary model had mediocre or unacceptable fit indices: χ <sup>2</sup> = 5.014 p = 0.082, CFI = 0.872, SRMR = 0.060, RMSEA = 0.152, and 90% CI = [0.000, 0.325]. The two-factor model (**Figure 1**) showed the best fit: χ <sup>2</sup> = 0.556 p = 0.456, CFI = 1.000, SRMR = 0.018, RMSEA = 0.000 and 90% CI = [0.000, 0.295]. All the factor loadings were significant (t values > 2).

# Investigating the Inhibitory Difference in DS and TD Groups

Two composite scores representing response inhibition and interference suppression were calculated as the mean of the zscores as follows: the z-score average of PMFFT Errors and Go/No-Go task Accuracy for response inhibition and the zscore average of Flanker Accuracy and Dots Accuracy for interference suppression (**Table 5**). These composite measures


\**p* < *0.05;* \*\**p* < *0.001;* \*\*\**p* < *0.0001. Time is reported in seconds for the Preschool Matching Familiar Figure Task Time (PMFFT Time) and in milliseconds for the Flanker (Flanker Time) and Dots tasks (Dots Time).*

TABLE 3 | Zero-order correlation through inhibition tasks, CPM and age (in months) in the 5TD group (upper triangle) and in the 6TD group (lower triangle).


\**p* < *0.05;* \*\**p* < *0.001.*

TABLE 4 | Zero-order correlation through inhibitory tasks, CPM and age (in months) in the DS group.


\**p* < *0.05;* \*\**p* < *0.001.*

TABLE 5 | Descriptive statistics of inhibitory components in the three groups.


can be considered formative indicators of the two inhibitory factors found with the previous EFA (Willoughby et al., 2015). The results of an ANOVA conducted with the two composite inhibitory measures as dependent variables and group membership as the between-subjects variable showed that the three groups differed in both response inhibition, F(2,96) = 8.363 p < 0.001, and interference suppression, F(2,96) = 10.530 p < 0.001. The 6TD group outperformed the 5TD group in both the response inhibition (p = 0.008, dCohen = 0.94) and interference suppression (p = 0.001, dCohen = 0.83) components. No differences were found in either inhibition component between the SD and 5TD groups. In contrast, the 6TD group outperformed the DS group on the response inhibition component score (p = 0.001, dCohen = 0.96) and the interference suppression component score (p < 0.001, dCohen = 1.15).

## DISCUSSION

The main goal of this study was to investigate diverse inhibition components in children and youth with DS compared to two groups of typically developing children aged 5 and 6 years matched for mental age. Specifically, we aimed to focus on response inhibition and on interference suppression components (see Bunge et al., 2002; Gandolfi et al., 2014). In contrast to previous studies in which only single task scores were examined, we considered both raw scores and composite scores as formative indicators of these two components, referring to a theoretical model of inhibition that was tested in children (Gandolfi et al., 2014).

# Inhibition in Children With Typical Development

First, the performance of the two typically developing groups was analyzed. Although inhibition development has been widely documented and investigated in childhood in preschool more than in the transition to school (Carlson, 2005; Romine and Reynolds, 2005; Davidson et al., 2006; Garon et al., 2008), the developmental trajectories of this ability and its components are not yet clear. To acquire more information on atypical development, we argue that it is important to focus on inhibition changes in typically developing children.

Concerning single tasks, our results showed that children 6 years of age were more accurate than 5-year-olds in most of the tasks, although they did not have significant differences in general cognitive functioning measured with CPM. These findings are consistent with previous studies that documented a rapid improvement in accuracy on similar tasks in this age range (Davidson et al., 2006; Traverso et al., 2016). Moreover, the older children significantly increased their response time in the Preschool Matching Familiar Figure Task. In all three tasks in which response time was registered, it was significantly positively related (higher the time, greater the accuracy) to accuracy in both the 6-year-olds and the 5-year-olds. In middle childhood and adulthood, low response time is considered an index of a high level of inhibition. In contrast, Gerstadt et al. (1994) showed that in early childhood, children who took longer to respond were more likely to be correct. Diamond et al. (2002) demonstrated that it is possible to increase accuracy by encouraging children to wait before answering in a Stroop task, and some authors argue that the time is useful because it permits the dissipation of the prepotent response in children (Simpson et al., 2012; Ling et al., 2016). In an investigation of the performance of 3- to 6-year-old children on the Preschool Matching Familiar Task, in which no instruction to wait before answering was given, Traverso et al. (2016) observed that response time and accuracy were not related until the age of four and a half years. These results suggest that the interpretation of the time response may depend on age, accuracy, and task; consequently, it may not be a valid index of cognitive efficiency when these other parameters are not considered, at least in childhood (see Davidson et al., 2006; but see studies, i.e. Tamm et al., 2012), in which an application of ex Gaussian distribution to response time allowed the achievement of more fine-grained analyses of the distribution and consequently obtained much more information on cognitive profile than using raw response time, which was characterized by high variability and was not normally distributed.

As expected, the inhibition tasks did not correlate with each other (Willoughby et al., 2015; Rey-Mermet et al., 2017) in all three groups. Nevertheless, according to previous studies (see Gandolfi et al., 2014), the CFA demonstrated that a twofactor model in which response inhibition (Go/No-Go task and Preschool Matching Familiar Figure Task indicators) and interference suppression (Flanker Accuracy and Dots Accuracy indicators) were distinguishable best explained the data observed. In the Go/No-Go task and the Preschool Matching Familiar Figure Task, the child is required to focus on one attribute of the stimulus. In the Go/No-Go task, the child must look at the color of the figure and be able to control the response to press the spacebar. In the Preschool Matching Familiar Figure Task, the child must be able to consider the target and then the figure before pointing with the finger. In both tasks, the child is required to press/point or not to press/point according to the stimulus presented. Given the large majority of go stimuli and the diverse figures that need to be compared in the Preschool Matching Familiar Figure Task, in these tasks, the child usually must stop an automatic response or an impulsive tendency. In contrast, in both the Flanker Task and the Dots Task, the child must always give a response (press a computer key). Nevertheless, the child must analyse the type of stimulus that is presented to evaluate what type of response is correct. The stimuli presented are particularly challenging. In the Flanker Task, the child must be able to focus on the central fish; in the Dots Task, the child must focus on the type and side of the stimulus. Whereas, in the first type of tasks the child must decide to respond or not consider the stimulus, in the latter tasks, the child must choose between to different responses by managing the complexity of the stimulus. In these tasks, the child must suppress distracting information as well as competing response tendencies. Following the CFA, two composite scores were calculated as a formative index of response inhibition and interference suppression components. As suggested by Willoughby et al. (2015), formative indices may be a useful method to investigate EF development. However, it must be noted that this conceptual framing is consistent with the characterization of EF as a latent variable that is defined by (rather than giving rise to) individual performance across a set of performance-based tasks. Our results show that older children obtained higher scores than younger children in both response inhibition and interference suppression. These results may suggest that from 5 to 6 years of age, children increase both their ability to control an automatic response and their ability to manage interference. Previous studies have shown that performance on response inhibition tasks such as the Go/No-Go task undergoes significant changes in middle childhood (Brocki and Bohlin, 2004; Cragg and Nation, 2008). Similarly, an increase in performance on tasks that are supposed to require interference suppression was previously observed in middle childhood studies (Hommel et al., 2004). Both components improve during school transition, although Gandolfi et al. (2014) suggested that interference suppression emerges after response inhibition in pre-schoolers, and Cragg (2016) claimed that the improvements in performance on inhibition tasks in middle childhood may be due to development in what we define as interference suppression rather than response inhibition.

# Inhibition in Individuals With Down Syndrome

With regard to task accuracy, the DS group showed worse performance than the 6-year-olds on the Preschool Matching Familiar Figure Task and worse performance than both groups in the Dots task. No differences were observed in the Go/No-Go task transformed variable (although a difference emerged in the raw score) and in the Flanker Task accuracy. Moreover, the DS group had a higher response time than the 5-year-olds on the Preschool Matching Familiar Figure Task and a higher response time than both control groups on the Flanker task. This inconsistent pattern is in line with the inhibition literature (Rey-Mermet et al., 2017) and with studies that have found high variability on cognitive tasks in the atypical development population (i.e., Tamm et al., 2012; van Belle et al., 2015). With reference to previous studies, as in Costanzo et al. (2013), no differences were observed in the Go/No-Go task, whereas a significant difference emerged in other tasks requiring response inhibition (although in tasks different from the tasks we used; see Lanfranchi et al., 2010; Schott and Holfelder, 2015; Amadó et al., 2016). For interference suppression tasks, to our knowledge, only a study by Merrill and O'dekirk (1994) used a Flanker paradigm, and individuals with DS showed more interference caused by the flankers (and higher response time) than controls. Otherwise, no difference emerged in our study.

One possible explanation for these mixed results may involve the non-executive abilities required by the task. In the Merrill and O'dekirk study, the flankers were letters; therefore, we cannot exclude the possibility that their results were due to the DS group's difficulties in verbal elaboration. Costanzo et al. (2013) explained their mixed results by arguing that the differences were due to the visual vs. verbal stimuli. However, in our study, the DS group performed worse on tasks in which visual stimuli must be processed (i.e., Preschool Matching Familiar Figure Task). In the Flanker Task, in contrast to the other tasks, the examiner used a brief story-telling paradigm to explain what the child was expected to do. Thus, it is possible that the children were more motivated to perform the Flanker task than the other tasks and that they were helped by a practical story rather than arbitrary and abstract rules for the task. Another possible explanation involves the difference in other executive demands of the task. For instance, the Dots task and the Preschool Matching Familiar Figure Task may require higher working memory than the other two tasks. Nevertheless, according to Munakata et al. (2011), the child needs to actively maintain the goal of the task in working memory in all types of inhibition tasks.

To discuss these mixed results, it is helpful to reflect on which variable was considered (accuracy vs. response time). Previous studies considered both accuracy and response time, and, as in our study, mixed results were reported. Nevertheless, it must be noted that in our study, accuracy was unrelated to response time in both the Flanker task and the Preschool Matching Familiar Task. This evidence may suggest that as early pre-schoolers (Traverso et al., 2016), individuals with DS are not able to control response time to be more accurate; thus, response time may not be a useful index of executive control in this population.

We speculate that focusing on single task differences makes it difficult to investigate the efficacy of the inhibition components (see Miyake et al., 2000; Willoughby et al., 2015). Consequently, we prefer to focus on inhibition composite scores as indices of response inhibition and interference suppression. When composite scores were considered, the DS group performed similarly to the younger children using both components. In contrast, a significant difference emerged between the older children and the DS group in both components. These results suggest that individuals with DS show a deficit in both response inhibition and interference suppression components when compared with a TD population that shows more mature inhibition abilities than the younger group of TD children. In previous studies, most of the tasks used required response inhibition. Our studies on the response inhibition component confirmed the evidence provided by Amadó et al. (2016), Lanfranchi et al. (2010), and Schott and Holfelder (2015). However, few studies have examined the interference suppression component. Moreover, to the best of our knowledge, this is the first study in which individuals with DS were compared with two typically developing groups at different stages of development.

In summary, our findings demonstrate that individuals with DS show a delay in inhibition development, but their performance is similar to the typical development of 5-yearold children. This evidence is consistent with the study by Borella et al. (2013), in which individuals with DS showed difficulties in tasks assessing diverse inhibition components. Moreover, it should be noted that even though differences emerged between the groups, the three groups had the same level of general cognitive functioning. These results suggest that significant differences in inhibition abilities may characterize groups with similar levels of general cognitive functioning in typical development. Consequently, when differences in individuals with DS and typically developing children are investigated, it is possible that mixed results will emerge due to the age of typically developing children with similar cognitive functioning, which may be characterized by diverse levels of inhibition development.

# CONCLUSION

To the best of our knowledge, in the last 20 years, only ten studies have examined the inhibition abilities of individuals with DS. These studies reported contradictory results and generally used only response inhibition tasks without referring to a theoretical model of inhibition (see Borella et al., 2013 for the only exception in which an adult model was considered). This is the first study in which different inhibition tasks were used to investigate two inhibition components with reference to a model of inhibition tested in children (Gandolfi et al., 2014). Specifically, in the current study, we refer to response inhibition as the ability to control a predominant response and suppressing interference as the ability to respond to one task attribute and to inhibit the response to another attribute. Our results show that individuals with DS show a delay in both of the evaluated inhibition components. Given the importance of inhibition for other cognitive abilities (i.e., working memory, see Lustig et al., 2001; intelligence, see Lee et al., 2015), this evidence suggests that both the ability to control a response and the ability to manage interference must be supported in individuals with DS. More generally, we argue that investigating inhibition in individuals with DS is preferable to using diverse inhibition tasks to achieve information on diverse inhibition components. As suggested by

Morra et al. (2017), it is important to pay attention to the way that inhibition tasks are classified based on theoretical assumptions.

# LIMITATIONS AND FUTURE DIRECTIONS

There were some weaknesses in the current study that should be noted. First, although this study aimed to focus on inhibition, it would have been useful to control for other non-executive or executive abilities, such as working memory. Second, although the formative indices may represent a useful methodology to investigate executive functions, in this study, after testing the inhibition model on typical-developmental children with an EFA, we assumed that the inhibition construct was similar in both typical and atypical development. Increasing the sample size would be useful to examine findings observed using reflective and formative inhibition indices (Willoughby et al., 2015) in individuals with DS. Third, the DS group was matched for mental age to the typically developing children. Nevertheless, the DS group showed high variability in chronological age. Consequently, high variability in environmental factors that may have affected inhibition development must be considered. For example, when a large age range is considered, it could be useful to add information concerning the type of treatment and support received and as well as information on differences in treatment that may depend on the cohort to which the subject

# REFERENCES


belongs. To minimize the effect of confounding factors, in future research, it would be useful to consider DS samples with reduced chronological and mental age ranges or to include chronological age-matched TD comparison groups (Godfrey and Lee, 2018).

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Ethical Code of Italian Psychology Order and of the Ethical guidelines of the Italian Association of Psychology with written informed consent from all subjects. All parents of the subjects gave written informed consent in accordance with the Declaration of Helsinki. At the time we collected the data no ethical committee was yet present to which we could refer to.

# AUTHOR CONTRIBUTIONS

LT and MU: revised the literature on inhibition development; MF and MP: revised the literature concerning inhibition in individuals with Down Syndrome; MU, MP, LT, and MF: conceived and designed the experiment; LT and MF: collected the data; LT and MU: performed the analysis; LT: wrote a first draft of the manuscript that was revised by MF, MU, and MP. All authors read and approved the final manuscript.


Podjarny, Kamawar, and Andrews (2017). J. Exp. Child Psychol. 167, 246–258. doi: 10.1016/j.jecp.2017.11.004


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Traverso, Fontana, Usai and Passolunghi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Heterogeneity in Cognitive and Socio-Emotional Functioning in Adolescents With On-Track and Delayed School Progression

Loren Vandenbroucke<sup>1</sup>† , Wouter Weeda<sup>2</sup>† , Nikki Lee<sup>3</sup> , Dieter Baeyens<sup>1</sup> , Jon Westfall<sup>4</sup> , Bernd Figner<sup>5</sup> and Mariëtte Huizinga<sup>6</sup> \*

<sup>1</sup> Research Group of Parenting and Special Education, KU Leuven, Leuven, Belgium, <sup>2</sup> Department of Methodology and Statistics, Leiden University, Leiden, Netherlands, <sup>3</sup> Department of Clinical, Neuro- and Developmental Psychology, Vrije Universiteit Amsterdam, Amsterdam, Netherlands, <sup>4</sup> Department of Counselor Education and Psychology, Delta State University, Cleveland, MS, United States, <sup>5</sup> Behavioural Science Institute, Radboud University, Nijmegen, Netherlands, <sup>6</sup> Department of Educational and Family Studies, Vrije Universiteit Amsterdam, Amsterdam, Netherlands

#### Edited by:

Sarah E. MacPherson, The University of Edinburgh, United Kingdom

#### Reviewed by:

Sophie Taylor, Sheffield Hallam University, United Kingdom Johanna M. Jarcho, Temple University, United States

> \*Correspondence: Mariëtte Huizinga m.huizinga@vu.nl

†These authors share first authorship

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 06 February 2018 Accepted: 07 August 2018 Published: 24 August 2018

#### Citation:

Vandenbroucke L, Weeda W, Lee N, Baeyens D, Westfall J, Figner B and Huizinga M (2018) Heterogeneity in Cognitive and Socio-Emotional Functioning in Adolescents With On-Track and Delayed School Progression. Front. Psychol. 9:1572. doi: 10.3389/fpsyg.2018.01572 Adolescence is characterized by considerable changes in cognitive and socio-emotional skills. There are considerable differences between adolescents with regards to the development of these skills. However, most studies examine adolescents' average functioning, without taking into account this heterogeneity. The current study applies network analysis in order to examine heterogeneity of cognitive and socio-emotional functioning in adolescents on-track or delayed in their school progression. Data was collected at two time-points for on-track (n = 320) and delayed (n = 69) adolescents (Mage = 13.30 years, SDage = 0.77). Repeated measures ANOVA showed no significant differences between the groups in cognitive and socio-emotional functioning (p's > 0.05). Network analysis revealed that executive functions play a key role in the network of cognitive, social, and emotional functioning. This is especially the case in the delayed group where executive functions are even more central, both at T1 (inhibition and shifting) and T2 (shifting). Subsequent community analysis revealed three profiles in both groups: a well-adapted and well-balanced group, a group with high levels of need for arousal and risk-taking, and a group with regulation problems. Compared to ontrack adolescents, delayed adolescents showed even higher levels of risk-taking in the second profile and higher levels of executive function problems in the third profile at T1. These differences were leveled out at T2, indicating adolescents in the delayed group catch up with their peers. This study highlights the intricate balance between cognitive, social and emotional functioning in adolescents in relation to school performance and provides preliminary evidence of the importance of taking individual differences within groups into account.

Keywords: graph theory, network analysis, community analysis, executive functioning, adolescence, cognitive development, social development, emotional development

# INTRODUCTION

fpsyg-09-01572 August 24, 2018 Time: 19:26 # 2

Adolescence is considered an important developmental stage (Konrad et al., 2013; Jaworska and MacQueen, 2015; Juraska and Willing, 2017). It is a period of transition that is not only characterized by obvious changes in biological and physical functioning, but also by rapid change in cognitive, social and emotional skills (Crosnoe and Johnson, 2011; Crone and Dahl, 2012). Adolescent thinking becomes more complex, abstract and focused on metacognition, while they reorient from the family to peers in the social domain and experience heightened intensity and variability of emotions in the emotional domain (Steinberg, 2005; Silvers et al., 2012; Blakemore and Mills, 2014; Dumontheil, 2014).

Due to the tremendous changes that occur during adolescence it is a period that offers both opportunities and challenges. For most adolescents, the increased cognitive, social and emotional abilities provide opportunities and result in increased competence, self-worth and positive social bonds (Briggs, 2009; Lewin-Bizan et al., 2010). Yet, some adolescents struggle with the many changes during this period and may become entangled in a negative spiral (Crone and Dahl, 2012). For example, adolescence is known to be accompanied by increases in anxiety, depression, risk-taking behavior, and academic problems (Prencipe et al., 2011; Eiland and Romeo, 2013; Blakemore and Mills, 2014; De Laet et al., 2016). Thus, there is great heterogeneity in the functioning of adolescents and the examination of these individual differences in adolescent developmental trajectories is essential to fully understand these cognitive, social and emotional changes. However, current studies often ignore the (neuropsychological) heterogeneity within samples. Additionally, most studies investigate cognitive, social and emotional skills in isolation, while in real life these skills are interrelated and influence each other. The current study attempts to fill these gaps by using network and community analyses to examine the interrelations between cognitive, social and emotional skills, while taking into account inter-individual differences between adolescents in their development.

# Cognitive, Social, and Emotional Changes in Adolescence

Three domains in which rapid and large developmental changes occur in parallel during adolescence are within cognitive, social and emotional functioning (Crone and Dahl, 2012). With regard to cognitive development, adolescents' cognition becomes more complex and efficient and is characterized by increases in abstract, multidimensional and metacognitive thinking (Steinberg, 2005; Dumontheil, 2014). Central in this area of development are the substantial improvements with regard to executive functioning (Weil et al., 2013). These improvements can be seen both in increased performance on neuropsychological tasks measuring executive functioning (e.g., Digit Span, Go/No-go task; Zelazo and Carlson, 2012) and in increases of adolescents self-regulating behavior (e.g., reductions in hyperactivity; Coté et al., 2002). In the area of social development, adolescence marks a reorientation from family to peers (Blakemore and Mills, 2014). Adolescents spend more time with peers, interact with larger peer groups and interact more often with oppositesex peers compared to children. As a consequence, peer problems are more likely to arise and peer influences on adolescents' behavior become more apparent (Albert et al., 2013; Huang et al., 2014). Finally, changes in emotional functioning are characterized by the experience of more intense emotions (both negative and positive) and more variability in emotions, increases in need for arousal, and increases in emotional regulation abilities (Steinberg, 2005; Silvers et al., 2012).

# Interrelations Among Developmental Domains and Heterogeneity in Adolescence

Most available studies examining adolescent development and functioning study cognitive, social and emotional skills in isolation, while in real life these skills are known to influence each other. For example, the increase in emotion intensity and variability in combination with the slow and prolonged development of executive functions, can lead to a discrepancy between adolescents' knowledge about negative consequences of behavior and their actual behavior in emotionally loaded situations, resulting in an increase in risk-taking behavior (Prencipe et al., 2011). Recent research even defines hot executive functions as the cognitive top-down control in emotionally important situations and distinguishes it from cool executive functions, which are used for top-down control in neutral (or purely cognitive) situations (Zelazo and Carlson, 2012). Similarly, due to the increased importance and influence of peers and the increased ability of metacognitive thinking, the way adolescents think about themselves and their emotional experiences (e.g., loneliness) changes (Blakemore and Mills, 2014; Vanhalst et al., 2014). These examples show that the general functioning of adolescents and their functioning at school is determined by the interrelation between cognitive, emotional and social skills, both at the neurocognitive level (e.g., executive functioning) and the behavioral level (e.g., risk-taking behavior). Yet, the interrelation between these skills is, so far, rarely systematically investigated. As a consequence, little is known about how these skills relate to each other, what the relative importance of each of these skills is, and what the role of a potential imbalance between these skills is for issues common during adolescence, such as academic difficulties.

Although the importance of individual variability in the functioning of adolescents is generally recognized, many studies do not explicitly examine this heterogeneity. Rather, most studies look at average development or functioning within samples or compare the average individual in one group with an average individual in another group. This is especially the case for community samples, which are often treated as a homogeneous group of individuals without symptoms or impairments (van der Meer et al., 2017). However, there is strong evidence that within a group of 'typically developing' individuals and adolescents

there can also be large differences in cognitive, social and emotional skills (Fair et al., 2012; van der Meer et al., 2017). Not explicitly modeling or examining such heterogeneity can limit our understanding in the occurrence and causes of specific deficits (Fair et al., 2012).

# Grade Retention

One domain where great changes in cognition and socioemotional functioning can result in maladaptive outcomes during adolescence is within education (De Laet et al., 2016). While obvious changes still have to occur in cognitive, social and emotional functioning, the altered school environment places high demands on these skills. For example, increases in class size, decreases in adult support and increased individual responsibility place higher demands on executive functioning and social skills (Jacobson et al., 2011). Consequently, difficulties with these skills can result in negative scholastic outcomes, such as grade retention. Previous research has shown behavior problems, aggressive behavior, peer relationships, motivation and self-efficacy, and executive functioning to be predictive for school outcomes such as drop out, grades and grade retention (Jimmerson et al., 2000; Robbins et al., 2004; Jimmerson and Ferguson, 2007; Davoudzadeh et al., 2015; Fitzapatrick et al., 2015).

Yet, current literature on grade retention focuses on examining the involvement of isolated aspects of cognitive, social and emotional functioning. Additionally, these studies often compare on-track and delayed adolescents, without taking the heterogeneity into account. This approach ignores that the balance between different skills can play a role in grade retention, and that different skills may have played a role in the retention of different adolescents. The current study examines the role of the interrelation between cognitive and socio-emotional skills in grade retention and takes into account the within group heterogeneity with the use of graph theory as an analytical approach.

#### Graph Theory as an Analytical Approach

Graph theory involves the analysis of relationships within a network. For example, scores on measures of cognitive functioning can be represented as nodes within a network, and the correlations between these variables as edges which form the connections between the nodes. This approach has been applied in multiple domains, such as neuroimaging research examining connectivity between brain regions within neural networks (Dosenbach et al., 2013; Kellermann et al., 2015), or within the field of psychopathology when comparing patient and control groups (McNally, 2016). In the later it has, for example, been used to examine which symptoms (Heeren and McNally, 2016; Russell et al., 2017), which neuropsychological skills (Heeren and McNally, 2016; Hoorelbeke et al., 2016; Ibrahim et al., 2016) are central in clinical groups compared to non-clinical groups, or if core symptoms of clinical disorders vary across developmental periods (Martel et al., 2016).

Understanding which impairments are central for specific (clinical) groups can aid the (early) recognition and treatment of these problems (Ibrahim et al., 2016). This can be examined in more detail through detection of communities within graphs where individuals are represented as nodes, and the similarities between individuals (e.g., in their cognitive profiles) are represented by the edges. Communities are densely connected groups of nodes, with sparse connections to other groups. Individuals within a community show a high degree of similarity in the determinants of their behavior, while the various communities may display similar overt behaviors, but the underlying determinants often differ. One of the first studies to apply this method was conducted by Fair and colleagues (2012), who used community detection to determine the existence of subgroups based on neuropsychological profiles in both a typically developing group and an ADHD group of children and adolescents. Their results suggested that there was significant heterogeneity within both the patient and control groups, and that comparison between the groups was greatly improved when patients and controls were matched based on their neuropsychological profiles. Thus, the advantage of this approach is that rather than averaging out inter-individual differences in behavior, it enables examination of relative strengths and weaknesses within groups, and therefore can provide new information about underlying difficulties for specific groups.

# The Current Study

Where previous studies using graph theory as an analytical approach examined the networks of neuropsychological skills in clinical samples (e.g., ADHD), the current study shows that this approach can also be used to investigate the networks of skills in different developmental domains in a community sample of adolescents. The current study examines cognitive, social and emotional skills in adolescents on-track and delayed in their scholastic progress, adopting a graph theory approach. This approach allows examination and comparison of the influence of multiple cognitive (cool executive functioning, conduct problems, and hyperactivity), social (social support, resistance to peer influence, peer problems, and prosocial behavior) and emotional skills (emotional problems, emotion regulation, hot executive functioning, and need for arousal) in an on-track and delayed (in terms of their educational progress) adolescent group. With regard to cool executive functioning, the current study uses both performance tasks of working memory, inhibition and cognitive flexibility, as well as questionnaires assessing these and more complex executive functions (e.g., planning). This way the current study thoroughly examines three important core executive functions that still show clear developmental changes in early adolescence, as well as a broader perspective on executive functioning (Zelazo and Carlson, 2012; Diamond, 2013). Additionally, it allows the examination of the heterogeneity within these groups by investigating the existence of subgroups with specific patterns of cognitive, social and emotional skills within both the on-track and the delayed group. It is expected that network analyses can reveal differences between the on-track and delayed group of adolescents that are not apparent when applying traditional analytical approaches. Using this innovative approach allows for more nuanced insights

into which skills play a role in grade retention for which subgroups of adolescents.

#### MATERIALS AND METHODS

#### Participants

The present study used data from two waves. The target sample involved pre-adolescents and adolescents living in urban and rural areas in the Netherlands, attending regular schools for primary and secondary education. At baseline (T1), data were collected from 524 participants (mean age 13.13 years, SD = 0.86, 247 girls). Approximately 1 year later (mean time difference = 0.89 years; SD = 0.08), all participants were invited for data collection for the second time point (T2). A total of 101 participants moved, or indicated that they could not, or did not want to participate again. Therefore, at T2, data was collected from 423 participants (mean age 14.18 years, SD = 0.77, 199 girls). The children that dropped out between T1 and T2 differed slightly from the group that participated in both waves with respect to their age at T1 (Mdrop-out = 12.51, Mboth = 13.28; t(142.56) = 8.08; p < 0.001). This difference is strongly related to the younger children transferring to different schools for secondary education after middle school; we experienced great difficulty recruiting these children for participation in T2, as they failed to inform us about their new school. The children that dropped out did not differ from the children that participated in both waves with respect to their IQ at T1 [Mdrop-out = 25.86, Mboth = 25.39; t(171.30) = 1.28; p = 0.202], nor did they differ in terms of their gender distribution [χ 2 (1) = 0.00; p = 0.944] or the distribution of whether they repeated a class [χ 2 (1) = 1.08; p = 0.299].

A final sample of N = 389 (188 girls) was included in the analyses reported in the current study; these participants completed the entire task battery and all surveys in both waves (T1: mean age 13.30 years, SD = 0.77; T2: mean age 14.19 years, SD = 0.74).

Next, we identified the students who repeated a class during their school career (i.e., the delayed group, n = 69; 27 girls) as well as the students that never repeated a class (i.e., the ontrack group, n = 320; 161 girls). The gender distributions in the delayed and the on-track group did not differ significantly, χ 2 (1) = 2.41, p = 0.120. At T1, the mean age in the delayed group slightly differed from the on-track group (13.81 years; SD = 0.79 vs. 13.19 years; SD = 0.72), t(93.01) = 6.00; p < 0.001. Consequently, at T2, the mean age in the delayed group slightly differed from the on-track group (4.72 years; SD = 0.76 vs. 14.08 years; SD = 0.69; t(92.85) = 6.36; p < 0.001). The delayed and the on-track group did not differ in terms of their IQ scores at T1 (Mdelayed = 25.13, Mon-track = 25.43; t(106.08) = 0.66; p = 0.513) or at T2 (Mdelayed = 25.46, Mon-track = 26.00; t(102.08) = 1.28; p = 0.203). Note that effects of age and IQ were regressed out in the analyses (see Results section for a description of the regression approach).

All participants provided written informed consent for the study (parental consent and participant assent for children and adolescents) at both time points. Instead of receiving individual credit, participants received a voucher for an excursion together with their classmates. The study was approved by the Ethical Committee of the Faculty of Behavioral and Social Sciences of the University of Amsterdam.

Estimated intelligence scores were obtained by using a nonverbal scale, the matrix reasoning subscale of the Stanford Binet V (Roid, 2003). At T1, the mean scores on this task were 25.13 (SD = 3.42) for the delayed group, and 25.43 (SD = 3.77) for the on-track group. At T2, the mean scores on this task were 25.46 (SD = 3.15) for the delayed group, and 25.98 (SD = 3.27) for the on-track group. Mean scores did not differ between waves [F(1, 387) = 2.04, p = 0.155], nor between groups [F(1, 387) = 1.41, p = 0.236].

#### Materials

Participants' cognitive, social and emotional functioning was indexed by surveys and behavioral measures. **Table 1** provides an overview of the measures and the variables of interest per domain of functioning.

# Surveys

#### **SDQ**

To assess child mental health problems, we used the Dutch selfreport version of the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997, 2001; van Widenfelt et al., 2003). The SDQ consists of 25 items covering five subscales relating to emotional problems, peer relationship problems, conduct problems, hyperactivity/inattention and prosocial behavior. Each subscale comprises five questions with 3-point response scales ("Not true" = 0, "Somewhat true" = 1, and "Certainly true" = 2). Example items include "I am constantly fidgeting or squirming" (hyperactivity), or "I am kind to younger children" (prosocial behavior). The variables of interest were the mean scores per subscale, with higher scores indicating more mental health problems (A. Goodman et al., 2010).

#### **RPI**

To assess resistance to peer influence, we used an adapted version of the Resistance to Peer Influence questionnaire (Steinberg and Monahan, 2007). This adaptation consisted of ten statements, such as "My friends easily make me change my mind," or "I say things I don't really mean, when I think that my friends will respect me more." Respondents indicated on 5-point response scales ("Not at all true" = 0, "Not true" = 1, "Somewhat true" = 2, "True" = 3, and "Certainly true" = 4) which of the answer options applied to them. The variable of interest was the total score of all items, with higher scores representing higher resistance to peer influence.

#### **BRIEF**

To assess daily life executive functions, we used the self-report version of the Behavior Rating Inventory of Executive Function (BRIEF; Guy et al., 2004; Huizinga and Smidts, 2012). The questionnaire was completed by the adolescents, who indicated how often a given behavior has occurred in the past 6 months by endorsing one of three responses, namely "Never," "Sometimes," or "Often." The BRIEF consists of 75 items, concerning


TABLE 1 | Overview of the measures and variables of interest per domain of functioning (i.e., cognitive, social, and emotional functioning).

specific behaviors relating to executive functioning in children. The questionnaire comprises eight clinical scales (Inhibit, Shift, Emotional Control, Task Completion, Working Memory, Plan/Organize, Organization of Materials, and Monitor), used as variables of interest. Example items include: "Gets out of seat at the wrong times" (Inhibit), "Is disturbed by change of teacher or class" (Shift), or "Makes careless errors" (Monitor). Higher scores indicated more problems with executive functions.

#### **Social support scale**

To assess the perceived social support and regard which significant others manifest toward children and young adolescents, we used an adapted version of the Social Support Scale (Harter, 1985). This version consisted of 16 statements, concerning their parents (4 items), classmates (4 items), teachers (4 items), and close friends (4 items). Example items are: "I have parents who want to listen to my problems," "I have classmates I can become friendly with," "I have a teacher who cares about me," or "I have a close friend who really understands me." Respondents indicated on 5-point scales ("Not at all true" = 0, "Not true" = 1, "Somewhat true" = 2, "True" = 3, and "Certainly true" = 4) which of the answer options applied to them. The variables of interest were the total scores on the parents, classmates, teachers, and close friends scales. Higher scores indicated a higher perceived availability for social support.

#### **Need for arousal**

To assess situation-unspecific trait-like aspects of need forarousal, we used an eight-item questionnaire devised by Figner et al. (2009). Questions refer to broad preferences regarding the level of novelty in general and the propensity to expose oneself to risky situations in everyday life (e.g., "I like a lot of variety" and "I often position myself in an exciting/dangerous situation on purpose"). Responses on this scale were given using a visual analog scale (slider bar), with scale endpoints "Does not apply at all" (scored as 1) and "Applies very much" (scored as 100). The variable of interest was the mean score of all eight questions.

#### Behavioral Measures

#### **Dots-triangles**

In order to assess cognitive flexibility, we used the Dots and Triangles task, which is part of a task battery to assess executive functions in children, adolescents and young adults (Huizinga et al., 2006). In a 4 × 4 grid on the screen, varying numbers (i.e., three to eight per half of the grid; equally distributed) of dots or triangles were presented. During the "dots" task, participants had to decide whether there are more dots in the left or the right part of the screen (block 1; 30 practice trials, 50 experimental trials). During the "triangles" task, participants had to decide whether there are more triangles in the top or in the bottom part of the screen (block 2; 30 practice trials, 50 experimental trials). Blocks 1 and 2 were administered in randomized order. In the third block (90 practice trials, 150 experimental trials), a series

of four "dots" trials and a series of four "triangles" trials were alternately presented to the participants. The focus of the analyses was on reactions in the third block, in which responses could be preceded by a trial requiring the same task (i.e., task-repeat trials) and responses could require a switch to the alternative task (i.e., task-switch trials). Participants had 3500 ms to respond; when a response was given, the stimulus disappeared from the screen. The time interval between the response and the next stimulus varied pseudo-randomly between 900 and 1100 ms in steps of 10 ms. The main dependent variable was the ratio of the median response latencies on task-repeat trials and task-switch trials.

#### **Eriksen flankers task**

In order to assess the ability to resist interference, we used the Eriksen Flankers Task, which was also part of the task battery to assess executive functions in children, adolescents and youngadults (Ridderinkhof and Van der Molen, 1995; Huizinga et al., 2006). Participants had to respond to a left vs. right pointing arrow presented at the center of the screen, by pressing a left or right response button. The arrow was flanked by four arrows pointing in the same direction (i.e., →→→→→ or ←←←←←; congruent condition) or by four arrows pointing in the opposite direction (i.e., →→←→→ or ←←→←←; incongruent condition). There were 50 practice trials and 100 experimental trials (i.e., 50 congruent trials and 50 incongruent trials, varied pseudo-randomly). The stimulus onset occurred with the presentation of a rectangle, which served as the warning stimulus. After a time interval of 400 to 600 ms (varied pseudorandomly in steps op 10 ms), the arrows array was presented. Participants had 2500 ms to respond; the arrow array disappeared from the screen when a response was made. The inter trail interval varied pseudo-randomly between 900 and 1100 ms in steps of 10 ms. The main dependent variable was the ratio of the median response latencies on congruent trials and incongruent trials.

#### **Digit span**

To asses working memory capacity, we used a computerized version of the Digit Span subtest from the WISC-III-NL (Kort et al., 2005). The Digit Span comprises two parts: the Digit Span Forward and the Digit Span Backward. The Digit Span Forward requires the participant to repeat increasingly longer strings of numbers, in the same order as presented on the computer screen. The Digit Span Backward requires the participant to repeat increasingly longer strings of numbers, in the reverse order as presented on the computer screen. Numbers were presented at a rate of one number per second, and responses were given on a number pad. In both parts of the Digit Span, each test item consisted of two strings of digits administered at each list length, starting with a string of two digits, and increasing in length by one digit following successful repetition of at least one string of digits at a given length. Testing was discontinued when a participant incorrectly repeated two strings of the same length. Digit Span scores were computed using the longest string length correctly recalled. The variable of interest was the ratio of the Digit Span Forward and the Digit Span Backward.

#### **Columbia card task**

In order to assess risk-taking behavior, we used a computerized version of the Columbia Card Task (CCT; Figner et al., 2009). The CCT comprises a hot and a cold version. The hot version of the CCT was designed to trigger the involvement of affective decision-making processes, whereas the cold version of the CCT was designed to assess risk-taking under predominantly deliberative conditions involving 'cold' cognitive processes. In the hot version, 32 cards are presented face down in an 8 × 4 grid on the screen (see Figner et al., 2009 for an example of the lay-out). Among these cards are loss cards and gain cards. At the top of the screen, information is provided about the number of loss cards (1 or 3) hidden among them, the gain amount for each turned over gain card (10 or 30 points), and the loss amount when a loss card is turned over (−250 or −750). There were 24 game rounds. Each new game round starts with a score of 0 points and all 32 cards shown face down. Within a game round, participants are required to make a series of binary decisions whether to turn over a card or to stop turning over cards. After each turn, they receive feedback, indicating whether the turned card was a gain card or a loss card. A running total of the accumulated amount of points is shown when a card is turned over. A game round continues until the participant turns over a loss card (leading to the subtraction of the loss from the running score), or when the participant decides to stop tuning cards. The magnitude of gain, magnitude of loss, and the number of loss cards varied across game rounds. The variable of interest is the average number of cards chosen to turn, as an indicator of risk-taking. The cold version of the CCT is similar to the hot version, but without the inclusion of affective processes during decision making. Specifically, at the beginning of each game round the participant decides how many cards he or she wants to turn over. In addition, outcome feedback is only provided until all game rounds have been played. Again, the variable of interest is the average number of cards chosen to turn.

#### Procedure

Testing took pace in two sessions. In one session, the experimental tasks were administered. All participants were tested simultaneously in groups of two. The order of the tasks was counterbalanced. There were 3-min breaks between tasks. Each test was practiced first; when the participants understood the instructions, the actual testing took place. Each experimental test session test session lasted approximately 1 h. In the other test session, participants filled out the surveys. This was done in groups of approximately 15 participants, and took place in the common computer lab at school, which was reserved for testing. Each survey testing session lasted approximately 1 h.

# RESULTS

We performed three sets of analyses. The first set included analyses of variance to assess performance differences between the delayed and the on-track group, and between waves. Data are presented in three domains: cognitive functioning, social functioning and emotional functioning (see 2). Age and IQ were

used as covariates. The second set included a network analysis to model pairwise relationships between the variables in the delayed and the on-track groups. The third set included a community profile analysis to assess neuropsychological heterogeneity in relation to educational performance (i.e., delayed or on-track). To control for differences in Age and IQ in the network analysis we fitted, for each variable in each wave, a linear model with Age and IQ as predictors and the variable of interest as dependent variable. The residuals of each regression were subsequently used to calculate the covariance matrix used as input in the network analysis. Analyses were performed using the lm function in R (R Core Team, 2016).

#### Analysis of Variance

The marginal means and standard errors of the variables of interest per domain of functioning and per time point (T1 and T2) are reported in **Table 2** for the delayed and on-track group.

We performed separate repeated-measures ANOVA's with Time Point (T1 and T2) as within-subject variable, and Group (delayed and on-track) as between-subjects variable. As can be seen in **Table 2**, none of the interactions were significant, indicating that the combined effect of Time point and Group on the variables of interest was absent. In other words, on average both groups showed the same change in scores from T1 to T2 on all measures.

#### Network Analysis

Network analysis was performed on the covariance matrix of the 24 pre-processed variables. To estimate the optimal network in each group we used the GLASSO algorithm (Friedman et al., 2008) with an Extended BIC (EBIC, Foygel and Drton, 2010) criterion to select the optimal sparseness parameter. We estimated the optimal model on the covariance matrix of participants of both groups, but separately for each wave. Using this sparseness parameter, we then estimated the network separately for the on-track and delayed group in each wave. After network estimation, we calculated degree centrality for each node. To obtain confidence intervals we used a bootstrap approach with 1000 iterations. For each iteration, we took a random sample (with replacement) from each group and performed sparseness parameter selection (over both groups) and network estimation. All analyses were performed using the bootnet package (Epskamp et al., 2016) in R (R Core Team, 2016).

Estimated networks for the on-track and delayed groups for both waves are shown in **Figure 1**. Colored circles indicate the nodes of the network (red = cognitive functioning, green = social

TABLE 2 | The interaction effects of the variables of Interest (per domain of functioning) at T1 and T2, with group (delayed vs. on-track).


NB. df = 1,387.

FIGURE 1 | Network model of all variables, for the two waves across the three domains (cognitive functioning in red; social functioning in green; and emotional functioning in blue) and groups (delayed vs. on-track). The variables are now plotted in a 'spring' plot (Fruchterman and Reingold, 1991), in which variables that are strongly related are closer together and the more central nodes are in the center of the plot. EFSW, Dots-Triangles; EFIN, Eriksen Flankers Task; EFWM, Digit Span; RCOL, Columbia Card Task (cold version); BRIN, BRIEF Inhibit; BRFX, BRIEF Shift; BRWM, BRIEF Working Memory; BRTC, BRIEF Task Completion; BRPO, BRIEF Plan/Organize; BROM, BRIEF Organization of Materials; BRMO, BRIEF Monitor; SDQC, SDQ Conduct problems; SDQH, SDQ Hyperactivity/inattention; SSPA, Social Support Scale – Parents; SSCM, Social Support Scale – Classmates; SSTE, Social Support Scale – Teachers; SSFR, Social Support Scale – Close friends; RPIF, Resistance to peer influence; SDQS, SDQ Prosocial behavior; SDQP, SDQ Peer relationship problems; NFAR, Need for arousal; RHOT, Columbia Card Task (hot version); SDQE, SDQ Emotional symptoms; BRER, BRIEF Emotional Control.

functioning, and blue = emotional functioning). Lines between the nodes are the non-zero partial correlations (red = negative and green = positive), thickness of the lines indicates stronger correlations. In order to examine differences between the ontrack and delayed group, we plotted so-called 'spring' plots (Fruchterman and Reingold, 1991), in which variables that are strongly related are plotted closer together and more central (i.e., in the center) of the plot. In addition, **Figure 2** shows the centrality estimates for all variables for the two waves across the three domains.

As can be seen in **Figure 1**, across all waves and groups there are strong connections within the cognitive domain, indicating that the variables comprising this domain are strongly related. For the other domains, the interrelations are present, but less strong. Across domains of functioning, there are strong connections of executive function behavior (as measured by questionnaires) and emotion regulation, and of conduct problems and prosocial behavior and teacher social support. Overall, the networks for both groups are relatively stable across waves.

Hyperactivity/inattention; SSPA, Social Support Scale – Parents; SSCM, Social Support Scale – Classmates; SSTE, Social Support Scale – Teachers; SSFR, Social Support Scale – Close friends; RPIF, Resistance to peer influence; SDQS, SDQ Prosocial behavior; SDQP, SDQ Peer relationship problems; NFAR, Need for arousal; RHOT, Columbia Card Task (hot version); SDQE, SDQ Emotional symptoms; BRER, BRIEF Emotional Control.

The most prominent difference between the delayed and the on-track group is expressed in the increased amount of connections within and between social and emotional variables in the delayed group compared to the on-track group. Remarkably, in the delayed group, the variables in the emotional domain are more central than in the on-track group. Looking at the individual variables per domain (**Figure 2**), in the cognitive functioning domain, hyperactivity (SDQ), conduct problems (SDQ), inhibition, shifting, monitoring, and difficulties with organization of materials (BRIEF) play a more central role in the delayed group compared to the on-track group. In the social functioning domain, only teacher social support seems to play a more prominent role in the delayed group than in the on-track group. Whereas in the emotional functioning domain, especially risk-taking in the hot condition plays a more central role in the delayed group compared to the on-track group. The observed differences in all three domains become smaller in the second wave.

#### Community Profile Analysis

Community profile analysis was performed using an approach similar to that of Fair and colleagues (2012). We first transformed all variables for both groups and waves to z-scores. To aid interpretability of the profiles we clustered related variables using Confirmatory Factor Analysis (CFA, using the Lavaan package in R, R Core Team, 2016) separately for each wave. For an overview of the factors see **Table 3**. The resulting factor scores for each participant indicate their individual profile. We then correlated each participant's profile with the profiles of all other participants. On this 389 × 389 correlation matrix, we performed a community analysis using the Louvain algorithm (Rubinov and Sporns, 2011) in R. We ran the analysis 200 times and further examined the analysis with the highest modularity index [Q, higher index indicates better separable 'modules' or 'communities,' ranges from −0.5 to 1 with positive scores indicating above chance level separation in modules (Newman, 2004)]. In addition, we examined the uniqueness of the communities (i.e., a significant variation from random). We compared our Q index against an estimated distribution of Q values under the null hypothesis (Fair et al., 2012). For this approach, we randomized the factor scores of all participants (separately for each wave) 200 times and estimated the Q index at each instance. The distribution of Q's across the 200 iterations was taken as our null distribution


to which we compared our observed Q values for each wave.

The community analysis thus estimates a number of profiles, with the number of profiles decided by the algorithm, that best separates participants into different 'profile communities' and assigns each participant to one community. We analyzed the resulting profiles in terms of mean factor scores for the twelve factors, differences in these scores across the on-track and delayed groups, and the proportion of participants across both groups assigned to a certain community.

The community analysis returned three profiles for both waves with a good modularity index (Q T1 = 0.42 and Q T2 = 0.45) (Newman, 2004, 2006). These Q values differed significantly from our estimated null distributions (z-values 10.05, and 16.10 for T1 and T2, respectively), indicating that all profiles are unique (i.e., they vary significantly from random). **Figure 3** shows the profiles, whereas **Table 4** shows the distribution of scores across the profiles. Profile #1 could be interpreted as a 'weak executive function and emotion regulation profile,' as shown by more problems with executive functions, more prone to influence of peers, relatively normal problems with peers, low risk-taking, but more emotion regulation problems. This profile is somewhat more inflated in the delayed group compared to the on-track group. Profile #2 seems to be characterized by high risk-taking behavior, in both the affective (hot) and non-affective (cold) setting, with a high need for arousal. Especially cold risk-taking is at higher level in the delayed group compared to the on-track group, while for hot risk-taking this is effect is reversed. Profile #3 seems to be a relatively balanced profile, with the absence of problems with EF, influence by peers, and peer problems, high levels of prosocial behavior, a somewhat high need for arousal, and no problems with emotion regulation. At T2 all profiles look very similar except the high cold risk-taking of profile 2 is now less extreme.

#### DISCUSSION

Although it is generally accepted that adolescence is characterized by high levels of neuropsychological heterogeneity, traditional statistical analyses fall short when taking into account the interrelation between different characteristics of interest and the relation between these profiles and (sub)group membership. Building on recent insights from research on neural networks (e.g., Kellermann et al., 2015) and social networks (e.g., Cappella et al., 2013), this study introduced network analysis as a statistical approach to better understand the interrelation between aspects of cognitive, social and emotional functioning in typically developing adolescents who are either on-track or delayed in their school career. In a next step, community analysis was used to assess neuropsychological heterogeneity in relation to educational performance. It was expected that this network approach could reveal differences between on-track and delayed students that would not be visible using traditional analytical approaches. Results indicate that traditional analysis of variance does not reveal significant differences between the on-track and the delayed group. In contrast, network analysis showed heightened centrality of executive functions in the delayed group and community analysis indicated increased levels of risk-taking and executive function problems to be important for adolescents delayed in their school career. This provides a first indication of the importance of examining heterogeneity and interrelations among different developmental domains in order to understand adolescent functioning and school retention.

First, the on-track and delayed group were compared using traditional analyses, namely repeated measures ANOVA. The ontrack and delayed group were followed up for approximately 1 year to map their development in terms of the interrelation of cognitive, social and emotional functioning. Repeated measures ANOVA's showed that for each of the variables, namely executive functions, risk-taking (no affect), behavior regulation


Q, modularity strength. Higher index indicates better separable 'modules' or 'communities,' ranges from −0.5 to 1 with positive scores indicating above chance level separation in modules (Newman, 2004).

problems, social support, resistance to peer influence, prosocial behavior, peer problems, need for arousal, risk-taking (affective), emotional problems, and emotion regulation, the change from T1 to T2 was identical for the on-track and the delayed group.

This is in contrast to previous studies that found a relationship between different cognitive, social and emotional skills and school outcomes such as grade retention (Jimmerson et al., 2000; Robbins et al., 2004; Jimmerson and Ferguson, 2007; Davoudzadeh et al., 2015; Fitzapatrick et al., 2015). For example, Jimmerson and Ferguson (2007) found grade retention to have a negative impact in the sense that children who were retained in the beginning of elementary school showed more aggressive behavior in adolescence. However, the latter study specifically focused on children experiencing delay early in their school career and the long term consequences of this delay, while the current study did not specify the timing of the retention. It is possible that early retention has a more profound impact on children, that there are different causes for early and late grade retention, or that deficits will become more visible over a longer period of time in adolescents' development. Additionally, many studies examining cognitive, social and emotional functioning in relation to grade retention have been conducted in the United States, where school policies and policies on grade retention are likely different compared to the Netherlands. Whereas grade retention in the United States is often based on results on standardized tests (Jimmerson and Ferguson, 2007), the decision to retain a child or youth in the Netherlands is made by the teachers and based on a multitude of factors, such as grades and behavior. As the motives for grade retention are different, the population of delayed adolescents is also likely to differ, making the comparison of existing insights more difficult.

Nevertheless, subtle differences can exist between the two groups of adolescents which are not detected by these traditional analyses. Therefore, network analysis was used to examine the interrelation between the different variables of cognitive, social and emotional functioning for the on-track and the delayed group. Network analyses allow for the examination of the interrelation between different variables and shows which variables are very central (i.e., strongly related to many other variables) or peripheral (i.e., weakly related to the other variables). Networks of the on-track group suggest that executive functions as measured by cognitive tasks, are peripheral in the network and thus play a limited role in the cognitive, social and emotional functioning of adolescents. Yet, difficulties with self-reported executive function behavior, as measured with a questionnaire, play a more central role in the network. In the delayed group, however, only self-reported executive function behavior stands out in the network, playing a central role, while all other variables, including executive function as measured with cognitive tasks, are equally important in the network. The comparison of both groups shows that at T1 the cognitive measures, teacher support, and risktaking in affective situations are more central in the delayed adolescents compared to the on-track adolescents. Especially executive functions (inhibition, shifting, and difficulties with organization of materials) play a more central role in the functioning of delayed adolescents compared to the on-track adolescents. At T2 these differences have diminished and only shifting is significantly more central in the network of delayed adolescents compared to on-track adolescents. In other words, while traditional analysis does not show any differences between the two groups, network analysis suggests that executive functions are more important in the functioning of delayed adolescents than they are for on-track adolescents.

The centrality of executive function behavior in the network indicates that this skill is strongly related to many other skills of adolescents cognitive, social and emotional functioning. This is perhaps not surprising as executive functions are crucial for goal-directed behavior (Huizinga et al., 2006). Executive functioning has been related to a large number of outcomes in a variety of live domains including physical health, mental health, and school functioning (Best et al., 2011; Moffitt et al., 2011; Diamond, 2013). Adding to existing literature, the current study suggests that for adolescents delayed in their scholastic track, executive function behavior may be even more strongly related to a multitude of other domains and outcomes. Also, while the heightened importance of executive functions for delayed adolescents was clear at the first wave, this importance diminished to some extent by the second wave of data collection. It is possible that adolescents who are retained end up in a more appropriate age group, which can benefit the development of certain skills and lead to a more balanced network of cognitive, social and emotional skills.

Third, community analyses were conducted to assess neuropsychological heterogeneity within the two groups. Three unique profiles were found within both groups and at both waves of data collection. The first profile comprises adolescents with difficulties in emotion regulation, self-reported executive function behavior, and problems with resistance to peers, while showing low levels of risk-taking, and low levels of need for arousal. The second profile consists of adolescents with low levels of prosocial behavior, and high levels of risk-taking behavior and need for arousal. Finally, the third profile includes a well-adapted group of adolescents, where most variables center around the average. This group indicates low levels

of difficulties in executive function behavior, high levels of prosocial behavior, and average levels of resistance to peer influence. When comparing the on-track and delayed group, the well-adapted profile shows few differences. However, in the first profile delayed adolescents show a slightly higher level of difficulties in executive function behavior. In the second profile the delayed adolescents show even higher levels of risk-taking behavior in non-affective settings. These differences are visible at T1, but again are leveled off at T2. This suggests that while some delayed adolescence may show more unbalanced profiles, they eventually catch up with their on-track counterparts.

An important note is warranted on the distribution of the delayed and on-track adolescents across the different profiles. Results show that at T1 half of the delayed adolescents are classified in the balanced group (profile 3) and few are categorized in the high risk taking group (profile 2). By the second wave of data collection these numbers reverse and more delayed adolescents are categorized in the risktaking profile. For the on-track adolescents, the distribution across the three profiles is more equal. It would be expected that the delayed adolescents are more often categorized in the unbalanced profiles. One possible explanation is that the current study may not have included all skills relevant for grade retention, and therefore misses additional profiles of adolescents. Another explanation is that, as mentioned before, children and youth in the Netherlands are often retained to prevent them to change into a lower academic track. Such a preventive approach may result in retaining some adolescents who have few deficits. The fact that more delayed adolescents move to the unbalanced profiles may indicate that this approach has a negative effect for these balanced adolescents.

The findings of these community analyses thus further nuance the results of the network analysis and indicate that while executive functioning is important in the development of the delayed group as a whole, there are individual differences within this group. Whereas for some delayed adolescents executive functioning (behavior) plays an important role in their development, risk-taking might be the central deficit for others. As the network analyses in this study point out the centrality of executive function behavior in adolescents functioning, this may suggest that especially the group with difficulties in this executive function behavior are at risk for developing a multitude of problems. This has important implications for future research and clinical and educational practice. The results of the current study suggest that caution is warranted when average effects or group differences are examined. Such approaches may miss important associations and effects or even lead to wrong conclusions. An approach taking into account heterogeneity provides more detailed insights which can also be used to design more effective measures to reduce deficits. The current study for example indicates that while reducing deficits in executive functioning might be a useful intervention to reduce academic difficulties for some adolescents, this method will not be helpful for others.

# Strengths and Limitations

A clear strength of the current study is the advanced analytical approach used to examine different domains of adolescents functioning (e.g., executive functioning) in relation to grade retention. This provides us with new and more nuanced insights into these important topics in education. Additionally, community analyses can provide a powerful means of answering research questions, as it allows for new types of research questions to be examined (e.g., about heterogeneity) as well as avoiding problems generally linked to traditional analyses (e.g., multiple comparison in ANOVA).

Naturally, the current study also has some limitations, which should be taken into account. The current study used a large sample of adolescents, but only 69 of them belonged to the delayed group. Non-parametric procedures with bootstrapping were used to allow community analyses to be performed. Nevertheless, sampling bias occurs more easily in smaller samples, which means that the results of the current study might be biased due to an overrepresentation of certain (unknown) characteristics. Future studies should attempt to examine similar processes either in larger samples of adolescents who experienced grade retention or use a careful recruitment procedure that ensures that the sample is representative of the population with regard to key characteristics. Larger groups might also allow to distinguish more rare profiles that could not be found in the current sample. Additionally, it would be informative to examine heterogeneity within delayed and on-track groups, and interrelations between domains of functioning across a longer period of time. This could provide insights into developmental trajectories of different subgroups of adolescents.

# CONCLUSION

Traditional analyses often examine different aspects of adolescent cognitive, social and emotional functioning in isolation and rarely take into account the heterogeneity within groups of adolescents. The current study shows that such an approach may miss important insights. In the current study, no differences were found between on-track and delayed adolescents using traditional analysis of variance, while network analysis highlighted the importance of executive function behavior. Additionally, community analysis suggested executive function problems and risk-taking behavior to be important deficits for different subgroups of delayed adolescents. Network and community analyses can thus provide more nuanced insights into the underlying factors of specific difficulties, such as difficulties in educational progression. Such nuanced insights can guide more effective preventive and supportive measures for educational difficulties.

# AUTHOR CONTRIBUTIONS

LV wrote the Introduction and Discussion of the manuscript, contributed substantially to the interpretation of the analyses,

and reviewed the manuscript. WW contributed substantially to the design of the work, conducted and interpreted the network and community analyses, contributed substantially to the writing (Results section) and reviewing of the manuscript. NL contributed substantially to the design of the work, the interpretation of the analyses and the writing (Introduction) and reviewing of the manuscript. DB contributed substantially to the interpretation of the analyses and the reviewing of the manuscript. JW programmed several of the instruments used and reviewed the manuscript. BF designed several of the instruments used and reviewed the manuscript. MH is guarantor and designer of the study, contributed substantially to the interpretation of the analyses and the writing (Method section) and reviewing of the manuscript.

## REFERENCES


#### FUNDING

This work was supported by a grant from the National Initiative Brain and Cognition awarded to M. Huizinga (NIHC 056- 34-016, MH).

#### ACKNOWLEDGMENTS

The authors gratefully acknowledge the contribution of participants, their parents and their schools. The authors would like to thank Dr. Lourens Waldorp for his valuable input on the network analyses. The authors would also like to thank Lisa van der Heijden, Daan van Es, and Juri Peters for help with gathering the data.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Vandenbroucke, Weeda, Lee, Baeyens, Westfall, Figner and Huizinga. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intraindividual Variability in Executive Function Performance in Healthy Adults: Cross-Sectional Analysis of the NAB Executive Functions Module

#### Dorota Buczylowska\* and Franz Petermann

Center of Clinical Psychology and Rehabilitation, University of Bremen, Bremen, Germany

The current study was aimed at investigating across-tasks intraindividual variability, also termed dispersion, in EF performance. The German adaptation of the Neuropsychological Assessment Battery (NAB) was used as a measure of EFs. Data of 444 participants aged 18–99 from six NAB Executive Functions Module subtests (i.e., Planning, Mazes, Letter Fluency, Judgment, Categories, and Word Generation) along with the NAB Total Index score as a measure of overall cognitive ability were analyzed. Maximum discrepancy (MD) was applied as a measure of dispersion. MD values ranged from 0.47 to 5.20 indicating substantial across-tasks dispersion in EF performance. Furthermore, dispersion moderately decreased with advancing age. Taking overall cognitive ability into account revealed that dispersion might be lower at older ages; especially, when associated with low overall ability levels. The dedifferentiation hypothesis offers a plausible explanation for these findings. That is, the cognitive profiles of older people might be less heterogenous than that of younger people, which may be due to age-related central nervous system constraints.

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Lucia Serenella De Federicis, Centro Clinico Caleidos, Italy Marina Avila Villanueva, Fundación CIEN, Spain

\*Correspondence:

Dorota Buczylowska buczylowska@uni-bremen.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 08 January 2018 Accepted: 27 February 2018 Published: 14 March 2018

#### Citation:

Buczylowska D and Petermann F (2018) Intraindividual Variability in Executive Function Performance in Healthy Adults: Cross-Sectional Analysis of the NAB Executive Functions Module. Front. Psychol. 9:329. doi: 10.3389/fpsyg.2018.00329

Keywords: executive functions, intraindividual variability, dispersion, cognitive aging, NAB

# INTRODUCTION

There is a substantial degree of variability in the literature regarding the conceptualization and operationalization of executive functions (EFs). A recently conducted review of contemporary empirical studies, however, revealed some points of convergence among researchers in respect to the definition of EFs (Baggetta and Alexander, 2016). According to this review, the majority of researchers regards EFs as a set of cognitive processes responsible for guiding and monitoring action and behaviors crucial to learning and everyday human performance tasks. Furthermore, executive functioning is considered more a multidimensional construct comprising several cognitive processes rather than single ability. Nevertheless, no consensus among researchers has been reached in respect to which cognitive processes comprise this multidimensional construct (Maricle and Avirett, 2012; Flanagan et al., 2014). According to Baggetta and Alexander (2016), researchers identified 39 different components as EFs. Inhibitory control, working memory (WM), shifting or cognitive flexibility were among components or labels most frequently encountered in the literature. Planning and attention were other terms often used by researchers in relation to executive functioning. Even less consensus appears to be regarding the operationalization of the EF construct. Within the 106 studies reviewed by Baggetta and Alexander (2016), 109 different tasks, and 11 different batteries were used as measures of EFs. Moreover, the same tasks were used in different studies to measure different components or single tasks were used to measure multiple components. An important conclusion of the review is that researchers should work toward a unifying definition and develop new assessment tools to measure EFs.

When exploring the nature of EFs, developmental patterns are to be taken into account. In particular, evidence exists on age-related individual differences in EFs across the entire life span (De Luca et al., 2003; Romine and Reynolds, 2005; De Luca and Leventer, 2008; Reynolds and Horton, 2008; Buczylowska and Petermann, 2016). Particularly interesting are the findings supporting the differentiation-dedifferentiation hypothesis, which postulates that cognitive abilities change throughout the life span in the extent of relationship. That is, EFs in children are unidimensional, but develop through adolescence into early adulthood into a multidimensional construct (Wiebe et al., 2011; Brydges et al., 2012); with advancing age and reaching late adulthood, however, EFs become more unidimensional again (Buczylowska et al., 2016; Buczylowska and Petermann, 2017).

The existing evidence on developmental patterns in EFs has been gained using both indices of central tendency and measures of interindividual variability. Studying age-related differences between individuals is particularly important as the magnitude of heterogeneity in EFs may help better understand why some individuals display changes in EFs, whereas others do not (Buczylowska and Petermann, 2016). Nevertheless, it has been acknowledged that exploring within-person differences in cognitive performance, also termed intraindividual variability (IIV), may also provide important information in respect to developmental changes associated with normal and pathological aging.

It should be noted that the term IIV refers either to differences within individuals across different tasks within single occasion (dispersion) or differences within individuals within one task across multiple occasions (Hultsch et al., 2002; Hilborn et al., 2009; inconsistency). Studies have shown that both dispersion and inconsistency in cognitive performance might increase with advancing age (Christensen et al., 1999; Hultsch et al., 2002; Schretlen et al., 2003; Williams et al., 2005; Hilborn et al., 2009; Vandermorris et al., 2013; Ferreira et al., 2017). Nevertheless, contrary results have also been reported (Lindenberger and Baltes, 1997; Rapp et al., 2005; Mella et al., 2016). It has been argued that increases in IIV might be observed in younger and healthy adults, whereas decreases in IIV might follow very late in life as a result of age-related cognitive decline (Hultsch et al., 2002). However, this is normal aging that seems to be associated with decreases in IIV in late adulthood. Cerebral dysfunctions, in their initial stages especially, may present specific deficits in one cognitive domain, resulting in an increase of cross-domain variability (Rapp et al., 2005). This is consistent with the findings demonstrating links between increased IIV and lower cognitive functioning (Christensen et al., 1999; Rapp et al., 2005; Hilborn et al., 2009) as well as higher risk for dementia (Hultsch et al., 2000; Holtzer et al., 2008).

Advances in the study of IIV might have practical implication for clinical neuropsychology (Hilborn et al., 2009; Vandermorris and Tan, 2015). In particular, based on prior research, the characteristics of psychometric assessment must be reconsidered. As an example, fluctuations in test performance across different occasions may question the validity and reliability of the measures used. IIV in performance across different tasks on a single occasion also presents a challenge in respect to clinical decisions; that is, neuropsychological evaluations are based on comparisons of an individual with a norm of a similar developmental cohort. Consequently, if there is substantial dispersion within individuals in respect to test scores, better understanding of developmental profiles, and their potential associations with normal and pathological aging is required (Vandermorris and Tan, 2015).

Additional research is necessary in respect to the impact of demographic characteristics such as sex and education as well as health and cognitive status. No differences in IIV appear to be according to gender (Hultsch et al., 2002; Mella et al., 2016). Conflicting findings exist in respect to the relationship between IIV and educational attainment as well as overall cognitive ability. Hilborn et al. (2009) demonstrated that higher levels of education and overall cognitive ability may be associated with lower levels of IIV. Christensen et al. (1999), Hultsch et al. (2002), and Schretlen et al. (2003) also showed that IIV might decrease with higher ability levels, whereas in the study by Lindenberger and Baltes (1997) the opposite was evident. Such discrepancies in results might be due to methodological issues as both educational attainment and overall cognitive functioning are variables differently operationalized in studies. As an example, to assess overall cognitive ability some researchers use intelligence tests (Lindenberger and Baltes, 1997; Schretlen et al., 2003), whereas others use composite scores of several cognitive measures (Hultsch et al., 2002). Education has also been used a proxy to assess intelligence and overall cognitive ability (Christensen et al., 1999).

In the context of aging and age-related intraindividual differences, exploring EFs seems particularly important. The prefrontal cortex (PFC) is considered the brain region most disrupted by healthy aging (Bryan and Luszcz, 2000; De Luca et al., 2003; Cabeza and Dennis, 2013); therefore, EFs have been proposed as potential mediators of age-related cognitive decline (Levine et al., 1997; Parkin, 1997; Salthouse et al., 2003; Troyer et al., 2007). Indeed, findings support the notion that changes in cognition with advancing aging reflect age-related decline in frontal lobe functioning. For example, evidence exists in respect to word-list-learning performance showing a greater incidence and increased inconsistence in false memory in healthy older adults as compared to younger adults (Murphy et al., 2007). Furthermore, patients with frontal lobe lesions showed greater dispersion and inconsistency in respect to reaction time (RT) tasks performance relative to patients with non-frontal lesions (Stuss et al., 2003). Additionally, increased IIV in RT and episodic memory performance among healthy individuals have been associated with smaller prefrontal white matter volumes (Lövdén et al., 2013) and frontal white-matter hyperintensities (Bunce et al., 2007). On the other hand, studies investigating IIV in EFs that use standardized neuropsychological tests are scarce. In most cases, IIV has been examined using EF tasks Buczylowska and Petermann Intraindividual Variability in Executive Function Performance

in combination with other neuropsychological measures. In particular, verbal fluency and cognitive flexibility measures have been applied (Baltes and Lindenberger, 1997; Lindenberger and Baltes, 1997; Schretlen et al., 2003; Rapp et al., 2005). As findings derived from these studies refer more to cross-domain variability, additional research focusing within-domain variability in EFs is required.

The current study was aimed at exploring IIV in EF performance in healthy adults across a large life span. The focus was on dispersion as this is the aspect of IIV that is not well studied, neither in respect to cognition in general nor in respect to EFs. The main goal was to clarify whether there are age-related changes in across-tasks variability in EF performance by examining cross-sectional associations between dispersion and age. As the most of studies conducted are comprised of older adults, the current research was aimed at exploring dispersion in young participants as well. Furthermore, the goal was to compare different stages of adulthood with regard to the magnitude of dispersion. In addition to age, the impact of educational attainment, and sex was investigated. In line with the previous research (Lindenberger and Baltes, 1997; Hilborn et al., 2009; Heyanka et al., 2013), the current study examined differences in dispersion according to overall cognitive performance. Based on previous findings regarding dispersion in healthy adults (Lindenberger and Baltes, 1997; Rapp et al., 2005) as well as findings showing age-related increases in the intercorrelationship among EFs (Buczylowska and Petermann, 2017), it was hypothesized that across-tasks variability in EF performance would decrease in old age. In line with the previous research (Christensen et al., 1999; Mella et al., 2016), sex was not expected to exert an impact on the dispersion level. Due to conflicting prior findings, no hypotheses were formulated with regard to the impact of educational attainment and overall cognitive ability on dispersion.

# METHOD

## Sample Characteristics

Participants were 444 adults (205 males and 239 females) aged 18–99 (M = 59.74; SD = 20.35), recruited for the purpose of norming the German adaptation of the Neuropsychological Assessment Battery (NAB; Petermann et al., 2016b). Data were collected on four different sites in Germany, including the north, south, west, and east parts of the country, between February 2014 and February 2015.

To better understand the changes in IIV across the life span, the sample was subdivided into four age groups. The young age group (N = 98, 48 men, 50 women) ranged from 18 to 39 years (M = 28.69, SD = 6.29). The middle age group (N = 89, 39 men, 50, women) ranged from 40 to 59 years (M = 51.24, SD = 5.28). The middle-old age group (N = 134, 65 men, 69 women) ranged from 60 to 74 years (M = 67.42, SD = 4.26). The old age group (N = 123, 53 men, 70 women) ranged from 75 to 99 years (M = 82.27, SD = 5.35). Sample characteristics in respect to education are presented in **Table 1**.

Potential participants with known cardiovascular, neurological, or psychiatric conditions were excluded from TABLE 1 | Demographic characteristics of the sample.


<sup>a</sup>8–9 years of mandatory school.

<sup>b</sup>10 years of advanced school.

<sup>c</sup>A-level equivalent after regular 13 years of school.

the sample. Written informed consent was obtained from all participants. Only participants who completed Form 1 of the NAB on the first occasion were included in the study.

# Assessment Tools

#### NAB Executive Functions Module

The NAB is a battery of neuropsychological tests designed for the assessment of cognitive functions in adults with disorders of the central nervous system (White and Stern, 2003). The NAB is composed of one screening-module and five domainspecific modules (i.e., Attention, Language, Memory, Spatial, and Executive Functions). Performance on the five domain-specific modules results in the NAB Total Index, which is the standard score for overall cognitive functioning.

Due to standardization procedures, in which all tests are normed on a single standardization sample, the NAB Executive Functions Module offers a set of conormed tasks, suitable for the assessment of various aspects of executive functioning. Additionally, the Executive Functions Index (EFI) is available as a measure of overall performance.

Within the German adaptation of the NAB, the four original subtests of the Executive Functions Module were translated into German and adapted to standard conditions in German-speaking countries (Buczylowska et al., 2013). In the Executive Functions Module of the German NAB adaptation two additional subtests are included: Planning (German Planen) and Letter Fluency (German Wortflüssigkeit); however, only Letter Fluency offers a standard score and contributes to the EFI and the NAB Total Index. Planning is an adopted task based on the "Bogenhausener Planungstest" (von Cramon, 1988; von Cramon et al., 1991), an experimental measure designed to assess complex planning skills in the context of daily living. Letter Fluency is a task designed by the authors of the German NAB adaptation; it is based on the concept of verbal fluency (Strauss et al., 2006; Lezak et al., 2012). A detailed description of the German NAB Executive Functions Module is presented in **Table 2**.

For the German NAB Executive Functions Module following reliability coefficients are reported: internal consistency reliability, α = 0.82; test-retest reliability for younger age ranges (18–69 years old), r = 0.86, and for older age ranges (70–>85), r = 0.85 (Petermann et al., 2016a).


# ANALYSES

Statistical analyses were performed using Microsoft Office Excel 2007 and IBM SPSS Statistics 24. First, raw data from the Executive Functions Module subtests (i.e., Planning, Mazes, Letter Fluency, Judgment, Categories, and Word Generation) were z-transformed. Maximum discrepancy (Schretlen et al., 2003; Heyanka et al., 2013; MD) was employed as a measure of dispersion; that is, the difference between the highest and lowest scores for each person across the six subtests was calculated. Greater MD scores imply a relatively uneven performance profile, whereas smaller MD scores indicate a flatter, more consistent cognitive profile. To determine the relationship between dispersion and age, a Pearson correlation coefficient between the MD values and age in years was calculated.

As the MD values might have been affected by age, the six subtests scores were regressed on age. The resulted residuals were saved as z-transformed scores und used to compute MD values and a correlation between the MD values and age again. The z-transformed scores were used for further analyses, too.

To estimate the intercorrelationship between the NAB subtests, intercorrelations for the sample and for the four age groups separately were calculated. In the next step, a univariate analysis of variance (ANOVA) was applied to investigate the difference in MD between the four age groups. At the same time, the factor effects of education and sex were analyzed.

The focus of attention was also on the relationship between dispersion and overall cognitive ability. As a measure of cognitive ability, the NAB Total Index score was used. A Pearson correlation between the MD values and the NAB Total Index scores was calculated. Based on the performance on the NAB Total Index, the sample was subdivided in four groups: 1 SD above average, 0–1 SD above average, 0–1 SD below average, and 1 SD below average). Correlations between the MD values and age for the four ability groups were computed.

# RESULTS

Descriptive statistics of the NAB Executive Functions Module for the four age groups and sample are presented in **Table 3**. MD values based on z-transformed raw scores ranged from 0.31 to 4.97 SDs (M = 2.01, SD = 0.69). 8% of participants produced MD values greater than 3 SDs, 40% produced MD > 2 SDs and 47% produced MD values greater than 1 SD. Only 6% of participants produced MD values up to 1 SD. There was a negative correlation between MD and age (r = −0.21, p < 0.01) implying that dispersion in EFs might moderately decrease with advancing age.

MD values based on z-transformed residuals ranged from 0.47 to 5.20. That is, age might have slightly inflated the MD values. This is also reflected in the distribution of the MD values in the sample, so that 15% of participants produced MD values greater than 3 SDs, 43% produced MD values greater than 2 SDs and 39% produced MD values greater than 1 SD. Only 4% of participants produced MD values up to 1 SD. Furthermore, 53% of the participants exhibited in one or more subtests a score below 1 SD and 8% of the participants exhibited in one or more subtests a score below 2 SDs below average. The test scores of participants whose MD values exceeded 3 SD were reviewed to determine which subtests most frequently contributed to the highest dispersion levels. Among subtests with the lowest or highest score in all 65 participants with MD values above 3 SD, Mazes was involved 29 times, Letter Fluency 28 times, Planning 20 times, Judgment and Categories 19 times, and Word Generation 10 times. However, Planning contributed most frequently to MD values as the lowest score (26%), followed by Mazes and Judgment (20%), Letter Fluency (15%), Categories (11%), and Word Generation (8%). The analysis showed that in 49 individuals (75%) the lowest score exceeded 1 SD below average.

The intercorrelations among the six subtests (see **Table 4**) ranged in the sample from r = 0.13 to r = 0.42. The mean intercorrelation was r = 0.25. Based on the intercorrelations, the shared variance between the subtests ranged from 0.2% to


TABLE 3 | Descriptive statistics for the NAB Executive Functions Module based on raw scores.

Data represent means ± SD.

13% in the 18–39 age group, from 0.01% to 13% in the 40–59 age group, from 0.8% to 14% in the 60–74 age group and from 0.7% to 29% in the 75–99 age group. The correlation between age in years and MD values based on z-transformed residuals (r = −0.14, p < 0.01) was still significant. A 4 (age) × 2 (sex) × 3 (education level) ANOVA revealed a significant effect of age, [F(3, 442) = 3.43, p = 0.017, η² = 0.02] for MD values. The factor effects of gender and education level were not significant. Bonferroni type post-hoc analyses revealed significant differences in MD values (p < 0.05) between the old age group (M = 1.80) and any other group. That is, the 75–99-year-olds showed less dispersion than the participants of younger age groups. There were no significant differences between the young age group (M = 2.19), the middle age group (M = 2.08), and the middle-old age group (M = 2.03).

A small but significant correlation between the MD values and Total NAB Index scores (r = 0.10, p < 0.05) indicates that overall ability level should also be taken into account. Subdividing the Total NAB Index in four different ability levels (i.e., 1 SD above average, 0–1 SD above average, 0–1 SD below average, and 1 SD below average) revealed more insight into the relationship between IIV and age. That is, the correlation between MD values and age varied according to ability level, ranging from r = −0.06 for the highest ability group (> 1 SD) and r = −0.05 for the high average group (+1 SD) to r = −0.25 for the low average group (−1 SD) and r = −0.30 for the lowest ability group (> −1 SD). This implies that IIV might be lower at older ages, particularly when associated with low overall ability levels.

#### DISCUSSION

The current study demonstrates considerable dispersion in respect to performance on six NAB EF tasks. This is based on the observation that 96% of the participants produced MD values exceeding at least 1 SD. These findings are in line with those derived from previous research on dispersion in respect to neuropsychological test performance (Schretlen et al., 2003; Heyanka et al., 2013). 52% of the participants produced MD values exceeding at least 2 SDs. Such discrepancies between test scores must be interpreted in the context of neuropsychological evaluations. Given that presumably healthy people were assessed within the current research, discrepancies between test scores in the normal population might be more common than expected. Thus, in the clinical context especially, more attention should be TABLE 4 | Intercorrelations between the NAB Executive Functions Module subtests.


p < 0.01; two-tailed probability.

paid to the distribution of test scores in relation to the MD values found in the population. Furthermore, in the present study, 53% of the participants exhibited in one or more subtests a score below 1 SD and 8% of the participants exhibited in one or more subtests a score below 2 SDs below average. As scores exceeding 1 SD below average are considered an indication of cognitive impairment, lower cutoffs to identify cognitive problems might be more appropriate for clinical use (Brooks et al., 2009).

The current results imply that dispersion in EFs might moderately decrease with advancing age. This is in contrast to previous research since the majority of studies showed an agerelated increase in dispersion (Christensen et al., 1999; Hultsch et al., 2002; Schretlen et al., 2003; Stuss et al., 2003; Hilborn et al., 2009; Ferreira et al., 2017). As already suggested by other researchers, inconsistent findings might be due to the age range investigated (Hultsch et al., 2002; Hilborn et al., 2009). If a large life span is examined, there might be increases in dispersion in young and healthy adults followed by decreases in dispersion in very old adults as later in life, cognitive deteriorations are more likely (Hultsch et al., 2002). Additionally, age-related disturbances might often occur within more than one cognitive domain (de Frias et al., 2007); as a result, more consistent cognitive profiles and less task dispersion could frequently be found in very old adults. However, this pertains to deterioration patterns associated with normal aging. As suggested by Rapp et al. (2005) this might be in contrast to deterioration patterns induced by cerebral dysfunctions, which may be associated with specific deficits in only one domain of cognitive functioning. Consequently, increases in across-tasks variability are more likely in adults affected by cerebral dysfunctions than in those with age-appropriate cognitive profiles. This is important since the current study is based on a test norming sample drawn upon strict exclusion criteria. That is, potential participants with known cardiovascular, neurological, or psychiatric conditions were excluded from the participation in the study. Additionally, selective mortality is likely to restrict the range of interindividual variability at the lower end of ability spectrum (Lindenberger and Baltes, 1997). As a result, individuals reaching very old age might be less likely to be diagnosed with clinically relevant diagnoses, and thus, they might also be cognitively healthier. Moreover, in contrast to prior research often using convenience samples, the current study is based on a representative sample of healthy adults. Hence, the current findings might more accurately reflect dispersion trends in the normal population.

The current findings are consistent with three previous studies (Lindenberger and Baltes, 1997; Rapp et al., 2005; Mella et al., 2016). Rapp et al. (2005) reported higher levels of dispersion in nursing home residents than in community-dwelling older adults. In adults living in the community dispersion decreased with advancing age. Rapp et al. (2005) argued that this may reflect the absence of cerebral dysfunction. Lindenberger and Baltes (1997) examined IIV in intellectual abilities at old and very old ages (i.e., 70–102 years) and found decreases in dispersion with age of similar magnitude (i.e., r = −0.19) as in the current research. In the current study, a significant difference in dispersion was observed between the oldest group (i.e., 75–99 years) and all three younger groups of the 18–74 age range. Interestingly, no significant differences in dispersion were evident between the younger age groups; thus, it is likely that the shift in the magnitude of dispersion happens relatively late in life. As suggested by other researchers (Lindenberger and Baltes, 1997; Rapp et al., 2005; Mella et al., 2016), decreases in dispersion might reflect dedifferentiation processes, which are primarily dominated by aging-induced changes in brain integrity. The dedifferentiation in cognition that occurs at old age is characterized by the linear decrement of cognitive performance and greater magnitude of intercorrelations between cognitive abilities than that observed at younger ages. The phenomenon of greater uniformity between cognitive abilities in old age has already been demonstrated on an example of increasing correlationship between EFs and intelligence (Buczylowska and Petermann, 2017). Age-related increases in correlations among different cognitive abilities were reported elsewhere, too (Baltes and Lindenberger, 1997; Deary et al., 2004; de Frias et al., 2007). In the present study, the magnitude of intercorrelations between the subtests increased from 0.2–13% in the 18–29 age group to 0.7–29% in the 75–99 age group. Additionally, as already demonstrated in a previous study (Buczylowska and Petermann, 2016), performance on the NAB Executive Functions Module subtests is marked by decrement in all test scores across age. Planning, Mazes and Categories were the subtests with the highest decrease in mean scores and the highest increase in interindividual variability; whereas Letter Fluency, Judgment and Word Generation were the subtests with the lowest decrease in mean scores and the lowest increase in interindividual variability. If there are differences between tasks in deterioration patterns, there might also be differences in intraindividual variability. However, differences between individuals are not necessarily associated with differences within individuals as individuals differing in performance among each other might show consistent performance within each other. Nevertheless, it is meaningful to analyze how the individual NAB subtests contribute to across-tasks dispersion and whether this is associated with the deterioration pattern identified in the previous research. In 65 participants whose MD values exceeded 3 SD, Mazes and Letter Fluency were the subtests most frequently involved as the highest or lowest score, followed by Planning, Judgment and Categories, and Word Generation. However, Planning, Mazes, and Judgment contributed most frequently to MD values as the lowest score. As identified in the previous research, Planning and Mazes were also the subtests with the highest increase in interindividual variability. On the one hand, high dispersion levels might be partly due to tasks more frequently resulting below average scores. On the other hand, the various characteristics of the individual tasks might be responsible for uneven performance profiles, too. Based on the various underlying skills, the NAB subtests can be differently classified. Planning and Mazes are paper-pencil time restricted tasks. In particular, Mazes involves visuo-spatial skills and speed. This is in contrast to the other NAB subtests, in particular highly language-related Letter Fluency and Word Generation. Consequently, different skills involved must be taken into account when considering across-tasks dispersion levels.

Inconsistent findings with regard to IIV in cognitive performance might also be due to various domains investigated and assessment tools used. In the previous research, RT tasks and various neuropsychological measures were frequently used. In contrast, the current study focused solely on the assessment of EFs and used a set of conormed tasks. Mella et al. (2016) examined across-tasks dispersion in children, young adults, and older adults separately for RT and WM tasks. Dispersion in RT tasks was characterized by a U-shaped curve with heterogeneous performance both in children and older adults. With regard to WM tasks on the other hand, young adults displayed greater dispersion than children and older adults. Thus, as suggested by Mella et al. (2016), dispersion across RT tasks and dispersion across WM tasks might be driven by different processes. Moreover, age-related cognitive impairments may be more likely to occur in more than one aspect of a cognitive domain. Consequently, there might be less task dispersion in adults with age-related executive disfunctions.

In line with the expectations, there was no effect of sex on the level of dispersion. As demonstrated by previous studies (Lindenberger and Baltes, 1997; Christensen et al., 1999; Hultsch et al., 2002; Schretlen et al., 2003; Hilborn et al., 2009) educational attainment and the level of overall ability might be linked to IIV in cognitive performance. Contrary to the previous results, dispersion in the current study was independent of educational attainment. It must be noted that years of schooling may not be an appropriate measure of educational attainment. Due to cohort effects, there might be discrepancies between participants of different ages with the same number of years of schooling. Moreover, educational attainment is likely to be related to the level of overall cognitive ability. The current research implies that dispersion may be positively related to overall cognitive ability. Moreover, taking overall ability into account might also reveal more insight into the relationship pattern between age and dispersion; that is, the negative relationship between age and dispersion appears to be stronger in adults with low overall ability levels. Lindenberger and Baltes (1997) obtained similar findings by demonstrating a negative relationship between across-tasks dispersion and age in low ability participants. They argued that very old, low ability participants have the most dedifferentiated pattern of cognitive performance as they are likely to perform uniformly low across all tasks. The flattering of the cognitive profile in low ability adults may be explained by central nervous system constraints associated with very old age. The dedifferentiation hypothesis appears to be a plausible explanation for the findings from the current study, too. Decreases in dispersion with advancing age and lower overall cognitive functioning as well as increasing intercorrelations between tasks demonstrate that dedifferentiation processes may apply to healthy, community-dwelling older adults.

# LIMITATIONS

There are several potential limitations to the current study. First, due to cross-sectional study design, it must be noted that observed differences in dispersion do not reflect developmental changes but differences between age groups. Cohort effects may apply to educational attainment, health, life style, and overall cognitive ability. Longitudinal studies are considered more informative in respect to cognitive changes over time. Especially in respect to changes in dispersion longitudinal studies are preferred as they are based on within-person comparisons between different occasions.

Second, the decision to divide the sample into four age groups might have influenced the study results. That is, when examining the adult life span, comparing dispersion levels between different age groups might be affected by the age range of the individual age groups. In particular, in larger age groups comprised of individuals with different levels of overall cognitive ability, only general conclusions regarding age-related differences in dispersion are allowed. Thus, when using a cross-sectional study design, studies with several small age groups are recommended.

Third, the NAB Executive Functions Module scores were used as a measure of EFs and also, as a part to the Total NAB Index, contributed to the measure of overall ability. Consequently, the Total NAB Index cannot be considered as an independent measure of overall ability level. That is, when exploring the relationship between the Total NAB Index and any other NAB module, it must be taken into account that the module scores contribute twice to the analysis.

Fourth, due to the lack of research on IIV in EFs, the results of the present study were discussed mainly in relation to the available findings on IIV in different cognitive domains. As there might be differences in IIV according to cognitive domain, additional research targeting executive functioning in its various aspects and using different assessments tools will be required. Consequently, the nature of the tasks used should be taken into account. Although the NAB offers a comprehensive assessment of executive functioning, there might be executive aspects more pronounced, whereas some other aspects may be neglected. Furthermore, the NAB Executive Functions Module includes complex, multifaceted tasks that are appropriate for ecologically valid use in the clinical practice. Due to practical implications it is meaningful to investigate IIV in EFs measured by such assessment tools. Nevertheless, future studies should also focus on basic aspects of executive functioning, such as updating, shifting, and inhibition as proposed by Miyake et al. (2000), especially because these executive components are among those frequently examined in respect to other research questions.

# CONCLUSION

To conclude, the current study demonstrates considerable acrosstasks dispersion in respect to EF performance. When taking age into account dispersion appears to decrease with advancing age and reach its lowest level late in life. Furthermore, lower levels of overall cognitive ability may be associated with decreases in dispersion, too. The current findings can be accounted for by the representativeness of the sample, the absence of cerebral dysfunctions in participants, and dedifferentiation processes associated with normal aging. These findings should be considered in the context of clinical evaluations, especially because lower cutoffs to identify cognitive problems might be more appropriate for clinical use. Additionally, assessment tools designed to detect dispersion as well as base rates of dispersion for different ages could be useful to identify cognitive impairments associated with pathological aging.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Ethical Principles of Psychologists and Code of Conduct of the American Psychological Associations and Ethical Pronciplies of the Swiss Psychological Association, Ethical committee of the Philosophy Faculty of the University of Zürich with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethical committee of the Philosophy Faculty of the University of Zürich.

# AUTHOR CONTRIBUTIONS

DB conceived the study and was involved in data collection. DB also performed statistical analysis and drafted the manuscript. FP supervised the study and provided feedback to the draft of the manuscript.

# FUNDING

DB was supported by the doctoral and postdoctoral funding of the University of Bremen.

# REFERENCES


Functioning, eds S. Goldstein and J. A. Naglieri (New York, NY: Springer), 143–155.


**Conflict of Interest Statement:** FP is the editor of the German NAB adaptation.

The other author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Buczylowska and Petermann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intra-Individual Variability of Error Awareness and Post-error Slowing in Three Different Age-Groups

Fabio Masina<sup>1</sup> \*, Elisa Di Rosa1,2 and Daniela Mapelli1,3

<sup>1</sup> Department of General Psychology, University of Padova, Padova, Italy, <sup>2</sup> Department of Neuroscience, University of Padova, Padova, Italy, <sup>3</sup> Human Inspired Technologies Research Center, University of Padova, Padova, Italy

Background: Error awareness (EA) and post-error slowing (PES) are two crucial components of an adequate performance monitoring because, respectively, they allow being aware of an error and triggering performance adjustments following unexpected events.

Objective: The purpose of the present study was to investigate the ontogenetic trajectories of EA and PES, as well as to examine how EA and PES interact with each other.

#### Edited by:

Sarah E. MacPherson, The University of Edinburgh, United Kingdom

#### Reviewed by:

Maria Herrojo Ruiz, Goldsmiths, University of London, United Kingdom Fei Luo, Institute of Psychology, Chinese Academy of Sciences, China

> \*Correspondence: Fabio Masina fabio.masina@phd.unipd.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 February 2018 Accepted: 17 May 2018 Published: 05 June 2018

#### Citation:

Masina F, Di Rosa E and Mapelli D (2018) Intra-Individual Variability of Error Awareness and Post-error Slowing in Three Different Age-Groups. Front. Psychol. 9:902. doi: 10.3389/fpsyg.2018.00902 Methods: The performance of three groups of participants (children, younger, and older adults) in a modified version of the Error Awareness task (EAT; Hester et al., 2005) was compared. In particular, in this study not only variations of the average performance were examined, but also intra-individual variability (IIV), considered in terms of variations of SD and ex-Gaussian parameters (mu, sigma, and tau).

Results: Two distinct ontogenetic trajectories of EA and PES were observed. Regarding EA, we observe a U-shaped curve that describes an increase of the process from childhood to early adulthood and a progressive reduction advancing age in late adulthood. Furthermore, a greater IIV in older adults indicated a susceptibility of EA to the aging process. The ontogenetic trajectory of PES seems substantially different from the trajectory that describes EA since in PES we do not observe age-related differences.

Conclusion: These results suggest that EA and PES are two independent processes. Furthermore, it appears that EA and PES are differently prone to short-term fluctuations in performance across the lifespan. While EA presents an increase in IIV in aging, PES seems to be immune to these changes.

Keywords: error awareness, post-error slowing, intra-individual variability, executive functions, performance monitoring, ex-Gaussian analysis

# INTRODUCTION

The ability to monitor our performance, and moreover our errors, is essential in everyday life. In fact, an error not only represents a failure during performance but also a source of information about the necessity, direction, and magnitude of adjustments needed to prevent similar errors in future (Ullsperger et al., 2014). Therefore, only a correct performance monitoring allows triggering compensatory actions in order to support an efficient goal-directed behavior (Ullsperger and von Cramon, 2006).

Among the processes that constitute performance monitoring, the most representative and investigated phenomena are posterror slowing (PES) and error awareness (EA). Both these two phenomena have been widely studied in the literature, from several interesting points of view.

Post-error slowing is the motor slowing that usually occurs after errors, and was described for the first time in 1966 by Rabbitt, who reported significant slower reaction times (RTs) after erroneous responses than mean RTs of all correct responses (Rabbitt, 1966). From this first evidence, other studies have reported PES in different kinds of task, for instance Stroop, Simon, Flanker or categorization tasks (for a review, Danielmeier and Ullsperger, 2011). However, despite this large piece of knowledge, the functional role of PES is still debated. Specifically, two veins of research consider alternatively PES either an adaptive or a maladaptive phenomenon. On the one hand, the adaptive theories claim that PES contributes to improving ongoing behavior, involving both the perceptual and the motor system. For instance, Botvinick et al. (2001) suggest that PES might reflect the modification of the amount of sensory evidence required to initiate a motor response. On the other hand, the maladaptive theories propose that PES would be a detrimental consequence of an impaired processing after an error. In line with this idea, Notebaert et al. (2009) propose that PES would be caused by the relative infrequency of errors during a task, which in turn may cause attentional lapses. In fact, in their study, when the error rate increased, approaching the frequency of the correct responses, PES was reduced or absent (Notebaert et al., 2009).

Despite these two different theories, PES can be considered as an index of performance adjustments following unexpected events (Wessel and Aron, 2017).

However, contrary to EA, PES does not inform about the level of consciousness in error detection.

Error awareness is a metacognitive process that allows being aware of an error. Relatively more recent respect to the analysis of PES, Hester et al. (2005) introduced an interesting method to investigate EA, with the design of a specific task, i.e., the Error Awareness Task (EAT; Hester et al., 2005). From that first study, EA has been the object of several works (Shalgi et al., 2007; O'Connell et al., 2009; Harty et al., 2013, 2014), where different aspects have been explored: from its neural basis and electrophysiological markers, to its relation with other cognitive functions and environmental aspects. Specifically, it seems that EA strictly depends on both endogenous factors, such as attention or expertise, and exogenous factors, such as time pressure or ambiguity in task situations (Klein et al., 2013). Interestingly, other studies have shown EA can be significantly affected by different neurological and psychiatric diseases (Klein et al., 2013), and to decline in normal aging (Harty et al., 2013).

Nonetheless, despite these two phenomena have been largely studied, at least two questions appear to be still open in this literature: How do these two mechanisms develop across the lifespan? And how do they interact?

The changes of PES and EA across the lifespan seem topics scarcely explored in literature, and to the best of our knowledge, so far no study has investigated intra-individual variability (IIV) of PES and EA across different age-groups.

Moreover, the evidence about a possible correlation between PES and EA is not convergent.

In fact, several authors recently showed that PES is significantly modulated by EA since it is larger after an aware error than unaware (Nieuwenhuis et al., 2001; Hester et al., 2005; Wessel and Ullsperger, 2011). At the same time, other evidence seems to show the opposite, reporting significant PES also after unnoticed errors (Cohen, 2009).

To date, studies that have jointly investigated PES and EA are still few and this would explain the fact that it is not yet possible to clearly accept or exclude an association between the two processes.

With the goal to identify the ontogenetic trajectories of EA and PES, the present study reports new evidence from healthy subjects of different age-groups, where EA and PES have been analyzed in terms of both intra and inter-individual variability.

In fact, although the ontogenetic cognitive development has been mainly investigated through the comparison of participants' average performance, only the study of IIV allows understanding another important aspect, such as the short-term changes in behavior. For the sake of clarity, with "intra-individual variability" we refer to what Hultsch et al. (2002) considered as "inconsistency within persons." Specifically, in our study, we refer to the variability of a single measure evaluated across different trials within the same task.

In general, IIV changes on cognitive performance describe a U-shaped function during the lifespan (MacDonald et al., 2006), showing a quadratic relationship between IIV and age. At first, from childhood to adolescence, cognitive performance is characterized by a progressive reduction of IIV (Williams et al., 2005), followed by a relative stability during the early adulthood (Hultsch and MacDonald, 2012). Finally, with the advancing age in later adulthood, IIV encounters a further progressive increasing (Hultsch et al., 2002). In typically developing children, IIV decreases quickly after 6 years (Williams et al., 2005, 2007; Lewis et al., 2017), whereas compared to younger adults, older adults present more IIV (Hultsch et al., 2008; Garrett et al., 2012; Holtzer et al., 2014). Thus, IIV could be considered a stable and efficient marker of normal development and aging (MacDonald et al., 2006, 2009; Hultsch et al., 2008). Furthermore, a high fluctuation in response times, or response time variability, seems to reflect attentional lapses (Castellanos et al., 2005) and allows to acquire more information on short-term changes in behavior.

Altogether, these studies emphasize the fruitful "tool" of IIV in the study of cognitive functioning, without to assume a static vision of ontogenetic changes.

Intra-individual variability can be quantified in multiple ways and some measures typically used are the intra-individual standard deviation (ISD) and the coefficient of variation (CV). Alternatively, an interesting mathematical approach is the fitting of reaction time (RT) distributions through the use of mathematical functions. These methods allow to quantify different parameters of RT distributions, not only measures of central tendency such as the mean or median RTs.

For example, Ratcliff (1979) was among the first to show how the ex-Gaussian distribution yields an optimal fit to RT distributions. The ex-Gaussian is the convolution between a

Gaussian and an exponential distribution and it can well capture the positive skew frequently observed in RT distributions (Luce, 2008). Three parameters can be estimated from the ex-Gaussian fitting. The mu (µ) and the sigma (σ) parameters represent the normally distributed components of the distribution. The tau (τ) represents the exponentially distributed component, which reflects the positive skew of the RT distribution, or in other words, the slowest RTs in a distribution.

Several authors provide cognitive interpretations attributed to the ex-Gaussian parameters. Specifically, the tau parameter would reflect a central processing component (Spieler et al., 1996), a measure of attentional lapses (Epstein et al., 2006), and an index of higher cognitive functions such as working memory and reasoning (Schmiedek et al., 2007). In general, the ex-Gaussian parameters seems to be an efficient method for measuring IIV (West et al., 2002; Tarantino et al., 2013; Lewis et al., 2017).

Interestingly, IIV across the lifespan would be associated with changes in brain morphology. The reduction of the gray matter during the development, or synaptic pruning, might contribute in the decrement of neural noise in cognitive functions and IIV (Gogtay et al., 2004). On the contrary, in the elderly population, a reduction of white matter, especially in the frontal lobes, may be related to the increased level of IIV (Moy et al., 2011).

Among cognitive processes that could better explain IIV across lifespan, executive functions are probably the best candidate for several reasons. First of all, neuroimaging studies provide evidence on a connection between executive processes mediated by frontal lobes and IIV (Bellgrove et al., 2004; Bunce et al., 2007). Second, executive functions are characterized by a multifaceted nature and it is plausible that a couple of its many components can be involved in IIV, namely lapses of intention (West et al., 2002) or an inadequate level of sustained attention (Kaiser et al., 2008).

In summary, although IIV has been linked to executive functions, so far no study has taken into account IIV of performance monitoring across the lifespan. This lack of evidence lays the foundations for investigating how EA and PES develop over the lifespan. Interestingly, IIV, considered from several authors a better predictor of normal development and aging than average performance (MacDonald et al., 2006, 2009; Hultsch et al., 2008), can offer new insights to characterize two phenomena closely related to performance monitoring, namely EA and PES.

# MATERIALS AND METHODS

#### Participants

Ninety-six healthy participants took part to the study: 34 children (age range 8–13 years), 30 younger adults (age range 19–35 years), and 32 older adults (age-range 61–83 years).

The inclusion criteria were normal or corrected-to-normal vision, no previous or present neurological and/or psychiatric disorders and no use of any neuro-psychopharmacological drugs. Older adults had an additional inclusion criterion of a Montreal Cognitive Assessment test score (Nasreddine et al., 2005) over the Italian cut-off (Conti et al., 2015; Santangelo et al., 2015). Moreover, participants with a poor or high level of accuracy on the EAT (Hester et al., 2005) were excluded from the analyses (<5% or >90% correctly withheld No-Go trials). As a result, the final sample consisted of 30 children (mean age 10.8 years, range 8–13), 30 younger adults (mean age 25.4 years, range 19–35), and 30 older adults (mean age 70.4 years, range 61–83). The **Table 1** shows the main demographics of the sample and the number of participants in each group entered in the analyses. Written informed consent was obtained from all of the younger and older participants, and from the parents of the children. The study was carried out in accordance with Declaration of Helsinki, and was approved by the Ethics Committee of School of Psychology, University of Padua.

# Experimental Task

All participants performed a modified versions of the EAT, (Hester et al., 2005), a Go/No-Go response inhibition task specifically designed to evaluate EA. A serial stream of color words was presented at the center of a computer screen. The stimuli were presented at the center of a screen for 750 ms, followed by a black screen presented for 750 ms. Participants were asked to press a Go button ("3" on a keyboard), as soon as possible, when word and its font color were congruent (Go trial). On the contrary, participants had to withhold the response in two conditions: (1) when the same colored word was repeated on two consecutive trials (Repeat No-Go), or (2) when the word and its font color were incongruent (Stroop No-Go).

After each stimulus presentation a prompt was presented for 1000 ms, with the following question: "Did you make a mistake?".



RTs, reaction times; SD, standard deviation; MoCA, Montreal Cognitive Assessment test.

During this time, participants were instructed to press an error button (space bar) to signal a supposed error. Through this prompt, the task was simplified compared to the original version (Hester et al., 2005), because participants were asked explicitly to monitor trial-by-trial own performance. In total 675 trials were presented (600 Go trials and 75 No-go trials, of which 36 Stroop No-Go and 39 Repeat No-Go). The task was divided into three blocks including 225 trials each, in order to guarantee participants took a brief rest after each block. It was ensured that all participants were well-trained and fully understood the instructions of the task before they began the experiment. The experiment was run by E-Prime software (version 2.0 Psychology Software Tools, Pittsburgh, PA, United States).

#### Measures and Data Analysis

According to the aims of the present study, participant's performance was evaluated through different dependent variables, which were calculated and analyzed as following.

#### Reaction Times and Accuracy

Average performance indices was assessed in term of mean RTs, for both correct and error responses (RTs under 100 ms were removed from analyses), and mean accuracy, which was calculated as the ratio of correct withholds on No-go trials.

To evaluate differences on mean RTs, a mixed 2 × 3 ANOVA was conducted with response type (correct vs. error) as a withinsubjects variable and group (children, younger and older adults) as between-subjects variable. Differences in mean accuracy were computed by one-way ANOVA with group (children, younger, and older adults) as between-subjects variable.

#### Error Awareness

Mean EA was calculated as the percentage of correctly signaled errors on the total number of commission errors (O'Connell et al., 2009). Differences in mean EA were analyzed by oneway ANOVA with group (children, younger, and older adults) as between-subjects variable. To evaluate EA across the time (task duration), we divided the EAT into six blocks and computed EA for each of them. In this case, a mixed 6 × 3 ANOVA was conducted with block (1, 2, 3, 4, 5, and 6) as withinsubjects variable and group (children, younger, and older adults) as between-subjects variable. For this analysis, the sample size was reduced to 83 participants (29 children, 28 younger, and 26 older adults) because seven of them did not commit any error during a particular block (see **Table 1**).

Finally, in order to evaluate IIV of EA across the task duration, a single value of SD was computed for each participant, again considering EA across the six blocks. Standard deviations were averaged for each group and compared by one-way ANOVA with group (children, younger, and older adults) as between-subjects variable. Again, for these analyses the sample size was reduced to 83 participants (29 children, 28 younger, and 26 older adults) (see **Table 1**).

#### Post-error Slowing

Mean PES was computed according to Dutilh et al. (2012) by the difference between the RT that follows and precedes each error. This difference was compared with the difference between the RT that follows and precedes each correct withhold. RTs under 100 ms were removed from analyses. Unaware errors were excluded from the analyses of PES. Moreover, three participants were removed from the analyses since did not commit at least 3 errors to compute a reliable mean PES (Fitzgerald et al., 2010; Rodehacke et al., 2014). Consequently, the sample size was reduced to 87 participants (29 children, 29 younger, and 29 older adults) for all the analyses on PES (see **Table 1**). The outline of the analyses on PES is as follows. First, differences in terms of mean and SD were analyzed by two mixed 2 × 2 × 3 ANOVA with response (post vs. pre No-Go target response) and target response (aware error vs. correct withhold) as within-subjects variables and group (children, younger, and older adults) as between-subjects variable. Finally, as measures of IIV of PES, we fitted the ex-Gaussian distribution to our RT distributions. The ex-Gaussian parameters were calculated by the egfit MATLAB function (Lacouture and Cousineau, 2008) that allows estimating the measure of central tendency (mu), spread (sigma), and the degree of positive skew (tau) of the distribution. Three separated mixed 2 × 2 × 3 ANOVAs were conducted for each ex-Gaussian parameter, in which response (post vs. pre No-Go target response) and target response (aware error vs. correct withhold) were entered as within-subjects variables and group (children, younger, and older adults) as between subject variables.

We decided to avoid investigating PES across the time (task duration), because a sufficient number of trials was not present in all the conditions of a hypothetical 6 × 2 × 2 × 3 ANOVA with block (1, 2, 3, 4, 5, 6), response (post vs. pre No-Go target response) and target response (aware error vs. correct withhold) as within-subjects variables and group (children, younger, and older adults) as between-subjects variable.

In general, the Bonferroni correction was applied to every post hoc analysis and a corrected alpha-level of 0.05 was considered. Finally, effect sizes were estimated by partial eta squared (η 2 p ).

# RESULTS

#### Reaction Times and Accuracy

The mean and SD of RTs and accuracy are showed in **Table 2**. Data analyses on mean RTs revealed a main effect of response type [F(1,87) = 32.4, p < 0.001, η 2 <sup>p</sup> = 0.3], and group [F(2,87) = 21.8, p < 0.001, η 2 <sup>p</sup> = 0.3]. Post hoc comparisons showed that error RTs were faster than correct RTs. Moreover, younger adults were faster than children (p < 0.01) and older adults (p < 0.001), whereas children were faster than older adults (p < 0.05). No interaction between response type × group was found.

As for mean accuracy, a main effect of group was found [F(2,87) = 20.8, p < 0.001, η 2 <sup>p</sup> = 0.3]. Post hoc comparisons showed that children made more errors than younger adults (p < 0.001) and older adults (p < 0.001). No difference between younger and older adults in terms of accuracy (p = 0.32).

#### Error Awareness

fpsyg-09-00902 June 2, 2018 Time: 20:58 # 5

The mean and SD of EA scores are presented in **Table 2**. The analyses on mean EA revealed a main effect of group [F(2,87) = 33.03, p < 0.001, η 2 <sup>p</sup> = 0.4]. The post hoc comparisons indicated that younger adults were more aware on their commission errors respect to both the children (p < 0.001) and older adults (p < 0.001). No significant differences were found comparing mean EA in children and older adults (p = 0.14).

When EA across the time were considered in our analyses (**Figure 1**), we found a main effect of block [F(4.1,329.5) = 4.6, p < 0.05, η 2 <sup>p</sup> = 0.1]. Specifically, post hoc comparisons showed a general improvement in EA between the block 1 and 5 (p < 0.01). However, this improvement disappeared at the end of the task, as the comparison between the blocks 1 and 6 showed (p = 0.29). Furthermore, the analyses revealed a main effect of group [F(2,80) = 26.5, p < 0.001, η 2 <sup>p</sup> = 0.4]. Similarly to the analyses on mean EA, we observed by post hoc comparisons that younger adults had a higher level of EA than children (p < 0.001) and older adults (p < 0.001). The comparison between children and older adults did not show a significant difference (p = 0.29). Finally, no interaction between block × group was observed (p = 1.57).

The analyses of IIV of EA across time, in terms of SD, showed a main effect of group [F(2,80) = 15.1, p < 0.001, η 2 <sup>p</sup> = 0.3]. Younger adults' EA was less variable than children (mean standard deviations: 12 vs. 19; p < 0.05) and older adults (mean standard deviations: 12 vs. 26; p < 0.001). Interestingly, unlike mean EA that did not reveal a difference between children and older adults in terms of average performance, the comparison of standard deviations showed that children had a lower IIV of EA than older adults (mean standard deviations: 19 vs. 26; p < 0.05).

#### Post-error Slowing – Means and SDs

The mean and SD for RTs following and prior an aware error or a correct withhold are showed in **Table 3**. The analyses on means showed a main effect of response [F(1,84) = 78.53, p < 0.001, η 2 <sup>p</sup> = 0.5], target response [F(1,84) = 25.73, p < 0.001, η 2 <sup>p</sup> = 0.2],

TABLE 2 | Mean and standard deviations (SD) of performance indices on the EAT for children, younger, and older adults.


and group [F(2,84) = 21.66, p < 0.001, η 2 <sup>p</sup> = 0.3]. The post hoc comparisons indicated that RTs following a No-Go target were slower than RTs prior a No-Go target (p < 0.001), and RTs faster (without a distinction between post and pre No-Go target response) when participants correctly withheld the response for No-Go target (p < 0.001). As for group differences, younger adults were generally faster than children (p < 0.01) and older adults (p < 0.001), whereas children and older adults did not present any difference (p = 0.32). Moreover, the analyses revealed a response × group interaction [F(2,84) = 8.1, p < 0.01, η 2 <sup>p</sup> = 0.2]. All groups showed differences each other (all p < 0.01), namely a slowing after a No-Go target. Finally, an interaction between response × target response was found [F(2,84) = 8.1, p < 0.01, η 2 <sup>p</sup> = 0.2], confirming a PES effect (**Figure 2**). In fact, data revealed a general slowing after an aware error (post-error RTs = 650 ms vs. pre-error RTs = 539 ms; p < 0.001) and an opposite trend after a correct withhold (post-withhold RTs = 550 ms vs. pre-withhold RTs = 573 ms; p < 0.001). No response × target response × group interaction was found (p = 0.24), revealing that the magnitude of PES was the same in all groups.

The analyses on SDs revealed a main effect of response [F(1,84) = 4.94, p < 0.05, η 2 <sup>p</sup> = 0.1], target response [F(1,84) = 12.76, p < 0.01, η 2 <sup>p</sup> = 0.1], and group [F(2,84) = 24.1, p < 0.001, η 2 <sup>p</sup> = 0.4]. The post hoc comparisons indicated that RTs following a No-Go target were more variable than RTs prior a No-Go target (p < 0.05), and a reduction of variance (collapsing post and pre No-Go target response) when participants correctly withheld the response for No-Go target (p < 0.05). The main effect on group indicated that children' RTs were generally more inconsistent than younger adults (p < 0.001), and older adults (p < 0.001). No difference between younger and older adults (p = 0.53). Moreover, the analyses on SDs revealed a response × target response interaction [F(1,84) = 27.9, p < 0.001, η 2 <sup>p</sup> = 0.2]. The results showed a general increasing of variance after an aware error (post-error SDs = 145 vs. pre-error SDs = 113;

p < 0.001) and an opposite trend after a correct withhold (postwithhold SDs = 106 vs. pre-withhold SDs = 118; p < 0.05). No interaction between response × target response × group was found (p = 0.76).

## Post-error Slowing – Ex-Gaussian Parameters

The **Table 4** summarizes the results (means and SDs) on ex-Gaussian parameters. Three mixed ANOVAs were conducted to check differences on each parameter of the distribution: mu, sigma, and tau.

TABLE 3 | Means and standard deviations (SDs) for post- and pre-target responses computed as a function of target response (aware error, correct withhold) and group.


The analyses on the mu parameter showed a main effect of response [F(1,84) = 30.9, p < 0.001, η 2 <sup>p</sup> = 0.3] and group [F(2,84) = 20.96, p < 0.001, η 2 <sup>p</sup> = 0.3]. Similarly to the previous analyses on means, RTs following a No-Go target were slower than RTs prior a No-Go target (p < 0.001). The main effect on group revealed that younger adults were generally faster than children (p < 0.01) and older adults (p < 0.001), and children were faster than older adults (p < 0.01). In addition, the analyses revealed a response × group interaction [F(2,84) = 6.8, p < 0.01, η 2 <sup>p</sup> = 0.1]. Only younger adults and children presented a slowing after a No-Go target (p < 0.05). Interestingly, an interaction between response × target response was found [F(1,84) = 46.1, p < 0.001, η 2 <sup>p</sup> = 0.4], confirming PES also when the mu parameter was entered in the analyses, rather than the more conventional mean. The results showed higher mu scores after an aware error (post-error mu = 534 vs. pre-error mu = 460; p < 0.001), whereas no difference after a correct withhold (post-withhold mu = 482 vs. pre-withhold mu = 490; p = 0.32). Finally, in line with the previous analyses, no response × target response × group interaction was found (p = 0.30).

The analyses on the sigma parameter showed only a main effect of group [F(2,84) = 9, p < 0.001, η 2 <sup>p</sup> = 0.2]. Post hoc analyses revealed that children showed higher scores of sigma than younger adults (p < 0.01) and older adults (p < 0.01). No differences in terms of sigma scores between younger and older adults (p = 1).

Finally, the analyses on the tau parameter showed a main effect of response [F(1,84) = 4, p < 0.05, η 2 <sup>p</sup> = 0.05], target response [F(1,84) = 14.57, p < 0.001, η 2 <sup>p</sup> = 0.1], and group [F(2,84) = 11.35, p < 0.001, η 2 <sup>p</sup> = 0.2]. The post hoc comparisons indicated higher tau scores after a No-Go target than tau scores preceding a No-Go target (p < 0.05), and lower tau scores (without a distinction between post and pre No-Go target response) when participants correctly withheld the response for No-Go target (p < 0.001). As for group differences, children had higher tau scores than younger adults (p < 0.001) and older adults (p < 0.01). No difference between younger and older adults (p = 1). Finally,

TABLE 4 | Ex-Gaussian parameters for post- and pre-target responses computed as a function of target response (aware error, correct withhold) and group.


an interaction between response × target response was found [F(1,84) = 17.21, p < 0.001, η 2 <sup>p</sup> = 0.2]. The results showed higher tau scores after an aware error (post-error tau = 117 vs. preerror tau = 80; p < 0.001), whereas no difference after a correct withhold (post-withhold tau = 68 vs. pre-withhold tau = 82; p = 0.06). No response × target response × group interaction was found (p = 0.78).

# DISCUSSION

The purpose of the present study was to investigate the ontogenetic trajectories of EA and PES through the comparison of different age-groups and how EA and PES interact with each other. The performance of children (age range 8–13 years), younger adults (age range 19–35 years) and older adults (age range 61–83 years) in a modified version of the EAT (Hester et al., 2005) was compared. In particular, in this study not only variations of the average performance were examined, but also IIV, considered in terms of variations of SD and ex-Gaussian parameters (mu, sigma, and tau).

The results on average performance on the EAT showed that older adults were generally slower than younger adults and children. However, in terms of accuracy, older and younger adults did not differ. This result is in line with several studies in which a general slower performance in older adults, but same levels of accuracy than younger adults, seems partially to reflect a changing in the strategy used to tackle a task, indeed older adults seem to be more caution that younger adults in their responses (Starns and Ratcliff, 2010; Dutilh et al., 2013). In contrast, accuracy in children was lower than the other two groups, since they committed more errors, showing a difficulty to inhibit an inappropriate response and in general a lower level of inhibitory control.

Particularly interesting is the result in which EA was more reduced in children and older adults, rather than in younger adults. Previous studies have already highlighted differences between older and younger adults, in terms of EA, showing a poorer levels of error detection in older adults (Harty et al., 2013, 2014, 2017). However, for the first time, this study also takes into account a group of children, in order to outline the ontogenetic trajectory of EA comparing different age-groups. The results showed that EA in children was not yet a fully mature process if compared to a group of younger adults. At the same time, the results confirmed the previous evidence that EA is reduced in older adults. Taken together, these findings show a pretty clear relationship between age and EA. Specifically, these results suggest that the relationship between age and EA could be represented by a classical U-shaped function, as it is for most of the executive functions. Whereas EA increases through childhood and early adulthood, advancing age in late adulthood is related to a reduction of EA. However, the absence of a fourth group of adults aged between 40 and 60, which is a limitation of this study, suggests waiting for future replication in order to confirm this conclusion.

When we evaluated EA across the time, dividing the task into six blocks, we found an improvement in EA between the blocks 1 and 5 and, afterward, a reduction in EA at the end of the task, as the comparison between the blocks 1 and 6 showed. This result could depend on a spontaneous fluctuation in sustained attention or arousal during the EAT and it seems to be in line with previous studies that show a relationship between EA and arousal (Shalgi et al., 2007; Robertson, 2014).

Of interest, IIV of EA across the time showed a different pattern, because it was more prominent in older than younger adults and children. Although older adults and children, in terms of average performance, had very similar levels of EA, older adults were characterized by a more marked IIV of EA across the time. A possible interpretation of these results could derive from previous studies where an increased IIV has been often associated with impairments in attention (Castellanos et al., 2005), especially in older adults (Hultsch et al., 2002). Thus, temporary lapses of attention in older adults can contribute to explain a greater inconsistency of EA.

The results of our study revealed PES, a phenomenon widely observed in the literature. However, the strength of the present study is that PES was examined using both more conventional approach based on the comparison of the mean and SD for RTs, and the use of the ex-Gaussian function to fit our RT distributions.

The first evidence that emerges from our findings concerns the fact that PES was confirmed, as expected when the mean RTs were considered. Moreover, interestingly, we also observed PES when the mu parameter of the ex-Gaussian was entered in the analyses.

As for the absence of an interaction between response × target response × group, when both mean RTs and mu parameter were considered in the analyses, it revealed no difference between groups in terms of the magnitude of PES. This result contrasts our expectations and also previous findings that showed a more pronounced PES in older adults (Dutilh et al., 2013). In fact, we expected to find group differences regarding the magnitude of PES, in particular, we expected these differences were greater in children and older adults, compared to younger adults. These expectations were justified by at least two different interpretations. On the one hand, we expected an increased PES in children and older adults since it is generally known they present greater difficulty in inhibitory control (Christ et al., 2001). As claimed by Notebaert et al. (2009), PES would reflect a slow down generated by an orientating response when an error occurs. The easier distractibility in children and older adults to exogenous and endogenous stimuli (e.g., an error that represents a failure in performance monitoring) could, therefore, explain the greater expected slow down. On the other hand, especially to explain the expected slowing in older adults, it is known they seem to be more cautious to avoid errors, as well described by the speed-accuracy trade off phenomenon. Therefore, it was plausible to expect a major slow down following an error in older adults. In summary, both weak inhibitory control and changes in the strategy used to tackle a task could explain our expectations. However, the unexpected result, in the present study, opens up an alternative interpretation. In fact, considering PES as a compensatory and adaptive process aimed at improving performance following an error (Gehring and Fencsik, 2001), it is plausible to think that it is already well mature in childhood and

do not decline during normal aging. This interpretation would explain the absence of differences between the groups in terms of the magnitude of PES.

Another interesting result concerns the response × target response interaction found when we considered into the analyses on PES the SD of the RT distributions. This interaction showed that people not only slow down on trials following errors, but also are more inconsistent in following responses.

The last group of analyses took the three parameters of the ex-Gaussian distribution into account. The results revealed differences between the three groups in terms of the normally distributed RT mu parameter. In fact, older adults were the slowest group, while children were slower than younger adults. This result differed from the previous analyses of mean RTs of PES, since in that analyses no difference was present between children and older adults. Therefore, excluding the slowest RTs from the distributions (in fact the slowest RTs of the distribution were captured by the tau parameter), the group of children appeared to be faster than older adults.

When we considered the sigma parameter, we found a main effect of group. The results showed that the normally distributed RTs of the ex-Gaussian were more variable in children than younger and older adults. These results were perfectly consistent to those observed when the SDs were considered in the analyses of PES.

Finally, the analyses on the tau parameter revealed an important result. Children presented the highest tau scores compared to younger and older adults. In previous studies, tau values have been associated with a higher IIV (see Dixon et al., 2004) and this inconsistency has been related to lapses in cognitive processes, such as attention. In particular, high tau scores would be related to attentional lapses (Epstein et al., 2006). This evidence can partially explain the differences between groups observed in our study regarding tau scores. Children may

#### REFERENCES


be more prone to attentional lapses than the other two groups and consequently to present a higher level of IIV.

# CONCLUSION

This study allows observing two distinct ontogenetic trajectories of the investigated processes. Regarding EA, we suggest the existence of a U-shaped curve that describes an increase of the process from childhood to early adulthood and a progressive reduction advancing age in late adulthood. Furthermore, a greater IIV in older adults indicated a susceptibility of EA to the aging process. The ontogenetic trajectory of PES seems substantially different from the trajectory that describes EA since in PES we do not observe age-related differences. These results suggest that EA and PES are two independent processes, explaining why in a previous study no association between them was revealed (Cohen, 2009). Furthermore, it appears that EA and PES are differently prone to short-term fluctuations in performance across the lifespan. While EA presents an increase in IIV in aging, PES seems to be immune to these changes.

# AUTHOR CONTRIBUTIONS

FM, ED, and DM conceived and planned the experiments. FM and ED carried out the experiments. FM analyzed the data and wrote the manuscript with input from all authors. ED and DM contributed to the interpretation of the results.

# ACKNOWLEDGMENTS

The authors thank V. Tarantino for her suggestions with the ex-Gaussian analysis.

and equivalent scores. Neurol. Sci. 36, 209–214. doi: 10.1007/s10072-014-1 921-3




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Masina, Di Rosa and Mapelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dual Task Effects on Visual Attention Capacity in Normal Aging

Erika C. S. Künstler <sup>1</sup> \* † , Melanie D. Penning2†, Natan Napiórkowski <sup>2</sup> , Carsten M. Klingner <sup>1</sup> , Otto W. Witte<sup>1</sup> , Hermann J. Müller <sup>2</sup> , Peter Bublak 1‡ and Kathrin Finke1,2‡

<sup>1</sup> Hans Berger Department of Neurology, Jena University Hospital, Jena, Germany, <sup>2</sup> Department of Psychology, Ludwig-Maximilians-Universität München, Munich, Germany

#### Edited by:

Celine R. Gillebert, KU Leuven, Belgium

#### Reviewed by:

Mario Bonato, Università degli Studi di Padova, Italy Thomas Alrik Sørensen, Aalborg University, Denmark

> \*Correspondence: Erika C. S. Künstler erika.kuenstler@med.uni-jena.de

†These authors share first authorship

‡These authors share senior authorship

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 28 February 2018 Accepted: 06 August 2018 Published: 03 September 2018

#### Citation:

Künstler ECS, Penning MD, Napiórkowski N, Klingner CM, Witte OW, Müller HJ, Bublak P and Finke K (2018) Dual Task Effects on Visual Attention Capacity in Normal Aging. Front. Psychol. 9:1564. doi: 10.3389/fpsyg.2018.01564 Older adults show higher dual task performance decrements than younger adults. While this is assumed to be related to attentional capacity reductions, the precise affected functions are not specified. Such specification is, however, possible based on the "theory of visual attention" (TVA) which allows for modeling of distinct attentional capacity parameters. Furthermore, it is unclear whether older adults show qualitatively different attentional effects or whether they show the same effects as younger adults experience under more challenging conditions. By varying the complexity of the secondary task, it is possible to address this question. In our study, participants performed a verbal whole report of briefly presented letter arrays. TVA-based fitting of report performance delivered parameters of visual threshold t0, processing speed C, and visual short-term memory (VSTM) storage capacity K. Furthermore, participants performed a concurrent motor task consisting of continuous tapping of a (simple or complex) sequence. Both TVA and tapping tasks were performed under single and dual task conditions. Two groups of 30 younger adults each performed either the simple or complex tapping, and a group of 30 older adults performed the simple tapping condition. In older participants, VSTM storage capacity declined under dual task conditions. While no such effect was found in younger subjects performing the simple tapping sequence under dual task conditions, the younger group performing the complex tapping task under dual task conditions also showed a significant VSTM capacity reduction. Generally, no significant effect on other TVA parameters or on tapping accuracy was found. Comparable goodness-of-fit measures were obtained for the TVA modeling data in single and dual tasks, indicating that tasks were executed in a qualitatively similar, continuous manner, although quantitatively less efficiently under dual- compared to single-task conditions. Taken together, our results show that the age-specific effects of motor-cognitive dual task interference are reflected by a stronger decline of VSTM storage capacity. They support an interpretation of VSTM as central attentional capacity, which is shared across visual uptake and concurrent motor performance. Capacity limits are reached earlier, and already under lower motor task complexity, in older compared to younger adults.

Keywords: visual attention, healthy aging, dual-tasking, theory of visual attention, multi-tasking

# INTRODUCTION

Aging is associated with a decline of sensory and motor functions, as well as distinct cognitive abilities (Lindenberger, 2014). Moreover, consistent evidence shows that dealing with cognitive demands in parallel to a motor task is more difficult for subjects of a higher age (McDowd and Craik, 1988; Kramer and Larish, 1996; Verhaeghen and Cerella, 2002; Woollacott and Shumway-Cook, 2002; Verhaeghen, 2011; Ruthruff and Lien, 2017). Thus, not only do cognitive and motor skills both decline over the life span (Ketcham and Stélmach, 2001; Park and Reuter-Lorenz, 2009; McAvinue et al., 2012; Habekost et al., 2013), but dual tasking seems to add an additional deteriorating factor (Verhaeghen et al., 2002, 2003) that renders even the execution of seemingly easy tasks vulnerable through the introduction of a secondary task (Boisgontier et al., 2013; Künstler et al., 2017). That is, dual tasking requirements seem to represent a specific challenge for elderly adults, which in turn leads to exacerbated performance deterioration. These particular difficulties of older adults in dual tasking situations are especially relevant because they have been linked to a higher risk of falls (Faulkner et al., 2007). However, the reasons for these stronger dual task effects in aging are still not entirely clear.

Dual task interference is observed when performance of one or both tasks within a dual task situation declines compared to the performance of each single task carried out separately (Kahneman, 1973). Two of the most influential attentional explanations for the dual task effect are the bottleneck account and the central capacity sharing model (see Tombu and Jolicoeur, 2004, for an overview). According to the bottleneck account, the dual task related decline in performance arises from the fact that two tasks cannot be executed simultaneously but have to be carried out in a sequential manner, at least at some stage of processing (Pashler, 1994). In contrast, the capacity sharing account assumes simultaneous task performance, but suggests that the overall amount of attentional resources available for performance is strictly limited (e.g., Navon and Miller, 2002). Due to this limitation, attentional capacity has to be shared between the two tasks, giving rise to a trade-off in task performance. As long as the individual's capacity limit is not reached, both tasks can be performed concurrently without a drop-off in either task. Only when the task demand exceeds said limit, one or both of the tasks will be affected. Capacity sharing models consider serial task processing at central stages (Pashler, 1994) as a special case of capacity sharing, whereby first Task 1 and then Task 2 gets all of the available capacity. However, Logan and Gordon (2001) offered a model combining aspects from both the resource sharing and the bottleneck account in their "executive control of the theory of visual attention" (ECTVA) framework.

The "theory of visual attention" (TVA; Bundesen, 1990; see Bundesen et al., 2015 for a current overview) can itself be applied as a framework to assess processing capacity under a dual task condition. TVA is a mathematically formalized theory which has strong relations to the biased competition account of attentional processing. With the Neural Theory of Visual Attention (NTVA) Bundesen et al. (2005) sought to describe single cell data based on TVA, thereby attempting to provide a deeper understanding of how TVA could possibly be explained from a neural standpoint. TVA disentangles processing capacity into a set of distinct parameters determining the efficacy of an individual's visual information uptake. These parameters can be estimated by modeling participants' performance on a simple psychophysical whole report task (e.g., Sperling, 1960). In this task, an array of letter stimuli is briefly presented; TVA proposes that these stimuli are encoded in two distinct processing waves. The first, unselective wave processes the visual information in parallel, allocating evidence values to objects based on the extent to which long-term memory representations match the objects in the display. The second, selective wave distributes limited capacity attention across the objects, with attentional weighting being allocated based on the evidence values. The objects then race to be encoded in the fixed capacity visual shortterm memory, which is typically limited to approximately three to four elements in younger, healthy participants. This VSTM storage capacity is intimately related to the concept of visual working memory capacity, as applied by Luck and Vogel (2013) and proposed to be a central index of overall cognitive ability (however, see Aben et al., 2012 for an opposing view). Only those objects which are encoded into the VSTM store are consciously represented, and are therefore available for further actions, such as verbal report.

Performance in the whole report task is modeled, according to the equations set out by TVA (see Kyllingsbaek, 2006; Habekost, 2015, for a comprehensive overview), by an exponential growth function that relates accuracy of letter report to the effective stimulus exposure duration. The origin, the slope, and the asymptote of this function are determined by three parameter estimates provided by TVA: the perceptual threshold, t0, reflects the time-point at which conscious visual stimulus processing starts; the processing rate C indexes the number of visual elements which can be processed per second; and parameter K estimates the size of the storage capacity of the visual short-term memory, given as the maximum number of elements which can be maintained in parallel. TVA has several advantages in the dual tasking context (see Habekost, 2015, for an overview on the methodological merits of TVA-based measurement): Importantly, to the best of our knowledge, TVAbased testing furthermore is the only methodology that permits a mathematically independent quantification measurement of the parameters perceptual threshold, processing speed, and capacity of VSTM. Thus, firstly, it reveals cognitively specific information on which aspect(s) of visual attentional processing is or are affected by the concurrent second task. Secondly, it allows precise measurements of how strongly each parameter is affected. Furthermore, as the TVA whole report paradigm does not rely on motor speed or button presses, the effects of a concurrent manual motor task can be assessed simultaneously, without motor confounds. Finally, by analyzing goodness of fit parameters, qualitative comparisons between single- and dualtask performance can be made, giving insights into how the tasks are processed.

In a recent study Künstler et al. (2017) assessed motorcognitive dual task interference by combining the TVA-based whole report task with a simple motor task (alternating tapping with two fingers of the dominant hand) in middleto higher-aged individuals. The results revealed a decline of visual attentional capacity under dual task conditions. Importantly, goodness-of-fit and reliability measures in both single and dual task conditions showed that participants performed on the visual task in a qualitatively similar (i.e., continuous), although quantitatively less efficient way under dual task as compared to single task conditions. Taken together, the results supported a capacity sharing account of motorcognitive dual-tasking and suggested that even performing a relatively simple motor task relies on central attentional capacity that is necessary for efficient visual information uptake.

In the present study, we apply this method to analyse the effects of aging on motor-cognitive dual-task performance. We investigate which attentional capacity aspects are disproportionately affected in older compared to younger adults when performing a concurrent motor task consisting of the continuous tapping of a simple sequence. In an additional group of younger participants, the complexity of the tapping sequence was increased. This was done due to the evidence that older subjects require more attention for the execution of simple motor tasks, which younger subjects can perform more or less effortlessly (Boisgontier et al., 2013). That is, we tested the hypothesis that more pronounced effects in the older group are attributable to the motor demand being more challenging for them. Taken together, by quantifying the dual-task decrement in older and younger adults, we firstly want to specify the exact attentional parameters that are more prone to dual-task decline in older compared to younger adults. Secondly, by comparing the dual-task decrements of older adults induced by a simple tapping sequence to the decline induced by a more complex sequence in younger adults, we want to assess whether older adults show the same dual-task effects as younger adults facing a more challenging dual-task scenario.

# METHODS

This study combined a TVA whole report paradigm with a simple or complex continuous tapping task as the secondary task. In order to establish the effect of task load, 30 younger participants completed a simple tapping task condition (referred to as the "younger simple group"), while 30 younger adults performed a more complex tapping sequence as the secondary task (the "younger complex group"). Then, to look at the effects of aging, the performance of the 30 younger adults who executed the simple tapping sequence was compared to the performance of 30 older adults who completed the same task (the "older adults group"). This allowed us to explore the decline in dual-task abilities as a function of age. Lastly, to test whether younger participants experience a qualitatively similar decline in attentional processing under more complex conditions, we compared the performance of the older adults to that of the younger adults who completed the complex tapping task.

#### Participants

We tested a total of 90 participants, split into three groups of 30 participants each, who were recruited at the Department of Psychology, Ludwig Maximilians Universität, in Munich and the Department of Neurology, Jena University Hospital, in Jena, Germany: An older group aged between 50 and 78 years, one younger group aged between 19–35 years performing a simple tapping sequence and another younger group with an age of 18–34 years performing a complex tapping sequence. All participants had normal or corrected to normal vision and no history of neurological or psychiatric disorders. The older participants were tested for signs of beginning dementia (MMSE; all values ≥ 27; all values ≥ 26; and MOCA; Folstein et al., 1975; Nasreddine et al., 2005). Handedness was assessed with the Edinburgh Handedness Inventory (Oldfield, 1971) and vocabulary as an estimate of crystallized intelligence with the "Mehrfachwahl-Wortschatz-Test" (MWT-B; Lehrl, 1977). Due to changes in educational and occupational standards over the years, we created a sociodemographic score based on vocabulary (an estimate of crystallized intelligence), number of school years, and occupation (please see the **Supplementary Material** for a full overview of how this score was constructed). This sociodemographic score indicated that there were no significant differences between the various groups. The study was approved by the Ethics Committees of the Jena University Hospital and of the Ludwig-Maximilians-Universität München, and all participants gave written informed consent prior to participation, in accordance with the Declaration of Helsinki. Each participant received monetary remuneration. Relevant demographic data for each group are listed in **Table 1**.

#### Apparatus

In both locations, the data was collected in dimly lit- and soundattenuated rooms so as to minimize distractions. Stimuli were presented on ASUS VG248 17-inch monitors with a refresh rate of 100 Hz and a resolution of 1920 × 1080 and a viewing distance of 60 cm. The tapping task was conducted on external keyboards attached to the computer on which the experiments were run. The height of the screen was adjusted for each participant,

TABLE 1 | Demographic data and sociodemographic score for younger participants who performed the simple or complex tapping sequence and for older participants who performed the simple tapping sequence.


Demographics include gender (number), handedness (number), age, and sociodemographic score.

M, male; f, female; r, right; a, ambidextrous; Mn, Mean; SD, standard deviation.

such that the center of the screen was directly at eye level. Because of the setup of the apparatus, the keyboard was located below participants' visual periphery. Thus, to visually monitor their tapping performance, participants would have had to move their heads downwards so as to see their hands. Not only were participants instructed to not look down, and to continuously maintain fixation at the center of the screen, but their compliance was also monitored by the examiner.

#### Procedure

All participants completed a single session which lasted around 60 min. Approximately 20 min were spent on questionnaires aimed at obtaining demographic information. The remaining 40 min were allocated to the tapping tasks and TVA based whole report, with breaks being taken as needed. The task order was counterbalanced between participants, such that half of all participants began with the two single tasks before commencing to the dual-task condition, while the other half started with the dual-task condition, before completing the two single tasks. In this case, the single tapping was always first performed first.

#### Tapping Task

This task was carried out using the dominant hand to continuously tap a given sequence. The simple sequence consisted of using the index and middle fingers to press the "1" and "2" keys respectively, while the more complex sequence required the use of the index, middle, ring, and pinky fingers to press the "F4," "F3," "F2," and "F1" keys (with the keyboard turned upside down to reduce interference from other keys) respectively (see **Figure 1** for a diagrammatic representation of these two sequences). The more complex sequence was deduced from an unpublished pilot study in which we tested the effects of varying sequence complexities in younger participants. The complex sequence used in the current study was found to be moderately challenging, but manageable for most participants.

The allocated sequence was then tapped at a subjectively preferred pace for a prespecified amount of time. As per the methodology used by Kane and Engle (2000), the single condition of the tapping task consisted of three blocks. The first block spanned 30 s, and was used to familiarize the participant with the sequence to be tapped. If performance on this block was unsatisfactory, the block could be repeated. However, if the performance on the first block was above 80% accuracy, the participant could go on to the second block, which lasted 60 s, during which time the average tapping speed was calculated. In this block, if the wrong key was pressed, auditory feedback in the form of a beep was given to the participant. If this block was performed below 80% accuracy, it could be repeated. However, if performance was satisfactory, the participant could proceed to the third block. Here, the average tapping speed calculated in the second block was added to a buffer of 150 ms. This was then used as the cut-off speed for the third block. Thus, if a participant took longer than this cut-off speed to press a key, or if the wrong key was pressed, a beep was again used as auditory feedback. This final block lasted 3 min, as this time-frame is equivalent to the average duration of a block in the whole report task. It was also a reasonable duration which should not lead to discomfort or hand cramps for the participants according to experience from a previous study (Künstler et al., 2017). A text file was created which recorded the time stamps and tapping speed for each key press, along with information about which key was pressed. This information allowed the post-hoc calculation of each participant's speed and accuracy, and also allowed the time-stamps to be compared between tasks in the dual-tasking condition. The average tapping accuracy and standard deviations for all groups and conditions can be found in **Table 2**<sup>1</sup> .

#### Whole Report Task

This task was run in Matlab<sup>2</sup> , using Psychtoolbox (Brainard, 1997). The experiment consisted of a total of 140 trials. At the start of each trial, a fixation point was displayed in the center of the screen for 1,000 ms. Subsequently, six isoluminant letters appeared around the fixation point, displayed equidistantly in an invisible circle. These letters were drawn at random from a predefined set of letters (all letters of the alphabet, excluding I, Q, and Y), with the size being set to 1.5 by 1.5 cm. These letters were either all blue [Color space: CIE L × a × b blue = (17.95; 45.15; —67.08)] or red [CIE L × a × b red = (28.51; 46.06; 41.28)], with a luminance of 0.49cd/m2. In 40 trials, the stimuli were masked. Once the screen went blank, participants were tasked with verbally reporting as many of the observed letters as possible; an unspeeded task, thereby allowing each participant as much time as necessary. The responses were then typed in by the researcher, who was seated behind the participant, before going on to the next trial. The timestamps of the responses, as well as the responses made, and the correct responses were exported to a text file. Following each block, participants received accuracy feedback on-screen, indicating what percentage out of the letters actually reported was correct. Performance between 70 and 90% was seen as optimal. If the accuracy rate dropped below 70%, participants were asked to be more conservative in their answers. If their accuracy was above 90%, participants were asked to try reporting more letters. A diagrammatic representation of a trial sequence can be found in **Figure 2**. The mean accuracy for this criterion in the single and dual task conditions was 87.6 (SD = 4.7) and 86.4 (SD = 4.2) for the older group, 86.5 (SD = 6.6) and 85.8 (SD = 6.4) for the younger simple group, and 87.5 (SD = 5.8) and 85.1 (SD = 5.6) for the younger complex group.

Initially, the task instructions were displayed on-screen, followed by two examples. Subsequently, a pretest, consisting of 12 triples of trials, was run over the course of four blocks. This served to familiarize the participants with the task, as well as to individually adjust the exposure duration to each participant through the use of a Bayesian adaptive staircase model. Two of the trials in each triple were not used for adjustment; one was unmasked with exposure duration of 200 ms, while the other was masked and presented for 250 ms. This long exposure duration

<sup>1</sup>For this study, we only analyzed tapping accuracy as a measure for effects of the dual task situation on the motor task. For the interested reader, average tapping speed and standard deviations as well as individual values and the distribution of tapping speed can be found in the **Supplementary Materials in Tables 1, 4 and Figures 5–7**.

<sup>2</sup>MATLAB and Statistics Toolbox Release. (2012). The MathWorks, Inc., Natick.

TABLE 2 | Tapping accuracy and TVA parameter values across all conditions and groups.


Mn, Mean; SD, standard deviation; N, sample size; WR, Whole Report; ED, exposure duration.

was only used to familiarize the participant with the task; in the experiment itself, shorter, and adjusted exposure durations were used. Only one trial in each triple was critical for exposure adjustment; this was masked and initially displayed for 100 ms. If at least one letter in such a critical trial was reported correctly, the exposure duration was decreased by 10 ms in the following critical trial. This was repeated until a final exposure duration was identified at which the participant was unable to report any letter correctly. This was then taken to be the lowest exposure duration, and was used together with four other pre-set exposure durations, which were picked based on the lowest, individually adjusted exposure duration. Stimuli in five conditions, using the different exposure durations, were masked. These masks, which comprised a red/blue mesh of overlapping flecks, were 2 by 2 cm in size, and covered the stimuli for 500 ms. They were used to avoid visual persistence effects, as visual information in unmasked trials typically persists by several hundred milliseconds (Sperling, 1960; Dick, 1974). In addition to these five masked conditions, two unmasked conditions were used, using the second shortest and the longest exposure duration, giving rise to a total of seven effective exposure duration conditions. Such a broad spectrum of exposure durations is necessary to measure a wide range of performance, allowing for the estimation of different parameters. For example, t0, the perceptual threshold, is calculated based on performance changes at lower exposure durations close to the minimum individual effective exposure duration. Exact quantification of t<sup>0</sup> is in turn needed to determine the rate of information uptake at t0, indexed by parameter C. However, the computation of the VSTM storage capacity, which is demarcated by the asymptote of performance or parameter K, requires higher exposure durations. For each of the seven effective exposure conditions, 20 trials were included in the study, resulting in a total of 140 trials, divided into four experimental blocks. The obtained data could then be further analyzed through the LibTVA script (Dyrholm, 2012) in Matlab<sup>2</sup> which calculated a maximum likelihood fit for the data, according to the principles of TVA. This was done for each participant, and utilizes observed data to extrapolate probabilistic parameters, based on the fixed capacity independent race model (see Shibuya and Bundesen, 1988). Our model had eight degrees of freedom: Five for parameter K and one each for parameters C, t0, and µ ("iconic memory buffer," of no particular interest to this study). The average minimum and maximum exposure durations for each group and condition can be found in **Table 2**.

#### Dual-Task

In this condition, participants completed the whole report task while simultaneously and continuously tapping. Participants initially performed the familiarization and speed adjustment blocks of the tapping task, after which the whole report paradigm was started. This was then followed by the simultaneous execution of both tasks concurrently, while participants' gaze remained fixated to the center of the screen. The timestamps of the data points of both tasks were compared. If the participant made a mistake in the tapping task, then the corresponding trial in the whole report task was excluded from the analysis. This was done in order to examine attentional parameters only in those trials where the tapping was successfully executed. On average, 5.7 (SD = 6.9) trials were excluded in the older simple group, 3.1 (SD = 4.3) trials were excluded in the younger simple group and 9.0 (SD = 7.2) trials were excluded in the younger complex group. **Supplementary Table 4** shows how the exclusion of trials affected Goodness-of-Fit values.

#### Goodness of Fit

As the whole report results were obtained through a mathematical model, we wanted to ensure that the observed data was closely mirrored by the estimated parameters. To this end, we did a Goodness of Fit analysis. These Goodness of Fit values give an indication of how much of the variance of the empirically observed data is explained by the model estimates provided by TVA. Thus, the higher the explained variance, the more closely the parameter estimates match the actual data obtained.

Furthermore, these Goodness of Fit results also provided an estimation of how robust these estimates were between the single and dual task conditions. More precisely, TVA posits that the processes indexed by the parameter estimates remain stable across comparable conditions. Violations of this assumption, e.g., due to the switching between tasks, would be expected to result in a lower Goodness of Fit in the dual task condition.

# RESULTS

The accuracy of the letter whole report was modeled as a function of effective exposure duration for each participant and task condition (single whole report task condition, dual task condition), from which parameters K (VSTM storage capacity in number of objects), C (visual processing speed in objects/s) and t0 3 (visual threshold in ms) were derived. For the tapping task, overall accuracy was computed for each task condition (single tapping task condition, dual task condition). The means and standard deviations of these parameter estimates are given for each group in **Table 2**.

We computed separate repeated-measures ANOVAs for tapping accuracy and TVA parameters. For comparison of older participants performing the simple tapping sequence to either younger participants performing the simple tapping sequence or younger participants performing the complex tapping sequence we included the factors Age Group (older vs. younger) and Task Condition (single task vs. dual task). Three tapping accuracy values were missing (one from each group) due to technical errors. For the sake of interest, several further analyses can be found in the **Supplementary Materials**, including a comparison between the two younger groups. Furthermore, for individual values of TVA parameters and tapping accuracy see **Supplementary Table 4**, while the individual variability in TVA parameter K is provided in **Supplementary Figures 2–4**.

<sup>3</sup>Possibly due to subjects' inappropriate guessing during letter report, or to inefficient masking, TVA-based modeling provided negative t0 values in multiple cases. We handled this problem by calculating our analyzes in two alternative ways: first, based on the model fit providing negative t0 values; second, based on a model fit constraining the minimum t0 value to zero. Both analyses generally revealed the same effects and group interactions. The data are provided in the **Supplementary Materials in Tables 2, 3 and 5**.

# Older Group Performing the Simple Tapping Sequence vs. Younger Group Performing the Simple Tapping Sequence

To look for age effects on tapping accuracy and TVA parameters in a dual task situation a comparison was run between older and younger participants who both performed the simple tapping sequence.

#### Tapping

For tapping accuracy (see **Table 2**), we found a significant main effect of Age Group [F(1, 56) = 7.06, p = 0.01; η 2 <sup>p</sup> = 0.11]. The main effect of Task Condition [F(1, 56) = 1.56, p = 0.22; η 2 <sup>p</sup> = 0.03], and the interaction [F(1, 56) = 2.06, p = 0.16; η 2 <sup>p</sup> = 0.04] were not significant. Thus, younger and older participants differed in their general tapping accuracy, but neither group's tapping accuracy was affected by the concurrent visual task. Results are depicted in **Figure 3**.

#### Whole Report

For VSTM storage capacity K (see **Table 2**), we found significant main effects of Age Group [F(1, 58) = 19.91, p < 0.001, η 2 <sup>p</sup> = 0.26] and Task Condition [F(1, 58) = 17.05, p < 0.001, η 2 <sup>p</sup> = 0.23], and a significant interaction [F(1, 58) = 10.01, p = 0.002, η 2 <sup>p</sup> = 0.15; see **Figure 4**]. Post-hoc pairwise t-tests with Bonferronicorrection demonstrated that there was a significant decline in VSTM storage capacity in the older group induced by the tapping [t(29) = 4.49, p < 0.001, d = 0.52], while, as described before, the younger group performing the same, simple tapping sequence did not show this effect [t(29) =0.83, p = 0.41, d = 0.06].

For processing speed C (see **Table 2**) no significant main effect of Age Group was found [F(1, 58) =0.76, p = 0.39; η 2 <sup>p</sup> = 0.01]. There was a trend for an effect for Task Condition, indicating lower performance in the dual-task compared to the single-task condition across groups [F(1, 58) = 3.37, p = 0.07; η 2 <sup>p</sup> = 0.06]. The interaction was not significant [F(1, 58) = 0.002, p = 0.97; η 2 <sup>p</sup> <sup>&</sup>lt; 0.0014]. Thus, there was no indication for a general age effect or for an increased dual task effect with increased age.

Similar effects as for processing speed were also found for the perceptual threshold parameter t<sup>0</sup> (see **Table 2**). There was only a significant effect for Age Group [F(1, 58) = 20.09, p < 0.001; η 2 <sup>p</sup> = 0.26], while the main effect for Task Condition [F(1, 58) = 0.06, p = 0.81; η 2 <sup>p</sup> = 0.001] and the interaction [F(1, 58) = 0.27, p = 0.60; η 2 <sup>p</sup> = 0.005] were not significant. Thus, significantly higher thresholds for older compared to younger adults were found in both task conditions, while there was no evidence for an age-specific dual task decrement for visual threshold t0.

# Older Group Performing the Simple Tapping Sequence vs. Younger Group Performing the Complex Tapping Sequence

Older participants' performance was also compared to that of the younger participants who completed the complex tapping sequence to see whether younger participants would show comparable effects as older participants under a more challenging dual-task condition.

#### Tapping

No significant main effect of Age Group [F(1, 56) = 0.79, p = 0.38; η 2 <sup>p</sup> = 0.01] or Task Condition [F(1, 56) = 0.99, p = 0.33; η 2 <sup>p</sup> = 0.02] was found on tapping performance. The interaction [F(1, 56) = 1.05, p = 0.31; η 2 <sup>p</sup> = 0.02] was also not significant. Thus, neither older participants nor younger adults performing a complex tapping sequence showed dual-task effects on motor performance induced by an additional visual attention task (see **Table 2**, **Figure 5**).

#### Whole Report

For VSTM storage capacity K (see **Table 2**), we found significant main effects of Age Group [F(1, 58) = 15.69, p < 0.001, η 2 <sup>p</sup> = 0.21] and Task Condition [F(1, 58) = 35.87, p < 0.001, η 2 <sup>p</sup> = 0.38], but no significant interaction [F(1, 58) = 0.17, p = 0.68, η 2 <sup>p</sup> = 0.003]. Thus, the older group showed a general reduction compared to the younger one in VSTM storage capacity K, and, across groups, dual task effects occurred. However, no indication was found for an enhanced dual task effect in VSTM storage capacity in the older group when a younger group had to perform a more challenging motor task. **Figure 6** shows comparable reductions of VSTM storage capacity K for both age groups.

For parameter visual processing speed C (see **Table 2**), we did not find any significant effects [Age Group: F(1, 58) = 0.03, p = 0.88; η 2 <sup>p</sup> <sup>&</sup>lt; 0.001; Task Condition: <sup>F</sup>(1, 58) <sup>=</sup> 1.94, <sup>p</sup> <sup>=</sup> 0.17; η 2 <sup>p</sup> <sup>=</sup> 0.03; Interaction: <sup>F</sup>(1, 58) <sup>=</sup> 0.48, <sup>p</sup> <sup>=</sup> 0.49; <sup>η</sup> 2 <sup>p</sup> = 0.008]. Thus, older and younger participants did not differ in visual processing speed, and none of the groups were affected by the secondary task.

We found a significant main effect for Age Group for visual threshold t<sup>0</sup> (see **Table 2**) [F(1, 58) = 17.42, p < 0.001, η 2 <sup>p</sup> = 0.23], but no other significant effects [Task Condition: F(1, 58) = 0.18, p = 0.68; η 2 <sup>p</sup> <sup>=</sup> 0.003; Interaction: <sup>F</sup>(1, 58) <sup>=</sup> 0.49, <sup>p</sup> <sup>=</sup> 0.49; <sup>η</sup> 2 <sup>p</sup> = 0.008]. The visual threshold was significantly higher in the older group compared to the younger group performing the complex tapping sequence, but there were no indications for a difference in t<sup>0</sup> between the single and dual task conditions in the younger or older groups.

# Goodness of Fit

To test to what degree the empirical data obtained in the different experimental whole report conditions was explained by the TVAbased modeling, Goodness-of-fit measures were obtained. They showed that there was a close correspondence between the empirical data (mean accuracy scores) obtained in the different experimental conditions of the whole report and the values that would be predicted based on the TVA parameter estimates. The average Pearson product-moment correlation coefficients are listed in **Table 3**. They show for each participant group, and very similarly in single and dual task conditions, that at least 96% of the variance in the observed data is explained by the TVA model parameters. Across all participants, the model explained at least

89% of the variance. For individual Goodness-of-fit measures see **Table 4** in the Supplementary Materials.

#### DISCUSSION

This study was aimed at specifying which aspects of visual attention capacity are disproportionately affected in elderly individuals in motor-cognitive dual task situations. To that end, we investigated the influence of a concurrent tapping task on the performance of a visual attention task (whole report) in older and younger participants, whilst additionally modulating the difficulty of the motor task performed by the younger adults. TVA model-based fitting of whole report performance provided estimates of separate visual attention capacity parameters.

When older participants performed a simple tapping task concurrently with the visual attention task, their VSTM storage capacity declined. However, when younger participants performed the same simple tapping sequence under dual task conditions, attention capacity did not show any significant decrement. However, in another group of younger participants performing a more challenging tapping task under dual task conditions, their VSTM storage capacity declined significantly as well. Tapping accuracy—although generally at a lower level in the older group than in the younger group performing the simple tapping task—remained unaffected by the load incurred by the dual task.

A comparison between the older participants performing the simple tapping, and the younger participants performing the complex tapping task, revealed that the effect of an additional tapping task on VSTM storage capacity was equally pronounced in both groups, although older adults, overall, had lower VSTM storage capacity than younger participants.

Similar to McAvinue et al. (2012) we found that older participants had a lower VSTM storage capacity, a higher visual threshold and—at least numerically—a lower perceptual processing speed than younger participants. These results are typical of older adults with normal or corrected-to-normal eyesight (see also Habekost et al., 2013; Espeseth et al., 2014). The fact that we did not see significant differences in perceptual processing speed seems to be driven by high standard deviations.

performing the complex tapping sequence. Error bars indicate standard errors of the mean.

Taken together, these results shed considerable light on the nature of motor-cognitive dual task interference: Firstly, concurrent performance of a motor task seems to affect visual attention capacity quite selectively by way of reducing VSTM storage capacity. It was especially the number of items that could be maintained within VSTM that declined under dual task conditions. This was true both for older subjects performing the simple tapping, and for younger subjects performing the more complex tapping task. The remaining parameters obtained from TVA-based fitting were not significantly affected. That is, the perceptual threshold and the visual processing rate did not decline under dual-task compared to single-task conditions in any age group.

Secondly, the effect of the motor task on VSTM storage capacity appears to be more pronounced in older participants. Whilst the simple tapping sequence put only a minor demand on younger participants, this same task caused considerable dual task effects in the older adults. The VSTM decrement found in these older participants more or less equaled the decline revealed in younger adults performing the more complex tapping task. The aging effect thus seems to reflect the fact that a simple motor task is more challenging for older participants. In other words, even a simple motor program consisting of a sequence of concurrent finger tapping significantly decreased VSTM storage TABLE 3 | Correlations between observed and modeled data: Goodness-of-Fit values (Pearson-product-moment correlation r) for single and dual-task-conditions for all three groups.


Mn: Mean; SD: standard deviation

capacity in older adults, an effect which was only present to the same extent in younger adults when they performed a more complex motor task. Overall, the results of this study support capacity sharing accounts of dual tasking (e.g., Navon and Miller, 2002), implicating the VSTM storage capacity as being the limiting attentional capacity which is shared across the two tasks. Thus, as long as the capacity limits of the VSTM are not reached, the performance of both tasks remains unaffected. However, when the task demands exceed the limits of this capacity, such as when the task demands are increased, then the performance on the tasks is reduced.

In sum, our results show that the age-specific effects of motorcognitive dual task interference are based on a stronger decline of VSTM storage capacity.

Our results are largely consistent with recent data presented by Künstler et al. (2017) who used the same method in a group of middle to higher aged subjects and combined the whole report task with the simple tapping task. In this study, a decrement of both VSTM storage capacity and processing rate was found under dual task conditions. The effect was more pronounced for VSTM, however, and a direct investigation of which parameter more strongly reflects the dual task related decline was not possible in this study. In line with these results, we found a clear decline of VSTM storage capacity in older subjects and in younger subjects performing a more complex tapping task, while the effects on processing rate were much weaker, and non-significant. Moreover, we were able to show that the age-related decline of attention capacity under motor-cognitive dual-task conditions is selectively reflected by parameter VSTM.

An important result of the Künstler et al. (2017) study was the demonstration that the performance of the whole report task, which was used to assess visual attention capacity, was qualitatively comparable under both single and dual task conditions. This was shown, for instance, by the fact that goodness-of-fit measures were comparable under both conditions. In this way, the valid applicability of the TVAmodel—which assumes parameter estimates to remain constant across the task—under both single and dual task conditions was proven. Consequently, a conjecture that the whole report task would be performed in a non-continuous manner under dual task conditions (for example by switching attention between the two tasks) was not supported. Analogously, comparable goodness-of-fit measures across the single and the dual task conditions were obtained also in the present study. This in turn corroborates that participants performed both tasks simultaneously and continuously, as evidenced by the high correlations between the observed and the predicted data, also obtained in the present study. Thus, in congruence with the previous study, we would suggest that the results of the present study indicate that both tasks were executed simultaneously and in a qualitatively similar, although quantitatively less efficient way under the dual task as compared to single task condition.

The results of the present study are in line with earlier studies showing that motor-cognitive dual task interference is increased in aging (Kramer and Larish, 1996; Verhaeghen et al., 2002, 2003; Woollacott and Shumway-Cook, 2002; Boisgontier et al., 2013; Schaefer, 2014). They are also congruent with other studies which have indicated that increased task demands are linked with decreased spatial awareness during dual tasking (Lisi et al., 2015).

However, by referring to an explicit theoretical framework modeling attentional processing capacity, it was possible for the present study to specifically attribute the capacity limitation to the constraint in VSTM storage capacity.

To explain these findings, in the previous study (Künstler et al., 2017) we proposed that, when it comes to dual task situations, the VSTM represents a stage of response selection, at which verbal output is required in the whole report task, whilst simultaneously preparing the finger movement output for the tapping task. A similar view was proposed by Klapp (1976) who considered short-term memory as a stage of motor-response programming where response commands are temporarily stored. Under motor-cognitive dual-task conditions, when several response commands have to be maintained in parallel, the probability of interference at this stage is increased by cross-talk effects, resulting in a performance decline. Due to the fact that aging is associated with an overall decline of VSTM storage capacity, the reliability of maintained representations would be reduced in this group, giving rise to an even higher probability of interference (Jonides et al., 2008).

Of course, these assumptions are speculative and need to be investigated in future studies. However, they are in line with both a resource sharing perspective on short-term memory (Franconeri et al., 2013), as well as with the view that processing capacity limitations are mainly dependent on interference control and inhibition (Kane and Engle, 2002), which appears to be significantly reduced in older subjects (Mccabe et al., 2005).

It could be argued that our results might best be accounted for within Baddeley's multicomponent working memory model (see Baddeley, 2012, for a recent review). According to this view, motor-kinetic information from the finger tapping task and visual information from the whole report task would both be represented within the same slave system, namely the visuospatial sketch pad (VSSP). Doing both tasks in parallel would, therefore, increase the load on the VSSP compared to when each of the tasks is performed separately. A possible decrease in VSSP during aging (e.g., Kessels et al., 2010) would then mean that older participants have a higher load on modality specific resources than younger participants, while a more complex tapping pattern would mean a higher load even for younger participants. We consider such an explanation as less likely, for the following reasons. First, there is of course strong evidence that observed kinesthetic movement information (Baddeley, 2012) mentions gestures and dance as examples) is represented within the viewer's sketchpad. However, whether this is also true for motor programs representing sequential finger movements that are not directly observed remains equivocal. Moreover, Logie's seminal work (Logie, 1995) has shown that the VSSP itself can be subdivided into a visual and a spatial subsystem, with movement related information only tapping into the latter. This would be inconsistent with the assumption of a modality specific interference within the VSSP. In line with this assumption, recent ERP data of Katus and Eimer (2018) implies that tactile and visual working memory representations are distinct, i.e., modality-specific, and are not transferable across different sensory modalities

In conclusion, our results indicate that tasks are processed in parallel under conditions of motor-cognitive dual tasking, and that VSTM storage capacity is a core function involved in the dual task decrement, which is particularly exacerbated during aging. Whilst younger adults only show difficulties when the complexity

#### REFERENCES


of the secondary task is increased, older adults already show qualitatively similar decrements in the VSTM capacity when performing a simple secondary motor task.

#### AUTHOR CONTRIBUTIONS

EK, MP, HM, KF, and PB contributed to the design of the study. NN contributed the necessary programming of the experiments used in this study. EK and MP collected the data, analyzed the results, and wrote the manuscript. KF and PB both supervised the data analysis and the writing of the manuscript. OW, CK, PB, and KF contributed to the data discussion. OW, PB, and CK contributed to the funding application. EK and MP contributed equally as first authors, whilst PB and KF both contributed equally as senior authors.

#### ACKNOWLEDGMENTS

This research was supported by a Grant within the Priority Program, SPP1772 from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG), Grant number SPP 1772/1—BU 1327/4-1.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01564/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Künstler, Penning, Napiórkowski, Klingner, Witte, Müller, Bublak and Finke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Selection Processing in Noun and Verb Production in Left- and Right-Sided Parkinson's Disease Patients

Sonia Di Tella<sup>1</sup> \*, Francesca Baglio<sup>1</sup> , Monia Cabinio<sup>1</sup> , Raffaello Nemni 1,2 , Daniela Traficante<sup>3</sup> and Maria C. Silveri <sup>3</sup>

1 IRCCS, Fondazione don Carlo Gnocchi ONLUS, Milan, Italy, <sup>2</sup> Department of Pathophysiology and Transplantation, Università degli Studi di Milano, Milan, Italy, <sup>3</sup> Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Roberta Biundo, IRCCS Fondazione Ospedale San Camillo, Italy Fabrizio Piras, Fondazione Santa Lucia (IRCCS), Italy Nikoletta Szabó, University of Szeged, Hungary

> \*Correspondence: Sonia Di Tella sditella@dongnocchi.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 February 2018 Accepted: 28 June 2018 Published: 20 July 2018

#### Citation:

Di Tella S, Baglio F, Cabinio M, Nemni R, Traficante D and Silveri MC (2018) Selection Processing in Noun and Verb Production in Left- and Right-Sided Parkinson's Disease Patients. Front. Psychol. 9:1241. doi: 10.3389/fpsyg.2018.01241 Verbs are more difficult to produce than nouns. Thus, if executive resources are reduced as in Parkinson's disease (PD), verbs are penalized compared to nouns. However, in an experimental condition in which it is the noun that must be selected from a larger number of alternatives compared to the verb, it is the noun production that becomes slower and more prone to errors. Indeed, patients are slower and less accurate than normal subjects when required to produce nouns from verbs (VN) in a morphology derivation tasks (e.g., "osservazione" from "osservare") ["observation" from "observe"] than verbs from nouns in a morphology generation task, in which only a verb can be generated from the noun (NV) (e.g., "fallire" from "fallimento") ["to fail" from "failure"]. In the Italian language morphology, in fact, generation and derivation tasks differ in the number of lexical entries among which the response must be selected. The left Inferior Frontal Gyrus (IFG) has been demonstrated to be involved in selection processes. In the present study, we explored if the ability to select words is related to the cortical thickness of the left IFG. Twelve right-sided PD with nigrostriatal hypofunctionality in the left hemisphere (RPD-LH), 9 left-sided PD with nigrostriatal hypofunctionality in the right hemisphere (LPD-RH) and 19 healthy controls (HC) took part in the study. NV and VN production tasks were administered; accuracy and reaction times (RTs) were collected. All 40 subjects received a structural MRI examination. Cortical thickness of the IFG and volumetric measurements for subcortical regions, thought to support selection processes, were computed using FreeSurfer. In VN derivation tasks RPD-LH patients were less accurate than LPD-RH patients (accuracy: 66% vs. 77%). No difference emerged among the three groups in RTs. Task accuracy/RTs and IFG thickness showed a significant correlation only in RPD-LH. Not only nouns (as expected) but also verbs were correlated with cortical thickness. This suggests that the linguistic nature of the stimuli along with executive resources are both relevant during word selection processes. Our data confirm that executive resources and language interact in the left IFG in word production tasks.

Keywords: Parkinson's disease (PD), executive functions, word production, magnetic resonance imaging, Inferior Frontal Gyrus (IFG), Broca's area, brain thickness

# INTRODUCTION

The disproportionate impairment in the production of verbs compared to nouns, frequently documented in patients with Parkinson's disease (PD) (Silveri et al., 2018), can be considered an expression of the dysexecutive syndrome, the typical manifestation of PD patient's cognitive decline (Dubois and Pillon, 1996; Roussel et al., 2017). Deficit for verbs in a pathology typically dominated by disorders of movement has been also attributed to the decay of conceptual representation of the motor components of the action (Boulenger et al., 2008; Rodríguez-Ferreiro et al., 2009; Cardona et al., 2014), according to an embodied view of cognition (Barsalou, 1999).

For a series of reasons, verbs are, among the grammatical classes, more difficult to produce than nouns (Mätzig et al., 2009). Thus, in the presence of reduced executive resources (such as in extrapyramidal pathologies), the verb class is more penalized than the noun class (Cotelli et al., 2006; Silveri et al., 2012). For example, during word production, verb forms must be selected from a larger set of word forms with the same verb root compared to nouns. Consequently, the selection process requires the recruitment of a larger amount of attentional resources. However, in an experimental condition in which it is the noun that must be selected from a larger number of alternatives compared to verb, noun production becomes slower and more prone to errors, thus more difficult. Indeed, in Silveri et al.'s study (2018) PD patients were slower and less accurate than healthy controls (HC) when required to produce nouns from verbs in a morphology derivation tasks (e.g., "osservazione" from "osservare") ["observation" from "to observe"] than verbs from nouns in a morphology generation task, in which the verb must be generated from the noun (e.g., "fallire" from "fallimento") ["to fail" from "failure"]. In the Italian language morphology, in fact, generation and derivation tasks differ in the number of lexical entries among which the response must be selected (see also Marangolo et al., 2006). For example, when the noun "cammino" [walk] must be produced (derived) from the verb "camminare" [to walk], the choice is from six alternative nouns ("cammino" [walk], "camminata" [walk], "camminamento" [route], camminante [walking], camminatrice [walker, female], camminatore [walker, male]); but, when the verb base "camminare" [to walk] must be produced (generated) from the derived noun "cammino" [walk], there is virtually only one choice. In addition, in the above cited study, frequency of target competitors and number of competitors more frequent than the target were also related to response latency and accuracy in word production tasks. These findings suggest that not only the selection process but also controlled retrieval is impaired in PD, as proposed by Crescentini et al. (2008).

Specifically, in the previous study (Silveri et al., 2018) PD patients were impaired not only when they had to select the target from many competitors, but also in the presence of competitors more frequent than the target. An efficient control of the lexical retrieval should inhibit their production, but inhibitory processes are impaired in PD (Castner et al., 2008) and competitors more frequent than the target were produced in the place of the target.

Results such as the ones above described would be consistent not only with a lack of inhibition of the irrelevant information represented by the competitors, but also with lack of topdown potentiation of the relevant information, potentiation supported by the basal ganglia (Norman and Shallice, 1986; Egner and Hirsch, 2005). Evidence from PD suggest that the basal ganglia are mostly active in post-retrieval processes, such as the inhibition of task irrelevant information (Crescentini et al., 2010), and also in the top down potentiation of relevant information (Norman and Shallice, 1986; Egner and Hirsch, 2005).

Moreover, neuroimaging studies in normal subjects have suggested that the left Inferior Frontal Gyrus (IFG) (or Ventrolateral Prefrontal Cortex-VLPC) is crucial in the selection processes (Moss et al., 2005; Siri et al., 2008) and that the IFG of the left hemisphere (LH) is involved in both selection processes and controlled retrieval (Badre et al., 2005; Moss et al., 2005; Thompson-Schill and Botvinick, 2006). Selection process and controlled retrieval are dissociable processes and, as such, potentially supported by different neural substrates within the IFG (Badre et al., 2005). For example, the IFG pars triangularis (Brodmann area-BA 45) supports the former whereas the anterior portion-pars orbitalis (BA 47) the latter. Consequently, the experimental evidence from both normal subjects and PD converges in suggesting that selection processes are supported by neural substrates that include both the IFG and subcortical regions (Mostofsky and Simmonds, 2008). IFG and the dorsal striatum (caudate nucleus and putamen) are included within the corticostriatal pathway, a key circuit associated with executive deficits in nondemented PD and motor inhibition (Owen, 2004; Baglio et al., 2011). The lack of inhibition is a characterizing aspect of PD dysexecutive syndrome (Obeso et al., 2011) which reaches the most striking clinical evidence in impulsive behavior (Napier et al., 2015). According to literature data, not only the VLPC (BA 45 and 47) but also the mid-dorsolateral frontal cortex (BA 46 and 9) are linked with different aspects of executive processing, and are crucial to optimize performance in a variety of executive tasks (Alexander et al., 1986; Owen, 2004).

Idiopathic PD generally presents, at initial phase, as an asymmetric clinical syndrome with right or left predominance, that correlates with neuronal loss in the deep gray matter of the contralateral hemisphere (Kempster et al., 1989; Lee et al., 2014; Tanner et al., 2017), although not systematically (Nemmi et al., 2015). Only a few studies looked specifically for the existence of structural cortical asymmetries in PD, suggesting that atrophy could start on one side (Zarei et al., 2013), and then extend to the other (Pereira et al., 2014; Tanner et al., 2017). Although the evidence of structural asymmetries extending to the cortical regions is for now weak, right-sided (RPD) and leftsided PD (LPD) are thought to have different cognitive profiles, with the former presenting dysfunction in language and verbal memory and the latter in spatial attention (see Verreyt et al., 2011 for a review). Thus, in general, clinical and experimental evidence suggests that symptom side predominance may predict, at least to some extent, decline in specific cognitive domains subtended by neural substrates in the contralateral hemisphere, although the involvement of the cortical areas, in addition to the subcortical regions, has not been incontrovertibly demonstrated. However, correlations between the structural indexes of specific brain regions and neuropsychological performances have been demonstrated in PD: executive abilities are correlated to the cortical thickness of frontal or frontoparietal regions (Pereira et al., 2009; Biundo et al., 2011; Filoteo et al., 2014; Duncan et al., 2016) and memory performance with the volume of hippocampus (Beyer et al., 2013).

In the present study we explored if disorders of the executive abilities, such as word selection among competing alternatives, could be subtended by changes in cortical thickness in the inferior regions of the frontal lobes; in particular, if changes in the LH could be detected when tasks of linguistic nature, such as word production tasks, are used. The role of the frontostriatal pathway of the LH in word selection processes is quite robust (Moss et al., 2005; Siri et al., 2008) whereas evidence of cortical changes in the left frontal cortex (BA 44/45/47/6/9) in PD is weaker. We expected that reduced cortical thickness of the left IFG could have a detrimental effect on tasks that combine executive control and verbal competence.

To verify this hypothesis, we tested idiopathic PD patients with asymmetric clinical expression (mild-moderate stage of disease).

On the basis evidence supporting differential patterns of cognitive abilities in clinically asymmetric PD (Verreyt et al., 2011), we hypothesized preferential left hemispheric impairment in verbal tasks, such as word production. In particular, we expected that right PD patients (RPD) (with prevalent left hemisphere nigrostriatal hypofunctionality, henceforth RPD-LH), would be less accurate and slower than left PD patients (LPD) (with prevalent right hemisphere nigrostriatal hypofunctionality, henceforth LPD-RH).

In relation to the executive nature of the word selection tasks, we also expected to detect an association between the volume of specific cortical brain regions of the LH and accuracy and RTs in nouns production (that must be selected among a larger number of alternatives than the verbs). The left IFG in fact, as above mentioned, has been considered a crucial region for selection processing and post lexical control during word production (Badre et al., 2005; Moss et al., 2005; Thompson-Schill and Botvinick, 2006; Siri et al., 2008).

# MATERIALS AND METHODS

#### Participants

Twenty-one PD participants (11 males and 10 females) and 19 healthy controls (HC) participated to the study. All subjects were right handed at the Edinburgh Handedness Inventory (Olfield, 1971). Persons with PD were consecutively recruited from the IRCCS Don Carlo Gnocchi Foundation—Neurological Unit (Milan, Italy). To be included in the study, the subject had to meet the following criteria: (1) diagnosis of probable PD according to the United Kingdom Parkinson's Disease Society Brain Bank (Hughes et al., 1992); (2) positive DAT scan; (3) mild to moderate stage of the disease with a scoring between stages 1 and 2.5 of the Modified Hoehn and Yahr (H&Y) Scale (Goetz et al., 2004); (4) at least 8 years of education; (5) Italian native language; (6) no decline in cognitive ability reported by either the patient or informant, or observed by the clinician, (7) Mini-Mental State Examination (MMSE) score greater than or equal to 24; (8) stable drug therapy with L-Dopa (alone or in association with dopamine agonistics, catechol-Omethyltransferase inhibitors, monoamine oxidase inhibitors, and anticholinergic drugs).

Exclusion criteria were: clinical signs satisfying criteria of other neurological disorders, including possible atypical parkinsonisms; secondary or iatrogenic parkinsonism; major psychiatric illnesses excluding presence of mild-moderate depression; claustrophobia.

All PD were characterized using the H&Y scale and Unified Parkinson's Disease Rating Scale (UPDRS)—motor part III (Goetz et al., 2008). To classify left- or right-dominance of symptoms, scores from each item on the UPDRS Part III that contained both a right and a left side component (e.g., finger taps, hand movements, rigidity of extremities, etc) were extracted, which provided left and right motor subscores for each individual. Then, the lateralization score was computed by subtracting the total symptom score from the right side from the total symptom score from the left side and patients were clinically clustered in RPD-LH (N = 12) and LPD-RH (N = 9). The Levodopa Equivalent Daily Dose (LEDD) (Tomlinson et al., 2010) was also calculated.

A group of 19 HC matched for age, sex, and years of education formed the control group. Demographical data of both groups are detailed in **Table 1**.

The present study was approved by the scientific and Ethics Committee of Don Gnocchi Foundation in accordance with the Helsinki Declaration and all participants gave written informed consent to participate to the study.

#### Measures

#### Neuropsychological Assessment

PD patients underwent an extensive neuropsychological assessment of general cognitive efficiency (Mini Mental State Examination, MMSE; and Montreal Cognitive Assessment, MoCA), language (object and action oral naming; phonological fluency; semantic fluency); verbal and spatial memory (Immediate and Delayed Recall of 15 words; Free and Cued Selective Reminding Test-FCSRT; Rey-Osterrieth figure recall), verbal and spatial short term memory (verbal span and Corsi's test), intelligence (Raven Colored Matrices), praxis (Rey-Osterrieth figure copy), attention and executive functions (Trail making test, TMT part A, B and B-A; Attentional Matrices; Stroop test; Modified Wisconsin Card Sorting test, MCST), lasting about 2 h, in two sessions. Neuropsychological assessment of HC was limited to MMSE, MOCA, phonological and semantic fluency and TMT.

All neuropsychological tests were corrected for age and education using normative Italian values. The three groups of subjects were matched for age and education. Bonferroni correction for multiple comparisons was applied to set significance.



RPD-LH, PD with prevalent left hemisphere damage; LPD-RH, PD with prevalent right hemisphere damage; H & Y, Modified Hoehn and Yahr (H&Y) Scale; UPDRS III, Unified Parkinson's Disease Rating Scale—motor part III; LEDD, levodopa equivalent dose; Sel, Selegiline; Ras, Rasagiline; DA, Dopamine-agonist; Rop, Ropinirolo; Pram, Pramipexole; Rot, Rotigotine; (∧)ANOVA; (#)Chi Square statistic; and (\*)Independent samples t-test.

#### MRI Acquisition and Data Analysis

All 40 subjects received a structural MRI examination. MRI sessions were performed on a 1.5 T scanner (Siemens Magnetom Avanto, Erlangen, Germany) and included [a] 3D T1 MPRAGE scan (TR/TE = 1,900/3.37 ms, FoV = 192 × 256 mm<sup>2</sup> , voxel size 1 mm isotropic, 176 axial slices), to perform cortical and subcortical measurements; [b] conventional anatomical sequences (PD-T2, FLAIR) in order to exclude patients showing macroscopic brain lesions or white matter hyperintensities outside the normal range (Vale et al., 2015). In particular, subjects who presented with one or more macroscopic hyperintensities on T2-weighted scans located in the deep with matter and/or more than five periventricular hyperintensities were excluded.

Anatomical high-resolution images (3D T1) were segmented and parcellated using Freesufer's recon-all standard pipeline (Dale et al., 1999; Fischl et al., 1999, 2002). Briefly, after brainextraction, high-resolution 3D T1 images were registered to standard space and the gray/white and gray/cerebrospinal fluid borders were computed (Dale et al., 1999; Fischl et al., 1999, 2002). Quality control checks were performed at all steps of the pipeline and manual corrections were made if necessary. Cortical parcellations were made according to Desikan atlas (Desikan et al., 2006) and thickness measurements were collected from bilateral regions of interest (ROIs). ROIs were selected based on the literature (Alexander et al., 1986; Owen, 2004): the pars opercularis (BA 44), triangularis (BA 45), and orbitalis (BA 47) of the IFG of both the hemispheres and the immediately adjacent regions such as the rostral and caudal middle frontal gyrus (BA 46, BA 9, and 10), see **Figure 1**. Volumetric measurements were also collected for three PD-related subcortical regions (Lewis et al., 2016): caudate, putamen, and globus pallidus, bilaterally using FreeSurfer subcortical segmentation streamline. Mean Surface Area (MSA) and Total Intracranial Volume (TIV) was also computed for each subject and inserted as covariates in the subsequent statistical analyses to account for differences in brain size; age and sex were included as covariates as well.

#### Stimuli

Two lists of words were selected (see Marangolo et al., 2003): 144 verbs (e.g., "osservare" [to observe]), and 144 corresponding derived nouns (e.g., "osservazione" [observation]).

Experimental tasks consisted in the production of morphologically related words: in the derivation task, verbs represented the input stimuli and nouns were the targets (VN: noun from verb); while in the generation tasks, nouns were the inputs and verbs were the targets (NV: verb from noun). In the derivation task, over 90% of the participants produced the same derived noun confirming it as the expected target response; in this condition any other answer was considered as an error. In the generation task, the correspondent verb-base form was the only correct response.

Derived nouns did not differ from verbs in length (nouns: M = 8.34, SD = 2.56; verbs: M = 8.12, SD = 1.50), but the two lists of items were different in frequency of use (Istituto di Linguistica Computazionale, CNR, Unpublished Manuscript; see also Marangolo et al., 2003): in fact, the target nouns, that are expected to be associated to a worse performance, were more frequent (M = 50.74, out of 1.5 million occurrences; SD = 106.05) than the corresponding target verbs (M = 28.85; SD = 68.99; t = 2.08, p < 0.04). Note, however, that the difference in frequency went against the hypothesis of worse performance for nouns derived from verbs. Furthermore, the two sets of stimuli differed for the number of alternatives among which the participant had to select his/her response to the stimulus. This variable, which was decisive to test our hypothesis, was estimated taking into account the number of word-types that share the root with the input word and are likely to be involved in processing the response. These word-types are listed in the corpus of written Italian by Bertinetto et al. (2005) (CoLFIS, http://www.istc.cnr.it/ en/grouppage/colfis). In the case of verb input (derivation task), there were several alternatives among which the target was to be selected (range: 1–8; M = 3.1; SD = 1.6), whereas for noun input (generation task) the number of alternatives was 1, as from a noun target only one verb base can be retrieved. Finally, another variable whose influence on the derivation task was considered is the presence of words more frequent than the target in the list of alternatives. For instance, from the verb "abitare" [to reside], the noun produced by all participants (that is, the target) was "abitazione" [residence], but the most frequently derived noun from this verb, listed in the CoLFIS, is "abitante" [resident].

#### Procedure

Participants were first screened using a reading test consisting in a set of seven words, different from the stimuli of the experiment, to exclude visual perceptual disorders that could interfere with the cognitive performance.

The input stimuli were presented in two blocks for each task, i.e., two blocks of 72 verbs (144 items for derivation task) and two blocks of 72 nouns (144 items for generation task), in a random order both within and between blocks. Each block was preceded by a training session, where participants were carefully instructed to produce as quickly and accurately as possible the expected target. For example, in the NV generation task participants were instructed to turn the noun into the corresponding infinitive form of the verb. For example, if the noun "sentimento" [feeling] appeared on the display, they were asked to produce the verb "sentire" [to feel] as quickly as possible, considering that the time elapsed from the appearance and the onset of the response was recorded through the microphone. In the VN derivation task, participants were instructed to turn the verb into the corresponding noun. For example, if the verb "partire" [to depart] appeared on the display, they were asked to produce the noun "partenza" [departure] as quickly as possible. During the training session, the examiner could offer a feedback on the accuracy of participant's response. Once the actual task started, the examiner could not give any indication. The administration of the entire experimental session lasted about 60 min with a pause from each block.

Stimuli were presented by SuperLab pro Software (Cedrus, Phoenix, Arizona), on the center of the video display one at time, using a size 60, bolded black font. The experiment was conducted in a quiet room and subjects were seated at a distance of 40 cm from the screen. Each block started with the appearance of the word "via" [start] on the screen and ended with the word "fine" [the end]. The trial sequence started with a blank background for a duration of 250 ms, after that a fixation point appeared for a duration of 750 ms and then the input word for a duration of 5,000 ms. SuperLab pro Software can register reaction times (RTs), which is the latency from the appearance of the word on the computer screen and the onset of response by the participants. These measures were reported in an Excel worksheet generated automatically by SuperLab software. The examiner manually scored response accuracy. Patients treated with L-DOPA were evaluated in "off " state.

#### Statistical Analyses

#### Behavioral Data

Non-parametric analyses were carried out on accuracy and RTs, to assess differences among the three groups of participants (HC, RPD-LH, and LPD-RH) in the two morphology tasks (generation vs. derivation task). To estimate the interaction effect of group by task on accuracy, Chi-squared statistic was carried out on corrected responses. As for RTs, comparisons between the two tasks were assessed through Wilcoxon's test within each group, whereas differences among the three groups were performed through Kruskal–Wallis' test within each task.

Effects of psycholinguistic features on accuracy, computed as the proportion of correct responses after removing technical failures and out-of-time responses, were analyzed through mixed-effects logistic models (Jaeger, 2008). Mixed-effects regression models (Baayen et al., 2008) were carried out to better understand the role of psycholinguistic features of inputand target-words in determining the effects tested in previous analysis on (log-transformed) RTs: along with the variable group, also input frequency, target frequency, target length, number of alternatives, and number of alternatives more frequent than target were tested as fixed effects. In these mixed-effects models, participants and items were tested as random intercepts, along with other three variables that can randomly vary within tasks: (a) phonetic transparency of derivation; (b) presence of homography between the derivative noun and a verbal inflected form; (c) input length in graphemes.

All fixed effects tested on latency were considered in the logistic models on accuracy, but only participants and items were carried out as random intercepts, as computational limits for logistic models do not allow to test the other three random effects (transparency, homography, and length of the input) all together.

#### MRI Data

#### **MRI between-groups whole-brain comparison**

In order to determine if our MRI brain measurements (cortical thickness, subcortical volumes) were modeled by a normal distribution, normality tests (skewness and kurtosis ranges and Q-Q plots) were performed for each variable using Statistical Package for Social Sciences (SPSS, IBM Corporation), version 24. Left and right sides of the same area have been considered separately.

In order to verify the presence of brain atrophy, cortical brain thickness was compared between HC (N = 19) and PD groups (N = 20, a subject was excluded due to motion artifacts) at a whole-brain level using QDEC, (https://surfer. nmr.mgh.harvard.edu/fswiki/Qdec), running on Freesurfer. A two-sample t-test was performed, inserting age and gender as covariates, results were considered as statistically significant when surviving the p < 0.05, corrected for multiple comparison (FDR). Then, a direct comparison was performed between RPD-LH (N = 11) and LPD-RH (N = 9) subgroups in order to assess anatomical brain differences between the two subgroups. Also in this case, age and gender were inserted as nuisance covariates and results were considered as statistically significant with a p < 0.05FDR−corrected. Moreover, direct comparisons were performed for subcortical volumes (for each brain side) using ANCOVA: group of participants as between factor (3 levels: HC, RPD-LH, LPD-RH) and TIV, age and sex as covariates.

#### **MRI correlation analysis**

Partial correlations were computed between NV [both accuracy and log-transformed RTs] and cortical thickness of the considered ROIs and between VN [both accuracy and logtransformed RTs] and cortical thickness. Mean surface areas, age and sex were inserted as covariates, to account for brain size differences. Partial correlations were also computed between NV [both accuracy and log-transformed RTs] and subcortical volumes. Correlations were computed separately in [a] HC group; [b] RPD-LH; [c] LPD-RH.

Results were considered as statistically significant at p < 0.05 after Benjamini–Hochberg procedure for controlling the false discovery rate (FDR) in multiple testing (Benjamini and Hochberg, 1995).

#### RESULTS

## Demographical Data and Neuropsychological Tests

Participants were matched on sex, age, and years of education. Regarding the PD subgroups, there was no statistically significant difference in symptoms at onset, disease duration, H&Y score, and UPDRS motor subset score between LPD-RH and RPD-LH patients, which ensures that disease severity was uniform between subgroups (see **Table 1**). Moreover, no significant differences in levodopa equivalent daily dose (LEDD) and kind of PD medication between the two PD subgroups were found (see **Table 1** for details).

Results showed that the LPD-RH and RPD-LH groups had no significantly lower adjusted scores than the HC group on any neuropsychological tests, except for the TMT sub-test part A, where LPD-RH were slower than HC [F(2,37) = 7.585, p = 0.002]. No other difference reached statistical significance between HC and PD groups and between PD groups (see **Table 2**).

#### Behavioral Tasks

Technical failures and responses given after the time limit of 5,000 ms (LPD-RH patients: 13%; RPD-LH: 13.8%; HC: 6.6%) were excluded from the analyses. Accuracy and mean RTs are shown in **Figure 2** and **Table 3**.

#### Accuracy

The interaction between group and task proved to be significant (χ <sup>2</sup> = 21.287, df = 2, p < 0.001): all participants showed a lower percentage of accuracy in the derivation (74.5%) (e.g., osservare [to observe] –> osservazione [observation]) than in the generation task (96.5%). However, it is worth noting that RPD-LH patients had a level of accuracy in derivation task (65.9%; standardized residual = −2.9) lower not only than controls (80.6%), but also than LPD-RH patients (77.1%), who did not significantly differ from the HC group.

The number of alternatives and the number of alternatives more frequent than the target proved to exert significant effects (b = −0.4974, z = −7.380, p < 0.001; b = −1.0249, z = −6.033, p < 0.001, respectively): the higher is their value, the lower is the percentage of accuracy showed by all participants. The inhibitory effect of input frequency was significant, as well (b = −0.1414, z = −2.08, p = 0.037). On the contrary, target frequency had a facilitatory effect: the higher the frequency, the higher the percentage of accuracy (b = 0.3526, z = 5.246, p < 0.001).

#### Latency

Analysis on RTs was carried out only for trials in which the word was correctly produced (accuracy for LPD-RH: 86.1%; for RPD-LH: 81.1%; for HC: 89.3%).

Wilcoxon's test showed that the derivation task (VN), which involves selection among several alternatives, was more difficult than the generation task (NV), which can be done by only retrieving the correct word-base, in each group (HC: z = −3.823, p < 0.001; LPD-RH: z = −2.429, p = 0.015; RPD-LH: z = −3.059, p = 0.002): all participants were much slower (M = 1,536 ms, SE = 65.6) when having to derive a noun from a verb (VN; e.g., TABLE 2 | Adjusted and row scores (when adjusted scores are not available) obtained by HC (healthy controls) and PD (Parkinson's disease) groups on neuropsychological tests.


LPD-RH, (PD with prevalent right hemisphere nigrostriatal hypofunctionality); RPD-LH, (PD with prevalent left hemisphere nigrostriatal hypofunctionality); MMSE, Mini Mental State Examination; MoCA, Montreal Cognitive Assessment; TMT, Trail Making Test; FCSRT, Free and Cued Selective Reminding Test; IFR, Immediate Free Recall; ITR, Immediate Total Recall; DFR, Delayed Free Recall; DTR, Delayed Total Recall; CSI, Cueing Sensitivity Index; MCST, Modified Wisconsin Card Sorting test; (∧) ANOVA statistic; (#) Independent samples t-test; \*Row scores. Significant p-value (p < 0.05) are highlighted in bold font.

osservare [to observe] –> osservazione [observation]) than when they had to generate the verb base from a derived noun (NV; e.g., fallimento [failure] –> fallire [to fail]) (M = 1,240 ms, SE = 45.5). No significant difference was observed among groups within each task (Kruskal–Wallis' test for NV: χ <sup>2</sup> = 0.725, df = 2, n.s.; Kruskal–Wallis' test for VN: χ <sup>2</sup> = 1.577, df = 2, n.s.).

The mixed-effects model that best fitted the whole data set (**Table 4**) confirmed the inhibitory effect of number of alternatives (b = 0.019, t = 2.934, p = 0.0036), and of the number of alternatives more frequent than the target (b = 0.076, t = 3.861, p < 0.001). These results show that, in the case of a high competition among several lexical entries, in particular in presence of high-frequency competitors (including the input word itself), the target selection is slowed down. The effect of the presence of high-frequency competitors was moderated by the group (LPD-RH∗number of alternatives more frequent than the target: b = 0.079, t = 4.491, p < 0.001), as, in the posthoc analyses, the LPD-RH's performance seemed to be minimally affected by this feature (b = −0.009, t = −0.390, p = 0.69), whereas both HC (b = 0.06, t = 3.09, p = 0.002) and RPD-LH (b = 0.08, t = 3.203, p = 0.001) show a reliable inhibitory effect. Finally, results show that the higher the target frequency, the faster was the participant's response (b = −0.078, t = −5.851, p = <0.001), but this effect influenced the performance in interaction with input frequency (b = 0.009, t = 3.371, p < 0.001). **Figure 3** shows that target frequency had a stronger facilitatory effect when input frequency was low (left side of the figure), whereas as input frequency increased, the inhibition on the target production was stronger, irrespective to the target frequency.

The other psycholinguistic variables considered as random variables, i.e., phonetic transparency of the derivation,

FIGURE 2 | (A) Accuracy of PD patients and controls in the two tasks (VN, noun from verb; NV, verb from noun); (B) Reaction times (RTs) of PD patients (LPD-RH; RPD-LH) and controls (HC) in the two tasks (NV, verb from noun; VN, noun from verb).

TABLE 3 | Mean (M) and Standard Deviation (SD) RTs (ms) and accuracy in generation (NV) and derivation (VN) tasks by groups.


homography between noun and verb forms, and input length, had negligible effects on latency.

# Brain Measurements and Correlation Analysis

#### MRI Between-Groups Whole-Brain Comparison

We found no signs of cortical atrophy in the PD group, compared with HC, at a whole-brain level (p FDR corr < 0.05). The direct comparison between RPD-LH and LPD-RH subgroups also revealed no differences in the cortical thickness associated with laterality of the pathology (p FDR corr < 0.05).

No statistically significant differences were also found in subcortical volumes (caudate, putamen, and globus pallidus, bilaterally) when including TIV, age, and sex as covariates (ANCOVA).

#### Correlation Analysis

#### **Accuracy**

Significant correlations were found only between accuracy and respectively RTs and cortical thickness in PD group with left hemisphere damage (**Figure 4**). In more details, RPD-LH showed significant partial correlations between the NV and VN accuracy respectively and left pars triangularis (NV: r = 0.717, p = 0.045; VN: r = 0.873, p = 0.010).

No other correlation did emerge in LPD-RH and HC.

#### **RTs**

Significant inverse partial correlation also emerged between NV and VN RTs and left pars triangularis in RPD-LH (NV: r = −0.845; p = 0.011; VN: r = −0.872; p = 0.010; **Figure 4**). No other significant correlation did emerge in any group.


Finally, no correlation was found in any group between noun and verb production and volume of subcortical regions (caudate, putamen and globus pallidus, bilaterally). For a detailed report of all correlations see Supplementary Materials.

# DISCUSSION

In the present study we aimed to replicate previous results (Silveri et al., 2018) in a new cohort of well-selected PD patients with a clear symptom side predominance (left or right), at a mild to moderate stage of the disease, positive DAT scan and with no decline in cognitive functioning. Indeed, our results confirm that noun and verb production may be differentially impaired when evaluated by a morphological paradigm that allows for keeping under control the number of alternatives among which operate word selection.

More specifically, in VN derivation tasks, which involve selection among many alternatives, we found two main results. First of all, the VN accuracy was lower than NV accuracy in all groups and RPD-LH patients were significantly less accurate than LPD-RH, who did not differ from HC. Furthermore, the

greater the number of alternatives and number of alternatives more frequent than the target, the lower was the accuracy in all subjects; while the higher the frequency of the target, the higher was the accuracy. Secondly, the responses in VN derivation tasks were slower than in NV generation tasks, with no difference among the three groups of subjects. The presence of high-frequency competitors in derivation tasks did not produce any affect in LPD-RH, while both RPD-LH and HC showed increased latencies of response, and the frequency of the target, in interaction with the input frequency, reduced the latency of response in all subjects.

These results are consistent with the hypothesis that word production requires resolving competition among alternatives (Usher and McClelland, 2001); when attentional resources decay, as in some neurological conditions such as PD, word production is penalized in relation to the number of alternatives among which the selection is made. It is worth noting that the decay of accuracy is principally on the account of the PD population, but only when the left hemisphere (RPD-LH) is involved, as expected on the base of the linguistic nature of the task. However, the nature of the task by itself does not seem sufficient for the emergence of a word production deficit. In fact, when it is the verb instead of noun to be produced, both PD groups and HC perform similarly. This result suggests that the production deficit emerges as a function of the difficulty of the task (attentional demand for the selection from many alternatives) in the presence of a left hemisphere dysfunction. Briefly said, nature of the task and attentional components converge in producing the results (reducing accuracy in noun production) only when the left hemisphere is damaged. In the mild to moderate phase of PD, executive dysfunction is one of the principal nonmotor symptoms; the executive function controls the cognitive behaviors and its effect becomes apparent only when subjects are involved in specific cognitive tasks. The dysexecutive syndrome, therefore, in our case, acquires the expression of a language deficit.

Presence of high-frequency competitors to the target resulted in a detrimental effect on latency but only in HC and RPD-LH.

Thus, in respect to previous reports (Crescentini et al., 2008; Silveri et al., 2018) controlled retrieval was to some extent impaired in PD, but differentially in LPD-RH and RPD-LH; the former group seemed insensitive to the interference. Taken together, these results might suggest that left and right hemisphere are different when bottom up inhibition of irrelevant information is requested. Such an "insensitivity" of the LPD-RH to the presence of interference is difficult to explain at this time and requires further confirmation in a larger sample of subjects. However, it is consistent with experimental data indicating a different contribution of the inferior regions of the left and right prefrontal cortex in response inhibition (Baglio et al., 2011; Dambacher et al., 2014) and impulse control (Boes et al., 2008). Impulsive behavior is a relatively frequent symptom in PD and is mostly in relation to dopamine treatment (Weintraub and Claassen, 2017) but the effects of the hemispheric side on the impulse control have not been explicitly explored in this pathology and further studies should address this issue in the future.

For what concerns the neuroimaging, we investigated if accuracy and RTs obtained in behavioral tasks were correlated with the morphometric status of the specific brain regions thought to support word selection processes, such as the left IFG. No difference was found between the three groups in the cortical thickness for all the selected ROIs. However, significant correlations between the cortical thickness and word production of both noun and verb emerged in patients with RPD-LH in the IFG of the left hemisphere. No correlation instead, was found between word production and thickness of the rostral and caudal middle frontal regions and with the volume of the subcortical regions considered. Lack of correlation in this case could be explained by considering that these regions might be implicated in higher level control functions (Owen, 2004) less influenced by the linguistic components compared to the IFG.

Both noun and verb production generated significant correlations with a specific subregion of the left IFG, in particular the pars triangularis, although a tendency toward significance was observed also with the pars orbitalis. We would expect, however a greater influence of the executive disorder and that noun, more than verb production, would correlate with the thickness of these regions. Thus, the critical factor for significant correlations to emerge, seems to be the language deficit rather than the dysexecutive disorder.

The element that should not be underestimated to account for these results is that the subregions of the IFG whose thickness correlated with word production belong to the so called Broca's complex (Hagoort, 2005), including BA 44, 45, and 47. BA 44 is the core of the Broca's complex, while more rostral areas such as BA 45 and 47 assume a more specific role in the executive control (Ardila et al., 2016a). Thus, IFG-Broca's complex is not only involved in selection among competing alternatives but underlies the complex of functions related to language production (see Ardila et al., 2016b for discussion). Word production "per se," and thus also for verb (that in our experimental condition does not require selection processes), is subtended by the IFG. Based on our analyses, the association between accuracy and RTs and cortical thickness seems to some extent more evident in the rostral regions (BA 45, BA 47) than in BA 44, regions which are more involved in executive control (Ardila et al., 2016a), However, our data do not allow to disentangle whether the association is a result of demand on executive resources, or whether associations emerge because the IFG is also part of the language production system.

Interestingly, the emergence of significant correlations in RPD-LH between IFG thickness and the production of the verb in an experimental condition that makes verb production easier than nouns leaves open another possible interpretation. It might be that the word class "verb" (at least some classes of verbs) are represented in neural substrates deputed to movement and movement control (Fernandino et al., 2013) such as the corticostriatal network. Our data do not exclude that verb meaning might be "embodied" in cerebral regions related to action representation by the integration of sensorimotor features (Barsalou, 1999). Moreover, other reports (Bocanegra et al., 2015) downsize the role of the executive disorder in producing verb deficit in PD and propose an interpretation that involves the decline of semantic memory or other components of language.

In conclusion, our data confirm that the verb production deficit in PD, often reported in the literature (Péran et al., 2003; Signorini and Volpato, 2006; Colman et al., 2009; Rodríguez-Ferreiro et al., 2009; Silveri et al., 2012; Bocanegra et al., 2015) may be traced back to the dysexecutive syndrome, but other hypotheses cannot be excluded. The linguistic nature of the tasks we adopted in the present study proved to be an influential factor. Decreased task performance was in facts primarily found in PD patients with left hemisphere damage. In agreement with previous reports (Verreyt et al., 2011) our data also indicate that side of the clinical symptomatology should not be underestimated when assessing cognitive functions in PD patients, since it may be a strong predictor of the characteristics of cognitive decline in this pathology.

Finally, we acknowledge that this is an exploratory study considering the relatively small number of participants and that additional investigations are needed to confirm the data.

# AUTHOR CONTRIBUTIONS

SD, FB, DT, and MS: study concept and design; FB and MS: study supervision; RN: subjects' recruiting; SD and FB: acquisition of data; SD, FB, MC, DT, and MS: analysis and interpretation of data; SD, FB, MC, RN, DT, and MS: drafting/revising the manuscript. All the authors approved the final version of the work to be published and agreed to be accountable for all aspects of the work.

# FUNDING

This research was supported by the financial contribution of Italian Ministry of Health—Ricerca Corrente 2017–2018.

# ACKNOWLEDGMENTS

The authors want to thanks Dr. Niels Bergsland for reading and revising the manuscript.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01241/full#supplementary-material

# REFERENCES


patients with Parkinson's disease. Neuropsychologia 46, 434–447. doi: 10.1016/j.neuropsychologia.2007.08.021


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Di Tella, Baglio, Cabinio, Nemni, Traficante and Silveri. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Role of the Cingulate Cortex in Dyskinesias-Reduced-Self-Awareness: An fMRI Study on Parkinson's Disease Patients

Sara Palermo<sup>1</sup> , Leonardo Lopiano2,3, Rosalba Morese1,4 \*, Maurizio Zibetti<sup>2</sup> , Alberto Romagnolo<sup>2</sup> , Mario Stanziano<sup>5</sup> , Mario Giorgio Rizzone<sup>2</sup> , Giuliano Carlo Geminiani1,3, Maria Consuelo Valentini<sup>5</sup> and Martina Amanzio1,3,6

<sup>1</sup> Department of Psychology, University of Turin, Turin, Italy, <sup>2</sup> Department of Neuroscience, University of Turin, Turin, Italy, <sup>3</sup> Neuroscience Institute of Turin, University of Turin, Turin, Italy, <sup>4</sup> Faculty of Communication Sciences, Università della Svizzera Italiana, Lugano, Switzerland, <sup>5</sup> Azienda Ospedaliera Universitaria "Città della Salute e della Scienza di Torino", Neuroradiology Unit, Turin, Italy, <sup>6</sup> European Innovation Partnership on Active and Healthy Ageing, Bruxelles, Belgium

#### Edited by:

Gail Robinson, The University of Queensland, Australia

#### Reviewed by:

Yang Jiang, University of Kentucky College of Medicine, United States Gianfranco Spalletta, Fondazione Santa Lucia (IRCCS), Italy

> \*Correspondence: Rosalba Morese moresr@usi.ch

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 06 February 2018 Accepted: 31 August 2018 Published: 20 September 2018

#### Citation:

Palermo S, Lopiano L, Morese R, Zibetti M, Romagnolo A, Stanziano M, Rizzone MG, Geminiani GC, Valentini MC and Amanzio M (2018) Role of the Cingulate Cortex in Dyskinesias-Reduced-Self-Awareness: An fMRI Study on Parkinson's Disease Patients. Front. Psychol. 9:1765. doi: 10.3389/fpsyg.2018.01765 Objectives: The detection of dyskinesias-reduced-self-awareness (DRSA), in Parkinson's disease (PD), was previously associated to executive and metacognitive deficits mainly due to dopaminergic overstimulation of mesocorticolimbic circuits. Response-inhibition dysfunction is often observed in PD. Apart from being engaged in response-inhibition tasks, the anterior cingulate cortex (ACC), is part of a functional system based on self-awareness and engaged across cognitive, affective and behavioural contexts. The purpose of the study was to examine the relationship between response-inhibition disabilities and DRSA using whole-brain event-related functional magnetic resonance imaging (fMRI), over the course of a specific executive task.

Methods: Twenty-seven cognitively preserved idiopathic PD patients – presenting motor fluctuations and dyskinesias – were studied. They underwent a neurological and neuropsychological evaluation. The presence of DRSA was assessed using the Dyskinesias Subtracted-Index (DS-I). Cingulate functionality was evaluated with fMRI, while patients performed an ACC-sensitive GO-NoGO task. Association between blood oxygenation level dependent response over the whole-brain during the responseinhibition task and DS-I scores was investigated by regression analysis.

Results: The presence of DRSA was associated with reduced functional recruitment in the bilateral ACC, bilateral anterior insular cortex and right dorsolateral prefrontal cortex (pFWE<0.05). Moreover, DS-I scores significantly correlated with percent errors on the NoGO condition (r = 0.491, pFWE = 0.009).

Discussion: These preliminary findings add evidence to the relevant role of executive dysfunctions in DRSA pathogenesis beyond the effects of chronic dopaminergic treatment, with a key leading role played by ACC as part of a functionally impaired response-inhibition network. Imaging biomarkers for DRSA are important to be studied, especially when the neuropsychological assessment seems to be normal.

Keywords: Parkinson's Disease, dyskinesias, self-awareness, response-inhibition, fMRI

# INTRODUCTION

fpsyg-09-01765 September 18, 2018 Time: 16:50 # 2

Dyskinesias are disabling motor complications subsequent to prolonged use of dopaminergic agents in Parkinson's disease (PD). In particular, levodopa-induced dyskinesia (LID), commonly occurs after long duration of treatment, primarily in the "on-state" medication, negating to some extent its beneficial effects (Vitale et al., 2001; Jenkinson et al., 2009; Amanzio et al., 2010; Sitek et al., 2011).

PD patients may be partially or even completely unaware of the presence of involuntary movements [the so-called dyskinesias-reduced-self-awareness (DRSA)]. PD patients with DRSA may therefore not comply with their pharmacological treatment or take part in potentially dangerous activities (Jenkinson et al., 2009). Importantly, DRSA is a clinical phenomenon that can provide untruthful suggestions about the progression of the disease, modify treatment adherence, adversely affect patients quality of life as well as impact the burden of caregivers (Jenkinson et al., 2009).

DRSA has previously been associated with executive dysfunctions due to a dopaminergic overstimulation of mesocorticolimbic circuits (Vitale et al., 2001; Leritz et al., 2004; Amanzio et al., 2010). Indeed, an action monitoring dysfunction, related to the medial prefrontal – ventral striatal circuit including the anterior cingulate cortex (ACC), has been considered to be associated with DRSA, using neuropsychological paradigms (Amanzio et al., 2010, 2014). According to our hypothesis regarding the association between DRSA and executive dysfunction (Amanzio et al., 2010, 2014; Palermo et al., 2017), DRSA can arise when the comparator mechanism for "attentive-performance-monitoring" is damaged. In this case, PD patients may not be able to identify their motor symptoms, and dyskinesias do not achieve conscious awareness (Blakemore et al., 2001; Jenkinson et al., 2009). We have therefore demonstrated the harmful role of dopaminergic pharmacological replacement treatment on the prefrontal-subcortical loops producing DRSA, which is linked to specific executive-metacognitive disabilities in terms of global monitoring, monitoring resolution, control sensitivity (Leritz et al., 2004; Amanzio et al., 2010), and the affective component of Theory of Mind (Palermo et al., 2017).

For what is our knowledge, only the study by Maier et al. (2016) analysed neural correlates of impaired selfawareness of motor symptoms in PD. The authors examined this phenomenon in a cohort of twenty-two PD using data from 18F-fluorodeoxyglucose positron emission tomography (Maier et al., 2016), discovering that impaired self-awareness of motor symptoms had its neural substrates in bilateral frontal regions such as the right precentral gyrus, the right superior frontal gyrus, the left inferior frontal gyrus and the medial frontal gyrus (Maier et al., 2016).

Regarding the role of medial prefrontal cortex in impaired self-awareness (Amodio and Frith, 2006), it has been previously demonstrated that a reduced functional recruitment of the MPFC – especially at the level of ACC – may be considered one of the relevant neurobiological substrates of unawareness in early Alzheimer's disease (Amanzio et al., 2011; Spalletta et al., 2014b), acquired brain injury (Palermo et al., 2014), bipolar disorder (Palermo et al., 2015), and schizophrenia (Orfei et al., 2010; Spalletta et al., 2014a). These results seem to suggest that unawareness of illness in pathologies with different aetiologies may exhibit overlapping symptoms in the context of common patterns of hypofunctionality (i.e., similar neural dysfunction) (Palermo et al., 2014, 2015). Moreover, we have also demonstrated how unawareness is related to the ability to shift and inhibit a response and to action-monitoring (Amanzio et al., 2011; Palermo et al., 2014, 2015), for which ACC functionality is central (Braver et al., 2001; Nee et al., 2007).

As far as we know, no previous studies have evaluated DRSA using an event-related functional MRI (fMRI) paradigm based on response-inhibition. Since the ACC network plays an important role in response-inhibition competence (Braver et al., 2001; Nee et al., 2007), and in our previous study patients who were unaware of their deficits exhibit impaired performance in response-inhibition tasks (Amanzio et al., 2011; Palermo et al., 2014, 2015), we predicted a relationship between DRSA in PD and cingulate hypofunctionality. Considering the above, the two main purposes of the study were: (1) to evaluate possible association between DRSA and responseinhibition performance and (2) to explore and describe the neural substrate of DRSA during a response-inhibition task in PD patients.

# MATERIALS AND METHODS

# Participants

We prospectively screened 30 patients between January 2016 and November 2017 from the Parkinson's and Movement Disorders Unit of the University of Turin, Italy. Selection was made among patients evaluated for possible access to advanced therapy in terms of deep brain stimulation intervention.

A good clinical response to levodopa with the presence of peak-of-dose dyskinesias and wearing off or on-off phenomena was the first required selection criteria (Amanzio et al., 2010, 2014; Palermo et al., 2017). Subjects took part in the study only if:


# Procedures

fpsyg-09-01765 September 18, 2018 Time: 16:50 # 3

Neurological evaluation was performed both in the absence of drug therapy and over the course of the maximum-benefit-peak of the first daily dose (Amanzio et al., 2010, 2014; Palermo et al., 2017). As regards the off-state, patients were assessed at least 10 h after therapeutic withdrawal, while in the on-state they were assessed within 2 h of the pharmacological assumption. DRSA assessment and the neuropsychological evaluation were performed in the on-state and took at least an hour and a half for each patient.

Importantly, all patients were in therapeutic washout during neuroimaging acquisition, to avoid possible confounding effects of dopamine treatment effects on response-inhibition execution and subsequent fMRI results (Robertson et al., 2015). Indeed, last pharmacological administration was performed in all patients 8 h before the experimental session.

# Neurological Evaluation and DRSA Assessment

The neurological evaluation was performed using the Unified PD Rating Scale revised by the Movement Disorders Society (MDS-UPDRS) (Antonini et al., 2013), which was administered by neurologists blind to the aim of the study. In particular, the motor assessment was performed on the basis of Section III; dyskinesias were assessed using Section IV. Disease stage was rated using the Modified Hoehn and Yahr Scale (Hoehn and Yahr, 1967).

The Dyskinesia rating Scale (DS) (Amanzio et al., 2010, 2014; Palermo et al., 2017) was used to measure awareness of movement disorders. It is a 4-point scale for which the severity of dyskinesias is evaluated separately by the patient and the examiner, while the first performs some selected tasks. Score ranges from 0 (total absence of dyskinesias) to 3 (severe dyskinesias). A Dyskinesias Index (DS-I) was calculated by subtracting the patient's judgments from those of the examiner. Higher scores indicated worse error detection in performance monitoring and so more severe DRSA.

# Neuropsychological and Neuropsychiatric Assessment

The neuropsychological assessment was based on the guidelines by the Task Force commissioned by the Movement Disorders Society to identify Mild Cognitive Decline (Litvan et al., 2012; Goldman et al., 2013). As in our previous studies (Amanzio et al., 2014; Palermo et al., 2017), the test battery included the MMSE to detect the presence of a general cognitive deterioration; attention, perceptual tracking of a sequence and speeded performance were analysed using the Trail Making Test part A (TMT-A); executive functions using the TMT-B and TMT B-A, and the Wisconsin Card Sorting test (WCST); memory abilities with subscales IV and VII of the Wechsler Memory Scale (WMS). Lastly, the ability to access the verbal lexicon was evaluated using the Phonemic Fluency Test – letters F, A, S (FAS) (Amanzio et al., 2014; Palermo et al., 2017).

Neuropsychiatric assessment consisted of the Beck Anxiety Inventory (BAI), the Beck Depression Inventory (BDI), the Apathy Scale (AS), the Young Mania Rating Scale (YMRS) and the Brief Psychiatric Rating Scale 4.0 (BPRS 4.0) (Amanzio et al., 2014; Palermo et al., 2017).

# Scanning Procedure, Activation Paradigm, fMRI Data Preprocessing and Analyses

Neuroimaging data acquisition was performed on a 3T Philips Ingenia scanner (Neuroscience Institute of Turin – Neuroimaging Centre). Images of the whole brain were acquired using a T1-weighted sequence (TR = 4.8 ms, TI = 1650 ms, TE = 331 ms, voxel-size = 1 mm × 1 mm × 1 mm).

During acquisition, the subject was asked to perform a response inhibition ACC-sensitive task (GO-NoGO paradigm), in which the subject had to respond to frequent "GO" stimuli inhibiting the response to infrequent "NoGO" stimuli (the letter "X" with a frequency of 17%), (Braver et al., 2001; Amanzio et al., 2011; Palermo et al., 2014, 2015). Every stimulus was shown for 250 ms with a 1000 ms inter-stimulus interval. The two stimulus types (X and non-X) were presented in random order in a continuous series of 232 trials (Amanzio et al., 2011; Palermo et al., 2014, 2015). Subjects had to respond by pressing a button with their right thumb. The paradigm we used is a prototypical task to measure the ability to inhibit an overpowering response (Braver et al., 2001).

Functional data were acquired using T2<sup>∗</sup> -weighted EPI (TR = 2.20 s, TE = 35 ms, slice-matrix = 64 × 64, slice gap = 0.28 mm, FOV = 24 cm, flip angle = 90◦ , slices aligned on the AC-PC line).

Image data preprocessing was performed using SPM8, while group-statistics results were visualised using MRIcron. All functional images were spatially realigned to the first volume and anatomical images were co-registered to the mean of them. The functional images were normalised to the Montreal Neurological Institute (MNI), space and smoothed with a 8 mm full-width half-maximum (FWHM), Gaussian Kernel. In order to remove low-frequency drifts, high-pass temporal filtering with a cut-off of 128 s was applied.

After preprocessing, we applied a General Linear Model (GLM) (Friston et al., 2007) to convolve the "GO" and "NoGO" stimuli with canonical hemodynamic response function (HRF). The GLM consisted of two categorical regressors ("GO" and "NoGO" as paradigm conditions), and seven parametric regressors of no interest: six motion regressors in order to correct residual effects of head motion and one (the levodopa equivalent daily dose, LEDD) to exclude any potential influence of pharmacological therapy on fMRI results. At the second level, neural correlates of response-inhibition function were explored by performing a one-sample t-test of the contrast "NoGO" vs. "GO" across all the participants. Results were corrected for multiple comparison by small volume correction [SVC], with a sphere of 10 mm radius centred on ACC (our primary region of interest), according to the coordinates reported in the metaanalysis by Need, Wager and Jonides (Nee et al., 2007). Finally, in order to identify which brain regions were associated with DRSA and how task-related activation during response inhibition and DS-I scores were reciprocally correlated, we performed a linear regression of individual scores on DS-I onto whole-brain results for the contrast "NoGO" vs. "GO" (pFWE < 0.05, at cluster level).

Data from the neuropsychological evaluation are listed in **Table 2**. The neuropsychiatric evaluation showed normative values in both the evaluation phases, on and off. Furthermore, the neuropsychological assessment performed in the on-phase reported normal cognitive profiles.

#### Statistical Analysis

fpsyg-09-01765 September 18, 2018 Time: 16:50 # 4

Statistical analyses were performed using SPSS version 21.0 (IBM Corp, 2013), for Windows. Data for clinical characteristics and neuropsychological assessment of the subjects are expressed as the mean ± standard deviation.

As far as the "GO-NoGO" paradigm is concerned, patients' behavioural performance were evaluated in terms of percentage of correct answers (percentage of GO to which the subject responded); percentage of wrong answers (percentage of NoGO to which the subject responded); reaction times (milliseconds from the appearance of the stimulus to the pressure of the response button).

Correlations between DS-I scores and response-inhibition performance were examined using Spearman's rank-order correlations. A p-value of<0.05 was considered statistically significant.

# RESULTS

Of thirty subjects screened, three patients withdrew from the study, while twenty-seven patients (eight women, 19 men), with idiopathic PD, receiving levodopa treatment and presenting motor fluctuations, were enrolled. Disease duration was 10.98 ± 0.94 (mean ± SD) years. The pharmacological treatment had been ongoing for about 8 years and consisted of levodopa associated with dopamine agonists (LEDD = 982.86 ± 92.41). Dyskinesias appeared about 3 years before the neuropsychological evaluation. Patients reported normal cognitive profiles at the first level of cognitive profile assessment. Data for key clinical variables are summarised in **Table 1**. More information regarding the experimental sample can be found on **Supplementary Tables I–III**.

Parkinson's Disease (PD), patient successfully performed the attentive task (GO) in 84% of cases (correct target), while they properly inhibited the incorrect answer in 64% of cases.

By normal standards, the association between DS-I scores and performance on the GO condition would not be considered statistically significant (r = −0.177; p = 0.377). Importantly, DS-I scores strongly correlated with percent errors on the NoGO condition (r = 0.491, p = 0.009). Indeed, the worse the response-inhibition's performance the worse the ability of a subject to notice and adequately assess the severity of his/her own dyskinesias.

In the "NoGO" vs. "GO" fMRI contrast, expected activation was found in a functional cluster including the bilateral ACC and part of the pre-Supplementary Motor Area (pre-SMA), as shown in **Figure 1**. Linear correlations between neural response during response-inhibition and DRSA scores (as expressed by DS-I) are summarised in **Table 3** and depicted in **Figures 2**, **3**. DS-I scores negatively correlated with the NoGO/GO response in the bilateral ACC, bilateral anterior insular cortex (AIC) and right dorsolateral prefrontal cortex (DLPFC) (pFWE < 0.05) (see **Table 4**).

# DISCUSSION

We studied twenty-seven PD patients to better elucidate the link between brain dysfunction and concomitant cognitivebehavioural disturbances (McGlynn and Schacter, 1989; Lezak


Maximum scores for the neurological examination are shown in square brackets. mg, milligrammes; H&Y, Hoehn and Yahr scale; SD, Standard Deviation; Q1, first quartile; Q3, third quartile; N, frequency; MDS-UPDRS, Movement Disorder Society –Unified Parkinson Disease Rating Scale.



Awareness for Levodopa-induced diskinesias is also shown. Where possible, the maximum scores for each test are shown in square brackets. Wherever there is a normative value, the cut-off scores are given in the statistical normal direction; the values refer to the normative data for healthy controls matched according to age and education. Cells in grey indicate the absence of a normative cut-off. N, frequency; AS, Apathy Scale; BDI, Beck Depression Inventory; BAI, Beck Anxiety Inventory; YMRS, Young Mania Rating Scale; BPRS 4.0, Brief Psychiatric Rating Scale version 4.0; MMSE, Mini-Mental state Examination; TMT, Trail Making Test; FAS, Verbal Fluency; WCST, Wisconsin Card Sorting Test; GAM, Global Awareness of Movement Disorders; DS-I, Dyskinesias Subtracted-Index.

et al., 2004; Amanzio et al., 2011; Palermo et al., 2014; Maier et al., 2016), such as DRSA. First, we confirmed the specific and central role of the ACC in the response-inhibition function. Then, we identified a novel negative correlation between DS-I scores and fMRI responses from the NoGO/GO contrast, suggesting that, in our sample, DRSA was specifically associated with reduced functional recruitment of cingulo-frontal (bilateral ACC – R DLPFC), and cingulo-opercular (bilateral ACC – bilateral AIC) regions during the employed response-inhibition task. These findings cannot be merely attributed to cognitive impairment, since all PD patients obtained scores above cut-off on the overall neuropsychological test battery.

It has been reported that DRSA «is characterised by a failure to acknowledge a particular neuropsychological deficit relative to specific functions, i.e., in the case in question, "action"» (Amanzio et al., 2014; Palermo et al., 2017). The phenomenon has attracted growing interest in recent years (Jenkinson et al., 2009; Amanzio et al., 2010, 2014; Maier et al., 2012, 2016; Pietracupa et al., 2013; Palermo et al., 2017). In previous studies approximately half of PD patients exhibited DRSA to some extent (Amanzio et al., 2010; Maier et al., 2012; Pietracupa et al., 2013). In particular, our own studies observed reduced awareness of dyskinesias in 44% (Amanzio et al., 2010), and 53% (Amanzio et al., 2014), of enrolled subjects. Moreover, in the study by Sitek et al., 43% of patients rated their dyskinesias as less severe than did their caregivers (Sitek et al., 2011). More recently, DRSA was reported in 61% and in 23% of PD patients. (Maier et al., 2012, 2016; Pietracupa et al., 2013).



We have demonstrated elsewhere (Amanzio et al., 2010, 2014; Palermo et al., 2017), a significant association between DRSA and reduced functional recruitment of the cingulo-frontal and cingulo-opercular pathways due to prolonged iatrogenic overstimulation and have also already discussed (Amanzio et al., 2010, 2014), possible grounds of inconsistency between our observations and those of others who did not find such an association using neuropsychological approaches (Jenkinson et al., 2009; Maier et al., 2012; Pietracupa et al., 2013; Amanzio et al., 2014). The limitations of these studies, which obtained negative results, had been previously reported (Amanzio et al., 2010, 2014).

This study suggests a possible evidence that chronic dopaminergic overstimulation of mesocorticolimbic circuitries – prolonged over the years – might be considered one of the mechanisms responsible for DRSA pathogenesis. Another possible intervening factor might also go "beyond" chronic dopaminergic overstimulation and affect the mesocorticolimbic circuitries on their own, as we should have ruled out potential confounding effects of replacing pharmacotherapy on our fMRI results: fMRI sequences were acquired in therapeutic washout), and, moreover, LEDD was included as a covariate of no interest (i.e., nuisance regressor), in all individual first level fMRI analyses (in order to control for any residual effect due to "chronic" influence of therapy).

FIGURE 2 | Brain area negatively associated with DS-I scores in the "NoGO" vs. "GO" contrast.

In the case of fMRI contrasts, we only focused on the responseinhibition function as specifically elicited by the "NoGO" vs. "GO" condition and applied regional correction, considering significant results only within the ACC. This was done in order to evaluate, in our sample, the degree of significance in the expected cluster of activation. Against, regression analysis was not restricted to the sole ACC in an effort to not constrain our interpretations to one brain region highly likely to be involved in DRSA pathogenesis. We have here confirmed that ACC is not only significantly active in the contrast NoGO vs. GO, but it is also the area that – among all the areas emerged from the linear regression analysis – expresses the main negative correlation peak with the DS-I scale. We can therefore suggest that ACC could be considered the main hub to interpret DRSA, with the DLPFC and the insula holding the role of supporting actors.

Interestingly, the relationship we found between DRSA and reduced functional recruitment of the cingulo-frontal and cingulo-opercular pathways refers to regions engaged in loading executive-monitoring onto the processing of taskrelevant information, so as to avoid interference by goalirrelevant stimuli. In particular, the DLPFC is a principal region of the "cognitive-executive" network (CEN), while the ACC and AIC have been identified as major nodes of the "salience" network (SN). These macroscale networks are typically recognised as topographically and functionally distinct from the "default mode" network, which on the contrary subserves inwardly oriented (i.e., self-referential), processing during both wakeful rest and task-execution conditions (Fox et al., 2005; Dosenbach et al., 2007). These areas have to be considered as "hubs" of a wider cognitive control network, globally known as "task positive" (as opposed to the default mode, also called "task negative" network),


Peak activity coordinates are given in MNI space. Peak activities are significant at p < 0.05, FWE corrected for multiple comparisons at the voxel level. ACC, anterior cingulate cortex; AIC, anterior insular cortex; DLPFC, dorsolateral prefrontal cortex; r, right; l, left.

that includes and connects different functional systems involved in response-inhibition, working memory, action monitoring, representation of the affective qualities of interoceptive signals and sensory events.

The ACC plays important roles in each of these functions (Dosenbach et al., 2006, 2007). Indeed, action monitoring is particularly important in situations that require higher processing capacity. In this case, ACC is concerned with conflict monitoring in several contexts. This includes the online monitoring of responses allowing the identification of errors, as per earlier error-detection theories, and also the detection of conflict between different possible responses to a stimulus, event, or situation. Considering the above, ACC is believed to be involved in attentional processes, particularly "attention for action" (Posner et al., 1988). Most recently, the conflictmonitoring model has been further revised in an effort to consider the findings related to a seeming role for ACC in decision making (Botvinick, 2007). Interestingly the joint action of ACC and AIC can provide an eye-opening perspective on the functions of these regions, as they appear to constitute input (AIC) and output (ACC) hubs of a system based on "awareness of self." This system may be regarded as an "integrated awareness" of physical, affective and cognitive states, generated by the integrative functions of the AIC and then re-represented in ACC, having the purpose of providing elements for the selection of, and preparation for, responses to inner or outer events. Our finding of a tight relationship between limited functional recruitment of the cingulo-frontal and cingulo-opercular regions and DRSA suggests that reduction in self-awareness of LID in PD could be interpreted as a specific impairment of an executive function related to metacognitive awareness (i.e., attention-foraction/target selection, motor response selection inhibition and error detection in performance monitoring), in line with our previous results obtained with a neuropsychological approach (Amanzio et al., 2010, 2014; Palermo et al., 2017).

LIMITATIONS SECTION

The present study has been carefully designed and reached its aims; however, some critical aspects have to be outlined. First, there is no current consensus about standard tools for DRSA assessment, so opting for a scale in place of another could represent a confounding factor. However, DRSA is still detected using different instruments and none of them has been able to prevail as superior to the others. Second, our analyses had been conducted on a relatively small sample, which might reduce statistical power to detect effects and limit generalisation of results. However, our study has to be considered as an exploratory attempt to investigate possible neural underpinnings of DRSA, using an effective and specific ACC-sensitive fMRI paradigm in a selected patient population. Indeed, our sample was clinically homogeneous in terms of disease duration, disease severity, and pharmacological treatment. Moreover, our patients did not present cognitive impairment or behavioural alterations that could compromise the DRSA assessment or the interpretation of the main results.

# CONCLUSION

This study of DRSA and its neural correlates has relevant clinical implications as this disorder is involved in diagnostic, nosological and prognostic factors that directly affect treatment adherence. Unawareness is often related to poor clinical outcomes and impaired psycho-social functioning. Unaware patients increase caregivers' burden as they are unable to track changes in their cognitive and behavioural status, thus requiring additional assistance. We believe that theoretical models of unawareness have greater clinical utility and are more effective if they integrate fMRI and neuropsychological data, given the relevance of detecting possible psycho-biological markers of this phenomenon in PD. Importantly, to the best of our knowledge, this is the first study to investigate the relationship between response-inhibition disabilities and DRSA, in cognitively intact patients with PD, using a specific executive (ACC-sensitive) task during an event related fMRI session.

# FUTURE PERSPECTIVES

It would be useful to consider the execution of specific responseinhibition task along with the neurological evaluation and neuropsychological assessment in order to define "tailored" interventions in DRSA and adopt a personalised clinical approach avoiding increased doses of dopaminergic drugs, which would in turn enhance the risk of side effects. On the other hand, we have here shown that DRSA pathogenesis in PD may also be considered as "intrinsic" and not necessarily related to chronic dopaminergic overstimulation. Therefore, future studies will be helpful in order to further characterise DRSA features in PD, replicating our findings in a larger group of patients both in the on- and off-phase of daily replacing therapy.

#### ETHICS STATEMENT

fpsyg-09-01765 September 18, 2018 Time: 16:50 # 9

The study was approved by the Ethics Committee "A.O.U. Città della Salute e della Scienza di Torino - A.O. Ordine Mauriziano - A.S.L. Città di Torino" as part of the core research criteria followed by the Neurological Units. All the implemented procedures ensured the safety, integrity, and privacy of patients. All subjects gave their informed written consent to participate in the study. Any critical aspects, neither with regard to the fMRI acquisition nor to the neuropsychological assessment could be noticed. Importantly, the study has been conducted according to the principles set forth by the Declaration of Helsinki (59th WMA General Assembly, Seoul, October 2008) and in accordance with the Medical Research Involving Human Subjects Act (WMO).

# AUTHOR CONTRIBUTIONS

The study was based on a concept developed by MA who wrote the paper and took part in the review and critique processes as

#### REFERENCES


PI. SP organised the study, performed the neuropsychological assessment (organisation and execution), participated in the statistical analyses (execution and organisation, review and critique), and wrote the paper. RM and MS conducted the fMRI analyses (execution), and participated in interpretation of results and writing of the paper. MV organised and conducted the MRI acquisition, and participated in interpretation of results and writing of the paper. MZ, AR, and MR performed the neurological assessment (execution) and took part in the organisation of the study and in the diagnostic phase (organisation and diagnosis). GG and LL supervised the neurological evaluation and participated in writing of the paper (review and critique). All the contributors gave their approval of this version of the manuscript to be submitted.

# FUNDING

This work received a contribution from "Acamedia" (https: //www.acamedia.unito.it/), in partnership with Collegio Carlo Alberto.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01765/full#supplementary-material


association with motor asymmetry and motor phenotypes. Mov. Disord. 27, 1443–1447. doi: 10.1002/mds.25079


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Palermo, Lopiano, Morese, Zibetti, Romagnolo, Stanziano, Rizzone, Geminiani, Valentini and Amanzio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Color of Noise and Weak Stationarity at the NREM to REM Sleep Transition in Mild Cognitive Impaired Subjects

Alejandra Rosales-Lagarde1,2 \*, Erika E. Rodriguez-Torres <sup>3</sup> , Benjamín A. Itzá-Ortiz <sup>3</sup> , Pedro Miramontes <sup>4</sup> , Génesis Vázquez-Tagle<sup>2</sup> , Julio C. Enciso-Alva<sup>3</sup> , Valeria García-Muñoz <sup>3</sup> , Lourdes Cubero-Rego<sup>5</sup> , José E. Pineda-Sánchez <sup>6</sup> , Claudia I. Martínez-Alcalá1,2 and Jose S. Lopez-Noguerola2,7

#### Edited by:

Celine R. Gillebert, KU Leuven, Belgium

#### Reviewed by:

Christian O'Reilly, École Polytechnique Fédérale de Lausanne, Switzerland Dario Arnaldi, Università di Genova, Italy Abdul Rauf Anwar, University of Engineering and Technology, Lahore, Pakistan

> \*Correspondence: Alejandra Rosales-Lagarde alexiaro@rocketmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 February 2018 Accepted: 22 June 2018 Published: 17 July 2018

#### Citation:

Rosales-Lagarde A, Rodriguez-Torres EE, Itzá-Ortiz BA, Miramontes P, Vázquez-Tagle G, Enciso-Alva JC, García-Muñoz V, Cubero-Rego L, Pineda-Sánchez JE, Martínez-Alcalá CI and Lopez-Noguerola JS (2018) The Color of Noise and Weak Stationarity at the NREM to REM Sleep Transition in Mild Cognitive Impaired Subjects. Front. Psychol. 9:1205. doi: 10.3389/fpsyg.2018.01205 <sup>1</sup> Consejo Nacional de Ciencia y Tecnología, Mexico City, Mexico, <sup>2</sup> Área Académica de Gerontología, San Agustín Tlaxiaca, Mexico, <sup>3</sup> Centro de Investigación en Matemáticas, Mineral de la Reforma, Mexico, <sup>4</sup> Facultad de Ciencias, Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>5</sup> Universidad Nacional Autónoma de México, Querétaro, Mexico, <sup>6</sup> Área Académica de Psicología, Universidad Autónoma del Estado de Hidalgo, San Agustín Tlaxiaca, Mexico, <sup>7</sup> Division of Molecular Psychiatry, Department of Psychiatry and Psychotherapy, University of Medicine, Goettingen, Germany

In Older Adults (OAs), Electroencephalogram (EEG) slowing in frontal lobes and a diminished muscle atonia during Rapid Eye Movement sleep (REM) have each been effective tracers of Mild Cognitive Impairment (MCI), but this relationship remains to be explored by non-linear analysis. Likewise, data provided by EEG, EMG (Electromyogram) and EOG (Electrooculogram)—the three required sleep indicators—during the transition from REM to Non-REM (NREM) sleep have not been related jointly to MCI. Therefore, the main aim of the study was to explore, with results for Detrended Fluctuation Analysis (DFA) and multichannel DFA (mDFA), the Color of Noise (CN) at the NREM to REM transition in OAs with MCI vs. subjects with good performances. The comparisons for the transition from NREM to REM were made for each group at each cerebral area, taking bilateral derivations to evaluate interhemispheric coupling and anteroposterior and posterior networks. In addition, stationarity analysis was carried out to explore if the three markers distinguished between the groups. Neuropsi and the Mini-Mental State Examination (MMSE) were administered, as well as other geriatric tests. One night polysomnography was applied to 6 OAs with MCI (68.1 ± 3) and to 7 subjects without it (CTRL) (64.5 ± 9), and pre-REM and REM epochs were analyzed for each subject. Lower scores for attention, memory and executive funcions and a greater index of arousals during sleep were found for the MCI group. Results confirmed that EOGs constituted significant markers of MCI, increasing the CN for the MCI group in REM sleep. The CN of the EEG from the pre-REM to REM was higher for the MCI group vs. the opposite for the CTRL group at frontotemporal areas. Frontopolar interhemispheric scaling values also followed this trend as well as right anteroposterior networks. EMG Hurst values for both groups were lower than those for EEG and EOG. Stationarity analyses showed differences between stages in frontal areas and right and left EOGs for both groups. These results may demonstrate the breakdown of fractality of areas especially involved in executive functioning and the way weak stationarity analyses may help to distinguish between sleep stages in OAs.

Keywords: NREM to REM sleep, DFA, mDFA, stationarity, mild cognitive impairment

#### 1. INTRODUCTION

#### 1.1. Rapid Eye Movement Sleep and Mild Cognitive Impairment

Rapid Eye Movement (REM) sleep enhances the information flow among functional networks. Well-defined cortico-cortical and thalamo-cortical networks (Steriade and Amzica, 1996) operating at higher frequencies that contrast with the great synchronizations in Slow Wave Sleep (SWS) have been demonstrated (Steriade et al., 1996). During REM sleep, in comparison to the wakeful state and Non-Rapid Eye Movement (NREM) sleep in humans, a decoupling of frontal vs. posterior areas occurs (Pérez-Garci et al., 2001; Corsi-Cabrera et al., 2003), and temporal coupling increases among homologous regions of the two cerebral hemispheres and among posterior regions (Corsi-Cabrera et al., 1987, 1989), as can be represented in **Figure 1**.

Evidence supporting these investigations has been found involving selective REM sleep deprivation effects over the interhemispheric correlation in gamma frequencies on the recovery night in frontal lobes (Corsi-Cabrera et al., 2014) and a higher prefrontal rebound of frontal gamma synchronization during subsequent wakefulness in young subjects performing executive tasks (Corsi-Cabrera et al., 2015).

Language, planning, purposive action, voluntary control of attention, working memory, evaluation, decision making, inhibition of stimuli and irrelevant responses, sequential organization of new and complex information (Fuster, 1999), as well as rule guided behavior have been together classified as executive functions (Fuster, 2005; Bunge and Wallis, 2008). In Mild Cognitive Impairment (MCI) it has been widely demonstrated that some of these functions are distorted. Patients diagnosed with MCI may have subjective mental complaints about their cognitive functioning, corroborated by a near relative or friend, or lower performances considering age and education standards on neuropsychological tests (Petersen, 2004).

Physiological indicators of cognitive impairment in the elderly during REM sleep comprise greater absolute and relative power in slower frequencies in frontal lateral regions vs. wakefulness (Brayet et al., 2015) and less muscle atonia (Chen et al., 2011). The above mentioned studies have included linear analyses of the Electroencephalogram (EEG) or the Electromyogram (EMG). In this paper, Detrended Fluctuation Analysis (DFA) and multichannel DFA (mDFA) are used to calculate the level of fractality in the NREM to REM transition from subjects with and without cognitive impairment.

REM sleep has long been associated with cognitive functions (Rasch and Born, 2013; Tononi and Cirelli, 2014). REM sleep plays a role in memory consolidation (Boyce et al., 2016,

2017; Peever and Fuller, 2016). After complex tasks, there are reactivations of neural circuits during REM sleep (Louie and Wilson, 2001). Long Term Potentiation (LTP) only happens during wakefulness and REM sleep, and, depending on its theta phase of the hippocampus, LTP can be enhanced or inhibited (Pavlides et al., 1988).

Generally, three REM sleep indicators appear in consecutive order. When entering this stage, spindles and high amplitude slow waves are absent, the EEG has abundant beta and gamma frequencies (Llinás and Ribary, 1993; Steriade and Amzica, 1996), there is an abrupt loss of voltage that occurs at an interval shorter than 2 seconds (Rosales-Lagarde et al., 2009), and, afterwards, the characteristic episodic REMs appear (Aserinsky and Kleitman, 1953; Rechtshaffen and Kales, 1968; AASM, 2007).

Evidence suggests the need to search for indicators of cognitive impairment in REMs. Before REMs, an internal attentive network comes into operation, for these are preceded by higher activations of the orbital region, amygdala and hippocampus (Ioannides et al., 2004) and a higher temporal coupling between the right frontal region and the midline (Corsi-Cabrera et al., 2008). Theta activity and Ponto-Geniculo-Occipital (PGOs) waves come into phase and their generators receive a common activation (Karashima et al., 2002).

It is proposed that, given the important role of REM sleep for cognitive functions, scaling exponents must differ for people with MCI vs. controls in at least one of the sleep markers.

To examine the functional relationships mentioned above, the analyses included individual derivations, bilateral, anteroposterior and posterior networks at the EEG. The Electrooculogram (EOG) was subjected to DFA and mDFA as explained below, and the scaling exponents of the EMG were also obtained. In brief, according to the fractal analysis of signals, it is known that if the noise of the signal is pink or toward 1, the signal represents health or a good correlation measure along a wide range of signals. On the contrary, in large scales, white noise means the signal is random; over short scales, brown noise is similar to the Brownian movement.

According to Nikulin and Brismar (2005), drowsy states have higher Hurst values than those of wakefulness. In Linkenkaer-Hansen et al. (2001) studies, the eyes-open condition -comprising beta frequencies- has smaller scaling values than the closed-eyes conditions, where alpha frequencies at parietal and occipital regions appear. In Weiss et al. (2009) and Weiss et al. (2011), healthy and younger subjects than ours were evaluated to obtain the Hurst values for the different sleep stages. To compute the Hurst Exponent, Weiss et al. (2009) and Weiss et al. (2011) employed the R/S statistics or rescaled adjusted range of Mandelbrot and Taqq (Mandelbrot, 1977). They found higher values in Hurst measures in NREM stage 4 versus NREM stage 2 and REM. Hurst values at REM sleep in frontal regions were lower than in all stages.

A difference between the signal of both groups is expected if the structural complexity of dendritic arborizations that possess a fractal anatomy is distorted in the group with impairments. Also, the replay of activity in sleep that may mimic the one during wakefulness and could be associated with the functional relationships, i.e., inverse functional relationship found by Corsi and collaborators mirrored by a decrease at the cross-correlations of anteroposterior (Pérez-Garci et al., 2001; Corsi-Cabrera et al., 2003) and an increase in posterior and interhemispheric networks (Corsi-Cabrera et al., 1987, 1989), may be altered in the MCI group. Although these results are based on linear analyses and the present paper employs DFA and mDFA, there are evidences supporting lost long range anatomical connections in patients with MCI and dementia (Andrews-Hanna et al., 2007). Especially, heteromodal association networks such as frontoparietal and hippocampus connectivity become more vulnerable with age and dementias than short range connections, such as sensoriomotor primary cortices connections (Li et al., 2017). Considering the results of Weiss et al. (2009) showing greater values in the SWS stages and the study of Brayet et al. (2015), revealing slower activity in MCI subjects may have greater Hurst values that would reveal their cognitive disadvantages.

Also, since in a previous report (Rosales-Lagarde et al., 2017) the percentage of stationarity of REM sleep was lower than that of NREM sleep and wakefulness, the degree of stationarity was obtained as an index to compare NREM vs. REM sleep in both groups.

# 2. METHODS

#### 2.1. Subjects

Most of the 115 Older Adults (OAs) evaluated with the cognitive and emotional tests mentioned below attended the Centro Gerontológico Integral (CGI) at Punta Azul in Pachuca, Hidalgo, Mexico. OAs were informed about the aims of the research and they signed an informed consent. The project received the approval of the research Ethics committee. A clinical interview was first applied to rule out epilepsy or psychiatric disorders. OAs were also asked about personal and family diseases. Cognitive assessment by the Neuropsi and the Mini-mental State Examination (MMSE), as well as the emotional evaluation by the Geriatric Depression Scale (GDS) and a Scale for the detection of Anxiety for the Elderly (the Short Anxiety Screening Test: SAST), were administered. The Katz Index of Independence in Activities of Daily Living (Katz et al., 1970) was also administered to rule out dementia, and all tests were validated in Spanish (Ugalde, 2010). The tests were rated by trained experts (ARL and GVT). Later, according to the results of the tests, the OAs were divided in two main groups, one with normal functioning on the Neuropsi and all its scales and subscales, with undiminished daily activities or control (CTRL), and another group with similar results in the daily living tests but with at least one subscale on the Neuropsi showing three standard deviations below the mean, or with MCI. **Table 1** shows the results for 13 subjects, 6 for the MCI group and 7 for the CTRL group. Another subject who was registered and classified as having MCI had facial paralysis and, due to technical problems, did not complete the whole PSG study.

As presented in **Table 1**, participants did not differ as regards age or education. The MMSE showed marginal differences between groups and daily living activities were maintained. Hypertension, diabetes and a thyroid problem were under medication. In answer to the question "Do you sleep well?" on the SAST scale of anxiety, 2 members of the MCI group and 2 of the CTRL group responded "never or rarely" or "occasionally."

Memory complaints as measured on the GDS by the only question about referring to altered memory were scarce, because


CTRL, Control group; MCI, Mild Cognitive Impairment; MMSE, Mini-Mental State Examination; GDS, Geriatric Depression Scale; SAST, Short Anxiety Screening Test; p < 0.05.

only one subject from the MCI responded affirmatively. If the question was about having problems to concentrate, the same subject from the MCI group answered positively. Nevertheless, when the question was about the mind being as clear as before, four CTRLs answered negatively and only two of the MCI group did, though objectively there were several scores below the mean in memory subscales in the MCI group.

#### 2.2. Neuropsychological Testing

The Neuropsi test measures neuropsychological functions. It was developed in the Universidad Nacional Autónoma de México and has been validated in Mexico with standards varying according to age and educational level (Ostrosky-Solís et al., 1999). The maximum score on the test is 130. Neuropsi test has clearly distinguished between normal, cognitive impaired and demented subjects (Ostrosky-Solís et al., 1999; Abrisqueta-Gomez et al., 2008; Montes-Rojas et al., 2012) and comprises several scales and subscales:


However, we excluded one of the subscales from the analysis as a basis for diagnosing MCI because only one subject from the 13 subjects could correctly continue the written sequence. Significant differences in the total score of the Neuropsi and the scales of "Attention," "Memory," and "Executive functions" were found between the groups (**Table 2**).

#### 2.3. Procedure

The administration of the battery of tests was carried out either at the CGI or at the Polyclinic belonging to the Department of Gerontology of the School of Health Sciences of the Universidad Autónoma del Estado de Hidalgo. OAs who satisfied the criteria were registered at the Laboratory of Sleep, Emotion and Cognition, under the direction of AR-L, located at the Polyclinic. OAs filled a sleep questionnaire and received instructions to continue their normal activities prior to the study and were told to avoid alcoholic drinks or energizers during 24 h before the study, and not to take naps the day they were to stay in the laboratory. OAs were scheduled to arrive in the afternoon, at least



CTRL, Control; MCI, Mild Cognitive Impairment. Student t-tests, p < 0.05. Significant results for Student t-tests are indicated in bold.

4 h before their normal bedtime, and performed some cognitive tasks (not presented here) before going to sleep.

#### 2.4. Polysomnography

The EEG was registered with a MEDICID-5 with 26 amplifiers. 19 silver chloride electrodes were located according to the International 10–20 System (FP1, FP2, F3, F4, F7, F8, C3, C4, T3, T4, T5, T6, P3, P4, O1, O2, FZ, CZ, and PZ) and linkedears were used as reference. The EMG was measured bipolarly with two electrodes located on the chin. The EOG was registered monopolarly with 2 electrodes: one a centimeter above and one below the external edge of each eye and also referred to linked-ears. A leg electrode was also located to detect if Restless Leg Syndrome (RLS) was present. Due to technical problems, no account was retained of leg movements in 6 of the 13 subjects (one from the MCI group and five from the CTRL group), so no statistic analysis about periodic leg movements could be performed to compare the groups. Filters were set between 0.1–100 Hz for the EEG, 10–70 Hz for the EMG, and 0.3–15 Hz for the EOG. A notch filter was centered at 60 Hz to avoid contamination. Impedance was kept below 10 k. Data were digitalized with a sample frequency of 512 Hz and an A/D convertor of 16 bits and were stored on the same MEDICID-5 system. The register began after calibrating the signals and bidding the participant good night. In the morning, OAs filled a Likert scale about their quality of sleep. LC-R, a trained Neurophysiologist, analyzed the signals offline to score the registered data. Sleep was classified by the AASM criteria (AASM, 2007) and the following sleep architecture variables were calculated: Total Time in Bed (TTB), Total Sleep Time (TST), sleep efficiency, latencies, number of minutes and percentages of sleep stages N1, N2, and N3, REM, Wakefulness After Sleep Onset (WASO), number of leg movements, index of periodic leg movements per hour of sleep, number of waking periods per night, index of waking periods per hour of sleep, number of arousals per hour of sleep and index of arousals per hour of sleep.

Signals were converted to the text format and DFA and mDFA were obtained with the MATLAB version R2015 (see data in Supplementary Material for the details of the custom code in MATLAB). As stated below, each 30 s epoch was tested for its degree of stationarity or non-stationarity with the free R statistical software.

# 2.5. Statistical Analysis

#### 2.5.1. Cognitive Tasks

Age, educational levels and total scores on each subscale of the Neuropsi and MMSE were compared between groups using independent Student t-tests. As presented in **Table 1**, age and educational levels did not show significant differences between groups, and to explore the relationship between age and educational level, a Pearson's correlation analysis was performed, revealing an almost perfect negative correlation. Also, a correlation analysis between the Neuropsi and the MMSE was carried out to verify the equivalence of Neuropsi and MMSE as tests to diagnose cognitive impairment. To rule out possible effects of age on the Neuropsi, another correlation test was made between the total score of the Neuropsi and age.

#### 2.5.2. Polysomnography

Ten epochs of 30 s mostly consecutive before the first episode of REM sleep and ten during REM sleep were scored for each subject. The DFA and the mDFA for each epoch and each group were calculated. Some subjects had not enough epochs at this episode and another REM episode was chosen. Epochs with artifacts were rejected by visual inspection. Rejection applied to few subjects only, affecting 0 and 46 epochs for the CTRL group in NREM and REM, respectively; and to 6 and 15 for the MCI group within NREM and REM sleep, respectively. Pearson's chisquare tests were performed for each stage to discard any unequal contribution of N1, N2, or N3 stages. Likewise, chi-square tests were performed for the comparison of the number of arousals and the index of arousals per hour of sleep. When the frequency was lower than 5, the Yates correction was used.

Correlation analyses were performed between the Neuropsi scores and the Hurst exponents considering the mean of the ten epochs for each stage at individual derivations and multichannel values; another correlation was done between the Hurst exponents of these stages and age. Kolmogorov-Smirnov tests were performed and all DFA and mDFA data followed normal distributions. Also, mixed ANOVAs (2 × 2) were used to test for statistical differences, with group (CTRL and MCI) as the between-subject factor and condition (NREM and REM) as the within-subject factor, using DFA and mDFA for each channel and assessing the interhemispheric relationship (bilateral channels including the EOG), anteroposterior (FP1-P3 and FP2-P4) and posterior networks (O1-P3-T3 and O2-P4-T4). First, only the means of the NREM and REM epochs were considered for the analysis. Next, another mixed ANOVA was performed not taking into account the means but instead the ten values for each set of channel and multichannel contrasts, using the Greenhouse-Geisser's correction for repeated measures (see Figure S1 in Supplementary Material to see the detailed figures for the NREM to REM transition). By visual inspection one subject from each group appeared to be an outlier, so these two subjects were removed and the ANOVAs were repeated. All subjects were kept because the differences were not qualitatively significant as can be shown at (see Table S2 in Supplementary Material). Wilcoxon tests were performed for stationarity percentages between the ten epochs of each group and stage and throughout the whole register (see Table S1 in Supplementary Material to see the detailed percentages for the groups). Bonferroni corrections for multiple comparisons between channels were not performed, because these criteria would eliminate our results and so the present findings must only suggest a tendency.

#### 3. THE COLOR OF NOISE

The power spectrum P of a time series is calculated using the Discrete Fourier transform. Many natural phenomena show a characteristic curve that can be fitted to the functional form:

$$P(f) \propto f^{\alpha} \tag{1}$$

where f is the frequency. In this case, the power spectrum is said to scale as a power law. Power laws are abundant in nature, and a few examples include allometric laws in Biology (West et al., 1997), the inverse-square laws of Newton and Coulomb (Heering, 1992), Kepler's third law (Gingerich, 1975), etc. It is worth noticing the fact that the expression 1 implies scaleinvariance (Song et al., 2005): multiplying f by a constant factor a gives P(af) = (af) <sup>α</sup> = a <sup>α</sup>P(f) ∝ P(f). In other words, the curve P(f) is invariant to changes in the scale of the independent variable since the resulting plot is the same as the original but scaled by the factor a α in the dependent variable.

Since the seminal work of George Kingsley (Zipf, 1949), the case when α is negative has been of special interest. Zipf examined several corpus in English and found that the frequency of words plotted against their rank constitutes a power law with exponent α = −1. **Figure 2** shows the log-log plot of power spectrums following a power law for different values of α.

If α ≈ 0, this means that all the frequencies are present with the same amplitude. If an analogy with the frequencies of electromagnetic radiation is established, then it is possible to state that the color of the time series is white, and the time series is often described as white noise. Following the analogy, the case of α = −1 emerges from the presence of all the frequencies but more strongly dominated by low frequencies corresponding to the red region of the color spectrum, and as red plus white gives pink the result is called pink noise. Pink noise seems to be ubiquitous in nature; for a number of examples, refer to Bak et al. (1987) and Bak (1996). If the power spectrum decays as f −2 , the low frequencies are even more dominant; for historical reasons, this distribution is called brown noise as it formally coincides with Brownian noise (Bak et al., 1987).

# 4. DETRENDED FLUCTUATION ANALYSIS

DFA was introduced by Peng et al. (1995) to analyze nonstationary heartbeat time series. The purpose of the technique is to detect self-similar patterns even if they are embedded in a seemingly non-stationary frame. Furthermore it has the added feature of avoiding the spurious detection of artificial selfsimilarity due to trending of the probability distribution function.

FIGURE 2 | The figure shows the power spectra of a set of time series. The scale of both axes is logarithmic, meaning that the plots are power laws in linear scales; that is, a family of parabolas or hyperbolas depending on the sign of the exponent. A horizontal line in the power spectrum means that all the frequencies appear in the Fourier transform of the time series with the same power. In analogy with the visible electromagnetic radiation, this case is know as "white noise." Of special interest are the cases when the slope of the line is -1 and -2. In the first scenario, the time series has all the frequencies but the low ones dominate and thus it has a mixture of white and red and the result is called "pink noise." When the slope is -2, the resulting noise (it has all the frequencies) is called "Brown" not because the mixture of frequencies lead to that color but because it coincides with the power spectrum of Brownian motion. Extending the analogy, a rising line would contain all the frequencies but as the large ones dominate, the resulting color would correspond to the various tones of bluish.

DFA starts with a discrete time series x(i), for i = 1, 2, . . . , N and then this is substituted by the integrated values y(i) so that a selfsimilar process is obtained. This integrated time series is defined by

$$\wp(k) = \sum\_{i=1}^{k} \left( \mathbf{x}\left(i\right) - \overline{\mathbf{x}}\right),\tag{2}$$

where the average value x of the times series x(i) is given by the formula x = 1 N P<sup>N</sup> i = 1 x(i). The next step in DFA consists in measuring the vertical characteristic scale for the integrated time series. This is achieved by dividing y(i) into N boxes of equal length n. For each one of these boxes we perform a linear least squares fitting of the data which is referred to as the local trend on that box. The ordinate at the straight line is denoted by yn. We note that, more generally, y<sup>n</sup> could be the y coordinate of a degree m polynomial fitting, and by choosing m > 1, we would be removing not only constant or linear trends but also higher order trends; to distinguish this particular approach we refer to the resulting method as DFAm, where the value of m ≥ 0 represents the degree of the polynomial fitting. We now come to the detrend step of DFA: we subtract from y(i) the linear local trend yn(i) for each n. Depending on the box size n, the characteristic lengthscale function for the fluctuations in the integrated and detrended series is:

$$\mathcal{F}(n) = \sqrt{\frac{1}{N} \sum\_{k=1}^{N} \left( \wp\left(k\right) - \wp\_n\left(k\right) \right)^2}. \tag{3}$$

Finally, we plot the values of log n against the values of log F(n) and observe a linear relationship which indicates the presence of a power law, in other words

$$\mathcal{F}(n) \sim n^{\alpha}.\tag{4}$$

The scaling exponent α is calculated as the slope of the line relating log F(n) and log n. If α is greater than 0.5, then there are persistent long-range correlations in x(i). In case α is equal to 1, then we obtain the so called 1/f noise (Li and Holste, 2005) a case which has attracted a lot of interest from both physicists and biologists. If 0 < α < 0.5, then this detects anti-correlations in x(i), that is, large values are expected to be followed by small values and vice versa (Peng et al., 1995). Hence, DFA can be regarded as a methodology for detecting scaling behavior in observational time series that may be affected by nonstationarities. This applies to our case with psychophysiological electric signals.

#### 5. MULTICHANNEL DETRENDED FLUCTUATION ANALYSIS

While DFA allows the analysis of one time series x(i) in order to assess the long range correlation of the data involved, for example, in the non-stationary heartbeat time series (Peng et al., 1995), or in the non-linear analysis of anesthesia dynamics (Zhang et al., 2001), it is evident that a similar analysis is needed for time series which are sequences of observations consisting of several simultaneous inputs, so as to be able to assess the long range correlation of multichannel data. For example, in our case, we record information from one night polysomnography with several electrodes, because the information registered by a single electrode alone cannot possibly give a complete picture of how the central nervous system is working.

Here we present a generalization of DFA as introduced by Rodriguez et al. (2011) which precisely enables the analysis of multichannel data. Begin with a vector valued time series xE(i), that is, for each i = 1, 2, . . . , N, we have an m-dimensional vector xE(i) = x1(i), x2(i), . . . , xm(i) . We emphasize that m represents the number of inputs of each recording. Then the mDFA will basically implement the DFA methodology in each component of the m-dimensional time series while manipulating the vector valued time series with the arithmetic of vectors in R <sup>m</sup>. So we initiate the process by considering the integrated values of our vector valued time series as in Equation (2), that is,

$$\vec{\chi}(k) = \sum\_{i=1}^{k} \left( \vec{\chi}(i) - \widehat{\vec{\chi}} \right), \tag{5}$$

where <sup>b</sup><sup>x</sup> <sup>=</sup> 1 N P<sup>N</sup> i = 1 xE(i) is nothing but the vector where each of its components is the average value of the corresponding components of the given vector valued time series xE(1), xE(2), . . . , xE(N). This step results in a component-wise self-similar process. We now measure the vertical characteristic scale for the integrated time series in each component. For this purpose, we divide each component of the integrated time series into boxes of equal length n. For each one of these boxes and for the data of each component, a linear least squares fitting (called the local trend of the component on that box) is performed. The vector of values of the y coordinates of the straight lines is denoted by −→y<sup>n</sup> (k). Just as in the DFA process, we detrend the integrated time series Ey(k) component by component, so that we subtract −→y<sup>n</sup> (k). By modifying Equation (3) in such a way that individual contributions for the detrended fluctuations of every vector component i = 1, 2, . . . , m are taken into account, we define

$$\mathcal{F}(n) = \sqrt{\frac{1}{N} \sum\_{k=1}^{N} \left\| \vec{\boldsymbol{\nu}}(k) - \overrightarrow{\boldsymbol{\chi}\_n}(k) \right\|^2}. \tag{6}$$

When we plot the values of log n against the values of log F(n), a linear relationship is observed which indicates the presence of a power law, that is to say

$$\mathcal{F}(n) \sim n^{\alpha}. \tag{7}$$

Hence, the scaling exponent α represents the fluctuations and can be approximated as the slope of the line relating log F(n) and log n, as in the classic DFA situation.

#### 6. STATIONARITY

Electrophysiological phenomena are typically regarded as complex signals, i.e., non-linear and non-stationary. We therefore assume the time series x(i) comes from a non-stationary process X(t) with E(X(t)) = 0 and E(X 2 (t)) < ∞, which admits an evolutionary spectrum as defined in Priestley (1965). The test introduced by Priestley and Subba Rao (Priestley and Subba Rao, 1969) makes use of the concept of evolutionary spectrum (that is, possibly time dependent) of a non-stationary process, and the basis of the method consists essentially in testing the uniformity of that evolutionary spectrum evaluated at a set of different frequencies and instants in time.

For estimating h(t, ξ ), the evolutionary spectral density function (evolutionary SDF) at frequency ξ and time t, the "double window" technique is used (Priestley, 1966). The algorithm of the double window is as follows: two functions wτ and g, referred to as windows, which satisfy the conditions


where Ŵ(ξ ) = R <sup>∞</sup> −∞ g(u)e iu<sup>ξ</sup> du, W<sup>τ</sup> (λ) = R <sup>∞</sup> −∞ w<sup>τ</sup> (t)e <sup>−</sup>iλ<sup>t</sup> dt. The window g is used to build an estimator U for evolutionary SDF

$$U(t,\xi) = \sum\_{u=t-T}^{t} g(u)\chi(t-u)e^{-i\xi(t-u)}.\tag{8}$$

The estimator U is demonstrated (Priestley, 1966) to be asymptotically unbiased (E(U(t, ξ )) ≈ h(t, ξ )) but inconsistent (Var(U(t, ξ )) ≈ h 2 (t, ξ )). Thus the second window w<sup>τ</sup> (t) is used to build a second estimator,bh, which is both asymptotically unbiased and asymptotically consistent

$$\widehat{h}(t,\xi) = \sum\_{u=t-T}^{t} \left| \mathcal{w}\_t(u) | U(t-u,\xi) \right|^2 \tag{9}$$

Furthermore, assuming the bandwidth of <sup>Ŵ</sup>(θ) 2 to be small compared with the frequency domain bandwidth of h(t, ξ ), or the bandwidth of Wτ (u) to be small compared with the timedomain bandwidth of h(t, ξ ), the following approximations can be obtained

• E <sup>b</sup>h(t, <sup>ξ</sup> ) ≈ h(t, ξ ) • Var <sup>b</sup>h(t, <sup>ξ</sup> ) ≈ C τ h 2 (t, ξ ) R <sup>∞</sup> −∞ <sup>Ŵ</sup>(θ) 4 dθ Let Y(t, ξ ) = log <sup>b</sup>h(t, <sup>ξ</sup> ) , then




It is important to notice that the variance of Y is asymptotically independent of both ξ and t. Alternatively, we may write

$$Y(t,\xi) = \log\left(f(t,\xi)\right) + \varepsilon(t,\xi),\tag{10}$$

where approximately E ε(t, ξ ) = 0 and Var ε(t, ξ ) = σ 2 C τ R <sup>∞</sup> −∞ <sup>Ŵ</sup>(θ) 4 .

 Let us choose a set of timest1, t2, . . . , t<sup>I</sup> and a set of frequencies ξ1, ξ2, . . . , ξ<sup>J</sup> and write Yi,<sup>j</sup> = Y(t<sup>i</sup> , ξj), hi,<sup>j</sup> = j(t<sup>i</sup> , ξj) and εi,<sup>j</sup> = ε(t<sup>i</sup> , ξj) for i = 1, 2, . . . ,I and j = 1, 2, . . . J. Then we obtain a model

$$Y\_{ij} = f\_{ij} + \varepsilon\_{ij} \tag{11}$$

where the {εij} can be regarded as uncorrelated if the points (ti,<sup>j</sup> , ξi,j) are sufficiently wide apart. If in addition the number of points are sufficiently large, then it turns out that the {εi,j} follow a normal distribution, that is, εi,<sup>j</sup> ∼ N(0, σ 2 ). With this assumption, we may rewrite our model as the usual model of the two factor variance analysis, that is, as

$$H\_0: \quad Y\_{i,j} = \mu + \alpha\_i + \beta\_j + \gamma\_{i,j} + \varepsilon\_{i,j} \tag{12}$$


TST, Total Sleep Time; N1, Stage 1; N2, Stage 2; N3, Stage 3+4; REM, Rapid Eye Movement sleep; WASO, Wakefulness After Sleep Onset; MCI, Mild Cognitive Impairment; CTRL, Control. Student t-tests, p < 0.05. Significant results for ANOVA tests are indicated in bold.

For a stationary process, it is fairly straightforward that E <sup>b</sup>h(t, <sup>ξ</sup> ) ≈ h(ξ ) is independent of t. Therefore, the degree of stationarity may be tested by means of the model

$$H\_1: \quad Y\_{i,j} = \mu + \beta\_j + \varepsilon\_{i,j}.\tag{13}$$

against the model H<sup>0</sup> in (12). We now construct the table of the standard analysis of variance for a two factor design, as shown in **Table 3**.

The first step of the test uses the statistic SI+<sup>R</sup> ∼ σ 2χ 2 (I − 1)(J − 1) , which follows a chi-squared distribution and will be 0 for γ = 0. When the interaction is not significant, we proceed to test the statistic S<sup>T</sup> ∼ σ 2χ 2 (I − 1) which is 0 for β = 0. Stationarity is proved when both interaction and time effect are not significant. For more information see Priestley (1981).

# 7. RESULTS

#### 7.1. Neuropsychological Testing

As stated above, age and education had an almost perfect negative correlation [r(11) = −0.82, p < 0.001]. The MMSE and the Neuropsi scores were significantly correlated according to the Pearson correlation coefficient [r(11) = 0.65, p < 0.01]. Neuropsi scores and age were not correlated [r(11) = −0.28, p < 0.33].

#### 7.2. Polysomnography

As mentioned above, subjects had few sleep complaints reported on the SAST. But on their sleep questionnaires more complaints appeared in the CTRL group: four members of the CTRL group declared their sleep was not very good and three of them good, while all members of the MCI group stated their sleep was good. According to the sleep questionnaire, mean subjective general latency to sleep onset was 21.5 for the CTRL group (range from 10 to 60 min) and 20.7 min (range from 10 to 60 min) for the MCI group, with no significant differences between them [t(11) = 0.09, p <0.93]. Nevertheless, according to the Likert scale, the night at the laboratory was evaluated as good (85 and 94%, respectively for the CTRL and MCI group) without significant differences between groups [t(11) = 1.57, p < 0.14].

The Neurophysiologist found among MCI subjects several cases of suspected RLS and/or fragmented sleep. The intensity of leg movement indexes for the MCI group per hour was abnormally high (range from 16.88 to 70.87 movements per hour). Only two CTRL subjects could be registered for leg

TABLE 5 | Scaling results for each group in the transition from NREM to REM at individual derivations.


CTRL, Control group; MCI, Mild Cognitive Impairment group; REM, Rapid Eye Movement sleep; NREM, Non-REM sleep; LOG, Left Oculogram; ROG, Right Oculogram; EMG, Electromyogram. Significant results for ANOVA tests are indicated in bold. In each case, the Greenhouse-Geisser's correction for repeated measures was employed.

movements: one subject had 5.32 sequences of Periodic Leg Movements (PLM) and the other had an index of 46.05 PLM per hour of sleep. The number of awakenings per hour was not abnormal. For the CTRL group, awakenings per hour ranged from 1.8 to 4.5 for all subjects; and the MCI group had from 1.9 to 5.1 awakenings per hour. Nevertheless, when arousals were considered, four CTRL members had a range of arousals per hour of 3.5 to 6.5 and three of them from 13.23 to 24.04. Instead, two subjects from the MCI group had a range of 2.6 to 7.6 arousals per hour, while four of them had 12.59 to 35.9 arousals per hour. The number of arousals reached statistical significance, the frequency of arousals for the MCI group being higher than for the CTRL group (chi = 6.26, df = 1, p < 0.01). In addition, another subject belonging to the MCI group had a diminished latency to REM sleep suggestive of narcolepsy. A subject of the CTRL group presented a diminished latency to REM sleep suggestive of depression but the clinical assessment ruled out that possibility. Despite those results, there were no significant differences regarding TST, percentages of sleep stages, efficiency, latencies to sleep onset or to REM sleep for the groups. **Table 4** shows these polysomnographic results, the epochs recorded as "wakefulness," the index of wakefulness per hour of sleep, and the overall number of arousals and arousals per hour.

# 7.3. DFA and mDFA

The percentages of NREM for N1, N2 and N3 stages for the CTRL group were 10, 51.4, and 38.5%, respectively; likewise, for the MCI group, 1.6, 70, and 28.3%, respectively. None of the frequencies of the ten NREM epochs subjected to the quantitative analyses for the DFA and mDFA showed significant differences between groups (chi = 1.56, df = 1, p < 0.21 for N1; chi = 0.46, df = 1, p < 0.49 for N2; chi = 2.27, df = 1, p < 0.13 for N3).

Hurst mean values in NREM and age were negatively correlated at LOG and ROG derivations [r(11) = −0.78, p < 0.001; r(11) = −0.70, p < 0.001], for LOG and ROG, respectively], as can be seen in **Figure 3**.

On the other hand, education was positively correlated with Hurst mean values in NREM sleep at the same derivations, LOG and ROG [r(11) = 0.68, p < 0.01; r(11) = 0.61, p < 0.02, respectively]. In addition, a significant positive correlation

between the Neuropsi scores and the Hurst mean exponents of NREM sleep was found at the left posterior network [r(11) = 0.55, p < 0.04].

For REM sleep, Hurst mean values and education were negatively correlated at the EMG [r(11) = -0.67, p < 0.02].

ANOVA results considering only the mean of the Hurst values for the ten NREM and the mean of the ten REM epochs were not significant. Nevertheless, the comparison considering the ten Hurst values of the NREM and REM epochs rendered one difference in stages at O1. Both Hurst values decreased from NREM to REM. Interactions in the ANOVA tests for the monopolar channels were found at LOG and ROG channels and frontotemporal derivations (FP2, F8, F7, F4, T4, and T3) and in all cases, the means were higher in REM sleep stages for the MCI group (**Table 5** and **Figure 4**).

Another interaction at the interhemispheric relationship was significant at the F7-F8 pair of derivations, following the same tendency of being higher for the MCI group (**Table 6** and **Figure 5**).

Likewise, the scaling values for the right frontoparietal network showed an interaction and increased for the MCI group and decreased for the CTRL group (**Table 6** and **Figure 6**).

#### 7.4. Stationarity

Each of the NREM and REM epochs was classified as stationary or non-stationary by using the described methodology stated above and considering a 30- seconds epoch was a better estimate of the degree of stationarity than a 10-seconds epoch, as can be seen in **Figure 7**.

Both types of classified epochs were represented visually for the whole night register in **Figure 8**.

The quantity of those stationary epochs did not show significant correlations with age, neither with MMSE nor Neuropsi scores. Also, no significant differences between groups nor sleep stages were found considering the ten values from each stage; this is believed to be due to a small-sample effect. The methodology was repeated for the whole night PSG register and similar analyses were performed. There was a great variability, as shown in the Table S1 of the Supplementary Material. Nevertheless, Wilcoxon tests were significant for each group in frontal (FP2, FP1, and F7) and LOG and ROG channels. For the CTRL group, F3 was also significant, as can be seen in **Figure 9**.

These results suggest that (1) the differences between REM and NREM can be effectively traced using simple techniques such as weak stationarity detection (**Figure 8**), and (2) there are real differences between groups, though not enough to be detected statistically by this method.

# 8. DISCUSSION

REM sleep enhances memory and attention processes by cholinergic inputs (Braun et al., 1997) via pontine (Datta et al., 2004) and basal forebrain structures (Blake and Boccia, 2017). During normal aging and especially during pathological aging, attention and memory processes become more vulnerable, and cholinergic neurons are mostly affected (Schliebs and Arendt, 2011). Aging affects various anatomic structures resulting in a loss of dendritic arbor in cortical neurons that show degradation in their structural fractal complexity (Lipsitz and Goldberger, 1992).

Non-linear dynamics and complex systems appear to be well suited to explain these phenomena (Babloyantz and Destexhe,

TABLE 6 | Scaling results for each group in the transition from NREM to REM at anteroposterior and posterior networks.


CTRL, Control group; MCI, Mild Cognitive Impairment group; REM, Rapid Eye Movement sleep; NREM, Non-REM sleep; LOG, Left Oculogram; ROG, Right Oculogram; EMG, Electromyogram. Significant results for ANOVA tests are indicated in bold. In each case, the Greenhouse-Geisser's correction for repeated measures was employed.

1986). There is a number of reports specifically addressing EEG signals during sleep: from the theoretical quantification of the effects of non-linearity (Aeschbach and Borbérly, 1993; Fell et al., 1993) to characterization of some pathologies (Röschke et al., 1995). Recent advances in nonlinear dynamics have pointed toward the relevance of using the framework of power-laws and associated tools to extract information from the EEGs (Lee et al., 2004), nevertheless, subjects under investigation have been few; only healthy subjects or subjects with sleep disorders as apnea, insomnia or narcolepsy have been studied, and non-linear methods have been heterogeneous.

Recent work showed a generalization of the DFA, named multivariate DFA (MVDFA) (Xiong and Shang, 2017). This analysis is very similar to the one developed by our group and referred to in this paper as mDFA (Rodriguez et al., 2011). Xiong and Shang (2017) showed the validity of the proposed MVDFA illustrated by numerical simulation on synthetic multivariate processes as well as on stock indices in Chinese and U.S. stock markets. Furthermore, these authors calculated the DFA of a single time series and showed that MVDFA is related to the average DFA of each time series. In this research we used mDFA to show interhemispheric, anteroposterior and posterior network behavior between EEG recordings and DFA of a single channel to validate differences between the CTRL and the MCI groups.

Lee et al. (2004) calculated the Hurst exponent of the recordings of normal sleep stages of six healthy subjects against

the Hurst exponent of six recordings of apnea from MIT/BIH polysomnography database. The scaling exponents of apnea were found to be lower than those of healthy subjects.

Acharya et al. (2005) computed several parameters, including the Hurst exponent but not through DFA. They worked out their analysis with eight EEG data from the sleep-EDF database from the PhysioBank, a data resource.

Weiss et al. (2009) based their study on the data from ten subjects, a similar number of subjects to those studied in the present research. Weiss et al. (2011) confirmed their previous results concerning Hurst values but now with twenty two participants. They added correlation analyses between Hurst exponents and the range of fractal spectra to strengthen their previous results through the use of fractal analysis to emphasize known phenomena of human sleep, in this case by proving that fractal range was a better estimating measure for classifying sleep stages.

Kumar et al. (2012) have proposed a pharmacological and neurophysiological model to reveal how the transition from wakefulness to NREM and REM sleep occurs. The transition has been explored in younger adults or in people with sleep disorders by means of mathematical analyses of one of the markers of REM sleep. In Kishi et al. (2011), a control night was compared with a night after a dose of risperidone, but only latencies and percentages of EEG stages were obtained. In another work in healthy young subjects an abrupt change in spectral analysis for two broad bands was observed on the EMG: one from 24 to 28 Hz and the other from 28 to 32 Hz (Rosales-Lagarde et al., 2009). Also, Bliwise et al. (1974) searched for EMG changes in 5 young female adults and discovered that the lowest tonic levels in EMG occurred just before REM sleep, increasing for subsequent periods of NREM sleep and decreasing again before the subsequent REM period. Hadjiyannakis et al. (1997) followed the three REM sleep markers, did spectral analysis of the EEG in a larger window than the above mentioned study of Rosales-Lagarde et al. (2009) and concluded that neither in normal

controls nor in narcoleptics, abrupt modifications in the power density of the frequencies were observed.

Nevertheless, the NREM to REM transition has not been explored by non-linear analysis with the exception of Bizzotto et al. (2010) who used Markov-chain models in insomnia patients, but to our knowledge no paper has presented a non-linear analysis in the NREM to REM transition in MCI subjects.

In our research, the three indicators of sleep were tested. There were selective differences in the transition from NREM to REM sleep for each group.

On the EEG, while the CTRL group, with preserved memory, attention and executive functions diminished its signal structure in frontotemporal areas from NREM to REM sleep, the MCI group had higher values suggestive of a tendency toward Brownian noise, strongly dominated by low frequencies, a result that had to be confirmed later using spectral analysis. In this matter, CTRL results agree with the findings of Weiss et al. (2009), because lower Hurst values were found in frontal regions in REM sleep. This is in accordance to several studies that refer to an anteroposterior gradient using metabolic techniques in resting states, indicating a greater hypofrontality relative to age (Moeller et al., 1996), but also during REM sleep. Only the left occipital area distinguished between stages. Likewise, Weiss et al. (2009) found differences for the Hurst values from NREM4 to REM in that site. The mDFA rendered a frontopolar relationship, because while the CTRL group diminished its scaling exponents, the MCI group increased them.

Inter-individual connectivity has been proved to distinguish heteromodal cortices from primary areas (Mueller et al., 2013) and in this work has been associated with group differences in cognitive functioning in the NREM and REM transition.

A dissimilar pattern among interhemispheric coupling, anteroposterior and posterior areas (Pérez-Garci et al., 2001; Corsi-Cabrera et al., 2003) was not found. Instead, the individual scaling exponents and the networks followed a greater Hurst value for the MCI group.

stationary (blue) and adopts another non-stationary pattern for REM sleep (green). LOG, Left Oculogram; ROG, Right Oculogram; EMG, Electromyogram.

From NREM to REM, at both right and left EOG, the MCI group increased its signal structure toward Brownian noise. The correlations with Hurst values of the EOGs in NREM sleep rendered a negative association with age and a positive one with education. Also, in NREM sleep, the left posterior network was

is sensitive for measuring cognitive impairment. Muscle activity at REM sleep was negatively correlated with education. Van der Hiele et al. (2011) found more theta activity, less alpha reactivity, and more frontal EMG in Alzheimer's patients than in controls. Increased EMG activity indicated more cognitive impairment and more depressive complaints. Also, Chen et al. (2011) used root mean square and frequency peak analysis and found greater values for MCI patients. In the present study, contrary to EEG and EOG results, EMG values tended to have scaling exponents of white noise, or, as stated above, anticorrelations: large values are expected to be followed by small values and vice versa.

positively correlated with Neuropsi scores, meaning this network

Regarding the great number of participants with RLS, the latter could not be diagnosed, because all MCI subjects stated their sleep was good and were practically without sleep complaints. Periodic RLS episodes are followed by increases in power, heart beat and arousals (Sieminski et al., 2017). Ferri et al. (2015) concluded that RLS is connected, but not in a simple causal relation, to arousals. The conclusion of Frauscher et al. (2014) refers to a high rate of motor events even in normal subjects. These authors found PLM especially in N1 stage, being the median of 5 per hour. The great number of arousals can help to explain group differences, because subjective complaints, both of memory and sleep, were minimal. Changes of betaamyloid ocurring in sleep disorders point toward the disruption of NREM sleep and, in particular, of SWS (Brown et al., 2016; Cellini, 2017). Potential underestimation of memory and sleep may affect these MCI subjects more than CTRLs. Gender effects could have influenced these results (Nikulin and Brismar, 2005), because almost all the CTRL group were women and most of the MCI subjects were men. In future works this factor must be controlled for.

The arbitrary exception for the degree of detection of stationarity when using the whole-night register, instead of a few NREM and REM epochs, was motivated mainly to counter the small-sample effect, but it was also preferred for being one of the fastest algorithms to compute (Nason, 2013). Previous results showed that this technique can be used for detecting

sleep stages for OAs (Rosales-Lagarde et al., 2017), so these new results indicate that such detection is still valid for some cognitive impaired OAs.

Finally, it must be highlighted that larger samples are needed to confirm the present findings.

#### 9. CONCLUSION

At the NREM to REM transition, executive functioning in MCI subjects was associated with brown noise in frontotemporal and LOG and ROG scaling exponents. On the EEG, both for DFA and mDFA, MCI OAs performing poorly on memory, attention and executive functions increased their Hurst values toward Brown noise from NREM to REM stages, while the CTRL group followed an opposite direction. On the EOG, both groups increased their Hurst values, and again the MCI group came nearer to its fractal breakdown. Muscle scaling values were lower than cerebral and eye movement Hurst values. Stationary differences were found at the whole register for the distinction of stages within the groups. Given the small size of the samples, any conclusion should be considered as preliminary and to confirm this data larger studies are needed.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of La Coordinación de Investigación of the Instituto de Ciencias de la Salud at the Universidad Autónoma del Estado de Hidalgo. The protocol was approved by El Comité de Ética de Investigación. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### AUTHOR CONTRIBUTIONS

AR-L, GV-T, and CM-A contributed the neuropsychological testing and selection of the OAs. AR-L and GV-T conducted the PSGs. AR-L and LC-R scored the PSGs. AR-L wrote most of the paper related to the clinical, neuropsychological and psychophysiological sections. ER-T organized VG-M and JE-A to do the Hurst and stationarity analysis, respectively. AR-L, JE-A, and ER-T did the statistical analysis, the tables and figures. ER-T, BI-O and PM wrote the mathematical section approaches of the analysis and helped with the format and the discussion. JL-N and JP-S provided administrative and technical support to carry out the sleep measurements at the UAEH and helped with the references.

#### FUNDING

GV-T received a fellowship from the Consejo Nacional de Ciencia y Tecnología (CONACyT) number (416939). This research was accepted by the authorities of the Instituto de Ciencias de la Salud (ICSa) of the Universidad Autónoma del Estado de Hidalgo (UAEH) and belongs to the project titled "Design of tests to pre-diagnose and diagnose Elderly Adults in Hidalgo at the bio-psycho-social areas." The CONACyT, through its Cátedras CONACyT program, financed AR-L and CM-A.

#### ACKNOWLEDGMENTS

This work forms part of a master's degree thesis submitted to the Master in Biomedical Sciences and Health Program at the Instituto de Ciencias de la Salud of the UAEH (GT-V). It is also associated with a Bachelor's program in Applied Mathematics at the Instituto de Ciencias Básicas e Ingeniería of the UAEH (VG-M and JE-A). We are grateful to Patricia Padilla-Muñoz,

#### REFERENCES


coordinator of the Centro Gerontológico Integral at Punta Azul, Pachuca de Soto for her support. Also, to Geovanni de la Cruz, Minerva Granillo and Mauricio Islas for their technical aid in the registers of the polysomnographies, and finally to Patricia Pliego-Pastrana for her help in gathering information on OAs. Christopher Follett reviewed the English language version.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01205/full#supplementary-material


Priestley, M. B. (1981). Spectral Analysis and Time Series. London: Academic Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Rosales-Lagarde, Rodriguez-Torres, Itzá-Ortiz, Miramontes, Vázquez-Tagle, Enciso-Alva, García-Muñoz, Cubero-Rego, Pineda-Sánchez, Martínez-Alcalá and Lopez-Noguerola. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effects of Mild Cognitive Impairment on the Event-Related Brain Potential Components Elicited in Executive Control Tasks

Montserrat Zurrón<sup>1</sup> \*, Mónica Lindín<sup>1</sup> , Jesús Cespón<sup>2</sup> , Susana Cid-Fernández1,3 , Santiago Galdo-Álvarez<sup>1</sup> , Marta Ramos-Goicoa<sup>1</sup> and Fernando Díaz<sup>1</sup>

<sup>1</sup> Laboratorio de Neurociencia Cognitiva Aplicada, Departamento de Psicoloxía Clínica e Psicobioloxía, Facultade de Psicoloxía, Universidade de Santiago de Compostela, Santiago de Compostela, Spain, <sup>2</sup> Basque Center on Cognition, Brain and Language, San Sebastián, Spain, <sup>3</sup> Sezione di Neuroscienze Cognitive – Laboratorio di Neurofisiologia, IRCCS Centro San Giovanni di Dio, Fatebenefratelli, Brescia, Italy

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Rolf Verleger, Universität zu Lübeck, Germany Iris Wiegand, Max-Planck-Institut für Bildungsforschung, Germany

> \*Correspondence: Montserrat Zurrón montserrat.zurron@usc.es

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 09 March 2018 Accepted: 11 May 2018 Published: 29 May 2018

#### Citation:

Zurrón M, Lindín M, Cespón J, Cid-Fernández S, Galdo-Álvarez S, Ramos-Goicoa M and Díaz F (2018) Effects of Mild Cognitive Impairment on the Event-Related Brain Potential Components Elicited in Executive Control Tasks. Front. Psychol. 9:842. doi: 10.3389/fpsyg.2018.00842 We summarize here the findings of several studies in which we analyzed the eventrelated brain potentials (ERPs) elicited in participants with mild cognitive impairment (MCI) and in healthy controls during performance of executive tasks. The objective of these studies was to investigate the neural functioning associated with executive processes in MCI. With this aim, we recorded the brain electrical activity generated in response to stimuli in three executive control tasks (Stroop, Simon, and Go/NoGo) adapted for use with the ERP technique. We found that the latencies of the ERP components associated with the evaluation and categorization of the stimuli were longer in participants with amnestic MCI than in the paired controls, particularly those with multiple-domain amnestic MCI, and that the allocation of neural resources for attending to the stimuli was weaker in participants with amnestic MCI. The MCI participants also showed deficient functioning of the response selection and preparation processes demanded by each task.

Keywords: mild cognitive impairment, event-related brain potentials, Stroop task, Simon task, Go/NoGo task

# INTRODUCTION

Here we report the findings of several studies carried out by our research group in which we recorded event-related brain potentials (ERPs) elicited in participants with mild cognitive impairment (MCI) during performance of executive tasks. The aim of this research was to search for neurofunctional indexes of executive processes which can be used in the diagnosis of MCI. This might contribute to the early detection of Alzheimer's disease (AD), of enormous socio-sanitary importance, and to halting progression of the disease by the timely application of appropriate treatments.

In section "Executive Functions, Tasks, Behavior and Brain Electrical Activity," we describe the three executive tasks, i.e., the Stroop, Simon, and Go/No-Go tasks, which we have adapted to induce ERPs. We also describe the characteristic ERP components elicited in these tasks and the cognitive processes associated with each. In section "Aging, Mild Cognitive Impairment (MCI), Performance

of Executive Tasks and ERPs," we define the different subtypes of MCI and summarize the results obtained in our studies on the effect of MCI on the ERP components in the three tasks.

# EXECUTIVE FUNCTIONS, TASKS, BEHAVIOR AND BRAIN ELECTRICAL ACTIVITY

The term executive function encompasses those brain functions that enable us to set goals and control the skills and behaviors required to achieve these goals. These functions include a series of cognitive processes, including attention, cognitive control, working memory, cognitive inhibition, and flexibility, which we use in our activities of daily living to monitor behaviors and implement goal-directed actions (Chan et al., 2008; Diamond, 2013).

The study of brain functions that support executive processes has traditionally been carried out by analyzing the behavioral responses to classic executive tasks in patients with some type of brain lesion or alteration. More recently, functional neuroimaging techniques have been used to record the haemodynamic activity in such patients and healthy controls while they perform executive tasks (see Gilbert and Burgess, 2008; Verdejo-García and Bechara, 2010; Lezak et al., 2012). These studies have highlighted the role of prefrontal areas in executive functioning, but do not provide any information about the temporal course of brain activity associated with executive processes. The ERP technique is a valuable tool in this respect, as it enables direct measurement of brain functioning, with a high temporal resolution (milliseconds), is non-invasive (ERPs are recorded with electrodes attached to the intact scalp), simple to use and relatively inexpensive. ERPs are positive and negative deflections of voltage that are considered components of brain activity and provide information about the temporal dynamics of the different stages of stimulus processing (perception, evaluation, categorization) and response processing (selection, preparation, and control) (Luck, 2005).

Event-related brain potential components can be characterized according to their amplitude (in microvolts, µV), latency (in milliseconds, ms) and topographical distribution. Various executive tasks have been adapted for use during the recording of brain electrical activity, in order to identify the ERP components associated with executive functioning. We have adapted three such tasks for this purpose: the Stroop, Simon, and Go/NoGo tasks.

All three tasks allow the study of attentional control, as subjects must direct their attention toward the characteristics of the stimulus that are relevant to the task demands and inhibit responses that are incompatible with the demands. Thus, the N2 and P300 components have been identified in relation to stimulus processing. These components are associated with the processes of directing and controlling attention to relevant stimuli: N2/N2b is associated with evaluation (Ritter et al., 1979; Folstein and Van Petten, 2008) and P300/P3b with categorization (Donchin, 1981; Polich, 2007) of the stimuli, although these components have specific characteristics in each task. Regarding the response processing, the lateralized readiness potential (LRP) is obtained in all three tasks and is generated within the motor cortex. The LRP provides information about the timing of response selection (stimuluslocked LRP; sLRP) and of planning and executing the response (response-locked LRP; rLRP) (Smulders and Miller, 2012).

# Stroop Task

All of the different versions of the Stroop task available compare the performance (reaction times -RT- and hits) under two conditions (**Figure 1**): one in which there is conflict between two types of information, one relevant and the other irrelevant to the task (e.g., the word "red" displayed in blue -incongruent stimulus-), and another that does not produce conflict (e.g., the word "red" displayed in red -congruent stimulus- or colored X-strings). Participants are usually asked to respond to the color of the stimulus and to ignore the word, thus generating the so-called Stroop effect, whereby the RT is longer in the conflict condition than in no-conflict condition. Considering that words induce an automatic reading response, the Stroop effect is attributed to interference generated by the meaning of the incongruent word in the task of responding to the color of the stimulus (see Stroop, 1935; MacLeod, 1991).

The ERP waveforms recorded in young and old healthy participants while they are processing color-word stimuli basically include (**Figure 2**) the frontal-central N2 and the parietal P3b (Zurrón et al., 2009, 2013, 2014).

The frontal-central N2 component is characteristic of tasks in which a conflict must be resolved before the response is given (N2 is larger for higher conflict) and it has therefore been associated with cognitive control processes (see review by Folstein and Van Petten, 2008).

The parietal P3b component (or P300) is observed in attentional tasks when an informative task-relevant stimulus is detected, and the P3b amplitude is thus larger for target than non-target stimuli (Donchin, 1981; Polich, 2007).

The cognitive processes associated with N2 and P3b involve respectively evaluation and categorization of the stimulus. Indeed, N2 and P3b latencies are considered indicators of the time required to evaluate and categorize the stimulus for resolution of a task (Kutas et al., 1977; Hillyard and Kutas, 1983; Folstein and Van Petten, 2008). The P3b amplitude is smaller for incongruent stimuli than for congruent stimuli in young participants, and this difference has been considered an indicator of the semantic conflict that occurs in the former relative to the absence of such conflict in the latter (Zurrón et al., 2009).

# Simon Task

In the Simon task (**Figure 1**), participants are required to respond to a non-spatial feature (i.e., color, shape) of a lateralized stimulus by pressing one of two response buttons that are lateralized in the same spatial arrangement. Although the stimulus position is irrelevant to the task, the RT is longer when the response

side is spatially incompatible with the stimulus position than in trials that require an ipsilateral response regarding the stimulus position. This spatial interference, known as the Simon effect, is evoked by visual (Craft and Simon, 1970), auditory (Simon and Small, 1969), and somatosensory (Hasbroucq and Guiard, 1992) stimulation, regardless of whether the participants respond with hand, feet or eye movements (Leuthold and Schröter, 2006).

This paradigm enables us to study the spatial attention given to the target and to inhibition of the non-target stimulus, the cognitive control used to suppress the spatial tendency of the response and the motor processes involved in the response preparation. ERPs can be used to study these cognitive processes (**Figure 2**). The contralateral posterior negativity (N2pc), a correlate of visuospatial attention to the lateralized target stimulus and suppression of the non-target stimulus (Luck and Hillyard, 1994; Eimer, 1996), arises from extrastriate visual areas (Luck et al., 1997; Hopf et al., 2000). The contralateral central negativity (N2cc) is generated in the dorsal premotor cortex during activity involved in preventing spatial responses (Praamstra and Oostenveld, 2003; Praamstra, 2006; Cespón et al., 2012). N2pc and N2cc appear between 200 and 300 ms post-stimulus in young adults. The rLRP component is associated with response preparation.

# Go/NoGo Task

In the Go/NoGo task (**Figure 1**), participants are usually required to respond by pressing a button in response to frequent stimuli (Go stimuli), while withholding the response to different infrequent stimuli (NoGo stimuli). This task is considered an executive function task (Rubia et al., 2001), and it can be used to study several cognitive processes such as stimulus evaluation, response inhibition and response control (Botvinick et al., 2001; Lucci et al., 2013).

The ERP correlates of the Go stimuli evaluation processes are N2b and P3b, although they are often called Go-N2 and Go-P3 when obtained with this task (**Figure 2**). In young adults, N2b (latency: 200–300 ms post-stimulus) shows maximum amplitude at central electrodes and P3b (300–500 ms post-stimulus) at parietal electrodes. In addition, the sLRP and rLRP components have been studied in relation to response processing in the Go condition.

With NoGo stimuli (**Figure 2**), the NoGo-N2 and NoGo-P3 ERP components are identified at frontocentral

locations around 200–400 and 300–500 ms post-stimulus, respectively (Pfefferbaum and Ford, 1988; Jodo and Kayama, 1992; Falkenstein et al., 2002; Vallesi et al., 2009). Both components have traditionally been considered correlates of response inhibition (e.g., Jackson et al., 1999; Bokura et al., 2001; Nakata et al., 2009).

# AGING, MILD COGNITIVE IMPAIRMENT (MCI), PERFORMANCE OF EXECUTIVE TASKS AND ERPs

Executive functioning is known to decline in healthy cognitive aging (Park and Schwarz, 2002; Raz et al., 2005; Reuter-Lorenz and Park, 2010; Grady, 2012; Harada et al., 2013; Kirova et al., 2015). In Stroop tasks, the RT is slower and the Stroop effect is larger in older than in young people (see MacLeod, 1991; Van der Elst et al., 2006; Peña-Casanova et al., 2009); the Simon effect increases in normal aging (Proctor et al., 2005; Juncos-Rabadán et al., 2008), and an age-related decline in behavioral responses has been observed in Go/NoGo tasks (Falkenstein et al., 2002; Lucci et al., 2013; Correa-Jaraba et al., 2016; Hsieh et al., 2016).

The ERPs elicited in the three tasks are sensitive to cognitive decline in healthy aging, and, relative to younger participants, healthy older participants display longer N2 and P3b latencies and smaller parietal P3b amplitude in Stroop (Zurrón et al., 2014) and Go/NoGo tasks (Falkenstein et al., 2002; Vallesi, 2011), as found in oddball tasks (Polich, 1997); longer N2pc in the Simon task (Van der Lubbe and Verleger, 2002; Cespón et al., 2013a), as also observed in visual search or selection tasks (Lorenzo-López et al., 2008, 2011; Amenedo et al., 2012; Wiegand et al., 2013, 2015); and delayed rLRP onset and larger sLRP amplitude in the Simon task (Cespón et al., 2013a), in consonance with previous findings (Yordanova et al., 2004; Kolev et al., 2006; Roggeveen et al., 2007; Wild-Wall et al., 2008; Wiegand et al., 2013). In addition, the P3b distribution is frontal in older and parietal in younger participants in the Stroop task (Zurrón et al., 2014), and the central NoGo-P3 amplitude is larger in older than in younger participants (Vallesi, 2011).

Some middle-aged and old adults show greater cognitive decline than expected according to their age and educational level, although their independence in daily life activities is preserved and they do not meet the criteria for diagnosis of dementia. This intermediate stage between normal cognitive aging and dementia is denominated MCI (Petersen et al., 1999; Knopman and Petersen, 2014). Interest in MCI has increased in recent years because the condition is associated with a higher risk of progressing to dementia (Winblad et al., 2004). MCI is frequently (although not always) associated with a decline in memory functioning, which may also be accompanied by deterioration in other cognitive functions, including executive functions. Four subtypes of MCI are currently distinguished (Petersen, 2016): single-domain amnestic MCI (sdaMCI, characterized by memory impairment only), multiple-domain amnestic MCI (mdaMCI, characterized by impairment in memory and in other cognitive domains), singledomain non-amnestic MCI (sdnaMCI, preserved memory but an overt decline in another cognitive domain), and multiple-domain non-amnestic MCI (mdnaMCI, preserved memory but with evidence of decline in several cognitive domains). Individuals with amnestic MCI (aMCI), especially those with mdaMCI, display a greater risk of developing AD than the healthy old population (Petersen et al., 1999, 2009).

In our ERP studies involving administration of Stroop, Simon and Go/No-Go tasks to participants older than 50 years (aMCI participants and healthy controls matched for age and years of schooling), we found that the effects of aMCI on the behavioral responses depended on the task used (Cespón et al., 2013b, 2015a; Cid-Fernández et al., 2014, 2017a,b; Ramos-Goicoa et al., 2016). However, aMCI affected the latencies and amplitudes of ERP components associated with executive functioning in all three tasks (**Figure 2**).

In the Stroop task, aMCI did not affect behavioral responses, probably because the task was simpler (only two types of response) than the standard versions of the Stroop test. In contrast, the latency of P3b component generated by congruent and incongruent stimuli was longer in middle-aged aMCI participants than in the paired controls; however, no differences between groups were observed in response to colored X-strings, which were presented separately and did not mobilize executive processes. aMCI affected the P3b amplitude: we observed a greater difference in P3b amplitudes between conditions (colored X-strings versus congruent or incongruent stimuli) in aMCI participants than in healthy controls. As only the color of the X-string stimuli, but both the color and semantic meaning of the congruent and incongruent stimuli must be evaluated, this finding may be attributed to a greater effect of readingrelated interference in aMCI participants than in the controls. Finally, the sLRP and rLRP amplitudes were smaller in aMCI participants than in healthy controls (Ramos-Goicoa et al., 2016).

In the Simon task, the error rate (but not the RT) was affected by mdaMCI, indicating greater interference due to the spatial position of the stimulus in participants with mdaMCI than in healthy controls. In relation to stimulus processing, we observed that the latencies of the N2 and N2cc components were longer in the mdaMCI participants than in the paired controls and attributed these findings to respectively a longer time devoted to stimulus evaluation processes and delayed allocation of neural activity associated with cognitive control of the spatial response (Cespón et al., 2013b, 2015a,b). The N2pc amplitude was smaller in mdaMCI participants than in the controls, and as N2pc is an ERP correlate of the direction of the spatial attention to lateralized stimuli (Luck and Hillyard, 1994; Woodman and Luck, 1999; Hickey et al., 2009), we conclude that the mdaMCI participants show a deficit in neural resources allocated to spatial attention.

In ERPs related to response processing in the Simon task (Cespón et al., 2013b, 2015b), the rLRP amplitude was smaller in sdaMCI and mdaMCI participants than in controls, and the amplitude of the frontocentral pre-response positivity component between 80 and 20 ms pre-response (Nessler et al., 2007) was smaller in mdaMCI and sdaMCI participants than in controls.

In the Go/NoGo task, MCI participants (mainly mdaMCI) showed slower responses and fewer hits in response to Go stimuli than the controls.

The latency of the N2-Go component was longer in mdaMCI participants than in controls (Cid-Fernández et al., 2017a). This finding is consistent with those reported by Mudar et al. (2016) for a semantic-type Go/NoGo task, in which the latencies of the Go-N2 and NoGo-N2 components were longer in aMCI participants than in controls. We also observed smaller Go-N2 amplitude in sdaMCI participants than in the paired controls, as well as a smaller NoGo-N2 amplitude in sdaMCI and mdaMCI participants than in the controls. López Zunini et al. (2016) observed smaller Go-P3 and NoGo-P3 amplitudes in aMCI participants than in controls. Therefore, aMCI participants display a deficit in neural resources dedicated to the Go and NoGo stimuli. These deficit in neural resources may be unspecific, as they affect both stimuli, or they may be related respectively to evaluating and categorizing the target stimulus and to inhibiting predominant responses not relevant to the task demands.

A late positive slow wave (PSW) was identified in the 700–1000 ms interval after the Go stimulus in sdaMCI participants, and it was absent in healthy controls and subjects with mdaMCI (Cid-Fernández et al., 2017a). This ERP component overlaps with the Go-P3 and is thought to indicate additional operations after categorization of target stimuli. In the sdaMCI participants, this component may indicate compensatory operations in order to maintain acceptable levels of performance, as the behavioral response of this subgroup of participants was similar to that of the controls. Furthermore, this hypothesis was consistent with the longer sLRP latency in response to Go stimuli in sdaMCI participants than in the controls (Cid-Fernández et al., 2017b). In this study, we also observed that the amplitude of the sLRP component was smaller in mdaMCI participants than in healthy controls.

In summary, our findings on the components of the ERPs elicited in executive tasks indicate that, relative to age-matched controls, aMCI participants exhibit (a) longer

#### REFERENCES


latencies, indicating slower neural functioning in the stimulus and response processing, (b) changes in the amplitudes of the N2 and P3b components, reflecting fewer neural resources allocated for attending to the relevant dimension of the stimulus, and (c) smaller LRP amplitudes, indicating a lower capacity of the motor cortex to allocate neural resources for selection and preparation of the motor response. The decline in processing associated with aMCI is greater than in healthy aging and is more evident in mdaMCI than in sdaMCI participants.

In light of the above, we conclude that aMCI participants present neurofunctional alterations in executive functioning, independently of any behavioral responses, and we believe that the neurofunctional deficits appear prior to behavioral deficits in aMCI participants. Exploration of the sensitivity and specificity of the ERPs components is therefore of interest regarding the potential use of these components as markers of MCI and possible predictors of AD.

# AUTHOR CONTRIBUTIONS

MZ, ML, JC, SC-F, SG-Á, and FD: wrote the manuscript. JC, MR-G, and SC-F: recorded the brain electrical activity. MZ, ML, JC, SC, SG-Á, MR-G, and FD: analyzed the ERP waveforms and identified the different components. MZ, JC, SC-F, SG-Á, and MR-G: carried out statistical analysis of the data.

# FUNDING

This study was financially supported by funds from the Spanish Government: Ministerio de Economía y Competitividad (PSI2014-55316-C3-3-R); and by the Galician Government: Consellería de Cultura, Educación e Ordenación Universitaria; Axudas para a Consolidación e Estruturación de Unidades de Investigación Competitivas do Sistema Universitario de Galicia: GRC (GI-1807-USC); Ref: ED431-2017/27.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zurrón, Lindín, Cespón, Cid-Fernández, Galdo-Álvarez, Ramos-Goicoa and Díaz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Working Memory Deficits After Lesions Involving the Supplementary Motor Area

Alba Cañas <sup>1</sup> , Montserrat Juncadella<sup>1</sup> , Ruth Lau<sup>2</sup> , Andreu Gabarrós 2,3 and Mireia Hernández 3,4,5 \*

<sup>1</sup> Department of Neurology, Hospital Universitari de Bellvitge, L'Hospitalet de Llobregat, Spain, <sup>2</sup> Department of Neurosurgery, Hospital Universitari de Bellvitge, L'Hospitalet de Llobregat, Spain, <sup>3</sup> Cognition and Brain Plasticity Group, Bellvitge Biomedical Research Institute (IDIBELL), L'Hospitalet de Llobregat, Spain, <sup>4</sup> Section of Cognitive Processes, Department of Cognition, Development and Educational Psychology, University of Barcelona, Barcelona, Spain, <sup>5</sup> Basque Center on Cognition, Brain and Language, Donostia, Spain

The Supplementary Motor Area (SMA)—located in the superior and medial aspects of the superior frontal gyrus—is a preferential site of certain brain tumors and arteriovenous malformations, which often provoke the so-called SMA syndrome. The bulk of the literature studying this syndrome has focused on two of its most apparent symptoms: contralateral motor and speech deficits. Surprisingly, little attention has been given to working memory (WM) even though neuroimaging studies have implicated the SMA in this cognitive process. Given its relevance for higher-order functions, our main goal was to examine whether WM is compromised in SMA lesions. We also asked whether WM deficits might be reducible to processing speed (PS) difficulties. Given the connectivity of the SMA with prefrontal regions related to executive control (EC), as a secondary goal we examined whether SMA lesions also hampered EC. To this end, we tested 12 patients with lesions involving the left (i.e., the dominant) SMA. We also tested 12 healthy controls matched with patients for socio-demographic variables. To ensure that the results of this study can be easily transferred and implemented in clinical practice, we used widely-known clinical neuropsychological tests: WM and PS were measured with their respective Wechsler Adult Intelligence Scale indexes, and EC was tested with phonemic and semantic verbal fluency tasks. Non-parametric statistical methods revealed that patients showed deficits in the executive component of WM: they were able to sustain information temporarily but not to mentally manipulate this information. Such WM deficits were not subject to patients' marginal PS impairment. Patients also showed reduced phonemic fluency, which disappeared after controlling for the influence of WM. This observation suggests that SMA damage does not seem to affect cognitive processes engaged by verbal fluency other than WM. In conclusion, WM impairment needs to be considered as part of the SMA syndrome. These findings represent the first evidence about the cognitive consequences (other than language) of damage to the SMA. Further research is needed to establish a more specific profile of WM impairment in SMA patients and determine the consequences of SMA damage for other cognitive functions.

Keywords: executive control, neuropsychology, neurosurgery, processing speed, SMA syndrome, supplementary motor area, verbal fluency, working memory

#### Edited by:

Gail Robinson, The University of Queensland, Australia

#### Reviewed by:

Nazbanou Nozari, Johns Hopkins University, United States Evy Visch-brink, Erasmus University Rotterdam, Netherlands

> \*Correspondence: Mireia Hernández mireiahernandez@ub.edu

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 January 2018 Accepted: 30 April 2018 Published: 23 May 2018

#### Citation:

Cañas A, Juncadella M, Lau R, Gabarrós A and Hernández M (2018) Working Memory Deficits After Lesions Involving the Supplementary Motor Area. Front. Psychol. 9:765. doi: 10.3389/fpsyg.2018.00765

# INTRODUCTION

The supplementary motor area (SMA) is situated in the superior and medial aspects of the superior frontal gyrus (Penfield and Welch, 1951), in front of the primary motor cortex (M1) and bordering inferiorly with the portion of the cingulate gyrus just above the genu of the corpus callosum (Talairach and Bancaud, 1966) (**Figure 1**). At a functional level, the SMA is fundamental in the selection, preparation, initiation, and execution of complex sequences of voluntary movements (Weilke et al., 2001). Such an important motor role relies on white-matter connectivity between the SMA and different core motor structures of the nervous system such as M1 (Catani et al., 2012), the striate body (Alexander and Crutcher, 1990; Lehéricy et al., 2004), or the spinal cord (Rizzolatti et al., 1996). The dominant SMA also plays a critical role in the control of motor aspects of speech production (Penfield and Rasmussen, 1950; Chassagnon et al., 2008), thanks to its connectivity with Broca's area (Vergani et al., 2014; Sierpowska et al., 2015).

The SMA happens to be a preferential site of different neurological disorders and abnormalities—most frequently tumors (especially low-grade gliomas; Duffau and Capelle, 2004) and epileptic foci (Chassagnon et al., 2008)—but cases with other etiology such as arteriovenous malformations (MAVs; Sailor et al., 2003) and cerebrovascular accidents (CVA; Dick et al., 1986; Ziegler et al., 1997; Pai, 1999) have also been described. This particularity has driven a special interest in investigating the functional symptoms derived from damage to the SMA, the collection of which is known as SMA syndrome (Laplane et al., 1977). The most apparent and common symptoms of the SMA syndrome consist of movement disorders, which are subject to the SMA somatotopy: the representation of the face, the contralateral upper limb, and the contralateral lower limb are located from middle to posterior SMA portions (Fontaine

et al., 2002). Extreme cases show contralateral hemiparesis and hemiplegia, but patients most frequently show difficulties with fine hand movements, rapid alternating sequences, and bimanual coordination (Laplane et al., 1977; Rostomily et al., 1991; Zentner et al., 1996; Pai, 1999; Bannur and Rajshekhar, 2000). As the anterior portion of the SMA in the dominant hemisphere represents language (Fontaine et al., 2002), patients may also show language disorders if the lesion is located in the dominant SMA. These language disorders consist in varying degrees of transcortical motor aphasia (Masdeu et al., 1978; Alexander and Schmitt, 1980; Berthier et al., 1991; Ziegler et al., 1997): difficulties in the initiation of speech, reduced fluency (even mutism in extreme cases), relatively preserved repetition and reading skills, and comprehension difficulties with complex language content and speech at a high speed rate.

Many SMA lesions have surgical treatment, which results in several patients recovering the altered functions. Even so, 19.46% of patients show permanent motor and speech sequelae after surgery, often interfering with daily life (Gabarrós et al., 2011). In consequence, the bulk of SMA research has mainly focused on identifying what variables influence the likelihood that SMA surgical patients show permanent motor and speech disorders (Laplane et al., 1977; Rostomily et al., 1991; Zentner et al., 1996; Ziegler et al., 1997; Bannur and Rajshekhar, 2000; Duffau et al., 2001; Fontaine et al., 2002; Nelson et al., 2002; Peraud et al., 2002; Krainik et al., 2003, 2004; Russell and Kelly, 2003; Sailor et al., 2003; Hashiguchi et al., 2004; Liu et al., 2004; Yamane et al., 2004; Ulu et al., 2008; Rosenberg et al., 2010; Gabarrós et al., 2011; Anbar, 2012; Kim et al., 2013; Ryu et al., 2013; Nakajima et al., 2014, 2015; Satoer et al., 2014; Abel et al., 2015; Acioly et al., 2015; Ibe et al., 2016; Satter et al., 2017; Vassal et al., 2017). One of the most important findings in this respect has been that the probability of permanent motor deficits increases with the severity of such deficits before surgery (Gabarrós et al., 2011). Another important contribution of those studies has been the observation that awake mapping surgery significantly reduces motor sequelae (Gabarrós et al., 2011): it allows the exact identification of the SMA by having the patient conduct a motor task (generally, either a finger-opposition motor task or a bimanual coordination one) while using electrical stimulation that, when applied to the SMA, disrupts task performance. This exact identification helps avoid unnecessary damage to SMA's healthy portions during lesion resection and, in turn, better preserve the motor function. The identification of language areas of the SMA during awake mapping surgery has also been very effective in preserving speech (Duffau et al., 2003; Sierpowska et al., 2015): electric stimulation in areas relevant for language produces speech impairment during word-generation tasks, which allows mapping of those SMA areas to protect their language function during lesion resection. These findings have represented precious information for the development of SMA protocols. The refinement of such protocols is still a priority in clinical research, as they are critical to achieving the highest effectiveness in guiding the treatment plans of SMA lesions.

In contrast to the great amount of attention devoted to motor and speech disorders, the assessment of cognitive functions (other than language) has been neglected in SMA patients. Especially surprising is the lack of attention that has been given to working memory (WM): a cognitive mechanism whose attentional component (att-WM) allows sustaining information temporally so that its executive component (ex-WM) can manipulate such information and operate with it (Baddeley, 1992, 2003; Wingfield, 2016). In fact, many neuroimaging studies with both healthy and brain-damaged individuals have evidenced that the SMA is part of a widespread fronto-parietal network underlying WM [see Owen et al.'s (2005) and Rottschy et al.'s (2012) meta-analyses]. Most of those studies used classical WM tasks in cognitive research which especially challenge the ex-WM, such as the n-back task: the participant needs to indicate whether the current stimulus matches the one from n steps earlier in a continuous sequence, with the load factor n increasing to make the task progressively more difficult in terms of load. This is particularly relevant because ex-WM is essential for higherorder functions such as reasoning, problem solving, and learning (Engle et al., 1999; Shah and Miyake, 1999). This means that ex-WM deficits could compromise an individual's capacity to perform a wide range of complex cognitive tasks. Therefore, it is fundamental to determine whether SMA damage leads to WM impairment. A positive answer to this question would mean that WM deficits need to be considered as part of the SMA syndrome and, hence, taken into account in future refinements of clinical protocols.

Nakajima et al. (2014) conducted the only prior study investigating the effects of SMA damage on WM with the description of two patients with brain tumors situated in the SMA. Using a 2-back task during awake mapping surgery, these authors obtained direct evidence that the SMA plays a role in WM, which is consistent with prior neuroimaging studies (Owen et al., 2005; Rottschy et al., 2012) that identified the SMA as one more region comprised in the WM fronto-parietal network. However, it is precisely because of the fact that WM relies on such a widespread network that it remains unclear whether permanent SMA damage would be relevant in patients' daily life. In other words, the rest of the network might overcome the permanent SMA dysfunction. Whether or not SMA damage has a real hampering effect on patients' WM can only be tested with data obtained outside the operating room. In this respect, Nakajima et al.'s (2014) data is uninformative because they merely reported the WM scores obtained by the two cases in the absence of a control group.

The main goal of the current study was to determine whether lesions involving the dominant SMA hamper WM. We focused on the dominant side because the SMA shows a certain degree of hemispheric dominance and, thus, more severe deficits are expected from damage in the dominant SMA compared to the non-dominant one (Rogers et al., 2004). With this main purpose, we conducted a group study with SMA patients and healthy controls.

We approached this study from an applied perspective using tasks from clinical practice, as opposed to experimental tasks typically used in basic cognitive research. In particular, we measured WM with the WM Index of the WAIS-III (Wechsler, 1997), which combines the scores of 3 tasks: Digit span (Dspan), Letter-Number Sequencing (LNS), and Arithmetic. The Dspan tests the capacity of mentally maintaining a series of items (digits) in the same order they were memorized. Although the second part of this task also requires the mental manipulation of those items (i.e., retrieving them in a backward order), such manipulation is rather simple. Therefore, this task fundamentally measures att-WM. The LNS is much more demanding on ex-WM, as it requires a more complex mental manipulation of a series of items: numbers and letters are mixed and the participant is required to retrieve them in a specific order different from that in which they were memorized. Arithmetic is also particularly demanding on ex-WM, but the fact that the participant needs to mentally solve mathematical problems makes it much more complex than LNS.

We also included the Processing Speed (PS) Index of the WAIS-III in the testing to assess the possibility that WM deficits in SMA patients are reducible to PS difficulties (Fry and Hale, 1996): it has been observed that brain damage tends to reduce an individual's PS (see the WAIS-IV Technical and Interpretative Manual, Wechsler, 2008; see also Hawkins, 1998; Fisher et al., 2000). This is because WM holds information only briefly and, hence, the time to manipulate such information and perform mental operations with it is limited. In cases of remarkably slow PS, the first processed items may no longer be available when the last ones reach WM, making it impossible to perform any manipulation or operation with them.

As a secondary goal, we set out to examine whether cognitive deficits associated to SMA damage would go beyond WM and extend to another cognitive process that is also relevant for complex cognition: executive control (EC), understood as a set of cognitive processes that are needed for the flexible allocation of mental resources in the service of goal-directed behavior (Posner and Snyder, 1975; Miller, 2000; Badre, 2008; Kouneiher et al., 2009; Solomon et al., 2009). This additional question was based on the fact that the SMA is linked to the dorsolateral prefrontal cortex (DLPFC)—a relevant brain structure for EC—through white matter connectivity (Nachev et al., 2008). We addressed this question by comparing SMA patients and controls in the performance of a task that has been often used as a broad, quick EC measure in clinical contexts: verbal fluency (Shao et al., 2014). In particular, we used a semantic (animals) and a phonemic (letter P) verbal fluency: the participant is given a limited amount of time to retrieve all animals or words starting with the letter P as she can. In addressing this question, however, it is necessary to take into consideration that WM and EC processes are tightly related. For example, some EC processes, such as the ability to suppress interference from distracting stimuli, depend on an individual's WM capacity (Redick and Engle, 2006). Such an association has also been observed with the verbal fluency task. Indeed, verbal fluency performance has been associated with WM capacity in different studies, with both healthy individuals (Rosen and Engle, 1997; Rende et al., 2002; Azuma, 2004; Hedden et al., 2005; Unsworth et al., 2010) and neuropsychological patients (Sands et al., 2000; Witt et al., 2004; Lam et al., 2006; Larsson et al., 2008; Zahodne et al., 2008). Therefore, we controlled for byproduct effects of WM when assessing the effects of SMA damage on verbal fluency.

# MATERIAL AND METHODS

# Participants

Twenty-four participants took part in this study, half of whom were patients with a lesion involving the dominant SMA and the other half were control participants. The patient and control groups were matched for gender (7 men and 5 women per group), age (patients: mean = 41.25 years, SD = 10.9; controls: mean = 41.33 years, SD = 10.54) and years of educational attainment (patients: mean = 11.5, SD = 2.7; controls: mean = 12.33, SD = 2.67). This matching was conducted by pairing each patient with a control in all those three socio-demographic variables. All participants were righthanded. Patients were attended in the Neurosurgery Service of the Hospital Universitari de Bellvitge (HUB, Barcelona, Spain) between 2012 and 2015. The etiology of the lesion was a tumor between grade I and III in all cases except for a cerebral abscess and two AVMs. All lesions were located in the left cerebral hemisphere. All patients but one underwent lesion resection. As WM and EC were assessed with verbal tasks, patients diagnosed with language impairment by their clinical neuropsychologist were not included in the study. The absence of language impairment was determined by asking patients to describe the Cookie Theft Picture of the Boston Diagnostic Aphasia Examination (Goodglass and Kaplan, 1983), with which the neuropsychologist determined that the spontaneous speech of all patients was fluent, with information content, and free of anomia and phonological, semantic, morphological, or syntactic errors. **Table 1** summarizes patients' socio-demographic characteristics and their type of SMA lesion (see also **Figure 2** for an example of an SMA lesion before and after surgical treatment).

#### Procedure

In the case of patients, experimental testing was included in the same clinical neuropsychology sessions, consisting in the assessment of language and fine hand movements. The


Educational attainment = the maximum educational level that the patient completed in the Spanish educational system.

testing session was conducted a few days before surgery. The experimental testing was repeated between 4 and 6 weeks after surgery. We addressed the main and the secondary questions in this study—the effects of SMA damage in WM and verbal fluency—using patients' pre-surgery data. We used patients' postsurgery data only to conduct complementary analyses examining post-surgery changes in WM, verbal fluency, and PS (see Data analyses and Results sections for further details).

#### Tasks

#### WM Assessment—WM Index of the WAIS-III

The WM Index is derived from 3 tasks of the WAIS-III (Spanish Edition; Wechsler, 2001), which need to be administered in the same order as we describe them here: Digit span (Dspan), Letter-Number sequencing (LNS), and Arithmetic. The sum of the raw scores for each task gives the raw score for the WM Index.

#### **Dspan**

This task is composed of two parts: the Dspan forward and the Dspan backward. For the Dspan forward, the experimenter utters strings of digits at a rate of one number a second, approximately. The participant needs to repeat those digits in the same order the examiner uttered them. There are 8 blocks with 2 trials for each string length, which starts with 2 digits and increases by 1 in each successive block. The administration of the task is terminated if the participant fails both trials within the same block. The same procedure is used for the Dspan backward, with the exception that the participant needs to repeat the digits in reverse order and that only seven (instead of eight) blocks compose the task. The Dspan raw score is the sum of the total number of correct Dspan forward and Dspan backward trials.

#### **LNS**

The experimenter utters strings of items consisting of different numbers and letters presented in mixed order and at a rate of one item a second, approximately. The participant needs to

FIGURE 2 | Example of an SMA lesion (in these images, left is right and right is left). (A) A FLAIR sequence MRI shows a tumoral lesion involving the left SMA in a 35-year-old patient. (B) A FLAIR sequence MRI 6 months after surgery shows a complete resection, achieved by performing awake brain mapping surgery. The pathology report was anaplastic oligodendroglioma.

repeat each string, uttering first the numbers in numerical order followed by the letters in alphabetical order. There are 8 blocks with 3 trials for each string length, which starts with 3 items and increases by 1 in each successive block. The administration of the task is terminated if the participant fails all three trials within the same block. The LNS raw score is the sum of correct trials.

#### **Arithmetic**

The experimenter reads arithmetic problems out loud and asks the participant to give the correct answer to them. The participant is not allowed to take any written notes but the experimenter can repeat the entire problem once, if requested. Complexity in terms of the amount of information that needs to be held, the difficulty of the mathematical operations required (additions, percentages, etc.), and the time allotted for problem solving increases with every problem. There are 20 problems, and the administration of the task is terminated after the participant fails to solve 4 consecutive problems. The participant is given 1 point for each problem correctly solved, the total sum of which composes the Arithmetic raw score.

#### EF Assessment—Verbal Fluency Tasks

Verbal fluency tasks involve a limited amount of time (usually 1 min) for the participant to name as many words as possible according to a key, which is usually semantic or phonemic. We tested verbal fluency using both types of keys. On the semantic fluency, the participant had 1 min to generate as many words belonging to the semantic category "animals" as she could. On the phonemic fluency, the participant had the same amount of time to generate as many words as possible that began with the letter P, excluding proper names and repetitions of the same word with different endings. The raw score for each type of fluency (semantic, phonemic) is the total number of words generated within the allotted time limit.

#### PS Assessment—PS Index of the WAIS-III

The PS Index is derived from 2 tasks of the WAIS-III (Spanish Edition; Wechsler, 2001): Coding-Digit Symbol (CDsymbol), and Symbol Search (Ssearch). The sum of the raw scores for the 2 tasks gives the raw score for the PS Index.

#### **CDsymbol**

The participant is provided with a key matching each digit (from 1 to 7) with different meaningless symbols: e.g., digit 1 is associated with a horizontal double-headed arrow, digit 2 with a vertical double-headed arrow, digit 3 with 3 parallel horizontal lines, etc. Below this digit-symbol matching key—which remains always visible to the participant to avoid WM load effects—the participant is provided with a series of numbers, each placed above a blank box. She needs to draw the appropriate symbol in the box below each number according to the digit-association key. The participant is instructed to complete as many boxes as she can within the allotted time of 2 min. The CDsymbol raw score is the sum of the correctly completed boxes.

#### **Ssearch**

The participant is provided with 2 meaningless symbols in a left column. She needs to indicate if at least 1 of those 2 symbols is present in a group of 5 meaningless symbols in a right column. The participant gives the response by marking the "yes" or the "no" box. She is instructed to complete as many trials as possible in the allotted time of 2 min. The Ssearch raw score is the sum of correct answers minus the sum of errors.

# DATA ANALYSES

To reduce the skewness in the distribution of the data, we used transformed scores for data analyses. In the case of all WAIS-III tasks and verbal fluencies, we transformed raw scores into standard scores using Spanish normative data: WAIS-III Spanish edition (Wechsler, 2001) and NEURONORMA (Peña-Casanova et al., 2009), respectively. As for the WM and PS Indexes, we used intelligence quotient (IQ) scores: standard scores of the tasks composing each index were summed and transformed into IQs according to Spanish normative data (Wechsler, 2001). All data analyses were conducted with R (version 3.4.1; R Development Core Team, 2008).

Potential outliers were identified with box plots. In small samples it is difficult to determine whether an extreme observation is in fact outlying or merely reflects variability, especially in the case of patients. Therefore, we only considered as outlying the scores of control participants that fell below the cut-off for normality according to normative data. We based this action on the rationale that, by definition, controls cannot show impairment. This only occurred with two control participants: control c04 was not considered for any group analysis related to WM because he scored below the cut-off in most WM measures (**Supplementary Figures 1A–C**); control c06 was an outlier in phonemic fluency (**Supplementary Figure 1D**). In between-group analyses, we also excluded the data of the two patients matched socio-culturally with controls c04 and c06: p04's WM data, and p06's phonemic fluency data.

We first performed non-parametric between-group analyses (using the Mann Whitney U test) to examine whether the median (Mdn) of the patient group in the different tasks differed from that of the control group. Second, we assessed whether any potential impairment in WM observed in patients was actually driven by difficulties in PS. This assessment was done partialling out the influence of PS on WM measures in regression analyses and using the residuals resulting from those regressions to compare patients and controls. A similar procedure was used to assess whether between-group differences in verbal fluency were merely a by-product of patients' WM impairment rather than due to differences in EC. Finally, we included complementary analyses to examine whether the surgical treatment of the lesion caused any changes in WM, verbal fluency, or PS. To this end, we performed non-parametric within-group analyses (using the Wilcoxon signed rank test) to compare patients' pre- and post-surgery medians in the different cognitive measures.

# RESULTS

**Table 2** summarizes patients' and controls' scores in each WM, verbal fluency, and PS measure. It also summarizes the results of all between-group comparisons.

# Between-Group Differences in WM

The WM Index of patients (Mdn = 92) was significantly lower than that of controls (Mdn = 110; W = 92.5, p = 0.038), suggesting that SMA lesions led to WM impairment. To examine what WM task was more problematic for patients, we compared the performance of patients and controls in each WM task separately: patients clearly performed worse than controls on LNS (patients' Mdn = 9, controls' Mdn = 13, W = 105.5, p = 0.003); patients also showed poorer performance than controls in Arithmetic, being this between-group difference marginally significant (patients' Mdn = 8, controls' Mdn = 12, W = 89.5, p = 0.059); and there were no significant betweengroup differences in Dspan (patients' Mdn = 10, controls' Mdn = 12, W = 84.5, p =0.118). These results suggest that, in general, patients performed worse than controls in WM tasks, with difficulties appearing in the ex-WM tasks (LNS and Arithmetic) rather than in the att-WM one (Dspan) (**Figures 3A–D**).

# Between-Group Differences in Phonemic and Semantic Fluency

Patients performed poorly compared to controls in phonemic fluency (patients' Mdn = 7, controls' Mdn = 11, W = 95.5, p = 0.022), but not in semantic fluency (patients' Mdn = 7, controls' Mdn = 9, W = 98.5, p = 0.13). That is, patients showed reduced verbal fluency if tested with the phonemic fluency task (**Figures 3E,F**).

#### Between-Group Differences in PS

**Figure 3G** shows a marginally reduced PS Index for patients compared to controls (patients' Mdn = 100.5, controls'


Numbers in participants' code indicate the patient-control pairing (e.g., c01 is the control participant for p01). WM and PS Indexes are measured in terms of intelligence quotient. The values for the rest of the measures are standard scores. W/p-value (residuals): the significance of the between-group comparison partialling out PS (WM measures) or WM (phonemic fluency). \*Statistical values reported without considering these participants as they were considered outliers (or the patient pair of an outlier control participant).

Mdn = 107.5, W = 104.5, p = 0.062). Such reduction in PS was mainly driven by patients performing worse than controls in CDsymbol (**Figure 3H**; patients' Mdn = 8, controls' Mdn = 11.5, W = 105, p = 0.056) rather than in Ssearch (**Figure 3I**; patients' Mdn = 11, controls' Mdn = 12, W = 95, p = 0.186).

#### PS Influence on WM

Given the marginal impairment in PS shown by patients, we examined whether patients' difficulties with WM could be merely driven by slow PS. To this end, we first partialled out the contribution that PS had on participants' WM performance with linear regression analyses. We conducted a separate regression including both controls and patients—for each WM measure (dependent variables). We used CDsymbol performance as the predictor factor, which was the PS task in which patients and controls differed the most. Then, we compared the residuals of controls and patients resulting from those regressions: significant between-group differences in these residuals would mean that patients and controls differed in WM regardless of PS. The results of these analyses showed a reduced WM Index for patients compared to controls (W = 30.5, p = 0.05), which was mainly driven by a poorer performance in LNS (W = 21.5, p = 0.011) rather than Dspan (W = 46, p = 0.36) or Arithmetic (W = 39, p = 0.17). Arithmetic seemed to be the WM task most influenced by PS, as the marginal between-group difference reported above vanished after controlling for PS. In spite of that, after removing the influence of PS on WM, patients showed a similar pattern of WM difficulties as the one reported above: they still showed a reduced WM Index due to poor performance on an ex-WM task: LNS. This indicates that patients' difficulties in ex-WM were genuine and not reducible to slow PS.

# WM Influence on Phonemic Fluency

As exposed in the Introduction, besides EC processes, verbal fluency tasks also engage WM. This means that patients' reduced verbal fluency observed in the phonemic task version could have been driven by difficulties with its WM component rather than by EC deficits. To examine this possibility, we first partialled out the contribution of WM on participants' performance in phonemic fluency (dependent variable) with a linear regression analysis that included both controls and patients. We used LNS performance as the predictor factor in this regression, which was the WM task in which patients showed difficulties. Then, we compared controls and patients in the resulting residuals: significant between-group differences in these residuals would mean that patients and controls differed in the EC processes typically engaged by verbal fluency tasks regardless of their WM component. Contrasting with this prediction, however, the results showed no between-group differences in the residuals (W = 49.5, p = 0.97), indicating that patients' had no difficulties in engaging the EC processes required by the task. In other words, the reduced phonemic fluency reported above seemed to have been driven by patients' WM impairment.

# Complementary Analyses on Post-surgery Changes

Prior studies focused on movement and language seemed to indicate that the surgical treatment of the lesion reduces the likelihood that patients remain with permanent sequelae (Gabarrós et al., 2011). In this study, however, resecting the SMA lesion did not result in any changes with respect to WM or PS function, whereas patients' performance in verbal fluency tended to worsen: WM Index (pre-surgery Mdn = 90, postsurgery Mdn = 86, V = 21, p = 0.759), Dspan (pre-surgery Mdn = 10, post-surgery Mdn = 8, V = 16, p = 0.733), LNS (pre-surgery Mdn = 9, post-surgery Mdn = 8, V = 21, p = 0.268), Arithmetic (pre-surgery Mdn = 7, post-surgery Mdn = 9, V = 21, p = 0.67), PS Index (pre-surgery Mdn = 103, post-surgery Mdn = 92, V = 45, p = 0.306), CDsymbol (presurgery Mdn = 8, post-surgery Mdn = 9, V = 26, p = 0.719), Ssearch (pre-surgery Mdn = 11, post-surgery Mdn = 10, V = 43, p = 0.125), phonemic fluency (pre-surgery Mdn = 7, post-surgery Mdn = 4, V = 45.5, p = 0.072), and semantic fluency (presurgery Mdn = 7, post-surgery Mdn = 5, V = 38, p = 0.074). Many different factors may play a role in the post-surgery outcome of cognitive processes, including potential collateral effects of surgery, premorbid cognitive deficits, and various clinical variables such as extension and exact location of the lesion. Since we did not control for many of those factors, we will not discuss the post-surgery outcome any further.

# DISCUSSION

The main goal of the present study was to determine whether lesions affecting the SMA hamper WM by taking into consideration potential PS confounds. As a secondary goal we asked whether SMA damage would also hamper patients' performance on a broad, quick EC measure such as verbal fluency. We addressed these questions using tests commonly used in clinical practice to facilitate the transfer of the results to a health care context.

# On the Impact of SMA Lesions on WM

The results indicated that SMA lesions indeed compromised WM and that such a deficit could not be reduced to PS difficulties even though patients showed a trend toward slower PS compared to controls. The awake surgery data reported in Nakajima et al. (2014) had already evidenced a relation between SMA and WM. Neuroimaging data had also evidenced such relation. Indeed, different meta-analyses have attributed to the SMA a role in WM, regardless of the specific WM task, modality (verbal vs. non-verbal), and difficulty (Owen et al., 2005; Rottschy et al., 2012). However, neuroimaging data had also revealed that WM relies on a widespread fronto-parietal network, being the SMA one more of the various regions comprising such network along with portions of dorsolateral prefrontal, ventrolateral prefrontal, parietal, and subcortical regions (Owen et al., 2005; Rottschy et al., 2012). It is worth noting that such an extensive network could alleviate the effects of SMA damage to the extent that such damage would not have a significant impact on WM functionality. A neuropsychological approach was fundamental in shedding light onto this issue. However, WM deficits associated with SMA damage have never been previously reported, at least to our knowledge: as exposed in the Introduction, Nakajima et al.'s (2014) was the only study exploring the effects of SMA damage on WM, but the data obtained outside the operating room for the two single cases they reported were inconclusive. Therefore, the results of the present study represent the first evidence that the role of the SMA is relevant enough as to significantly impair WM in case of damage. In this regard, it is worth noting that, rather than the SMA playing a direct role in WM, it may play an indirect role through its connectivity with the rest of the brain regions composing the fronto-parietal network. This is a question for further research.

As for the specific pattern of WM difficulties, patients showed a reduced WM Index, which is a global WM measure. Among the different WM measures we had, WM deficits were captured by the LNS task. Patients did not differ from controls in Dspan, and Arithmetic, especially after removing the effects of PS. The fact that patients performed similarly to controls in Dspan indicated that they were relatively able to hold a series of items in WM. This means that their poor performance in manipulating such items (ex-WM)—which is what LNS mainly measures—could not be explained by patients' not having those items available. One would expect that this difficulty in manipulating information held in WM would, in turn, lead patients to perform poorly in tasks requiring operating with such information, as is the case with Arithmetic. It is relevant to note, however, that Arithmetic may not be the most valid tool to assess WM, as it depends on individuals' mathematical skills (Hill et al., 2010). In fact, in a study examining which tasks of the WAIS-III were the best predictors of WM capacity, Hill et al. (2010) found that Arithmetic was not among those predictors. These authors used common WM paradigms in experimental psychology as the gold standard in a study with 188 healthy individuals. The results of regression analyses indicated that, when considering only the 3 tasks originally thought to compose the WM Index (Dpan, LNS, and Arithmetic), Dpan and LNS accounted for most part of the variances (33% and 28%, respectively). In contrast, the Arithmetic was excluded from the model, evidencing its weak association with WM.

It is also relevant to point out that the present results should be taken only as a starting point for further research in which more sophisticated WM tools would be used. We acknowledge that the information one can gather with the WM Index of the WAIS-III regarding individuals' WM functioning is limited. Future research with experimental tasks is needed to fully understand the profile of WM difficulties associated to SMA damage. A related question for future research is whether such profile is specific to SMA patients with respect to the WM profile of patients with damage in other regions within the fronto-parietal network.

# On the Effects of SMA Lesions on Verbal Fluency

The secondary question of this study was to determine whether cognitive impairment (other than language) associated to SMA goes beyond WM and also affects EC due to its connectivity with the DLPFC (Nachev et al., 2008). To this end, we compared patients' and controls' performance in verbal fluency, both semantic and phonemic. In first between-group analyses, patients showed worse performance than controls in the phonemic task version. Subsequent analyses, however, revealed that the WM processes engaged by the phonemic fluency task drove such a disadvantage. That is, it seems that patients' reduced phonemic fluency was most likely not due to difficulties in the EC processes required by this task but to the WM ones. Note, however, that it would be too precipitate to state that these results suggest that SMA damage does not lead to EC deficits. This is because verbal fluency is too broad of an EC measure. In addition, it is not possible to disentangle the multiple EC processes this task may engage (e.g., inhibition, response suppression, and switching, among others; Abwender et al., 2001; Hirshorn and Thompson-Schill, 2006), which is a relevant limitation because patients may have selective difficulties in some of these processes. An additional limitation of verbal fluency is its strong dependence on the integrity of other cognitive processes different from EC, such as lexical access or semantic processing (Shao et al., 2014). Therefore, our data only indicates that SMA damage does not seem to affect cognitive processes engaged by verbal fluency other than WM. Future research using more extensive and specific experimental protocols is needed to further investigate the potential association of EC dysfunction and SMA damage.

A secondary point related to the verbal fluency task is the fact that patients did not differ from controls in the semantic task version (before partialling out the effects of WM). This observation suggests that semantic fluency engaged WM processes to a lesser extent than phonemic fluency and, thus, was not affected by the collateral effect of patients' WM impairment. In fact, in some authors' view, semantic fluency is particularly associated with lexical-semantic skills rather than WM or EC (Shao et al., 2014). In line with this view, neuropsychological studies have found that patients with brain damage that compromises regions typically associated with lexical-semantic representations show more difficulties with the semantic than with the phonemic verbal fluency task (Jones et al., 2006; Laws et al., 2010; Magaud et al., 2010; Meijer et al., 2011). Similarly, Satoer et al. (2014) described a patient with the SMA syndrome whose reduced semantic (but not phonemic) verbal fluency seemed to stem from his language difficulties rather than his WM or EC dysfunctions.

# On SMA Patients' Slow PS

As exposed above, patients showed a marginal PS impairment, which did not drive their WM impairment. Beyond this fact, it is worth discussing the potential origin of such marginal PS difficulties. In this respect, it is relevant to point out that there is no specific brain region subserving PS, but it depends on white-matter pathways that synchronize the transmission of information across distributed brain networks (Mesulam, 1998, 2000). This is why the neurological conditions most commonly affecting PS are those involving axonal damage, such as multiple sclerosis or traumatic brain injury (Rao, 1996; Levine et al., 2006). In a neuroimaging study using the CDsymbol task as a PS measure, Turken et al. (2008) revealed that, among the various white-matter pathways critical for PS, at least two of them involve the frontal cortex: the superior longitudinal fasciculus (a major fronto-parietal tract) and fronto-striatal projections. One of the frontal locations most relevant seemed to be the superior frontal cortex (SFC) to which parietal regions project through the superior longitudinal fasciculus. Another relevant frontal site was the DLPFC, connected with the basal ganglia through fronto-striatal projections. It is of note that the functioning of both frontal structures might be—at least to a certain extent compromised by SMA lesions: the SMA is, in fact, harbored in the superior and medial aspects of the SFC (Penfield and Welch, 1951) and linked to the DLPFC though white matter connectivity (Nachev et al., 2008). Therefore, the marginal PS impairment may have been due to the SFC and DLPFC (presumed) dysfunction having an impact on their parietal and striatal white matter links, respectively.

# Limitations of the Current Study

As acknowledged above, one of the limitations of the present study has to do with the simplicity of the tasks used to index WM and EC. In addition, the modest sample size has been a limitation in our statistical analyses. In this respect, it is worth underlining that the SMA's susceptibility to harbor tumors and other abnormalities calls for research on the SMA syndrome. However, the incidence of SMA damage is not massive, making it difficult to compose a patient group of a decent size: for example, the 12 patients in this study were recruited over a 3-year period in a hospital of reference for the treatment of SMA lesions in Spain. In fact, large samples (n > 15) are exceptional (Zentner et al., 1996; Peraud et al., 2002; Duffau et al., 2003; Russell and Kelly, 2003; Rosenberg et al., 2010; Kim et al., 2013), whereas singlecase descriptions have been very common (Laplane et al., 1977; Masdeu et al., 1978; Dick et al., 1986; Ziegler et al., 1997; Pai, 1999; Duffau et al., 2001; Chung et al., 2004; Hashiguchi et al., 2004; Iwasaki et al., 2009; Ryu et al., 2013; Nakajima et al., 2014; Acioly et al., 2015; Satter et al., 2017). Some studies have been able to describe series of 6 and 7 patients (Rostomily et al., 1991; Bannur and Rajshekhar, 2000; Sailor et al., 2003; Liu et al., 2004; Yamane et al., 2004; Tankus et al., 2009; Vassal et al., 2017). In several other studies—including the present one—the series of patients has been large enough (8–15 patients) as to allow nonparametrical group analyses (Lim et al., 1994; Fontaine et al., 2002; Nelson et al., 2002; Krainik et al., 2004; Ulu et al., 2008; Gabarrós et al., 2011; Anbar, 2012; Abel et al., 2015; Nakajima et al., 2015; Ibe et al., 2016). In this regard, it is worth highlighting that this is the first group study investigating the effects of SMA lesions on cognitive functions other than language: the only prior study assessing non-linguistic cognitive functions only provided descriptive data of 2 single cases (Nakajima et al., 2014). In this respect, the results of the present study represent a remarkable new contribution to the research field. Even so, we interpret such results with caution.

It is also worth mentioning that a number of clinically related variables—which we did not control for—may have an influence on the extent of functional deficits. Two of these variables are the size and exact location of the lesion. In this respect, distinguishing between the 2 anatomical portions of the SMA—the pre-SMA, and the SMA-proper—would be of particular relevance. It seems that the former could be more related to cognitive processes (Chouinard and Paus, 2010). Therefore, it is possible that WM deficits arise only if the lesion affects the pre-SMA. In the case of a tumor, whether it invaded or displaced the SMA may also make a difference. In addition, it has been reported that the damage a tumor may cause is much greater if it is infiltrating (Gabarrós et al., 2011). Similarly, the degree to which (if any) the contralateral non-dominant SMA takes over the functions of the dominant one may vary across patients. It is worth noting, however, that a considerable large sample of patients would be needed to apply sound methodological approaches (e.g., neuroimaging studies based on lesion mapping techniques) and sound statistical methods (e.g., mixed-model regression analyses) in order to examine the implications of these variables.

#### CONCLUSION

The results of this study represent the first evidence that WM impairment is a symptom of the SMA syndrome, which seems to stem from difficulties in manipulating information held in WM. PS is also somewhat compromised in SMA patients. However, WM deficits are not reducible to PS difficulties. These findings highlight the need to include WM assessment in clinical SMA protocols. Further research is needed to establish the specific WM profile of SMA patients and determine the consequences of SMA damage for other cognitive functions such as EC.

#### ETHICS STATEMENT

This study was reviewed and approved by the ethics committee of the Hospital Universitari de Bellvitge (L'Hospitalet de Llobregat,

#### REFERENCES


Barcelona, Spain). The study was conducted in accordance with the Declaration of Helsinki. All participants provided written informed consent.

#### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

MH was supported by the Ramón y Cajal (RYC-2016-19477) research program (Spanish Ministry of Economy, Industry, and Competitiveness). MH also acknowledges financial support from the Diputación Foral de Gipuzkoa/Gipuzkoako Foru Aldundia (Programa Fellows Gipuzkoa de atracción y retención de talento).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00765/full#supplementary-material

Supplementary Figure 1 | Boxplots showing outlier control subjects in different WM and EC measures. (A–C) WM measures in which control c04 was an outlier. (D) EC measure in which control c06 was an outlier.


intelligence: a latent-variable approach. J. Exp. Psychol. Gen. 128, 309–331. doi: 10.1037/0096-3445.128.3.309


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Cañas, Juncadella, Lau, Gabarrós and Hernández. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive Functions Rating Scale and Neurobiochemical Profile in HIV-Positive Individuals

Vojislava Bugarski Ignjatovic<sup>1</sup> \*, Jelena Mitrovic<sup>2</sup> , Dusko Kozic<sup>3</sup> , Jasmina Boban<sup>3</sup> , Daniela Maric<sup>4</sup> and Snezana Brkic<sup>4</sup>

<sup>1</sup> Department for Psychology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia, <sup>2</sup> Department for Psychology, Faculty of Philosophy, University of Novi Sad, Novi Sad, Serbia, <sup>3</sup> Department for Radiology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia, <sup>4</sup> Department for Infectious Diseases, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia

The set of complex cognitive processes, that are necessary for the cognitive control of behavior, known as executive functions (EF), are traditionally associated with the prefrontal cortex and commonly assessed with laboratory based tests and conventional neuroimaging. In an effort to produce a more complete and ecologically valid understanding of executive functioning, the rating scales have been developed in order to assess the behavioral aspects of EF within an everyday real-world context. The main objective of this study was to examine the relationship between behavioral aspects of EF measured by rating scale and neurometabolic profile in neurologically asymptomatic HIV-positive individuals under cART, measured using multi-voxel magnetic resonance spectroscopy (mvMRS). The sample comprised 39 HIV-positive adult male participants, stable on cART and 39 healthy HIV-negative volunteers. Both groups completed the Behavior Rating Inventory of Executive Function-Adult Version (BRIEF-A). HIV-positive group additionally underwent long-echo three-dimensional mvMRS to determine neurobiochemical profile in the anterior cingulate gyrus (ACG) of both hemispheres. Three dominant neurometabolites were detected: N-acetyl aspartate (NAA), the neuronal marker; choline (Cho), the marker of membrane metabolism and gliosis and creatine (Cr), the reference marker. Ratios of NAA/Cr and Cho/Cr were analyzed. The initially detected significant correlations between age, current CD4, BRIEF-A subscales Inhibit, Shift, Emotional Control, Plan/Organize, Self Monitoring and ratios of NAA/Cr and Cho/Cr in the dorsal and ventral part of the ACG, were lost after the introduction of Bonferroni corrections. Also, there were no significant differences between HIV–positive and HIV–negative group on any of BRIEF-A subscales. Such results possibly imply that stable cART regimen contributes to preservation of behavioral aspects of EF in asymptomatic HIV-positive individuals. Even though a subtle deficit in some aspects of EF might exist, it would not be manifest if behavioral aspect was assessed using EF rating scale. Further explanation might be that expected HIV-related changes in neurometabolic profile of the ACG under cART are not reflected in those behavioral aspects that are measurable by EF rating scale.

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Anna Rita Egbert, University of British Columbia, Canada Keenan Alexander Walker, Johns Hopkins Medicine, United States

#### \*Correspondence:

Vojislava Bugarski Ignjatovic vojislava.bugarski-ignjatovic@ mf.uns.ac.rs

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 20 February 2018 Accepted: 27 June 2018 Published: 19 July 2018

#### Citation:

Bugarski Ignjatovic V, Mitrovic J, Kozic D, Boban J, Maric D and Brkic S (2018) Executive Functions Rating Scale and Neurobiochemical Profile in HIV-Positive Individuals. Front. Psychol. 9:1238. doi: 10.3389/fpsyg.2018.01238

Keywords: executive functions, anterior cingulate gyrus, behavior rating scale, HIV infection, MR spectroscopy

# INTRODUCTION

The executive functions (EF) are defined as the control or self-regulatory functions that organize and direct all cognitive activity, emotional response, and overt behavior (Miyake et al., 2000; Anderson, 2008; Duggan and Garcia-Barrera, 2015). The term "executive functions" represent an umbrella term for the interrelated functions that are responsible for purposeful, goaldirected, and problem-solving behavior in the everyday, "real world" environment (Goldstein et al., 2014). These functions are unique to our experience as human beings and critical for the successful and adaptive everyday functioning (Duggan, 2014). Given the central importance of the executive functions for the direction and control of the dynamic "real world" behavior, the reliance on traditional performance measurements can lead to a limited, incomplete assessment (Gioia and Isquith, 2004; Isquith et al., 2013). Resulting from an effort to produce a more complete and ecologically valid understanding of executive functioning, a number of rating scales have been developed in order to assess the behavioral aspects of executive function within an everyday real-world context; these scales can potentially serve as an ecological validity index for clinical or laboratory findings (Isquith et al., 2013; Duggan, 2014; Silver, 2014). In this sense, executive function rating scales were originally intended to serve as complementary measures to traditional assessment methods; however, studies have shown that performance-based and rating-based measurements assess different aspects of executive functioning and provide important complementary information to clinicians and researchers (Isquith et al., 2013; Toplak et al., 2013; Duggan, 2014; Silver, 2014).

According to Duggan (2014), some of the primary strengths of executive function rating scales are the ability to assess application of executive skills (rather than the functionality of their components), the capacity to capture executive characteristics of everyday functioning in clinical setting, the contributions of distinct information to executive function assessment, and the potential correlations with biological markers (Gioia et al., 2010; Isquith et al., 2013; Toplak et al., 2013). The presumed neurological basis for deficits in EF makes the relationship between markers of neuronal function and ratings on these scales an expected finding, especially in clinical populations (Isquith et al., 2013). Several studies reported associations between neural substrates and everyday executive functioning measured by rating scales (Isquith et al., 2013). The Behavior Rating Inventory of Executive Function (BRIEF; BRIEF-A) represents one of the most successful EF rating scales in daily activities, both in pediatric and adult clinical population. (Gioia and Isquith, 2004; Anderson et al., 2005; Wozniak et al., 2007; Garlinghouse et al., 2010; Kesler et al., 2011; Ghassabian et al., 2013; Hagen et al., 2016).

Frontal lobes are traditionally said to be involved in the executive functions. However, due to the high diversity of cognitive processes encompassed by the term "executive functions," researchers put an effort to define functions within the executive domain according to the cognitive process that they are involved in. Functional neuroimaging studies have shed much light on this field, by identifying the cerebral regions engaged in the processing of certain cognitive task (Gazzaniga, 2004; MacPherson et al., 2015). However, these conclusions were not sufficient. The results of lesion studies also pointed to the frontal lobe regions as necessary for performing specific cognitive tasks (MacPherson et al., 2015).

The anterior cingulate gyrus (ACG) is a unique and important region in the brain circuitry, extensively connected to both the emotional and the cognitive regions. ACG lies in the medial part of both cerebral hemispheres, encompassing the corpus callosum. It is divided into two anatomically and functionally different regions, the ventral (or rostral) ACG (surrounding the genu of the corpus callosum) and the dorsal (or caudal) ACG (surrounding the body of the corpus callosum) (Palomero-Gallagher et al., 2009). Anatomic specificum of these regions (the whole dorsal and a half of ventral part) is the presence of Von Economo spindle neurons that are larger than pyramidal neurons, constructed for purposes of fast neurotransmission and high connectivity (Nimchinsky et al., 1999). The connections of these two parts differ greatly, as well as their proposed functions. The ventral part of ACG has connections with emotion (amygdala), autonomic (hypothalamus), memory (hippocampus) and reward-related (orbitofrontal cortex) regions of the brain. On the other hand, the dorsal part of ACG has rich connections with cognitive (lateral prefrontal) and motor-related (premotor and primary motor) regions, as well as with thalamic nuclei involved in pain- and motor-processing (Allman et al., 2001).

Functional roles of these two parts of ACG have been studied using functional neuroimaging studies and diffusion tensor imaging (DTI). DTI studies, that measured structural connectivity, largely confirmed patterns described in anatomical analyses (Stevens et al., 2011). Resting-state functional MR imaging (fMRI) revealed the connections between the ventral ACG and areas involved in affective processing, as well as between the dorsal ACG and sensorimotor and cognitive processing (Margulies et al., 2007). Task-related fMRI showed that motor-related tasks activated dorsal ACG, while simple emotions (sadness and happiness) activated more the ventral ACG. However, the sense of sadness contributed to the sense of pain, processed by the dorsal ACG. Cognitive control, conflictmonitoring, response-selection, error-detection and emotionrelated appraisal are most likely processed also by the "cognitive" dorsal part of ACG. The ACG is thus the "affective" part, involved in processing emotion assessment, emotion-related learning and autonomic regulation (Etkin et al., 2011). Prior neurocognitive and neuroimaging studies in healthy population confirmed the association between anterior cingulate gyrus (ACG) and cognitive control functions, especially in conflict detection, performance monitoring and response selection (Botvinick, 2007; Alexander and Brown, 2017), as well as consolidation of memories (Nieuwenhuis and Takashima, 2011) and attention and reward-based learning (Hayden et al., 2011). Also, a functional division of ACG in ventral "affective" and dorsal "cognitive" subdivisions was confirmed (Mohanty et al., 2007).

Human Immunodeficiency Virus (HIV) infection has become a chronic condition after the introduction of combination antiretroviral therapy (cART), with the average lifespan of HIV-positive individuals reaching that of the HIV-negative population (Heaton et al., 2015). Currently, European AIDS Clinical Society guidelines strongly recommend the introduction of cART immediately after confirmation of HIV-seropositive status, regardless the immune status (Ryom et al., 2016). This early introduction of cART has significantly changed the overall picture of HIV-associated neurocognitive disorders, with a marked reduction in the prevalence of the most severe form, HIV-associated dementia (Chan and Brew, 2014) being up to 18% in the pre-cART era, and dropping down to 3–5% after the introduction of modern antiretroviral treatment (Heaton et al., 2011) The milder forms, asymptomatic neurocognitive impairment, and mild neurocognitive impairment remain highly prevalent in the cART era (Harezlak et al., 2011; Clifford and Ances, 2013). HIV-positive individuals have become an interesting clinical population for the assessment of cognitive status. Especially interesting are the EF, since they play a very important role in adherence to the antiretroviral therapy (Huerta et al., 2016). Recent studies showed impairments in multiple domains of executive functions in HIV-infected individuals (Heaton et al., 2011; Cattie et al., 2012; Giesbrecht et al., 2014), such as performance difficulties observed on cognitive shifting and complex sequencing tests (Marcotte et al., 1999; Mindt et al., 2003; Vazquez-Justo et al., 2003; Heaton et al., 2004;); response inhibition (Hinkin et al., 1999; Yadavalli, 2009); decision making (Hardy et al., 2006); abstract reasoning (Marcotte et al., 1999; Heaton et al., 2004); planning (Cattie et al., 2012) and working memory (Malagurski et al., 2017). In spite of peripherally successful cART application and the absence of a subjective experience, a certain degree of the cognitive deficit can still be observed (Malagurski et al., 2017). In addition, in the cART era, an important contribution to neuropsychiatric complications can be attributed to the antiretroviral drugs–nucleoside and non-nucleoside reverse transcriptase inhibitors in the first line (Treisman and Soudry, 2016).

One of the most commonly used advanced neuroimaging techniques for the detection of pathological process underlying neurocognitive impairment in HIV-positive population is magnetic resonance spectroscopy (MRS). MRS is a noninvasive diagnostic imaging technique used for detection of neurometabolite concentrations in small volumes of brain tissue. The pattern of changes in neurometabolic profile can characterize the type of ongoing pathological process in the brain parenchyma (Posse et al., 2013). Classical pattern of brain injury in chronic HIV infection consists of the decline in neuronal marker (N-acetyl-aspartate, NAA), and the increase in markers of membrane metabolism (choline-containing compounds, Cho) and glial proliferation (myoinositol, mI) (Ances and Hammoud, 2014). Recent MRS studies showed a reduction in neuronal marker in ACG in the HIV-positive subjects compared to healthy controls (Boban et al., 2017). Additionally, some differences were observed in the level of NAA in HIV-positive subjects with stimulant (alcohol, drugs) dependence, compared to both healthy controls and HIV-positive subjects without this dependence (Taylor et al., 2000). However, correlations between neurometabolite ratios and executive functions in chronic HIV infection have not been widely studied to date.

Although there are many recent studies that examined the use of EF traditional performance measurements in HIV-positive adult population, to date there are no studies that used EF rating scales. In addition, there are no studies that explored the relationship between neurometabolite ratios measured using advanced neuroimaging techniques such as multi-voxel MRS and behavior measured by EF rating scales, in HIV-positive adult population.

The first objective of this study was to assess the correlations between EF rating scale subtests and neurometabolite ratios obtained on the three-dimensional MRS in the ventral and dorsal ACG in chronically infected, virally suppressed HIV-positive individuals that are stable on long-standing cART.

The main hypothesis resulting from this objective was that there was a significant correlation between NAA/Cr, Cho/Cr ratios and EF rating scale subtests–better achievement on Working Memory, Emotional Control, Inhibit and Shift subtests would be followed by higher NAA/Cr and lower Cho/Cr ratios, especially in the "cognitive" dorsal part of the ACG.

Previous studies on neurocognitive status in HIV-positive individuals showed the presence of asymptomatic neurocognitive impairment even in subjects who are under stable cART regimen, thus supporting the assumption that HIV undoubtedly had certain negative influence on the central nervous system.

In well-controlled HIV infection, cognitive impairment is often subtle, discrete and commonly asymptomatic. Based on those findings, it can be assumed that the probability of detecting the cognitive deficit among people on a stable cART regimen on behavioral tests is relatively low, but possible.

The second objective of this study was to find out whether there was a difference between the group of HIV-positive individuals on stable cART and the group of HIV-negative healthy controls, on the EF rating scale subtests. It was hypothesized that there was a difference only on certain subtests, such as Working memory, Shift, Inhibit and Emotional control.

# MATERIALS AND METHODS

# Participants

This institutional ethical board-approved cross sectional study was conducted from 2015 to 2016 and comprised a total of 39 HIV-positive adult male participants, average age 42.18 years (range 25–66). All participants were Caucasians. All participants were chronically infected (over 1 year after diagnosed) and stable on cART (over 1 year on the same cART regimen). HIV-positive subjects were diagnosed with HIV infection using polymerase chain reaction (PCR) testing. After the initiation of cART, viral loads and current CD4 T-cell counts were closely monitored in each patient. Plasma viral load was under the detection threshold and current CD4 counts were over 250 cells/mm<sup>3</sup> in all participants for at least 1 year. In addition, the regimen of cART remained unchanged for the same period of time. Based on the results of the screening neurocognitive test, International HIV Dementia Scale, (Sacktor et al., 2005), all participants were neurologically asymptomatic and able to work.

The criteria for inclusion of HIV-positive individuals in the study were: (1) age over 18, (2) normal conventional MR scan,

TABLE 1 | Clinical and sociodemographic characteristics for the study sample.


(3) the presence of HIV infection (PCR verified), (4) HIVseronegative status on PCR testing during previous 12 months (two negative PCR tests on a regular 6-months follow up), and (5) plasma viral load under the level of detection (<40 copies/mL). The exclusion criteria were the presence of focal or diffuse lesions in the brain (verified by conventional MR scan), active infiltrative or infective/opportunistic neurological disease, chronic neurological or psychiatric illness, comorbid disorders known to influence cognitive performance such as diabetes, cardiovascular diseases, hepatitis B or C, active abuse of alcohol and narcotic drugs. The route of infection was registered and in all HIV-positive participants and it was via sexual transmission. No illicit intravenous drug use or vertical transmission was detected in our cohort.

The control group consisted of 39 adult male participants, average age 42.21 years (range 25–65), Caucasian race, who were more than 18 years old and tested HIV-negative. The selection of the sample was performed randomly from general population. Healthy controls underwent neuropsychological testing with BRIEF-A. However, in these participants, neuroimaging was not performed due to the lack of clinical indication.

All subjects signed the fully informed written consent to participate in this study. Therefore, every participant was familiar with the research objectives. Demographic and clinical data of the participants in the study are summarized in **Table 1** and descriptive statistics for the variables included in the study are summarized in **Table 2**. The difference between HIV+ and HIV– participants was not significant regarding educational achievement [t(76) = −1.968, p > 0.05] as well as age [t(76) = −0.010, p > 0.05].

#### Measures

Behavior Rating Inventory of Executive Function-Adult Version–BRIEF-A (Roth et al., 2005) is the standardized rating scale developed to assess behavior manifestations of executive functions in adults, aged 18–90 years. The BRIEF-A consists of Self-Report and Informant Report Forms. The Self-Report Form, used in this study, provides an understanding of the individual's perspective with respect to their own difficulties in self-regulation. The questionnaire contains 75 items within nine non-overlapping clinical scales: Inhibit, Shift, Emotional TABLE 2 | Descriptive statistics for the variables included in the study.


control, Initiate, Working Memory, Plan/Organize, Organization of Materials, Task Monitor and Self-Monitor. Responses were indicated on a 3-point scale labeled never, sometimes, often, where lower score represents better performance. Reliability analyses, measured by Chronbach's Alpha, revealed high internal consistency of all clinical scales for this study sample (Inhibit α = 0.808, Shift α = 0.701, Emotional Control α = 0.878, Self Monitor α = 0.811, Initiate α = 0.725, Working Memory α = 0.778, Plan/Organize α = 0.670, Task Monitor α = 0.651, Organization of Materials α = 0.813).

# Multi-Voxel Magnetic Resonance Spectroscopy Protocol

Magnetic resonance spectroscopy was performed immediately after neuropsychological testing on a 3T MR scanner (Siemens Trio Tim, Erlangen, Germany), using an 8-channel head array coil. Conventional MR scan consisted of multisequentional and multiplanar tomograms, necessary for excluding focal or diffuse white and gray matter lesions, and for localization of multivoxel network. Conventional MR imaging consisted of: T1-weighted sagittal, T2-weighted and FLuid Attenuation Inversion Recovery (FLAIR) axial and T2-weighted coronal tomograms, as well as thin-sliced 3D T1-weighted magnetization prepared rapid acquisition gradient echo (MPRAGE) sagittal tomograms. Threedimensional multi-voxel magnetic resonance spectroscopy was

spectra were analyzed.

performed using point-resolved spectroscopy with a long echo time (time of repetition 1,700 ms/time of echo 135 ms). The volume of interest (VOI) was 80 × 80 × 80 mm, with a slice thickness of 10 mm, positioned parallel to the axial images. Total scan time was 7:17 min (weighted phase-encoding scheme was applied). Saturation planes (for saturating signals of surrounding tissues) were manually positioned along the margin of the VOI. The automatic, volume selective shimming method was used to optimize homogeneity of the magnetic field. VOI was positioned in the same way in every subject in order to achieve the reproducibility, while analyzed voxel positions were chosen manually. Multivoxel network was placed in the supracallosal white matter and gray matter, comprising gray matter of anterior and posterior cingulate gyrus and white matter of frontal centrum semiovale (**Figure 1**). Due to the reasons stated in Introduction, four localizations were chosen out of possible 64 voxels (8 × 8 network): two placed in the ventral part of anterior cingulate gyrus (one in the left and one in the right hemisphere) and two placed in the dorsal part of anterior cingulate gyrus (also one in the left and one in the right hemisphere). The spectra were imported to a digital workstation, where a dedicated manufacturer's software for spectroscopy was applied for baseline corrections, peak identification and calculation of the ratios throughout analyzed voxels. Spectra obtained on short echo time MRS were analyzed by identifying peaks of N-acetyl aspartate (NAA) at 2.02 ppm, choline (Cho) at 3.2 ppm, and creatine (Cr) at 3.0 ppm. Ratios of NAA/Cr and Cho/Cr were calculated for each voxel. Spectra from left and right side were analyzed separately, due to possible lateralization in some executive functions (emotional control and working memory, for example).

#### Analysis

#### Variables

Variables in this research were obtained as scores from nine BRIEF-A scales (constructed by adding the items belonging to a certain scale in a sum score) as well as ratios of NAA/Cr and Cho/Cr in dorsal and ventral part of the anterior cingulate gyrus (ACG) on both sides, in order to determine the relationship between executive functions and neurobiochemical profile. Due to the presumed lateralization of some executive functions (although chronic HIV-related changes affect the brain volume diffusely), we analyzed each voxel separately. Age, educational achievement, duration of HIV infection, duration of cART


\*\*p < 0.01, \*p < 0.05.

therapy, nadir CD4 count (the lowest CD4 count in patient's history), and current CD4 counts were examined in relationship with executive functions. Also, these parameters were examined as potential control variables in a regression model.

#### Data Analyses

All statistical analyses were performed using IBM SPSS software (Version 23.0, Chicago, IL, USA). Pearson's correlation analyses were conducted to estimate the relationship between neurobiochemical profile in normal-appearing brain tissue and executive functions, clinical parameters, sociodemographic variables and executive functions as well as the relationship between clinical parameters, sociodemographic variables, and neurobiochemical profile. Four Pearson's correlation analyses were conducted separately regarding specific areas of interest. In order to investigate the difference on nine BRIEF-A scales between HIV+ and HIV– individuals, nine t-tests were conducted. Moreover, five hierarchical regression analyses were conducted to estimate the unique contributions of neurobiochemical changes in normal-appearing brain tissue, in the prediction of executive functions.

Due to the small study sample and many hypothesis tested, the Bonferroni correction was introduced in order to control family wise error rate (FWE) and avoid making Type I error (Armstrong, 2014). The Bonferroni correction was used for Pearson's correlation tests since multiple comparisons were tested in a way that the alpha value (p-value) was adjusted by the number of comparisons being performed. To perform the Bonferroni correction, the p-value (p = 0.05) was divided by the number of comparisons for each test separately. For the first Pearson's analysis where the goal was to determine the relationship between clinical parameters, sociodemographic variables, and executive functions in HIV+ individuals, the p-value was calculated and the significance value was set at p = 0.0009. For the second Pearson's correlation analysis where the goal was to investigate the relationship between EF and neurobiochemical profile in HIV-positive individuals, the significance value was set at p = 0.003. In order to investigate the relationship between clinical parameters, sociodemographic variables, and neurobiochemical profile in HIV-positive individuals, statistical significance was set at p = 0.001. Finally, in order to determine the relationship between executive functions and sociodemographic variables in HIVindividuals, the significance value was set at p = 0.003.

However, since this correction was too rigorous and all statistically significant results were lost, we presented and discussed the obtained results as if the Bonferroni correction was not applied.

#### RESULTS

In order to determine the relationship between sociodemographic variables, clinical parameters and executive functions, Pearson's correlation test was performed (**Table 3**). Statistically significant positive correlations were obtained between the variable Age and BRIEF-A scale Shift as well as between current CD4 count and BRIEF-A scale Working Memory.

Pearson's correlation analysis was performed in order to determine the relationship between sociodemographic variables and EF (**Table 4**). Significant positive correlation was obtained between the educational achievement and Working Memory.

Pearson's correlation analysis was performed in order to determine the relationship between executive functions and neurobiochemical profile (**Table 5**). Statistically significant negative correlations were obtained between NAA/Cr at the dorsal part of the ACG (right) and Inhibit, Shift, Emotional Control, Plan/Organize as well as between Cho/Cr at the dorsal part of the ACG (right) and Self Monitoring. Furthermore, statistically significant negative correlations were obtained between NAA/Cr at the dorsal part of the ACG (left) and Shift, Emotional Control, and Self Monitoring.

In order to determine the relationship between sociodemographic variables, clinical parameters and neurobiochemical profile in HIV-positive individuals, Pearson's correlation test was performed. The results are summarized in **Table 6**: statistically significant correlations were obtained between NAA/Cr at the ventral part of the ACG (left) and nadir CD4 count, NAA/Cr at the dorsal part of the ACG (right)and age, as well as between the duration of cART, NAA/Cr at the dorsal ACG (left) and age.

Nine t-tests were conducted in order to investigate the difference between HIV-positive and HIV-negative individuals on nine BRIEF-A scales. Observing the results (**Table 7**), we could state that there is no difference between age-matched HIV-positive and HIV-negative individuals with regard to EF.

Hierarchical regression analyses were performed in order to determine the influence of the NAA/Cr and Cho/Cr ratios at the dorsal ACG (left and right) on following BRIEF-A scales: Inhibit, Shift, Emotional Control, Self-monitoring, and Plan/Organize. Age was used as the control variable in the first step, the current CD4 count in the second step, and the NAA/Cr and Cho/Cr

TABLE 4 | Correlations between sociodemographic variables and BRIEF-A measures in HIV-negative individuals.


\*p < 0.05.

ratios at the dorsal ACG (left and right) in the third step. The results (Tables 8–17 in Supplementary Materials) showed that age had a significant influence on clinical scale Shift. However, none of the metabolite ratios had significant influence on the achievement BRIEF-A scales.

## DISCUSSION

Recent studies showed that HIV-positive individuals have a significantly extended lifespan due to the introduction of cART (Heaton et al., 2015). However, this resulted in a more frequent occurrence of asymptomatic neurocognitive impairment represented with subtle deficit in multiple domains of EF (Heaton et al., 2011; Cattie et al., 2012; Giesbrecht et al., 2014). Majority of studies that reported deficits in HIV-positive subjects, assessed the EF using limited, laboratory-based tests, known as traditional performance measures. Over time, however, clinicians and researchers revealed that these tests could not always detect impairment in patients who had clear executive dysfunction in their everyday lives, while, at the same time, in patients with no evidence of executive problems outside of the test setting, tests indicated impairment (Pennington and Ozonoff, 1996). Due to aforementioned reasons, the standardized rating scales of real world behavior are becoming increasingly used as a supplement to the traditional laboratory tests (Silver, 2014).

To date, to the best of our knowledge, there are no HIVrelated studies which used rating scales in the assessment of EF. This was the main reason for the authors of this study to use the behavior rating scale of EF in chronically infected, virally suppressed HIV-positive subjects under long-standing cART and in correspondent group of HIV-negative individuals, in order to detect potential differences on nine subscales of BRIEF-A

TABLE 5 | Correlations between neurobiochemical profile and executive functions in HIV-positive individuals.


VP ACG, ventral part of the anterior cingulate gyrus; DP ACG, dorsal part of the anterior cingulate gyrus. \*\*\*p < 0.001, \*\*p < 0.01, \*p < 0.05.

between these two groups. The results showed that there were no significant differences between HIV-positive and HIVnegative individuals on the subscales Inhibit, Shift, Emotional control, Initiate, Working Memory, Plan/Organize, Organization of Materials, Task Monitor and Self-Monitor. This was a rather surprising result, since at least subtle differences were expected primarily on Working memory, Shift, Inhibit and Emotional control subscales, that represented basic/core cognitive processes of EF and played an important role in organizing behavior. Also, these specific aspects of EF have been most thoroughly studied in clinical HIV-positive populations.

Some subtle, though asymptomatic neurocognitive deficits can be observed in HIV-positive individuals under cART using traditional performance measure tests (Malagurski et al., 2017). However, in our study, no significant deficit was recorded in any of nine scales of EF. Potential explanation could be that, even if a certain degree of disturbance existed in tested components of EF, this deficit was not pronounced enough to be observed in the behavioral domain. This could imply that in HIV-positive individuals under stable cART, who are functional in everyday activities and capable to work, there are no prominent deficits in the behavioral domain, even though they can be detected on traditional performance measure tests. Furthermore, it might be true that a behavioral deficit has to be more prominent to cause the disturbances in everyday, real life situations in HIV-positive patients. Indirect confirmation for this conclusion was that our sample included HIV-positive who were highly functional in everyday activities and that the level of compliance was high (reflected in plasma viral load under the detection limit). Additional explanation could lie in potential functional neuronal reorganization and brain plasticity that prevents further deprivation of executive functions. A recent study by Sanford et al. recently raised the issue whether the functional remodeling in the pathways involved in cognition existed in HIV-positive individuals under cART. In a recently published 2-year follow-up study, a better interval performance was observed on several neurocognitive tasks, explained as the result of a beneficial effect of the on-time introduction of cART and long-standing stable aviremia (Sanford et al., 2018).

The second main objective of the study was to examine the presence of correlations between EF rating scale subtests and neurometabolite ratios in the dorsal and ventral part of the ACG, obtained using multi-voxel MRS in chronically infected, virally suppressed HIV-positive individuals under stable and long-standing cART. The idea to explore these correlations came from everyday clinical practice, since we felt that obtained results could contribute to the better clinical management of HIV-positive individuals. Additionally, if significant correlations between neurometabolite ratios and executive functions were observed, it would be the additional confirmation of the key neuroanatomic role of ACG in EF, especially given that these correlations have never been examined in HIVpositive subjects. Also, the data obtained on this easy and fast assessment scale, suggesting the presence of neurodegeneration in the domains related with everyday activities, could lead the clinical decision on introducing further neuroimaging and detailed neurocognitive studies. That way, new protocols for the assessment of neurocognitive status in an individual patient could be established, based on the speed, efficacy and costbenefit analysis, thus contributing to the better initial triage of the patients eligible for the cognitive assessment.

We decided to examine correlations between BRIEF-A subscales and neurometabolite ratios in the ventral and dorsal parts of the ACG using Pearson's correlation analysis. Statistically significant correlations were obtained between NAA/Cr at the dorsal part of the ACG (right) and Inhibit, Shift, Emotional Control, Plan/Organize, as well as between Cho/Cr at the dorsal part of the ACG (right) and Self Monitoring. Furthermore, statistically significant correlations were obtained between NAA/Cr at the dorsal part of the ACG (left) and Shift, Emotional Control, and Self-Monitoring.

With regard to the obtained correlations, one can see that the majority of significant correlations involved scales of Behavioral regulation domain: Inhibit, Shift, Emotional Control and Self-Monitoring. Achievement on all these scales reflects the ability to maintain appropriate regulatory control of one's own behavior and emotional responses. This includes appropriate inhibition of thoughts and actions, flexibility in shifting problemsolving set, modulation of emotional response, and monitoring one's actions. Appropriate behavioral regulation is likely to be a precursor of appropriate meta-cognitive problem solving that successfully guides active and systematic problem solving, and more generally supports appropriate self-regulation (Isquith et al., 2013). Lower concentrations of neuronal marker (NAA/Cr) in the dorsal ACG (left and right) are connected to the poorer achievement on these scales. This result clearly confirmed that the dorsal ACG is the key structure in behavioral regulation, problem solving and monitoring of thoughts and actions. The decrease of NAA/Cr in the left and right dorsal ACG is related to the deprivation of inhibitory control and impulsivity, leading to the decline of the ability to resist impulses and the ability to stop one's own behavior at the appropriate time. The ability to move with ease from one situation, activity, or aspect of a problem to another as the circumstances demand is also deprived. Key aspects of shifting include the ability to make transitions, tolerate change, problem-solve flexibly, switch or alternate attention and change focus from one mindset or topic to another, implying that the progressive decline of all these shifting aspects occurs (Isquith et al., 2013). The decrease of NAA/Cr ratios in the dorsal ACG (left and right) was correlated with poor achievement on Emotional Control subscale, primarily with regard to the impact of EF problems on emotional expression and assesses an individual's ability to modulate or control his or her emotional responses. It is interesting that Self-Monitor scale was the only scale that showed correlations with Cho/Cr ratios in the right ACG. The decrease in both Cho/Cr and NAA/Cr ratios in the dorsal ACG (left and right) is correlated with impairment in the aspects of social or interpersonal awareness, since this scale captures the degree to which an individual perceives himself as aware of the effect that his/her behavior has on others (Isquith et al., 2013). From other scales involved in Metacognition domain, only the Plan/Organize scale showed positive correlation with NAA/Cr in the dorsal ACG on the



VP ACG, ventral part of the anterior cingulate gyrus; DP ACG, dorsal part of the anterior cingulate gyrus. \*\*\*p < 0.001, and \*p < 0.05.


MD, mean difference; SED, std. error difference.

right. The Plan/Organize scale measures an individual's ability to manage current and future-oriented task demands. This scale consists of two components: plan and organize. The Plan component captures the ability to anticipate future events, to set goals, and to develop appropriate sequential steps ahead of time in order to carry out a task or activity. The Organize component refers to the ability to bring order to information and to appreciate main ideas or key concepts when learning or communicating information (Isquith et al., 2013). Obtained correlations imply the key role of the dorsal ACG in the relationship between behavioral changes in everyday setting with the decrease in neuronal markers in chronic HIV-positive subjects.

The authors wanted to satisfy more strict statistical criteria and, therefore, the Bonferroni corrections were performed. However, after the introduction of the Bonferroni corrections, none of the initially detected correlations remained significant. The authors expected that correlations between those variables should exist, based on prior studies, both using multi-voxel MRS and other advanced neuroimaging techniques. Therefore, we decided to discuss the potential explanations for the absence of significant correlations after the introduction of this strict statistical criterion.

Given that a decrease in some neurometabolite concentrations in chronically infected virally suppressed HIV-positive subjects was observed in a prior study (Boban et al., 2017), we expected that this decrease could be correlated with disturbances in some behavioral aspects that are coordinated with EF. The lack of expected correlations after the introduction of Bonferroni corrections can be explained with several arguments. The first argument was that our sample was consisted of neuroasymptomatic HIV-positive subjects, under stable cART regimen, functional in everyday activities and able to work. We speculated that, although the presence of HIV can affect the neurometabolite ratios, the good clinical control of the disease prevented the overt profile in behavioral aspects (that could be correlated with expected deficits in executive functions). The lack of differences between HIV-positive and control subjects on BRIEF-A subscales indirectly confirmed that behavioral aspects of executive functions in asymptomatic and working HIV-positive individuals are not inferior to those in healthy controls. An additional explanation could follow the direction of potential neuronal remodeling and brain plasticity that prevented manifest deprivation of executive functions. The other argument deals with the issue of successful use of selfassessment scales (such as BRIFEF-A) in behavioral analyses of HIV-positive subjects, since the participants could have given socially appealing answers in order to present themselves as functional. This might justify the need for additional objective/ independent assessment obtained from family members or partners in future studies. The third argument might be that this type of scale was not sensitive enough to detect subtle deficits in behavioral functions in HIV-positive subjects, thus explaining the lack of correlations between rating scale and neurometabolite ratios.

When discussing the additional correlations between clinical parameters of HIV infection, sociodemographic variables and BRIEF-A subscales, in HIV-positive group, the significant correlations were observed between Shift scale and age. There were no significant correlations between age and behavioral aspects of EF in HIV-negative group, while there was a significant correlation between age and results on the Shift scale. (The more advanced the age, the worse achievement was detected on this scale). The explanation for the lack of correlations in the HIV-negative group could be the age of the sample (42.18), implying that correlations may be related with advanced age. The pattern of brain aging in HIV-positive population has also been changed with the early introduction of cART into the accelerated aging form (Cole et al., 2017), meaning that HIV infection increases the burden of risk for age-related comorbidities. The presence of the correlation between age and Shift scale, and the absence of any correlations with age in age-matched HIV-negative controls, might be in favor of this assumption.

Key aspects of shifting include the ability to make transitions, tolerate change, solve a problem flexibly, switch or alternate attention, and change focus from one mindset or topic to another. According to the relationship between age and Shift scale, this scale might particularly reflect those aspects of EF that are most vulnerable to the aging effect. This might be important in the context of compliance, working ability and global cognitive efficacy prediction in HIV-positive individuals.

Furthermore, significant correlations were observed between Working memory and current CD4 count (the higher current CD4 count, the worse achievement was detected on this scale) in HIV-positive group; as well as between Working memory and educational achievement (the better educational level, the worse achievement was detected on this scale) in HIV-negative group. These correlations came as a surprise, given that the better educational achievement should be associated with bigger capacity of working memory, while higher current CD4 counts should be associated with better achievements on the Working memory scale. Potential explanations for obtained correlations could be that the current CD4 count–which is correlate of the immune state–does not reflect adequately the protective effect on neuronal function and some aspects of EF, Working memory in the first line. This fact confirms the complex relationship between parameters of HIV infection and EF, stated in recent studies that failed to present correlations between positive indices of immune response (cART compliance, high current CD4 count, high nadir CD4) and improvement in different aspects of cognitive functioning. Indirectly, it can be concluded that there are very complex inter-reactions and correlations between manifest indices of immune response and latent parameters that affect rapid cognitive decline, potentially involved in acceleration of brain aging in HIVpositive individuals.

Potential explanation for unexpected correlations between educational achievement and Working memory scale in control group might be explained in the way that people with higher academic achievements are faced with increased number of cognitive tasks in the modern life settings, they are involved in a daily multitasking thus lowering the capacity of the working memory. Bearing in mind that the working memory is one of our core cognitive functions that allows us to keep information in mind for short period of time and then use this information when needed, representing the capacity to hold information in mind for the purpose of completing a task, encoding information, or generating goals, plans, and sequential steps to achieving goals, it becomes clear that exhausting daily cognitive tasks can create a misbalance in this function.

Also, we observed significant negative correlation between duration of cART and NAA/Cr ratio in the dorsal part of the right ACG, pointing to the decrease of NAA/Cr with increased duration of cART. This introduces the aspect of potential lowlevel neurotoxicity of the therapy, especially when knowing that most of our patients were on a so-called "old-fashioned" cART. Also, we confirmed the strong correlation between age and NAA/Cr ratios in the dorsal ACG (left and right). This is a known fact, implying that age is a major contributing factor to the decrease of neuronal marker, even greater than HIV-induced neuronal injury (Boban et al., 2017).

Finally, five hierarchical regression analyses were performed in order to determine the influence of the NAA/Cr and Cho/Cr ratios at the dorsal ACG (left and right) on following BRIEF-A scales: Inhibit, Shift, Emotional Control, Self-monitoring, and Plan/Organize. Age was used as a controlling variable in the first step, the current level of CD4 cells was used in the second step, and the ratios of NAA/Cr and Cho/Cr at the dorsal ACG (left and right) in the third step. Age and current CD4 count were chosen as controlling variables since they showed statistically significant correlations with some BRIEF-A subscales. Dependent variables were BRIEF-A subscales, based on statistically significant correlations with neurobiochemical changes. The results (Tables 8–17 in Supplementary Materials) showed that age had a statistically significant influence on clinical scale Shift. However, none of the metabolite ratios had significant influence on BRIEF-A subscales. Potential explanation for the lack of expected influence of neurometabolite ratios on BRIEF-A scales can be sought in the context of significant correlations of neurometabolite ratios among themselves, thus lowering the correlation significance between other unique predictors (**Table 5**). Only age was a significant predictor of the achievement on Shift scale in HIV-positive subjects, in whom age defines the decrease of the ability to move with ease from one situation, activity, or aspect of a problem to another as the circumstances demand. Even though in our study sample there were no subjects of advanced age, the presence of HIV infection in the middleaged group has a certain negative effect on behavioral aspects of EF, despite good peripheral control of the disease. In other word, even though the age itself is not expected to affect the cognitive status in the middle-aged group, due to presumed accelerated brain aging in HIV infection, certain impairment might be detected in spite of good disease parameters.

It is important to emphasize that after the introduction of the Bonferroni corrections, none of the initially detected and explained correlations remained significant.

Finally, the authors feel that the lack of correlations after the introduction of the Bonferroni corrections could also be the "victim" of too strict statistical thresholds on the small sample. If the sample was bigger, there is a possibility that initially detected correlations would be still observed after introduction of Bonferroni corrections and stricter significance levels. In the future, studies should be conducted on the bigger samples that include functional and working population of HIVpositive individuals. Additionally, the introduction of traditional measures for EF along with the self-assessment scales could give a more objective and comprehensive insight in correlations between neurometabolite ratios and measurements of EF. Also, the introduction of third party (members of family or partners) as independent raters of EF could be extremely helpful. Considering neuroimaging studies, multi-voxel MRS represents a very useful and sensitive tool for the analysis of neurometabolic profile in an HIV-positive individual. HIV-related neuronal injury is diffuse, and not focal, the use of average values for the left and right hemispheres could be methologically justified in some cases.

Alongside with the limitations of this study, there are some advantages also. This is the first study in HIV-positive individuals that revealed the potential correlations between neurobiochemical profile and assessment scales for EF in adult population. The issue of the use of those correlations in clinical management of these patients was also raised. Finally, this was

REFERENCES


the first study that used rating scales for the assessment of EF in HIV-positive individuals, thus contributing to the existing knowledge about the role of rating scales in EF in the clinical setting.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Declaration of Helsinki. The protocol was reviewed and approved by the institutional review board (Ethical Comittee of the Faculty of Medicine, University of Novi Sad) governing each site. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### AUTHOR CONTRIBUTIONS

All the authors have seen and approved the final version and have contributed significantly to the work with regard to conception and design (VB, SB, and JB), acquisition of the data (VB, JM, JB, DK, and DM), data analysis and interpretation (VB, JM, and JB), as well as drafting the article (VB, JM, and JB) or revising it (SB).

#### FUNDING

The study is funded by Provincial Secretariat for Science and Technological Development of Autonomous Province of Vojvodina, grant number 114-451-2730/2016-02.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01238/full#supplementary-material


A. Naglieri (New York, NY: Springer Science + Business Media), 435–458. doi: 10.1007/978-1-4939-1562-0\_27


Gazzaniga, M. S. (2004). The Cognitive Neurosciences. Cambridge, MA: MIT Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bugarski Ignjatovic, Mitrovic, Kozic, Boban, Maric and Brkic. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Preserved but Less Efficient Control of Response Interference After Unilateral Lesions of the Striatum

Claudia C. Schmidt <sup>1</sup> \* † , David C. Timpert 1,2† , Isabel Arend<sup>3</sup> , Simone Vossel 1,4 , Anna Dovern<sup>1</sup> , Jochen Saliger <sup>5</sup> , Hans Karbe<sup>5</sup> , Gereon R. Fink 1,2 , Avishai Henik <sup>3</sup> and Peter H. Weiss 1,2

<sup>1</sup>Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Centre Jülich, Jülich, Germany, <sup>2</sup>Department of Neurology, University Hospital Cologne, Cologne, Germany, <sup>3</sup>Department of Psychology and the Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev, Beer-Sheva, Israel, <sup>4</sup>Department of Psychology, University of Cologne, Cologne, Germany, <sup>5</sup>Neurological Rehabilitation Centre Godeshöhe, Bonn, Germany

Previous research on the neural basis of cognitive control processes has mainly focused on cortical areas, while the role of subcortical structures in cognitive control is less clear. Models of basal ganglia function as well as clinical studies in neurodegenerative diseases suggest that the striatum (putamen and caudate nucleus) modulates the inhibition of interfering responses and thereby contributes to an important aspect of cognitive control, namely response interference control. To further investigate the putative role of the striatum in the control of response interference, 23 patients with stroke-induced lesions of the striatum and 32 age-matched neurologically healthy controls performed a unimanual version of the Simon task. In the Simon task, the correspondence between stimulus location and response location is manipulated so that control over response interference can be inferred from the reaction time costs in incongruent trials. Results showed that stroke patients responded overall slower and more erroneous than controls. The difference in response times (RTs) between incongruent and congruent trials (known as the Simon effect) was smaller in the ipsilesional/-lateral hemifield, but did not differ significantly between groups. However, in contrast to controls, stroke patients exhibited an abnormally stable Simon effect across the reaction time distribution indicating a reduced efficiency of the inhibition process. Thus, in stroke patients unilateral lesions of the striatum did not significantly impair the general ability to control response interference, but led to less efficient selective inhibition of interfering responses.

Keywords: cognitive control, stroke, simon task, putamen, caudate nucleus

# INTRODUCTION

Cognitive control refers to a set of cognitive processes implicated in selecting an appropriate task-related action while minimizing interference from possible response alternatives. Therefore, one important aspect of cognitive control is the ability to inhibit prepotent yet inappropriate or interfering response tendencies (Banich, 2009; Goghari and MacDonald, 2009). Two conceptually different components of cognitive control with respect to inhibitory (control) demands are (global) response inhibition and interference control (Nigg, 2000). The former process, response inhibition, aims at completely withholding or canceling an inappropriate response,

#### Edited by:

Sarah E. MacPherson, University of Edinburgh, United Kingdom

#### Reviewed by:

Christiane Thiel, University of Oldenburg, Germany Jeremy Hogeveen, University of New Mexico, United States

> \*Correspondence: Claudia C. Schmidt c.schmidt@fz-juelich.de

†These authors have contributed equally to this work and thus share the first authorship

Received: 16 January 2018 Accepted: 25 September 2018 Published: 16 October 2018

#### Citation:

Schmidt CC, Timpert DC, Arend I, Vossel S, Dovern A, Saliger J, Karbe H, Fink GR, Henik A and Weiss PH (2018) Preserved but Less Efficient Control of Response Interference After Unilateral Lesions of the Striatum. Front. Hum. Neurosci. 12:414. doi: 10.3389/fnhum.2018.00414 which is usually studied with go/no-go (Falkenstein et al., 1999) or stop-signal tasks (Verbruggen and Logan, 2008). In contrast, interference control requires conflict resolution and the selective inhibition of inappropriate response tendencies to enable the execution of the task-appropriate response (Friedman and Miyake, 2004; Diamond, 2013). The current study focusses on response interference control (i.e., the latter component of cognitive control).

A well-established and sensitive measure for the control of response interference is the Simon task (Simon, 1969; Hommel, 2011). In this stimulus-response interference task, participants are instructed to respond to a non-spatial stimulus feature (e.g., color) by giving manual left or right responses, irrespective of the location at which the stimulus appears. In each trial, the stimulus is presented either to the left or to the right of a fixation point. Participants typically respond faster (and more accurate) when the relative spatial location of the stimulus matches the side of the response (congruent condition) compared to when both positions do not correspond (incongruent condition), even though the spatial location of the stimulus is task-irrelevant (Lu and Proctor, 1995). The difference in response times (RTs) between incongruent and congruent conditions is called the Simon effect (Hedge and Marsh, 1975). The magnitude of the Simon effect reflects the extra demand (and thus time) that is required to overcome the stimulus-response interference. Controlling stimulus-response interference in the Simon task is thought to involve selectively inhibiting the prepotent tendency to respond to the (irrelevant) location of the stimulus and instead selecting a task-appropriate response for successful task performance (Burle et al., 2002). The activation-suppression hypothesis provides a theoretical framework for the temporal dynamics of the response interference control processes in the Simon task. It proposes that, in incongruent trials, the (incorrect) response tendency as activated by the (task-irrelevant) stimulus location is followed by selective inhibition, which needs some time to develop (Ridderinkhof, 2002). Given these proposed dynamics, the automatic activation of the (incorrect) response prevails in incongruent trials with short reaction times (RTs), while in incongruent trials with long RTs selective inhibition can already exert its effects. Therefore, the integrity (Forstmann et al., 2008) and efficiency (Wylie et al., 2010) of the selective response inhibition process can be revealed by distributional analyses of RTs in which the Simon effect is analyzed as a function of intraindividual response latencies (De Jong et al., 1994; Ridderinkhof et al., 2004b). Specifically, efficient selective inhibition is reflected in a reduced Simon effect as RT increases. Conversely, less efficient selective inhibition rather leads to a uniform or even increased Simon effect across the RT distribution.

It is generally accepted that cortical areas make a fundamental contribution to cognitive control processes, including response interference control (Miller and Cohen, 2001; Nee et al., 2007). In particular, medial and lateral frontal brain regions including the anterior cingulate cortex (ACC) and lateral prefrontal cortex (PFC) are supposed to support the detection and monitoring of (response) conflicts (Botvinick et al., 2004; Kerns, 2006) and the implementation of cognitive control to resolve the conflict (MacDonald et al., 2000; Ridderinkhof et al., 2004a), respectively.

In contrast, the putative contribution of subcortical structures to cognitive control processes, especially to the control of response interference is less clear (O'Callaghan et al., 2014). Models of basal ganglia function suggest that the striatum (comprising the putamen and caudate nucleus) modulates the selection and inhibition of interfering responses via anatomical connections to (pre-) frontal and motor cortical areas (Mink, 1996; Utter and Basso, 2008), thereby potentially contributing to response interference control.

Indeed, clinical observations and studies in patients with neurodegenerative diseases affecting the basal ganglia, such as Parkinson's disease (PD) and Huntington's disease (HD), point to a relevant role of the striatum in the control of response interference (Seiss and Praamstra, 2004; Nelson and Kreitzer, 2014). However, previous studies on response interference control in PD and HD patients using the Simon task are far from being conclusive: while some studies reported impairments in resolving response interference in PD or HD patients (Georgiou et al., 1995; Praamstra and Plat, 2001; Fielding et al., 2005; Wylie et al., 2010), others did not find significant differences between patient and control groups (Brown et al., 1993; Cope et al., 1996; Georgiou-Karistianis et al., 2007; Schmiedt-Fehr et al., 2007). These equivocal findings might presumably be due to several experimental variables, including clinical characteristics of the examined PD and HD patients as well as differences in task design and procedures (for a detailed discussion of the potential impact of these diverse factors please refer to Falkenstein et al., 2006; Wylie et al., 2009).

Given the abundant anatomical interconnections between the striatum and (pre-) frontal regions, control of response interference may depend on the integrity of fronto-striatal networks (Liston et al., 2006; Aron et al., 2007; Wiecki and Frank, 2013). Note that the functional (and later structural) alterations in neurodegenerative diseases are not limited to the striatum but extend into (pre-) frontal areas (Reading et al., 2004; Selemon et al., 2004). Therefore, impairments in response interference control in PD and HD patients may reflect abnormal function of the striatum, the (pre-) frontal cortex, or both (Caligiore et al., 2016).

The few available group studies on the impact of strokeinduced striatal lesions on cognitive control processes have revealed deficits in (global) response inhibition using a stop-signal task (Rieger et al., 2003) as well as in cognitive flexibility during task switching (Cools et al., 2006; Yehene et al., 2008). Other studies investigating cognitive control processes after striatal lesions have reported (non-specific) deficits in cognitive control using standard neuropsychological tests (Hochstenbach et al., 1998; Ward et al., 2013). Moreover, single case studies revealed (specific) impairments in response selection, interference control, or cognitive flexibility (Dubois et al., 1995; Swainson and Robbins, 2001; Benke et al., 2003; Rainville et al., 2003). Finally, neuroimaging studies in healthy subjects employing (variants of) the Simon task also revealed involvement of the striatum in controlling response interference (Peterson et al., 2002; Wittfoth et al., 2009).

Accordingly, the present study aimed at further investigating the putative role of the striatum in the control of response interference in 23 patients with unilateral stroke-induced striatal lesions by using the Simon task. This task was chosen because it was commonly adopted in previous studies investigating response interference control processes in neurological patients (e.g., Georgiou-Karistianis et al., 2007; Wylie et al., 2010) and has been shown to engage the striatum (e.g., Peterson et al., 2002; Stocco et al., 2017). The overall Simon effect was used to infer the general ability to control response interference, with larger RT differences (i.e., larger Simon effects) being associated with impaired control of response interference. Further distributional analyses of RTs assessed the efficiency of the selective inhibition process engaged in resolving response interference. Here, a decrease in the magnitude of the Simon effects across the RT distribution reflects efficient selective inhibition of interfering responses (Ridderinkhof, 2002). To the best of our knowledge, no stroke lesion study has yet investigated the specific role of the striatum with respect to the general ability and efficiency of response interference control as measured by the Simon task. If integrity of the striatum is required for the control of response interference, unilateral striatal lesions might impair performance in the Simon task. In particular, when compared to healthy control subjects, stroke patients with striatal involvement could exhibit: (i) an overall larger Simon effect (indicating impaired control of response interference) and/or (ii) less reduction in the Simon effect across the RT distribution (indicating less efficient selective inhibition).

# MATERIALS AND METHODS

#### Participants

A total of 43 patients with first-ever ischemic or hemorrhagic stroke affecting the left or the right hemisphere were consecutively recruited from the Department of Neurology, University Hospital Cologne (n = 21) and the Neurological Rehabilitation Centre Godeshöhe, Bonn (n = 22). Inclusion criteria were right-handedness, age between 18 and 90 years, no other neurological disorders, no current or previous psychiatric diseases, and no current or history of substance abuse or dependance.

Stroke lesions were identified based on clinical imaging data (computed tomography (CT): n = 9, magnetic resonance imaging (MRI): n = 34). All lesions were mapped by drawing the lesions manually in steps of 5 mm on axial slices of a T1-weighted template brain (ch2.nii) provided by MRIcron<sup>1</sup> . Lesion mapping was performed by DCT and consecutively checked by AD. Both examiners had to jointly agree upon the exact lesion location and extent and were blind to the individual patient's task performance at the time of lesion mapping (for further methodological descriptions see also Timpert et al., 2015).

The aim of the current study (i.e., to investigate the role of the striatum in the control of response interference) required that the patients' unilateral stroke involved the striatum (putamen and/or caudate nucleus). Involvement of the striatum was verified using a mask of the striatum (putamen and caudate nucleus, see **Supplementary Figure S1**) derived from the Harvard-Oxford atlas of cortical and subcortical structures provided by the Harvard Center for Morphometric Analysis<sup>2</sup> and distributed with FSL<sup>3</sup> . Accordingly, 19 patients were excluded after enrolment because their clinical imaging did not reveal any striatal involvement. A 20th patient was excluded due to chancelevel performance in the Simon task.

Thus, a total of 23 stroke patients (12 female), including nine patients with left hemispheric (LH) and 14 patients with right hemispheric (RH) stroke were included in the subsequent analyses. Twenty patients suffered from an ischemic, three from a hemorrhagic stroke. The mean age was 57.1 years (SD = 15.1 years, range 25–84 years). All patients were examined during the sub-acute (i.e., >24 h post-stroke; Hillis et al., 2002) or chronic stage of their disease. The mean time interval between stroke onset and experimental assessment was 76.3 days (SD = 30.5 days, range 2–514 days). The time interval did not differ significantly between LH and RH stroke patients (t(21) = −0.35, p = 0.731). Furthermore, there were no significant correlations between days since onset of stroke and any of the key measures in the Simon task (e.g., overall mean reaction time, mean Simon effects, total error rate; all p-values >0.278).

The lesion overlay plot of the 23 stroke patients with striatal involvement is shown in **Figure 1**. In all patients, the territory of the middle cerebral artery was involved. Consistent with the inclusion criteria, the highest lesion overlap was observed within the putamen and the head of the caudate nucleus. In some patients the lesions also extended to adjacent subcortical (central white matter tracts, globus pallidus, thalamus) and cortical regions (insula, frontal cortex). There was no significant difference between LH and RH stroke patients concerning lesion size (t(21) = −1.56, p = 0.134), and no significant correlation between number of lesioned voxels and any of the key Simon task measures (all p-values > 0.327).

Consistent with the predominantly subcortical lesion pattern, neuropsychological deficits (e.g., neglect, apraxia, aphasia, executive dysfunction) were mild in the current patient sample. The 23 stroke patients did not suffer from unspecific cognitive decline, since all patients performed above the cut-off of 24 out of 30 points of the Mini-Mental Status Examination (MMSE; Folstein et al., 1975). None of the patients showed relevant signs of neglect, apraxia, or aphasia according to the Neglect Test (NET; Fels and Geissner, 1996, i.e., the German version of the Behavioral Inattention Test (BIT), Wilson et al., 1987), the Cologne Apraxia Screening (KAS; Weiss et al., 2013), or the short version of the Aphasia Check List (ACL-K; Kalbe et al., 2002), respectively. There was mild (to moderate) impairment of executive functioning as assessed with the Trail Making Test (TMT; Reitan and Wolfson, 1993) and the Stroop Color-Word Interference Test (SCWT; Bäumler, 1984).

<sup>1</sup>http://people.cas.sc.edu/rorden/mricron/index.html

<sup>2</sup>http://www.nmr.mgh.harvard.edu/

<sup>3</sup>https://fsl.fmrib.ox.ac.uk/fsl/

from −17 to 28 are shown. The striatum (putamen and caudate nucleus) is visible in the axial slices with the MNI z-coordinates ranging from −12 to 23 (see Supplementary Figure S1 for a mask of the striatum). Axial slices with the MNI z-coordinates 8 and 13 indicating the highest lesion overlap within the putamen and the head of the caudate nucleus are highlighted. The figure was generated using the freely available MRIcron software package (Rorden and Brett, 2000).

With respect to clinical scores, all patients were rated to be mild to moderately disabled according to the modified Rankin Scale (Rankin, 1957). On average, stroke patients exhibited mild to moderate paresis of the contralesional hand and arm as assessed by the Medical Research Council (MRC) paresis scale (Medical Research Council, 1976). There were no significant differences between LH and RH stroke patients concerning any of the above-mentioned clinical characteristics (all pvalues > 0.091). The neuropsychological and clinical data of the 23 stroke patients are summarized in **Table 1**.

Thirty-two neurologically healthy subjects served as controls and were matched to the patient group with respect to age (M = 54.6 years, SD = 7.7 years, range 42–70 years; t(53) = −0.79, p = 0.432), right-handedness (i.e., laterality quotient (LQ) of the Edinburgh Handedness Inventory, Oldfield, 1971; M = 91.0, SD = 14.9; t(50) = −1.63, p = 0.110), and gender (16 female; χ 2 (1) = 0.03, p = 0.874). None of the control subjects had a history of neurological or psychiatric diseases nor substance abuse or dependance. Moreover, both the stroke patients and the healthy controls constituted a representative sample in terms of education level and occupational profile. All participants had normal or corrected-to-normal visual acuity.

The study had been approved by the local Ethics Committee of the Medical Faculty of the University of Cologne and was performed in accordance with the ethical principles of the World Medical Association (Declaration of Helsinki; revised version, October, 2013). All participants provided written informed consent to participate in the study. The stroke patients additionally gave consent for using their clinical imaging data for lesion mapping.

#### Apparatus and Stimuli

The experiment was programmed in Presentation<sup>r</sup> (Neurobehavioral Systems, Inc.) and presented on a 15-inch computer screen.

As in the unimanual Simon task used by Arend et al. (2016), the peripheral stimuli consisted of a flow field of green moving dots against a black background. A white fixation cross was centrally displayed. Two patches of flow fields of upward and downward motion served as the task-relevant stimulus feature. The flow field density was set at 0.0075 dots/pixel<sup>2</sup> . The dots were randomly distributed within a square subtending 4◦ × 4 ◦ of visual angle. The boundaries of the square were not visible to the participants. The dots moved at a speed of 45 pixels per second. The patches of moving dots were displayed at one of two locations positioned such that their boundaries were 4◦ left or right of the fixation cross. The design and the timing of the experimental task are illustrated in **Figure 2**.


Means, standard deviations (SD) and score ranges are provided. MMSE = Mini-Mental Status Examination (maximum score 30 points; cut-off ≤24 points). BIT = Behavioral Inattention Test (neglect-specific cut-off criteria as defined in Eschenbeck et al. (2010): BIT line bisection test: line bisection score ≤7 points; BIT star cancellation test: laterality index (LI) ≤ −0.2, with LI = (hits left − hits right)/(hits left + hits right); BIT text reading test: neglect of either one left paragraph or of the left words in at least two different lines of a newspaper article arranged in three columns each consisting of two paragraphs (total words: 140); none of the patients showed signs of neglect). KAS = Cologne Apraxia Screening (maximum score 80 points; cut-off ≤76 points). ACL-K = Aphasia Check List-short version (maximum score 40 points; cut-off <33 points; mild (26–32 points), moderate (15–25 points), severe (0–14 points) language impairment). TMT = Trail Making Test (the ratio score Part B/Part A was corrected for age and transformed into z-values; the mean ratio score of 2.8 for the stroke patients corresponds to a z-value of −0.6). SCWT = Stroop Color-Word Interference Test (the time in seconds in the interference condition was corrected for age and transformed into T-values; the mean time in the interference condition of 116.5 scorresponds to a T-value of 47). Rankin scale (grades: 0 = No symptoms at all; 1 = No significant disability despite symptoms; 2 = Slight disability; 3 = Moderate disability; 4 = Moderately sever disability; 5 = Severe disability). MRC = Medical Research Council rating scale for assessing paresis (grades: 0 = No contraction; 1 = Flicker or trace of contraction; 2 = Active movement with gravity eliminated; 3 = Active movement against gravity; 4 = Active movement against resistance; 5 = Normal strength). <sup>a</sup>n = 22; <sup>b</sup>n = 21; <sup>c</sup>n = 20.

Participants were required to give left or right responses based on the motion direction of the moving dots (see below). Consequently, two trial types were defined by the correspondence between the spatial location at which the stimulus appeared and the side of response signaled by the motion direction of the stimulus. For congruent trials, the spatial location of the stimulus matched the side of response (e.g., an upward motion calling for a left response was presented on the left side of the fixation cross). For incongruent trials, there was a mismatch between the spatial location of the stimulus and the side of response (e.g., an upward motion calling for a left response was presented on the right side of the fixation cross).

#### Procedure

Each participant was tested in a single session. The stroke patients were tested at the Department of Neurology of the University Hospital Cologne or at the Neurological Rehabilitation Centre Godeshöhe, Bonn. The healthy control subjects were tested at the Institute of Neuroscience and Medicine, Research Centre Jülich. All participants gave informed written consent and were comfortably seated at a table with the testing material and the computer screen for task presentation in front of them. Before the experimental task, stroke patients performed the above-described set of neuropsychological and clinical tests.

To avoid any confounding effects of contralesional paresis, stroke patients were instructed to respond with their ipsilesional hand. Accordingly, the responding hand is equivalent to the damaged hemisphere (left-hand response: n = 9, right-hand response: n = 14). Healthy control subjects were randomly assigned to respond either with their left (n = 17) or right hand (n = 15) to match stroke patients regarding the responding hand set-up (χ 2 (1) = 1.05, p = 0.305). The responding hand was always positioned to the right side (for the right-hand response group) or to the left side (for the left-hand response group) of the computer screen and thus the body midline.

For the Simon task, participants were instructed to give left or right responses corresponding to the motion direction of the stimulus on the screen, irrespective of the location (i.e., side of the fixation cross) at which the stimulus appeared. The stroke patients used a standard computer mouse to respond, the healthy controls used a LUMItouch response keypad. For stimulusresponse compatibility in the current unimanual version of the Simon task, left and right responses were mapped to the index and middle fingers of the same, either of the left or of the right hand (Heister et al., 1987). In other words, for the participants responding with their right hand, left responses were given with the index finger and right responses were given with the middle finger. Accordingly, participants responding with their left hand gave left responses with the middle finger and right responses with the index finger.

The mapping between motion direction of the stimulus (downward and upward) and side of response (left and right) was counterbalanced across participants. For half of the participants, upward motion was mapped to a left response (i.e., middle finger of the left hand or index finger of the right hand) and downward motion was mapped to a right response (i.e., index finger of the left hand or middle finger of the right hand). For the other half of the participants, upward motion was mapped to a right response and downward motion was mapped to a left response.

Participants were required to respond as quickly and as accurately as possible. Response time (RT) in milliseconds was measured by the computer from stimulus presentation until the participant's response. Errors, i.e., false responses, were also automatically recorded.

Each trial started with a central fixation cross that remained present throughout the task. The fixation cross subtended about 0.07◦ × 0.07◦ of visual angle and was presented for a variable period of 800, 1,200, 1,600, or 2,000 ms, after which a change in size (about 0.05◦ × 0.05◦ of visual angle) for 500 ms signaled the start of the trial. One of the two patches of moving dots (upward or downward) were then presented randomly either at the left or at the right side of the fixation cross and remained visible until participants made a response. If no response was given, the trial ended after 2,000 ms post-stimulus onset. After a variable inter-trial interval (ITI) of 600, 800, or 1,000 ms, the next trial started (see **Figure 2**).

Following a practice block of 20 trials for which participants received feedback (''Correct,'' ''Wrong,'' or ''No response''),

each participant completed two experimental blocks, separated by a short break. Each experimental block contained 80 trials (i.e., 40 congruent trials and 40 incongruent trials) mixed within the block, resulting in a total of 160 experimental trials. The experiment took approximately 10–12 min to complete.

# Statistical Analysis

Statistical analyses were conducted using the software IBM SPSS Statistics (Statistical Package for the Social Sciences, Version 22, SPSS Inc., Chicago, IL, USA).

Independent-samples t-tests and chi-square analyses (for nominal variables) were used to compare means of demographic and clinical data between LH and RH stroke patients and between stroke patients and healthy controls. Repeated measures analyses of variance (ANOVAs) were used to analyze error rate and mean response times (RTs), separately. The error rate was calculated as the proportion of erroneous responses to the total number of trials, as a function of stimulus location and congruency. Error trials as well as RTs exceeding two standard deviations above or below the individual's mean were discarded prior to the analysis to reduce skewness and prevent extreme RTs from influencing the mean of each participant (Ratcliff, 1993). The trimmed mean RTs for correct responses were then calculated as a function of stimulus location and congruency, irrespective of the preceding trial congruency.

The Kolmogorov-Smirnov tests (used to test for normality) on mean RTs of the stroke patients (D(23) = 0.15, p = 0.191) and the healthy controls (D(32) = 0.12, p = 0.200) were not significant, indicating that the RT data were normally distributed in both groups. However, the Levene's test (used to test for homogeneity of variances between groups) on mean RTs was significant (F(1,53) = 16.75, p < 0.001), indicating that the variances of RTs were significantly different for the stroke patients and the healthy controls. Early findings suggest that the F-statistic is a robust statistical model even when its assumptions are violated (Glass et al., 1972; Games, 1984). Accordingly, parametric tests were used. Nonetheless, to assure the reliability of the results with distribution-free statistics, non-parametric (post hoc) tests of the main results were additionally conducted.

The initial ANOVAs included responding hand/lesioned hemisphere as a between-subjects factor. However, there was no significant main effect of responding hand/lesioned hemisphere on error rate (F(1,51) = 0.00, p = 0.988) or mean RT (F(1,51) = 0.13, p = 0.717), and no interaction effect between responding hand/lesioned hemisphere and congruency or group approached statistical significance (all p-values > 0.180). Therefore, error rate and mean RT data were collapsed across responding hand/lesioned hemisphere for all further analyses. Thus, the final ANOVAs evaluated the effect of group (stroke patients vs. control subjects) as between-subject factor and stimulus location (ipsilesional/-lateral hemifield, contralesional/-lateral hemifield) and stimulus-response congruency (congruent, incongruent) as within-subject factors.

To further investigate the possible difference in the magnitude of the (overall) Simon effect between stroke patients and healthy controls, an additional Bayesian independent samples t-test (nondirectional, Cauchy prior = 0.707) was computed in JASP (version 0.8.1.1), a freely available statistical software (Rouder et al., 2009). The Bayes factor comparing the null hypothesis (H0) against the alternative hypothesis (H1; B01) is reported. The Bayes factor B<sup>01</sup> reflects the evidence for H0 (i.e., the Simon effect is not different/is similar in the two groups) compared to H1 (i.e., the Simon effect is different in the two groups).

To assess post-error behavioral adjustments, the difference in mean RTs between (correct) post-error trials and all correct trials (both post-error and post-correct) was compared between the stroke patients and the healthy controls using an independentsamples t-test.

An analysis of the RT distributions was performed to determine the efficiency of selective response inhibition, based on the activation-suppression hypothesis (Ridderinkhof, 2002). For this purpose, correct RTs for each participant were rank-ordered separately for congruent and incongruent trials. Unfortunately, raw RT data were lost for one stroke patient. Each RT distribution was partitioned into four quantile bins of roughly equal size<sup>4</sup> in each participant, and mean RTs were computed for each of the quartiles in each condition (i.e., congruent and incongruent). The Simon effect for each quartile was then obtained as the difference in mean RT for the incongruent and congruent conditions, and, averaged across subjects, plotted against the mean quartile RT (Vincentizing procedure; Ratcliff, 1979). A repeated measures ANOVA with stimulus-response congruency (congruent, incongruent) and quartile (Q1, Q2, Q3, Q4) as within-subject factors and group (stroke patients vs. control subjects) as between-subject factor was used to analyze the mean RTs in incongruent and congruent conditions as a function of response latency. To further examine a significant interaction effect with the factor group, separate ANOVAs were conducted for each group. Polynomial contrasts were then used to test for the trend in the RT difference between incongruent and congruent conditions across the RT distribution.

For all statistical analyses, a level of p < 0.05 was considered significant (with Bonferroni or Greenhouse-Geisser corrections, if applicable).

# RESULTS

# Error Rate

Analysis of error rates revealed a significant main effect of congruency (F(1,53) = 42.38, p < 0.001, η 2 <sup>p</sup> = 0.44), reflecting that incongruent trials evoked overall more errors than congruent trials (6.8% vs. 2.9%). Moreover, the groups differed in overall error rate (F(1,53) = 6.73, p = 0.012, η 2 <sup>p</sup> = 0.11), with stroke patients making more errors compared to healthy controls (6.5% vs. 3.5%).

Importantly, the difference in error rate between incongruent and congruent conditions did not differ significantly between the groups (interaction group × congruency: F(1,53) = 0.19, p = 0.665, η 2 <sup>p</sup> = 0.004).

#### Response Times

Mean RTs for the stroke patients and healthy control subjects are presented in **Figure 3**.

There was a significant main effect of congruency (F(1,53) = 79.70, p < 0.001, η 2 <sup>p</sup> = 0.60): RTs for incongruent trials were longer than for congruent trials (772 ms vs. 712 ms), revealing the typical Simon effect. The main effect of group also reached significance (F(1,53) = 28.93, p < 0.001, η 2 <sup>p</sup> = 0.35), reflecting that stroke patients responded overall more slowly than healthy controls (910 ms vs. 622 ms). Considering the higher error rate in the stroke patients, their overall response slowing did probably not reflect a speed-accuracy trade-off.

Importantly, the RT difference between incongruent and congruent conditions did not differ significantly between the stroke patients and the healthy controls (interaction group × congruency: F(1,53) = 0.16, p = 0.691, η 2 <sup>p</sup> = 0.003). Indeed, this difference (i.e., the magnitude of the Simon effect) was similar in the two groups (63 ms for stroke patients; 58 ms for healthy controls). The similar magnitude of the Simon effect in the patient and control groups was also evident when mean RTs for congruent and incongruent conditions were adjusted for overall response latencies for each participant by means of proportion transformation (Faust et al., 1999; i.e., after taking into account overall group response speed differences; interaction group × congruency: F(1,53) = 0.58, p = 0.450, η 2 <sup>p</sup> = 0.01). Note that there was no significant difference in the magnitude of the Simon effect between the LH and RH stroke patients (64 ms for LH stroke patients, 62 ms for RH stroke patients; t(21) = 0.08, p = 0.937, d = 0.03).

The Bayesian independent samples t-test on the magnitude of the (overall) Simon effect resulted in an estimated Bayes factor BF<sup>01</sup> of 3.4, indicating substantial evidence for H0 (i.e., the hypothesis that the Simon effect is not different/is similar in the two groups).

Furthermore, there was a significant interaction effect of stimulus location and congruency (F(1,53) = 7.42,

<sup>4</sup>The number of bins was chosen to provide a reasonable estimate of bin values of about 20 trials for congruent and incongruent conditions per quartile (Ratcliff, 1979).

patients (triangles, solid lines) and the healthy controls (squares, dashed lines). Furthermore, there was an asymmetry of the Simon effect in both groups with a more pronounced Simon effect in the contralesional/-lateral hemifield (compared to the ipsilesional/-lateral hemifield). Error bars indicate standard error of the mean (SEM).

p = 0.009, η 2 <sup>p</sup> = 0.12). Planned comparisons revealed that RTs for incongruent trials were significantly longer in the contralesional/-lateral hemifield than in the ipsilesional/-lateral hemifield (791 ms vs. 754 ms; t(54) = 2.91, p = 0.005, d = 0.79); whereas RTs for congruent trials did not show a significant difference between hemifields (704 ms vs. 721 ms; t(54) = −1.45, p = 0.153, d = 0.40). Results indicated an asymmetry of the Simon effect that was smaller in the ipsilesional/-lateral hemifield (i.e., on the side of the responding hand; see **Figure 3**). Note that there was no significant difference of this asymmetry of the Simon effect between the stroke patients and the healthy controls (interaction group × stimulus location × congruency: F(1,53) = 1.96, p = 0.167, η 2 <sup>p</sup> = 0.04).

There was a significant post-error slowing in both stroke patients and healthy controls, which did not differ significantly between groups (77 ms for stroke patients, 100 ms for healthy controls; t(48) = 0.77, p = 0.445, d = 0.22). Thus, the stroke patients and the healthy controls similarly adjusted their behavior after an error had occurred.

The distributional analysis of RTs revealed a significant congruency × quartile × group interaction effect (F(1.404,73.024) = 4.34, p = 0.028, η 2 <sup>p</sup> = 0.08; see **Figure 4**). Polynomial contrasts for each group separately showed that for the healthy controls the difference in RTs between incongruent and congruent conditions significantly decreased as RTs increased (interaction congruency × quartile: F(1,31) = 5.82, p = 0.022, η 2 <sup>p</sup> = 0.16 for the linear trend). In contrast, the RT difference between incongruent and congruent conditions did not significantly differ across the RT distribution for the stroke patients (interaction congruency × quartile: F(1,21) = 1.47, p = 0.238, η 2 <sup>p</sup> = 0.07 for the linear trend). Note that there was no significant difference between the LH and RH stroke patients concerning the course of the Simon effect as a function of response latency (t(20) = 0.11, p = 0.913, d = 0.05). According to the activation-suppression hypothesis (De Jong et al., 1994), the stable difference in RTs between incongruent and congruent conditions (i.e., the stable Simon effect) across the RT distribution indicated less efficient selective inhibition in the patients with stroke-induced lesions of the striatum (in contrast to healthy controls).

Notably, the pattern of results could be replicated with non-parametric tests.

# DISCUSSION

The aim of the present study was to investigate the putative contribution of the striatum (putamen and caudate nucleus) to the control of response interference. For that purpose, patients with unilateral striatal lesions (caused by stroke) and age-matched healthy controls performed a unimanual version of the Simon task. The magnitude of the Simon effect (reaction time difference between incongruent and congruent conditions) reflected the general ability to control response interference, and in combination with an analysis of the RT distributions the efficiency of selective inhibition of interfering responses.

Consistent with previous studies that successfully used unimanual Simon tasks (Heister et al., 1987; Wiegand and

Wascher, 2007; Arend et al., 2016), both stroke patients and healthy controls exhibited a significant Simon effect. Most importantly, the magnitude of the Simon effect did not differ significantly between the stroke patients and the healthy controls (63 ms for stroke patients; 58 ms for healthy controls), even after taking into account the differences in overall response latencies between the two groups. Thus, stroke patients—despite their unilateral lesions of the striatum—showed a similar ability to control response interference as healthy subjects, independent of the lesioned hemisphere. However, by considering the temporal dynamics of the processes underlying response interference control, stroke patients showed less efficient selective inhibition of interfering responses compared to healthy controls, independent of the lesioned hemisphere.

At first glance, the preserved Simon effect in the stroke patients with unilateral lesions of the striatum may contrast with previous clinical studies revealing reduced control of response interference in a Simon task in patients suffering from neurodegenerative diseases involving the striatum (i.e., PD, or HD; Georgiou et al., 1995; Praamstra and Plat, 2001; Fielding et al., 2005). On the other hand, the current results are in line with other clinical studies that used the Simon task in patients with PD and HD and showed that the control of response interference was preserved in these patients despite (fronto-) striatal neurodegeneration (Brown et al., 1993; Cope et al., 1996; Georgiou-Karistianis et al., 2007; Schmiedt-Fehr et al., 2007). Moreover, a previous clinical study in PD patients likewise reported preserved response interference control but reduced efficiency of selective inhibition in a Simon task by applying distributional analyses (Wylie et al., 2010).

The current results of the distributional analysis in the stroke patients with unilateral striatal lesions are consistent with the assumption that less efficient selective inhibition would manifest in a stable or rather increasing Simon effect across the RT distribution. This assumption is based upon both theoretical frameworks of the selective inhibitory process (and its interpretation) in the Simon task (Ridderinkhof, 2002) and many (Ridderinkhof et al., 2005; Castel et al., 2007; Juncos-Rabadán et al., 2008) albeit not all previous patient studies (Wylie et al., 2010).

The most parsimonious explanation for the preserved (general) ability to control response interference in the current sample of stroke patients with unilateral striatal lesions is that the functions of the lesioned striatum were compensated for by the contralesional striatum and/or by (frontal) cortical regions. This notion is corroborated by imaging studies in healthy participants showing that the Simon task activated the striatum bilaterally in addition to frontal areas, including anterior cingulate and lateral prefrontal cortices (Nee et al., 2007; Zhang et al., 2017).

With respect to previous studies in stroke patients with unilateral striatal lesions, there are currently only three studies available that investigated cognitive control processes, namely cognitive flexibility (Cools et al., 2006; Yehene et al., 2008) and response inhibition (Rieger et al., 2003). Using task switching and the stop-signal task, these studies revealed impaired flexible and inhibitory control functions in patients suffering from unilateral strokes involving the striatum, while the current study revealed no relevant deficit in another cognitive control function, namely the general ability to control response interference. These apparently divergent results may depend on specific task demands. While sharing a common need to control prepotent response tendencies (Aron, 2011), the cognitive control process assessed by the Simon task (i.e., response interference control) is subtly different from that required for (global) response inhibition (assessed by stop-signal or go/no-go tasks; Egner et al., 2007) or for cognitive flexibility (assessed by task-switching or set-shifting tasks; Diamond, 2013).

One could argue that the sample size of the current study may have been too small to reliably detect deficits in the (general) ability to control response interference in stroke patients with striatal lesions. However, note that the Bayesian statistics revealed a Bayes factor BF<sup>01</sup> of 3.4, indicating substantial evidence for the hypothesis H0 that the (overall) Simon effect is not different between the patient and control groups. Moreover, the sample sizes in the studies that reported impaired response inhibition (Rieger et al., 2003) and cognitive flexibility (Cools et al., 2006; Yehene et al., 2008) in stroke patients with striatal lesions were clearly smaller (6–8 patients) than the current patient sample size (n = 23).

Taken together, these previous findings and our current results suggest a specific role of the striatum in cognitive control processes, namely in (global) response inhibition (and cognitive flexibility) as well as in the efficiency of selective inhibition of interfering responses (engaged in response interference control). The above-mentioned studies and the current study also indicate that it is important to precisely characterize the cognitive control process under investigation when trying to elucidate the neural basis of cognitive control.

The differential contribution of the striatum to some (response inhibition and cognitive flexibility), but not other (general ability of response interference control) cognitive control processes may be grounded in the involvement of different subparts of the striatum in the diverse fronto-striatal networks related to cognitive control (Middleton and Strick, 2000; Utter and Basso, 2008). In this vein, the striatum could be involved in cognitive control processes in the context of eye movements (e.g., by connections with the frontal eye fields, FEFs) rather than in spatial coding per se (i.e., how response and spatial codes are represented in the context of the Simon task; Henik et al., 1994; Van der Stigchel et al., 2010). Note that we asked our subjects to centrally fixate during the Simon task, while other studies allowed eye movements or even used saccades to measure response latencies (Fielding et al., 2005). Therefore, it is conceivable that stroke patients with striatal involvement may exhibit impaired control of response interference in tasks requiring eye movement responses, but performed relatively unimpaired in the current task requiring unimanual finger responses. This hypothesis of an effector-dependent involvement of the striatum in the control of response interference warrants further investigation.

In addition, the current results of the unimanual version of the Simon task have implications for theoretical accounts of spatial coding in the Simon task. For both groups (stroke patients and healthy controls), the Simon effect was smaller in the ipsilesional/-lateral hemifield. This pattern of results was also observed in (right-handed) young healthy controls (Arend et al., 2016; see also **Supplementary Figure S2**), and is in line with the grouping model (Adam et al., 2003) that accounts for the effects of the location of the stimulus and that of the responding hand on the Simon effect in unimanual experimental setups. The grouping model assumes that pre-attentive grouping processes may pose an advantage when the stimulus activates two associated responses. In the unimanual version of the Simon task used here, left and right responses were given with the index and middle fingers of the same hand and therefore the two fingers were part of the same response unit. Following the grouping model, if a participant responded with the right hand, the presentation of the stimulus in the right visual field (i.e., ipsilateral hemifield) probably activated both the index and middle fingers of the right hand because they belong to the same response unit (i.e., the right hand). Consequently, the mismatch between the spatial location of the stimulus and the side of response in incongruent conditions should be reduced on the side of the responding hand, which in turn should reduce the difference between incongruent and congruent conditions (i.e., the Simon effect; Arend et al., 2016), as could be observed in the current study.

# CONCLUSION

When adopting a unimanual Simon task, stroke patients with unilateral lesions of the striatum showed preserved yet less efficient control of response interference. Moreover, the finding of a reduced Simon effect in the ipsilesional/ lateral hemifield—in both stroke patients and healthy controls—supports the grouping model (Adam et al., 2003; Arend et al., 2016).

# AUTHOR CONTRIBUTIONS

CS: analysis and interpretation of the data; conception, writing and revision of the manuscript. DT: study design; acquisition and analysis of the data; lesion mapping; critical revision of the manuscript. IA and SV: study concept and design; analysis of the data; critical revision of the manuscript. AD: lesion mapping; critical revision of the manuscript. JS and HK: acquisition of the data; critical revision of the manuscript. GF and AH: study concept and supervision; critical revision of the manuscript. PW: study concept, design and supervision; critical revision of the manuscript. All the authors have approved the final version of the manuscript and agree to be accountable for all aspects of the work.

# FUNDING

This research was funded by the German Israeli Foundation (GIF) for Scientific Research and Development awarded to AH and PW (GIF Grant No: 1110-93/2010). SV is supported by funding from the Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF, 01GQ1401). GF gratefully acknowledges additional support from the Marga and Walter Boll Foundation.

# ACKNOWLEDGMENTS

We are grateful to Dr. Elisabeth Achilles for providing the MRIcron compatible mask of the striatum.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00414/full#supplementary-material

# REFERENCES


of cognitive control. Cereb. Cortex 16, 553–560. doi: 10.1093/cercor/ bhj003


deficits in AD/HD that are eliminated by methylphenidate treatment. J. Abnorm. Psychol. 114, 197–215. doi: 10.1037/0021-843x.114.2.197


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Schmidt, Timpert, Arend, Vossel, Dovern, Saliger, Karbe, Fink, Henik and Weiss. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Complex Interplay Between Depression/Anxiety and Executive Functioning: Insights From the ECAS in a Large ALS Population

Laura Carelli<sup>1</sup>† , Federica Solca1,2† , Andrea Faini<sup>3</sup> , Fabiana Madotto<sup>4</sup> , Annalisa Lafronza<sup>1</sup> , Alessia Monti<sup>5</sup> , Stefano Zago<sup>6</sup> , Alberto Doretti<sup>1</sup> , Andrea Ciammola<sup>1</sup> , Nicola Ticozzi1,2 , Vincenzo Silani1,2† and Barbara Poletti<sup>1</sup> \* †

<sup>1</sup> Laboratory of Neuroscience, Department of Neurology, Istituto Auxologico Italiano – Istituto di Ricovero e Cura a Carattere Scientifico, Milan, Italy, <sup>2</sup> Department of Pathophysiology and Transplantation, "Dino Ferrari" Center, Università degli Studi di Milano, Milan, Italy, <sup>3</sup> Department of Cardiovascular, Neural and Metabolic Sciences, Istituto Auxologico Italiano – Istituto di Ricovero e Cura a Carattere Scientifico, Milan, Italy, <sup>4</sup> Research Centre on Public Health, Department of Medicine and Surgery, University of Milano-Bicocca, Milan, Italy, <sup>5</sup> Department of Neurorehabilitation Sciences, Casa di Cura Privata del Policlinico, Milan, Italy, <sup>6</sup> Department of Neuroscience and Mental Health, IRCCS Fondazione Ca' Granda Ospedale Maggiore Policlinico, Università degli Studi di Milano, Milan, Italy

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Marco Pitteri, University of Verona, Italy Arianna Palmieri, Università degli Studi di Padova, Italy

#### \*Correspondence:

Barbara Poletti b.poletti@auxologico.it †These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 08 January 2018 Accepted: 19 March 2018 Published: 05 April 2018

#### Citation:

Carelli L, Solca F, Faini A, Madotto F, Lafronza A, Monti A, Zago S, Doretti A, Ciammola A, Ticozzi N, Silani V and Poletti B (2018) The Complex Interplay Between Depression/Anxiety and Executive Functioning: Insights From the ECAS in a Large ALS Population. Front. Psychol. 9:450. doi: 10.3389/fpsyg.2018.00450 Introduction: The observed association between depressive symptoms and cognitive performances has not been previously clarified in patients with amyotrophic lateral sclerosis (pALS). In fact, the use of cognitive measures often not accommodating for motor disability has led to heterogeneous and not conclusive findings about this issue. The aim of the present study was to evaluate the relationship between cognitive and depressive/anxiety symptoms by means of the recently developed Edinburgh Cognitive and Behavioral ALS Screen (ECAS), a brief assessment specifically designed for pALS.

Methods: Sample included 168 pALS (114 males, 54 females); they were administered two standard cognitive screening tools (FAB; MoCA) and the ECAS, assessing different cognitive domains, including ALS-specific (executive functions, verbal fluency, and language tests) and ALS non-specific subtests (memory and visuospatial tests). Two psychological questionnaires for depression and anxiety (BDI; STAI/Y) were also administered to patients. Pearson's correlation coefficient was used to assess the degree of association between cognitive and psychological measures.

Results: Depression assessment negatively correlated with the ECAS, more significantly with regard to the executive functions subdomain. In particular, Sentence Completion and Social Cognition subscores were negatively associated with depression levels measured by BDI total score and Somatic-Performance symptoms subscore. Conversely, no significant correlations were observed between depression level and cognitive functions as measured by traditional screening tools for frontal abilities (FAB) and global cognition (MoCA) assessment. Finally, no significant correlations were observed between state/trait anxiety and the ECAS.

Discussion and conclusion: This represents the first study focusing on the relationship between cognitive and psychological components in pALS by means of the ECAS, the current gold standard for ALS cognitive-behavioral assessment.

**255**

If confirmed by further investigations, the observed association between depression and executive functions suggests the need for a careful screening and treatment of depression, to avoid overestimation of cognitive involvement and possibly improve cognitive performances in ALS.

Keywords: depression, anxiety, executive functions, social cognition, ECAS, amyotrophic lateral sclerosis

# INTRODUCTION

fpsyg-09-00450 March 31, 2018 Time: 16:55 # 2

A consistent body of literature concerning cognitive-behavioral alterations in amyotrophic lateral sclerosis (ALS) in the last 20 years allowed to recognize it as a multisystem disorder, and not a purely motor neuron disease. In particular, specific cognitive alterations have been described, together with psychological and behavioral changes, with underlying uncompletely determined patterns of neuroradiological, neurobiological, and genetic profiles (DeJesus-Hernandez et al., 2011; Phukan et al., 2012; Goldstein and Abrahams, 2013; Agosta et al., 2016). A prevalence of 30–50% of cognitive impairment, predominantly in the form of executive dysfunction, has been described (Phukan et al., 2012; Montuschi et al., 2015); in about 10–15% of ALS patients these changes fulfill the criteria for frontotemporal dementia – FTD (Phukan et al., 2012; Goldstein and Abrahams, 2013; Montuschi et al., 2015). Prevalence rates for depression and anxiety in ALS range from 0 to 44% and from 0 to 30%, respectively (Kurt et al., 2007). Both psychological aspects and cognitive changes exert a well-known effect on quality of life, functional abilities, prognosis and survival of patients with ALS (pALS) (Elamin et al., 2013; van Groenestijn et al., 2016; Xu et al., 2017). An association between cognitive deficits and depression has been consistently observed in both clinical and medically healthy populations (Lim et al., 2013; Rock et al., 2014; Pu et al., 2017; Yoon et al., 2017). In particular, depressive symptoms seem to primarily affect executive functioning, with neuroimaging findings supporting the involvement of brain networks entailing frontal-subcortical circuits in depressed patients (Rock et al., 2014; Ahern and Semkovska, 2016; Brakowski et al., 2017).

Despite the presence of consistent evidence in other neurological diseases (e.g., Terroni et al., 2012; Nunnari et al., 2015; Snowden et al., 2015), the association between cognition and psychological aspects has been poorly investigated in ALS, also showing conflicting results. A recent study revealed the absence of a relation between cognitive impairment and psychiatric/psychosocial measures, as well as between cognitive impairment and wish to die, employing a neuropsychological assessment focused on executive functions (Rabkin et al., 2016). On the contrary, other studies adopting different measures of cognitive functions showed an association between psychological/psychiatric symptoms and global cognition (Wei et al., 2016) and between depression and specific neuropsychological aspects, including verbal and visual learning, processing speed and language (Jelsone-Swain et al., 2012).

When considering more innovative cognitive tools, two recent studies about feasibility of Eye-Tracking (ET) and Brain Computer Interface (BCI) technologies for cognitive testing in ALS highlighted a negative correlation between anxiety levels and reasoning time at some neuropsychological tests for executive functions. Such results were interpreted according to an increase in the rate of impulsivity, possibly depending both on the reduced amount of cognitive and inhibitory resources in pALS and on the supplementary influence of anxiety on such cognitive profile (Poletti et al., 2016a,b). In both studies, no significant correlations were observed between ET tests and depression level, as measured by Beck Depression Inventory (Beck et al., 1961).

A relevant issue in the neuropsychological assessment of pALS concerns the absence of validated gold standard tools, accommodating for progressive physical disabilities and providing reliable, comprehensive and comparable results across studies. This aspect remained an unsolved issue till the recent development (Abrahams et al., 2014), and validation into several languages (Lulé et al., 2015; Niven et al., 2015; Poletti et al., 2016c) of a rapid multi-domain cognitive-behavioral screening tool specifically designed for ALS, i.e., the Edinburgh Cognitive and Behavioral ALS Screen – ECAS. Moreover, the ECAS has been specifically designed to accommodate for verbal/motor disability, since subtests can be administered both in spoken and written form and in moderate/advanced stages of the disease. Despite the rapid and large diffusion of such screening tool for the clinical assessment of pALS, it has never been applied in order to specifically investigate the association between cognitive and psychological aspects. Only two studies report the use of the ECAS in association to psychological and behavioral scales, without specifically addressing such topic. In particular, Radakovic et al. (2017) described the absence of correlation between depression evaluated by means of Geriatric Depression Scale and 'cognitive functioning task performances' including the Verbal Fluency Total Score of the ECAS. Another study (Niven et al., 2015) described the absence of correlations between anxiety and ECAS scores, but no indications were provided about symptoms of depression.

In the context of the validation of the Italian ECAS, we preliminarily explored the relationship between such instrument and psychological variables such as anxiety and depression (Poletti et al., 2016c). In the recruited sample, a mild negative association was observed between depression assessment and both global and ALS-specific functions scores at the ECAS; moreover, state anxiety assessment negatively correlated with the ECAS total and the ALS non-specific functions scores.

As above described, previous literature data on other medically ill and non-medically ill populations (Terroni et al., 2012; Lim et al., 2013; Rock et al., 2014; Nunnari et al., 2015; Snowden et al., 2015; Pu et al., 2017; Yoon et al., 2017) supports the presence of a relationship between a psychological aspect,

i.e., presence/severity of depressive symptoms and cognitive profiles. Basing on such evidence, as well as on conflicting results regarding pALS and on our preliminary findings, we aimed to investigate the association between psychological features and cognitive abilities with the ECAS in a larger ALS population, particularly focusing on executive functions. Such association was also investigated with traditional cognitive measures of frontal (FAB) and global cognitive functioning (MoCA) not specifically designed for ALS, in order to identify possible differences between the gold standard ALS instrument and traditional cognitive tools for such purposes.

The presented research is part of a larger and ongoing clinical study using the ECAS for longitudinal assessment, evaluating its feasibility and sensitivity across the course of ALS disease.

## MATERIALS AND METHODS

#### Participants

Sample included 168 ALS patients (Males: 114; Females: 54; age: 62.3 ± 12.1 years; education: 11.1 ± 4.4 years; disease duration: 33.6 ± 42.7 months) recruited at the Department of Neurology, IRCCS Istituto Auxologico Italiano. The diagnosis of ALS was made by neurologists experienced in the field of neuromuscular diseases, with patients fulfilling the revised El Escorial criteria for clinically possible, probable, probable – laboratory-supported or definite ALS (Brooks et al., 2000). Patients in terminal stages of disease or with major comorbid medical, neurological or cardio-vascular diseases were excluded from the study. Disease status was evaluated using the ALS Functional Rating Scale-Revised – ALSFRS-R (Cedarbaum et al., 1999). The study protocol was reviewed and approved by the Ethics Committee of our Institution and all eligible subjects received verbal and written information about the study. All participants signed an informed consent, according to the Declaration of Helsinki. Patients performed the designed cognitive and psychological protocol, as described below; it was administered by trained neuropsychologists and required approximately 40 min.

#### Cognitive and Psychological Assessment Cognitive Assessment

A neuropsychological and psychological protocol employed in a previous study (Poletti et al., 2016c) was adopted, including both two standard cognitive measures and a recently validated rapid cognitive-behavioral screening tool specifically developed for ALS (Edinburgh Cognitive and Behavioral ALS Screen – ECAS).

The Italian version of the ECAS was administered, assessing different cognitive domains, including ALS-specific (executive functions, fluency and language tests) and ALS non-specific domains (memory and visuospatial functions tests). Each domain includes specific tests (see **Table 1**). Moreover, it involves a separate Carer Interview, investigating behavior and psychotic alterations.

Standard cognitive measures included two brief batteries; for frontal executive functions, Frontal Assessment Battery – TABLE 1 | pALS' scores at the standard cognitive screening and the ECAS.


Data are expressed as Means ± SD or absolute numbers.

FAB (Dubois et al., 2000) was employed, evaluating subdomains of conceptualization, mental flexibility, motor programming, sensitivity to interference, inhibitory control, and environmental autonomy; for global cognitive functioning, Montreal Cognitive Assessment – MoCA (Italian translation by Pirani et al., 2007; Italian normative data by Conti et al., 2014) was administered.

#### Psychological Assessment

The evaluation of depressive and anxiety symptoms was performed by means of two self-rated measures widely used in ALS, i.e., the Beck Depression Inventory (BDI) (Beck et al., 1961) and State-Trait Anxiety Inventory-Y (STAI-Y) (Spielberger et al., 1970), for both state (STAI-Y1) and trait (STAI-Y2) anxiety components assessment. The BDI consists of 21 items, concerning both cognitive-affective (BDI CA, items 0–13) and somatic-performance (BDI SP, items 14–21) symptoms of depression. Total score ranges from 0 to 63, with higher total scores indicating more severe depressive symptoms. The standard cut-off ranges are as follows: 0–9 indicates minimal depression, 10–18 indicates mild depression, 19–29 indicates moderate depression, 30–63 indicates severe depression. The 40 item STAI-Y scores range from 20 to 80. Questions refer to how anxious people are feeling at the time of the study (state) and in general (trait). Scores higher than 65 indicate a clinically relevant anxiety.

#### Statistical Analysis

fpsyg-09-00450 March 31, 2018 Time: 16:55 # 4

Descriptive statistics (mean ± standard deviations for continuous variables and absolute number and frequencies for discrete variables) were used to describe the main characteristics of our sample and performances obtained at the ECAS and at standard cognitive/psychological assessments. Pearson's correlation coefficient was used to assess the degree of association between measures. An α level of 0.05 was used for all hypothesis tests. P-values were adjusted with a 'False Discovery rate' approach for multiple comparison correction (Benjamini and Hochberg, 1995). All data analyses were performed using SAS 9.2 software (SAS Institute, Cary, NC, United States).

#### RESULTS

# Patients' Demographic and Clinical Characteristics

Clinical neurological examination showed a mean ALSFRS/R score of 37.2 ± 7.5 (ALSFRS/R Bulbar: 10.2 ± 2.3); 51 patients presented upper limb regions involvement at onset, 70 lower limb regions, 9 upper and lower limb, 35 bulbar, 2 bulbar and lower limb and one respiratory symptoms at onset. Ten patients had non-invasive ventilation (NIV) and no one had percutaneous endoscopic gastrostomy (PEG). For description of patients' scores at ECAS and standard cognitive screening measures, see **Table 1**.

With regard to psychological aspects, of the 154 patients who completed the BDI, one hundred (65%) showed a clinically significant depression, ranging from mild-to-moderate (43%), moderate-to-severe (17%) and severe (5%). Of the 155 patients who completed the STAI-Y, 13 (8%) showed a clinically significant state anxiety and 15 (10%) showed relevant trait anxiety levels. See **Table 2** for description of patients' scores at the psychological assessment.

# Relationship of Psychological Symptoms to ECAS Scores and Subdomains

Analysis of association between cognitive and depression scores revealed the presence of significant, even if modest, correlations between variables. In particular, depression assessment negatively correlated with the ECAS Total and the ALS-Specific functions scores, mainly with regard to Somatic-Performance symptoms



Data are expressed as Means ± SD or absolute numbers.

and BDI Total score (BDI SP-ECAS Total score: r = −0.21, p = 0.023; ALS-Specific functions score: r = −0.22, p = 0.019; BDI Total score-ECAS Total score: r = −0.22, p = 0.019; ALS-Specific functions score: r = −0.23, p = 0.019; BDI CA-ECAS Total score: r = −0.19, p = 0.028; ALS-Specific functions score: r = −0.19, p = 0.027). No significant correlations were observed between anxiety and the ECAS, concerning both Total score and ALS-non Specific/Specific functions.

Analysis performed on the ECAS subdomains revealed mild significant, negative correlations between the ECAS executive functions subdomain and both Cognitive-Affective and Somatic-Performance symptoms of depression (BDI CA-ECAS executive functions: r = −0.21, p = 0.023; BDI SP-ECAS executive functions: r = −0.24, p = 0.017). Moreover, negative correlations were also observed between BDI total score and Executive functions subdomains as well as Language (BDI Total score-ECAS Language: r = −0.20, p = 0.023; ECAS executive functions: r = −0.25, p = 0.016). Language subdomain also correlated with Cognitive-Affective BDI subscore (r = −0.19, p = 0.027).

#### Relationship of Psychological Symptoms to Specific Cognitive Tests of the ECAS Subdomains

Further analysis performed on cognitive tests included in the subdomains revealed that, with regard to executive functions subdomain, Sentence Completion and, particularly, Social Cognition were associated with depression levels measured by BDI total score and Somatic-Performance symptoms scores (BDI SP-Sentence Completion: r = −0.22, p = 0.019; Social Cognition: r = −0.29, p = 0.007; BDI Total score-Sentence Completion: r = −0.20, p = 0.023; Social Cognition: r = −0.26, p = 0.016). Mild correlations were also observed between Cognitive-Affective BDI subscore and Social Cognition (r = −0.19, p = 0.028). Moreover, BDI Total Score negatively correlated with performance at the Spelling test (BDI Total score-Spelling: r = −0.20, p = 0.028). Finally, significant but weak correlations were also observed between depression scores and Immediate Recall subtest within the Memory subdomain (BDI SP – Immediate recall: r = −0.16, p = 0.044; BDI Total Score-Immediate Recall: r = −0.18, p = 0.029).

## Relationship of Psychological Symptoms to Performance at Standard Cognitive Screening Tools

No significant correlations were observed between depression assessment (BDI), concerning both total, cognitive-affective and somatic-performance symptoms, and traditional screening of frontal abilities (FAB) and global cognition (MoCA). Similarly, anxiety assessment (STAI-Y) did not correlated with either FAB or MoCA for neither state or trait anxiety scores.

#### DISCUSSION

This study revealed the presence of mild significant correlations between depression scores and the ECAS, mainly concerning the executive functions subdomain. Such findings are in accordance with previous literature regarding patients with depression

without neurological illness. In particular, some recent reviews and meta-analysis about cognitive impairment in depressed patients showed that depression is related to reduction in a wide range of cognitive abilities, including attention, processing speed, executive function, and memory (Lim et al., 2013; Bortolato et al., 2014; Ahern and Semkovska, 2016); in particular, larger impairments were observed in executive abilities concerning processing speed and shifting, while memory scores were affected by small to moderate impairments (Ahern and Semkovska, 2016). A recent review on adult individuals with depression focusing on studies employing neuropsychological tests for executive functions showed that the majority of studies included (25 of 28) found alterations in some aspects of executive functioning in patients with depression (Alves et al., 2014). Moreover, deficit in executive functions and attention seem to persist after remission of depressive symptoms, in particular with regard to inhibition, shifting and verbal fluency (Rock et al., 2014; Ahern and Semkovska, 2016).

In our sample, more detailed analysis within the executive functions subdomain revealed a more significant involvement of social cognition abilities in association with depression scores. Such finding is in agreement with literature underling a positive relationship between severity of depression symptoms and degree of Theory of Mind (ToM) impairments in depressed patients (Cusi et al., 2013; Bora and Berk, 2016). In a recent metaanalysis, ToM deficits were not influenced by severity of executive dysfunctions, suggesting that social cognition impairment could represent a separate domain affected in depressive disorders, aside with executive functions (Bora and Berk, 2016). Debate is still open about the source of social cognition deficits in ALS as studies provide heterogeneous results about their independence from executive dysfunction (Consonni et al., 2016; Strong et al., 2017). Another possible explanation is the one considering the inhibitory component as specifically associated to depression symptoms in ALS, resembling what described in depressed patients without neurological illness (Alves et al., 2014). In our sample, such consideration is supported by the association between depression scores and tasks markedly involving inhibitory abilities, i.e., Sentence Completion and Social Cognition within the Executive Functions subdomain.

Weak correlations have been observed in our sample between memory subdomain, in particular the Immediate Recall test, and depression scores. The involvement of memory functions, in particular immediate recall, cannot be clearly separated by the influence of attention and executive alterations, since available cognitive tests, and also those included into the ECAS, are usually related to both cognitive domains. In ALS population, the lack of consensus about the characterization of memory alterations and the poor specificity of such components leaded to not include them in the current criteria for ALS-FTD (Strong et al., 2017). Therefore, the collected data do not actually support an association between depression and memory function in our sample, also according to the weak correlations recorded.

Mild correlations have also been observed between the Spelling test within the language subdomain and depression score. Such data, not confirmed by available literature, needs further investigations and more significant data to be more critically considered.

Only few studies have investigated the association between depression symptoms and cognition in pALS. With regard to available literature in this field, our data are globally in accordance with two previous studies (Jelsone-Swain et al., 2012; Wei et al., 2016), highlighting an effect of depressive symptoms on cognitive performance at both global and specific neuropsychological measures in ALS. Conversely, contrasting results have been observed in other previous studies employing both traditional and motor-verbal free cognitive testing, showing the absence of a relationship between cognitive impairment and depression levels (Poletti et al., 2016a,b; Rabkin et al., 2016). The novelty of such topic in ALS, in association with the variety of measures for depression and cognition employed and the different modes of administration of neuropsychological tests (both 'paper and pencil' and motor-verbal free based measures), makes a comparison with previous findings not feasible. Possibly, the increasing adoption of a recently validated gold standard measure for cognition and behavior in ALS, i.e., the ECAS, will allow to provide comparable data across studies and to realize longitudinal investigations, due to the partial compensation of motor disability. Moreover, the absence of correlations observed between depression scores and performance at standard cognitive screening tools (FAB and MoCA) further support the use of the ECAS for the investigation of the discussed topic according to both sensitivity and feasibility components of such instrument in ALS. In our sample, association with cognition mainly concerned Total BDI Score and Somatic-Performance BDI components, while Cognitive-Affective components were only poorly associated with cognitive scores. These results could be controversial, according to doubts that BDI may overestimate the presence of depression in ALS, since it contains a number of somatic/vegetative symptoms that can overlap with physical illness. However, the use of BDI in our protocol is consistent with previous studies that confirmed the reliability of such measure in pALS. A previous study comparing the estimated prevalence of depression using BDI and a questionnaire specifically designed for pALS revealed little discrepancy between findings (Kubler et al., 2005). Also the use of a modified BDI scale, without items that could be confounded by physical symptoms, showed good agreement with the standard BDI scale (Wicks et al., 2007). Therefore, the BDI score can be considered appropriate for clinical use in ALS population (Kubler et al., 2005; Wicks et al., 2007; Taylor et al., 2010).

In depressed patients, both somatic/vegetative symptoms and negative affective states/dysfunctional cognitive patterns may interfere with optimal performance. According to our results, we could hypothesize that in pALS the influence of depression over cognition, in particular social cognition abilities, may be more specifically mediated by somatic/vegetative symptoms. If confirmed by further studies, this result could be discussed within the frame of psychophysiological changes, involving the sympathetic nervous system, observed in ALS in association to social and emotional impairments (Lulé et al., 2005).

With regard to another psychological components investigated, i.e., state and trait anxiety, we found not significant

associations with cognitive performances at the ECAS. According to recent literature, experimentally induced anxiety seems to impair performance only under low-load, i.e., simple cognitive tasks, while its effect is reduced when subjects engage in more difficult tasks that involve high attentional and executive resources (Vytal et al., 2012, 2013). Previous results obtained in ALS underlined an effect of anxiety on reasoning times, with higher levels of anxiety corresponding to lower execution times at some cognitive tests (Poletti et al., 2016a,b). In our sample, the limited proportion of patients presenting with clinically relevant state and trait anxiety levels, with global mean scores considerably lower than cut off, could explain the absence of clear effects of such psychological component over the observed performances. Moreover, previously reported associations between anxiety and execution times would not have been observed with the employed protocol that not include time-related measures. Therefore, further investigations are needed in ALS in order to clarify such issue.

Limitations of the present study mainly concern the psychological evaluation of pALS. In particular, informations about previous depressive episodes that could have helped in distinguishing between recurrent depressive episodes and depressive symptoms reactive to ALS diagnosis have not been systematically collected and described. Moreover, exclusion of patients taking psychotropic drugs, or distinction of such subgroup for data analysis, would have been useful to prevent or highlight the potential effect of medications on cognitive performances.

Even if assessed in a large sample of pALS, the relationship between psychological factors and executive functions in such population cannot be considered fully clarified. Probably, the investigation about the complexity of factors modulating executive functions will benefit from the fully control of psychological variables, verbal-motor components of cognitive tests and other well-known aspects influencing cognitive performances (i.e., socio-demographic variables).

Despite the above mentioned limitations, if confirmed by further investigations and by stronger statistical results, the observed association between depression and executive functions

# REFERENCES


could help to better plan both cognitive assessment/training and psychological interventions in ALS. With regard to the former, to emphasize the importance of a careful screening for depression could help to avoid a cognitive involvement overestimation. Furthermore, the observed correlation could mask the progression of cognitive involvement as measured in longitudinal assessments. Such point should be targeted in future follow-up studies with the ECAS. With regard to psychological interventions, the explanation about how comorbid cognitive impairment, as well as other biopsychosocial factors, moderate both pharmacotherapy and psychotherapy outcomes in ALS could turn into more tailored and effective interventions. Actually, due to the poor definition of the influence between cognitive and psychological aspects in ALS, cognitive alterations are mostly not considered in psychological interventions in ALS (Gould et al., 2015). Additionally, the influence of psychotherapy on cognitive aspects of pALS should be more deeply investigated, according to recent literature showing an association between changes in the central nervous systems, i.e., an attenuation in disease progression, and a concomitant psychological treatment (Kleinbub et al., 2015).

The presented work could represent an important step toward the definition of a integrated approach to intervention for pALS, in accordance with the biopsychosocial model, with the aim to carefully and more efficiently manage such complex disease.

# AUTHOR CONTRIBUTIONS

BP, LC, and FS conceived the study and wrote the manuscript. AL, SZ, and AM administered the cognitive and psychological protocols. AF and FM performed the statistical analysis. AC, AD, and NT performed the clinical examination of ALS patients. BP and VS critically revised the manuscript.

# ACKNOWLEDGMENTS

The authors thank patients and their relatives, together with the other volunteers who participated in this research.



Lateral Scler. Frontotemporal Degener. 16, 172–179. doi: 10.3109/21678421. 2015.1030430



Yoon, S., Shin, C., and Han, C. (2017). Depression and cognitive function in mild cognitive impairment: a 1-year follow-up study. Geriatr. Psychiatry Neurol. 30, 280–288. doi: 10.1177/0891988717723741

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AP and handling Editor declared their shared affiliation.

Copyright © 2018 Carelli, Solca, Faini, Madotto, Lafronza, Monti, Zago, Doretti, Ciammola, Ticozzi, Silani and Poletti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intra-Individual Variability Across Fluid Cognition Can Reveal Qualitatively Different Cognitive Styles of the Aging Brain

#### Sara De Felice<sup>1</sup> \* † and Carol A. Holland<sup>2</sup>†

1 Institute of Neurology, Faculty of Brain Sciences, University College London, London, United Kingdom, <sup>2</sup> Division of Health Research, Centre for Ageing Research (C4AR), Faculty of Health and Medicine, Lancaster University, Bailrigg, United Kingdom

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Veronica Mazza, University of Trento, Italy Brandon P. Vasquez, Baycrest Hospital, Canada

#### \*Correspondence:

Sara De Felice sara.felice.16@ucl.ac.uk

†These authors were affiliated to Aston University at the time of the research

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 27 April 2018 Accepted: 26 September 2018 Published: 16 October 2018

#### Citation:

De Felice S and Holland CA (2018) Intra-Individual Variability Across Fluid Cognition Can Reveal Qualitatively Different Cognitive Styles of the Aging Brain. Front. Psychol. 9:1973. doi: 10.3389/fpsyg.2018.01973 Dispersion is a measure of intra-individual variability reflecting how much performance across distinct cognitive functions varies within an individual. In cognitive aging studies, results are inconsistent: some studies report an increase in dispersion with increasing age and decline in performance, while others report an increasingly homogenous cognitive profile in older adults. We propose that inconsistencies may reflect qualitative differences in the cognitive functioning of the aging brain: age-groups may differ in how efficiently they engage resources, depending on both executive processing and resources available. This in turn would result in either greater or less dispersion. 21 young (mean 25.14 years, SD ± 2.85), 21 middle-old (65.05 ± 4.19), and 20 old-old (80.65 ± 4.38) healthy adults completed a series of neuropsychological tasks engaging executive processing, including switching, planning, updating, working memory and short-term memory. Individual dispersion profiles were obtained using a regression method which computes individual standard deviation across tasks from standardized test scores. Results revealed associations between performance, dispersion and cognitive reserve (measured as education level). Although differences across groups did not approach significance, there was a general pattern consistent with existing literature showing greater dispersion in the old-old group, and this was negatively associated with performance. In contrast, the middle-old group showed young-equivalent dispersion index, while performance was similar to the young group on some tasks and to the old-old group on others, possibly reflecting differences in cognitive demand. Educational level positively correlated with performance in the middle-old group only. Overall, a distinct pattern emerged for the middle-old adults: they showed young-equivalent performance on a number of measures and similar dispersion index, while uniquely benefitting from cognitive reserve. This may possibly reflect engagement in compensatory mechanisms. This study contributes to clarifying inconsistencies in previous studies and calls for more thoughtful selection of sample

cohorts in aging research. The study of dispersion may provide a behavioral index of age-related changes in how cognition functions and recruits resources. Future work could examine whether this also reflects age-related changes in neural recruitment and aim at identifying factors contributing to cognitive reserve, in order to prolong good performance and improve cognition in aging.

Keywords: intra-individual variability, dispersion, fluid cognition, executive functions, aging, compensation, cognitive reserve, education

# INTRODUCTION

The term intra-individual variability has been adopted to refer to variability in performance within individuals across either different trials (within the same task) or across tasks. In recent years, advanced aging has been studied with reference to intraindividual variability and pattern of performance within and across cognitive domains. While many studies have focused on intra-individual variability across trials within tasks (for a review see Dykiert et al., 2012), relatively less attention has been devoted to the study of intra-individual variability across tasks/cognitive functions, referred to in the literature as either dispersion (Hilborn et al., 2009) or differentiation (Juan-Espinosa et al., 2002; Li et al., 2004; Blair, 2006). This latter form of intra-individual variability is the focus of this study. While the terms intra-individual variability, differentiation and dispersion have been used interchangeably in the literature, the term "dedifferentiation" specifically refers to reduced intra-individual variability within person across tasks (Balinsky, 1941; Juan-Espinosa et al., 2002). For clarity, here we will adopt the term 'dispersion' (or dispersion index) to refer to the variability in individuals' performance across tasks, as defined in previous studies (Sosnoff and Newell, 2006; Hilborn et al., 2009; Halliday et al., 2018).

Reduced dispersion (cognitive de-differentiation) has been reported as a function of age across measures of speed of processing, working-memory, verbal fluency and lexical decision both in cross-sectional and longitudinal studies (Li et al., 2004; Rabbitt et al., 2004). However, more recent studies have reported the opposite pattern, namely an increase in dispersion with increasing age (Hultsch et al., 2000; Sosnoff and Newell, 2006; Hilborn et al., 2009; Halliday et al., 2018). These inconsistent results may derive from (i) differences in neuropsychological batteries, (ii) analysis adopted to compute dispersion index, and/or (iii) demographic differences in age groups across studies. Therefore, the present study will (i) use tests to include a variety of cognitive measures, (ii) compute dispersion index via a regression technique which accounts for external and internal confounds (see methods), and (iii) compare two different aged older adults groups with young adults. An alternative explanation may arise when considering the dispersion index in relation to cognitive performance across different older age groups, as this may carry information about the cognitive profile of a given age group, how cognition functions and recruits resources at different developmental stages. Specifically, the aim of this study is therefore to investigate variability in performance across cognitive measures within single individuals (dispersion index), and in particular its relation to cognitive performance in healthy aging.

Theoretical and empirical evidence points at the frontal lobe as the hub for neuronal and cognitive mechanisms that may be responsible for driving this cross-domain variability between individuals and age groups (e.g., Robbins et al., 1998; Rabbitt and Lowe, 2000; Gruber and Goschke, 2004; MacDonald et al., 2006). "Executive functions" is an umbrella term that refers to a series of cognitive processes which cooperate purposefully, including selecting the target information and inhibiting irrelevant information likely to interfere with current mental processes and/or response execution, keeping and manipulating information online, shifting and sustaining of attention, planning, organizing and executing tasks. We adopted the definition proposed by Blair (2006), which goes beyond the classical definition of executive functioning (Diamond, 2013) and overlaps with the more comprehensive construct of fluid cognition. This includes speed of processing and working memory, as these cognitive processes co-operate purposefully to organize and execute sequential steps or actions (Alvarez and Emory, 2006). These may be particularly associated with agerelated cognitive decline (e.g., Johnson et al., 2007) as well as agerelated compensatory mechanisms (Buckner, 2004; Ouwehand et al., 2007). To incorporate these cognitive components in our definition and avoid confusion with terminology, we will use the term "fluid cognition" over "executive function." Although the unitary entity of fluid cognition is conceptually useful, it is important to notice that different components are distinguishable (for a meta-analytic review see Alvarez and Emory, 2006), and dissociate both in healthy aging and pathological conditions (e.g., Robbins et al., 1998; Piguet et al., 2002; Godefroy, 2003; Brandt et al., 2009; Turner and Spreng, 2012; White, 2013; De Felice et al., 2018). The focus of the present study is to investigate how the behavioral relationships across these different cognitive components vary with age (dispersion index), and whether this reveals specific association with performance in different groups of healthy older adults.

In contrast to individual cognitive performance, the dispersion index is thought to provide an indicator of fairly stable endogenous factors, such as central nervous system (CNS) integrity (MacDonald et al., 2006, 2009), and to be less influenced by situation-dependent factors including fluctuations in stress or sleep (Hultsch et al., 2000). Several studies have reported that a measure of dispersion can be a meaningful indicator of individual differences in behavioral cognitive integrity and neurological mechanisms over age, education and socio-economic status (Hultsch et al., 2000; Rapp et al., 2005; Hilborn et al., 2009;

Wojtowicz et al., 2012; Halliday et al., 2018). Behavioral studies have shown that greater dispersion across neuropsychological measures is associated with greater cognitive decline and poorer performance in healthy older adults (Rapp et al., 2005; Hilborn et al., 2009; Wojtowicz et al., 2012; Halliday et al., 2018). However, little attention has been given to the cognitive behavioral profile of older adults who do not show such dispersion, and fewer studies have specifically focused on variability in dispersion between old-age groups (e.g., see Hilborn et al., 2009).

The variance in the degree of dispersion associated with older adults is consistent with models of successful aging such as the Selection, Optimization, and Compensation (SOC) model (Baltes and Baltes, 1990). The SOC model defines successful aging as a heterogenous process of losses and gains, whereby older adults select tasks that are meaningful to them, optimize the resources available to complete those tasks, and engage in compensatory mechanisms to adapt to age-related losses. Variability across individuals in engaging such strategies depends on resources available (Lang et al., 2002; Amieva et al., 2014) and may result in different degrees of "successful" aging. This definition has the advantage of allowing for non-normative, individual cognitive profiles, not necessarily defined by chronological age (e.g., Lang et al., 2002). In fact, the model refers to compensatory strategies including behavioral adjustments to cope with a large range of age-related losses (biological, cognitive, psychological, sensorimotor, socio-economic, etc.). With regards to cognitive losses, compensatory mechanisms may take place unconsciously and manifest behaviorally only when performance is specifically tested (e.g., Li et al., 2001; Silagi et al., 2015).

We argue that mechanisms of cognitive compensation preceding and/or counteracting cognitive decline may have been overlooked in behavioral studies, in favor of an analysis of the variability in individual performance across cognitive domains (dispersion). This may have left unexplored stages in aging when cognitive changes produce distinct patterns of behavioral outcomes, in cases where dispersion is reduced or absent. Dispersion may occur later in aging, is usually associated with decrement in performance and tends to be more easily detectable by neuropsychological tasks. Additionally, behavioral indices of cognitive compensation may tend to remain hidden when older groups are compared to young adults, and when no direct comparisons are carried out within older groups. Although studies on cognitive aging have examined age-related cognitive decline and compensatory mechanisms (Dixon and Bäckman, 1995; Glisky et al., 2001; Dixon and de Frias, 2007; Raz, 2009), and have tried to identify specific factors contributing to cognitive compensation (e.g., working-memory, Borella et al., 2010; Karbach and Verhaeghen, 2018), little attention has been devoted to the study of dispersion and its relation to (good) performance, in the context of 'successful aging.' In order to investigate these issues, we specifically compared two older-age groups (middle-old adults in their 60 s and old-old adults over 75 years old) to young controls on a series of tasks engaging fluid cognition.

In line with a progressive neurodegeneration with aging (Raz et al., 2005), we expect middle-old adults to show overall better performance than old-old adults. However, when looking at the dispersion index and performance, we predict that middle-old adults would show a distinct pattern of results compared to both young and old-old adults. Specifically, middle-old adults may show similar dispersion to young adults and less than old-old adults, while showing performance levels in the middle between young and old-old groups. This could possibly reflect greater cognitive stability, resulting in limited dispersion in this age group and better performance. In contrast, the old-old group may show greater dispersion and poor performance compared to the other age-groups, possibly reflecting advanced aging (Li et al., 2004; Sosnoff and Newell, 2006), and/or a stage when compensatory mechanisms may become increasingly difficult to implement as a result of age-related losses in resources (Freund and Baltes, 2002; for a review see Ouwehand et al., 2007).

If the dispersion index within older-age groups is modulated by the degree of engagement in cognitive compensatory strategies (to counteract age-related loss), which in turn depends on resources available (Lang et al., 2002; Amieva et al., 2014), it follows that the older adults with lower dispersion would also be relying on cognitive reserve to a greater extent. Cognitive reserve has been defined as a set of variables including education, intelligence and novelty (Stern, 2002, 2009, 2012), which helps the brain to flexibly adapt to and compensate for pathological loss (Robertson, 2013). It would follow that performance should be positively correlated with factors such as education, specifically in those age-groups that engage in compensatory strategies, but not in those age-groups that either do not need to compensate (e.g., young adults) or whose neurological aging processes have reached an advanced stage where cognitive resources are more limited (Glisky et al., 2001; Sosnoff and Newell, 2006; Ouwehand et al., 2007; Raz, 2009). According to this argument, old-old adults would show the greatest dispersion and this would be coupled with poor-performance, while factors contributing to cognitive reserve (e.g., education level) would have little influence on general performance, in accordance with studies showing that factors considered to be protective against mental aging may no longer benefit the oldest elders (Paulo et al., 2011).

The counter argument would be that middle-old age is not qualitatively different from either young adulthood or old-old age, and differences across groups should simply reflect a linear increase in dispersion and decrease in performance. According to this second hypothesis, the middle-old group should either resemble young adults or old-old adults, depending on where in the developmental trajectory they collocate, thus showing either little dispersion and good performance or significant dispersion and poor performance, respectively. Taking into account evidence that variability between individuals exists, particularly in older groups (Morse, 1993), it may indeed be possible that our middle-old age groups would include both young-like participants as well as old-old-like participants: this would predict negative correlation between dispersion and performance, consistent with a linear increase in dispersion and decrease in performance with aging (e.g., Hilborn et al., 2009). This hypothesis would also predict that an index of cognitive reserve such as educational level should not have any specific effect on any age-group in particular more than another, benefitting either all or none.

Furthermore, if cognitive outcome in aging reflects both cognitive resources available and ability to recruit those resources, it follows that performance should be modulated by task demand. To test this hypothesis, we designed a working memory task based on the findings from an earlier study (Cappell et al., 2010). By using functional magnetic resonance imaging (fMRI), Cappell and Gmeindl found that different patterns of neural recruitment were associated with different levels of performance in healthy older-adults, and reported variations as a function of working-memory load: while lower-loads were associated with greater neural recruitment in older-adults and age-equivalent performance, at higher-loads conditions showed lower recruitment as well as poor performance. Although this study was intended to investigate age-related differences in neural recruitment, and we are instead interested in age-related variance in cognition, we still believe that the logic of comparing performance for different working-memory loads is relevant to test our hypothesis. In this task, working-memory span does not increase based on individual performance (as in classic digit span tasks), but participants are given different fixed span length (working-memory loads) and performance is compared across conditions. If the middle-old group presents a distinct cognitive profile, with behavioral performance reflecting agerelated changes in cognitive recruitment, we would expect different patterns of results for different working-memory loads: for lower loads, the middle-old group would show youngequivalent performance, while performing better than old-old adults. However, higher loads would be so cognitively demanding that middle-old and old-old adults would perform almost equally poorly. In contrast, if middle-old adults are at an intermediate stage between young and old-old adults, one should expect the middle-old group to perform worse than young adults and better than old-old adults across conditions, although differences in age-related change in each of the measures may result in some variance. See **Table 1** for a schematic summary of these hypotheses.

# MATERIALS AND METHODS

## Participants

Twenty-one young (22–31 years old), 20 middle-old (59–71 years old), and 20 old-old (76–91 years old) participants volunteered in the present study (for sample demographic information see **Table 2**). Groups were matched as far as possible on gender and educational level, with no significant difference between age groups on education variable (p = 0.68). All participants were native Italian speakers.

Participants were excluded only on the basis of factors which could significantly affect their ability to do the tasks: e.g., history of stroke; neurological impairments; serious cognitive, visual, hearing or motor impairment. The cognitive status of older adults was determined based on self-report. Older participants included in the final sample were community dwelling adults presumed cognitively normal, they were all non-institutionalized and independent-living individuals. None had a diagnosis of Mild Cognitive Impairment or Dementia. None had any difficulty with performing the tasks.

#### Tasks

The aim is to investigate variability in performance across measures of fluid cognition (see Blair, 2006). Within this definition, we selected neuropsychological tests that are considered measures of psychomotor speed [e.g., Trial Making Test part A (TMT-A), Arbuthnott and Frank, 2000; Salthouse, 2011], short-term memory and executive-attention (e.g., Digit

The two hypotheses are presented with reference to their specific predictions of age-related changes in dispersion index, performance and effect of indices of cognitive reserve such as educational level. (A) Cognitive aging reflects qualitatively different developmental stages: this hypothesis interprets the dispersion index to reflect age-related differences in neural recruitment and predicts that this would increase with aging in a non-linear manner, depending on the efficiency of compensatory mechanisms/severity of neural noise; it would follow that performance may vary depending on task demand (compensatory strategies may overcome aging losses only until a certain threshold) and that indices of cognitive reserve may specifically improve performance when compensatory strategies are employed but not when these are neither necessary (young age) nor feasible (advanced aging). (B) Cognitive aging reflects a linear decline in performance: this hypothesis predicts a linear increase in dispersion with aging, as this is reflecting increase in neural noise, which in turn would results in linear decrease in performance, with factors contributing to cognitive reserve failing to show beneficial effect which are specific to any age group or cognitive styles.

#### TABLE 2 | Demographic data.

fpsyg-09-01973 October 16, 2018 Time: 13:6 # 5


M: male; F: female.

Span Forward, Kane and Engle, 2002) and more classic executive function tasks engaging working-memory (e.g., Digit Span Backward, Kane and Engle, 2002), switching and inhibition (TMT part B/A, Salthouse, 2011; verbal fluency, Troyer et al., 1998; Shao et al., 2014). The common denominator here is cognitive control (Morton et al., 2011): we selected tasks which are considered to engage a series of cognitive processes that co-operate purposefully, by organizing and executing sequential steps or actions (Alvarez and Emory, 2006; Blair, 2006), and that may be particularly associated with age-related cognitive decline (e.g., Johnson et al., 2007) as well as age-related compensatory mechanisms (Buckner, 2004; Ouwehand et al., 2007).

#### Digit Span (DS): Forward and Backward

Participants are required to recall an increasingly long sequence of digits in the same order (DS forward) or in reverse order (DS backward) (Wechsler, 2008). The two versions of this task provide measures of different components of working memory (Baddeley, 1983), including maintenance of information online (DS forward) and manipulation of such information (DS backward). Although other models consider DS forward as a measure of verbal short-term memory separated from workingmemory (see Warrington and Shallice, 1969), what is important here is its conceptualization within fluid cognition (Schofield and Ashman, 1986) and executive-attention (Kane and Engle, 2002).

#### Trail Making Test (TMT)

This test is widely used as a measure of both speed of processing (Part A) and executive functions (Part B) (Reitan and Wolfson, 2009). This paper and pencil task requires completion of sequences as fast as possible by joining different circles. In part A (TMT-A), participants are asked to link numbers 1–25 in ascending order; in part B (TMT-B), participants are asked to link numbers and letters in alternating order (e.g., 1-A; 2-B; 3- C. . .). Time (sec) to complete TMT-A is taken as a measure of psychomotor speed, while the TMT-B over TMT-A ratio (B/A) is computed as an index of executive processing. This has been shown to be a better measure of executive function compared to the TMT-B minus TMT-A difference (Salthouse, 2011), and to specifically reflect switching and inhibition (Arbuthnott and Frank, 2000). Each participant was given one practice trial before each component.

#### Verbal Fluency (VF): Phonemic and Semantic

This test provides measures of executive processing including lexical retrieval, inhibition, mental set-shifting, internal response generation, updating and self-monitoring (Benton and Hamsher, 1978; Shao et al., 2014). In the phonemic task, participants are given a letter (P, L, or F) and asked to produce as many words starting with that letter as they can in a minute. Participants are told that they cannot produce proper names (e.g., cities, people, etc.), and that they will not receive a score for alterations of the same word [e.g., casa (house) and casetta (little house)]. In the semantic task, participants are given the semantic category "animals" and are asked to produce as many words within that category as they can in a minute. These two versions of verbal fluency have been shown to reflect clustering (generating words within subcategories) and switching (alternating between subcategories) in slightly different degrees, with dissociations being reported in clinical populations (Juhasz et al., 2012) and different neural correlates being associated with these two conditions (Troyer et al., 1998).

#### Word Span

This task was designed based on a previous study (Cappell et al., 2010) specifically to have a measure of working memory as a function of different working-memory loads. To control for any possible practice effect with the digit span task, as well as to account for differences between digit and word span tasks found in early studies (Brener, 1940; Crannell and Parrish, 1957), we changed the span stimuli from digits to words. This is believed to improve sensitivity of the measure (e.g., Kochhann et al., 2009), which may be particularly important in the case of fixed workingmemory span (rather than adjusting online task difficulty based on individual performance). The researcher reads aloud three lists of words, one with four words, another with five words, and another with seven words, once for each condition. Word stimuli were selected from the Burani et al. (2011) database for Italian words and matched for: frequency; mean bigramfrequency; letter length; syllable length; and mean reading-time. Participants are asked to repeat the list in the same serial order. A score of 2 is given for each item recalled in the correct ordinal position, 1 for each item recalled in an incorrect ordinal position and 0 for items not recalled (this gives a maximum score of 8 for word span 4, 10 for word span 5, and 14 for word span 7).

# Intra-Individual Variability Measure: Dispersion Index

To investigate whether intra-individual variability across tasks can represent a meaningful index of cognitive styles in healthy

aging, a dispersion index was calculated for each individual and then used in further analysis (see below). This measure reflects performance variability across cognitive measures within an individual.

There are multiple indices that can be computed as a measure of intra-individual variability (Slifkin and Newell, 1998). The simplest of these is the intra-individual standard deviation (iSD), which is calculated as the standard deviation across standardized scores of different tasks for a single individual. This measure can be problematic when there are significant systematic group differences in average level of performance, as greater means tend to be associated with bigger variance (Hale et al., 1988; Stawski et al., 2017). This can represent a serious confound in the case of comparison of performance between age-groups. To control for differences in variability that may derive from individual mean-level performance, the coefficient of variation (COV) can be calculated by dividing the iSD by the mean of performance for each individual. However, it has been shown that this measure is less sensitive to the pure endogenous factors defining individual cognitive structure, as it does not control for systematic confounds across individuals and/or groups, such as gender or boredom (Wojtowicz et al., 2012). Instead, the dispersion index, although requiring a slightly longer computation, is thought to provide a more reliable measure of CNS integrity and index of individual cognitive structure (Hultsch et al., 2002; Wojtowicz et al., 2012).

Individual dispersion profiles are obtained by using a regression technique, which computes iSD scores from standardized test scores (Christensen et al., 1999; Hultsch et al., 2002; Cole et al., 2011; Halliday et al., 2018). Test scores of interest (TMT-A, TMT-B/TMT-A, Digit Span Forward, Digit Span Backward, Verbal Fluency Phonemic, Verbal Fluency Semantic, Word Span average across trials 4, 5, and 7) were initially regressed on linear age trends across all participants, then the resulting residuals from these models were standardized as z-scores (M = 0, SD = 1), with individual iSDs subsequently computed across these z-scores. Higher values in dispersion index reflect greater intra-individual variability in cognitive functions, across tasks.

#### Procedure

Participants were recruited through advertising flyers in the local community. Each session took approximately 30 min. The protocol was approved by the Aston University ethics committee. All participants gave written informed consent in accordance with the Declaration of Helsinki. All written information was in a font clear and big enough for all agegroups. To facilitate comprehension, instructions were presented both in writing and orally. Tasks were presented in the following order: Trail Making Test (TMT), Digit Span (DS; Forward and Backward), Verbal Fluency (VF; Phonemic and Semantic), and Word Spans (WS). The same order was followed with all participants. This order was designed to have a balanced cognitive demand, without over-stressing the same ability consecutively (e.g., working memory in Digit Span and Word Span).

#### Statistical Analysis

#### Performance Across Groups: Is the Middle-Old Age Simply Reflecting an In-Between Stage in Cognitive Aging?

To examine the age-related differences in different cognitive measures, a series of one-way between-subject ANOVA will be conducted to compare performance in each task between age groups. When an overall significant effect of age is found, separate independent t-tests will be computed to examine differences between groups. No correction for multiple comparisons will be made, because the aim is to identify cognitive measures for which there is a difference between groups, rather than testing the null hypothesis of no overall difference between the groups (Armstrong, 2014). Also, the conservative nature of post hoc tests would increase the risk of Type II error (Wuensch, 2017). Moreover, a 3 (age-group) × 3 (word span loads) ANOVA will be conducted to examine whether the differences in performance between the different levels of load in the word span task were different for the different age groups. This will be done using percentages because of the different range of possible scores for each load.

#### Comparison of the Dispersion Index Across Groups and Relation With Performance and Educational Level

First, independent student t-tests will be computed to examine age-group differences in dispersion index. Second, to test the hypothesis that the link between performance and dispersion presents some qualitative differences between the middle-old and the old-old group and to examine the role of reserve, we will perform a series of correlation analysis between performance and (i) dispersion index (iSD) and (ii) education (an index of cognitive reserve, see Introduction) across age-groups separately as well as for the whole sample.

# RESULTS

# Performance Across Groups: Is the Middle-Old Age Simply Reflecting an In-Between Stage in Cognitive Aging?

Results are summarized in **Table 3** and **Figure 1**.

#### Digit Span

There was a significant group effect on performance on the Digit Span Forward component [F(2, 59) = 2.97, p = 0.05], with a significant difference between the young and the old-old group [young M = 5.38; old-old M = 4.65; t(39,41) = 2.38, p = 0.02, 95% CI (0.11 and 10.35)]. The middle-old group (M = 4.86) did not significantly differ from the other two groups (vs. young p = 0.12; vs. old-old p = 0.48).

There was also a significant group effect on performance on the Digit Span Backward component [F(2, 59) = 4.32, p = 0.01]. The young group significantly differed from the old-old group [young M = 3.76; old-old M = 3; t(39,41) = 2.61, p = 0.01, 95% CI (0.17 and 1.35)]. The middle-old group significantly differed


Significant differences obtained between the Middle-Old group and the other two groups are reported under the 'vs.' columns. t-test <sup>∗</sup>p ≤ 0.05; ∗∗p ≤ 0.01; ∗∗∗p ≤ 0.001. TMT, trail making test; DS, digit span; VF, verbal fluency; WS, word span.

FIGURE 1 | Performance of the three age-groups on different tasks. y-axis depicts the outcome of each given test, either in terms of time (A) or score (B–D). The middle-old group performs similarly to the young-group on certain tasks, while similarly to the old-old group on others. Note that for TMT B/A, the young and the middle-old group's performances overlap almost completely. <sup>∗</sup>p ≤ 0.05; ∗∗p ≤ 0.01; for visualization purposes, this figure only shows significant differences between the middle-old group and the other two groups, while differences between the young and the old-old group are omitted.

from the young group [M = 3.14; t(39,41) = 2.10, p = 0.04, 95% CI (0.02 and 2.21)], but not from the old-old group (p = 0.55).

#### Trail Making Test (TMT)

fpsyg-09-01973 October 16, 2018 Time: 13:6 # 8

There was a significant group effect on performance on the TMT-A [F(2, 59) = 17.62, p = 0.0001]. The mean scores for the three age-groups (young M = 27.67; middle-old M = 49.57; old-old M = 70.20) significantly differed from each other [young vs. middle-old t(39,41) = −3.51, p = 0.001, 95% CI (−34.49 and −9.31); young vs. old-old t(39,41) = −6.78, p = 0.0001, 95% CI (−55.20 and −29.86); middle-old vs. old-old t(39,41) = −2.37, p = 0.02, 95% CI (−38.19 and −3.06)].

There was a significant group effect on performance on the TMT-B/TMT-A ratio [F(2, 59) = 3.45, p = 0.03]. We found a significant difference between the young and the old-old group [young M = 2.13; old-old M = 3.05; t(39,41) = −2.03, p = 0.05, 95% CI (−1.8 and −0.003)]. The middle-old group (M = 2.18) did not significantly differ from the other two groups (vs. young p = 0.82; vs. old-old p = 0.07).

#### Verbal Fluency

There was no significant group effect on performance on the Phonemic Verbal Fluency task [F(2, 59) = 0.79, p = 0.45].

There was a significant group effect on performance on the Semantic Verbal Fluency task [F(2, 59) = 7.73, p = 0.001]. The young group (M = 20.76) differed from the old-old group [M = 13.80; t(39,41) = 3.94, p = 0.0001, 95% CI (3.38 and 10.53)]. The middle-old group (M = 19.10) differed from the old-old group [t(39,41) = 3.01, p = 0.005, 95% CI (1.73 and 8.85)], but not from the young group (p = 0.40).

#### Word Span

This task was designed to test whether differences in performance across groups remained stable across conditions or instead varied as a function of working-memory load. We found ceiling effects for Word Span 4 (max score = 8, young M = 7.90 and 95% scoring the maximum; middle-old M = 7.52 and 80% scoring the maximum; old-old M = 7.10 and 75% scoring the maximum). Therefore, we did not include this condition in the further analysis. There was also a ceiling effect for the young group for Word Span 5 (90% scored the maximum of 10) and so this group was omitted from the within subjects ANOVA. The 2 (agegroup) × 2 (word span loads 5 and 7) ANOVA found an overall effect of level of demand [F(1,39) = 51.37, p < 0.001] and an effect of age group: F(1,39) = 5.21, p < 0.05, but there was no statistically significant interaction between the effects of age and working-memory load condition on performance, F(1,39) = 1.76, p = 0.193.

We then wanted to test whether there were any group differences for each working memory load separately. There was no significant group effect on performance on the Word Span 4 [F(2, 59) = 2.42, p = 0.09]. This may be due to ceiling effects.

There was a significant group effect on performance on the Word Span 5 [F(2, 59) = 11.63, p = 0.0001]. The young group (M = 9.57) significantly differed from the old-old group [M = 6.67; t(39,41) = 4.89, p = 0.0001, 95% CI (1.68 and 4.05)]. The middle-old group (M = 8.52) significantly differed from the

old-old group [t(39,41) = 2.80, p = 0.008, 95% CI (0.50 and 3.14)], but not from the young group (p = 0.07), although bearing in mind possible ceiling effects for the young group (max score = 10; percentages for participants scoring the maximum: young 90%, middle-old 57%, old-old 20%).

There was a significant group effect on performance on the Word Span 7 [F(2, 59) = 5.93, p = 0.004]. The young-group (M = 9.43, SD = 2.75) significantly differed from the old-old group [M = 6.15, SD = 3.10; t(39,41) = 3.58, p = 0.001, 95% CI (1.43 and 5.12)]. The middle-old group (M = 7.24, SD = 3.45) showed a significant difference from the young group [t(39,41) = 2.27, p = 0.02, 95% CI (−0.24 and −4.13)], but not from the old-old group (p = 0.30). These results do not seem to be due to ceiling effects (max score = 14; percentages for participants scoring the maximum: young 0.09%, middle-old 0.09%, old-old 0%).

Taken together, these results indicate that the middle-old group showed a mixed pattern of performance, resembling the young-group on certain tasks and the old-old group on others (see **Figure 1**).

#### Intra-Individual Variability Across Tasks: The Dispersion Index

Average dispersion index was calculated for each age group separately: the young group had a dispersion index score of 70 (SD = 0.30), the middle-old group of 71 (0.23), and the old-old group of 87 (0.35). Although not reaching significance (independent t-test young vs. middle p = 0.93; middle vs. oldold p = 0.08; young vs. old-old p = 0.10), this pattern of results reveals a bigger dispersion index associated with the oldest group. **Figure 2** shows dispersion index as plotted for the three age groups, which follows a non-linear increase with age.

We questioned whether impairment may have been associated with a specific sub-group of tests, so that a specific pattern of performance may have introduced a confound in the way variability across tasks has been computed. **Table 4** shows individual standardized scores across tests for the whole


(Continued)

#### TABLE 4 | Continued


The table shows standardized scores for all participants in different age groups for each neuropsychological test. There is no specific pattern of one test or sub-group of tests being specifically represented with low scores in any age group. Rather there is a general variability across the neuropsychological battery. Red scores are < − 1. Y, young group; MO, middle-old group; OO, old-old group; WS, word span; TMT, trail making test; DS, digit span; VF, verbal fluency.

sample. Performance does not systematically drop for some measures: rather, impairment is more generally seen across the neuropsychological test battery. A factorial ANOVA was also carried out to test for the main effect of cognitive test on score. Results show no significant main effect of cognitive test (F(8, 531) = 0.01, p = 1), nor significant age<sup>∗</sup> test interaction effect (F(2, 531) = 0.42, p = 0.97).

# Comparison of the Dispersion Index Across Groups and Relation With Performance and Educational Level

Some significant correlations emerged when considering dispersion index and performance on different tasks and educational level across groups. Results are summarized in **Table 5** and significant correlations are also reported below for each group.

#### The Young Group

Significant positive correlations were found only between iSD and two cognitive measures: score on the Digit Span Backward (r = 0.66, p = 0.001) and score on the Word Span 7 (r = 0.58, p = 0.006). In other words, young adults who showed higher dispersion, also showed better performance in these two tasks. Educational level did not correlate with any cognitive measures.

#### The Middle-Old Group

iSD did not correlate with any cognitive measures. Educational level negatively correlated with time on the TMT A (r = −0.54, p = 0.01) and positively correlated with performances on VF Phonemic (r = 0.65, p = 0.001), Word Span 5 (r = 0.46, p = 0.03), Word Span 7 (r = 0.48, p = 0.02), as well as overall performance, calculated as the average across z-score of the measures of interest (r = 0.66, p = 0.001). Therefore, while dispersion index was not associated with performance in any of the cognitive measures included in this study, higher levels of education was associated with faster and better performance in this age-group.

#### The Old-Old Group

iSD was positively correlated with time on the TMT A (r = 0.75, p = 0.0001) and negatively correlated with overall performance, calculated as the average across z-scores of the measures of interest (r = −0.50, p = 0.02). Therefore, higher dispersion was associated with a decrease in speed of processing and poorer performance in this group. Educational level was negatively correlated with time on TMT A (r = −0.60, p = 0.005) only, meaning that more years of education were associated with faster completion of the task. No other significant correlations were found between educational level and any of the cognitive measures considered in this study.

# DISCUSSION

The aim of this study was to investigate age-related differences in intra-individual variability, also known as dispersion, across fluid cognition, and to examine the potential of this index to reveal changing patterns of cognitive functioning in later life. We compared performance of young adults in their 20 s, middle-old adults in their 60 s and old-old adults over 75 years old on a number of tasks engaging fluid cognition. Previous studies have shown some inconsistencies in the degree of dispersion found in old age (Hultsch et al., 2000; Li et al., 2004; Rabbitt et al., 2004; Hilborn et al., 2009; Halliday et al., 2018). Dispersion has generally been associated with poor performance and advanced age-related cognitive decline (Rapp et al., 2005; Hilborn et al., 2009). Likewise, and importantly for cognitive theories of compensation (Baltes and Baltes, 1990; Ouwehand et al., 2007), measures of cognitive control have been found to be particularly involved in deployment of resources in the aging brain (Dixon and Bäckman, 1995; Glisky et al., 2001; Stuss et al., 2003; Dixon and de Frias, 2007; Davis et al., 2008; Hilborn et al., 2009; Wojtowicz et al., 2012). We asked whether age-related differences in dispersion across measures of fluid cognition reflect differences in the efficiency of recruitment of resources (cognitive control) in healthy older adults.

When considering between-group differences in dispersion index, our results failed to reach statistically significance, possibly due to small sample size or a limited number of cognitive

TABLE 5 | Correlations between dispersion index and education and performance in different tasks.


Significant results are in bold. <sup>∗</sup>p ≤ 0.05; ∗∗p ≤ 0.01, ∗∗∗p ≤ 0.001. †Sign has been sign-reversed here for consistency with the other tests, so that all correlations are presented in the same direction. iSD (intra-individual SD, an index of dispersion across all the tasks: the higher the iSD, the higher the dispersion); DS, digit span; TMT, trail making test; VF, verbal fluency; WS, word span.

measures used to compute the dispersion index. However, we observed a general non-linear trend showing that a higher level of dispersion was associated with the oldest group, while the young and the middle-group exhibited very similar dispersion index (see **Figure 2** and results). Moreover, dispersion index in the old-old group was significantly negatively correlated with overall performance, in line with previous studies (Rapp et al., 2005; Hilborn et al., 2009). In contrast, performance in the middleold group showed a mixed pattern: middle-old adults performed better than the old-old adults and similarly to the young adults on a number of tasks, but also worse than the young adults and similar to the old-old adults on other tasks, possibly depending on differences in cognitive demand across tasks. We argue that this reflects differences in cognitive deployment of resources in aging: specifically, compensatory mechanisms in the middleold group may have resulted in young-equivalent dispersion across measures of fluid cognition and overall better performance compared to old-old adults.

Although we did not specifically test for compensatory strategies, this interpretation is consistent with the fact that higher educational level – thought to increase cognitive reserve, which in turn supports cognitive compensation (Stern, 2009, 2012) – was related to better performance in the middle-old group only. Moreover, the correlation analysis reveals that the dispersion index may be specifically associated with performance in the old-old group, but not in the other groups. The current pattern of data therefore reveals that the dispersion index can be a useful indicator of cognitive aging, in accordance with previous studies (Hultsch et al., 2002; Halliday et al., 2018), and goes beyond previous work in suggesting that it can be used to study variability in cognition in different cohorts. Importantly, the study of the relationship between dispersion index, performance and cognitive reserve has allowed the emergence of a distinct pattern in which older adults with overall better cognition uniquely benefit from greater cognitive reserve and show young-equivalent dispersion index. We will now discuss specific points which support these conclusions.

# The Middle-Old Group Showed a Distinct Pattern of Performance Compared to the Other Age-Groups

The comparison of three age groups revealed cognitive profiles specific to each age group. This failed to follow a linear decrease in cognitive abilities with increasing age. Although overall results are in line with a progressive neurodegeneration with aging (Raz et al., 2005) – the young and the old-old group being at the two ends of the performance distribution – a closer look at the middle-old group revealed a distinct cognitive profile associated with this cohort. Performance of the middleold group did not simply fit a stage in-between the good performance of younger adults and the poor performance of older adults, but rather, resembled the performance of each of these two cohorts in different tasks (see **Figure 1**). Specifically, while the old-old group performed significantly worse than younger adults in almost all measures, the middleold group did not differ from young adults in a series of measures including verbal fluency and short-term memory, and performed better than the older adults in a lexical selection task (verbal fluency semantic) and in a visual search task engaging speed of processing (TMT-A). However, on a working-memory task (Digit Span backward) the middle-old group performed similarly to the old-old group and significantly worse than young adults.

The most distinct pattern of performance in the middle-old group was observed in those tasks requiring a higher level of cognitive control (TMT B/A and verbal fluency), while for tasks that are considered more "automatic," the cognitive decline across groups was more even (e.g., TMTA, see Miyake et al., 2000 for a discussion on degree of cognitive control within executive functioning). We interpret discrepancy in performance in the middle-old group as reflecting variations in the employment of cognitive resources in different tasks, based on task difficulty and/or resources available to compute the cognitive goal. This interpretation is consistent with previous studies showing that older adults vary in a non-linear manner, either performing good

(young-equivalent) or significantly worse than their younger counterparts (e.g., Cabeza et al., 2002; Hultsch et al., 2002).

One could argue that differences across tasks are likely to reflect dissociations in cognitive modules, so that separate executive processes may undergo slightly distinct degeneration progress (Buckner, 2004; Stuss, 2011). However, when taking into account the Word Span task, this interpretation seems unlikely to explain our results: here, cognitive demand was manipulated within the same task. Results revealed some differences across conditions. While little can be said for word span 4, where ceiling effects may have prevented any age-group differences to emerge, in word span 5 the middle-old group performed as well as young adults (although again the performance range of the young adults may have been truncated due to ceiling effects) and significantly better than the old-old adults. However, at higher load (word span 7) they performed significantly worse than the young group and as bad as the old-old group. Our interpretation – although speculative, as we did not test compensatory strategies directly – is that the middle-old group still had enough resources to compensate in some of the tasks (or conditions, in the case of word span), thus exhibiting young-equivalent performance, until a stage at which cognitive demand was too high and performance dropped. In contrast, participants in the old-old group might have reached resource ceiling at an earlier stage, resulting in worse performance than both young and middle-old adults at word span 5 condition.

Although this is not a neural imaging study, these findings are what may be expected behaviorally from the CRUNCH (Compensation-Related Utilization of Neural Circuits Hypothesis, Reuter-Lorenz and Cappell, 2008). This recognizes a trade-off between compensatory potential and task demand. Consistent with the interpretation that variability in performance reflects differences in recruitment of resources and employment of compensatory strategies, using a similar task to the one we designed here, in their fMRI study, Cappell et al. (2010) found that seniors exhibited dorsolateral prefrontal cortex (DLPFC) over-activation with lower memory loads despite equivalent performance accuracy across age groups. In contrast, with the highest memory load, older adults were significantly less accurate and showed less DLPFC activation compared to their younger counterparts. Likewise, at the behavioral level, we found that the middle-old group performed as good as younger adults, until a point when cognitive demand was high, and performance dropped to the level of older adults.

The counter argument would be that aging affects cognition gradually, with a linear and steady decline in cognitive abilities as people get older. A longitudinal design and/or the inclusion of more cohort groups, especially one between our young (aged 20– 31) and middle-old (aged 59–71) group would provide further points to the function and therefore a more comprehensive study of the developmental trajectory of aging cognition. However, although the inclusion of a limited number of age groups remains a limitation of the current study, our results still show a degree of "non-linearity" – as measured cross-sectionally – in the effect of aging on cognitive performance and dispersion. The middle old group performed better than the old-old group on a number of tasks and show a young-equivalent dispersion index. The fact that the age-gap between the middle-old and the young group is much larger than the age gap between the middle-old and the oldold group argues in favor of this non-linearity, and is consistent with other studies (Robbins et al., 1998). Moreover, our cohorts were selected based on previous longitudinal and cohort studies showing age-related changes are almost non-existent before age 60 (Schaie, 1996; Zelinski and Burnight, 1997; Hultsch et al., 1998).

# Dispersion Is Associated With Poor Performance

The interpretation discussed in the previous section that the distinct cognitive profile exhibited by the middle-old group is in fact reflecting differences in recruitment of cognitive resources across age groups receives further support when considering the variability within a person across tasks, namely dispersion. Our results show a general trend which is consistent with previous studies: although this only approached statistical significance, the old-old group exhibited a higher dispersion index compared to both the young and the middle-old group (see **Figure 2** and results), while also performing significantly worse than the younger counterparts (Hultsch et al., 2002; Stuss et al., 2003; Rapp et al., 2005; Hilborn et al., 2009; Halliday et al., 2018). Although lack of statistically significant difference between groups in terms of dispersion index limit the strength of our conclusions, we believe the current pattern of data is still informative. For example, we found that the higher the variability across executive measures, the worse the overall performance in the old-old group. In contrast, level of dispersion was not related to performance in the middle-old group, which show a young-equivalent dispersion index. In this group, the index of cognitive reserve was a better predictor of overall performance.

The association between dispersion and cognitive decline is in line with theories of cognitive aging that suggest a reduction in cognitive control and inhibition of irrelevant cognitive processing (Hasher and Zacks, 1988). However, and most importantly, we showed that dispersion, an index of intraindividual variability across tasks, is not a defining feature of cognitive aging, and is, rather, specifically associated with worse performance. Compensatory mechanisms in the middleold group may have resulted in young-equivalent dispersion as well as better cognitive outcomes compared to old-old adults (Hypothesis A, **Table 1**).

The fact that the middle-old group did not show agerelated dispersion while at the same time showed youngequivalent performance on some tasks and old-old-equivalent performance on others (possibly depending on cognitive demand), suggests that the interaction between dispersion and performance can reveal age-related changes even before deficits are observed at the performance level. In other words, despite good performance in the middle-old group, underlying agerelated changes are detectable by examining the relationships between reserve and performance. Although conclusions about the developmental trajectories of cognitive aging need to be

drawn with caution in cross-sectional studies as this one, results seem to suggest that healthy aging is not characterized by a linear decrease in cognitive function (Hypothesis A, **Table 1**). Differences between age-groups in intra-individual variability across tasks and related cognitive performance are likely to reflect differences in cognitive resource recruitment, as we found specific effect of cognitive reserve factors in older adults who perform better and show less dispersion (see section below).

# Cognitive Reserve Is Beneficial, but Only When We Need/Can Use It

Together with intelligence and curiosity, education is thought to be a proxy of cognitive reserve, which refers to the ability to flexibly and efficiently use available brain resources (Stern, 2002). Noticeably, the middle-old group was the only group benefitting significantly from extra cognitive reserve provided by education, while concurrently showing less dispersion and better performance (in line with Hypothesis A, **Table 1**). Notably, in those tasks where education had an impact on performance (there was a positive correlation between educational level and cognitive outcome), the (negative) relationship between dispersion index and performance was negligible. This was the case for a number of measures in the middle-old group, and also for the old-old group for one measure (TMT A). It may be that the relatively low cognitive demand associated with TMTA (often considered an index of simple speed of processing, Salthouse, 2011) may have prolonged the beneficial effect of education on performance on the old-old group, specifically for this task but not others.

Our results are in line with a cognitive model of successful aging such as the SOC model (Baltes and Baltes, 1990). According to this model, variability across individuals in terms of how 'successfully' they age may depend on resources available as well as engagement of compensatory strategies (Li et al., 2001; Lang et al., 2002; Amieva et al., 2014; Silagi et al., 2015). Accordingly, it has been shown that increasing cognitive reserve through education resulted in greater compensatory potential (Scarmeas et al., 2003).

These results are in line with a recent longitudinal analysis reporting education as a key factor determining cognitive decline in healthy aging, even more so than chronological age itself (Passos et al., 2015), until a point of advanced aging when the beneficial effect vanishes (Paulo et al., 2011). Likewise, cognitive reserve (and related compensatory effects) seems to play a role only when cognitive demand is "sufficiently" high. For Word Span 4, there is no significant relationship between performance and either education or dispersion index for any of the age group, although ceiling effects here should prevent us drawing any conclusions. However, when task demand increases (Word Span 5 and 7), there is a positive effect of education on performance and no association with dispersion in the middle-old group. In contrast, in the old-old group, higher dispersion index is associated with worse performance. Notably, although not statistically significant, there is a trend showing an impact of reserve for the oldest group in overall performance, which is not present for young adults. It may be that, despite a general beneficial effect of reserve in cognitive aging, the dispersion index may play a more important role than reserve in characterizing cognition in more advanced aging (e.g., Paulo et al., 2011). Another possibility may be that variability between individuals in our age-groups may have contributed to alter age-related effects when comparing different cohorts, as the old-old group may include "middle-old-grouplike" adults. Longitudinal studies could specifically address this question.

# Conclusion, Limitations, and Further Directions

We showed evidence that the study of the dispersion index in cognitive aging can provide a useful and powerful behavioral measure of age-related differences in cognitive deployment of resources. We found that the association between dispersion, aging and performance does not fit all age groups indiscriminately and cannot be predicted solely based on the developmental stage (Hypothesis A, **Table 1**). Specifically, greater dispersion in advanced healthy aging is associated with poor performance, possibly reflecting reduction in cognitive control (Hasher and Zacks, 1988). Non-significant differences in dispersion index between age-groups limit the strength of our conclusions. However, we showed a non-linear trend which has implications for future studies, especially cross-sectional studies, which should be aware of the differences among older sub-group populations.

We acknowledge that there may be the risk for an overrepresentation of working-memory measures in the present study (Digit Span backward and Word Span task). However, workingmemory is one of the major components involved in age-related decline (e.g., Hasher and Zacks, 1988; Hedden and Gabrieli, 2004) and previous evidence has demonstrated that it contributes heavily to compensatory mechanisms in counteracting agerelated losses (e.g., Borella et al., 2010; Karbach and Verhaeghen, 2018), more so than other executive processes. These studies would argue for a special role of working-memory in the study of cognitive aging. Additionally, whether the inclusion of more working-memory measures may have affected our results seems unlikely. For example, our results show a significant relationship between dispersion index – as computed – and TMTB/A for the old-old group, where TMTB/A is very much a switching and updating task rather than a working memory task, thus suggesting a significant dispersion-performance link despite the weight of WM measures in the computed dispersion index.

Although some inferences and analogies can be drawn with regards to cognitive re-organization and age-related changes in resource recruitment, further studies will need to combine behavioral analyses with neuroimaging techniques, to investigate cognitive aging and concurrent changes in neuromodulation. Moreover, as a cross-sectional study, these results are open to the usual threats to validity: potentially, cohort-effects may have

led to overestimation of the impact of age on group differences (Cozby, 2009). By limiting the age-range to a maximum of 15 years and controlling for factors such as gender and educational level, an attempt was made to minimize any cohorteffect. Further studies should aim to investigate dispersion in relation to performance longitudinally, as this may reveal pattern of cognitive aging which can be identified from early adolescence (Gow et al., 2012). Additionally, larger sample size may reveal stronger and more reliable associations across performance, dispersion index and factors contributing to cognitive reserve.

#### CONCLUSION

Cognition might undergo some changes to either cope with or as an effect of normal aging. Future studies should aim to clarify whether and how cognitive re-organization in senescence can inform our understanding of age-related changes in neuromodulation. If a link exists between cognitive dispersion and age-related changes in neural modulation, then the next challenge would be to design paradigms which are sensible to small differences across individuals to predict and counteract severe cognitive aging. Factors such as education may contribute to prolonged performance through the employment of extra cognitive resources. Future work should investigate this relation further through neuroimaging and aim to identify additional enhancement factors to increase cognitive reserve. This would also include development of new actions and training to facilitate

#### REFERENCES


processes of re-organization and optimization, in order to improve the quality of healthy aging for future generations.

#### AUTHOR CONTRIBUTIONS

The study represents the undergraduate research project of SDF, supervised by CH. SDF collected the data, organized the database, performed the statistical analysis, and wrote the first draft of the manuscript. All authors contributed to the conception and design of the study, manuscript revision, and read and approved the submitted version.

# FUNDING

The study represents the undergraduate research project of SDF and no funding was available. This work was entirely carried out as a volunteering commitment from all parties.

## ACKNOWLEDGMENTS

We thank all the participants who volunteered to take part in this study. Thanks to staff and friends at Aston University for the stimulating discussions contributing to the final version of this work, Dr. Raffaele Nappo for helping with the access to the language databases, and SDF's grandparents for inspiring this work.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 De Felice and Holland. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# General Slowing and Education Mediate Task Switching Performance Across the Life-Span

Luca Moretti<sup>1</sup> , Carlo Semenza1,2 and Antonino Vallesi1,2 \*

<sup>1</sup> Padova Neuroscience Center, Department of Neuroscience, University of Padova, Padova, Italy, <sup>2</sup> IRCCS San Camillo Hospital Foundation, Venice, Italy

Objective: This study considered the potential role of both protective factors (cognitive reserve, CR) and adverse ones (general slowing) in modulating cognitive flexibility in the adult life-span.

Method: Ninety-eight individuals performed a task-switching (TS) paradigm in which we adopted a manipulation concerning the timing between the cue and the target. Working memory demands were minimized by using transparent cues. Additionally, indices of cognitive integrity, depression, processing speed and different CR dimensions were collected and used in linear models accounting for TS performance under the different time constraints.

#### Edited by:

Roberta Sellaro, Leiden University, Netherlands

#### Reviewed by:

Tilo Strobach, Medical School Hamburg, Germany Francisco Barceló, Universitat de les Illes Balears, Spain

> \*Correspondence: Antonino Vallesi antonino.vallesi@unipd.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 29 January 2018 Accepted: 16 April 2018 Published: 04 May 2018

#### Citation:

Moretti L, Semenza C and Vallesi A (2018) General Slowing and Education Mediate Task Switching Performance Across the Life-Span. Front. Psychol. 9:630. doi: 10.3389/fpsyg.2018.00630 Results: The main results showed similar mixing costs and higher switching costs in older adults, with an overall age-dependent effect of general slowing on these costs. The link between processing speed and TS performance was attenuated when participants had more time to prepare. Among the different CR indices, formal education only was associated with reduced switch costs under time pressure.

Discussion: Even though CR is often operationalized as a unitary construct, the present research confirms the benefits of using tools designed to distinguish between different CR dimensions. Furthermore, our results provide empirical support to the assumption that processing speed influence on executive performance depends on time constraints. Finally, it is suggested that whether age differences appear in terms of switch or mixing costs depends on working memory demands (which were low in our tasks with transparent cues).

Keywords: task-switching, speed of processing, transparent cue, cognitive aging, cognitive reserve

# INTRODUCTION

Many everyday life situations require cognitive flexibility, namely the capacity to adaptively select between multiple task-sets. Task-switching (TS) paradigms are a useful tool for representing such situations within an experimental context. A growing body of research has investigated the underlying cognitive mechanisms of TS, mostly referring to two types of performance costs: the switch and the mixing costs. Switch cost is measured as the difference between a repeat and a switch trial, both in terms of Response Times (RT) and errors, whereas the mixing cost arises as the difference between a repeat and a single-task trial.

The aging literature points to a deterioration in TS performance in the elderly, especially for the mixing cost (Kray and Lindenberger, 2000; Meiran et al., 2001; Adrover-Roig and Barceló, 2010; Lawo et al., 2012). Many accounts of the phenomenon have been put forward, some referring to mechanisms specific for TS (Kray and Lindenberger, 2000; Mayr, 2001) and others trying to integrate age-related difference across different functions, proposing a unitary account of such aging effects (e.g., Lindenberger and Baltes, 1994; Salthouse, 1996; Li and Lindenberger, 1999; Wasylyshyn et al., 2011): in these models, cognitive decline is mainly attributed to deterioration in an underlying domain, rather than considering single domains as separated entities. For this reason, they are referred to as "common cause models".

One example of these models is the processing speed theory put forward by Salthouse (1996). This model proposes that agerelated differences stem from a decline in processing speed, the pace at which simple operations are performed by the cognitive system. Support for this theory mainly comes from studies using hierarchical linear models, or similar ones, to investigate the amount of variance explained by measures of speed of processing, most often consisting of scores obtained in substitution tests; through this type of analyses, similar findings have been replicated for measures of working memory (WM; Salthouse, 1992), Trail Making Test (Salthouse et al., 2000), fluid intelligence (Salthouse et al., 1998), cognitive inhibition (Salthouse and Meinz, 1995; Verhaeghen and De Meersman, 1998; also see Puccioni and Vallesi, 2012a,b), recall, reasoning and spatial abilities (Salthouse, 1993): when the variance explained by speed of processing measures is removed from the models, the relationship of more specific cognitive abilities with age drops down. Importantly for the present study, also the relationship between TS and age was strongly attenuated when controlling for processing speed (Salthouse et al., 1998).

Even though an involvement of speed of processing in determining age-related differences in EFs seems therefore to be likely, it is to be noted that it might not be the only factor at stake (Keys and White, 2000; Verhaeghen et al., 2006; Bugg et al., 2007). Some studies on the Stroop effect, which is consistently found to be increased in older adults, still showed a significant effect of age after controlling for general slowing (Salthouse and Meinz, 1995; Bugg et al., 2007); similar results were also obtained with a card sorting test (Bugg et al., 2007), another task presumed to rely on inhibition and set shifting. Coming to TS, a meta-analysis by Verhaeghen and Cerella (2002) confirmed an effect of age beyond the general slowing, at least for TS global costs, that is, performance difference between blocks in which the participant has to switch between tasks and blocks with only single tasks. There are thus counterexamples in which correlations with processing speed, even though significant, are of moderate entity, and cannot fully account for age-related effects (cf., Salthouse et al., 1998).

The speed of processing theory proposes two mechanisms to be responsible for age-related slowing as a consequence of a lower speed of processing (Salthouse, 1996): (i) the limited time mechanism, according to which early operations may take too long and therefore leave no time for later, possibly more complex operations; (ii) the simultaneity mechanism, that is, the assumption that products of lower level computations should be concurrently available for later processing. Despite these assumptions appear to be crucial on a theoretical ground, empirical testing is still scarce. To better characterize the relationship of speed of processing with TS performance, we used a cuing paradigm in which the cue-to-target interval (CTI) was varied across blocks, giving the participants more or less time to reconfigure or inhibit a previously active task-set. If limited time and simultaneity mechanisms were to operate, we should expect looser correlations with a longer CTI: by having more time to prepare for the task at hand, the impact of speed of processing on the performance should indeed be reduced. On the other hand, if mixing and switch costs do not depend on limited time mechanisms, we should observe no differences in correlations with processing speed across CTI conditions. We therefore built linear models for the two CTI conditions, always including a measure of processing speed as one of the predictors, to test whether the impact of such and other variables is actually higher when temporal constraints are stricter.

Other variables included in the models were chosen to investigate the contribution of other factors that are likely to play a major role in age-related cognitive decline. In particular, the construct of cognitive reserve (CR) has gained popularity during the last two decades (Stern, 2002, 2009; Valenzuela and Sachdev, 2006), with a number of studies demonstrating that the capacity of the cognitive system to cope with aging is also a function of previous life experiences. It is fair to say that early works mainly linked cognitive reserve to educational level, as the construct of cognitive reserve itself stems from those studies demonstrating a negative relationship between education and the severity of Alzheimer's Disease symptoms (Katzman et al., 1988; Stern et al., 1992; Katzman, 1993). Nonetheless, life experiences other than education are also often considered to mediate age-related decline in older adults, mainly concerning occupational attainment (Satz et al., 1993; Stern et al., 1994; Garibotto et al., 2008) and leisure time activities (e.g., Wilson et al., 2002).

Even though it is plausible that education, being a relatively early event in life, plays a major role on the neural and psychological development of an individual, testing different dimensions of the CR construct becomes crucial for a better development of clinical tools and theoretical understanding of the aging process. For this reason, we chose to use the Cognitive Reserve Index questionnaire (CRIq) as a measure of cognitive reserve (Nucci et al., 2012): this tool provides both an overall measure of cognitive reserve and subscales related to each of the three dimensions cited above (education, occupational attainment, leisure time). In this way, it was possible to test the role for each of the relevant activities in a standardized manner. Even though CR and speed of processing are most likely to be involved in age-related differences when it comes to TS, there are a number of other variables that have been described as important in the EF literature. As depression is fairly frequently found among older adults (e.g., Beekman et al., 1999; Noël et al., 2004), and has been associated with reduced cognitive control (e.g., Meiran et al., 2011; Vallesi et al., 2015), Beck Depression Inventory (BDI) (Beck et al., 1961) scores were used as one of

the regressors in the analyses. Finally, mild cognitive impairment and dementia have been found to affect TS performance on both mixing and switch costs (e.g., Belleville et al., 2008; Schmitter-Edgecombe and Sanders, 2009). We therefore chose to administer the Montréal Cognitive Assessment (MoCA, Nasreddine et al., 2005), as a measure of general cognitive integrity, and used its scores as another regressor in our analyses. In summary, building such large models allowed us to test and control for a variety of factors that might selectively affect TS performance in aging, taking into consideration possible confounds with speed of processing effects. Moreover, it was also possible to assess an important aspect of the processing speed theory, which is not much tested in the TS literature, namely the assumption that speed of processing is particularly relevant under high temporal constraints (e.g., Wingfield et al., 1985): in the present case, this was done by varying CTI across blocks.

#### MATERIALS AND METHODS

#### Participants

A total of 98 volunteers (45 female), ranging from 21 to 79 years of age, were recruited through Internet and flyers. The whole experiment took place in a single session of 1.5 h at the Department of Neuroscience in Padova. Before starting the experiment, participants read and signed a consent form specifying the aim of the study and possible risks; another form was then presented in which they declared whether they had previous or current history of neurological or psychiatric problems, whether they had taken drugs or alcohol in the last 24 h and whether they had normal or corrected-to-normal vision.

#### Test Description

All volunteers were administered 5 paper-and-pencil tests, and two computerized tasks: a TS paradigm and a Sustained Attention Reaction Time (SART) paradigm (not reported here)<sup>1</sup> . The sequence was fixed except that the computerized tasks were counterbalanced depending on the participant's number (see Computerized Tasks). The order was: Edinburgh Handedness Inventory, Montreal Cognitive Assessment (MoCA), computerized tasks (counterbalanced), Symbol Digit Modalities Test (SDMT) and the BDI. A final questionnaire, the CRIq was administered only to those participants above 30 years of age, since the test is based on life-long experiences and younger individuals tend to systematically show lower scores just due to their relatively young age. Descriptive statistics of the sample are reported in **Table 1**, which shows participants divided into 3 age TABLE 1 | Average demographic data, scores and standard deviations (in parenthesis) for each experimental group.


The performed tests were Montreal Cognitive Assessment (MoCA), Symbol Digit Modalities Test (SDMT), Beck Depression Inventory (BDI), Cognitive Reserve Index for Education (CRI-S), Working Activity (CRI-L), Leisure Time (TL), and total score (CRI-tot).

groups (21–30, 31–60, >60 years old), as commonly done in the aging literature (e.g., Charles et al., 2003; Bialystok et al., 2005).

#### Computerized Tasks

Two computerized tasks were implemented on E-prime and then administered to each participant: a TS paradigm and a SART paradigm. The order of administration was counterbalanced according to demographic variables such as age range (in decades), gender and years of education (in 3 ranges): those sharing these features were pseudo-randomly assigned to one of four possible sequences to obtain a roughly similar number of participants per counterbalancing order in each population layer. The TS paradigm was indeed divided into two separate blocks with short or long CTIs. Task switching blocks were always performed consecutively, and the order of presentation was counterbalanced; TS blocks could be either preceded or followed by SART thus leading to the four possible aforementioned sequences. For data collection, we used a Dell laptop computer (Intel core i5-3320M CPU; 4 GB of RAM) with Windows 7 OS. Stimuli were presented on a 15-inch color monitor with a white background. Participants performed every task with a distance of about 50 cm from the screen.

# Task Switching Test

Each run of the TS paradigm was divided in 5 blocks: at the beginning participants had to perform two pure blocks consecutively, then the mixed block, finally two pure blocks again presented in reverse order with respect to the first two blocks in order to control for practice and fatigue effects. The two runs differed in CTI, that is, either 100 or 1200 ms (short and long CTI, respectively). The stimuli used were the letters A and E presented above or below a fixation cross.

In the pure blocks participants were asked to indicate either the identity of the letter at hand (verbal task) or its spatial location

<sup>1</sup>This test was a modified version of that developed by Robertson et al. (1997). In the SART, participants were asked to respond to frequent 'go' stimuli but maintain a readiness to withhold a response to rare and unpredictable no-go trials. In each trial, a single number appeared randomly in the center of the computer screen, and remained for 250 ms, before being substituted by a fixation cross that lasted for 900 ms. Participants had to press the "B" key on the keyboard if the stimulus presented was any number except the number "3". The items were subdivided into a practice session of 18 trials (including 2 no-go trials) and a test session of 225 trials (including 25 no-go trials). The results of this task will be reported in future works as it falls outside the scope of the current study.

(spatial task): this was accomplished by pressing F or K on the keyboard, which were labeled as 1 or 2. In this way, the same keys were used for both the verbal and the spatial task, and the stimulus-response mapping was fixed across participants: F (labeled as 1) was to be pressed for A in the letter task, and when the letter appeared above the fixation cross in the spatial task; K (labeled as 2) had to be pressed for E, or when the letter was below the fixation cross depending on the task. Pure blocks began with instructions and 4 practice trials: at the end of practice participants were free to decide whether to start the test phase, or keep on with 4 extra practice trials. During practice, the experimenter monitored the responses and made sure that the participant understood the task to be performed. Moreover, during practice participants were also provided with a feedback indicating accuracy, and whether the response took too long: responses over 2000 ms were considered as non-responses. The practice phase with the pure block was repeated by 22 participants (on average, 4.5 extra trials each). The test phase consisted in 16 trials for each pure block: therefore, at each CTI, 32 trials were collected for each task, holding a total of 64 pure trials per CTI condition. In the mixed block a similar procedure was adopted: this time 16 practice trials had to be performed, always with a feedback; once again, participants were left free to decide whether to begin the test phase or to have another practice run: 24 subjects needed at least one extra practice (on average, 23.3 extra trials each).

The test consisted of 64 trials. **Figure 1** provides a graphical example of the procedure. At the beginning of each trial a cue indicating the task at hand was presented on the screen for 100 or 1200 ms depending on the run: the cues were transparent rather than arbitrary with respect to the task to be performed, as they were the words SPAZIO (Italian word for "space") or LETTERA (Italian for "letter"). After this interval, the target stimulus appeared on the screen with the cue still present in its position (to minimize memory demands) and the participant had 2 s to provide a response. After this deadline passed by, both the target and the cue disappeared for a variable inter-trial interval ranging between 500 and 1000 ms, and a new trial began. The procedures described in this study were approved by the Comitato Etico per la Sperimentazione "Azienda Ospedaliera di Padova".

# DATA ANALYSIS

The first trial of each mixed block was not considered in the analysis to avoid confounds with starting costs (Altmann, 2007). Five participants were excluded for excessive inaccuracy: to determine a threshold, a binomial analysis was carried out spotting those individuals responding at chance level (<60.8% correct) on the task switching conditions. After this, error and post-error trials were removed (7.8% of the total) (Rabbitt, 1966; Falkenstein et al., 2000). At this point, the distribution of RT data was analyzed to ensure normality: the distributions for each condition showed a skewness above 1; for this reason we transformed the data with a logarithmic function (Ratcliff, 1993; Whelan, 2008). The natural logarithm was used for this transformation: after the transformation, all the distributions survived to Shapiro–Wilk test for sphericity. Finally, outlier trials were determined for each participant in each condition (i.e., pure block, repeat trials, switch trials for each CTI): a cutoff value of two standard deviations above/below the mean was used (3.8% of the total trials were excluded). Analyses were conducted using R, version 3.4.0. Mixing cost and switch cost were calculated for both RTs and accuracy as in the literature (Roger and Monsell, 1995): Mixing cost = Performance difference


TABLE 2 | Multiple linear regressions with mix cost at 100 and 1200 ms CTI as DVs.

Predictor variables include age, years of education, Montreal Cognitive Assesment, Beck depression inventary and Symbol-digit modalities test scores. Both dependent and independent variables have been standardized.

between Repeat switching block trials and Pure block trials; Switch cost = Performance difference between Switch and Repeat switching block trials.

A 2 × 3 mixed ANOVA was performed for each cost, with CTI as the within-subject factor and group (21–30, 31–60, >60 years old groups) as the between subjects factor. Separate ANOVAs were run for errors and RTs. For post hoc analyses, a Tukey's HSD test was used. Further, multiple regressions were performed on raw data for mixing and switch cost at different CTIs. Predictors were: age, MoCA score, years of education, BDI, and SDMT score. The latter was intended as a proxy of speed of processing. The same procedure was then used to build linear models excluding the youngest group (30 years of age or below): in this case years of education were not included in the models and were replaced with sub-scales and total score of the CRIq.

#### RESULTS

#### ANOVAs

#### Mixing Cost

Mean RTs and accuracy for each group and CTI are reported in **Table 2** and **Figure 2**. Mixing cost in RTs decreased when more time to prepare was given to the participants, as evident from the main effect of CTI [F(1,90) = 261.57, p < 0.001, η 2 <sup>p</sup> = 0.744]. This effect was constant across groups as no interaction emerged between Group and CTI [F(2,90) = 0.27, p = 0.76, η 2 <sup>p</sup> = 0.008]. The absence of a main effect for Group [F(2,90) = 0.4, p = 0.67, η 2 <sup>p</sup> = 0.006] indicated that the mixing cost did not significantly change with age. As far as the error analysis was concerned, results were substantially similar to those reported for RTs: once again, there was a main effect of CTI [F(1,92) = 6.94, p = 0.01, η 2 <sup>p</sup> = 0.072] such that a longer time between cue and target was useful to correctly complete the task; no differences emerged between groups [F(2,92) = 0.785, p = 0.459, η 2 <sup>p</sup> = 0.017] and no interaction was found between group and CTI [F(2,92) = 1.831, p = 0.166, η 2 <sup>p</sup> = 0.039].

#### Switch Cost

Mean RTs and accuracy for each group and CTI are reported in **Table 3** and **Figure 3**. As for the mixing cost, a main effect of CTI [F(1,90) = 5.32, p = 0.02, η 2 <sup>p</sup> = 0.055] was found

transformed before calculating the mix cost.

on the RT switch cost, consistent with the literature on task switching. Critically, however, also a main effect of group reached significance [F(2,90) = 7.24, p = 0.001, η 2 <sup>p</sup> = 0.139]; Tukey's pairwise comparisons indicated that the oldest group showed higher switch costs than the youngest (p < 0.001) and than the middle-age groups (p = 0.036). No differences emerged between the latter two groups (p = 0.19). Finally, CTI x Group interaction was well far from significance [F(2,90) = 0.1, p = 0.9, η 2 <sup>p</sup> = 0.002], meaning that no group gained particular advantage from a higher preparation time in terms of switch costs. As far as errors were concerned, no significant effect was found either for CTI


Predictor variables include age, years of education, Montreal Cognitive Assesment, Beck depression inventary and Symbol-digit modalities test scores. Both dependent and independent variables have been standardized.

[F(1,92) = 0.355, p = 0.553, η 2 <sup>p</sup> = 0.004] or Group [F(2,92) = 2.866, p = 0.062, η 2 <sup>p</sup> = 0.06].

#### Linear Models

In order to test for a predictive role of speed of processing, cognitive reserve, mood, general cognitive integrity and age, 8 linear models were built: 4 including the whole sample, and 4 on a sub-population of older and middleaged individuals only. Beside the sample, the first set of linear models differed from the latter in the measures of CR: as the CRIq was not administered to the youngest group, education was the only CR index in the whole-sample models. Results are summarized in the tables below (**Tables 2**–**5**). In general, nearly all models significantly predicted TS performance, but the amount of explained variance was consistently higher for the switch cost. This may be in line with the ANOVA's results, indicating a main effect of group for switch cost only. As SDMT scores were negatively related with TS performance according to the models, it is conceivable that the older group, with lower scores on SDMT, showed higher switch costs. Also in line with ANOVAs is the drop of predictive power for both costs when the participants had more time to prepare: once again, as we observe that speed of processing is the most powerful predictor, then it is expected that models' significance drops when time constraints are looser. Even though SDMT performance did not always correlate significantly with TS costs in all the conditions, it is to be noted that it did when correlated alone. The reason for this may be linked to shared variance between age and SDMT scores: correlation between these two variables was fairly high (r(91) = −0.53, p < 0.001, CI 95% [−0.66, −0.362]).

Finally, education was found to negatively correlate with switch cost in the short CTI condition, whereas the mixing cost did not seem to be affected at all. Under all aspects cited above, models built excluding the youngest group were similar: the predictive power dropped at long CTIs, predictors accounted for more variance in the case of the switch cost, even though SDMT scores correlated significantly both with switch and mixing costs when considered alone. Furthermore, the subscale of CRIq for education, was still a valuable predictor for switch cost under high temporal constraints (i.e., short CTI).

However, no other cognitive reserve variable had an impact on TS performance.

# DISCUSSION

The aim of this study was twofold: on the one hand, the influence on TS performance of a number of variables related to cognitive aging was tested. To the best of our knowledge, despite a vast literature on TS and aging, no previous study had considered at the same time the role of both potentially protective and adverse factors in this frame. On the other hand, some aspects of


Young adults excluded. Predictor variables include age, years of education, Montreal Cognitive Assesment, Beck depression inventary and Symbol-digit modalities test scores. Both dependent and independent variables have been standardized.

TABLE 5 | Multiple linear regressions with switch cost at 100 and 1200 ms CTI as DV.


Young adults excluded. Predictor variables include age, years of education, Montreal Cognitive Assesment, Beck depression inventary and Symbol-digit modalities test scores. Both dependent and independent variables have been standardized.

processing speed theory have often been underestimated. Despite the theorization of a processing stream in which products of low level operations should remain available for later processing in order to successfully perform cognitive tasks (Salthouse, 1996), few studies were concerned with testing the role of speed of processing under different temporal constraints. There is consistent evidence that cognitive aging may be partially attributable to a decline in speed of processing (Salthouse, 1996). However, high variability in the elderly population suggests that multiple factors, including life-experience, may play an important role in shaping decline of cognitive functions in aging (Stern, 2002, 2009; Vallesi, 2016).

In order to better characterize these aspects in the executive domain, we implemented linear models using classical TS indices as dependent variables. Speed of processing, as operationalized with SDMT performance, was found to be consistently related to both mixing and switch costs, with a tendency to account for more variance in the latter case. In general, significance of the models was higher when the participants had little time to prepare (i.e., shorter CTI of 100 ms). This effect was driven by a drop of predictive power of SDMT scores and age: when temporal constraints were lower (i.e., longer CTI), it is likely that the capacity to quickly carry out simple operations becomes less relevant. This is in line with Salthouse's (1996) processing speed theory. His model posits that age differences arise because the ability to carry out simple operations deteriorates over time. Performance in cognitive tasks is consequently disrupted since products of low level-operations are not available when successive, higher order operations, have to be executed. These results replicate other studies in which temporal demands have been manipulated to test this possibility in other domains (Salthouse and Coon, 1993; Kliegl et al., 1994).

Furthermore, variations of SDMT weight at different CTIs implicate that the weight of the speed of processing factor varies as a function of cognitive demands (Salthouse, 1992). It is to be noted, however, that correlations between processing speed and TS measures were moderate, and do not fully account for between subjects variability, replicating previous results (Salthouse et al., 1998; Cepeda et al., 2001). On the other hand, age was a significant predictor for the switch cost, suggesting that agerelated effects on the latter are partly independent from general slowing, consistent with the literature on EFs (Verhaeghen and Cerella, 2002; Verhaeghen et al., 2006; Bugg et al., 2007).

Summing up, it is important to recognize that, even though speed of processing measures account for a significant part of the variance in executive performance, age-related differences

persist beyond this construct. On a more fine-grained analysis, the relevance of general slowing may interact with experimental manipulations such as time-constraints, an often underestimated aspect in aging studies.

The role of cognitive reserve (CR) was also tested in the linear models. Contrary to our expectations, most indices of CR did not mediate TS performance. The only exception was education, being it negatively related to switch cost in the shorter CTI condition. For the interpretation of this result it might be important to underline that education was also the only CR index to significantly correlate with SDMT performance [r(91) = 0.24, p = 0.02]. It is possible to speculate that a high education level protects against speed of processing decline, in line with the literature (Cabral et al., 2016; Ihle et al., 2016). This would also be consistent with biological accounts of processing speed linking this construct to white matter integrity (Kerchner et al., 2012; Haász et al., 2013; Kuznetsova et al., 2016); also the original conceptualization of brain reserve, and subsequently CR, began with studies observing the beneficial effects of high education on brain pathologies affecting white matter connectivity (Katzman et al., 1988; Katzman, 1993).

However, other CR measures did not show the same effect. It is to be noticed that results concerning the role of CR in agerelated EFs decline are ambiguous. Contrary to other domains, a relevant number of studies failed to establish a link between life experiences and a better performance in EF tasks. For instance, Hindle et al. (2017) in a recent study reported no difference between high and low CR Parkinson Disease patients in tasks tapping EFs. However, a similar research of the same group, also on a PD sample, revealed that education promoted various cognitive functions including EFs (Hindle et al., 2014), a pattern compatible to our data. Also research on non-pathological aging population is inconsistent on this point. Cabral et al. (2016) found a relation between CR and EFs; however, CR was operationalized only with measures of education (years of education, school failure, school attendance), or strictly related to it (reading books, newspapers and number of languages spoken). When using a more complex CR questionnaire, correlations with EFs or related functions sometimes fail to emerge (Puccioni and Vallesi, 2012b; Léon et al., 2014; Arcara et al., 2017).

Coming to the results of the ANOVAs, the finding of comparable mixing cost across groups is at odds with previous literature, as most of the studies find an increase in older adults (Kray and Lindenberger, 2000; Meiran et al., 2001; Adrover-Roig and Barceló, 2010; Lawo et al., 2012). However, a number of authors also report results similar to our (DiGirolamo et al., 2001; Kray et al., 2002; West and Travers, 2008). Classically, the mixing cost has been proposed to reflect a differential load on WM between the pure and the mixed block (Los, 1996; Kray and Lindenberger, 2000): as two task representations have to be held in the second case, this might slow down performance. This view is consistent with the finding of increased mixing cost on non-cued paradigms, where the participant has to keep track of the sequence. Within this frame, an increased mixing cost in older adults is a direct consequence of WM decline in aging (Kray and Lindenberger, 2000); coherently, in non-cued paradigms older adults' performance suffers the most (Kray et al., 2002).

However, several studies cast doubts on the role of WM in mixing cost. Parametrically varying the number of task-sets does not produce any larger cost (Kramer et al., 1999; Rubin and Meiran, 2005), nor increases age-related differences (Buchler et al., 2008). As for switch cost, mixing costs are present only with bivalent stimuli (Mayr, 2001; Rubin and Meiran, 2005), namely stimuli on which both tasks may be potentially executed (as it is the case in the present study); also, introducing bivalent stimuli in one task elicits cautious responding on the other tasks performed in the same block employing univalent stimuli (Woodward et al., 2003; Metzak et al., 2013), a phenomenon which is known as the bivalency effect and is especially present in younger adults (Rey-Mermet and Meier, 2015). Finally, Mayr (2001) demonstrated that without overlap between S-R representations, mixing costs are absent: when responses were mapped differently for each task, or stimuli could not cue the competing representation, a mixing cost failed to emerge. Taken together, these results suggest that the mixing cost arises because of task conflict (Rubin and Meiran, 2005), resulting from a bottom-up activation of the competing task set: if there is no bivalency, there is no competition; on the other hand, the mixing cost would arise only in situations in which the two task-set representations overlap to some degree. The proposal of a stimulus-driven nature of the mixing cost is in line with findings of reduced mixing costs after extensive practice (Kramer et al., 1999; Buchler et al., 2008), since practice would help to reduce ambiguity.

What follows from these data is that older adults would particularly benefit from experimental manipulations aimed at reducing task ambiguity. For instance, Hirsch et al. (2016) report of no age-differences in mixing cost when presenting univalent stimuli. It is also possible to lower task ambiguity through the employment of strategies aiding a correct representation retrieval. Inner speech disruption through articulatory suppression has been consistently found to increase mixing costs in adults (Emerson and Miyake, 2003; Miyake et al., 2004; Saeki and Saito, 2009). Moreover, several studies have found an interaction between age and articulatory suppression costs, or explicit verbalization benefits (Kray et al., 2008, 2010; Chevalier and Blaye, 2009): children's and older adults' performance benefited more from verbalization, and was more disrupted by articulatory suppression compared to young adults' performance. Supposedly, inner speech should help to form a WM representation about the task to be performed (Baddeley, 1996), or solve task-set competition; consistent with this, cue transparency has been found to reduce mixing and switching costs (Mayr and Kliegl, 2000; Arbuthnott and Woodward, 2002; Miyake et al., 2004; Logan and Schneider, 2006; Grange and Houghton, 2009): also in this case, ambiguity on the task to be performed next is lowered. In the present experiment transparent cues for the implementation of the TS paradigm were chosen. Indeed, the use of Italian words for "space" and "letter" to cue the upcoming task was likely to considerably reduce the competition in task selection. In other kinds of cueing paradigms, not only the cue does not solve ambiguity between tasks, but it also may be particularly detrimental, since it embeds an extra processing demand, as demonstrated by the literature on restart costs (Allport and Wylie, 2000; Poljac et al., 2009), namely the

cost associated with cued repetition trials vs. uncued trials. It is therefore likely that older adults, who may have more difficulties in solving the ambiguity or forming correct WM representations, may benefit when an informative cue is used.

Coming to the switch cost, it is still debated whether it arises from an ongoing inhibition of a previously active taskset (Allport et al., 1994; Wylie and Allport, 2000), or it is due to a process of task-set reconfiguration (Roger and Monsell, 1995; Mayr and Kliegl, 2000; Monsell, 2003) or a mixture of the two (e.g., Tarantino et al., 2016; see Vandierendonck et al., 2010 for a review). However, its increase in older individuals had already been found in a number of studies using the cueing paradigm (DiGirolamo et al., 2001; Mayr, 2001; Meiran et al., 2001; Kray et al., 2002; Reimers and Maylor, 2005; West and Travers, 2008; Adrover-Roig and Barceló, 2010). There is a vast literature on differential cognitive strategies across the life-span indicating that older adults use a more bottom-up stimulus-triggered (reactive) strategy than top–down (proactive) cognitive control, consistent with the idea of an executive deficit in aging (Braver et al., 2005; Paxton et al., 2006). The use of cognitive control itself, as classically conceptualized by Norman and Shallice (1986), is required when routines do not suffice for the correct execution of the task: the use of a transparent cue may not aid much performance on switch trials, since extra information is needed (e.g., retrieval of different S-R rules); on repeat trials, however, a clear indication of the task to be performed may solve interference with the other task, thus facilitating task selection as proposed above. In other words, while using a transparent cue may be useful for repeat trials, such a beneficial effect is usually not found when a reconfiguration is needed (i.e., in switch trials). In individuals with lower speed of processing, this might prove particularly difficult; as a consequence, a bottom–up strategy may be implemented more often in older compared to younger adults. Since the endogenous component of the switch cost, as conceptualized by Monsell (2003), would be larger in older adults, a higher switch cost is expected. In the present case, not only switch cost was higher across CTIs for the older adults, but differences emerged more neatly at the short CTI, where speed of processing has been found to be a more robust predictor of TS performance.

# CONCLUSION

Even though the construct of cognitive reserve has gained popularity in the last decades, it is still controversial which specific life-experiences may actually protect from age-related decline in different cognitive functions (e.g., Puccioni and Vallesi, 2012b; Léon et al., 2014). The results of the present study showed

#### REFERENCES

Adrover-Roig, D., and Barceló, F. (2010). Individual differences in aging and cognitive control modulate the neural indexes of context updating and maintenance during task switching. Cortex 46, 434–450. doi: 10.1016/j.cortex. 2009.09.012

that, among various proxies of cognitive reserve, only education is associated, under highly demanding time-constraints, with improved switch cost. On the other hand, common cause theories have been put forward to link cognitive decline in different functions to a general underlying cognitive domain. Among these, the speed of processing theory predicts poorer performance due to an inability to carry out simple operations at a high pace: if this was the case, we would expect speed of processing to play a more prominent role under high temporal constraints. Our results suggest that this is indeed the case: with short CTIs, our speed of processing measure predicted a significant portion of variance, while it did not with long CTIs. Finally, a pattern of increased switch cost and comparable mixing cost across the different age groups emerged in contrast with many the previous works. We suggest that the kind of cue employed (i.e., its degree of transparency) is the critical manipulation explaining the discrepancy of our results with previous studies. Despite it is conceivable that the choice of cue transparency may produce some consequences, these have not been systematically investigated so far, leaving a methodological issue in the TS literature that our study started to solve.

# ETHICS STATEMENT

This study was carried out in accordance with the Declaration of Helsinki. All volunteers gave their written informed consent prior to participation. The protocol was approved by the Bioethical Committee of the Azienda Ospedaliera di Padova.

# AUTHOR CONTRIBUTIONS

AV designed the research project and designed the tasks. LM collected and analyzed the data. LM, CS, and AV wrote the manuscript.

# FUNDING

This work was partially funded by the European Research Council under the European Union's 7th Framework Program (FP7/2007-2013), Grant Agreement n. 313692 awarded to AV.

# ACKNOWLEDGMENTS

We thank Giacomo Caiani for help in data collection. We also thank Fondazione Istituto di Ricerca Pediatrica Città della Speranza, Padova, for hosting some of our lab facilities.

Allport, A., and Wylie, G. (2000). "Task switching, stimulus-response bindings, and negative priming," in Proceedings of the 18th International Symposium on Attention and Performance: Control of Cognitive Processes XVIII, Windsor, ON, 35–70.

Allport, D. A., Styles, E. A., and Hsieh, S. (1994). "Shifting attentional set: exploring the dynamic control of tasks," in Attention and Performance XV: Conscious

and Nonconscious Information Processing, eds C. Umiltá and M. Moscovitch (Cambridge, MA: MIT Press), 421–452.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Moretti, Semenza and Vallesi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# On the Reliability of Switching Costs Across Time and Domains

Kalinka Timmer<sup>1</sup> \*, Marco Calabria<sup>1</sup> , Francesca M. Branzi<sup>2</sup> , Cristina Baus<sup>1</sup> and Albert Costa1,3

<sup>1</sup> Center for Brain and Cognition, Pompeu Fabra University, Barcelona, Spain, <sup>2</sup> Neuroscience and Aphasia Research Unit, School of Biological Sciences, University of Manchester, Manchester, United Kingdom, <sup>3</sup> Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain

Bilingual speakers are suggested to use control processes to avoid linguistic interference from the unintended language. It is debated whether these bilingual language control (BLC) processes are an instantiation of the more domain-general executive control (EC) processes. Previous studies inconsistently report correlations between measures of linguistic and non-linguistic control in bilinguals. In the present study, we investigate the extent to which there is cross-talk between these two domains of control for two switch costs, namely the n-1 shift cost and the n-2 repetition cost. Also, we address an important problem, namely the reliability of the measures used to investigate cross-talk. If the reliability of a measure is low, then these measures are ill-suited to test crosstalk between domains through correlations. We asked participants to perform both a linguistic- and non-linguistic switching task at two sessions about a week apart. The results show a dissociation between the two types of switch costs. Regarding test– retest reliability, we found a stronger reliability for the n-1 shift cost compared to the n-2 repetition cost within both domains as measured by correlations across sessions. This suggests the n-1 shift cost is more suitable to explore cross-talk of BLC and EC. Next, we do find cross-talk for the n-1 shift cost as demonstrated by a significant crossdomain correlation. This suggests that there are at least some shared processes in the linguistic and non-linguistic task.

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Kenneth R. Paap, San Francisco State University, United States Yanping Dong, Guangdong University of Foreign Studies, China

> \*Correspondence: Kalinka Timmer kalinkatimmer@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 February 2018 Accepted: 31 May 2018 Published: 22 June 2018

#### Citation:

Timmer K, Calabria M, Branzi FM, Baus C and Costa A (2018) On the Reliability of Switching Costs Across Time and Domains. Front. Psychol. 9:1032. doi: 10.3389/fpsyg.2018.01032 Keywords: bilingual language control, executive control, test–retest reliability, cross-talk, switching costs

# INTRODUCTION

How do bilingual speakers control their two languages to avoid linguistic confusion? Researchers argue that this is achieved by a set of control processes labeled bilingual language control (BLC) (Green, 1998). But, what is the nature of these control processes? It is debated whether or not BLC is an instantiation of the more domain-general executive control (EC) processes (Green, 1998; Dijkstra and van Heuven, 2002; Abutalebi and Green, 2007). One of the most frequently used tasks to measure BLC and EC abilities is the switching paradigm and its switch cost measures (n-1 shift cost and n-2 repetition cost). The extent to which this cross-talk between domains is present for switch costs is inconsistent and controversial (Garbin et al., 2010; Branzi et al., 2016a; Timmer et al., 2017). Here, we report a study that explores: (a) the cross-talk between the two systems by looking at switch costs in the two domains, and (b) the reliability of two switching measures often used to explore the cross-talk of control mechanisms between BLC and EC. We argue that the reliability of these measures varies considerably and that when they are reliable, cross-talk between domains is present.

The current evidence regarding cross-talk between BLC and EC comes from four sources. First, and especially relevant for our purposes, are studies that compared performance of bilinguals in linguistic and non-linguistic switching tasks. Most studies did not reveal a correlation of switch costs across domains (Calabria et al., 2011; Prior and Gollan, 2013; Cattaneo et al., 2015; Branzi et al., 2016a; Declerck et al., 2017; but see Declerck et al., 2015). Second, studies that compared brain activity for the switch cost in the two domains for the same bilinguals showed there is some degree of overlapping activation but also a contribution of different regions (De Baene et al., 2015; Weissberger et al., 2015; Branzi et al., 2016b; Timmer et al., 2017). Third, somewhat indirect evidence comes from studies comparing monolinguals and bilinguals. The hypothesis is that bilinguals have long-term practice in BLC which can affect the performance of EC. The results are mixed, while some studies show smaller behavioral switch costs on a non-linguistic task for bilinguals than monolinguals (Prior and Macwhinney, 2010; Prior and Gollan, 2011; Houtzager et al., 2017), many other studies did not (Hernández et al., 2013; Paap and Greenberg, 2013; Paap et al., 2017; Timmer et al., 2017; Branzi et al., 2018). Four, studies that show a relation between how much people switch between their languages on a daily basis and the switch cost in a non-linguistic task (Hartanto and Yang, 2016). Thus, the results regarding cross-talk for the switch cost is inconsistent and controversial.

However, one important problem when considering the above studies is the reliability of the measures used to investigate cross-talk. If reliability of a measure is low, then the absence of cross-talk between domains is uninformative. For example, if no correlation is observed for switch costs across domains, one might be tempted to conclude that there is no cross-talk for switching abilities in the two domains. However, before drawing such a conclusion we need to know whether the switch cost measures are reliable themselves within each domain. If there is poor test–retest reliability the measures do not consistently distinguish the performance of individuals within a population. This inability to distinguish between individuals makes these measures ill-suited to detect relationship with other constructs in cross-domain correlational studies (Hedge et al., 2017). Thus, with poor test–retest reliability the result is silent about the crosstalk between domains. In this study, we test the reliability of switch costs, one of the most frequently used measures regarding cross-talk of control processing, in a linguistic and non-linguistic switching task at two points in time with about a week in between.

This strategy has already been used in the context of EC measures (Miyake et al., 2000) where some measures of EC reached acceptable level of reliability whereas others did not (Williams et al., 2005; Beck et al., 2011; Willoughby and Blair, 2011; Paap and Sawi, 2016; Soveri et al., 2016; Donath et al., 2017; Fernández-Marcos et al., 2017). Of interest to the present study is that the switch cost measure (i.e., n-1 shift cost) during a non-linguistic task, as an index of EC efficiency, showed a reliable effect over time (r = 0.62) (Paap and Sawi, 2016). In addition, another type of switch cost, the n-2 repetition cost that reflects a differential process of inhibitory control (described underneath) did not show high reliability in non-linguistic switching tasks (between r = 0.23 and 0.44) (Pettigrew and Martin, 2016; Kowalczyk and Grange, 2017; Rey-Mermet et al., 2017). To the best of our knowledge, reliability for the language switch costs has not been tested. The present study extends the test of reliability to the language domain.

#### Description of the Tasks and Measures

We asked participants to perform both a linguistic and nonlinguistic switching task. In the linguistic switching task, participants named pictures in Catalan, Spanish, or English depending on the flag presented around the picture. In the nonlinguistic switching task, participants made a decision about the color, size, or type (i.e., number/letter) of a visual stimulus depending on a visual cue presented around the stimulus.

These tasks were designed such that two types of switch costs could be measured: the n-1 shift cost and the n-2 repetition cost. The n-1 shift cost refers to the cost of switching between languages/tasks (switch trial; BA) compared to repeating the same language/task (repeat trial; AA). This cost is considered to measure people's efficiency in applying transient control (Meiran, 2010; Bobb and Wodniecka, 2013). More specifically, it reflects cue encoding, the activation of a new set of S-R rules in working memory, and the inhibition of the previous task set (Periáñez and Barceló, 2009; Timmer et al., 2017, 2018). This measure has played an important role to inform theories of language control (Green, 1998; Roelofs, 1998; Costa et al., 1999; La Heij, 2005; Finkbeiner et al., 2006) and domain-general EC (Jurado and Rosselli, 2007; Munakata et al., 2011; Diamond, 2013).

The n-2 repetition cost refers to the cost of switching into a recently performed task (in an n-2 trial) as compared to switching into a not-recently performed task. Consider a language switching task with three languages (Catalan, Spanish, and English) and the following language sequences (CBA and ABA). When participants name pictures in three different languages (CBA) each trial corresponds to a different language and hence no language repetition is present. However, in the sequence ABA the last instance (A) corresponds to the same language used two trials before. Comparing the performance in these two sequences is how the n-2 repetition cost is calculated, and this cost is often interpreted as cognitive processes that solve proactive interference (Mayr and Keele, 2000; Philipp et al., 2007; Branzi et al., 2016a). Thus, previous inhibition needs to be overcome to perform the current task suggesting this is a pure measure of inhibitory control (Mayr and Keele, 2000). While some suggest the n-2 repetition cost is a pure measure of inhibitory control (Philipp and Koch, 2009), others suggest that the n-2 also measures other factors than only inhibition, for example episodic retrieval, and is therefore not a pure measure of one process (Grange et al., 2017; Kowalczyk and Grange, 2017).

Previous evidence regarding the cross-talk for these two costs are somewhat inconsistent. For example, correlations across domains did often not reveal cross-talk for either the n-1 shift cost (Calabria et al., 2011; Cattaneo et al., 2015; Branzi et al.,

TABLE 1 | Mean answers (and standard deviations) to self-rating proficiency and social economic background questionnaire.


<sup>a</sup>A 7-point scale, with 1 point being the lowest proficiency and 7 being the highest self-rated proficiency.

2016a; but see Declerck et al., 2015, 2017) or the n-2 repetition cost (Branzi et al., 2016a). But neural evidence suggests that there are some overlapping processes and regions underlying the switch costs (De Baene et al., 2015; Weissberger et al., 2015; Branzi et al., 2016b; Timmer et al., 2017). We investigate whether the absence of cross-domain correlations is caused by the lack of test–retest reliability of the switching measures in both the linguistic- and non-linguistic task. We expect to replicate the reliability of the n-1 shift cost during non-linguistic task switching (Paap and Sawi, 2016), while we expect to find lower reliability for the n-2 repetition cost (Pettigrew and Martin, 2016; Rey-Mermet et al., 2017). We extend these findings to the linguistic domain expecting to find similar results in the linguistic as in the non-linguistic domain.

#### MATERIALS AND METHODS

#### Participants

Fifty-eight Catalan-Spanish-English trilingual from Universitat Pompeu Fabra were paid for their participation (35 females; average age: 23.3 years; SD = 4.12). They all had normal or corrected-to-normal vision, and no history of neurological impairments or language disorders. Four participants were excluded due to low accuracy on the linguistic switching task, for these participants less than 65% of trials were left due to high error rate and many voice-key errors. One participant was excluded due to technical failure of the voice key. The final sample consisted of 53 participants (32 females; average age: 23.5 years; SD = 4.22).

All participants completed a self-rating proficiency and social economic background questionnaire to assess their language proficiency and social-economic background. The language proficiency is reported in **Table 1**. The mothers' [4.1 (SE = 1.08)] and fathers' [4.0 (SE = 1.41)] education level were measured on a 6-point scale (1) primary school, (2) middle school, (3) high-school diploma, (4) professional training, (5) Bachelor University, and (6) Master or Ph.D.). They also performed the Superior Scale I of the Ravens Advanced Progressive Matrices to measure non-verbal intelligence. Participants had to indicate which of eight possible pieces was missing from a picture. Twelve picture items were tested (Raven et al., 1998). On average participants had a score of 9.5 (SD = 1.54) out of a maximum of 12.

# Materials and Procedure

The experiment consisted of two sessions (test and retest) with approximately a week (5–9 days) between sessions. At both days the participants performed a linguistic- and non-linguistic switching task. The order in which the tasks were performed was counterbalanced across participants and kept in the same order over sessions for each participant. Each session took approximately 1.5 h during which they were seated individually in a quiet room with dimmed lights and seated approximately 1 m from the computer screen. At the first session, before starting the switching tasks, participants signed an informed consent form, filled out the language proficiency questionnaire, and completed the Raven's non-verbal intelligence test before the experimental tasks (**Table 1**). Instructions for the switching tasks were given in oral and written format. They were instructed to make responses as fast and accurate as possible. They received practice trials before each task.

#### Linguistic Switching Task

In the linguistic switching task, participants named eight blackand-white drawings representing nouns with non-cognate names between Catalan (L1), Spanish (L2), and English (L3) (see Branzi et al., 2016a for the stimuli). The pictures were presented one at a time. Each picture was surrounded by four cue-signs (flag in the corners of the picture) indicating the language in which each picture was to be named (four Catalan, Spanish, or English flags). All target stimuli were centered and presented in black on a white background. The speech onset latencies were measured with a voice-key. Before the experiment, participants were familiarized with the pictures and their corresponding names in the three languages to make sure they produced the correct names for the pictures.

In total participants named 648 randomized pictures divided over six blocks. After each block participants could take a short break and start the next block when ready. Each trial started with the presentation of the cue-signs together with a tone. After 100 ms (CSI), the picture was presented in the middle of the screen while the cues also remained on the screen. Picture and cue-signs remained on the screen until a response was given. After each response a blank screen appeared before the next cue was presented for the following trial.

#### Non-linguistic Switching Task

In the non-linguistic switching task, participants made three perceptual classifications about visual stimuli. The three classifications were 'color' (red vs. blue), 'size' (small vs. big), and 'type' (letter vs. number) as used in previous studies (Philipp and Koch, 2006; Branzi et al., 2016a). Just as the flags during the linguistic task, cues were presented around the target stimulus to notify the classification to be made. For the 'color' decision the cue was a yellow square, for the 'size' decision the cue was an arrow pointing up and down, and for the 'type' decision the cue was a paragraph sign. Responses were given manually with key presses to three response keys for each hand. Note also that responses were labeled on the keyboard.

The procedure of the non-linguistic task was identical to that of the linguistic one. The only difference was that

for the non-linguistic task participants received feedback on their performance, accuracy in % was shown at the end of each block, while no feedback was given on the linguistic task.

#### Data Analysis

The experimental design was the same as previous studies investigating the n-1 shift cost and n-2 repetition cost (Philipp and Koch, 2006; Branzi et al., 2016a, 2018). Depending on the two preceding trials the current n trial was allocated to one of three conditions (CAA, CBA, and ABA). Each language (Catalan, Spanish, and English) or task (colors, size, and type) was assigned a letter (A, B, or C). Within each of the three conditions the latter letter refers to the current n trial. For example, for the CAA condition the n trial is A and preceded at the n-1 trial by another A. Given that the preceding trial is identical to the current this is an n-1 repetition condition. For the CBA condition, the n trial A is preceded at the n-1 trial by a B and at the n-2 trial by a C. Both preceding trials are different from A and therefore considered an n-2 switch condition. For the ABA condition, the n trial A is preceded at n-1 by B, a different trial, but at n-2 by A, a repetition of trial n. Here, at n-2 there is a repetition of the n trial and therefore the condition is called n-2 repetition.

The two effects are calculated by comparing two conditions. The n-1 shift cost is the difference between the RTs from CAA (n-1 repetition) and CBA (n-2 switch) conditions. The n-2 repetition cost is the difference between the RTs of the CBA (n-2 switch) and the ABA (n-2 repetition) conditions.

Importantly, it has been suggested that the n-2 repetition cost can be eliminated when the number of trials in all three conditions is equal. However, when the number of n-1 repetition trials is greatly reduced, in comparison to the other two conditions, the n-2 repetition cost is present (see Philipp and Koch, 2006). As we are interested in both costs we have reduced the number of n-1 repetition trials. Both the n-2 repetition trials (ABA) and the n-switch trials (CBA) occurred approximately on 39% of the trials, while the n-1 repetition trials (CAA) occurred on approximately 11% of the trials. The sum does not add up to 100% because the first two trials of each block were removed as well as the trial after a repetition trial (CAA). See Branzi et al. (2016a, 2018) for the same procedure and further details.

Non-linguistic and linguistic switching data were analyzed separately with a repeated measure ANOVA that included the within-subject factors Session (test vs. retest) and Trial type (CAA vs. CBA vs. ABA). This was followed by correlational analyses to examine, among others, the test–retest reliability between sessions. Bonferroni corrections for multiple comparisons are applied when necessary.

# RESULTS

We first present the response latency analyses separate for the two tasks (**Table 2**) and second the correlations within and between tasks (**Figures 1**–**4**).

#### Linguistic Switching Task

Outliers were discarded from the analysis [naming latencies longer than 5,000 ms (0.2% of the data) and latencies that deviated 2.5 SD from the average per participant per condition (4.1% of the data)]. In addition, voice-key errors (1.5% of the data) and incorrect responses were also discarded (4.0% of the data). The first two trials after an error were also removed (7.2% of the data) as the Type of trial (CAA, CBA, or ABA) could not be determined until 2 n after an error. In the analysis a total of 83.5% of the trials was included at test and 82.7% at retest. No differences in accuracy were observed between Sessions or Trial types.

Naming latencies at retest were 61 ms faster than at test [F(1,52) = 15.75, MSe = 18666.82, p < 0.001, η 2 <sup>p</sup> = 0.233]. The main effect of Trial type was also significant [F(1,104) = 37.24, MSe = 5949.21, p < 0.001, η 2 <sup>p</sup> = 0.417], revealing the presence of both and n-1 shift cost and n-2 repetition cost. The n-1 shift cost was reflected by faster latencies for CAA than CBA trials [respectively, 1,068 ms and 1,117 ms; F(1,52) = 27.61, MSe = 9371.87, p < 0.001, η 2 <sup>p</sup> = 0.347]. The n-2 repetition cost was reflected by faster latencies for CBA than ABA trials [respectively, 1,117 ms and 1,138 ms; F(1,52) = 26.91, MSe = 1669.11, p < 0.001, η 2 <sup>p</sup> = 0.341]. There was no interaction between Session and Trial type (F < 1).

#### Non-linguistic Switching Task

The same criteria to remove outliers and errors was used as in the linguistic task (latencies longer than 5,000: 0.5% of the data; 2.5 SD outliers: 3.6%; errors: 2.6%; 2 trials after error: 4.6%). In the analysis a total of 88% of the trials was included at test and 89.4% at retest. No differences in accuracy were observed between Sessions or Trial types.

Response latencies at retest were 257 ms faster than at test [F(1,52) = 182.49, MSe = 28886.72, p < 0.001, η 2 <sup>p</sup> = 0.778]. The main effect of Trial type was also significant [F(1,104) = 33.45, MSe = 6387.06, p < 0.001, η 2 <sup>p</sup> = 0.391], revealing the presence of both and n-1 shift cost and n-2 repetition cost. The n-1 shift cost was reflected by faster latencies for CAA than

TABLE 2 | Mean response latencies in ms (and standard error) for the linguistic and non-linguistic switching tasks for each Trial type by Session, as well as the magnitude of the n-1 shift cost and the n-2 repetition cost in ms.


FIGURE 1 | Correlations for the n-1 shift cost, based on the proportional costs, between test and retest for (A) linguistic and (B) non-linguistic switching task.

CBA trials [respectively, 959 ms and 993 ms; F(1,52) = 8.74, MSe = 13517.09, p < 0.005, η 2 <sup>p</sup> = 0.144]. The n-2 repetition cost was reflected by faster latencies for CBA than ABA trials [respectively, 993 ms and 1037 ms; F(1,52) = 48.69, MSe = 4230.65, p < 0.001, η 2 <sup>p</sup> = 0.484]. However Session and Trial type interacted [F(1,104) = 8.59, MSe = 2926.78, p < 0.001, η 2 <sup>p</sup> = 0.142]. This showed that the size of the n-1 shift cost decreased significantly from test to retest [respectively, 54 ms and 13 ms; t(52) = 2.87, SE = 14.29, p < 0.01]. In contrast, the n-2 repetition cost did not decrease significantly over testing sessions [50 ms and 38 ms; t(52) = 1.24, SE = 9.96, ns].

#### Correlations

For the correlations we calculated a proportional cost for each of the switch costs to avoid problems of differences between tasks in

speed of responding. The switch cost was divided by the average RT of the involved trials and multiplied by a hundred.<sup>1</sup>

To investigate whether the n-1 shift and n-2 repetition costs are consistent over time we correlated [Intra Class Correlation (ICC); also named Cronbach's alpha] each of these costs between test and retest. The n-1 shift cost revealed a positive correlation between test and retest for both the linguistic (r = 0.739, p < 0.001; see **Figure 1A**) and non-linguistic switching tasks (r = 0.573,

<sup>1</sup>Note that the correlational analyses based on the original mixing and switch costs show a similar pattern of results as the analyses based on the proportional costs reported here, and the same conclusions would be drawn as described in the text.

p < 0.001; see **Figure 1B**). The n-2 repetition cost also revealed a test–retest correlation for both the linguistic (r = 0.384, p < 0.05; see **Figure 2A**) and non-linguistic switching tasks (r = 0.399, p < 0.05; see **Figure 2B**).

To investigate whether the n-1 shift and n-2 repetition costs are consistent across domains we correlated (Pearson's coefficient) each of these costs between the linguistic- and non-linguistic switching tasks. The n-1 shift cost revealed a positive correlation across domains at both test (r = 0.347, p < 0.05; see **Figure 3A**) and retest (r = 0.272, p < 0.05; see **Figure 3B**). In contrast, the n-2 repetition cost does not reveal correlations across domains at neither test (r = 0.116, ns; see **Figure 4A**) nor retest (r = 0.015, ns; see **Figure 4B**).

# DISCUSSION

We explored the test–retest reliability of linguistic and nonlinguistic switch costs (n-1 shift and the n-2 repetition cost), as well as the presence/absence of cross-talk between the two cognitive control domains for both switch costs. Participants performed switching tasks in both domains at two sessions, approximately a week apart. The current study revealed a dissociation between the two types of switch costs (n-1 shift cost and n-2 repetition cost) regarding their test–retest reliability and the cross-talk between domains. The test–retest reliability for the n-1 shift cost was quite high as the correlation between sessions indicates, both in the linguistic and non-linguistic tasks. However, this reliability was much lower for the n-2 repetition cost. This pattern indicates that the n-1 shift cost is more stable across time than the n-2 repetition cost, and consequently the former is more suitable to explore whether there are correlations across domains that would suggest shared constructs. Cross-talk between the two domains was present for the n-1 shift cost as demonstrated by a crossdomain correlation. This suggests that there are at least some shared processes in the linguistic and non-linguistic task.

We looked at the correlations of the proportional switch costs instead of the mean RTs as the latter will often show high reliability due to overlapping processes (e.g., perceiving the visual stimulus and the motor processing of pressing a button or making a vocal response) in the RTs of switch and repeat trials (Declerck et al., 2015). In addition, the difference scores reflect a specific process within switching paradigms (Miller and Ulrich, 2013). The present study revealed weaker test–retest correlation for the n-2 repetition cost than the n-1 shift cost. Thus, in the present study the n-2 repetition cost does not rank individuals consistently, either due to high error variance or due to low between-subjects variance. The inability to distinguish between individuals makes this measure ill-suited to investigate shared constructs across domains (Hedge et al., 2017). To conclude, when investigating questions of cross-talk with correlational paradigms it is advisable to use the n-1 shift cost and be careful with the use of the n-2 repetition cost.

Next to increased error variance and low between-subjects variance, it is to be noted that practice effects can also diminish the test–retest reliability of a measure (Paap and Sawi, 2016). Performance on a simple choice RT task improves over time in speed and accuracy. Some part of this practice effect is removed by a short practice at the beginning of the experiment, however, there is still a practice effect across testing days. While both switch costs were present at first testing in both domains, the n-1 shift cost decreased from test to retest for the non-linguistic task but not for the linguistic task. Thus, these differential practice effects depending on the domain in which the switching paradigm was conducted can diminish the reliability over time.

While the n-1 shift cost showed good test–retest reliability, the n-2 repetition cost only showed a weak reliability for both domains. This is in line with previous studies investigating the n-2 repetition cost in the non-linguistic domain (Pettigrew and Martin, 2016; Kowalczyk and Grange, 2017; Rey-Mermet et al., 2017). Therefore, no conclusion can be drawn about the convergent validity across domains for the n-2 repetition cost. In light of low reliability, we did not find a cross-domain correlation either for the n-2 repetition cost, in line with Branzi et al. (2016a). The absence of reliability could be due to the fact that some have suggested that the n-2 repetition cost is not a pure measure of one process, inhibition, but is also influenced by factors like episodic memory (Grange et al., 2017; Kowalczyk and Grange, 2017). If this cost arises due to a mixture of underlying measures it is not strange that the reliability is low. In addition, it has been suggested that this measure might have different underlying processes in each domain and that these processes do not vary in the same way across the two tasks. For example, the linguistic task showed variations in the mechanisms of the n-2 repetition cost depending on which of the three different languages was used (Babcock and Vallesi, 2015). This shows that this measure is more complex than assumed within the linguistic domain and does not have a direct relation to the non-linguistic domain. Thus, the present study shows that the reliability of the n-2 repetition cost is weak over time and therefore no conclusions can be drawn on whether there is cross-talk across domains for this measure.

For the n-1 shift cost, there was strong test–retest reliability and we also find a cross-domain correlation suggesting that the mechanisms underlying the n-1 shift cost share at least some processes in the linguistic and non-linguistic task. Note that a correlation of 0.6 is often considered to reflect a good reliability within the literature (Landis and Koch, 1977; Cicchetti and Sparrow, 1981), however, there are no definitive guidelines on how to interpret correlational values (Crocker and Algina, 1986). Not all studies showed a relation between the linguistic and non-linguistic task for this measure (Calabria et al., 2011; Prior and Gollan, 2013; Calabria et al., 2015; Cattaneo et al., 2015; Branzi et al., 2016a; Declerck et al., 2017). This could potentially be due to a couple of reasons. First, the test–retest reliability observed for this switch cost in each domain limits the correlation that can be observed between them. While we have strong reliability for the n-1 shift-cost, there is always

some measurement error and the cross-domain correlation can never be higher than the test-reliability of both measures. The magnitude of the cross-domain correlation is attenuated by measurement error of both measures. This can have impact on theoretical conclusions, where non-significant correlational results are interpreted as an absence of shared constructs across domains, though there might be shared constructs that are not picked up due to high measurement errors (Hedge et al., 2017).

Second, we investigate the switch cost together with the n-2 repetition cost. To show effects on the latter cost the number of repeat trials was greatly reduced compared to the other trial types (Philipp and Koch, 2006), while previous studies had an equal number of switch and repeat trials (but see Branzi et al., 2016a). This could have changed the mechanism measured in the n-1 switch cost as participants might use a different strategy within such a set-up of trials.

Third, other paradigms have suggested there is some but not full overlap across domains for the switch cost (Green, 1998; Dijkstra and van Heuven, 2002; Abutalebi and Green, 2007; Grainger et al., 2010; Declerck et al., 2015). For example, studies that compared brain activity in linguistic versus nonlinguistic switching tasks directly found only some overlapping areas to be activated (De Baene et al., 2015; Weissberger et al., 2015; Branzi et al., 2016b) and electrophysiological comparisons only showed the P3 but not the N2 component to overlap for the switch cost (Timmer et al., 2017). This suggests that the switch cost reflects a multitude of underlying processes that may differ to a certain extent depending on the domain (Declerck et al., 2015). For example, the stimuli used in each task are often different (e.g., pictures versus alphabetic and numerical representations). Also, the modality of response is different (oral naming vs. categorization). The difference in the response-set is important on two points. First, manual responses are more diverse than speech responses. For speech production there is only one output through the vocal tracts, while manual responses involve completely different responses (e.g., left vs. right hand response). In the present study the oral response is in one of three languages, but the manual response is one of six buttons. Second, the underlying processes that accumulate to an oral or manual response develop differently over time. Competing representations and responses start diverging at a later point in time for speech production than for manual responses as has been observed by ERPs and impact the behavioral responses differentially (Tillman and Wiens, 2011; Acheson et al., 2012; Timmer and Chen, 2017). The final performance (size of the switch cost) of an individual is affected by all sub-processes: those shared between tasks and those that have differential contributions (Declerck et al., 2017). Therefore, it is possible that most studies do not reveal correlation due to the sub-processes that differ, making it difficult to detect the contribution of possible shared processes. But due to the common sub-process the correlation might sometimes present regardless of the variation in other sub-processes. However, a conclusion of some shared sub-processes needs to be taken with caution.

# CONCLUSION

To conclude, test–retest reliability for the n-1 shift cost is strong in both the non-linguistic and the linguistic domain, therefore, the n-1 shift cost is stable and can be used to test convergent validity across domains. In contrast, the reliability for the n-2 repetition cost that measures a different process was weaker, therefore the n-2 repetition cost should be used with caution when investigating correlations regarding crosstalk. While the n-1 shift cost seems to have at least some shared processes in the linguistic and non-linguistic domain, no conclusions can be drawn regarding the n-2 repetition cost.

# ETHICS STATEMENT

The study was approved by the ethical committee board at Universitat Pompeu Fabra. All participants were adults aged 18 or more. At the beginning of the experimental session participants signed an informed consent form that stated a description of the experiment and stressed that the participant is free to leave the experiment at any time without providing any explanation to the experimenter. If the participant wants to proceed, they sign the consent form and the experiment commences.

# AUTHOR CONTRIBUTIONS

All authors substantially contributed to the conception or design of the manuscript, interpretation of the data for the manuscript, and revising the manuscript critically. KT and FB contributed to the data acquisition. KT contributed to the analysis of the data and critical drafting and revising of the manuscript. All authors are in agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

# FUNDING

The research leading to these results has received funding from the Dutch Organization for Scientific Research (NWO) with grant 446-14-006 and from the Ministerio de Economía, Industria y Competitividad (MINECO) with the Juan de la Cierva grant (IJCI-2016-28564) awarded to KT. This work was also supported by grants from the Spanish government (PSI2014- 52181-P and PSI2017-87784-R), the Catalan government (SGR 2009-1521 and 2017 SGR 268), the La Marató de TV3 Foundation (373/C/2014), and the European Research Council under the European Community's Seventh Framework (FP7/2007– 2013 Cooperation Grant Agreement 613465-AThEME). And last, MC was supported by the postdoctoral Ramón y Cajal fellowship (RYC-2013-14013) and FB was supported by the postdoctoral Marie Sklodowska-Curie fellowship (658341).

#### REFERENCES

fpsyg-09-01032 June 20, 2018 Time: 18:31 # 9



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Timmer, Calabria, Branzi, Baus and Costa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Beyond Domain-Specific Expertise: Neural Signatures of Face and Spatial Working Memory in Baduk (Go Game) Experts

Wi Hoon Jung<sup>1</sup> , Tae Young Lee<sup>2</sup> , Youngwoo B. Yoon3,4, Chi-Hoon Choi<sup>5</sup> and Jun Soo Kwon2,3,6 \*

<sup>1</sup> Department of Psychology, Korea University, Seoul, South Korea, <sup>2</sup> Institute of Human Behavioral Medicine, Medical Research Center, Seoul National University, Seoul, South Korea, <sup>3</sup> Brain and Cognitive Sciences, College of Natural Sciences, Seoul National University, Seoul, South Korea, <sup>4</sup> Department of Psychiatry, Washington University in St. Louis, St. Louis, MO, United States, <sup>5</sup> Department of Radiology, Chungbuk National University Hospital, Cheongju, South Korea, <sup>6</sup> Department of Psychiatry, College of Medicine, Seoul National University, Seoul, South Korea

Recent advances of neuroimaging methodology and artificial intelligence have resulted in renewed interest in board games like chess and Baduk (called Go game in the West) and have provided clues as to the mechanisms behind the games. However, an interesting question that remains to be answered is whether the board game expertise as one of cognitive skills goes beyond just being good at the trained game and how it maps on networks associated with cognitive abilities that are not directly trained. To address this issue, we examined functional activity and connectivity in Baduk experts, compared to novices, while performing a visual n-back working memory (WM) task. We found that experts, compared to novices, had greater activation in superior parietal cortex during face WM, though there were no group differences in behavioral performances. Using a data-driven, whole-brain multivariate approach, we also found significant group differences in the multivariate pattern of connectivity in frontal pole and inferior parietal cortex, further showing greater connectivity between frontal and parietal regions and between frontal and temporal regions in experts. Our findings suggest that long-term trained Baduk experts have the reorganization of functional interactions between brain regions even for untrained cognitive ability.

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Quanying Liu, California Institute of Technology, United States Minghao Dong, Xidian University, China Vincenza Tarantino, Università degli Studi di Padova, Italy

#### \*Correspondence:

Jun Soo Kwon kwonjs@snu.ac.kr

Received: 27 February 2018 Accepted: 23 July 2018 Published: 07 August 2018

#### Citation:

Jung WH, Lee TY, Yoon YB, Choi C-H and Kwon JS (2018) Beyond Domain-Specific Expertise: Neural Signatures of Face and Spatial Working Memory in Baduk (Go Game) Experts. Front. Hum. Neurosci. 12:319. doi: 10.3389/fnhum.2018.00319 Keywords: board game, connectome, frontoparietal network, functional connectivity, holistic processing

# INTRODUCTION

People have very different levels of cognitive ability, from profound impairments to superior skills. However, while our understanding of neural mechanisms of cognitive impairments has been greatly enhanced via neuroimaging studies for general and cognitive-impaired individuals, those of superior skills still remain poorly understood.

Baduk, as it is called in Korea (known as the game of Go in the West<sup>1</sup> ), is an abstract strategy board game like chess; it is played on a square board with a 19 by 19 grid of lines. Its rule is simple; two players, one playing with white stones and the other playing with black ones, take turns placing a stone to capture more territory on the board than the opponent by surrounding the opponent's

<sup>1</sup>http://english.Baduk.or.kr

stones (**Figure 1A**). Despite such simple rule, Baduk is considered more complex than chess owing to its enormous branching factors (i.e., the enormous number of move choices available) (Keene and Levy, 1991).

instructions (2.34 s) preceded each block to indicate the upcoming condition.

Board games like Baduk and chess have been commonly used to investigate the mechanism underlying cognitive expertise, as playing involves diverse high-level cognitive functions such as visuospatial processing, decision making, and attention (Chase and Simon, 1973b; Gobet and Charness, 2006). For example, decades of neuroimaging studies have identified multiple brain regions engaged during board game play, including dorsolateral prefrontal cortex, premotor cortex, and occipitotemporal and parietal cortices (Chen et al., 2003; Atherton et al., 2003; Itoh et al., 2008). Additionally, researchers have explicitly investigated neural correlates of specific cognitive components closely related to board game by using a variety of domainspecific tasks employing the board game-related stimuli; these studies have revealed brain regions associated with object and pattern recognition on the board, such as the occipitotemporal junction, fusiform gyrus (FFA), and collateral sulcus (Bilalic et al., ´ 2010, 2011), with intuitive best next-move generation on the board, such as the caudate nucleus (Wan et al., 2011, 2012), and with intuitive strategy decision making on the board, such as different parts of the cingulate cortex (Wan et al., 2015). Some of these regions, particularly in the occipitotemporal junction, FFA, and caudate nucleus, also showed significant group differences in structural morphology as well as structural and functional connectivity (FC) at rest between board game experts (BEs) and novices (Lee et al., 2010; Duan et al., 2012a,b, 2014; Jung et al., 2013; Hänggi et al., 2014). Interestingly, a recent meta-analysis on functional neuroimaging studies of long-term expertise has suggested the common mechanisms across different cognitive domains, showing enhanced or additional activity in the brain of experts compared to novices (Neumann et al., 2016). Especially, the meta-analysis on the studies with visual stimulation showed that experts had enhanced activation in inferior parietal cortex and lingual gyrus.

Despite the efforts mentioned above, an interesting question that remains to be answered is whether the board game expertise goes beyond just being good at the trained game. In other words, do behavioral and neural differences between BEs and novices exist in untrained cognitive abilities? Although there are some studies investigating the effects of chess instructions on untrained tasks (Sala and Gobet, 2016), it is still unclear whether board game expertise transfers to other cognitive skills. Moreover, most of functional neuroimaging studies comparing BEs and novices so far have focused on the regional areas associated with visual expertise using both trained and untrained stimuli (e.g., chessrelated objects and positions, faces, rooms, and tools), such as FFA and occipitotemporal and occipitoparietal areas (Bilalic´ et al., 2010, 2011; Bilaliæ,, 2016), rather than functional networks associated with other specific cognitive abilities.

Working memory (WM) is involved in the frontoparietal network (Owen et al., 2005). Particularly, object and spatial WM may be one of cognitive abilities that can be potentially enhanced through board game training based on its important role in the game; in the case of Baduk, to win the game, players are needed to remember the positions of stones on the board and to hold several future offensive moves and the opponent's expected responses to each of the future moves in their WM. There have been existed several studies to test brain activity during 1-back WM task in chess experts compared to novices using chessboards and faces or scenes stimuli (Bilalic et al., 2011; Krawczyk et al., 2011 ´ ; Bartlett et al., 2013). These studies reported consistent behavioral results, showing no group differences in behavioral performances except board-specific stimuli, but inconsistent neural results, showing an increase in BEs (Bilalic et al., 2011 ´ ) or no group difference in FFA activation in response to chessboards (Krawczyk et al., 2011). These previous studies using 1-back task have investigated the neural activity in response to the recognition of trained and untrained stimuli rather than WM in BEs. Additionally, there is no a study to test functional coupling between brain regions during WM in BEs.

Here we examined the functional activity and connectivity of Baduk experts, compared to novices, while performing a visual n-back WM task with both object and spatial WM

conditions. Especially, we chose to use neutral face stimuli for the task because like the board games, face discrimination and recognition rely on at least partly on holistic processing and reliably activate the FFA (Kanwisher et al., 1997), where the region is a general visual expertise module rather than facespecific (Gauthier et al., 1999, 2000). That is, using face stimuli allows us to test whether there are group differences in FFA function (Bilalic et al., 2011 ´ ; Krawczyk et al., 2011; Bartlett et al., 2013). We also used a recently introduced FC technique, called multivariate distance-based matrix regression (MDMR) as part of a connectome-wise association study (CWAS) to identify brain regions showing group differences in the connectome, whole-brain FC patterns (Shehzad et al., 2014). It is a fully data-driven, whole-brain multivariate analytic approach, which provides a more comprehensive characterization of brainbehavior relationships than massive univariate approach. In the present study, the MDMR-based CWAS allowed us to evaluate the overall multivariate patterns of FC associated with a phenotype (BEs vs. novices as group) at each voxel while controlling confounding variables (age, sex, and in-scanner motion). Based on previous findings, we hypothesized that BEs, compared to novices, showed increased functional activity and connectivity particularly in the frontoparietal network during the n-back WM task.

# MATERIALS AND METHODS

# Participants

Seventeen BEs participating in Baduk training from their childhood were recruited from the Korea Baduk Association<sup>2</sup> . Seventeen novices who were age, sex, and IQ-matched with those in the BEs, also recruited through online advertisements for purpose of comparison. Based on the simple screening questionnaire, all participants were not experts in any games, except Baduk for BEs. All participants were also right-handed and had no history of psychiatric or neurological disease. This dataset included resting-state fMRI, n-back WM task-based fMRI, T1-weighted anatomical MRI, and diffusion tensor imaging. Previous reports from this dataset have concerned differences in anatomical connectivity (Lee et al., 2010) and in gray matter volume (GMV) and resting-state FC (Jung et al., 2013) between BEs and novices. The procedures of this study were approved by the Institutional Review Board of Seoul National University Hospital and written informed consent was obtained from all participants, including parental consent for those younger than 18 years of age. All methods were performed in accordance with the approved guidelines and regulations.

Here we analyzed fMRI data obtained during the n-back WM task. Four participants (2 BEs and 2 novices) out of 34 were excluded from analyses due to excessive missing trials during the task. The final sample consisted of 15 BEs (mean age 17.3 years, range 16–20 years) and 15 novices (mean age 17.0 years, range 15–19 years; mean training duration 12.6 years). Our sample size corresponded to samples used in previous behavioral (Bilalic´ et al., 2008, 2009) and neuroimaging studies with BEs (Bilalic´ et al., 2011; Bilaliæ,, 2016). Despite even the relatively small sample sizes, the direct comparison between experts and novices can provide power to capture the effects of interests.

#### Task

Participants performed a block-designed n-back WM task including both face matching and spatial location matching conditions (**Figure 1B**). The task had four WM loading conditions: 0-back, 1-back, 2-back, and 3-back. The 0-back served as a control condition, in which participants responded with a button press to a predetermined target stimulus. For the n-back face matching WM conditions, participants responded if the current face was identical to the previous one, two, and three faces ago, respectively. For the n-back spatial location matching WM conditions, participants responded if the current face was in the same place n faces ago regardless of the face identity. During the task, participants responded by pressing a button with their right index finger. The face stimuli consisted of 20 gray-scale pictures of neutral faces (10 Korean men and 10 Korean women), selected from stimuli used in our previous studies (Shin et al., 2008, 2015). The stimuli appeared in 35 different spatial positions on the screen during both the face identity and spatial tasks.

The task consisted of four runs with 48 blocks (12 blocks per run), resulting in six blocks for each of four loading conditions on face and spatial WM. Blocks were presented pseudo-randomly in order of increasing (or decreasing) memory load. Each block included 10 trials (23.4 s) and was split up with the resting blocks of 4 TRs (9.36 s). Visual instructions (2.34 s) preceded each block to indicate the upcoming condition. A face stimulus for each trial was presented for 1500 ms followed by 840 ms of fixation. Before scanning, the participant was given practice to learn the task rules.

#### Image Acquisition

All image data were acquired on a 1.5T Siemens Avanto MRI scanner. While performing the task, fMRI data were obtained via a gradient echo-planar image pulse sequence (repetition time (TR)/echo time (TE) = 2340/52 ms, flip angle (FA) = 90◦ , field of view (FOV) = 220 mm, voxel size = 3.44 mm × 3.44 mm × 5 mm). High-resolution anatomical images were acquired with T1-weighted 3-D MPRAGE sequence (TR/TE = 1160/4.76 ms, FA = 15◦ , FOV = 230, matrix size = 256 × 256). Other image parameters that are not related to the present study are not described herein.

# Conventional fMRI Analysis

Image analysis was performed with SPM12<sup>3</sup> . For each participant, after discarding the first three images in each run, images were then corrected for slice timing, realigned to the first volume, and co-registered with each participant's structural image. There were no significant group differences in the head motion parameters. All images were then normalized to the MNI space using the normalization parameters estimated from the structural MRI. The normalized images were smoothed with 6 mm FWHM

<sup>2</sup>http://english.Baduk.or.kr/

<sup>3</sup>www.fil.ion.ucl.ac.uk/spm

Gaussian kernel. First level-analyses were performed using the general linear model (GLM) with regressors for the cue, 0-back, l-back, 2-back, and 3-back conditions for each type of face and spatial WM in addition to six head motion parameters and a constant term. The regressors were modeled as boxcar functions convolved with a canonical hemodynamic response function for the length of each condition. To delineate brain regions activated during WM for each group and determine regions showing group differences, one-sample t-tests and two-sample t-tests were, respectively, performed for the contrast images of the 1-, 2-, 3-back versus 0-back conditions for each type of face and spatial WM. To further explore a group by WM load interaction for each type of face and spatial WM, we also performed a two (group) by three (WM load) analysis of variance (ANOVA) using the following contrast images: 1-back > 0-back, 2-back > 0-back, and 3-back > 0-back. The results were corrected for multiple comparisons to a significance level of p < 0.05 (height threshold of p < 0.001, uncorrected, combined with extent threshold of p < 0.05, family-wise error [FWE]-corrected).

# MDMR-Based CWAS Analysis

The MDMR-based CWAS method has been described in detail elsewhere (Shehzad et al., 2014; Satterthwaite et al., 2015). For MDMR-based CWAS analysis, the preprocessed image data were down-sampled to 4-mm isotropic voxels for the purpose of computational feasibility. Next, the first two data-points (4.68 s) from every task block were excluded to account for the hemodynamic delay and next all volumes of the same task type (i.e., face or spatial n-back WM) were concatenated across load levels, except for 0-back, for each participant. We then performed the MDMR-based CWAS analysis according to the following three steps using Connector, an R package for CWAS<sup>4</sup> . First, we computed FC (Pearson correlation coefficient) between time series of a given voxel and those of every other voxel within the gray matter mask including cortical and subcortical areas. Second, we evaluated the overall multivariate pattern of FC for a given voxel by calculating a distance metric between every pair of FC calculated above for a given voxel (e.g., between two participants' FC for a given voxel). To calculate the distance metric, we used <sup>√</sup> 2(1 − γ) , where γ is the Pearson correlation, resulting in a non-negative value that indicates how similar/different each pair is (0 = perfectly correlated, 2 = perfectly negatively correlated). Third, MDMR was used to test how well a phenotypic variable explains the distances between participants calculated in the second step. To examine group differences in FC between BEs and novices while controlling for confounding variables, modeled variables included group (BEs versus novices), age, sex, and head motion indexed by mean framewise displacement (Power et al., 2012). For each voxel's FC pattern, MDMR yielded a pseudo-F statistic from a standard ANOVA model, by comparing the sum of squared distances between BEs and novices to the residual sum of squared distances (the error term), whose significance (i.e., p-value) was assessed using 5,000 iterations of a permutation test. All these steps

# Follow-Up Seed-Based Connectivity Analysis

Although MDMR-based CWAS identifies clusters where group differences are present based on multivariate patterns of FC, it does not provide specific connections and direction of observed clusters (Shehzad et al., 2014). Accordingly, we performed post hoc seed-based FC analyses for each cluster identified by MDMR-based CWAS. Seed-based FC maps were generated by Pearson correlation between time series of each seed cluster and those of every other voxel and then Fisher r-to-z transformed. Two-sample t-tests were conducted to examine group differences in z-transformed seed-based FC maps. Statistical significance was set at a voxel-level FWE-corrected p < 0.05.

# Region-of-Interest (ROI) Analysis

Once regions showing significant effects from the aforementioned analyses were detected, we further conducted partial correlation analyses (controlling for age and sex) between brain measures (i.e., neural activity measured as the percent signal change extracted by MarsBar toolbox<sup>5</sup> or the strength of FC) in the identified regions as functional ROI and behavioral performances during the task and training duration in BEs.

# RESULTS

# Demographic and Behavioral Data

**Table 1** summarizes demographic information and behavioral performances during the n-back WM task in both BEs and novices. There were no significant group differences in age, sex, education, and IQ (all ps > 0.05). The two groups did not differ significantly in the accuracy and the reaction time (RT) during the face and spatial n-back WM (all ps > 0.05).

# Group Differences in Neural Activity

Both BEs and novices showed similar activation patterns in the frontal and parietal regions during both face and spatial n-back WM (**Figure 2A** and **Table 2**). However, between-group comparisons revealed greater activation in the left superior parietal cortex (SPC) in BEs than novices during face WM at cluster-level family-wise error (FWE)-corrected p < 0.05 (**Figure 2B** and **Table 2**). The post hoc ROI analysis to further characterize the group difference in the region showed less SPC activation in BEs for the face 0-back control condition (t-/p-value = −1.904/0.067; Cohen's d = −0.694), albeit only marginally significant, than novices but not for the face WM loading condition (t-/p-value = 0.292/0.773; Cohen's d = 0.107).

were repeated for every gray matter voxel to produce a wholebrain p-value map. The p-value map was converted to z-value for multiple comparisons corrections. As in Satterthwaite et al. (2016), the z-map was thresholded at a voxel height of z > 2.326 and a corrected probability of p < 0.05 using 10,000 Monte Carlo simulations.

<sup>4</sup>http://czarrar.github.io/connectir/

<sup>5</sup>http://marsbar.sourceforge.net/



Values in this table are presented as mean (standard deviation). Independent t-tests were used for statistical analyses of all variables except sex (male or female). <sup>a</sup>A chi-square test was used. <sup>b</sup>Overall accuracies and reaction times were estimated across 1-, 2-, and 3-back conditions for each type of face and spatial n-back. IQ, intelligence quotient. Participants' IQs were estimated by Korean–Wechsler Adult Intelligence Scale-Revised (K-WAIS-R).

No significant correlations were found between neural activities in the SPC and behavioral performances (accuracy and RT) during the task and training durations in BEs (all ps > 0.05). There were no significant group differences in any other regions and other task conditions and no group by WM load interactions (all ps > 0.05).

#### Brain Regions Identified by CWAS

Multivariate distance-based matrix regression-based CWAS analyses revealed two regions where the multivariate patterns of FC differ between BEs and novices at cluster-level corrected p < 0.05; one is the left frontal pole (FP; peak MNI x, y, z coordinates = −16, 60, 0) for face WM condition and the other is the left inferior parietal cortex (IPC; x, y, z = −52, −52, 44) for spatial WM condition (**Figure 3A**).

# Group Differences in Seed-to-Voxel Connectivity

To further characterize the FC of the regions identified by MDMR-based CWAS, we performed post hoc seed-based FC analysis to each of these identified clusters (**Figure 3B**). For face WM condition, BEs compared to novices had greater FC between the left FP seed and right FFA (x, y, z = 44, −48, −20; t-/z-values = 5.68/4.59), right supramarginal cortex (SMC; x, y, z = 60, −48, 28; t-/z-values = 5.68/4.60), left middle temporal cortex (MTC; x, y, z = −60, −36, 0; t-/z-values = 6.69/5.13) adjacent to superior temporal sulcus (STS). For spatial WM condition, BEs compared to novices had greater FC between the left IPC seed and left lateral frontal cortex (LFC; x, y, z = −56, 28, 16; t-/z-values = 7.10/5.33). However, the strengths of FC between these regions had no significant correlations with behavioral performances during the task and training durations in BEs.

# DISCUSSION

To address the question as to whether BEs, individuals having cognitive expertise including the highest level of domain-specific pattern recognition, differ from novices in untrained cognitive functions in terms of behavioral performance and brain function, here we explored the brain function of the Baduk (the game of Go) experts while performing n-back WM tasks. Despite no behavioral differences on task performance, BEs compared to novices showed greater SPC activation during face n-back task. Significant differences between BEs and novices were also found in the multifocal patterns of FC in the left FP and IPC for the face and spatial WM conditions, respectively, further showing greater functional couplings between frontal and parietal and temporal regions in BEs compared to novices.

The present study demonstrates that BEs with long-term training do not show an increase in WM ability but have disparate functional neural patterns. Consistent with our results, the same pattern of the absence of far transfer occurs in different types of training, including chess, music, and video game training (for a brief review, see Sala and Gobet, 2017a). For example, previous behavioral investigations have examined the correlates of expert performance (Chase and Simon, 1973a,b; Sala and Gobet, 2017b) and the effects of chess instruction on untrained tasks (Sala and Gobet, 2016). Some recent studies have reported the skill effect in the recall of meaningless domain-specific material (e.g., shuffled chess positions) (Gobet and Simon, 1996a,b; Sala and Gobet, 2017b) that contradicts the earlier claim for the lack of that skill effect (Chase and Simon, 1973b). However, the skill effect with meaningless material observed in experts is accounted for by meaningful chunks that occur in the position by chance, rather than superior cognitive function (Sala and Gobet, 2017b). A large number of the studies showing the effects of chess instruction on

activation maps were visualized with the BrainNet Viewer toolbox (Xia et al., 2013). (B) When comparing activation maps between experts and novices, experts showed greater activation in left superior parietal cortex (SPC) for the contrast of the face working memory load (1-, 2-, and 3-back) conditions versus the 0-back control condition. Percent signal change (PSC) was extracted for each participant and condition using MarsBar toolbox (http://marsbar.sourceforge.net/). The effect size was calculated by Cohen's d to provide the standardized mean difference between the two groups, independent of sample size. The plot of mean PSC shows that group difference in the SPC region is caused by less activation in experts compared to novices for the 0-back condition (Cohen's d = –0.694), whereas the PSCs of the working memory load condition for two groups are similar (Cohen's d = 0.107).

TABLE 2 | Brain regions significantly activated in experts and novices during working memory tasks and the cluster with a significant difference between groups.


All results presented at height threshold p < 0.001, uncorrected, and cluster-extent threshold p < 0.05, family-wise error corrected. For each region of activation, MNI (x, y, z) coordinates and t-/z-values are given in reference to the maximally activated voxel within each cluster. SPC, superior parietal cortex; IPC, inferior parietal cortex; PFC, prefrontal cortex.

compared to novices had greater connectivity between the left frontal pole seed and right supramarginal cortex (RSMC) (Cohen's d = 1.444), right fusiform cortex (RFFA) (Cohen's d = 2.084), and left middle temporal cortex (LMTC) (Cohen's d = 2.052) during the face n-back and greater connectivity between the left inferior parietal cortex seed and left lateral frontal cortex (LLFC) (Cohen's d = 2.887) during the spatial n-back. Note that these bar graphs are presented for visualization purposes only.

academic achievement (e.g., mathematics and literacy) suffered from the problem of confounding due to the overall poor design (e.g., the lack of control groups and no random assignment to groups; Sala and Gobet, 2016). It is thus suggested that engaging in intellectually demanding activities modifies the brain but the benefits are domain-specific rather than untrained cognitive abilities. Our findings from previous and present studies fit well with this pattern, showing differences in brain structure and function between BEs and novices, including structural connectivity (Lee et al., 2010), structural morphology and restingstate FC (Jung et al., 2013).

All BEs included in the present study have trained for 12.60 ± 1.55 years since their childhood. Achieving superior ability in one domain requires long periods of deliberate practice, which is known as a "10-year (or 10,000-h)-rule" (Ericsson et al., 1993). However, the deliberate practice is necessary but not sufficient to account for individual differences in experts and novices in music, sports, education, and board games (Campitelli and Gobet, 2011; Macnamara et al., 2014; Hambrick et al., 2018). Genetic predisposition and general intelligence may have more impact on ability than practice (Mosing et al., 2014, 2016; Sala et al., 2017). For example, more intelligent people tend to engage and excel in intellectually demanding activities such as chess (Burgoyne et al., 2016; Sala et al., 2017). In this regard, if general intelligence is controlled for (in our case, the IQ-matching of the two groups), differences between experts and novices in terms of untrained cognitive ability, WM, disappear. Our behavioral results that show no group differences support that idea. According to the theories that explain the domain-specificity of the effects of training, BEs

use their knowledge structures of Baduk positions in long-term memory, called chunks and templates, as encoding and retrieval strategies, with less WM resources (Chase and Simon, 1973a; Gobet et al., 2016). However, given that the chucks (domainspecific information) are the building blocks of one's expertise, such information is not transferable across domains. Therefore, our behavioral results may reflect that the chucks of experts are invalid for untrained tasks.

In the present study, both groups showed similar activation patterns in the frontoparietal areas and matched task performances for both face and spatial WM conditions. However, when comparing neural activity between groups, BEs compared to novice had greater SPC activation during face WM, particularly in response of the contrast of the face 1-, 2-, and 3-back versus 0-back control conditions. The post hoc analysis revealed that this activation difference was caused by less SPC activation in BEs, compared to novices, for the 0-back condition. The SPC is known to be involved in the top-down allocation of visual spatial attention (Giesbrecht et al., 2003; Shomstein, 2012). Given that the 0-back condition requires sustained attention/vigilance or recognition to a pre-specified target rather than WM (Owen et al., 2005) and in this condition BEs had less SPC activation, it is conceivable that visual or attention processing, rather than WM, might be different between the two groups. For example, BEs may use less attentional resource for perception and recognition of visual stimuli including face that requires at least partly holistic processing like board games, through long-term training (Guida et al., 2012, 2013). One speculated mechanism underlying the decreases of neural activity and GMV in experts is the usage-dependent possible selective elimination of synapses (Huttenlocher and Dabholkar, 1997; Takeuchi et al., 2011). Another possible explanation for our neural findings is that neural activation patterns observed do not necessarily represent training-induced changes in untrained tasks because their effects are very likely to be domain-specific. A recent longitudinal study has found no effects of commercial web-based cognitive training on brain activity and behavioral performance during decision-making as untrained tasks (Kable et al., 2017). As mentioned above, the IQ-matching of the two groups eliminates differences in untrained cognitive abilities. That is why various studies have found training-related neural patterns even in the absence of transfer effects on cognitive ability.

Expertise in board game playing may be associated with the change of FC between brain regions, rather than regional neural activity, to efficiently solve domain-specific problems (Duan et al., 2012b; Jung et al., 2013) and this may further affect the functional brain network associated with specific cognitive functions (e.g., the fronto-parietal network associated with WM). Consistent with this expectation, we found significant group differences in the multivariate patterns of FC in the left FP and IPC during the face and spatial WM, respectively, using a new data-driven multivariate approach, called MDMR-based CWAS (Shehzad et al., 2014). The post hoc seed-based FC analyses for these identified clusters further revealed group differences in FC between specific brain regions; BE compared to novices had greater FC between the left FP seed and several temporal and parietal areas, including the left MTC and right SMC and FFA, during face WM and greater FC between the left IPC seed and left LFC during spatial WM. Neuroimaging studies have consistently reported the co-activation of multiple frontal and parietal regions during spatial attention (LaBar et al., 1999; Ptak, 2012), WM (LaBar et al., 1999; Owen et al., 2005), and fluid reasoning tasks (Lee et al., 2006; Hampshire et al., 2011), suggesting the involvement of the frontoparietal network in such cognitive functions. A meta-analysis of functional neuroimaging studies using n-back tasks has been reported that the FP is one of regions consistently activated across all n-back studies and that has suggested that the FP plays an important role in the coordination of information processing and information transfer between multiple operations when solving the problem that requires two or more separate cognitive operations than one discrete cognitive process (Owen et al., 2005). The IPC, FFA, and MTG are higher-order visual areas. The IPC is involved in spatial perception as part of dorsal visual stream, whereas the FFA and MTG are involved in object and motion perception as part of ventral visual stream (Ungerleider and Haxby, 1994). Especially, the FFA is considered as a general visual expertise module that mediates automatic holistic processing of any higher familiar visual stimuli rather than face (Gauthier et al., 1999, 2000), while the posterior part of MTC adjacent to STS mediates object and face recognition (Hein and Knight, 2008; Bilalic et al., ´ 2010). Recent studies using chess-related tasks for chess experts have demonstrated that both the MTC and FFA are related to object and pattern recognition for chess pieces and positions, respectively (Bilalic et al., 2010, 2011 ´ ; Bilaliæ,, 2016) and that the IPC is involved in an active search for patterns or chunks when processing distorted structure in their trained domain, such as random chessboards (Bartlett et al., 2013). Based on aforementioned findings and roles of these regions mentioned above in cognitive components involved in playing board games, altered FC between these regions in BEs may be associated with visual expertise acquired through long-term training. Our results may reflect the functional reorganization of BEs' brain in a way that increases the strength of FC between frontal and parietal regions for spatial WM or adds new functional interactions between regions in the network and other regions, including FFA and MTC, for face WM that requires holistic processing.

The present study had some limitations to be addressed in future research. First, a relatively small sample size and scanning on a 1.5 T magnet may lead to resultant low statistical power and to limit the spatial resolution, respectively. However, our sample size corresponded to samples used in previous studies examining the differences in brain function between BEs and novices (Bilalic et al., 2011 ´ ; Bilaliæ,, 2016). Given that the neural data are not always consistent and there is currently an increasing interest in replication in psychology, future research with larger samples is needed to confirm the reliability of the present findings. Second, as a result of the cross-section nature of this study, it is unclear that brain function differences we found are directly caused by Baduk training or they are pre-existing group differences that predict whether or not a person takes up Baduk rather than a result of that training. Third, considering previous studies showing an interaction between resting-state activity and

stimulus-induced activity (Northoff et al., 2010; Fransson et al., 2018) and significant differences in resting-state activity between experts and novices (Jung et al., 2013; Dong et al., 2014, 2015), it is speculated that different resting-state activity patterns between the two groups may be the foundation for the activity difference during task-state. Further research with both resting-state and task-stat fMRI will help to clarify this issue. Finally, we used the WM tasks and found neural differences between experts and novices. Thus, our findings raise some questions to be explored by future research. Do the aspects of brain function where we have identified differences are associated only with WM tasks, or are they also associated with domain-specific cognitive skills (recall of Baduk positions)? Future longitudinal studies with measure of both trained and untrained tasks are needed to address such issue.

To our knowledge, this is the first study to examine whether there were differences in the functional activity and connectivity between BEs and novices while performing standard n-back WM task with both the face and spatial WM conditions, associated with the frontoparietal network, unlike previous studies to test domain-specific pattern recognition. Despite no behavioral differences, greater SPC activation in BEs compared to novices was observed during face WM. We also found altered connectivity in the FP and IPC in BEs in terms of multivariate patterns of FC using a new datadriven multivariate FC approach and further observed greater FC between frontal and parietal and temporal regions in BEs during WM. Our results provide novel insights into the

#### REFERENCES


mechanism behind Baduk expertise beyond domain-specific cognitive ability and provide evidence for differences in brain circuits associated with WM ability between experts and novices.

# AUTHOR CONTRIBUTIONS

WJ and JK designed and supervised the research. WJ, YY, and C-HC performed the experiments and analyzed data. WJ, YY, and TL wrote the manuscript. All authors reviewed the manuscript.

# FUNDING

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (Grant No. 2016R1E1A1A02921618). This work was also supported by Korea University Grant to WJ.

# ACKNOWLEDGMENTS

We thank Baduk experts for participating, the Korea Baduk Association for supporting this study, Dr. Zarrar Shehzad for development of the Connector toolbox, and researchers for helpful comments on the earlier draft of the manuscript.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer VT and handling Editor declared their shared affiliation.

Copyright © 2018 Jung, Lee, Yoon, Choi and Kwon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Monitoring Processes in Visual Search Enhanced by Professional Experience: The Case of Orange Quality-Control Workers

Antonino Visalli <sup>1</sup> \* and Antonino Vallesi 1,2

<sup>1</sup> Department of Neuroscience, University of Padova, Padua, Italy, <sup>2</sup> San Camillo Hospital IRCCS, Venice, Italy

Visual search tasks have often been used to investigate how cognitive processes change with expertise. Several studies have shown visual experts' advantages in detecting objects related to their expertise. Here, we tried to extend these findings by investigating whether professional search experience could boost top-down monitoring processes involved in visual search, independently of advantages specific to objects of expertise. To this aim, we recruited a group of quality-control workers employed in citrus farms. Given the specific features of this type of job, we expected that the extensive employment of monitoring mechanisms during orange selection could enhance these mechanisms even in search situations in which orange-related expertise is not suitable. To test this hypothesis, we compared performance of our experimental group and of a well-matched control group on a computerized visual search task. In one block the target was an orange (expertise target) while in the other block the target was a Smurfette doll (neutral target). The a priori hypothesis was to find an advantage for quality-controllers in those situations in which monitoring was especially involved, that is, when deciding the presence/absence of the target required a more extensive inspection of the search array. Results were consistent with our hypothesis. Quality-controllers were faster in those conditions that extensively required monitoring processes, specifically, the Smurfette-present and both target-absent conditions. No differences emerged in the orange-present condition, which resulted to mainly rely on bottom-up processes. These results suggest that top-down processes in visual search can be enhanced through immersive real-life experience beyond visual expertise advantages.

Keywords: real-world cognitive enhancement, visual search, expertise, cognitive control, professional training

# INTRODUCTION

Many daily activities require us to search around in order to locate particular items, such as an icon on a messy computer desktop or a friend in a crowded bar. Search performance depends on several factors, some involving the perceptual properties of stimuli and search contexts (Treisman and Gelade, 1980; Duncan and Humphreys, 1989), and others the observer and her/his previous experience (e.g., past knowledge about or affective attachment to the stimuli, Biggs et al., 2012). As far as the observer's factors are regarded, previous studies have investigated how expertise affects object recognition and detection. Several studies report that experts in particular topics (e.g.,

#### Edited by:

Roberta Sellaro, Leiden University, Netherlands

#### Reviewed by:

Reshanne R. Reeder, Otto-von-Guericke Universität Magdeburg, Germany Matthew S. Cain, Natick Solider Research, Development, and Engineering Center (NSRDEC), United States

> \*Correspondence: Antonino Visalli antonino.visalli@phd.unipd.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 24 October 2017 Accepted: 29 January 2018 Published: 14 February 2018

#### Citation:

Visalli A and Vallesi A (2018) Monitoring Processes in Visual Search Enhanced by Professional Experience: The Case of Orange Quality-Control Workers. Front. Psychol. 9:145. doi: 10.3389/fpsyg.2018.00145 birds, cars or fingerprints) are faster in discriminating objects pertaining to their area of expertise (Gauthier et al., 2000; Busey and Vanderkolk, 2005; Curby and Gauthier, 2009; Bukach et al., 2010). Moreover, other studies suggest that this facilitation can potentially extend to other visual processes, such as categorization (e.g., recognizing images of expertise from fragments; Harel et al., 2011) and detection (e.g., localizing targets of expertise among distractors from non-expertise categories or in natural scenes; Golan et al., 2014; Reeder et al., 2016).

An interesting case of expertise involves professional searchers. Radiologists, proofreaders, or airport security screeners take advantage not only of their domain-specific knowledge (e.g., knowledge of tumors, spelling errors, weapons, respectively), but also of their task-specific training. In a study with Transportation Security Administration Officers, Biggs et al. (2013) investigated whether professional searchers' expertise could influence visual search performance beyond the task they have been trained on. Comparing professional and nonprofessional searchers, the authors found differences in visual search strategies. More in detail, while search speed explained most accuracy variance in non-professional and early-career professional searchers, search consistency (i.e., trial-to-trial RT variability) was the best accuracy predictor in experienced professional searchers. The authors concluded that the effects of professional training and experience were likely extended to generalized search behaviors. Interestingly, since consistent search behaviors may allow a more efficient use of cognitive resources, the authors highlighted the importance of top-down control in visual search performance. In this regard, Harel (2016) describes visual expertise as an interactive process that emerges from enhanced interactions between the visual system and multiple top-down mechanisms including attentional control, domain-specific knowledge, and task-specific strategies.

Summarizing, previous research seems to suggest that visual expertise most likely reflects an enhanced engagement of multiple interactive processes that experts manifest with their objects of specialization (Harel, 2016). However, in this study we wanted to go beyond domain-specific aspects by addressing the following question: does intense visual search experience enhance topdown control mechanisms independently of the nature of the target? Indeed, professional experience may influence search behaviors in several ways independently of visual expertise. As mentioned above, professional experience can improve search behaviors through the acquisition of more efficient search strategies, such as search consistency (Biggs et al., 2013; Biggs and Mitroff, 2014). Here, we further explored a possible influence of expertise by testing the hypothesis that intense professional search experience could boost top-down control processes involved in generalized visual search behaviors.

Specifically, we focused on monitoring mechanisms, a series of "quality check" processes that aim to optimize behavior (cf. Stuss and Alexander, 2007; see Vallesi, 2012 for an overview). Monitoring skills are required in many cognitive domains and task contexts. For example, it has been shown that participants monitor the elapse of time during a variable foreperiod task (Vallesi et al., 2014; Capizzi et al., 2015), their performance to successfully detect errors (Ullsperger et al., 2014), the occurrence of critical events (Capizzi et al., 2016; Tarantino et al., 2017), or their progress toward a desired goal (Benn et al., 2014). Most germane to our study, monitoring is also involved in visual search paradigms that require checking and evaluating the presence/absence of a target embedded among distractors. When the target is absent, monitoring should intervene more strongly than when the target is present and clearly detectable (Vallesi, 2014). This is because while the detection of a present salient target is mainly driven by bottom-up processes that automatically attract participants' attention (Treisman and Gelade, 1980), determining the absence of a target needs a more extensive and wider inspection of the search array.

In order to test the role of expertise in the monitoring mechanisms during visual search, we compared a group of quality-control employees working on the orange production line of some citrus farms and a well-matched control group. The daily job of these quality-control workers consists of many hours spent inspecting and evaluating oranges rolling down a conveyor belt, and discarding the oranges perceived as not suitable on the basis of visual features such as size, color, or skin imperfections that worsen their organoleptic properties. Quality-controllers were selected as the experimental group since they routinely perform a job that extensively engages monitoring processes, which as a result should improve such processes in visual search independently of the nature of the target. All participants performed two blocks of a visual search task with images of several objects. In one block the target was an orange while in the other block was a Smurfette doll. If expertise advantages emerged only with objects of expertise, we would expect to find qualitycontrollers to be faster than control participants just with the orange target. On the contrary, we predicted better controllers' performance not only with the orange target, but in all the situations in which monitoring is especially involved (Weidner et al., 2009), mainly when the target is absent (Vallesi, 2014).

# MATERIALS AND METHODS

## Participants

Twenty-four quality-control employees on the production line of orange fruits (12 women; mean age: 51.2 years, SD = 9.4, range: 25.5–65.9 years; mean education: 8.5 years, SD = 2; hereafter referred to as quality-controllers) and 23 control participants (13 women; mean age: 51.3 years, SD = 9.9, range: 24.7–64.7 years; mean education: 8.6 years, SD = 1.7), all recruited in Sicily, Italy, voluntarily took part in the study. Quality-controllers reported to work about 6–8 h in a day, for about 6–9 months per year. Mean working experience was 13.9 years (SD = 8.2, range: 3–35). All participants reported to have normal or corrected-tonormal visual acuity and normal color vision. The two groups were equivalent in age [t(45) = 0.057, p = 0.955], education [t(45) = 0.199, p = 0.843], and sex [Yates' corrected χ 2 (1, N = 47) = 0.024, p = 0.876]. One further control participant was excluded from analysis due to poor task performance. The procedures involved in this study were approved by the Comitato Etico per la Sperimentazione dell'Azienda Ospedaliera di Padova. Participants gave their written informed consent, in accordance with the Declaration of Helsinki, and they were reimbursed 25 euros for their time.

#### Stimuli and Design

The visual search task was implemented in Matlab using the Psychophysics Toolbox (Brainard and Vision, 1997; Kleiner et al., 2007) and presented on a Dell Intel Core laptop computer. Participants sat facing the screen at a viewing distance of approximately 60 cm. Stimuli were 100 objects selected from the 1,000 images of the ALOI (Amsterdam Library of Object Images; Geusebroek et al., 2005) database. The image of an orange was selected as target for one of the two experimental blocks. The other target was selected based on matched luminance and surface area characteristics. Namely, for each ALOI image the mean luminance across all pixels (defined as L dimension in the CIE L<sup>∗</sup> a ∗b ∗ color space) and the object surface area (i.e., amount of pixels) were computed. A bi-dimensional Euclidian space was constructed, with Min-Max scaled luminance and surface (i.e., both measures brought into the range [0 1]) as dimensions. The image with the smallest distance from the orange in the luminance-surface Euclidian space was selected as the second target, that is, the Smurfette doll (**Figure 1A**). Ninety-seven images with low luminance-surface Euclidean distance from the targets were pseudorandomly selected as distractors. Specifically, they were chosen if located at a distance < 0.221 (median distance of ALOI images from the midpoint between the two targets). The final set of 97 distractors (**Figure 1B**) had a median distance of 0.080 (IQR = 0.220). One additional image (a spicy box) with low distance from the targets (distance = 0.04) was pseudorandomly selected as target for the practice block.

The task had a 2 × 3 × 2 factorial design with target type, array size, and target presence as factors. The target type (orange vs. Smurfette) was manipulated between blocks and the order of blocks counterbalanced across participants. The search array consisted of 12, 24, or 48 object images (2.66◦ × 2.00◦ of visual angle) with transparent background displayed against a middle gray background. The array was arranged in a grid of 6 × 8 available locations, each of which subtending a visual angle of 3.66◦ × 2.25◦ . To perturb this grid-like arrangement and prevent a line-by-line search, on each trial object locations were randomly jittered by a maximum of 0.25◦ horizontally and 0.5◦ vertically (Hershler and Hochstein, 2009).

Each experimental block consisted of 288 trials. At the beginning of each block, the target was presented for as long as participants demanded to memorize it. Next, on each trial a fixation cross was displayed at the center of the screen for an interval randomly jittered between 0.75 and 1.5 s to make the onset of stimuli (equally) unpredictable. The cross was then replaced by an array of object images (**Figure 2**) displayed until participant's response. Within blocks, each combination of array size × target presence was presented 48 times in pseudorandom order. For every array size, each of the 48 available positions was occupied in turn by the target, while distractors were pseudorandomly assigned to one of the remaining locations. Participants were required to press one of two response keys ("Z" or "M") to indicate whether the target was present or not. The assignment of the two response keys to either target presence or absence was counterbalanced across participants. Participants were instructed to be as fast as possible, but also accurate. A low tone was provided after errors. A practice block of 12 trials preceded the two experimental blocks. After each practice trial, feedback on accuracy and speed was provided: either a green tick for correct responses or a red cross for wrong responses given in less than 5 s, or "Try to be faster . . . " for response times (RTs) longer than 5 seconds (threshold based on previous findings from similar visual search tasks: Hershler and Hochstein, 2005, 2009; VanRullen, 2006; Golan et al., 2014).

#### Statistical Analyses

All statistical analyses were performed using R (R Core Team, 2016). Trial-level measures (i.e., single-trial log-transformed RT and dichotomous accuracy) were analyzed by conducting mixedeffects models using the lme4 library (Bates et al., 2014). Mixedeffects modeling has several advantages over traditional general linear model analyses (such as repeated-measures ANOVA) that make it suitable for trial-level measures (Baayen et al., 2008; Quené and van den Bergh, 2008). First, since mixedeffects model analyses are conducted on trial-level data (i.e., they do not require prior averaging of participant's data to a single value per experimental condition), they allow preserving and taking into account any variability across individuals, thus increasing the accuracy and generalizability of the parameter estimate. Moreover, for the same reasons, they account for intrinsic unreliability of participant's average scores due to differences in intra-individual performance variability (Kliegl et al., 2011). Another advantage of mixed-effects modeling over repeated-measures ANOVA is that it is not restricted to predictors with categorical levels, but it easily allows to test for the effect of discrete/continuous variables and their interactions with categorical variables, usually with a gain in statistical power (Kliegl et al., 2011). Especially concerning RTs, a further advantage is the possibility to control for many longitudinal effects during the task. First, there are the effects of learning and fatigue (Baayen et al., 2008). Second, the response in a trial is usually heavily influenced by what happens in the preceding trial (for example RT in the preceding trial is often a good predictor of RTs, Baayen et al., 2008). Using mixed-effects models, all these sources of experimental noise are easily brought under statistical control. Additionally, since mixed-effects models have been extended to generalized linear models, they can be used to efficiently model dichotomous data, such as accuracy in our task (Quené and van den Bergh, 2008).

Summary variables, such as Signal Detection Theory measures or item image properties, were analyzed using standard general linear model analyses (such as repeated-measures ANOVA or t-test).

# RESULTS

#### Accuracy

Response accuracy at each trial, given its dichotomic nature, was analyzed by conducting a Generalized Linear Mixed Model (GLMM) with logit link function using the glmer function from the lme4 library (Bates et al., 2014). Log-transformed RTs, target

FIGURE 1 | Luminance and surface characteristics of targets and distractors. (A) Representation of the positions of selected targets and distractors in the [luminance, surface] space. (B) The two boxplots show the Euclidian-distance distributions of all ALOI images (left) and of the selected distractors (right) from the midpoint between the two targets in the [luminance, surface] space.

presence, target type, array size, and group (with their interaction terms) were entered into the model as fixed effects. A random intercept varying among participants and among response bias (C) within participants, as well as uncorrelated random intercept and slope for trial order were entered into the model as random effects (an R-notation formula of the model is presented in Equation 1). In order to facilitate the convergence of the models, continuous variables (i.e., array size, log RT, and trial order) were scaled and centered within each participant using the R function scale.

$$\begin{aligned} \text{accuracy} & \sim \text{trial} + \log RT + \text{presence} \ast \text{target} \ast \text{array} \ast \text{group} \\ &+ \left(1 \middle| id/c \right) + \left(0 + trial \middle| id \right) \end{aligned}$$

Response bias, from Signal Detection Theory, was computed for each combination of target presence, target type, and array size, and defined as C = −0.5<sup>∗</sup> (ZHit+ZFA), where ZHit and ZFA are the standardized hit rates and false alarm rates, corrected as indicated by Snodgrass and Corwin (1988). C was introduced since it influences the probability of responding present/absent in visual search tasks (Palmer et al., 2000). Uncorrelated random intercept and slope for trial order were introduced to control for possible effects of learning and fatigue. The log- transformed RT for each trial was included to control for possible speedaccuracy trade-off effects. To explore the influence of group on accuracy, we first compared the model without fixed effects (i.e., null model; Macc0) with the model containing the predictor group. The likelihood ratio test showed that group did not significantly improve the model fit [χ 2 (1) = 0.01, p = 0.971], suggesting that accuracy did not change across groups. We explored the influence of the other predictors by incrementally adding each of them (with their interaction terms) to the null model. **Table 1** shows the results of the likelihood ratio test. The model Macc5, which included all the fixed effects (and their interaction terms) with the exception of group, resulted the best model to explain accuracy data distribution. In contrast, the inclusion of group and its interaction terms did not significantly increase the goodness-of-fit of the model. No other variable (e.g., pre-accuracy) significantly improved the fitting of the model. Marginal R<sup>2</sup> (Johnson, 2014) of Macc5, which represents the variance explained by the fixed effects, was 0.15; conditional R<sup>2</sup> , which is the variance explained by both fixed and random effects, was 0.21.

The Wald test (Wald, 1945) on the final model, Macc5, revealed a number of significant effects. For these effects, we report the estimated coefficient (b), the associated standard error (SE) and the z-statistics (z). A significant interaction was found between the predictors target presence, target type, and array size (b = −0.59, SE = 0.17, z = −3.45, p < 0.001). To interpret this three-way interaction, two GLMMs were fitted on the two task blocks separately. In the Smurfette block, the Wald test revealed a significant main effect of target presence (b = −2.44, SE = 0.13, z = −18.86, p < 0.001), with lower accuracy in the Smurfette-present than absent condition (**Figure 3A**). This effect was modulated by the array size (interaction: b = −0.87, SE = 0.12, z = −6.95, p < 0.001). In particular, the difference between present and absent conditions increased with increasing array size (**Figure 3A**). In the orange block, the Wald test revealed a significant main effect of target presence (b = 0.50, SE = 0.14, z = 3.62, p < 0.001) with accuracy slightly higher in the orange-present condition, and a main effect of array size (b = −0.24, SE = 0.10, z = −2.44, p = 0.015) with a slight decrease of accuracy with increasing array size (**Figure 3B**).

Additionally, we analyzed sensitivity (d') and response bias (C) measures from Signal Detection Theory, in order to further characterize visual search performance in terms of hits and false alarms. Specifically, d' provides a measure of the ability to discriminate the target from the distractors (Verghese, 2001) while controlling for possible biases (C) in using one response more than the other (Palmer et al., 2000). For this analysis, standardized hit (ZHit) and false alarm (ZFA) rates were computed as described above. The sensitivity index was defined as d' = ZHit – ZFA, while response bias was as above-defined C = −0.5<sup>∗</sup> (ZHit+ZFA). On each measure we separately conducted an ANOVA with target type and array size as within-subject factors and group as between-subject factor.

The ANOVA on d' revealed significant effects of target type [F(1, 45) = 28.48, p < 0.001, η 2 <sup>p</sup> = 0.39], with a lower discriminability for Smurfette (d' = 3.61, SE = 0.06) as compared to orange (d' = 3.92, SE = 0.04), and array size [F(2, 90) = 19.70, p < 0.001, η 2 <sup>p</sup> = 0.30]. A Newman-Keuls' post-hoc test on the latter result revealed that discriminability was higher for the 12 item condition (d' = 3.92, SE = 0.05) as compared to the 24-item ones (d' = 3.90, SE = 0.06; p = 0.009) and for the latter condition as compared to the 48-item ones (d' = 3.75, SE = 0.06; p < 0.001). The analysis also revealed that this effect was modulated by target type [interaction: F(2, 90) = 7.79, p < 0.001, η 2 <sup>p</sup> = 0.15]. Post-hoc analyses showed that the above described effect of array size on discriminability was significant for Smurfette (12 > 24 > 48; ps < 0.015 and 0.001, respectively) but not for


Fixed effects and degrees of freedom (df) are reported for each model. Models were fitted using maximum likelihood. Chi-squared and degrees of freedom, Chisq (df), and probability value, p, are based on the likelihood ratio test between successive models. 1AIC indicates the difference in Akaike Information Criterion (Burnham and Anderson, 2002) between the model Macc(n−1) and the model Macc(n) [e.g., for the model Macc3, ∆AIC = AIC(Macc2) – AIC(Macc3)]. To compare the relative evidence of each model, we computed the relative likelihood, RL = exp(∆AIC/2).

orange (12 = 24 = 48; ps > 0.339 and 0.748, respectively) (see **Figure 4A**). No other effect was significant.

The ANOVA on C yielded a similar pattern of results. Indeed, it revealed significant effects of target type [F(1, 45) = 172.23, p < 0.001, η 2 <sup>p</sup> = 0.79], with a conservative response bias for Smurfette (C = 0.29, SE = 0.02) as compared to orange (C = −0.04, SE = 0.02), and array size [F(2, 90) = 25.72, p < 0.001, η 2 <sup>p</sup> = 0.36]. Posthoc analyses on the latter result revealed that the mean C value was lower for the 12-item condition (C = −0.06, SE = 0.02) as compared to the 24-item one (C = 0.05, SE = 0.03; p < 0.001) and for the latter condition as compared to the 48-item one (C = 0.11, SE = 0.02; p = 0.001). Again, the analysis revealed that this effect was modulated by target type [interaction: F(2, 90) = 21.21, p < 0.001, η 2 <sup>p</sup> = 0.32]. Post-hoc analyses showed that the above described effect of array size on C values was significant for Smurfette (12 < 24 < 48; both ps < 0.001) but not for orange (12 = 24 = 48; ps > 0.143 and 0.405, respectively) (see **Figure 4B**). No other effect was significant.

# Response Times (RTS)

RTs were log-transformed to mitigate the influence of nonnormal distribution and skewed data. Log-transformed RTs were analyzed by conducting a Linear Mixed Model (LMM) using the lmer function from the lme4 library (Bates et al., 2014). Error trials and post-error trials were excluded from the analysis. The full model (Equation 2) included all the fixed and random effects of the accuracy GLMM (with the exception of log-RTs). Moreover, to control for the RT temporal dependency between successive trials, we included as fixed effect the log-RT at the preceding trial (Baayen and Milin, 2010).

logRT ∼ trial + pre\_logRT + presence ∗ target ∗ array ∗ group + 1 id/c + 0 + trial id

As for the accuracy model, C was introduced since individual's response bias can affect RTs, for example by causing faster responding to one condition than another. Initially, all the models were fitted using the Maximum Likelihood criterion to allow model comparisons (Bates et al., 2014). The full model resulted the best-fitting model [χ 2 (8) <sup>=</sup> 72.52, <sup>p</sup> <sup>&</sup>lt; 0.001]. Visual inspection of the residuals showed that the model was a bit stressed. As suggested by Baayen and Milin (2010), trials with absolute standardized residuals higher than 2.5 standard deviations were considered outliers and removed (1.55% of

the trials). After removing outlier trials, all the models were refitted and compared using a likelihood ratio test, and again the full model resulted the best-fitting model (**Table 2**). This time, visual inspection of residual plots of the full model did not show any evident violation of homoscedasticity and normality.

At this point, the full model was refitted by minimizing the REML (Restricted Maximum Likelihood) criterion, as suggested by Bates (2014; see also Luke, 2017). Marginal R <sup>2</sup> of the full model was 0.54 and conditional R <sup>2</sup> was 0.69. **Table 3** shows the statistical results of the type II ANOVA (as suggested by Langsrud, 2003) with additional F statistics based on Kenward-Roger's approximation of denominator degrees of freedom (Kenward and Roger, 1997). Overall, fitting mixedeffects models with REML and deriving p-values using Kenward-Roger's approximation seems to ensure optimal Type 1 error rate control (Luke, 2017). **Figure 5** shows that RTs were longer in the Smurfette block compared to the orange one, and longer when both targets were absent. The effect of array size (i.e., an increase of RTs with increasing array size) was slightly lower for orange than for Smurfette, especially in the target-present condition (**Figure 5**). Concerning group differences (**Figure 6**), the increase in RTs in the Smurfette block (compared to the orange one) was much greater for controls than for quality-controllers. The effect of target presence (i.e., longer RTs in the target-absent condition compared to the target-present one) was greater for controls than for quality-controllers and this between-group difference was larger in the Smurfette block. To further investigate the three-way interaction between target presence, target type, and group variables two LMMs were fitted on the two task blocks separately. In the orange block, ANOVA results did not show any significant main effect of the group variable [F(1, 64.5) = 1.8, p = 0.181, ß = −0.11], whereas there was a significant interaction between target presence and group [F(1, 12620) = 11.8, p < 0.001, ß = 0.03]. Indeed, as shown in the **Figure 6**, quality-controllers were faster than controls only in the orange-absent condition. In the Smurfette block, there was a significant main effect of the group [F(1, 74.1) = 6.3, p = 0.015, ß = −0.16], as well as a significant interaction effect between target presence and group [F(1, 11952) = 52.3, p < 0.001, ß = 0.06]. In the Smurfette block, quality-controllers were faster than controls and this difference in RTs was more pronounced in the target-absent condition (**Figure 6**). No significant group difference was found to involve array size, with the exception of a three-way interaction between



The table shows model comparison statistics. Fixed effects and degrees of freedom (df) are reported for each model. Models were fitted using maximum likelihood. Chi-squared and degrees of freedom, Chisq (df), and probability value, p, are based on likelihood ratio test between successive models. ∆AIC indicates the difference in Akaike Information Criterion between the model M(n−1) and the model M(n) [e.g., for the model M3, ∆AIC = AIC(M2) – AIC(M3)]. To compare the relative evidence of each model, we computed the relative likelihood, RL = exp(∆AIC/2).

TABLE 3 | Analysis of variance of log RTs data.


The table shows type III sums of squares. F-statistics and associated p-values were calculated using Kenward-Roger's approximation of degrees of freedom. Additionally, standardized regression coefficients (ß) are shown.

target type, array size, and group that was barely significant. To further explore this interaction, two LMMs were fitted for the absent/present condition separately. In both conditions no significant interaction between array size and group was found [target present: F(1, 360) = 0.22, p = 0.638, ß = 0.02; target absent: F(1, 1358.8) = 2.61, p = 0.106, ß = −0.02].

#### Image Analysis

Differences in RTs and search slopes between orange-present and Smurfette-present conditions were not expected a priori. Since these findings could be likely explained by low-level visual properties, we compared the distinctiveness of the two targets among their distractors. The Adaptive Whitening Salience (AWS) model (Garcia-Diaz et al., 2012) was used to estimate the perceptual salience of each target. AWS is a bottom-up saliency model that provides maps of the predicted probability for each location in an image of being fixated on the basis of its low-level visual features. Notably, this model has shown to outperform important saliency models in predicting human eye fixations and in reproducing several psychophysical phenomena (Borji et al., 2013). For each target, we generated 144 images representing search displays, one for each combination of array size by target position. Saliency maps of these images were then computed using the authors' Matlab implementation of the AWS model. From each map, a target salience score was obtained by averaging the values of the points corresponding to the target location. Target scores were finally compared using a pairedsample t-test. Results did not show any significant difference between the two targets [t(143) = 0.05, p = 0.963]. Since the saliency analysis did not explain the orange-present advantage, we compared perceptual similarity between the two targets and their distractors. Indeed, previous findings have shown that visual search difficulty increases with increased target-distractor perceptual similarity (Duncan and Humphreys, 1989). Usually, distinguishing real-world objects from one another is not easy on the basis of low-level visual features. However, it is not the case of the orange, which is highly characterized by its color. As with AWS, saliency maps for the 144 search displays were computed on the basis of a frequency-tuned approach (Achanta et al., 2009) that estimates saliency maps using color and luminance features. This method has been shown to outperform several state-ofthe-art saliency models in object segmentation. Following the author's algorithm, images were Gaussian filtered and converted in the Lab color space. The L<sup>∗</sup> a ∗b ∗ space is characterized by the luminance channel (L), a green-red opponent channel (a), and a blue-yellow opponent channel (b). This color space is preferable for its biological plausibility (Engel et al., 1997). Since selection of our stimuli was performed controlling for luminance, for each image the saliency map was computed finding the Euclidean distance between the a∗b <sup>∗</sup> pixel vector and the average a∗b ∗ vector. Saliency maps were min-max-scaled and target saliency

scores were computed and compared as for the AWS model. Results showed that orange saliency was significantly higher than Smurfette saliency [t(143) = 112.06, p < 0.001]. Similar results were obtained by computing saliency maps as the Euclidean distance between the L<sup>∗</sup> a ∗b <sup>∗</sup> pixel vector and the average L<sup>∗</sup> a ∗b ∗ vector [t(143) = 57.34, p < 0.001].

# DISCUSSION

The goal of the present study was to investigate whether intense professional visual search experience could enhance monitoring processes involved in visual search. To achieve this aim, we compared performance of a group of professional searchers (i.e., quality-controllers) with that of a well-matched control group on a computerized visual search task. The a priori hypothesis was to find an advantage for quality-controllers in those situations in which monitoring is especially involved, that is, when looking for the presence/absence of the target requires a more extensive evaluation of the search array (Vallesi, 2014).

To facilitate the discussion of the main results, it seems worthwhile to see how the overall performance pattern on the task was. All participants were slower in the target-absent condition, that is, the condition that was expected to rely much more on monitoring processes. Moreover, checking for the presence/absence of the target was more difficult in the Smurfette block, as revealed by longer RTs for both Smurfettepresent/absent conditions (compared to the orange ones), and by the lower Smurfette discriminability. This difference in search efficiency for the two targets was not predicted a priori. Looking at the two target-present conditions, slopes of RTs as a function of array size (a measure often associated with perceptual search efficiency, see Rauschenberger and Yantis, 2006) suggested that differences in performance between searching for the two targets could be accounted for by low-level visual features. In order to verify whether low-level visual properties could explain target differences in search efficiency, we analyzed the perceptual salience of each target among their distractors. Results showed that the color was a salient low-level visual feature that more easily distinguished the orange (compared to Smurfette) from distractors. Therefore, searching for an orange led to more efficient search, likely because its color was a distinctive feature (Wolfe, 1998; Liesefeld et al., 2016). Overall, these results suggest that bottom-up selection processes likely favored orange detection due to its perceptual properties, thus reducing the need of evaluating each item of the array. Conversely, searching for the Smurfette target led to a less efficient search accompanied by

more extensive monitoring needed to exhaustively evaluate the search array.

Concerning between-group differences, the results were consistent with our a priori hypothesis. Indeed, qualitycontrollers were faster than controls in the target-absent condition, the condition that, as discussed above, relied much more on monitoring. Moreover, this quality-controllers' advantage in the target-absent condition was more pronounced in the Smurfette block. This result is consistent with the fact that determining the absence of the Smurfette target required a more exhaustive evaluation (i.e., monitoring) of the search array compared to the orange target, as reflected by the general difference in RTs between the two target-absent conditions.

The target-present condition revealed two unexpected findings. First, according to studies on visual expertise (Hershler and Hochstein, 2009; Golan et al., 2014; Reeder et al., 2016), we expected to find a quality-controllers' advantage in detecting an object of expertise. However, no significant between-group difference emerged in the orange-present condition. One possible explanation for this negative finding could be a floor effect. However, the increase in RTs as a function of the array size makes this explanation unlikely. Indeed, even if it is plausible to explain the lack of a significant between-group difference as due to a floor effect in the 12-object array size, where RTs were indeed at their minimum, there was room for observing a quality-controllers' advantage in the higher array sizes, since RTs were longer in those conditions. A second explanation for this unexpected result could be that searching for the orange in our task mainly involved low-level visual mechanisms. Indeed, a similar result was found in a previous study with car experts performing a visual search task similar to ours (Golan et al., 2014). In that study, the authors found higher search efficiency for airplane targets than cars and butterflies across all groups involved. Remarkably, car experts exhibited no difference in search efficiency between their objects of expertise (i.e., cars) and airplanes. Even in that case, the authors explained efficient search for airplane in terms of discriminative perceptual features used by low-level visual mechanisms and largely independent of expertise with the target. A third non-mutually exclusive explanation could be that, since the orange is a more familiar target than Smurfette, it is possible that orange familiarity led to a more efficient search in both groups (Mruczek and Sheinberg, 2005; but see: Wang et al., 1994; Shen and Reingold, 2001).

The other unexpected result was the quality-controllers' advantage in detecting the Smurfette target, an object not related to their expertise. However, since searching for the Smurfettetarget required monitoring to a greater extent, this qualitycontrollers' advantage was congruent with our hypothesis of a professional search-experience boost of top-down control processes, even in the absence of objects of expertise. Overall, between-group differences emerged in those situations that required a more extensive employment of monitoring processes.

An alternative interpretation of between-group differences in search efficiency could be in terms of quality-controllers' enhancement in general response speed (Castel et al., 2005). However, the lack of differences in the orange-present condition makes this interpretation implausible. Indeed, the experts' advantages emerged only in the hard situations (i.e., lower search efficiency), that is when their trained ability (i.e., monitoring) was likely required. In this regard, our results are consistent with recent studies showing that cognitive control can be shaped by immersive real-life training (e.g., Yildiz et al., 2014; Babcock and Vallesi, 2015; Arbula et al., 2016).

In summary, quality-controllers were faster in those conditions that extensively required monitoring processes. Moreover, differences between quality-controllers and controls were independent of visual expertise with the targets (e.g., expertise for oranges). These findings extend previous research on visual search and expertise, highlighting the importance of control processes in search performance. The present results provide evidence that top-down processes in visual search can

#### REFERENCES


be enhanced through extensive professional search experience beyond visual expertise specific advantages.

#### AUTHOR CONTRIBUTIONS

AVi drafted the manuscript. He implemented the task, was involved in data collection and performed statistical analysis. AVa was involved in the conception of the work and provided ongoing contributions and feedback throughout the experimental process. He also provided additional revisions to the manuscript. All the authors have approved the final version of the manuscript and agree to be accountable for all aspects of the work.

#### FUNDING

This work was funded by the European Research Council Starting Grant LEX-MEA n◦ 313692 (FP7/2007-2013) to AVa.

#### ACKNOWLEDGMENTS

The authors wish to thank the following organizations for their precious help in recruiting participants and for the logistic support: Ardor (Scordia), Coa (Scordia), Oranfrizer (Scordia), Project Form (Ramacca), and Speedy 97 (Lentini). The authors thank Città della Speranza foundation (Padova) for its logistic support. We also thank Mariagrazia Capizzi and Ettore Ambrosini for their useful comments.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Visalli and Vallesi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The (In)significance of Executive Functions for the Trait of Self-Control: A Psychometric Study

Edward N ˛ecka\*, Aleksandra Gruszka, Jarosław Orzechowski, Michał Nowak and Natalia Wójcik

Institute of Psychology, Jagiellonian University, Kraków, Poland

Self-control (SC) is an individual trait defined as the ability to pursue long-distance goals in spite of the obstacles generated by current desires, innate or learned automatisms, and physiological needs of an organism. This trait is relatively stable across the life span and it predicts such important features as level of income, quality of social relationships, and proneness to addictions. It is widely believed that the cognitive substrate of SC involves the executive functions (EFs), such as inhibitory control, shifting of attention, and working memory updating. However, the empirical evidence concerning the relationships between trait SC and EFs is not convincing. The present study aims to address two questions: (1) what is the strength of relationships between trait SC and EFs, and (2) which aspects of SC are predicted by particular EFs, if at all. In order to answer these questions, we carried out a psychometric study with 296 participants (133 men and 163 women, mean age 23.31, SD 3.64), whom we investigated with three types of tools: (1) a battery SC scales and inventories, (2) a battery of EFs tasks, and (3) two general intelligence tests. Structural equation modeling approach was used to analyze the data. We found that the latent variables representing SC and the latent variable representing EFs did not show any relationship. The standardized path coefficient between EFs and general intelligence turned out rather strong. We conclude that the trait of SC, measured with questionnaires, does not depend on the strength of cognitive control, measured with EFs tasks.

#### Edited by:

Gail Robinson, The University of Queensland, Australia

#### Reviewed by:

Katie Moraes de Almondes, Federal University of Rio Grande do Norte, Brazil Alexander Strobel, Technische Universität Dresden, Germany

> \*Correspondence: Edward N ˛ecka edward.necka@uj.edu.pl; edward.necka@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 01 March 2018 Accepted: 14 June 2018 Published: 09 July 2018

#### Citation:

N ˛ecka E, Gruszka A, Orzechowski J, Nowak M and Wójcik N (2018) The (In)significance of Executive Functions for the Trait of Self-Control: A Psychometric Study. Front. Psychol. 9:1139. doi: 10.3389/fpsyg.2018.01139 Keywords: self-control, self-regulation, executive functions, cognitive control, intelligence

# INTRODUCTION

Self-control (SC) is a human capability to pursue distant valuable goals in spite of obstacles produced by situational influences, innate or learned automatisms, and inner impulses caused by current physiological needs. Traditionally, this phenomenon has been explored within two research paradigms. Firstly, there are studies publishes by Roy F. Baumeister and his colleagues, who showed that doing a task that requires effortful control results in transient reduction of one's capability to exert SC furthermore (Muraven and Baumeister, 2000). For instance, watching a movie with an instruction to ignore subtitles appearing at the bottom of the screen makes a person less able to do higher-order cognitive tasks, such as cognitive tests (Baumeister et al., 1998). Such studies provided empirical background for the so-called strength theory of SC, also known as the 'ego depletion' theory (Baumeister et al., 2007; Hagger et al., 2010), according to which self-regulation is a kind of resource

that can be 'spent' on tasks requiring effortful control. The more is 'spent' on a preceding task the less can be 'spent' on the following tasks, unless the resources are renewed during a recreational break. The 'ego depletion' effect, which we suggest to label with a neutral term 'the Baumeister effect,' is now debated concerning its strength and generality (Lurquin et al., 2016). Secondly, there are studies published by Mischel (1974) and his collaborators (Mischel et al., 1989), showing high predictive value of one's ability to refuse immediate gratification for the sake for a much larger but delayed reward. In the so-called marshmallow paradigm, preschool children were rewarded with one cake, which they could eat immediately, unless they decided to wait for the second cake, which – unbeknownst to them – would be delivered 15 min later. The median waiting time for the second cake was about 7 min, although some children could not wait longer than 1 min whereas others could withstand the whole waiting period. These huge individual differences in the ability to delay gratification, measured in the preschool period, appeared highly predictive concerning important aspects of adult life, such as higher income, better and more stable relationships, and reduced vulnerability to addictions (Mischel et al., 1988; Casey et al., 2011).

Recent approaches to SC underline its involvement in the process of value-based decision-making (Inzlicht and Berkman, 2015; Berkman et al., 2017). The decision to exert SC, or not, is described as a function of choice, determined by different values ascribed to potential personal goals. According to this account, sometimes people are able to delay gratification because the value of the delayed goal is much higher than the value of the immediately accessible goal, although the latter looks rather tempting and may be reached without any effort. In other cases, people with enough resources to control themselves may decide that immediate pleasure is more valuable than a long-distance goal, whose attainment needs time and effort. The 'Baumeister effect' can therefore be accounted for in terms of value weighting and failures of motivation rather than in terms of depletion of 'ego resources.'

In this paper, we discuss the problem of cognitive underpinnings of SC, understood as a relatively stable individual trait. We assume that such a trait can be assessed with reliable psychometric tools, and the scores gained by a person with such tools can be related to other individual traits, such as personality and intelligence. Next, we assume that the trait of SC is subserved by specialized cognitive functions, similarly to other individual traits. For instance, it has been convincingly demonstrated that general fluid intelligence depends on individual differences in working memory capacity (e.g., Colom et al., 2004; Chuderski and Necka, 2012, and the trait of creativity is related to divergent thinking skills (McCrae, 1987; Baer, 1993). Since stable traits are hardly susceptible to experimental manipulations, the studies on cognitive underpinnings of individual differences are mostly correlational in nature, so the causal explanations are quite risky. It may be claimed, for example, that capacity of working memory determines the level of general fluid intelligence or that the level of general intelligence determines accuracy in dealing with working memory tasks. The former account is sometimes referred to as the bottom-up approach (cognitive functions determine the general trait), whereas the latter one is called the top-down approach (the general trait determines cognitive functions). The training studies (e.g., Jaeggi et al., 2008) showed that enhancement of intelligence may result from systematic improvement of working memory capacity (a far transfer effect), which favors the bottom-up stance. The bottom-up explanations, according to which specific cognitive functions determine the level of intelligence, rather than the opposite, are also supported by theoretical considerations (e.g., Sternberg, 2008). However, there are serious doubts concerning the question whether intelligence really can be improved through training (e.g., Shipstead et al., 2012), so the bottomup explanations of intelligence still need stronger empirical evidence.

As regards the trait of SC, there is a widespread conviction that it is cognitively subserved by executive functions (EFs). According to a definition proposed by Akira Miyake and his coworkers, EFs are '. . .general-purpose control mechanisms that modulate the operation of various cognitive subprocesses and thereby regulate the dynamics of human cognition' (Miyake et al., 2000, p. 50). Various cognitive processes, involved in reception and storage of information (perception, memory), but also implicated in manipulation with mental representations (thinking), need some kind of integration and supervision. Without such a management, human cognition would get disintegrated, thus being unable to play its fundamental function, namely, the control of behavior. In other words, cognitive processes must be effectively controlled in order to be able to command our behavior (Diamond, 2013). Cognitive control seems particularly important in situations that need overriding automatic behavioral tendencies, since such situations are very complex, unexpected, or novel. In such situations, a dominant behavioral tendency must be suppressed (inhibition), a new pattern of behavior or a new mental set must be initiated (shifting), and the awareness concerning the ongoing task must be refreshed (working memory updating). No wonder, then, that Miyake and co-workers consider Inhibition, Shifting, and Updating, to be the most important EFs.

The definition proposed Miyake et al. (2000) declares that EFs are general-purpose mechanisms, meaning that they should be implicated in all kinds of cognition. However, they appeared particularly important for higher-order cognitive processes, such as thinking and problem solving. Consequently, EFs must be considered as important determinants of individual differences in cognition. Indeed, the results of research on intelligence (e.g., N˛ecka, 1998; Chuderski and N˛ecka, 2010; Cole et al., 2012) and creativity (e.g., Groborz and N˛ecka, 2003; Benedek et al., 2014) support this conclusion. Regarding SC, many authors seem to be convinced that EFs demonstrate huge individual differences that subserve individual level of SC (e.g., Hofmann et al., 2009, 2012). For instance, Kotabe and Hofmann (2015, p. 625) maintain that 'the importance of EFs to SC is clear.' In the theoretical model outlined by these authors, individual differences in SC depend, among other cognitive and motivational factors, on the capacity to exert control. This capacity is supposed to be measured by tasks that engage executive control.

On the one hand, the importance of EFs for SC has been demonstrated in many studies. For instance, the longitudinal study carried out by Friedman et al. (2011) showed in that preschool children who were able to restrain themselves from immediate gratification demonstrated, as adolescents, higher level of the common EF factor (closely related to Inhibition) and Shifting, but not Updating. Young et al. (2009) determined that behavioral misconduct among adolescents (e.g., substance abuse) was correlated with lower scores in three EFs tasks: Stroop, anti-saccade, and stop-signal. People who were able to delay gratification in the marshmallow experiment at the age of four showed better performance in the 'go-no go' and prepotent response inhibition tasks at the age of 16–18 (Eigsti et al., 2006). There are also findings suggesting that criminal and violent behavior may be related to deficient executive control (Meijers et al., 2015).

On the other hand, there are studies showing very weak relationships between EF tasks performance and self-report measures of behavioral control (N˛ecka et al., 2012). Duckworth and Kern (2011) carried out a meta-analytic study (282 samples, 34,564 participants), trying to establish the strength of relationships between various measures of SC (self-report, informant report, delay of gratification) and executive control. The authors found rather weak inter-correlations between various types of tasks, but also within each type of tasks. For instance, EF tasks appeared inter-correlated among themselves at the level of r = 0.14; the average correlations between EF tasks and other groups of measures appeared weak as well: r = 0.11 for delay tasks, r = 0.10 for self-reports, and r = 0.14 for informant reports. Interestingly, average convergent validity measures appeared much higher for selfreport (r = 0.48) and informant-report (r = 0.54) measures. These results suggest that both SC and executive control are highly heterogeneous constructs that need to be assessed with heterogeneous batteries of tests, scales, or questionnaires. They also suggest that the category of EF tasks is much more diversified than the category of self-report and informant-report measures. Low level of inter-correlations between various EF tasks may result from their 'impurity,' meaning that such tasks measure not only one specific EF but also other functions, not to mention a number of other factors, such as general speed of responding, attentional alertness, susceptibility to boredom during long experimental sessions, lack of computer phobia, etc.

In this paper, we attempt to investigate the relationship between executive control, measured with standard EF tasks, and SC, measured with both self-report and informant-report questionnaires. In order to overcome to problem of diversity and 'impurity' of EF tasks, we adopted the structural equation modeling approach with a relatively large sample of participants. The SEM approach allows extraction of latent variables that ignore specificity of various tasks, thus expressing the common factor that these tasks refer to. Additionally, we included two general intelligence tests, in order to establish whether possible relationships between SC and EFs would be moderated by the general mental ability, which is implicated in executive control as well.

# MATERIALS AND METHODS

# Participants

We investigated 296 participants recruited via two social media networks. There were 133 men and 163 women in the sample. Their mean age was 23.31 years (SD = 3.64). All participants were from outside of the Psychology Department. Participants obtained 60 PLN (ca. 15 €) for 4 h of testing, including a 15-min refreshment break.

#### Ethics Statement

The committee for ethics in studies involving human participants, assigned by the Institute of Psychology, Jagiellonian University in Krakow, approved this study on the basis of extended description of methods, materials, and procedure. According to the Helsinki declaration, participants signed written informed consent forms.

# Self-Control Measures

NAS-50 This is a self-report questionnaire of SC developed by us (N˛ecka et al., 2016). It consists of 50 items divided into five subscales: Initiative and Persistence (IP), Proactive Control (PC), Switching and Flexibility (SF), Inhibition and Adjournment (IA), and Goal Maintenance (GM). This tool has been subjected to the validation study with 934 participants (see: N˛ecka et al., 2016). Its reliability was assessed with internal consistency measures (Cronbach's α = 0.86) and test/retest approach (intraclass correlation coefficient ICC = 0.94). The validation study revealed that five subscales correlated with the NAS-50 general score at the moderate or high level (+0.47 < r < +0.70, depending on the subscale). The general score turned out strongly associated with

Baumeister's (see: Tangney et al., 2004) SC Scale (r = 0.77). Also, the Conscientiousness dimension of the Big Five model predicted the NAS-50 general score (r = 0.54). So, the convergent validity of NAS-50 seems suitable. As to divergent validity, this measure appeared completely independent of general mental ability scores (see: N˛ecka et al., 2016, for details).

#### NAS-40

This is a mutation of NAS-50 prepared for the informant report studies. We removed 10 items from the original version (NAS-50), due to their overly introspective content that would make them difficult to use by informants. The remaining 40 items were converted into the third person grammatical version (e.g., 'He/she is usually not late for meetings' instead 'I'm usually not late for meetings'). In this way, NAS-40 became possible to fill in by somebody who knows the participant proper (a colleague, a teacher, etc.). The reliability measures of NAS-40 turned out to be satisfactory (α = 0.84, ICC = 0.92).

#### Self-Control Scale (SCS)

The Self-Control Scale (SCS), developed by Tangney et al. (2004), is a self-report questionnaire consisting of 36 items. The authors report good reliability characteristics (Cronbach's α = 0.89, test– retest reliability = 0.89).

#### Conscientiousness (C)

fpsyg-09-01139 July 5, 2018 Time: 19:55 # 4

We administered the NEO-FFI questionnaire (Costa and McCrae, 1992) in the Polish adaptation (Zawadzki et al., 1998). This tool was important for its Conscientiousness scale, since description of this personality dimension pertains to some aspects of SC, understood as an individual trait.

#### Executive Control Tasks

We administered a battery of five computerized EF tasks that were supposed to engage three major EFs: Inhibition, Shifting, and Updating (Miyake et al., 2000). For Inhibition, we selected the Stop signal task. For Shifting, we chose the CATT procedure, already used in some studies of ours (N˛ecka et al., 2012). For Updating, we decided on the n-back procedure, which requires constant refreshment of the content of working memory. Additionally, this specific version of n-back requires that false signals (a.k.a. 'lures') be ignored, so this task allows assessment of the Inhibition function as well. The second task engaging the function of Updating is called COUNT, since it requires mental counting of sequentially presented stimuli up to their third appearance and again from the beginning. Furthermore, we administered the Stroop task, although it is hard to decide which specific EF this task refers to. However, in spite of its 'impurity' it is widely used in the cognitive control research as an example of the category of interference resolution tasks (Chuderski et al., 2012). It is supposed to capture the Inhibition function as well (Miyake et al., 2000). All these tasks have been already used in many studies carried out in our lab (e.g., Chuderski and N˛ecka, 2010; Chuderski and Necka, 2012; Chuderski et al., 2012; N˛ecka et al., 2012). Ideally, it would be advisable to have at least two tasks engaging each EF, and this was our initial plan. However, we could not find an acceptable version of a second task that would involve the function of Shifting, so we decided to use only the CATT procedure. The function of Updating is represented by two tasks: COUNT and n-back, the latter being important in reference to signal detection only. The function of Inhibition is represented by Stop-Signal, Stroop, and n-back again, the latter being important as far as inhibition of distracting lures is concerned.

#### Stop Signal

Participants performed the stop signal task (Logan, 1994) modified by Verbruggen et al. (2008). Pictures of an arrow heading left or right served as the visual stimuli in this version of SST. Participants were asked to press left or right arrow keys according to direction of the arrow on the screen. These go stimuli were presented randomly one at a time, each with 50% probability. Participants were supposed to be as fast and correct as possible unless an auditory stop signal was presented over the headphones. In this case they were instructed to stop the response. After successful inhibition, the interval between go and stop stimuli became 50 ms longer, after unsuccessful inhibition the interval became 50 ms shorter (minimum 50 ms, maximum 1150 ms). The stop-signal delay (SSD) was set to 250 ms at the start of each experimental block. This task allows calculation the SSRT (stop signal reaction time), according to the following rationale: SSRT is the time elapsing between the signal of stop and the internal (i.e., mental) reaction to this signal. If there is a.50 probability of responding in spite of the stop signal, time of the unobserved internal response to the signal of stop must be equal to the mean reaction time for go responses. Since SSD is adjusted on the basis of accuracy observed in the recent trial, the probability of responding in spite of the signal of stop must be 0.50. Therefore, SSRT is calculated as the difference between mean RT and adjusted SSD (see: Verbruggen et al., 2008). The shorter (faster) the SSRT the better is one's ability to inhibit the unnecessary response.

#### N-Back

We used the figural version of the n-back task, the same as in Experiment 5 reported by Chuderski and Necka (2012). The task consisted in serial presentation of simple figural symbols, such a star, a triangle, an arrow etc., each approximately 2.5 cm × 2.5 cm in size. Stimuli remained at the screen for 1500 ms and were masked for 300 ms. The task consisted of four series. In every series we presented 88 stimuli, so altogether there were 352 stimuli showed to each participant, plus some training stimuli before each series. Sixteen out of 88 stimuli in every series were presented twice. The participants were supposed to decide whether the second presentation took place n elements after the first one. The predefined n number equaled two. Hence, participants were instructed to press a space bar if and only if the currently presented symbol had already appeared two items back. For instance, if a symbol reappeared in the stream of stimuli separated by just one other symbol (e.g., star, triangle, star again) this repeated symbol became a target that required detection and speedy response with the space bar. If an item reappeared too early, i.e., immediately after its first presentation, or too late, i.e., separated by two symbols instead of just one, it was to be ignored. Stimuli reappearing too early (n = 1) or too late (n = 3) were classified as 'lures,' since their function was to 'tempt' participants to respond with no required accuracy. There were eight targets, four n = 1 lures, and four n = 3 lures in every series. Majority of stimuli (72 in every series) did not reappear shortly after their first presentation. These stimuli may be termed "noise," since they were to be ignored. If a participant responded to such stimuli, he/she committed the error of false alarm. Also, if a participant pressed the space bar in response to the stimuli that reappeared at "wrong" positions, i.e., n = 1 or n = 3, he/she earned the error of lure detection. We registered accuracy scores for each participant, defined as the proportion of correct signal detections and the proportion of erroneous lure detection. We also registered reaction time of every response.

#### COUNT

This task was based on the mental counters procedure (Larson, 1986). Participants were presented with a sequence of randomly repeated figures: triangle, circle or square. They were supposed to count how many times each figure has already appeared. If the currently displayed figure appeared for the third time in the sequence, the participants had to press space key. Additionally, the third appearance of any figure meant that counting of this

particular type of stimulus should start from the beginning. In this way, the participants had to keep in mind and update three 'stacks' of elements (i.e., figures). Auditory feedback took place after each erroneous reaction or lack of reaction for the third presentation of any figure. There were 45 instances of the third repetition (15 for each figure). The program registered the number of misses (lack of notification of the third repetition), the number of other errors, and the mean response time.

#### CATT

This task allows the analysis of controlled switching of attention and its logic was borrowed from Meiran (1996). Participants were presented with separate digits, which appeared at the screen for 3 s or until the response was made. They were instructed either to categorize the digits into odd (left key) and even (right key) or to categorize them into smaller than five (left key) and bigger than five (right key). Of course, the digit "5" had to be removed from the set of stimuli. Given that the task required double categorization, the participants were provided with cues that indicated which task they should fulfill in the upcoming trial. The cues were just single words followed by a question mark, i.e., "EVEN?" or "SMALLER?", and they appeared 500 ms before the stimulus proper. Participants were trained first in the correct use of instructions, response keys, and cues (20 trials, 1000 ms for a cue). Then, they were asked to perform a series of 148 trials, which were arranged randomly in sequence in relation to repeat and switch conditions. Participants had 3000 ms to respond (4000 ms in the training phase). Each digit that served as a stimulus was masked for 500 ms (1000 ms in the training phase). We registered the reaction time of correct responses as well as misses and false alarms. Participants were asked to be accurate rather than quick.

#### Stroop

We used the numerical version of the Stroop task, which required counting digits and ignoring their meaning (Fox et al., 1971; Chuderski et al., 2012). The screen showed three, four, five, or six exemplars of a digit drawn from the set [3, 4, 5, 6]. Each digit was 0.6 cm × 0.8 cm in size. In congruent trials, the number of stimuli was in concord with the digits to be counted (e.g., four exemplars of the digit '4'). In incongruent trials, the former and the latter differed (e.g., five exemplars of the digit '4'). Trials lasted 3 s or until response was given. There was also a neutral condition, in which participants were supposed to respond to the number of stimuli not being digits. The instruction was to avoid reading a digit and to press a response key that was assigned to a presented number of stimuli. There were 60 congruent, 60 incongruent, and 60 neutral stimuli altogether. Accuracy and latency of each response was registered. Dependent variables (DVs) were as follows: the number of correct responses in each condition and the average response time in each condition.

# Intelligence Tests

#### Raven

In order to assess their level of fluid intelligence, participants were given Raven's Progressive Matrices Advanced Version (Raven et al., 1983) in the paper-and-pencil form. This test consists of 36 items that include a three-by-three matrix of figural patterns. The bottom-left pattern is always missing. A testee is supposed to fill in this blank space with the correct pattern, which he/she can choose from eight response options provided at the bottom of the sheet. This test is regarded a good estimation of general fluid intelligence since it requires grasping the abstract rules that govern the composition of the matrix and to apply this rule while choosing the accurate response. The pilot study showed that the whole procedure was too long and tiresome for participants; therefore, only the even items from RAPM were administered, which did not worsen the reliability of assessment.

#### Analogies (TAO)

We also administered another paper-and-pencil test of fluid reasoning, which requires understanding and using the relations of analogy. Jarosław Orzechowski and Adam Chuderski have designed this analogical reasoning test in our lab. It has been used in several published studies (e.g., Chuderski and Necka, 2012; Chuderski et al., 2012). The test includes 36 figural analogies in the form 'A is to B as C is to X,' where A, B, and C are types of relatively simple patterns of figures, A is related to B according to two, three, four, or five latent rules (e.g., symmetry, rotation, change in size, color, thickness, number of objects, etc.), and X is an empty space. The task is to choose one figure out of four choice alternatives that relates to figure C, as B relates to A. Again, only the even items of TAO were administered.

#### Procedure

Participants were invited to the lab in pairs. In the ads that advertised participation in this study, we put the precondition that two people are welcome together if they know each other for at least 6 months. This requirement was important, since every participant was supposed to fill in all self-report questionnaires plus one informant-report tool (i.e., NAS-40), pertaining to the colleague he/she appeared with. After reporting to the lab, participants filled in the conscious consent form, and next they started to do the proper tasks in the following sequence: NAS-50, NAS-40, SCS, CATT, Count, n-back, Stroop, Stop signal, NEO-FFI, Raven, TAO. In the middle of the procedure, which took ca. 4 h altogether, participants had a 15-min break when they could have snacks and soft drinks.

#### RESULTS

In **Table 1** we report basic descriptive statistics. Since every computerized EF task yielded several indices, such as latencies and (in)accuracy measures for different series or level of difficulty, we do not report all possible measurement outcomes. Only the DVs that entered into further structural modeling are displayed in **Table 1**. These data pertaining to range and standard deviation suggest that there were huge inter-individual differences among participants, which make further correlational and structural analyses acceptable.

**Table 2** shows the first-order correlation coefficients pertaining to the formerly described variables. Since some

#### TABLE 1 | Descriptive statistics.

fpsyg-09-01139 July 5, 2018 Time: 19:55 # 6


NAS-50, Self-control questionnaire, self-report version (N ˛ecka et al., 2016). NAS-40, Self-control questionnaire, informant version (N ˛ecka et al., 2016). SCS, Self-Control Scale (Tangney et al., 2004). C (NEO-FFI), Conscientiousness from the NEO-FFI questionnaire. TAO, Analogical Reasoning Test. RAPM, Raven's Advanced Progressive Matrices. CATT (errors), Task switching, DV – the overall number of errors. COUNT (errors), Mental counters procedure, DV – the overall number of errors. SST (ssrt), Stop signal task, DV – stop signal reaction time. N-BACK (correct), Figural n-back task, DV – number of correct responses (n = 2). N-BACK (lures), Figural n-back task, DV – number of erroneous responses to lures (n = 1 or n = 3).

TABLE 2 | Zero-order correlation coefficients between the indices of self-control (NAS-50, NAS-40, SCS, C), general fluid intelligence (TAO, RAPM), and executive functions (CATT, COUNT, SST, N-BACK correct, N-BACK lures).


NAS-50, Self-control questionnaire, self-report version (N ˛ecka et al., 2016). NAS-40, Self-control questionnaire, informant version (N ˛ecka et al., 2016). SCS, Self-Control Scale (Tangney et al., 2004). C (NEO-FFI), Conscientiousness from the NEO-FFI questionnaire. TAO, Analogical Reasoning Test. RAPM, Raven's Advanced Progressive Matrices. CATT (errors), Task switching, DV – the overall number of errors. COUNT (errors), Mental counters procedure, DV – the overall number of errors. SST (ssrt), Stop signal task, DV – stop signal reaction time. N-BACK (correct), Figural n-back task, DV – number of correct responses (n = 2). N-BACK (lures), Figural n-back task, DV – number of erroneous responses to lures (n = 1 or n = 3). <sup>∗</sup>p < 0.05 (two-tailed). ∗∗p < 0.01 (two-tailed).

DVs were not distributed according to the Gaussian curve, they were log-transformed before entering the correlational procedures. We can see that there are strong inter-correlations between various indices of psychometric SC. In particular, the NAS-50 total score turned out highly correlated with SC Scale (r = 0.759), and with the Conscientiousness scale from the NEO-FFI questionnaire (r = 0.726). The informant version of our scale (NAS-40) shows much weaker, albeit positive and significant correlations with self-report tools (0.290 < r < 0.303, depending on the scale). It seems that assessment provided by peers reveals somewhat different aspects of SC than assessment based on one's own judgment. **Table 2** also shows that two tests of general fluid intelligence are correlated at the r = 0.551 level, which is a result comparable to what has been observed in other studies suing these tools (e.g., Chuderski and Necka, 2012). In contrast to the above-mentioned relationships, the correlation coefficients pertaining to different measures of EFs turned out rather weak, although statistically significant. Some of these correlation coefficients are positive and some are negative because of the nature of dependent variables (the number of errors versus the number of correct responses). Therefore, the absolute value of these coefficients should be taken into account as the strength of relationships. The absolute values oscillate between r = 0.069 (n.s.) and r = 0.354 (p < 0.01). Once again, the mutual relationships between various aspects of executive control appeared not very strong (compare:

Duckworth and Kern, 2011; N˛ecka et al., 2012). Notably, the absolute values of correlation coefficients between psychometric SC and EFs oscillate between r = 0.009 (n.s.) and r = 0.128 (p < 0.05), and only two of them, out of twenty, surpassed the p-level of 0.05.

In the next step of data analysis, we tested structural models that were supposed to capture the relationships between latent variables. The relationships between SC, fluid intelligence, and EFs were tested by means of latent variable modeling with IBM SPSS Statistics Amos v. 24, using maximum likelihood (ML) estimation method. The latent variable SC was defined by the following measures: NAS50, NAS40, SC Scale and C (NEO-FFI). The latent variable executive functioning was defined by five measures stemming from four tasks: total number of errors for the CATT task, total number of errors for the COUNT task, stopsignal reaction time (SSRT) obtained from the Stop-Signal task, and two measures from the n-back task: the number of correct responses for N2 condition and the number of lures for N1 and N3 conditions. Note that all indicators except of the number of correct responses for N2 condition were reversed: they were either errors or response latencies. Therefore, their higher values indicate lower performance, thus justifying negative signs of relationships reported in **Figures 1, 2**. Finally, the latent variable Fluid intelligence was defined by two tasks: TAO and RAMP. It must be underscored that the interference effect from the Stroop task, computed as the proportion of RT in the incongruent and congruent conditions, did not contribute to this latent variable, since the loading was as low as 0.11. Other DVs obtained from the Stroop task (e.g., latencies, error rates) did not contribute anything, either. For these reasons, we excluded the Stroop task indices from further analyzes, although some models that included Stroop showed acceptable fit.

In the first model, SC was regressed on executive functioning. No error correlations were specified. The analysis revealed evidence for moderate non-normality (skew < 2.7, kurtosis < 7.1) for some measures. This model showed an acceptable model fit: χ 2 (26) = 34.301, p = 0.128; CFI = 0.987, RMSEA = 0.033 (90% CI: 0.000, 0.060). **Figure 1** displays the standardized path coefficients of this model. Note that the fit of the measurement model of SC was satisfactory: χ 2 (2) = 0.766, p = 0.682; CFI = 1.000, RMSEA = 0.00 (90% CI: 0.00, 0.087). Similarly, the fit of the measurement model of executive functioning was very good: χ 2 (2) = 1.067, p = 0.586; CFI = 1.000, RMSEA = 0.00 (90% CI: 0.00, 0.096).

In the next step, both SC and fluid intelligence were regressed on executive functioning (see **Figure 2**). Again, no error correlations were specified. This model confirmed that executive functioning did not predict SC (β = 0.003, n.s.). By contrast, it strongly predicted fluid intelligence (β = −0.74, p < 0.001). The correlation between SC and fluid intelligence was set free because zero-order correlation coefficients between SC and intelligence measures turned our very weak (see **Table 2**). The fit of this model was also very good: χ 2 (42) = 50.20, p = 0.180; CFI = 0.990, RMSEA = 0.026 (90% CI: 0.00, 0.049). **Figure 2** shows the standardized path coefficients of this model.

We report only the models that obtained acceptable fit indices. Alternative models, built with other DVs, did not fit properly with the data. In particular, models in which the Stroop task was taken into account turned out unacceptable.

#### DISCUSSION

In order to examine the significance of EFs for the trait of SC in adult healthy volunteers, we investigated 296 people with the battery of five EF tasks and three psychometric measures of SC. We also added two general fluid intelligence tests (Gf) with the intention to check whether potential relationships between SC and EF would be affected in some way (i.e., strengthened, weakened, mediated) by Gf. In the structural

equation modeling approach, we extracted three latent variables, representing executive control, behavioral control, and general fluid intelligence. We found that the EF—SC relationship was non-existent, whereas the EF—Gf relationship turned out quite strong. No relationship between SC and intelligence became evident.

Lack of relationship between the latent variables representing executive control, measured with EF tasks, and psychometric SC, measured with questionnaires, is probably the most important finding of this study. On the one hand, it might be regarded unexpected taking into account the widespread conviction about the importance of EFs for effective control of behavior (e.g., Hofmann et al., 2009, 2012; Diamond and Lee, 2011; Kotabe and Hofmann, 2015). According to this stance, EFs play a crucial role in determination of the efficacy of behavioral SC, being its cognitive substructure. On the other hand, our findings should not be surprising in the context of other studies reporting rather weak relationships of executive control tasks with self-reported measures of behavioral control (e.g., Reynolds et al., 2006; N˛ecka et al., 2012). The meta-analysis performed by Duckworth and Kern (2011) seems particular interesting from this point of view because the authors found that the average correlation coefficient between these two types of measures, obtained after examining 282 studies, was as low as r = 0.10. What is a tenable explanation of these discrepancies, then?

To begin with, there is a possibility that SC is a personality trait rather than a cognitive ability. Personality traits are believed to be independent of both general intelligence and particular lower-order abilities constituting the g factor, although there are arguments that change of research paradigms might reveal still unknown relationships (Ackerman, 2018). According to the mainstream of the personality research, major personality dimensions should be regarded orthogonal to mental abilities. Apart from the existing body of evidence (e.g., McCrae and Costa, 1987; Ackerman and Heggestad, 1997), this conviction may be supported by theoretical arguments. For instance, personality traits are usually bipolar in nature (e.g., extraversion versus introversion) and none of their poles are regarded 'better' or 'worse' as such. Rather, being close to one of the extremes may help in specific tasks, situations, or job requirements: extraverts usually do better as salespersons although introverts may prevail in laboratory job (Barrick and Mount, 1991). Intellectual traits work in a different manner, since it is usually beneficial for a person to demonstrate high rather than low level of cognitive abilities. General intelligence seems particularly helpful because it contributes to performance in all cognitive tasks. Another argument pertains to the distinction between typical and maximal performance (Goff and Ackerman, 1992). Personality traits shape human behavior in typical, repetitive, everyday situations, whereas intellectual traits determine human performance in very specific situations, such as exams or test taking sessions, in which a person attempts to obtain the best possible result. IQ scores predict real-life achievements with limited precision because of this gap between typical and maximal performance. Standard personality assessment tools (i.e., questionnaires) include items referring to typical everyday situations, whereas standard intellectual tests consist of cognitive tasks that require the highest possible engagement.

If this line of reasoning is sound, we should treat the trait of SC as a dimension belonging to the realm of personality

rather than to the category of cognitive abilities. Specifically, this trait probably does not work according to 'the more the better' principle, which is characteristic of intellectual abilities. It would be fascinating to reveal possible dark sides of high level of behavioral SC, since over-control may cause a number of problems in social adjustment and personal satisfaction, such as inflexibility or obsessive-compulsive behavior. Anyway, the trait of SC may not need any cognitive functions underlying its mode of functioning. Consequently, it should not enter in any relationship with executive control, measured with EF tasks.

So, it is possible that the trait of SC does not need any underlying cognitive functions but it is also possible that it needs functions that were not investigated in our study. We based this investigation on the Miyake et al. (2000) model of EFs, for its widespread acceptance and popularity. However, this model lacks some EFs that might be important for SC, mostly for its proactive aspects. Careful planning of behavior, including creation a schedule of goals and actions, is undoubtedly an important facet of SC. But planning is rarely taken into account in EF studies, except of some clinical studies in which Shallice (1982) Tower of London (ToL) task is adopted (Mihalec et al., 2017). Although ToL engages short-term planning of actions, which tends to be impaired in the frontal lobe patients, as well as in PD and AD patients, it does not engage the processes involved in long term planning performed by healthy people during their goal-oriented activities. Another EF function that is absent in the Miyake et al. (2000) model pertains to goal maintenance. Inability to remember what is the goal of one's currently performed action results in chaotic behavior and overly dependence on environmental influences, at the expense of behavior triggered by endogenous decisions. On the contrary, the ability to maintain the goals allows efficient control of actions. Had we included the goal maintenance function into the battery of EF tasks, we might be able to obtain a bit stronger relationships between executive control and behavioral control. Inclusion of updating tasks did not help to resolve this problem because such tasks pertain to short-term updating of the content of working memory rather that long-term keeping in mind personal goals, particularly their hierarchy of importance and time scheduling. Tasks that would be able to engage long-term processes of planning and goal maintenance are still lacking in the standard list of EF procedures, although they seem to be of utmost necessity.

It is also possible that the trait of SC needs EFs, including the ones that were investigated in our study, but we were unable to unveil such relationships due to psychometric reasons. The EF tasks have been designed not for psychometric purposes but for investigation of the general aspects of cognition. Therefore, their psychometric properties are quite low, particularly in reference to stability of measurement. These tasks are also very narrow in scope, meaning that each of them engages a very specific process or function, such as disengagement of attention (the flanker task) or conflict resolution (the Stroop task). For psychometric purposes, the EF tasks should be much broader in scope. Moreover, the existing EF tasks are characterized by large amount of specific variance resulting from specificity of stimuli, procedure, implementation, equipment, instructions, etc. There is no standard rule of construction the EF tasks and their implementation for particular study. Being aware of this problem, we deliberately designed the study in the manner that allowed construction of latent variables, which were supposed to go beyond specificity of various tasks and capture the common variance pertaining to all tasks. To some extent, we succeeded because the latent variable representing executive control demonstrated quite strong relationship with the latent variable representing general fluid intelligence. From this point of view, lack of relationship between EF and the trait of SC turns out to be significant. If the standardized path coefficient between EF and Gf is rather strong, and the analogical coefficient between EF and SC is non-existent, then this 'negative' result probably supports the stance according to which SC in adult healthy people does not depend on the strength of executive control. Still, this conclusion must be supplemented with the caveat that different set of EF tasks might have resulted in quite different pattern of relationships between the latent variables.

Another explanation of the lack of any EF—SC relationship pertains to the characteristics of the sample. We investigated healthy adult volunteers who demonstrated the wide range of the trait of SC, whereas studies demonstrating the existence of the EF—SC relation were typically run with special populations, such as incarcerated violent offenders (Seruca and Silva, 2016; Meijers et al., 2017). Still, the relationships reported are rather weak. For instance, Meijers et al. (2017) investigated 130 prisoners with a neuropsychological battery suitable to assess such functions as response inhibition, planning, attention, shifting, working memory, and impulsivity. They found only one significant difference between violent and non-violent offenders, which referred to response inhibition (partial correlation r = 0.205). They also report a weak relationship between recidivism and planning (r = 0.209). As we can see, some EFs may demonstrate predictive value for SC when the latter is really weak. If the whole range of SC variance is taken into account, such relationships – being generally scarce and weak – disappear. Interestingly, the evidence demonstrating the predictive value of SC for various aspects of life success pertains mostly to low level of this trait, so to speak – to lack of SC. For instance, the results reported by Moffitt et al. (2011) show that it is the low level of SC that predicts such teenage problems as smoking, school absenteeism, or unplanned parenthood. Their participants were divided into quintiles according the informant-based ratings of SC. The first quintile differed substantially from the rest of participants, while the fifth quintile – representing to highest level of SC – did not contribute much as to prediction of behavioral conduct or misconduct. It is possible, then, that SC is important for life success in the sense that lack of it predicts many problems but its high level of development does not have predictive value anymore. In other words, there may be a threshold principle involved in this relationship: the trait of SC might be important up to some specific value (threshold), above which it loses significance as a predictor of life success.

Finally, there is a possibility that self-report measures do not provide exact estimation of the individual capacity to control one's behavior. Consequently, they should be replaced with some more objective measures, such as informant reports (e.g., Moffitt et al., 2011) or specially devised experimental tasks

(e.g., Steimke et al., 2016). SC is a highly valued personal trait, therefore the social desirability factor is likely to influence the way in which people approach particular items in selfreport questionnaires. Deliberate decision to present oneself in a positive way is probably not very likely in procedures that assure full anonymity, as was the case of the present study. Still, at least some participants could choose to present themselves as more 'organized' and 'reliable' than they know is the case in reality. Moreover, the results could be biased not only due to conscious decisions to boost the questionnaire results but also because of reduced awareness one's own personal traits. We simply may not know how much control do we have over our own cognitive control (N˛ecka et al., 2012), therefore, our questionnaire responses may not reveal the real state of affairs. However, this kind of bias seems unlikely to act in just one direction, namely, toward the unrealistically high level of assessment. If people are not aware how much control do they have, they may either overestimate or underestimate their capability of behavioral control. In consequence, the overall results should not be systematically heightened, although reliability of assessment is likely to suffer. To prevent this threat, items of our questionnaires did not require general knowledge about one's trait but only some level of awareness concerning specific situations. For instance, we did not ask 'Do you think you are a self-controlling person?'. Rather, we attempted to ask, for instance, about being late for meetings or doing deadlines. Additionally, we supplemented the battery of SC tools with the informant-based questionnaire NAS-40. Still, there is a possibility that the battery of tools supposed to assess the trait of SC suffered from subjectivity and bias toward social desirability.

This study suffers from some limitations that make the final conclusions questionable. Firstly, the number and variety of EFs tasks should be increased. EFs responsible for planning and goal maintenance seem particularly important for SC but they are mostly missing in experimental studies, including ours. Working memory updating tasks appear to involve shortterm goal maintenance, but not planning. Secondly, assessment of SC should be made more objective, for instance through application of observational scales referring to participants' behavioral characteristics (e.g., Moffitt et al., 2011). We used the informant version of the SC questionnaire, which undoubtedly helped to improve objectivity of assessment, but this solution is far from perfect, mostly because of limited knowledge the informants may have concerning the 'real' level of SC represented by the participants proper. The objective measures of SC are rather difficult to employ because of the very nature of SC, which seems to be a multi-dimensional and multi-faceted phenomenon. Self-report questionnaires, in spite of all their limitations, have a fundamental advantage: they allow holistic and generalized assessment that goes beyond specific situations and specific

#### REFERENCES

Ackerman, P. L. (2018). The search for personality-intelligence relations: methodological and conceptual issues. J. Intell. 6:2. doi: 10.3390/ jintelligence6010002

impairments. Still, a balanced combination of self-reported and objective sources of knowledge should be adopted in further studies. Finally, our sample of participants, albeit quite large, was probably not diversified enough concerning age, socio-economic status, and the general level of the trait of SC. In particular, we lacked participants who would suffer from mild, sub-clinical impairments of SC. Maybe the relationship we were not able to find takes place only as far as such people are concerned.

# CONCLUSION

While planning this study, we assumed that at least weak relationships between the trait of SC and efficiency of executive control would turn out significant. Former studies were conducted with smaller samples and usually without latent variable modeling. Since latent variables go beyond specific variance produced by particular measurement tasks and procedures, thus capturing 'the essence' of the constructs of interest, such modeling seemed much more promising than regular correlational approach. So far, the hypothesis that EFs constitute the cognitive substrate of the trait of SC must be rejected. In its strong version, our take-home message would sound like the following: EFs are not significant for SC, probably because they belong to the realm of abilities whereas the latter is a part the personality domain. In a weak and humble version, the message is that we were not able to prove such a relationship.

# AUTHOR CONTRIBUTIONS

EN designed the study and drafted the manuscript. AG helped to prepare the materials, performed all major statistical analyses, and helped to prepare the final version of the manuscript. JO helped to prepare the materials and improved the final version of the manuscript. MN and NW helped to prepare the materials, participated in the data gathering, and helped to prepare the final version of the manuscript.

# FUNDING

This paper has been prepared thanks to the support from the Polish National Centre of Science (NCN), Grant No. DEC-2013/08/A/HS6/00045.

# ACKNOWLEDGMENTS

The authors wish to thank Olivia Kłodzinska for her cooperation ´ in the process of data gathering.




interpersonal success. J. Pers. 72, 271–324. doi: 10.1111/j.0022-3506.2004. 00263.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 N˛ecka, Gruszka, Orzechowski, Nowak and Wójcik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Influence of Fluid Intelligence, Executive Functions and Premorbid Intelligence on Memory in Frontal Patients

Edgar Chan1,2 \*, Sarah E. MacPherson3,4, Marco Bozzali<sup>5</sup> , Tim Shallice6,7 and Lisa Cipolotti1,2

<sup>1</sup> Department of Neuropsychology, National Hospital for Neurology and Neurosurgery, London, United Kingdom, <sup>2</sup> Institute of Neurology, University College London, London, United Kingdom, <sup>3</sup> Centre for Cognitive Ageing and Cognitive Epidemiology, The University of Edinburgh, Edinburgh, United Kingdom, <sup>4</sup> Human Cognitive Neuroscience, Department of Psychology, The University of Edinburgh, Edinburgh, United Kingdom, <sup>5</sup> Neuroimaging Laboratory, Santa Lucia Foundation, Rome, Italy, 6 Institute of Cognitive Neuroscience, University College London, London, United Kingdom, <sup>7</sup> International School for Advanced Studies (SISSA-ISAS), Trieste, Italy

Objective: It is commonly thought that memory deficits in frontal patients are a result of impairments in executive functions which impact upon storage and retrieval processes. Yet, few studies have specifically examined the relationship between memory performance and executive functions in frontal patients. Furthermore, the contribution of more general cognitive processes such as fluid intelligence and demographic factors such as age, education, and premorbid intelligence has not been considered.

#### Edited by:

Kathrin Finke, Friedrich-Schiller-Universität Jena, Germany

#### Reviewed by:

Paul Dockree, Trinity College, Dublin, Ireland Giulio Pergola, Università degli Studi di Bari Aldo Moro, Italy

> \*Correspondence: Edgar Chan edgar.chan1@nhs.net

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 04 February 2018 Accepted: 22 May 2018 Published: 08 June 2018

#### Citation:

Chan E, MacPherson SE, Bozzali M, Shallice T and Cipolotti L (2018) The Influence of Fluid Intelligence, Executive Functions and Premorbid Intelligence on Memory in Frontal Patients. Front. Psychol. 9:926. doi: 10.3389/fpsyg.2018.00926 Method: Our study examined the relationship between recall and recognition memory and performance on measures of fluid intelligence, executive functions and premorbid intelligence in 39 frontal patients and 46 healthy controls.

Results: Recall memory impairments in frontal patients were strongly correlated with fluid intelligence, executive functions and premorbid intelligence. These factors were all found to be independent predictors of recall performance, with fluid intelligence being the strongest predictor. In contrast, recognition memory impairments were not related to any of these factors. Furthermore, age and education were not significantly correlated with either recall or recognition memory measures.

Conclusion: Our findings show that recall memory in frontal patients was related to fluid intelligence, executive functions and premorbid intelligence. In contrast, recognition memory was not. These findings suggest that recall and recognition memory deficits following frontal injury arise from separable cognitive factors. Recognition memory tests may be more useful when assessing memory functions in frontal patients.

Keywords: frontal lobes, recall, recognition, memory intelligence, executive functions

# INTRODUCTION

It is well-documented that frontal lobe lesions can result in memory difficulties (Wheeler et al., 1995; Kopelman, 2002). Memory impairments that result from frontal lobe lesions are thought to be distinct from pure amnesia, which arises from dysfunction of the diencephalon or temporal brain regions (Buckner et al., 1999). However, the exact nature of frontal lobe memory impairment

**336**

is still somewhat unclear. For example, it is still debated whether frontal memory impairment manifests as a deficit in recall, recognition or both recall and recognition. Some argue that only recall memory is impaired while recognition memory remains relatively preserved (e.g., Janowsky et al., 1989; Milner et al., 1991). Others have reported impairments in both recall and recognition (e.g., Baldo et al., 2002; Alexander et al., 2003). Recently, memory performance in a large cohort of frontal patients was assessed using the Doors and People battery (Baddeley et al., 1994) which consists of verbal and visual recall and recognition tasks thought to be comparable in terms of difficulty (MacPherson et al., 2016). Frontal patients were found to be significantly impaired on both recall and recognition memory tasks compared to healthy controls. However, in line with the pattern of deficits found in an earlier meta-analysis (Wheeler et al., 1995), the effect sizes were greater for recall compared with recognition memory impairment, suggesting that recall memory is more affected following frontal lobe damage.

Although it is commonly thought that frontal memory impairments are secondary to impairment in executive processes, surprisingly few studies have directly examined the relationship between executive dysfunction and memory impairment. In list learning tasks, it is suggested that executive deficits in frontal patients cause a breakdown in top-down supervisory processes. This breakdown leads to the poor use of organizational strategies such as spontaneous categorization and semantic linkages during memory encoding, and poor search strategies and self-monitoring during memory retrieval (Baldo and Shimamura, 2002). An assumption then is that individuals with greater executive dysfunction will likely have greater memory deficits. Indeed, in the aging literature, it has been argued that memory difficulties in older adults are related to increased vulnerability to executive deficits due to age-related frontal–striatal changes (see Buckner, 2004 for a review). Executive functions have been shown to mediate the relationship between the effects of age and recall memory performance (Troyer et al., 1994; Crawford et al., 2000). Similarly, in early mild Alzheimer's disease, recall memory performance has been shown to be correlated with performance on executive tasks (Baudic et al., 2006).

In patients with frontal lobe lesions, there has generally only been indirect support for the notion that memory impairments are related to executive deficits. A common finding is that word list-learning performance can be improved in frontal patients by explicitly grouping to-be-remembered words into semantic categories during encoding and by providing category cues during recall, thereby presumably reducing the 'executive load' of the task (e.g., della Rocchetta and Milner, 1993; Gershberg and Shimamura, 1995, but see Turner et al., 2007). Only a very few studies have explicitly examined the relationship between memory performance and performance on executive tasks in frontal patients. In one study, a correlation was found between recall memory performance (total number of words recalled) and phonemic fluency performance (FAS) in left dorsolateral frontal patients (Alexander et al., 2003). Interestingly, no similar correlation between fluency and recognition memory performance was found. However, no other executive measures were included in this study, limiting the conclusions that can be drawn.

Besides executive processes, more general cognitive processes may also contribute to memory performance in frontal patients. One prime candidate is fluid intelligence. Deficits in executive tasks in frontal patients have been argued to be underpinned by impairments in fluid intelligence (Duncan et al., 2000). In support of this, it has been shown that differences in performance on some executive tasks between frontal patients and healthy controls can be largely or entirely accounted for by performance on tests of fluid intelligence (Roca et al., 2010; Woolgar et al., 2010; Barbey et al., 2012; Keifer and Tranel, 2013). As such, it may be that memory difficulties in frontal patients might be better explained by impairment in fluid intelligence rather than executive functions. Indeed, fluid intelligence has been found to be the strongest predictor of episodic memory performance in healthy individuals (Aizpurua and Koutstaal, 2010).

We have also previously found that demographic factors such as age, years of education and premorbid intelligence, as measured by literacy attainment assessed using the National Adult Reading Test (NART IQ; Nelson and Willison, 1991), can significantly impact on executive impairments and fluid intelligence following frontal lobe injury (Cipolotti et al., 2015a; MacPherson et al., 2017). In a large cohort of frontal patients, we have shown that age and NART IQ are strongly correlated and predictive of performance on two executive tasks, verbal fluency and the Stroop Color Word test, over and above other factors such as lesion severity and chronicity. In addition, age, years of education and NART IQ are also related to fluid intelligence, though age seems to account for most of the unique variance. Indeed, age has been shown to exacerbate impairments in executive functions and fluid intelligence following frontal lesions (Cipolotti et al., 2015b). Whether these variables might also be related to, or mediate, memory performance following frontal lobe injury has yet to be investigated.

The aim of the current study was to increase our understanding of how executive processes relate to memory performance in patients with frontal lesions. Specifically, we wanted to examine the relationship between recall and recognition memory performance and age, education, premorbid intelligence, fluid intelligence and executive functions.

# MATERIALS AND METHODS

# Participants

Thirty-nine patients (24 males, 15 females) with focal frontal lesions were prospectively recruited from the National Hospital for Neurology and Neurosurgery, Queen Square, London as part of two larger studies examining cognitive functions of the frontal lobe. Patients had an absence of psychiatric disorders, history of alcohol or substance abuse or previous neurological disorders. Frontal lesions were traced and classified by a neurologist who was blind to the study results based on MRI scans (or CT scans if MRI was unavailable). The aetiologies of the lesions were: glioma = 20; meningioma = 14; subarachnoid

hemorrhage = 1; anterior communicating aneurysm = 3; and traumatic brain injury = 1. Importantly, we have previously shown that the grouping together of frontal patients with different aetiologies for the purposes of examining cognitive variables is methodologically justifiable (Cipolotti et al., 2015a). Sixteen patients had lesions confined to the left hemisphere, 18 patients to the right hemisphere and 5 patients had bilateral lesions. The majority of patients had lesions confined to the frontal lobes (n = 30; see Supplementary Table S1). The mean time since injury to assessment was 3.34 months (SD = 8.12 months). In addition, 46 healthy controls (HCs; 21 males, 25 females) with no history of neurological or psychiatric disorders were included for comparison. The study was approved by the National Hospital for Neurology and Neurosurgery and Institute of Neurology Joint Research Ethics Committee and written informed consent was gained according to the Declaration of Helsinki.

#### Material and Procedure

#### Baseline Neuropsychological Assessment

All patients and HCs were assessed on a series of baseline neuropsychological measures. Premorbid level of optimal functioning ('Premorbid intelligence') was estimated using the National Adult Reading Test (NART; Nelson and Willison, 1991). Naming ability was assessed using the Graded Naming Test (GNT; McKenna and Warrington, 1983) and perceptual ability was assessed using the Incomplete Letters subtest from the Visual Object and Space Perception Battery (VOSP; Warrington and James, 1991).

#### Fluid Intelligence

Fluid intelligence was assessed using Raven's Advanced Progressive Matrices (RAPM; Raven, 1976); an untimed, relatively culture-free, non-verbal test of abstract reasoning. The test requires the selection of the missing piece of a visual pattern from eight possible choices. The total number of correct responses in Set 1 (/12) was recorded and converted into age-adjusted scaled scores based on published norms.

#### Executive Functions – Verbal Fluency, Stroop Color Word Test

Two widely used neuropsychological tasks were administered to assess different aspects of executive functioning. These two tasks were chosen because they have been shown to require executive processes that are distinct from that which can be accounted for by fluid intelligence (Cipolotti et al., 2016; Cipolotti, unpublished). Verbal generation was assessed using the standard phonemic fluency test ('FAS'; Benton and Hamsher, 1976). The total number of words recalled for all three letters, excluding errors (i.e., proper nouns or repetitions), was recorded. Verbal response inhibition was assessed using the Trenerry et al. (1989) version of the Stroop Color Word test which requires participants to name the ink color of 112 color words (e.g., say 'Blue' when the word Red is written in blue) printed on one A4 sheet. The time taken to read all 112 words was recorded in seconds.

#### Recall and Recognition Memory

All patients and HCs were assessed on a verbal list-learning recall memory test ('Trieste Test'; Turner et al., 2007). Participants were asked to recall six 16-word lists that were each composed of four words from four different semantic categories (for further details on the construction of the word lists and semantic categories, see Turner et al., 2007). For each word list, words were either grouped according to their category ('Blocked') or they were mixed ('Unblocked'). These two types of lists (Blocked or Unblocked) were presented in an alternating fashion across the task (i.e., blocked, unblocked, blocked etc. . .). For each 16-word list, each word was presented on a computer screen for 2 s with a 1 s interval between words. Following the list presentation, participants immediately completed a distractor task for 30 s (add 1 to a series of random numbers ranging from 1 to 99). Then, participants were asked to recall as many words as they could from the prior list ('Uncued recall'). Once this was exhausted, the four semantic category labels were provided as prompts (e.g., jewels, occupations) for further recall ('Cued recall'). The total number of words correctly recalled from each list before and after cueing was recorded, as well as separately for blocked and unblocked word lists. We also recorded the total number of errors made during recall (i.e., intrusions of words that were not presented).

A subset of frontal patients (n = 22) and HCs (n = 29) also completed the Doors and People Test battery ('D&P'; Baddeley et al., 1994) which contained two recall tasks and two recognition tasks. Administration was conducted in accordance with procedures outlined in the manual. In brief, the verbal recall task required participants to learn and recall the names of four characters and their associated occupation, while in the visual recall task, participants had to copy and recall four simple line drawings. In both the verbal and visual recall tasks, participants were given three learning and recall trials. Points are awarded for recalled information across all three learning trials and the scores for the two recall tasks were combined to create an ageadjusted recall memory scaled score ('D&P Recall'). For the recognition tasks, participants were asked to remember two sets of 12 stimuli presented for 3 s each; the targets were either male/female names in the verbal condition and photographs of different types of doors in the visual condition. Participants were then asked to recognize the target among three distractors. Points were awarded for each correctly identified target and combined to create an age-adjusted recognition memory scaled score ('D&P Recognition').

A second smaller subset of frontal patients (n = 15) also completed a 30-item three forced choice version (RMT-30) of the classic 50-item two forced choice Recognition Memory Test (Warrington, 1984). In the learning phase, participants were asked to remember 30 photographs of faces presented for 3 s each. Photographs were of unfamiliar Caucasian male faces with non-distinctive facial types. Participants were explicitly told to remember the faces and to decide whether the faces were 'pleasant' or 'unpleasant' to encourage encoding. In the recognition phase that immediately followed, target faces were presented again with two distractors each. The total number of targets correctly identified was recorded. Raw scores were converted to z-scores based on available normative data from a separate healthy control sample (see Supplementary Table S2).

#### Statistical Analyses

fpsyg-09-00926 June 6, 2018 Time: 16:19 # 4

Statistical analyses were carried out using IBM SPSS Statistics 22<sup>1</sup> . Firstly, we investigated differences between frontal patients and HCs, and between left and right frontal patients, on demographic variables and performance on baseline neuropsychological tests, measures of fluid intelligence and executive functions using independent samples t-tests for continuous variables and chisquare test for categorical variables. Performance differences on memory tasks between groups were examined using mixeddesign repeated measures Analysis of Variance (ANOVA), except for RMT-30, where patient performance was evaluated using a one-sample t-test with a mean z-score of 0, as healthy control data were not available. An independent samples t-test was again used to compare differences between left and right frontal patients.

Secondly, we examined the relationship between recall and recognition memory performance and the different clinical and cognitive variables using two-tailed bivariate Pearson correlation analyses, for the frontal patients only.

Finally, for measures that were found to be significantly correlated with memory performance in our frontal patients, we ran a 3-stage hierarchical multiple regression to examine the independent predictive value of each variable. We chose a hierarchical approach because we were particularly interested in how executive functions predicted performance over and above any influences of general intelligence. Our previous work has shown that premorbid intelligence as measured by the NART is the best predictor of cognitive performance in frontal patients (e.g., MacPherson et al., 2017) and so this was entered in stage 1. Fluid intelligence was entered at stage 2 given that it has been argued to account for variance in executive deficits in frontal patients (e.g., Duncan et al., 2000). In stage 3, the two executive measures (Stroop Color Word test and verbal fluency) were entered together using a forced entry approach as we did not have an a priori hypothesis about the way in which each executive test might contribute to memory performance.

For results where p-values were less than 0.05, effect size and r-squared values were reported. For results where p-values were equal or greater than 0.05, additional Bayesian analyses were conducted where appropriate to determine the extent to which the odds were in favor of supporting the null-hypothesis (Gallistel, 2009). According to Jeffreys (1961), odds less than 3 are "weak," odds between 3 and 10 are "substantial," and odds between 10 and 100 are "strong."

#### RESULTS

#### Demographic and Baseline Neuropsychological Measures

Independent samples t-tests revealed that the frontal patient and HC groups did not significantly differ in terms of age (p > 0.1, Odds = 3.58), premorbid intelligence (p = 0.077, Odds = 1.81) and years of education (p > 0.1, Odds = 8.33; see **Table 1A**). Chi-squared analysis showed no significant difference in gender (p > 0.1). Patients were significantly poorer at naming than HCs [t(83) = 3.04, p < 0.01, d = 0.65] but there was no difference in performance on the test of visuo-perception (VOSP: p > 0.1, Odds = 8.23). Left and right frontal patients were well-matched on the demographic measures (p > 0.1; see **Table 1B**). There was also no difference in performance between left and right frontal patients on naming or visual perception (p > 0.1, Odds = 4.22 and Odds = 4.12, respectively).

#### Fluid Intelligence and Executive Functions

Compared to HCs, the frontal patients had significantly lower scores on the test of fluid intelligence [t(81) = 2.11, p = 0.038, d = 0.46]. Not unexpectedly, the frontal group also performed significantly more poorly compared to HCs on the two measures of executive function – verbal fluency [t(82) = 5.97, p < 0.001, d = 1.30] and Stroop Color Word test [t(54) = 2.68, p = 0.01, d = 0.69]. **Table 1A** shows the mean scores for each of the tests for the two groups. The difference between patients and HCs remained significant when we co-varied for fluid intelligence (Verbal fluency: p < 0.001; Stroop Color Word test: p = 0.02).

TABLE 1A | Clinical and cognitive neuropsychological data for patients and healthy controls.


TABLE 1B | Clinical and cognitive neuropsychological data for left and right hemisphere patients.


Difference between groups – <sup>∗</sup>p < 0.05, ∗∗p < 0.01.

<sup>1</sup>https://www.ibm.com/products/spss-statistics

Frontiers in Psychology | www.frontiersin.org

Chan et al. Memory in Frontal Patients

Within the frontal group, no significant difference was found between left and right frontal patients on the test of fluid intelligence (p > 0.1, Odds = 4.01). In contrast, patients with left frontal lesions were found to generate significantly fewer words on verbal fluency [t(31) = −2.18, p = 0.037, d = 0.76] and were slower on the Stroop Color Word test compared with patients with right frontal lesions [t(18) = 3.69, p = 0.002, d = 1.65]. The difference between left and right frontal patients remained significant when we co-varied for fluid intelligence (Verbal fluency: p = 0.021; Stroop Color Word test: p = 0.002). **Table 1B** shows the mean scores for each of the tests for the two groups.

#### Recall Memory

Performance on the Trieste test of verbal list-learning was examined using a mixed-design repeated measures Analysis of Variance (ANOVA) with 2 within-subjects factors of Block (Blocked, Unblocked) and Cue (Cue, Uncued) and 1 betweensubjects factor of Group (Patients, HCs). There was a significant main effect of Group in which patients recalled fewer words than HCs [F(1,83) = 6.90, p = 0.01, η 2 <sup>p</sup> = 0.08]. There was a significant main effect of Block [F(1,83) = 10.63, p = 0.002, η 2 <sup>p</sup> = 0.67] and Cue [F(1,83) = 170.52, p < 0.001, η 2 <sup>p</sup> = 0.11] showing that wordlists that were semantically blocked during presentation and providing cues improved recall performance. Crucially, however, there was no significant interaction between either factors with Group (Patients or HCs; p > 0.1). That is, frontal patients did not significantly benefit from blocking or cueing more than HCs (see **Table 2A**). There was no significant difference in the number of recall errors made between the frontal patients [M (SD) = 5.69 (3.95)] and HCs [(M (SD) = 4.70 (4.25)].

TABLE 2A | Recall and Recognition memory performance for patients and healthy controls.


Difference between groups – <sup>∗</sup>p < 0.05, ∗∗p < 0.01.

Recall performance on the Doors and People test was examined using a mixed-design repeated measures ANOVA with 1 within subjects factor of domain (verbal, visual) and 1 betweensubjects factor of Group (Patients, HC). Frontal patients scored significantly more poorly compared to healthy controls overall [F(1,50): 7.71, p = 0.008, η 2 <sup>p</sup> = 0.13]. There was no significant effect of domain (p = 0.1) and no interaction between domain and group (p > 0.1), suggesting that performance on the two recall subtests were relatively comparable.

Within the frontal group, there was no significant difference in the total words recalled on the Trieste test between patients with left and right sided lesions (p > 0.1, Odds = 1.69) and on D&P Recall (p > 0.1, Odds = 3.14; see **Table 2B**).

#### Recognition Memory

Recognition performance on the Doors and People test was examined using a mixed-design repeated measures ANOVA with 1 within subjects factor of domain (verbal, visual) and 1 betweensubjects factor of Group (Patients, HC). Frontal patients scored significantly more poorly compared to healthy controls overall [F(1,50) = 6.85, p = 0.012, η 2 <sup>p</sup> = 0.12]. There was a significant effect of domain [F(1,50) = 17.75, p < 0.01, η 2 <sup>p</sup> = 0.26] which showed that the visual recognition test was significantly harder overall [M (SD) = 9.47 (0.43)] compared with the verbal recognition memory test [M (SD) = 11.71 (0.46)]. However, there was no significant interaction between domain and group (p > 0.1).

On the RMT-30, z-score performance of frontal patients was assessed using a one-sample t-test (Mean z-score = 0). Mean z-score performance of the frontal patients was statistically different from zero [t(14) = −2.32, p = 0.036, d = 0.60].

Within the frontal group, as with recall performance, there was no significant difference on D&P Recognition between patients with left and right sided lesions (p > 0.1, Odds = 3.20) and on RMT-30 (p > 0.1, Odds = 1.30).

## Relationship Between Memory Performance and the Clinical and Cognitive Variables

We conducted two-tailed bivariate Pearson correlation analyses to examine the relationship between recall and recognition memory performance in frontal patients and their clinical and

TABLE 2B | Recall and recognition memory performance for left and right hemisphere patients.


cognitive variables. Given the lack of significant difference in performance between the left and right frontal patients on all memory measures, the two groups were combined in all correlation and regression analyses to increase power. To reduce the likelihood of false-positives, only the main memory measures that were found to be meaningfully impaired compared with healthy controls were included in the analysis; the Trieste test Total Score, D&P Recall, D&P Recognition, and the RMT-30. Clinical variables included were age, years of education and premorbid intelligence as assessed by the NART. Cognitive variables included were fluid intelligence as measured by Raven's Progressive Matrices and the two executive measures of verbal fluency and the Stroop Color Word test.

Both recall memory measures were significantly correlated with premorbid intelligence (Trieste test, p = 0.001; D&P Recall, p = 0.007), fluid intelligence (Trieste test, p = 0.002; D&P Recall, p < 0.001), and verbal fluency (Trieste test, p < 0.001; D&P Recall, p = 0.026). Only performance on the Trieste test was related to verbal response inhibition as assessed using the Stroop test (p = 0.035) but not D&P Recall (p > 0.1). Performance on the two recall measures were significantly correlated with each other (p < 0.001). Neither recall memory measures were correlated with age or years of education. The absolute Pearson's correlation coefficient between the two recall memory measures and the clinical and cognitive variables are shown in **Figure 1**.

In contrast, neither recognition memory measures, D&P Recognition or RMT-30, were correlated with premorbid intelligence (p > 0.1), fluid intelligence (p > 0.1) or either executive measures (see **Figure 2**). Neither recognition memory measures were correlated with age or years of education.

# Predictors of Recall Memory Performance

Given that the recall memory measures were significantly correlated with premorbid intelligence, fluid intelligence, and performance on the executive tasks, we examined the relative predictive value of these three variables using a 3-stage hierarchical multiple regression. Premorbid intelligence was entered at stage 1, fluid intelligence was entered at stage 2 and the two executive tasks (Stroop Color Word test and verbal fluency) were entered at stage 3.

Using performance on Trieste Total Recall as the dependent variable, the hierarchical multiple regression revealed that at stage 1, premorbid intelligence contributed significantly to the regression model [F(1,22) = 11.76, p < 0.01] and accounted for 36% of the variance in recall memory performance. Introducing fluid intelligence at stage 2 explained an additional 26% of the variation, explaining a total of 62% variance in recall memory performance, and this change in R <sup>2</sup> was significant [F(1,20) = 13.57, p < 0.01]. Adding the two executive tasks explained an additional 19% of the variance to the model and this change was significant [F(2,18) = 8.82, p < 0.01]. The final model accounted for 81% of the variance in Trieste Total Recall [F(4,12) = 18.83, p < 0.01]. Premorbid intelligence (β = −0.45, p = 0.041), fluid intelligence (β = 1.24, p < 0.01), and Stroop Color Word test (β = −0.59, p < 0.01) were all significant predictors whereas verbal fluency was not (p > 0.1).

The same analysis was repeated with D&P Recall as the dependent variable. The hierarchical multiple regression revealed that at stage 1, premorbid intelligence contributed significantly to the regression model [F(1,19) = 7.73, p = 0.012] and accounted for 29% of the variance in recall memory performance. Introducing fluid intelligence at stage 2 explained an additional 20% of the variation, explaining a total of 49% variance in recall memory performance, and this change in R <sup>2</sup> was significant [F(1,18) = 7.09, p = 0.016]. Unlike Trieste Total Recall, adding the two executive tasks did not significantly add to the variance explained by the model for D&P Recall performance (p > 0.1). At stage 2, only fluid intelligence (β = 0.85, p = 0.016) was a significant predictor of recall performance, whereas premorbid intelligence was not (p > 0.1).

As different variables were found to be significant predictors of recall memory performance in frontal patients, we examined whether the difference in performance originally found between

frontal patients and healthy controls could be accounted for by these predictors by entering them as covariates using an Analyses of Covariance. The difference in performance between frontal patients and healthy controls on Trieste Total Recall was no longer significant once premorbid intelligence, fluid intelligence and Stroop Color Word test were entered as covariate (F(1,51) = 0.053, p = 0.819). In contrast, the difference in performance between frontal patients and healthy controls on D&P Recall remained significant when fluid intelligence was entered as a covariate (F(1,49) = 6.38, p = 0.015).

Given that recognition memory performance was not correlated with any of the clinical or cognitive variables, multiple regression was not performed.

#### DISCUSSION

For the first time, we investigated how demographic factors of age and education, premorbid intelligence, fluid intelligence, and executive functions relate to and account for recall and recognition memory performance in frontal patients. Our frontal patients were found to be impaired on two different measures of recall memory and two different measures of recognition memory compared with healthy controls. This finding supports previous suggestions that frontal injury can result in both recall and recognition memory deficits (e.g., MacPherson et al., 2016). Crucially, however, we show that the nature of these deficits may be separable in how they relate to other clinical and cognitive factors.

For recall memory, performance in frontal patients on both recall memory measures was correlated with premorbid intelligence, fluid intelligence, and verbal fluency. Performance on the list learning task was also related to the Stroop Color Word test. Investigation into the individual contributions of premorbid intelligence, fluid intelligence and executive functions on predicting recall memory performance revealed slightly different but converging results for our two measures. For the Trieste list-learning task, all three variables were significant independent predictors of recall performance. Of the executive tasks, although both verbal fluency and Stroop Color Word were correlated with performance, only the Stroop Color Word test was a significant predictor of performance when all variables were taken into account. Of all the significant predictors, fluid intelligence was the strongest predictor of performance. For D&P Recall, fluid intelligence was the only significant predictor of recall performance. Despite D&P Recall performance being correlated with both premorbid intelligence and verbal fluency, neither variable contributed significantly over and above the variance accounted for by fluid intelligence. Overall, our findings suggest that recall memory deficits in frontal patients are best accounted for by fluid intelligence. The difference in findings between our two recall measures might reflect inherent differences in the two measures. The Trieste list learning task has 16-items per word list and one learning trial per list whereas the D&P Recall tasks only contain 4-items and have three repeated learning trials. Thus, it may be that the Trieste test requires greater demand on supervisory processes such as strategy and inhibition to encode the multiple word lists efficiently and avoid interference across lists (Baldo and Shimamura, 2002). However, investigation into the differences between the demands of the tasks warrants further study.

The finding that recall memory in frontal patients is related to fluid intelligence processes is in keeping with a specific theoretical proposal regarding the neurocognitive architecture of the frontal lobe. Fluid intelligence is taken as a measure of some general or g factor that can broadly account for performance across a range of different tasks (Duncan et al., 2000). It captures the mental processes required for breaking tasks down into subcomponents that are thought to be necessary to perform most cognitive tasks, particularly novel or complex ones. It has been argued that fluid intelligence can be mapped onto a multiple-demand (MD) network in the brain that involves predominantly frontal– parietal regions (Woolgar et al., 2010). As such, damage to frontal brain regions often results in impairment in fluid intelligence (Duncan et al., 2000). It has been shown that fluid intelligence can account for some executive deficits that result from frontal lobe injury (Roca et al., 2010). Furthering this, our data suggests that

impairment in fluid intelligence following frontal lesions may also account for performance in recall memory tasks.

Recall performance in frontal patients was also correlated with premorbid intelligence as assessed by the NART but not years of education. NART was also a significant independent predictor of Trieste performance. Both NART and years of education are often thought of as comparable indicators of premorbid intelligence. However, we have recently shown that these two variables do not represent the same proxy measure, at least following frontal injury, with NART being a better predictor of executive functions (MacPherson et al., 2017). Our findings further extend the important role of premorbid intelligence as measured by the NART in protecting against the impact of frontal brain injury on memory functions.

Recall memory impairment in our frontal patients was correlated with impairment in executive processes. Consistent with Alexander et al. (2003), we found that recall, but not recognition memory was related to performance on verbal fluency. In addition, Trieste recall was also related to response inhibition as measured by the Stroop Color Word Test. As far as we know, this is the first time in which the contribution of different executive measures to recall memory in frontal patients has been examined independently. Previous work has generally combined different executive measures into a composite, thereby limiting the potential for differences between tests to be explored (e.g., Troyer et al., 1994; Crawford et al., 2000). In our study, although both verbal fluency and Stroop performance were correlated with recall, only performance on the Stroop, but not verbal fluency, was a significant predictor independent of premorbid intelligence and fluid intelligence. Our findings show that different executive functions may contribute to recall performance differently. Furthermore, our findings support the notion that some executive abilities are dissociable from fluid intelligence following frontal injury (Cipolotti et al., 2016; Cipolotti, unpublished). In future, it would be important to consider this in further detail with a wider variety of tasks tapping different known executive functions.

In contrast to recall memory, performance on recognition memory measures in our frontal patients were not significantly related to premorbid intelligence, fluid intelligence or either executive measure. Importantly, however, frontal patients were significantly impaired on the recognition memory measures, which is consistent with previous findings (Wheeler et al., 1995; MacPherson et al., 2016). The lack of relationship between recognition memory impairment and performance on other cognitive tests suggests that recognition memory performance is dissociated from premorbid intelligence, fluid intelligence and executive functions. It may be that poor performance on the recognition memory task reflects some genuine deficit in memory processes (Cipolotti et al., 2001). Alternatively, it has been shown that poor recognition performance in frontal patients may be related to specific impairment in familiarity judgments; a difficulty in frontal patients to inhibit responding 'yes' to similar distractors (Alexander et al., 2003; MacPherson et al., 2008).

We did not find any significant relationship between performance on any of our memory measures and patients' age. In our previous work, we have demonstrated that age predicts performance on executive tasks in frontal patients (MacPherson et al., 2017) and modulates the magnitude of their impairment, whereby middle-aged and older frontal patients had exacerbated executive impairment compared to younger adults (Cipolotti et al., 2015b). This latter effect was not found for performance on non-executive tasks that do not rely on frontal functions. The lack of relationship between age and memory performance in our current study appears inconsistent also with what is shown in the healthy and pathological aging literature (Buckner, 2004). It may be that the impact of frontal lesions decompensates for any premorbid relationship between age and memory performance (but see Cipolotti et al., 2015b).

Our study represents a first step into exploring the relationship between memory performance and fluid intelligence, executive functions and premorbid intelligence in frontal patients. Given our findings, it would be important to examine these underlying mechanisms further in a larger sample of frontal patients to allow for grouping of patients into different subregions and more detailed examination of neuropathological factors such as proportionate gray matter loss or white matter tract involvement. It has been shown that the pattern of memory impairment may vary depending on the frontal subregion injured consistent with the known specialization of function in different frontal areas (Stuss and Alexander, 2005; Turner et al., 2007). It may be that factors such as premorbid intelligence and fluid intelligence impact upon recall performance across frontal subregions whereas different executive functions have a more location-specific effect. Furthermore, our slightly different pattern of findings across our two recall memory tasks suggests a more systematic exploration of frontal memory processes is necessary to further examine the different influences of fluid intelligence and executive tasks on recall task demands.

Overall, we have shown that recall memory performance in frontal patients can largely be accounted for by fluid intelligence, executive functions and premorbid intelligence. Future studies examining memory performance in frontal patients should consider how these factors might mediate any deficits observed. Although all three variables were related to recall memory performance, general fluid intelligence appears to be the strongest predictor. This was not replicated in recognition memory performance. Our findings suggest that it may be more meaningful to assess memory functions in frontal patients using recognition memory, as recall performance may likely be affected by non-memory related processes.

# AUTHOR CONTRIBUTIONS

All authors were involved in the conception of the study. EC and SM were involved in the collection of the data. EC, SM, and MB were involved in the analyses of the data. EC, SM, and LC were involved in the writing and editing of the manuscript. MB and TS reviewed the manuscript.

# FUNDING

fpsyg-09-00926 June 6, 2018 Time: 16:19 # 9

This work was supported by funding from the Wellcome Trust (Grant Nos. 066763 and 089231/ A/09/Z).

#### REFERENCES


#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00926/full#supplementary-material


Woolgar, A., Parr, A., Cusack, R., Thompson, R., Nimmo-Smith, I., Torralva, T., et al. (2010). Fluid intelligence loss linked to restricted regions of damage within frontal and parietal cortex. Proc. Natl. Acad. Sci. U.S.A. 107, 14899–148902. doi: 10.1073/pnas.1007928107

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chan, MacPherson, Bozzali, Shallice and Cipolotti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Commentary: Novelty seeking and reward dependence-related large-scale brain networks functional connectivity variation during salience expectancy

#### Cristiano Crescentini\*

Department of Languages and Literatures, Communication, Education and Society, University of Udine, Udine, Italy

Keywords: personality, temperament, character, executive functions, salience, brain networks

#### **A commentary on**

#### **Novelty seeking and reward dependence-related large-scale brain networks functional connectivity variation during salience expectancy**

by Li, S., Demenescu, L. R., Sweeney-Reed, C. M., Krause, A. L., Metzger, C.D., and Walter, M. (2017). Hum. Brain. Mapp. 38, 4064–4077. doi: 10.1002/hbm.23648

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Viviana Betti, Dipartimento di Psicologia, Università di Sapienza di Roma, Italy

> \*Correspondence: Cristiano Crescentini cristiano.crescentini@uniud.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 21 December 2017 Accepted: 13 February 2018 Published: 27 February 2018

#### Citation:

Crescentini C (2018) Commentary: Novelty seeking and reward dependence-related large-scale brain networks functional connectivity variation during salience expectancy. Front. Psychol. 9:242. doi: 10.3389/fpsyg.2018.00242 In recent years several neuropsychological and psychiatry studies employed the psychobiological model of temperament and character (TCI; Cloninger et al., 1993) to investigate the relationship between personality and neuropsychological function in patients with Parkinson's disease (Koerts et al., 2013), Friedreich Ataxia (Sayah et al., 2017), attention-deficit/hyperactivity disorder (ADHD; Drechsler et al., 2015), eating disorders (Pignatti and Bernasconi, 2013), and antisocial behavior (Bergvall et al., 2003). These studies indicated that alterations in personality and cognition are not independent from each other, in that poor development of specific personality traits appears to be associated with deficits in neuropsychological performance, in particular in advanced cognition such as executive functions (EF: attention, working memory, planning, set-shifting, inhibition).

Recently, a great attention has been paid to salience as a fundamental component and modulating factor of attention regulation (Uddin, 2015). It has been argued that the dorsal anterior cingulate cortex (dACC) and the anterior insula (AI) constitute a salience network (SN). The SN would be involved in the attentional control function of detecting subjectively salient events and would provide control signals to a central-executive network (CEN, including the dorsolateral prefrontal cortex and the posterior parietal cortex) to act upon in agreement with current goals (Menon, 2015; Uddin, 2015). Tellingly, a few past studies have associated salience-related internal expectancy behaviors (which employ top-down goal-directed attention regulation) with TCI novelty-seeking (NS) and harm-avoidance (HA) temperaments (Most et al., 2006; Bermpohl et al., 2008; Zhang et al., 2017).

In a recent study, Li et al. (2017) have provided crucial neuroimaging (fMRI) evidence about how functional connectivity (FC) and activation within and beyond the SN can be modulated by internal salience expectancy and temperament. Li et al. explored salience-related connectivity changes (using psychophysiological interaction analysis, PPI) during the anticipation periods involved in a salience expectancy task, in which a group of healthy adults (n = 68) had to rely on visual cues (arrows pointing up or down) in order to actively expect the high or low salience of the following pictures (positive and neutral pictures).

**346**

Furthermore, correlations between PPI maps and temperamental traits were ran, focusing specifically on NS (exploratory/impulsive vs. indifferent/reflective), HA (worrying/anxious vs. relaxed/confident), and RD (reward dependence: sentimental/dependent vs. practical/independent) TCI temperaments.

Critically, concerning FC, Li et al. found that both the right dACC and the right AI (the two areas used as PPI seed regions) showed positive FC with parts of the CEN and negative FC with posterior visual areas and parts of the default mode network (DMN) as a function of high versus low salience expectancy. Moreover, with regard to FC and personality correlation, Li et al. (2017) found reduced FC between right AI and right middle cingulate cortex (MCC) with increasing NS, and reduced FC between right dACC and caudate, with increasing RD. These correlations occurred when participants were expecting highsalience pictures as compared to low-salience pictures, suggesting that whether participants had high or low NS and high or low RD led them to use different salience expectancy processing neurocognitive mechanisms.

As argued by Li et al., the findings confirmed that the SN is important for internal salience evaluation and for the control of goal-directed behavior; more critically, they showed that SN activity, and its integration with other brain networks during salience processing, can be modulated by specific temperamental traits.

The study by Li et al. (2017) thus shed further light on the relation between personality and neuropsychological function and has important clinical implications. Considering past evidence showing associations between temperament and risk of addiction (see Crescentini et al., 2015 for a brief review), the authors argued that addiction behavior may result in part from structural or functional impairments of the SN and associated affective/reward systems (e.g., MCC, caudate), which may lead to dysfunctions during salience expectancy (e.g., perceiving low-salience stimuli with higher significance) depending on individuals' personality predispositions. This possibility is in line with the conclusion of Uddin (2015) suggesting that atypical engagement of the SN by subjectively salient stimuli, together with atypical patterns of FC with other brain networks (CEN, DMN), could lead to dysfunctions of salience and attentional processing that are characteristic of many neuropsychiatric

#### disorders, among which are schizophrenia and autism.

In their study, Li et al. used only positive and neutral pictures in the salience expectancy task, and only temperament traits were put into relation with salience processing and SN function. Hence, it would be interesting to see in future studies whether using more stressful stimuli in similar expectancy tasks (e.g., negative picture) discloses the mediating role of other temperaments (e.g., HA). For example, past research has shown that HA affects behavior and brain functions when negative emotional stimuli are used in visual attention tasks, especially under conditions of uncertain expectations (Most et al., 2006). More critically, it would be important to extend the current findings to the other major component of personality, namely the character. First of all, this will foster our knowledge on the association between character and neuropsychological functions related to salience and attention. In this regard, a recent investigation on healthy adults has shown that high Self-Directedness (SD: purposeful/responsible/reliable vs. purposeless/blaming/unreliable), a crucial aspect of character maturity in the TCI, is protective against distraction by highly salient picture stimuli during an audiovisual attentional conflict task, a finding suggestive of better goal-directed behavior in individuals with higher, vs. lower, SD (Dinica et al., 2015). Furthermore, future research on the relation between personality and neuropsychology could inform clinical applications on how best to assess and improve goal-directed behavior and highlevel cognitive control functions. Indeed, this could require focusing directly on these functions but also attempting to enhance individuals' character acting on aspects such as maturity, autonomy, perseverance, purposefulness, self-regulation, and self-acceptance (Diamond and Lee, 2011). Crucially, in several clinical researches it was shown that character maturity may indeed have a protective role against persistent manifestation of negative outcomes in individuals with antisocial behavior, conduct disorder, ADHD, and personality disorders (Svrakic et al., 1993; Bergvall et al., 2003; Drechsler et al., 2015; Gomez et al., 2017; Kerekes et al., 2017).

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

#### REFERENCES


of mindfulness-oriented meditation. J. Addict. Dis. 34, 75–87. doi: 10.1080/10550887.2014.991657


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Crescentini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dissociable Effects of Psychopathic Traits on Executive Functioning: Insights From the Triarchic Model

Rita Pasion\*, Ana R. Cruz and Fernando Barbosa

Laboratory of Neuropsychophysiology, Faculty of Psychology and Educational Sciences, University of Porto, Porto, Portugal

The relationship between executive functioning and psychopathy lacks consistent findings. The heterogeneity of the psychopathic personality structure may contribute to the mixed data that emerged from clinical-categorical approaches. Considering the link between antisocial behavior and executive dysfunction from the perspective of the Triarchic Model of Psychopathy, it is suggested that executive impairments in psychopathy are specifically explained by meanness and disinhibition traits, reflecting externalizing vulnerability. In turn, boldness is conceptualized as an adaptive trait. The current study assessed updating (N-back), inhibition (Stroop), and shifting (Trail Making Test) in a forensic (n = 56) and non-forensic sample (n = 48) that completed the Triarchic Psychopathy Measure. A positive association between boldness and inhibition was found, while meanness accounted for the lack of inhibitory control. In addition, disinhibition explained updating dysfunction. These findings provide empirical evidence for dissociable effects of psychopathic traits on executive functioning, in light of the Triarchic Model of Psychopathy.

#### Edited by:

Bernhard Hommel, Leiden University, Netherlands

#### Reviewed by:

Boris Forthmann, Universität Münster, Germany Sarah E. MacPherson, University of Edinburgh, United Kingdom

> \*Correspondence: Rita Pasion ritapasion@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 14 December 2017 Accepted: 24 August 2018 Published: 12 September 2018

#### Citation:

Pasion R, Cruz AR and Barbosa F (2018) Dissociable Effects of Psychopathic Traits on Executive Functioning: Insights From the Triarchic Model. Front. Psychol. 9:1713. doi: 10.3389/fpsyg.2018.01713 Keywords: antisocial behavior, psychopathy, impulsivity, cognition, executive functioning, personality

# INTRODUCTION

The evolutionary development of the prefrontal cortex and the expression of the executive functioning (EF) is a distinctive aspect of the human species that originated unprecedented adaptation capabilities (Miyake and Friedman, 2012). EF is defined as a set of cognitive abilities that promote the successful engagement in independent, goal-oriented, and self-serving behavior (Lezak et al., 2004). However, rather than a unitary construct, EF is an umbrella for several cognitive processes and subprocesses (Elliott, 2003).

The Model of Unity and Diversity (for a review see Miyake et al., 2000) was formulated to systematize the main components of EF, based on the assumption that selective deficits should not determine a general executive dysfunction. Three executive processes are proposed: shifting (the ability to shift back and forth between multiple mental sets, requiring the performance of a new operation in face of proactive interference, such as task rules constantly changing); updating (active encoding and manipulation of relevant information in the working memory); and inhibition (inhibition of automatic and dominant responses).

Basic and complex adaptive behaviors, such as inhibition and decision-making, are based on interactions between the above executive components (Miyake et al., 2000; Miller and Cohen, 2001). In turn, deficits in EF are consistently implicated in antisocial behavior (Morgan and Lilienfeld, 2000; Ogilvie et al., 2011). A robust association (d = 0.62–1.09) between antisocial

**349**

behavior and executive dysfunction is well documented in the literature (Morgan and Lilienfeld, 2000; Ogilvie et al., 2011). However, the link between psychopathy and EF is weak (d = 0.29– 0.42). This result seems to reflect inconsistent findings in the literature, as some studies found worse performance on EF tasks in psychopaths (e.g., Dolan and Anderson, 2002), while others did not find evidence for executive deficits (e.g., Dvorak-Bertsch et al., 2007).

Conceptual and methodological shortcomings may help to explain conflicting findings. First, a worse performance may be allocated to specific executive components, rather than a general EF deficit (Ogilvie et al., 2011; Bagshaw et al., 2014; Baskin-Sommers et al., 2015). Second, the association between psychopathy and antisocial behavior is not linear and is still under debate. Some authors conceptualize psychopathy as a criminogenic personality structure (Wilson and Herrnstein, 1985), explaining recidivism and violent offenses (Hemphill et al., 1998), but others argue that antisocial behavior features in psychopathy may be a secondary outcome of the corepersonality features of psychopathy that are moderated by protective and risk factors (e.g., Cooke and Michie, 2001; Gao and Raine, 2010). In this sense, antisocial behavior may co-occur, or not, with psychopathic personality core-features (Cooke and Michie, 2001; Gao and Raine, 2010), such as shallow affect, superficial charm, manipulativeness, lack of empathy (interpersonal-affective features; Hare, 1991). Executive dysfunction may constitute one of the risk factors for getting apprehended (Ishikawa et al., 2001), while intact EF may be a protective factor associated with the positive adjustment traits of psychopathy (Patrick et al., 2009). In fact, Cleckley (1976) observed the occurrence of this personality structure in subclinical groups. The so-called successful psychopaths are capable of engaging in social rules with no apparent criminal record (Glenn et al., 2011). Finally, the use of cut-off scores to analyze psychopathy as a homogeneous and taxonomic group may be masking the differential associations between specific psychopathic traits and executive deficits (Ogilvie et al., 2011).

From the outlined limitations, dimensional models of psychopathy may be an informative venue to unveil the main differential associations, as proposed here and recently evidenced in a systematic review conducted by Maes and Brazil (2013). Regarding the inhibition component of EF, three studies found better executive performance in adaptive psychopathic traits indexing fearlessness features, such as social efficacy and stress immunity (Sadeh and Verona, 2008; Carlson and Thái, 2010; Feilhauer et al., 2012). In turn, two studies reported impaired inhibition in impulsive–antisocial dimensions that are associated with disruptive and maladaptive behavior (Sellbom and Verona, 2007; Feilhauer et al., 2012). For shifting, the disposition toward low fear resulted on improved performance (Sellbom and Verona, 2007), although the affective dimension (i.e., coldheartedness, meanness, callous-unemotional traits) of psychopathy was related with worse shifting performance (Mahmut et al., 2008). Accordingly, a positive association was observed between better performance on updating tasks and higher scores on fearlessness dimensions of psychopathy (Hansen et al., 2007; Sellbom and Verona, 2007). Altogether, previous findings provide some support for a link between adaptive psychopathic personality traits indexing fearlessness features and better EF, whereas antisocial-impulsive dimensions of psychopathy seem to be associated with impaired inhibitory control.

Despite the above-mentioned findings, Maes and Brazil (2013) called for more empirical research to strengthen the evidence on the connections between specific psychopathy phenotypes and EF components. Specifically, Baskin-Sommers et al. (2015) acknowledged that measures designed from normal personality models may capture the adaptive features of psychopathy to a greater extent than the ones designed from clinical observations. In this sense, the Triarchic Model of Psychopathy (Patrick et al., 2009) may add value to the existing body of literature. This model was designed to integrate the distinct conceptualizations of psychopathy (Patrick et al., 2009), while capturing the positive core-features that were described by Cleckley (1976), though excluded from the criminal conceptualizations. Three psychopathic phenotypes are presented in the model: (a) meanness, which comprises lack of empathy, callousness, emotional detachment, active exploitativeness, excitement seeking, rebelliousness, instrumental or predatory aggression, abuse of others, and empowerment through cruelty; (b) disinhibition, characterized by a propensity toward problems of impulse control, deregulated negative affect, deficits in foresight, impatient urgency, non-planfulness, low frustration tolerance, reactive aggression, irresponsibility, and vulnerability to substance abuse; and, (c) boldness, characterized by optimism, resilience to stress, courage, social dominance, persuasiveness, tolerance for uncertainty, self-confidence, social assurance, and intrepidness (Patrick et al., 2009; Venables et al., 2014; Patrick, 2010, Unpublished). The Triarchic Model asserts that the main psychopathic components are distributed in a continuum, allowing to measure the distinct expressions of psychopathy in community samples, while accounting for both affective-interpersonal and behavioralimpulsive expressions of psychopathy (Patrick et al., 2009). These heterogeneous manifestations of psychopathy are explained by distinct etiological pathways: externalizing vulnerability underlies meanness and disinhibition phenotypes, while low fear leads to meanness and boldness (Patrick and Bernat, 2009).

The dissociation of etiological paths highlights the importance of refining the psychobiological and behavioral correlates of each phenotype and brings some implications to assess executive deficits in psychopathy. Importantly, extreme expressions of meanness and disinhibition are systematically included in the prototypical conceptualizations of criminal psychopathy (Patrick and Drislane, 2015). Disinhibition reflects an externalizing component of psychopathy, related to deviant behaviors in child and adult populations (Patrick et al., 2009). Empirical data supports the link between externalizing vulnerability and antisocial behavior (Patrick et al., 2005; Gao and Raine, 2010; Kennealy et al., 2010), and it was recently argued that executive dysfunction is the main path to explain antisocial behavior (Morgan and Lilienfeld, 2000; Ogilvie et al., 2011; De Brito et al., 2013). Findings suggest that antisocial individuals show executive impairments regardless of psychopathic features

(De Brito et al., 2013). In light of this, disinhibition is the main predictor of impulsive-reactive forms of aggression, and may reflect abnormal prefrontal functioning, which is associated with impaired EF (Patrick et al., 2009). Krueger et al. (2002) pointed that the high comorbidity between different manifestations of externalizing disorders (e.g., Conduct Disorder, Antisocial Personality Disorder, and Substance Dependence) implies shared causal processes. Executive dysfunction may constitute, precisely, one point of intersection over the Externalizing Spectrum. On the contrary, boldness may explain adaptive processes (Patrick et al., 2009) and may be dissociated from executive deficits. In particular, boldness is considered a critical trait to differentiate the positive adjustment features of psychopathy from antisocial behavior (Wall et al., 2015). Boldness seems to constitute an adaptive feature on daily life, given the ability of individuals who score high on boldness to remain calm and concentrated in stressful situations, and to quickly recover from such events (Patrick et al., 2005, 2009; Kennealy et al., 2010; Stanley et al., 2013; Venables et al., 2014; Patrick and Drislane, 2015; Pasion et al., 2016, 2017). Moreover, boldness entails high self-assurance and social efficacy, tolerance for unfamiliarity and danger, social dominance, thrill seeking without anticipatory fear, and emotional resiliency.

Despite the accumulated data, to our knowledge the Triarchic Model of Psychopathy remains untested in studies assessing EF. Recently, Patrick and Drislane (2015) argued that a comprehensive validation of the Triarchic Model demands for linkages between psychopathic phenotypes and measures of brain and behavior. The current study aims to provide direct evidence for the dissociable effects of the psychopathy phenotypes, as proposed by the Triarchic Model, while predicting inhibition, shifting and updating components. Dissociable effects on EF among the phenotypic expressions of psychopathy would allow to validate the triarchic model as a promising venue to explore and refine the neurobehavioral correlates of specific psychopathic manifestations. The use of an assessment model based on normalrange continua of personality traits will further allow to test if executive deficits may be detected in low and moderate levels of the spectra, namely in subclinical samples (Maes and Brazil, 2013).

Considering the externalizing-antisocial behavior link, it is hypothesized that externalizing vulnerability (meanness and disinhibition phenotypes) accounts for an executive dysfunction characterized by low inhibition. Boldness, as a positive adjustment phenotype, is expected to be associated with intact or even improved EF in all its components.

# MATERIALS AND METHODS

#### Sample

This study included 104 male participants, aged between 18 and 60, recruited from forensic (n = 56, four left-handed) and community settings (n = 48, three left-handed). **Table 1** summarizes sociodemographic data.

The forensic group comprised individuals currently convicted for one or more crimes. The recidivism rate was 28.6% and the criminal charges represented the expected criminal versatility (kidnapping, homicide, domestic violence, fraud, theft, and drug dealing) (Benson and Moore, 1992; Piquero et al., 2007; Piquero, 2008; Gavin and Hockey, 2010). The community group (nonforensic) did not report current or previous criminal activities.

Participants were excluded based on the following criteria: foreign nationality, illiteracy, age above 60, diagnosis of psychopathology, neuropathology, cognitive impairment, sensory or motor deficits. Criminal records were reviewed for this purpose and a semi-structured interview was designed to collect more information on exclusion criteria. Cognitive impairment was further screened using the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005, adapted by Simões et al., 2008).

#### Materials and Stimuli Psychopathy Measures

The TriPM (Patrick, 2010, Unpublished; adapted by Vieira et al., 2014) is a self-report scale with 58 items measuring three psychopathy phenotypes: Boldness, Meanness, and Disinhibition. Responses are provided on a 4-point Likert scale (0 – false; 1 – somewhat false; 2 – somewhat true; 3 – true). The boldness subscale (α = 0.853) indexes adaptive features of psychopathy, such as low anxiousness, venturesomeness, and social dominance. The disinhibition subscale (α = 0.844) entails the purest externalizing factors, namely hostility, irresponsibility, and impulsiveness. The meanness subscale (α = 0.868) comprises lack of empathy and close attachment, rebelliousness, excitement seeking, callousness and cruelty. Meanness correlated moderately with disinhibition (r = 0.619, p < 0.001) and low with boldness (r = 0.291, p = 0.003). A non-significant correlation was found between boldness and disinhibition (r = 0.148, p = 0.135), providing support for the etiological differentiation underlying the distinct phenotypic expressions of psychopathy (**Supplementary Tables S1**, **S2** present the associations between phenotypic components for each group). The total score on psychopathy was obtained from the sum of the three subscales.

TABLE 1 | Means (standard deviations) of socio-demographic variables, TriPM scores, and executive performance for the non-forensic and forensic groups.


All the fields were required to complete and there were no missing values.

#### Executive Functioning Measures

fpsyg-09-01713 September 10, 2018 Time: 18:49 # 4

#### **Inhibition**

In variants of the original Stroop Color-Word Test it is suggested that 55% of the variance of the task performance is explained by a factor related with the suppression of automatic responses (Miyake et al., 2000; Friedman and Miyake, 2004; Hull et al., 2008). The Stroop (paper-and-pencil version; Stroop, 1935; Portuguese version by Fernandes, 2013) presents a word condition (W; where the participant is asked to read words – red, green and blue – printed in black), a color condition (C; where the participant is asked to name the colors of 'Xs' printed in blue, red, and green), and a color/word condition (CW; where the participants should name colors that do not match with the written word). Each condition contains 100 stimuli, horizontally distributed in five columns, to be read during 45 s. The interference ratio (Fernandes, 2013), a widely used measure that removes the confounding factor of processing speed when assessing inhibition, was calculated by the following formula:

$$\text{CW}' = \frac{\text{W} \times \text{C}}{\text{W} + \text{C}}$$

#### **Updating**

Updating explains in 46% the performance variance during the N-Back (Friedman et al., 2008). For this reason, a computerbased spatial 2-Back task (adapted from Kirchner, 1958) was used to assess updating. In the 2-Back task, participants were asked to signal whenever a white square (1000 ms, 2.5 seg of inter-stimuli interval) in a matrix of 3<sup>∗</sup> 3 squares was displayed in the same position as two trials before (25 targets, and 71 non-targets). Four measures were obtained from this task: hits (signal is present), misses (signal is present, but the participant do not signal it), false alarms (participant indicates that signal is present when it is not), and correct omissions (participant correctly indicates that there is no signal, by not responding). Sensitivity to the signal was calculated from this task as:

$$\begin{aligned} d' &= \Phi^{-1}\left(\frac{\text{hits}}{\text{(hits + miss)}}\right) \\\\ &- \Phi^{-1}\left(\frac{\text{false alarms}}{\text{(fales alarms + correct emissions)}}\right) \end{aligned}$$

The 8−<sup>1</sup> function returns the inverse of the standard normal cumulative distribution assuming a distribution with average 0 and sigma 1 (NORMSINV Excel); that is, it transforms Hit Rate (signal is present) and False Alarm Rate (signal is absent) into z-scores. Perfect scores were adjusted by the following formulas (Macmillan and Creelman, 1991), where "n" refers to the number of hits or false alarms:

$$Hits = \frac{1}{(2n\_{hits})}$$

$$False\ Alarms = \frac{1}{\left(2n\_{\text{false}}\text{ alarms}\right)}$$

Two participants from the non-forensic group and three from the forensic group gave up the task, so their results were not considered.

#### **Shifting**

The paper-and-pencil version of the Trail Making Test (TMT; Reitan and Wolfson, 1985) was administered, considering that the performance in this task is explained in 87% by the latent factor of shifting (Rose et al., 2011). TMT requires the participants to connect, in an ascending order, a sequence of numbers in circles (part A – from 1 to 24), and a sequence of numbers and letters alternately (Part B – from A to L and 1 to 13). Errors were immediately corrected in both conditions. The task was discontinued after 200 s in Part A, and 400 s in Part B, or after four accumulated errors in each part, except when only three or less circles were lacking to the end. This rule led to the exclusion of 11 participants of the forensic group, and three non-forensic participants. The shifting measure was calculated by subtracting the execution time of Part A from Part B.

#### Procedure

The forensic group was recruited from three maximum security prisons. Full access to criminal records was given to assess exclusion criteria. Offenders were selected based on the available information from the files and in collaboration with case workers.

The non-forensic group was recruited by e-mail advertisement using the mailing list of the university campus. In this advertisement, participants were required to complete an on-line version of the TriPM (n = 1072). Individuals for the non-forensic group were then selected according to their TriPM scores in order to match the psychopathy scores of the forensic group. A total of 86 participants were invited for the neuropsychological assessment (acceptance rate ( 55.8%).

Individual interviews were conducted to complete demographic data and further screen for exclusion criteria. The neuropsychological tasks were individually administered by two trained psychologists and the order of these tasks was randomized across participants.

Informed consent was obtained from all participants. Ethical principles and code of conduct were strictly followed.

# RESULTS

!

#### Preliminary Analysis

There were significant differences between groups regarding age, t(102) = 3.32, p = 0.001, and years of formal education, t(102) = 8.25, p < 0.001 (**Table 1**). None of the participants reported substance use at the time of the study. However, 27 inmates reported past substance abuse.

The groups were matched for the total psychopathy score (t < 1), to ensure that significant differences in phenotypic components were not explained by the variation in the total psychopathy score. The total scores ranged between 26 to 102 in the forensic group, and 30 to 116 in the non-forensic group.

Despite similar values in total psychopathy score, group differences (independent variable) emerged when analyzing

phenotypic components in a multivariate model (MANOVA) with Bonferroni correction, T <sup>2</sup> = 0.324, F(3,100) = 10.8, p < 0.001, η 2 <sup>p</sup> = 0.245. Boldness scores were higher in the non-forensic group, F(1,102) = 5.74, p = 0.018, η 2 <sup>p</sup> = 0.053, while disinhibition scores were higher in the forensic group, F(1,102) = 11.4, p = 0.001, η 2 <sup>p</sup> = 0.100, p = 0.001 (**Table 1**). The differences in groups regarding recidivism (independent variable), T <sup>2</sup> = 0.324, F(3,39) = 4.44, p = 0.009, η 2 <sup>p</sup> = 0.254, showed that meanness, F(1,41) = 13.94, p = 0.001, η 2 <sup>p</sup> = 0.254, and disinhibition, F(1,41) = 5.07, p = 0.030, η 2 <sup>p</sup> = 0.110, were higher in the recidivist group, compared to the non- recidivist group. The group reporting past substance use (independent variable), T <sup>2</sup> = 0.324, F(3,93) = 8.74, p < 0.001, η 2 <sup>p</sup> = 0.220, showed higher level of disinhibition, F(1,95) = 22.7, p < 0.001, η 2 <sup>p</sup> = 0.193.

Regarding executive functions, updating, t(98) = 2.71, p = 0.007, d = 0.56, and shifting components, t(88) = 2.60, p = 0.011, d = 0.55, were lower in the forensic group, compared to the non-forensic group (cf. **Table 1**). This effect was not confirmed for inhibition, t(102) = 1.72, p = 0.089, d = 0.33.

#### EF and Psychopathic Traits

Hierarchical Linear Regression models were used to examine the variance of the executive components explained by psychopathy phenotypes. The models were run independently, considering Miyake et al. (2000) rationale and the absence of significant zero-order correlations across tasks (**Supplementary Table S3**).

The dissociable effects of phenotypic expressions of psychopathy (boldness, meanness, and disinhibition) in EF components (inhibition, updating, and shifting) were analyzed. Group differences and non-matched variables systematically identified in the literature as predictors of EF, were introduced in the model to control moderation effects, if displaying significant correlations with the executive performance (**Supplementary Table S1**). Age and years of education were entered, respectively, in updating and shifting models to control for moderation effects. No evidence for multicollinearity was found in the regression models.

#### Inhibition

Meanness, β = 0.318, t = 2.51, p = 0.014, was the main predictor of inhibition. An acceptable power was achieved (85%), despite the non-significance of the regression model, F(3,103) = 2.51, Adj R <sup>2</sup> = 0.042, p = 0.063, and a small effect size (R <sup>2</sup> = 0.070). The inclusion of the group moderation effects in the model lead to a non-significant increase in R <sup>2</sup> of 0.023 (p = 0.118), but the moderation model reached significance, F(4,103) = 2.53, Adj R <sup>2</sup> = 0.056, p = 0.045. The power increased to 94%. Nevertheless, group was a non-significant predictor of inhibition, β = 0.174, t = 1.58, p = 0.118. Interestingly, boldness emerged as a significant predictor, β = −0.210, t = 2.04, p = 0.044, in an opposite direction of meanness, β = 0.260, t = 1.98, p = 0.051 (**Table 2**).

#### Updating

Disinhibition was the main predictor of updating, F(3,98) = 3.66, Adj R <sup>2</sup> = 0.075, p = 0.015, β = −0.321, t = 2.59, p = 0.011. In boldness there was a trend toward an opposite pattern, although the effect did not achieve statistical significance, β = 0.198, t = 1.95, p = 0.055. The moderation effect of age, β = −0.271, t = 2.42, p = 0.017, and group, β = 0.069, t = 0.606, p = 0.546, F(5,98) = 3.90, Adj R <sup>2</sup> = 0.129, p = 0.003, increased significantly the R 2 , p = 0.023, to a medium effect size (R <sup>2</sup> = 0.173). Disinhibition remained as a significant predictor of updating, β = −0.272, t = 2.01, p = 0.047, but boldness did not achieve significance in the moderation model, β = 0.129, t = 1.26, p = 0.211. The observable power was high in both models (94 and 99%, respectively) (**Table 3**).

#### Shifting

The model including the phenotypic expressions of psychopathy did not reach significance and did not yielded significant predictions on shifting (**Table 4**). The post hoc power was low (31%). The moderation analysis lead to a significant R 2 increase of 0.120, p = 0.004, and on power (97%). Years of education was the main moderator accounting for the significance and medium effect size (R <sup>2</sup> = 0.136) of the model, F(5,89) = 2.64, Adj R <sup>2</sup> = 0.084, p = 0.029, and explained improved shifting, β = −0.334, t = 2.46, p = 0.016 (**Table 4**).

# DISCUSSION

The link between antisocial behavior and executive dysfunction is well-established (Morgan and Lilienfeld, 2000; Ogilvie et al., 2011), but when analyses are redirected to psychopathy, empirical findings become less robust (Ogilvie et al., 2011). This result may seem paradoxical, considering the close association between psychopathy and antisocial behavior systematically described in the literature. The current study aimed to clarify the inconsistent findings by dissociating psychopathy phenotypes and executive components. The analysis of specific profiles of psychopathy and separate components of executive functioning may allow a more precise evaluation of its complex relationships.

In a global analysis, our results are consistent with previous findings. A poor performance in updating and shifting was observed in the forensic group, with medium effect sizes. In inhibition the effects were almost significant and small in magnitude, probably due to the small sample size. Thus, differences were found between phenotypic components, confirming that the psychopathic personality structure has heterogeneous features and demands for a dimensional approach. Higher traits of disinhibition were observed in the forensic group, while the non-forensic group exhibited higher traits of boldness. Furthermore, meanness and disinhibition explained recidivism and disinhibition accounted for past substance abuse. Such findings reinforce the thesis that the etiological pathway of externalizing vulnerability is linked to disruptive and antisocial behavior, while boldness remains unrelated to these phenomena (Patrick et al., 2009). This is consistent with the assumption that psychopathy per se may not be a risk factor for criminal behavior (Ishikawa et al., 2001). This risk may be moderated by adaptive psychopathy traits, and mediated by other risk factors, such as TABLE 2 | Regression model for inhibition (Stroop task).

fpsyg-09-01713 September 10, 2018 Time: 18:49 # 6


TABLE 3 | Regression model for updating (N-back).


TABLE 4 | Regression model for shifting (TMT).


impaired EF (Cooke and Michie, 2001; Ishikawa et al., 2001; Patrick et al., 2009).

The maladaptive features of psychopathy, mainly related to disinhibition and meanness (externalizing vulnerability), were expected to be associated with the lack of inhibitory control. In turn, we hypothesized that boldness, as an adaptive phenotype, would be associated with intact or even improved EF in all its components.

Our results revealed that the psychopathic traits were the main predictors regarding the inhibition component of EF. Group was a non-significant predictor of inhibition and the increase in R 2 was non-significant and small in magnitude, but moderated a significant dissociation between boldness and meanness traits. Providing support for our hypothesis, boldness was associated with an enhanced ability to inhibit automatic responses, which is in line with previous studies (Sadeh and Verona, 2008; Carlson and Thái, 2010; Feilhauer et al., 2012) and the adaptive role of boldness (Patrick et al., 2009). Conversely, meanness was related with high interference scores, suggesting this phenotype predicts action toward immediate and impulsive behavior. This result is aligned with previous findings on impulsiveantisocial dimensions of psychopathy (Sellbom and Verona, 2007; Feilhauer et al., 2012). The inhibitory deficit in meanness may also help to explain the previously reported association between the affective facet of psychopathy and aggressiveviolent behavior (Baskin-Sommers et al., 2015). Meanness might be conceived as a maladaptive phenotypic expression of a fearless temperament associated with life-course antisocial

trajectories (Polaschek and Daly, 2013), characterized by greater impulsiveness and aggressiveness. Unexpectedly, higher traits of disinhibition did not predict lower inhibitory control. The lack of a significant association between disinhibition and inhibition does not seem to be explained by inadequate statistical power, which was high in both models (85–93%). Considering that differences on Stroop interference between forensic and nonforensic participants were not significant, it would be necessary to use other tasks to better detect differences in this executive function and test if disinhibition is the best phenotypic candidate to capture poor inhibitory control.

Interestingly, disinhibition predicted reduced sensitivity to the signal during the N-Back. It suggests that updating abilities may play an important role in the executive deficits of individuals characterized by high traits of disinhibition. Updating requires the capacity to dynamically monitor, control, replace, and manipulate information in working memory. If the updating component is affected in disinhibition, the ability to acquire new and relevant information may be compromised, limiting learning from past experiences. To date, studies had only evidenced a positive association between fearlessness-related traits of psychopathy and updating (Hansen et al., 2007; Sellbom and Verona, 2007). Our study adds evidence on the opposite pattern for the disinhibition phenotype. Previously studies on P3 – a neurophysiological correlate of updating – reported P3 blunted amplitude in individuals scoring high on impulsive and antisocial traits of psychopathy (Carlson et al., 2009; Carlson and Thái, 2010; Pasion et al., 2017). Our results seem to be capturing this effect at a behavioral level. In line with the abovementioned studies (Hansen et al., 2007; Sellbom and Verona, 2007), boldness emerged as a predictor of updating. For instance, P3 amplitude is also found to be increased in fearlessness-related traits, as boldness (Pasion et al., 2017). The trend for a dissociable effect with disinhibition strengths the assumption that boldness is an adaptive phenotype, by explaining improved ability to encode and manipulate relevant information in the working memory. Nevertheless, we did not confirm a significant effect of boldness when the moderators were entered in the model and this was not explained by insufficient observed power (99%). A significant negative association was evidenced between age and updating and an increase to a medium effect size was found in the model in which disinhibition, but not boldness, remained a significant predictor of impaired updating. Group was a non-significant moderator.

The performance on shifting was not explained by the distinct phenotypic expressions of psychopathy. This model did not yield significant predictions, the effects were negligible and the achieved power was poor (31%). In turn, the model accounting for moderators explained a significant increase in power to 97% and in R 2 to a medium effect size. Group was a non-significant moderator and years of education emerged as the unique predictor of enhanced shifting abilities. The lack of consistent and significant associations between psychopathic traits and shifting was reported in previous studies (Sellbom and Verona, 2007; Mahmut et al., 2008; Racer et al., 2011; Dolan, 2012), indicating that shifting is not a core-deficit when explaining risk factors for antisocial behavior.

Taken together, the current study highlights that psychopathic personality traits explain in a great extent EF than incarceration. The documented group differences in EF were suppressed when psychopathic traits were introduced in the regression models. Moreover, our study found evidence for the etiological dissociation proposed by Patrick et al. (2009). Meanness and disinhibition were associated with maladaptive behavior (recidivism and past substance abuse), and traits of the disinhibition phenotype were higher in the forensic group. Disinhibition and meanness were found to be moderately correlated and are systematically referred to in the prototypical conceptualization of criminal psychopaths (Patrick et al., 2009). The externalizing manifestation of psychopathy (high disinhibition in combination with high meanness) also evidenced a poor EF. Meanness and disinhibition predicted negatively the performance on inhibition and updating tasks, respectively. Deficits in EF may explain the higher risk for persistent rule breaking as frequently observed in antisocial behavior (Baskin-Sommers et al., 2015). Executive impairments (Ishikawa et al., 2001), as well as reductions of prefrontal gray matter (Yang et al., 2005), were previously observed in criminals showing high scores in the impulsive-antisocial factor of psychopathy.

In turn, a disposition toward low fear may constitute a key component for the adaptive social functioning. The boldnessfearlessness expression of psychopathy (i.e., high boldness scores, in combination with low to moderate meanness) seems to be demarked from criminal correlates and executive impairment. Accordingly, we found higher boldness in the community sample, and these traits accounted for enhanced inhibitory control. Better inhibitory control may prevent disruptive, aggressive and violent behavior to occur in individuals with high boldness traits. The ability to remain calm in stressful and unfamiliar contexts may result in a circumstance that puts these individuals in a favorable position to reach high performance. In a coherent picture, improved EF (Ishikawa et al., 2001), as well as intact gray matter (Yang et al., 2005), were observed in successful (non-criminal) psychopaths that scored lower in impulsive-antisocial factor (Ishikawa et al., 2001) and higher in the interpersonal-affective factor (DeMatteo et al., 2006).

It is important to acknowledge that meanness shares the etiological pathway of low fear with boldness. Nevertheless, the operationalization of meanness in the triarchic psychopathy measure addresses secondary features of externalization designed from the same self-report that operationalizes disinhibition (Patrick, 2010, Unpublished). The behavioral manifestations of meanness comprise arrogance, defiance of authority, a lack of emotional attachment, aggressive competitiveness, physical cruelty and exploitation toward others, premeditated aggression, and excitement seeking through destructiveness (Patrick et al., 2009). Therefore, the link between meanness and externalizing vulnerability seems to be stronger than the one with low fear. Supporting that meanness and disinhibition are close phenotypes, a weak correlation between boldness and meanness was found in our study, in contrast with the moderate association between disinhibition and meanness.

The dissociation of psychopathy traits highlights that dimensional models are promising to clarify conflicting results

from studies assessing EF as a single construct and psychopathy as a homogeneous construct. Studies examining EF deficits in psychopathy remained focused on the taxonomic differences between psychopaths and non-psychopaths. Of 191 studies investigating the EF components of the Miyake et al. (2000) model in psychopathic personality only 11 analyzed the distinctive traits of psychopathy (Maes and Brazil, 2013). Moving toward dimensional models will allow to analyze psychopathy in terms of its distinct phenotypes. This may help to shed light on differential relations with brain and behavior that are not evidenced when using total scores.

Although common to most studies in this field, the main methodological limitations of this study should be outlined. Firstly, psychopathy scores were based on self-report measures, which raise questions of social desirability bias, particularly in individuals who may be highly manipulative. Secondly, as we did not cross information with judicial authorities, eventual criminal acts in the community sample may have gone unnoticed. Third, the samples were matched according to the total psychopathy scores. While this procedure allows one to explore the distinct prevalence of phenotypic expressions of psychopathy among samples that do not differ in their variation in terms of total scores, it may also force the selection of non-forensic participants who have high psychopathy scores, and who score highly on certain phenotypes that are not typical of nonforensic participants more generally. Finally, the cross-sectional nature of this study limits the inference of causal relationships between psychopathy phenotypes and EF. Regardless of the above limitations, our results suggest that inconsistent findings regarding EF on psychopathy may be explained, at least partly, by a variable representation of disinhibition, meanness, and boldness phenotypes across samples. Therefore, disentangling the psychopathic personality structure into its distinctive phenotypes may favor an accurate analysis of their specific correlates.

Future research should extend the main findings and accumulate knowledge to establish a robust association between

#### REFERENCES


triarchic dimensions of psychopathy and EF, both in forensic and community samples, controlling not only for the criminal trajectory but also related variables such as socioeconomic status and incarceration effects.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Declaration of Helsinki, World Medical Association (WMA), and the European Code of Conduct for Research Integrity, All European Academies (ALLEA). The protocol was approved by the Scientific Board of Psychology Doctoral Program. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### AUTHOR CONTRIBUTIONS

RP, AC, and FB conceptualized the study. RP and AC collected the data. RP analyzed the data and prepared the paper. AC and FB reviewed the paper. FB supervised the study.

# FUNDING

This research was supported by grant SFRH/BD/76062/2011 from Science and Technology Foundation awarded to AC.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01713/full#supplementary-material



contributions to complex "frontal lobe" tasks: a latent variable analysis. Cognit. Psychol. 41, 49–100. doi: 10.1006/cogp.1999.0734



Yang, Y., Raine, A., Lencz, T., Bihrle, S., LaCasse, L., and Colletti, P. (2005). Volume reduction in prefrontal gray matter in unsuccessful criminal psychopaths. Biol. Psychiatry 57, 1103–1108. doi: 10.1016/j.biopsych.2005. 01.021

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Pasion, Cruz and Barbosa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Individual Differences in Reward Sensitivity Modulate the Distinctive Effects of Conscious and Unconscious Rewards on Executive Performance

Rémi L. Capa1,2 \* and Cédric A. Bouquet<sup>3</sup>

<sup>1</sup> French National Institute for Health and Medical Research, Department of Psychiatry, University of Strasbourg, Strasbourg, France, <sup>2</sup> L'institut National Universitaire Jean-François Champollion, Université de Toulouse, Albi, France, <sup>3</sup> Centre National de la Recherche Scientifique, Centre de Recherches sur la Cognition et L'apprentissage, Université de Poitiers, Poitiers, France

#### Edited by:

Sarah E. MacPherson, University of Edinburgh, United Kingdom

#### Reviewed by:

Erik Bijleveld, Radboud University Nijmegen, Netherlands Luke Clark, University of Cambridge, United Kingdom

> \*Correspondence: Rémi L. Capa remi.capa@univ-jfc.fr

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 27 September 2017 Accepted: 29 January 2018 Published: 16 February 2018

#### Citation:

Capa RL and Bouquet CA (2018) Individual Differences in Reward Sensitivity Modulate the Distinctive Effects of Conscious and Unconscious Rewards on Executive Performance. Front. Psychol. 9:148. doi: 10.3389/fpsyg.2018.00148 Executive control can be driven by conscious and unconscious monetary cues. This has raised the exciting question regarding the role of conscious and unconscious reward in the regulation of executive control. Similarities and differences have been uncovered between unconscious and conscious processing of monetary rewards. In the present study, we explored whether individual differences associated with reward sensitivity foster these variations on memory-updating—a core component process of executive control. Participants (N = 60) with low, medium, and high reward sensitivity were selected and performed a numerical memory-updating task. At the beginning of each trial, a high (1 euro) or a low (5 cents) reward was presented subliminally (24 ms) or supraliminally (300 ms). Participants earned the reward by responding correctly. Participants with low reward sensitivity performed better for the high reward only in the subliminal condition. For participants with medium reward sensitivity, performance improved with high reward in both subliminal and supraliminal conditions. When participants had high reward sensitivity scores, the effect of reward was stronger in the supraliminal condition than the subliminal condition. These results show that the distinctive effects of conscious and unconscious rewards on executive performance are modulated by individual differences in reward sensitivity. We discuss this finding with reference to models of conscious/unconscious processing of reward stimuli.

Keywords: reward, conscious and unconscious processes, behavioral activation, individual differences, executive control, memory-updating

# INTRODUCTION

Executive control has been defined as "the ability to flexibly and dynamically adjust one's performance to changing environmental demands and internal goal states" (Barch et al., 2009). This ability has been strongly linked to consciousness. However, several studies have reported effects of subliminal stimuli on high-order executive control processes (van Gaal et al., 2012). For instance, studies have suggested that conscious but also unconscious processing of monetary reward can

increase performance in tasks requiring executive control (Capa et al., 2011, 2013; Bustin et al., 2012). This has raised the exciting question of the role of conscious and unconscious processing reward in executive control. However, differences have been uncovered between conscious and unconscious reward. In the present study, we explored whether the behavioral activation system – a motivational system responsible for organizing and regulating behavior to attain rewards (Gray, 1989) – may be a crucial factor to explain differences in executive performance (memory-updating) between conscious and unconscious reward processing.

Initial empirical evidence of the influence of unconscious reward processing was provided by Pessiglione et al. (2007), who invited participants to perform a task in which they could earn money by squeezing a handgrip. Participants put in more effort for larger sums of money displayed subliminally and supraliminally. Moreover, the same basal forebrain region was involved in both subliminal and supraliminal rewards presentation, which suggests that the cerebral structures involved in both conditions were qualitatively similar. This influential study was replicated and extended to various cognitive tasks (Bijleveld et al., 2009, 2010, 2011, 2012a; Capa et al., 2011, 2013; Zedelius et al., 2011, 2012a,b, 2013, 2014; Bustin et al., 2012). Among these studies, several showed that even performance in complex tasks involving high-order processes, traditionally thought to require consciousness, can be driven by both conscious and unconscious rewards. For instance, Bijleveld et al. (2010) invited participants to perform a task in which they could earn money by quickly and accurately solving a mathematical equation. The amount of money that participants received was contingent on their speed and accuracy. The possibility of speed-accuracy trade-off thus allowed participants to make strategic choices. In other words, participants could choose between using a rapid strategy or a cautious one. Subliminal high rewards made participants more eager, with faster but equally accurate responses. Supraliminal high rewards, on the other hand, caused participants to be more cautious, with slower but more accurate responses. Interestingly, other studies also reported differences between conscious and unconscious rewards for tasks requiring executive control such as memory-updating (Capa et al., 2011; Bustin et al., 2012) and task-switching (Capa et al., 2013).

In a previous study (Capa et al., 2011), we sought to investigate the influence of conscious and unconscious rewards on memoryupdating. Participants had to memorize five numbers and update those numbers independently according to a series of six successive arithmetic operations. At the beginning of each trial, a reward (1 euro or 5 cents) was presented either subliminally (27 ms) or supraliminally (300 ms). If participants successfully reported the final correct series of numbers, then they earned the reward at stake. Results showed better performance when a high monetary reward (either consciously or unconsciously processed) was at stake. However, the participants showed a better percentage of correct responses when subliminal reward cues were presented compared to supraliminal cues.

In another study, we tested the influence of conscious and unconscious rewards during cued task-switching performance (Capa et al., 2013). In this study, participants performed runs of task-switching. During each run, participants switched among three tasks and earned the reward contingent upon their accuracy in the run. The percentage of correct runs was larger for the higher than for the lower reward, in both subliminal and supraliminal conditions. In respect to reaction times, participants were overall faster for the supraliminal reward. Moreover, for the subliminal reward, no reaction time difference was observed between high and low reward conditions suggesting that participants were more cautious.

Why consciously and unconsciously processed rewards can differentially affect executive control is open to argument and the modulating factors of the reward-related effects remain to be fully understood. In this context, one important – but rather unexplored – question is whether individuals' personality traits or tendencies can modulate these effects on executive performance. The impact of individual differences on the distinctive effects of conscious and unconscious rewards is a key issue to investigate to further our understanding of the regulating role of motivation in executive control (Braver et al., 2010).

In a first attempt (Bustin et al., 2012), we explored whether individual differences associated with novelty seeking could foster differences in the effect of reward processing on executive function. Novelty seeking is defined as a trait involving activation or initiation of behaviors such as exploratory activity and approach to monetary rewards (Cloninger et al., 1993). Within this frame, participants performed a memory-updating task, similar to Capa et al. (2011), to earn rewards presented consciously and unconsciously. On the basis of participants' scores on the novelty seeking scale from the Temperament and Character Inventory-Revised (TCI-R; Cloninger, 1999), two groups (low below the median vs. high above the median) of participants were created. We found that low novelty seeking participants performed better when rewards were presented subliminally, whereas high novelty seeking participants' performance did not differ regardless whether reward cues were processed consciously or unconsciously. These previous findings highlight the necessity of taking individual differences into account to better understand the effects of conscious and unconscious processing reward on executive control.

To examine further this issue, the present study focused on reward sensitivity, which is known to be a moderator of reward processing (Kim et al., 2015). More specifically, we investigated whether individual differences in reward sensitivity can foster the distinctive effects of conscious (supraliminal) and. unconscious (subliminal) processing of reward on executive performance. Cloninger's model of personality, which incorporates novelty seeking tendency (see above), is theoretically related to Gray (1989) model, which distinguishes between two motivational systems: the behavioral approach system (BAS) and the behavioral inhibition system (BIS) (Mardaga and Hansenne, 2007). Hence, the present work was guided by Gray's model. Accordingly, the BIS guides' behavior in response to punishment signals via the septohippocampal system. The BAS, on the other hand, may organize and regulate behavior in response to reward signals via the dopamine system.

Reward sensitivity is measured using the BIS/BAS scale developed by Carver and White (1994). The BAS scale consists of three subdimensions, with two subscales related to reward processing (BAS Drive and BAS Reward Responsiveness) and one to novelty-seeking (BAS Fun Seeking). Previous research has shown that the BAS Drive and BAS Reward Responsiveness subscales can be used as a reliable index of individual differences in reward sensitivity (Carver and White, 1994; Hickey et al., 2010). The BAS Reward Responsiveness subscale captures positive responses to the occurrence or anticipation of reward, whereas the BAS Drive subscale indexes the persistent pursuit of desired rewards. Participants with high, as opposed to low, BAS Drive scores show correspondingly higher task engagement to earn a conscious reward during a cognitive task (Boksem et al., 2006, 2008; Hickey et al., 2010). Similarly, BAS Reward Responsiveness has been found to correlate positively with reward-related effects on cognitive processing (Braem et al., 2012). Differences in BAS scores can thus lead to differential effects of conscious rewards. To the best of our knowledge, no studies to date have looked at the impact of unconscious reward processing as a function of the BAS system.

In the present study, executive control was probed through a memory-updating task. Memory-updating is a key component of executive control. It refers to a process that is required to modify the content of working memory by replacing current, no longer relevant information with more relevant information (Morris and Jones, 1990). Participants were selected with low, medium, or high scores on the BAS Drive subscale.<sup>1</sup> They could earn money displayed subliminally or supraliminally by performing well in a numerical memory-updating task. As observed previously in a similar task (Capa et al., 2011; Bustin et al., 2012), we anticipated a reward effect, with better performance associated with the possibility of earning a high reward as opposed to a low reward. Moreover, we explored whether conscious and unconscious reward processing differed or not as a function of the BAS system. We expected an increase in the reward effect on executive performance concomitant with the increase in BAS scores.

# MATERIALS AND METHODS

#### Participants

The French version of the BIS/BAS scales (Carver and White, 1994) was administered to 215 university students (129 female, 86 male) enrolled in an introductory-level psychology course. Their mean age was 21.18 years (SD = 3.13). Participants were classified as low BAS if their score (i.e., the sum of the four items) with the BAS Drive subscale was six or less (below the 15th percentile), and high BAS if their score was 10 or more (above the 85th percentile).<sup>2</sup> Participants who scored 7 or 8 (40th and 60th percentile) were classified as medium BAS (participants with a score of 9 were excluded). Furthermore, participants in the low, medium, and high BAS groups were also selected if their BIS score (i.e., sum of the 7 BIS items) was between 14 and 22 (15th and 85th percentile). This was done to avoid differences in BIS scores across groups. Participants among this first sample of those who volunteered to take part in the study had to fill out the BIS/BAS questionnaire a second time (Carver and White, 1994). Only participants whose scores stayed within the limits set by the initial distribution were retained, as an experimental precaution to ensure their characteristics were stable. Therefore, a total of 60 participants (20 per group) constituted our final sample.

The experiment was conducted in accordance with the Helsinki Declaration. Each participant read and signed an informed consent form prior to taking part in the experiment. They were allowed to keep the money they earned. The study was of 2 (reward presentation duration: supraliminal vs. subliminal) × 2 (reward value: low vs. high) × 3 (BAS group: low vs. medium vs. high) mixed-factorial design. Reward presentation duration and reward value were within-participants factors, and BAS group was a between-participant factor.

# Experimental Task

The updating task – based on the memory-updating paradigm devised by Salthouse et al. (1991) – was presented on a 85-Hz CRT screen. Participants took part in a training session consisting of eight trials (two trials per condition), followed by 80 experimental trials (20 repetitions per condition). At the beginning of each trial, a fixation cross appeared, followed immediately by a premask, the reward stimulus (presented subliminally for 24 ms<sup>3</sup> or supraliminally for 300 ms), and a post-mask (**Figure 1A**). Stimuli for the updating task were then presented (**Figure 1B**). Participants were instructed that if they responded correctly, they would receive the reward presented at the beginning of the trial. They had to memorize five numbers and update each number independently according to a series of six arithmetical operations (i.e., additions and subtractions of ±1 or ±2). Intermediate and end results for each number were always in the range of 0 to 9, to ensure a constant degree of difficulty.

At the end of the trial, participants were asked to enter the final value of each number on a keyboard. They were told they would only win the pecuniary reward if all five numbers were correct. The five-number sequences they had to memorize, the six successive updating operations, and the required responses were all different across trials. This ensured there was no implicit learning or association possible between the reward stimuli and the response. Cumulative earnings were displayed at the end of each trial (**Figure 1C**). Participants were told that the reward stimuli were either 1 euro or 5 cents and would sometimes be difficult to see. This was an experimental precaution which

<sup>1</sup>Each of the three BAS subscales can potentially predict reward-related effects (Hickey et al., 2010; Braem et al., 2012). Nevertheless, we choose to create the groups of participants on the basis of the BAS Drive subscale, because it has been found to show the highest construct validity, and it has been suggested to be the best predictor of reward-induced behavior (Carver and White, 1994; Hickey et al., 2010; see also Ross et al., 2002).

<sup>2</sup>These distribution parameters were obtained from our data instead of the original Carver and White (1994) distribution. This was done because we used the French version of the BIS/BAS scales and little is currently known about the distribution of scores on this French version.

<sup>3</sup>This prime duration was selected to approximate the 27 ms prime duration in Capa et al. (2011) given our monitor refresh rate (85 Hz).

ensured that participants paid attention to the rewards. It was used because the cognitive processes at work in masked priming experiments are dependent on attention (Naccache et al., 2002).

# Perceptual Discrimination Task

To ensure that supraliminal reward stimuli were consciously perceived and subliminal reward stimuli were not, after the experimental task participants were asked to perform a forcedchoice test. The test consisted of four training trials followed by 80 experimental trials. Each trial consisted of masks and reward cues (**Figure 1A**), after which several choices were displayed simultaneously instead of the experimental task. Participants were asked to choose one of four responses: "I saw 1 euro," "I saw 5 cents," "I guess it was 1 euro," "I guess it was 5 cents." There was no limit to the response time, and the possible responses remained on the screen until a choice had been made.

# Self-Report Data

After the perceptual discrimination task, participants filled out the French version of the BIS/BAS scales (Carver and White, 1994). The BIS scale consisted of seven items (e.g., "I feel worried when I think I have done poorly"). The BAS scale, for its part, was made up of three subscales: Drive (four items, "I go out of my way to get things I want"), Reward Responsiveness (five items, "When I get something I want, I feel excited and energized"), and Fun Seeking (four items, "I crave excitement and new sensations"). Participants rated their responses on four-point scales ranging from 1 (totally true) to 4 (totally wrong). In the current study, Cronbach's alphas were 0.70 for the BIS scale, 0.71 for Drive, 0.73 for Reward Responsiveness, and 0.69 for Fun Seeking. Correlations between the BAS subscales ranged from 0.42 to 0.62. The BIS scale was unrelated to the BAS Reward Responsiveness and BAS Drive subscales but had a small negative association with the BAS Fun Seeking subscale (r = −0.21).

# RESULTS

#### Participant Characteristics

fpsyg-09-00148 February 14, 2018 Time: 18:59 # 5

As expected, Student's t-tests revealed that participants in the medium BAS group had higher BAS Drive scores than participants in the low BAS group and lower scores than participants in the high BAS group [t(38) = 12.52, p < 0.0001, d = 3.96] and [t(38) = 21.39, p < 0.0001, d = 6.79], respectively. Student's t-tests revealed that participants with medium BAS had higher BAS Reward Responsiveness [t(38) = 3.47, p < 0.001, d = 1.09], BAS Fun Seeking [t(38) = 2.25, p < 0.03, d = 0.71], BAS Total [t(38) = 3.29, p < 0.002, d = 1.04] scores than participants in the low BAS group and also lower BAS Reward Responsiveness [t(38) = 2.80, p < 0.007, d = 0.89], BAS Fun Seeking [t(38) = 2.61, p < 0.01, d = 0.82], BAS Total [t(38) = 3.20, p < 0.003, d = 1.01] scores than participants in the high BAS group. Analyses of BIS scores revealed no significant difference between groups (all ps > 0.77). These results reflect a successful selection of participants. Their characteristics are presented in **Table 1**.

# Percentage of Correct Responses in the Updating Task

A three-way ANOVA with reward presentation duration (subliminal vs. supraliminal) and reward value (1 euro vs. 5 cents) as within-participants factors and BAS group (low vs. medium vs. high) as between-participants factor was used to analyze the data. Reward value had a main effect, with better performance for the high reward (M = 44.00, SD = 24.79) than the low reward (M = 34.00, SD = 20.98), F(1,57) = 64.11, p < 0.0001, η 2 <sup>p</sup> = 0.53, reflecting a generally successful manipulation of reward. Most interestingly, we found a significant interaction between reward presentation duration, reward value, and group, F(2,57) = 8.64, p < 0.005, η 2 <sup>p</sup> = 0.23 (**Figure 2**).

We broke this interaction down by performing three separate ANOVAs (2 reward presentation duration × 2 reward value) for each BAS group. In the low BAS group (**Figure 2A**), there was a main effect of reward presentation duration, F(1,19) = 9.42, p < 0.007, η 2 <sup>p</sup> = 0.33, with better performance in the subliminal condition (M = 44.00, SD = 26.57) than the supraliminal condition (M = 29.00, SD = 13.80). No main


BAS, behavioral activation system; BIS, behavioral inhibition system.

effect of reward value was found (p > 0.12), but there was a significant interaction between reward value and reward presentation duration, F(1,19) = 6.41, p < 0.02, η 2 <sup>p</sup> = 0.25. The low BAS participants only performed better for the high reward than the low reward in the subliminal condition. Complementary Student's t-tests confirmed that the reward effect was present only in the subliminal condition, t(19) = 3.21, p < 0.005, d = 0.70, and not in the supraliminal condition (p = 0.83).

In the medium BAS group (**Figure 2B**), there was a main effect of reward with better performance when a high reward was at stake (M = 44.67, SD = 22.86) as opposed to a low reward (M = 34.17, SD = 18.81), F(1,19) = 58.41, p < 0.0001, η 2 <sup>p</sup> = 0.75. This effect was not affected by reward presentation duration (p = 0.57). To ascertain whether there was a reward effect in both conditions, complementary Student's t-tests were conducted. Participants in the medium BAS group performed better for the high reward than the low reward, both in the subliminal condition, t(19) = 3.62, p < 0.002, d = 0.80, and the supraliminal condition, t(19) = 5.16, p < 0.0001, d = 1.11. The main effect of reward presentation duration was not significant (p = 0.21).

In the high BAS group (**Figure 2C**), the main effect of reward was significant (1 euro: M = 49.00, SD = 26.65, and 5 cents: M = 33.17, SD = 23.80), F(1,19) = 35.46, p < 0.0001, η 2 <sup>p</sup> = 0.65. Furthermore, the reward effect was higher in the supraliminal condition than the subliminal condition, as suggested by a significant interaction, F(1,19) = 10.70, p < 0.004, η 2 <sup>p</sup> = 0.36. Complementary Student's t-tests showed an effect of reward in both conditions of presentation duration, t(19) = 3.32, p < 0.004, d = 0.75 for the subliminal condition and t(19) = 5.44, p < 0.0001, d = 1.13 for the supraliminal condition. No difference in performance was found between the possibility of earning 5 cents depending on whether it was displayed subliminally or supraliminally (p = 0.40). However, high BAS participants performed better in the subliminal condition in order to win 1 euro than in the supraliminal condition, t(19) = 2.21, p < 0.04, d = 0.47.

#### Complementary Analysis

To gain a better understanding of the effect of individual differences associated with reward sensitivity for conscious and unconscious reward processes, we conducted two separate ANOVAs (2 reward value × 3 group), one for the subliminal and the other for the supraliminal condition. In the subliminal condition, no main effect of group or interaction was found (all ps > 0.81). In the supraliminal condition, the effect of reward increased with the increase in the BAS scores, as suggested by a significant interaction between reward value and groups, F(2,57) = 12.98, p < 0.001, η 2 <sup>p</sup> = 0.31. Complementary Student's t-tests showed that the reward effect (difference between 1 euro and 5 cents) in the medium BAS group (M = 11.67, SD = 10.12) was greater than that in the low BAS group (M = −0.67, SD = 14.00), t(38) = 3.19, p < 0.003, d = 0.62, but less than that in the high BAS group (M = 23.33, SD = 19.16), t(38) = 2.41, p < 0.02, d = 0.71.

FIGURE 2 | Percentage of correct responses in the memory-updating task as a function of reward value displayed subliminally and supraliminally and BAS groups. Low BAS participants (A) performed better for the high reward than the low reward only in the subliminal condition. In the medium BAS group (B), there was a main effect of reward, and it was similar in the subliminal and supraliminal conditions. In the high BAS group (C), the reward effect was higher in the supraliminal condition than the subliminal condition. Error bars denote standard errors of the mean. ∗∗p < 0.005, and ns for not significant.

# Prime Visibility Test

It was apparent from debriefing participants before the prime visibility test that none of them was able to report whether 1 euro or 5 cents coins were presented subliminally. We analyzed the prime visibility test results on the basis of correct responses, defined as responses indicating the participant had seen or guessed the right coin. Results of the prime visibility test showed that the participants had seen the coins in the supraliminal condition (M = 97.85). However, for the subliminal coins, the mean percentage of correct responses (M = 50.94, SD = 6.29) was not significantly different from chance, as suggested by Student's t-tests (ps > 0.24). In addition, Student's t-tests revealed no difference between groups (ps > 0.72).

# DISCUSSION

The present study shows that executive performance associated with the possibility of earning a high reward improved with the increase in the BAS scores when the reward was consciously processed, but not when the reward was subliminally displayed. In the same way that personality has been shown not only to attenuate, but even sometimes to eradicate or reverse classic psychological effects (Matthews, 2009), these results highlight the need to take into account individual differences in reward sensitivity when investigating the effects of conscious and unconscious rewards on executive control.

The current findings echo other studies showing that executive performance fluctuates and can be modified for example by affective stimuli (Dreisbach, 2006; Chiew and Braver, 2014). More specifically, our work is in line with previous research on the influence of motivation showing that rewards can improve executive functioning (Bijleveld et al., 2010; Jimura et al., 2010; Capa et al., 2011, 2013; Bustin et al., 2012; Chiew and Braver, 2014). Here we found that, overall, high rewards (compared to low reward) improved memory-updating. A key finding is that individuals' reward sensitivity mediated these reward-related effects on memory-updating (at least when reward cues were processed consciously). A straightforward explanation is that in contrast with the high BAS group, individuals with low sensitivity to rewards should experience a weaker incentive state when processing the high rewards, leading to a less improvement in memory-updating relative to the low reward condition. However, why both conscious and unconscious rewards improved memory-updating, but only conscious rewards' effects depended on individual differences in reward sensitivity is open to discussion.

A relevant framework for understanding conscious and unconscious reward processing and its similar or distinctive effects on performance has been suggested by Bijleveld et al. (2012b). This model may be useful to understand the variability of the effects of conscious and unconscious rewards on memory-updating. Accordingly, people first process rewards in subcortical brain structures, such as the striatum (Pessiglione et al., 2008). This initial processing requires little perceptual input, is not consciously experienced, and can directly facilitate performance by prompting task engagement in the service of reward attainment. However, when supraliminal reward cues are consciously perceived, they may undergo full processing, in which case, the brain structures mobilized (e.g., anterior cingulate cortex, dorsolateral prefrontal cortex, and medial prefrontal cortex), in addition to the structures already engaged by initial reward processing, may involve higher-level cognitive functions, such as strategy (Haber and Knutson, 2009). Thus, full reward processing may lead individuals to consciously choose a strategy.

Furthermore, when strategy differences emerge, the effects of conscious and unconscious reward cues may also differ, with conscious reward cues either helping or hindering performance.

A good illustration of hindering performance was observed with the low BAS participants (**Figure 2A**). They performed better for the high reward than the low reward, but only in the subliminal condition. No improvement in performance was observed for a high reward in the supraliminal condition. The amount of money at stake in the present study was probably not challenging enough for the low BAS participants, and conscious reflection on reward probably led them to disengage from the pursuit and attainment of reward. This is consistent with the study of Zedelius et al. (2013) which showed that conscious reflection on a money cue can cause people to disengage from attainment of reward. Participants were invited to perform a working memory task in which money cues (coins) serving as rewards or not were displayed supraliminally or subliminally. High money cues led to improved performance even when the coins did not serve as rewards in the subliminal condition, but not in the supraliminal condition. This suggests that consciousness is crucial in regulating effort mobilization toward money cues. In the medium BAS group (**Figure 2B**), the effect of reward was similar in the subliminal and supraliminal conditions, suggesting that participants in the medium BAS group adopted a similar level of task engagement toward the pursuit and attainment of conscious and unconscious rewards. This participant sample is probably the one most frequently studied, and this result is in keeping with our previous study (Capa et al., 2011), which found a reward-related effect in both subliminal and supraliminal conditions. The high BAS group may be a good illustration of conscious reward helping performance. For participants in the high BAS group (**Figure 2C**), the reward effect was greater in the supraliminal condition than the subliminal condition. Consciously reflecting on reward led the high BAS participants to engage more toward attaining the reward and to perform better. This fits well with previous studies showing that an increase of BAS score induced higher task engagement to earn a conscious reward (Boksem et al., 2006, 2008; Hickey et al., 2010).

The existence of unconscious perception is no longer denied. Rather, the controversy has shifted to the depth with which subliminal stimuli can be processed and the limits to unconscious cognition (van Gaal et al., 2012). These limits are source of variability between conscious and unconscious reward effects on executive functions. The present results highlight the limits to the depth with which unconscious stimuli can be processed. In our study, there was no difference in performance across BAS groups when the possibility of winning a large reward was displayed subliminally, suggesting there was no difference in the strategies adopted by the groups. Differences in performance across BAS groups emerged only when the reward was processed consciously, suggesting that consciousness of reward is crucial to triggering specific strategies. This result lends support to the theoretical framework developed by Bijleveld et al. (2012b) which suggests that full reward processing may cause individuals to choose a strategy consciously.

Individual differences in reward sensitivity, as indexed by the BAS scores used in the present study, have been strongly linked to dopaminergic neurotransmission (Gray, 1989; Tomer et al., 2014). Interestingly, two recent studies have investigated whether individual differences in indirect markers of dopaminergic activity modulate the reward-related effects on performance in a tapping task (Veling and Bijleveld, 2015) and a force task (Pas et al., 2014). In both studies, performance correlated with dopaminergic activity when reward cues were presented subliminally, but not when reward cues were presented supraliminally. These results contrast with our finding that performance was moderated by reward sensitivity only when reward information was processed consciously. One possible explanation of these contrasting findings is that the prior studies and the current work differed in the method employed to define inter-individual differences. The prior studies used neurophysiological or behavioral markers of the dopamine system activity, such as eye blink rate (Pas et al., 2014, Study 1), error-related negativity (Pas et al., 2014, Study 2) and performance in the balloon analog risk task (Veling and Bijleveld, 2015). In contrast, we estimated inter-individual differences using self-report measures. It is possible that the explicit selfreport of sensitivity to reward led participants to act accordingly when exposed to supraliminal reward cues, hence putting more effort in the implementation of conscious strategies. However, this explanation cannot fully account for the present results, because one can still expect individuals' trait sensitivity to reward to modulate the effects of subliminal rewards. Then, it is noteworthy that the studies by Pas et al. (2014) and Veling and Bijleveld (2015) involved a tapping task and a force task, respectively, i.e., tasks that place minimal demand on executive control resources. In contrast, we specifically investigated executive performance in a memory-updating task. Such a focus on executive performance may explain the specific pattern of results we obtained. One might speculate that the processing of conscious reward and/or cost-benefit decisions associated with these rewards (Bijleveld et al., 2010, 2012b), are less resource demanding in individuals with high (vs. low) reward sensitivity. Hence, processing conscious reward would be more beneficial for high-BAS individuals during the performance of a task that requires resource demanding, cognitive control processes. In contrast, the more rudimentary unconscious processing of reward would boost motivation (Bijleveld et al., 2010), hence improving executive performance, irrespective of individuals' reward sensitivity. This interpretation remains, however, speculative and calls for further research.

Interestingly, it has been suggested recently that working memory operations are subjectively costly and therefore costbenefit decision making would bias the functioning of working memory (Westbrook and Braver, 2016). Individual differences in reward/punishment responsiveness may imply differences in subjective costliness of executive operations and cost-benefit decisions (Franken and Muris, 2005; Suhr and Tsanadis, 2007). Hence, a promising line of future research is to examine the potential relationship between reward responsiveness and costbenefit decisions underlying executive functioning, for example by evaluating the combined effects of task difficulty and reward as a function of individuals' BAS/BIS profiles.

It is worth considering our finding that rewards improved memory-updating in light of recent neurocognitive models. On one hand, research demonstrates that reward processing and motivation are intimately linked to dopamine-related circuitry (Kim et al., 2015; Westbrook and Braver, 2016). On the other hand, neurocognitive approaches of executive control suggest that dopamine activity plays a crucial role in the maintenance/updating of working memory content (Braver, 2012). Interestingly, a recent framework proposed by Westbrook and Braver (2016) integrates these two lines of research and suggests that rewards improve the stability of the content in working memory through modulating of frontal dopamine release (Westbrook and Braver, 2016). Importantly, within this framework, striatal dopamine release mediates the incentiveinduced improvement of the updating of working memory. On this basis, one might speculate that the improvement in memoryupdating we observed in the high reward condition was related to reward-induced modifications of striatal dopaminergic activity.

The present results thus highlight the intimate link between motivation/reward and executive functioning (Pessoa, 2009; Braver et al., 2010). They further show that this link is shaped by the BAS/BIS profile of individuals. However, these results should be considered in light of some limitations. A potentially important limitation of our study is the small number of participants in each group, which limits the generalizability of the findings. Another limitation is that the different groups differed on the three BAS subscales (Drive, Reward Responsiveness, and Fun Seeking). Although the groups were defined according to the Drive subscale, this prevents us to determine which specific dimension of the BAS was responsible for the observed effects. However, it is worth to note that the participants were matched on the BIS component and that the stability of the BAS scores was verified with two administrations, which are strong points of the present research.

#### CONCLUSION

We identified different behavioral responses to consciously vs. unconsciously processed reward stimuli and have shown that individual differences in reward sensitivity are a key factor

#### REFERENCES


for explaining variations in executive performance aimed at reward attainment. We found that the modulatory role of reward strengthened with the increase in the BAS scores, but only when the reward was processed consciously. This result suggests conscious processing of reward is crucial for the existence of specific strategies pertaining to reward sensitivity in tasks requiring executive control.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the centre of human investigations of Strasbourg with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the centre of human investigations of Strasbourg. All procedures performed were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

This work was supported by the French National Institute for Health and Medical Research (INSERM).

#### ACKNOWLEDGMENTS

This work was supported by the French National Institute for Health and Medical Research (INSERM). The authors wish to thank the reviewers for their helpful comments on earlier drafts of the manuscript. They also thank Sean Duffy for his suggestions.

reward decisions. J. Exp. Psychol. Gen. 141, 728–742. doi: 10.1037/a002 7615



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Capa and Bouquet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Effects of Arousal and Approach Motivated Positive Affect on Cognitive Control. An ERP Study

Andrzej Cudo\*, Piotr Francuz, Paweł Augustynowicz and Paweł Stróz˙ak

The Department of Experimental Psychology, The John Paul II Catholic University of Lublin, Lublin, Poland

A growing body of research has demonstrated that affect modulates cognitive control modes such as proactive and reactive control. Several studies have suggested that positive affect decreases proactive control compared to neutral affect. However, these studies only focused on the valence of affect and often omitted two of its components: arousal and approach motivation. Therefore, we designed the present study to test the hypothesis that cognitive control modes would differ as a function of arousal and approach motivated positive affect. In our study, we used an AX-continuous performance task (AX-CPT), commonly used to examine shifts in proactive and reactive control. We also measured P3b, contingent negative variation (CNV), N2 and P3a components of event-related brain potentials (ERPs) as indicators of the use of cognitive control modes. The findings of the present study demonstrated that approach motivated positive affect modified only the P3b and the CNV without effects on the N2 and P3a components. However, arousal induced by pictures modified P3b, CNV and N2 amplitudes. Specifically, the P3b amplitude was larger, and CNV amplitude was less negative in the high than in the low-approach motivated affect. In contrast, the P3b amplitude was larger and both the CNV and N2 amplitudes more negative in low- compared with high-arousal conditions. These ERP results suggest that approach motivated positive affect enhanced proactive control with no effect on reactive control. However, arousal influenced both proactive and reactive control. High arousal decreased proactive control and increased reactive control compared to low arousal. The present study provides novel insights into the relationship between affect, specifically, arousal and approach motivated positive affect and cognitive control modes. In addition, our results help to explain discrepancies found in previous research.

Keywords: proactive control, reactive control, arousal, approach motivation, P3b, CNV, N2, P3a

# INTRODUCTION

Cognitive control is defined as a system of processes that maintain the ability to interact with the environment in a goal-driven manner, with flexibility and constantly adapting behavior to the changing environment (Botvinick et al., 2001). Cognitive control is also defined as an emergent process resulting from the dynamic interaction between specialized brain processing systems. Also, this control is possible due to the information of the context in which the task is performed. The context is defined by information about the goals, instructions and requirements relating to the task, as well as information from a previously performed task

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Susan Gillingham, University of Toronto, Canada Juliana Yordanova, Institute of Neurobiology (BAS), Bulgaria

> \*Correspondence: Andrzej Cudo andrew.cudo@gmail.com

Received: 22 January 2018 Accepted: 23 July 2018 Published: 31 August 2018

#### Citation:

Cudo A, Francuz P, Augustynowicz P and Stróz˙ ak P (2018) The Effects of Arousal and Approach Motivated Positive Affect on Cognitive Control. An ERP Study. Front. Hum. Neurosci. 12:320. doi: 10.3389/fnhum.2018.00320 (Braver et al., 2007). The Dual Mechanism of Control (DMC) framework indicates that cognitive control functions via two distinct operating modes: proactive control and reactive control (Braver et al., 2007; Braver, 2012). Proactive control relates to the active maintenance of contextual information to optimally bias attention, perception and action systems in a goal-driven manner. Reactive control is associated with the retrieval of context information mobilized only as needed, especially after detection of a high interference event (Braver et al., 2007; Braver, 2012). Proactive control is associated with a large number of resources which must be engaged to achieve continuous goal maintenance. As a result, it contributes to limiting the number of goal representations that are the focus of attention and reducing the maintenance of other information (Braver, 2012). It is also connected with the activity of the orbitofrontal–dorsolateral cortex (Braver et al., 2007; Braver, 2012). On the other hand, reactive control is mobilized ''in a just-in-time manner'' and is, therefore, less resource consuming. The engagement of this control mode is linked to increased anterior cingulate cortex (ACC) activity in response to the detection of interference (Braver, 2012). These types of control can be modulated by several factors, including positive affect (Dreisbach and Goschke, 2004; Dreisbach, 2006; Goschke and Bolte, 2014).

A large number of studies examining the influence of affect on cognitive control within the DMC framework have used the AX-Continuous Performance Task (AX-CPT; Rosvold et al., 1956; Braver and Cohen, 2001; Braver, 2012). This task also requires the ability to update information held in working memory (Goschke and Bolte, 2014). During the task, sequences of letters are shown to the subjects; in each sequence, the first letter is a cue and the second is a probe. There are four possible sequences: (1) AX: the cue is A, and this is followed by the letter X as the probe; (2) AY: the cue is A, and it is followed by any probe other than X; (3) BX: the cue is any letter other than A, and it is followed by the X probe; and (4) BY: the cue is any letter other than A, and it is followed by any probe other than X. The subject's task is to respond in a specific way (e.g., by pressing the right mouse button) to the probe when it appears as part of an AX sequence. When exposed to the other sequences, the subject is expected to respond differently (e.g., by pressing the left mouse button). This task is an experimental paradigm that establishes context in the form of a specific cue, after which the subject must react to the probe. The sequences are displayed with the following frequencies: AX—70%, AY—10%, BX—10%, BY—10% (Braver, 2012). Therefore, subjects are biased to respond as though for AX sequences when they have AY or BX sequences. Two different error rates and reaction time (calculated for the correct responses) patterns in AY and BX sequences can be observed depending on whether the proactive or reactive control is engaged. Proactive control should create an expectancy for an X probe response following an A cue, which leads to a larger error rates and longer reaction times in the AY sequences. In this context, the longer reaction time may reflect greater interference between the preparatory process followed by the A cue and the response process followed by the Y probe. In the BX sequences, the cue-driven reaction to the probe should lead to fewer error rates and shorter reaction times. This reaction time pattern in BX sequences may occur because the actively maintained contextual information provided by the B cue serve to reduce interference between the preparatory process followed by the B cue and response process followed by the X probe. By contrast, the engagement of reactive control is associated with probe-driven reactions and may lead to fewer error rates and shorter reaction times in AY sequences because the subject does not follow the A cue information when the Y probe is presented. Hence, the subject does not actively maintain contextual information about the A cue and responds on the basis of information about the Y probe which leads to a shortened response time for AY sequences in reactive compared to proactive control. Also, the probe-driven reaction should contribute to more error rates and longer reaction times in the BX sequences. This is related to the fact that the person using reactive control when seeing the X probe is not able to inhibit the learned reaction and change to the less frequent response in the BX sequence. This occurs even though the B cue appears before the X probe. Also, the slower reaction time in the BX sequence in reactive compared to proactive control mode reflects the time taken to engage contextual information about the B cue following X probe presentation (Braver and Cohen, 2001; Braver et al., 2007; Chiew and Braver, 2017).

In addition to behavioral measurements, the AX-CPT method provides reliable indicators of proactive and reactive control using event-related brain potentials (ERPs; see van Wouwe et al., 2011; Morales et al., 2015; Chaillou et al., 2017; Li et al., 2018). Based on previous studies of AX-CPT, the proactive mode of control is assumed to be reflected by P3b analyzed for the cue and contingent negative variation (CNV) analyzed before the probe (see **Figure 1**). By contrast, reactive control is reflected by N2 and P3a analyzed for the probe (see van Wouwe et al., 2011; Lamm et al., 2013; Morales et al., 2015; Kamijo and Masaki, 2016; Chaillou et al., 2017). P3b is a positive component that reaches its maximum 300–600 ms after stimulus presentation at the Pz electrode (Polich, 2003). This component has multiple functional correlates including context updating, the memory of task-relevant information and target categorization (Polich, 2007). Moreover, a larger P3b is associated with greater context updating and utilization of cue information (Donchin and Coles, 1988; Polich, 2007; Lenartowicz et al., 2010). Therefore, P3b amplitude may reflect enhanced proactive control (van Wouwe et al., 2011). CNV is a slow, surface-negative electrical brain wave occurring in the interval between the presentation of a warning stimulus (e.g., cue) and an imperative stimulus (e.g., probe) to which a motor response is usually required (Tecce, 1972). The CNV component is recorded from the frontal and central electrodes and it is assumed to represent multiple functional correlates including preparing the motor response (Loveless and Sanford, 1975), activation of the attention network (Fan et al., 2007), temporal processing (Mento, 2013), working memory load and response interference (Tecce, 1972; Roth et al., 1975; Gevins et al., 1996; McEvoy et al., 1998). Moreover, a more negative CNV is related to a greater preparatory process for the motor response, particularly where that preparation is preceded by

a prior cue that a response is to be prepared (Ruchkin et al., 1995). This may indicate that greater involvement of proactive control is related to more effective task preparation and a larger CNV amplitude. Regarding the reactive control components, the N2 component is a negative component that reaches its maximum 200–400 ms after a conflict situation (Folstein and Van Petten, 2008). Its source of generation is in the medial frontal cortex but is more likely to be in the ACC; (Nieuwenhuis et al., 2003; Folstein and Van Petten, 2008). The ACC, according to the DMC framework is associated with reactive control (Braver et al., 2007; Braver, 2012). The N2 component is usually associated with the monitoring of conflicts relating to the inhibition of incorrect response tendencies caused by either the processing of irrelevant stimuli or choice in the face of competing alternatives (Van Veen and Carter, 2002; Nieuwenhuis et al., 2003). Therefore, it is expected to occur with AY sequences. Larger amplitudes of N2 may reflect stronger conflict detection and may thus be associated with the efficient reactive control. Conversely, P3a is a positive frontoparietal scalp potential with its maximum occurring 300–600 ms after probe presentation. This component reaches a maximum at the FCz electrode (Beste et al., 2011; van Wouwe et al., 2011). The P3a component may be associated with conflict resolution and response inhibition (Bekker et al., 2004; Jonkman, 2006; Polich, 2007; Smith et al., 2008). It is connected with the activity of the ACC (Volpe et al., 2007) that partly supports reactive control (Braver et al., 2007; Braver, 2012). Therefore, larger amplitudes of P3a may reflect enhanced reactive control. It is also expected that its amplitude will be largest for AY sequences as in the case of the N2 component. Concluding, the greater significance of the cues in the proactive control, as opposed to the reactive control, would be expected to elicit a larger cue-related P3b component. Also, the greater expectation of the probe after cue in proactive mode would be expected to elicit a larger CNV compared to the CNV in the reactive control. The greater significance of the probe in the reactive control would be expected to elicit a larger probe-related N2 and probe-related P3a amplitudes here than in the proactive control for AY sequences.

The results of research conducted on the DMC framework (Braver et al., 2007) indicate that positive affect modulates the proactive mode of cognitive control (Dreisbach and Goschke, 2004; Dreisbach, 2006; Fröber and Dreisbach, 2012, 2014). Some researchers have suggested that positive affect is associated with a decrease in proactive control (Dreisbach and Goschke, 2004; Dreisbach, 2006; Fröber and Dreisbach, 2014). For example, Dreisbach (2006) showed that compared with pictures eliciting neutral and negative affect, those eliciting positive affect reduced error rates in AY trials and increased error rates and reaction times in the BX condition in an AX-CPT. Fröber and Dreisbach (2014) demonstrated that positive affect pictures reduced error rates and reaction times in AY but not in BX sequences. Similarly, van Wouwe et al. (2011) demonstrated that positive rather than neutral affect reduced errors in AY trials but, had no effect on BX sequences. Decreases in error rates in the AY condition may be linked with reduced maintenance of the A cue, which would lead to incorrect preparation for displays of the Y probe. This may result in lower response conflict when the Y probe appears, which may suggest a decrease in proactive control (Braver, 2012). van Wouwe et al. (2011) also showed a more negative probe-related N2 amplitude with neutral affect than with positive affect in AY sequences. van Wouwe et al. (2011) suggested that their results indicated an increase in reactive control and a decrease in proactive control. By contrast, Chiew and Braver (2014) showed increased error rates in AY and decreased error rates in all other sequences in a positive affect block compared with a neutral one. This may indicate the reinforcement of proactive control (Braver, 2012).

In addition to exploring valence, studies have also examined two other dimensions of affect: arousal and approach motivation (see Gable and Harmon-Jones, 2010; Demanet et al., 2011). Arousal is one of the independent affect dimensions defined as a mental activity that can be described along a single dimension ranging from sleep to excitement among other things, in response to a stimulus (Mehrabian and Russell, 1974; Russell and Barrett, 1999). On the other hand, approach motivation is defined as the impulse to go toward stimuli (Lang and Bradley, 2008; Gable and Harmon-Jones, 2010). Arousal is a state of physiological alertness and readiness for action in response to the emergence of an affective stimulus, whereas approach motivation is associated with the action of a person to an affective stimulus (Gable and Harmon-Jones, 2008, 2010). For example, if a person sees a beautiful landscape, such an affective stimulus could generate a low level of arousal but a high motivation to approach it. Furthermore, arousal and approach motivation are connected with different nervous systems. A great deal of recent research suggests that the locus coeruleus-norepinephrine system (LC-NE) is associated with general arousal (Aston-Jones and Cohen, 2005). However, in the case of the approach motivation dimension of positive affect, recent research hypothesizes that the dopamine (DA) system may have a key role in the relationship between motivation and cognitive control (Aarts et al., 2011, 2014; Yee and Braver, 2018).

As regards arousal, it has been shown that low-arousal positive affect reduces cue usage and proactive control, but that high-arousal positive affect increases this type of control. However, no study has found effects for negative affect conditions (Fröber and Dreisbach, 2012). Regarding approach motivation, it has been observed that low-approach motivated positive affect is associated with decreased proactive control and high-approach motivated positive affect enhances proactive control. Specifically, Liu and Xu (2016) showed that error rates were higher in AY sequences and lower in BX sequences in a high-approach motivated positive picture group than in a neutral one. Also, they demonstrated the opposite effect in a low-approach motivated positive picture group than in the a neutral one. More recently, Li et al. (2018) showed that the CNV amplitude analyzed before the probe presentation was larger in a high-approach than in a low-approach condition. They also demonstrated that probe-related P3a was more positive for low than for high-approach motivated positive affect in an AY sequence. However, no effect of approach motivation was found for the probe-related N2 component (Li et al., 2018). This may indicate increasing proactive control in high-approach compared with low-approach motivated positive affect (Gómez et al., 2007; Morales et al., 2015).

To sum up, previous research indicates that the modification of cognitive control is associated with positive affect (Dreisbach and Goschke, 2004; Dreisbach, 2006; Fröber and Dreisbach, 2012; Goschke and Bolte, 2014; Lamm et al., 2013). Positive affect enhances cognitive flexibility in cognitive control and, consequently, impaired maintenance of task-relevant context information and reduced proactive control (Goschke and Bolte, 2014). However, behavioral and electrophysiological findings have suggested that high-approach as opposed to low-approach motivated positive affect enhances proactive control. Furthermore, Fröber and Dreisbach (2012) showed that low-arousal positive affect reduced proactive control whereas high-arousal positive affect increased this type of control. Therefore, previous findings do not provide a coherent explanation of observed differences in the different dimensions of affect. It should be noted that arousal was fully controlled for only in the study by Li et al. (2018), while Fröber and Dreisbach (2012) did not investigate the approach motivation of positive affect. Moreover, the positive affect components arousal and approach motivation have not been compared in any study using the AX-CPT paradigm. Such as comparison could help to understand the discrepancy in studies of the impact of positive affect on cognitive control.

Previous studies have demonstrated different neuronal mechanisms relating to arousal and the approach motivation of affect (Aston-Jones and Cohen, 2005; Demanet et al., 2011; Miller et al., 2013; Braver et al., 2014; Unsworth and Robison, 2017). Therefore, it can be assumed that arousal and the approach motivation of positive affect can independently influence cognitive control. Therefore, the aim of our study was to identify the specific influence of positive affect on proactive and reactive control, considering not only valence but also arousal and the approach motivation of positive affective stimuli simultaneously. On the basis of theoretical discussions, and in accordance with previous studies, we postulated that with high compared with low approach motivation positive affect would be associated with enhanced proactive control (see Liu and Xu, 2016). This would be reflected in the modified amplitudes of the P3b and CNV components. Specifically, we postulated that the P3b amplitude, which is thought to be associated with context updating, would be larger for high- than low-approach motivation. Also, we hypothesize that CNV amplitude, as a functional correlate of preparation for an incoming stimulus would be more negative with high- than low-approach motivation. Considering the Pessoa (2009) model, in which high-arousal stimuli are related to reducing task performance because there is competition between affective stimuli and executive control for attention resources, we expected that high compared to low arousal would be associated with the impaired proactive control. We also hypothesize that both proactive and reactive control would be modified by arousal. The above would be reflected in the P3b, CNV, N2 and P3a component amplitudes. Specifically, we postulated a smaller P3b amplitudes and less negative CNV amplitudes with high compared with low arousal. We also hypothesize that N2 amplitudes, which is thought to be associated with conflict monitoring, would be more negative in the high than in the low arousal. Moreover, we postulated that P3a amplitudes as a functional correlate of conflict resolution will be larger in the high than in the low arousal. To investigate the connection between proactive and reactive control, the electrophysiological method was used, along with high time precision and the AX-CPT paradigm. Considering that individual differences play an important role in modulating affective impact on cognitive control, we used an intra-subject design to control for differences between individuals.

# MATERIALS AND METHODS

# Participants

The study comprised 25 participants (five men; M = 21.32 years, SD = 1.44) who were selected from 748 university students from Lublin. The selection was based on the level of working memory capacity. For this purpose, people performed an Operation Span Task and Symmetry Span Task (Unsworth et al., 2005). The level of working memory capacity was calculated similarly to previous studies (Redick et al., 2012; Redick, 2014). People who achieved the middle results were selected for the study (M = 0.05, SD = 0.13) because previous research has shown the difference between people with high and low working memory capacity in proactive control (Redick, 2014; Wiemers and Redick, 2018) The mood of the participants was measured by the Positive and Negative Affect Schedule (PANAS; Watson et al., 1988; Brzozowski, 2010). The participants obtained a mean of 49.96 ± 9.50 in the positive affect scale and a mean of 20.40 ± 5.82 in the negative affect scale. All participants had normal or corrected to normal vision and had no known neurological problems. They volunteered for the study and received a monetary 70 PLN reward (approximately 20 USD). They were informed about the anonymity of the research, and participants gave written consent before the experiment. This study was carried out in accordance with the recommendations of the Ethical Committee of the Institute of Psychology with written informed consent from all participants. All participants gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethical Committee of the Institute of Psychology of The John Paul II Catholic University of Lublin.

# Procedure

The study applied the paradigm of AX-CPT (Rosvold et al., 1956), using the version proposed by Braver and Cohen (2001) and applied previously in research focusing on the functioning of cognitive control (Braver, 2012). The AX-CPT is a context processing task particularly applied to examine changes in the use of two types of cognitive control: proactive and reactive control. During AX-CPT trials, participants are shown pairs of letters, the first one being a cue, and the second being a probe. There are four possible sequences: (1) AX: the cue is A, and this is followed by the letter X as the probe; (2) AY: the cue is A, and it is followed by any probe other than X; (3) BX: the cue is any letter other than A, and it is followed by the X probe; and (4) BY: the cue is any letter other than A, and it is followed by any probe other than X. The participant's task is to respond in a specific way (e.g., by pressing the right mouse button) to the probe when it appears as part of an AX sequence. When exposed to the other sequences, the participant is expected to respond differently (e.g., by pressing the left mouse button). The sequences were displayed with the following frequency: AX—70%, AY—10%, BX—10%, BY—10% (Braver, 2012). This frequency is implemented to induce a strong association between the A cue and the X probe in the AX sequence.

The experimental procedure was preceded by a training session, during which the participants practiced the task. At this stage, the participants received feedback on the accuracy of responses. No such information was provided during the experimental trials. Each trial started with the presentation of the picture from the affective picture pool for 1,000 ms, followed by a blank screen shown for 100 ms. Subsequently, the cue letter was displayed for 250 ms. The interval between the contextual cue onset and the probe onset in each trial was 1,750 ms. After this period, the probe was displayed on the screen for 250 ms (see **Figure 1**). Participants had to press a button each time the probe was presented. In the AX sequence, if the X probe appeared after the A cue, they had to respond with the right button. In other sequences, they had to press the left button. Participants had to press the right button using the right index finger and the left button using the left index finger. To ensure equivalence, halfway through the procedure, the method of responding to the use of the response pad was reversed. All letters were displayed in black color and 28-point Arial font. The procedure does not use letters that are similar in appearance to A or X; for example, K, Y, B, H, R. Affective picture types were organized in separate blocks that were presented randomly to each participant. The experiment began with 40 practice trials. Next, participants performed 600 trials in each affective condition. Each affective condition block was divided into six identical blocks of 100 trials, separated by short breaks in each condition. Stimuli were presented on a 24-inch LCD computer monitor with a display resolution of 1920 × 1080 pixels and a refresh rate of 60 Hz. Participants were seated at a viewing distance of 70 cm from the monitor. The procedure was prepared in the E-Prime software 2.0 (Psychology Software Tools Inc., Sharpsburg, PA, USA).

In order to verify the influence of affect on cognitive control, and in line with previous studies, pictures from a standardized set of affective pictures was used. However, we used the Nencki Affective Picture System (NAPS; Marchewka et al., 2014) instead of the International Affective Picture System (IAPS; Lang et al., 1997). Our choice was influenced by the pictures in the NAPS being divided according to the three dimensions of affect: valence, arousal and approach-avoidance motivation dimensions. In addition, the standardization of pictures was performed on a Polish sample (Marchewka et al., 2014). In these studies, all pictures had positive valence and they were divided into four types: (1) low level of arousal and low level of approach motivation (valence: M = 6.52, SD = 0.41; approach-avoidance motivation: M = 6.18, SD = 0.32; arousal: M = 3.63, SD = 0.29); (2) high level of arousal and high level of approach motivation (valence: M = 7.32, SD = 0.22; approach-avoidance motivation: M = 7.25, SD = 0.23; arousal: M = 5.66, SD = 0.34); (3) high level of arousal and low level of approach motivation (valence: M = 6.33, SD = 0.29; approach-avoidance motivation: M = 5.88, SD = 0.39; arousal: M = 5.42, SD = 0.48); and (4) low level of arousal and high level of approach motivation (valence: M = 7.36, SD = 0.39; approach-avoidance motivation: M = 7.35, SD = 0.27; arousal: M = 2.84, SD = 0.48). The images were displayed at the resolution of 800 × 600. The list of selected pictures can be found in **Supplementary Material (Data Sheet 1)**.

# EEG Recording

Electroencephalograms (EEG) were continuously recorded at a sampling rate of 250 Hz with a high-input impedance amplifier (200 MOhms, EGI Inc., Model: GES 300), using an active electrode system (Brain Products 64-channel actiCAP). The EGI Net Station Version 4.4 was used in the EEG registration. Electrode impedance was maintained below 5 kOhm throughout the experiment. E-Prime 2.0 Professional was used for stimuli presentation.

# ERP Preprocessing

Preprocessing was performed in MATLAB (Mathworks, Natick, MA, USA) using EEGLAB (Delorme and Makeig, 2004). EEG data were re-referenced offline to linked mastoids. As in previous studies (van Wouwe et al., 2011; Chaillou et al., 2017), we used a 0.01–30 Hz offline bandpass filtering for the P3a, P3b and CNV components. For the N2 component, we used a 2–12 Hz offline bandpass filtering, to filter out the P3a component (Donkers et al., 2005). Eye movements and other non-EEG artifacts were corrected by independent component analysis (Delorme et al., 2007). Only epochs with correct responses were kept for averages. The number of trials used for ERP averaging was controlled across conditions; this information is included in a **Supplementary Material (Data Sheet 2)**. Our ERP segmentation and analyses were based on previous ERP studies in which the AX-CPT was used (Beste et al., 2011; van Wouwe et al., 2011; Morales et al., 2015; Chaillou et al., 2017). Epochs were extracted from −200 ms to 800 ms relative to cue or probe onset, with a 200 ms pre-cue or pre-probe baseline respectively. However, the trials for CNV analyses were segmented into 2,200 ms epochs which were extracted from −1,950 ms to 250 ms relative to probe onset with 200 ms pre-cue baseline (see Beste et al., 2011).

The P3b component was analyzed for cue-related potentials. Analyses were conducted over the Pz electrode site because previous studies (Polich, 2007; van Wouwe et al., 2011; Morales et al., 2015) showed that P3b reaches its maximum amplitude at this electrode. The mean amplitude of P3b was calculated in the 450–700 ms time window after cue onset.

On the basis of previous studies (van Wouwe et al., 2011; Morales et al., 2015; Chaillou et al., 2017), the mean amplitude of the CNV was calculated in the time range of 200 to 0 ms before the probe presentation over the Cz electrode. This electrode was chosen because previous studies have indicated that the amplitude is the greatest here (Ruchkin et al., 1995).

The N2 component was analyzed after probe presentation for the probe-related potentials. The analyses were carried out over the FCz electrode because this site is considered to be where the amplitude is greatest (Van Veen and Carter, 2002; van Wouwe et al., 2011; Morales et al., 2015). The mean amplitude of N2 was calculated in the 250–350 ms time window after probe onset.

Also, the P3a component was analyzed for probe-related potentials over the FCz electrode. The mean amplitude of the P3a was calculated in a time range of 350–500 ms after probe onset.

# Data Analysis

For the behavioral data, statistical analyses (three-way mixed ANOVA) were conducted separately for errors and medians of response times (calculated for correct responses) with withinsubject factors of APPROACH MOTIVATION (low, high), AROUSAL (lower, higher) and SEQUENCES (AX, AY, BX, BY). Moreover, Proactive Indexes were calculated separately for the medians of response times and error rates according to the formula (AY − BX)/(AY + BX; see Braver et al., 2009; Chiew and Braver, 2014). The result was in the range from −1 to +1. Results approximating +1 reflect the greater involvement of proactive control. The statistical analyses (two-way mixed ANOVA) were similarly conducted for the Proactive Indexes. Simple effects were verified with the Bonferroni post hoc test.

For the electrophysiological data, statistical analyses (threeway repeated measures ANOVA) were conducted separately for amplitudes of the CNV and P3b components with withinsubject factors of APPROACH MOTIVATION (lower, higher), AROUSAL (lower, higher) and CUES (A, B). For amplitudes of the N2 and P3a components, we performed a three-way repeated measures ANOVA with within-subject factors of APPROACH MOTIVATION (lower, higher), within-subject factors of AROUSAL (lower, higher) and within-subject factors of SEQUENCES (AX, AY, BX, BY). The Bonferroni correction was applied to multiple comparisons. Statistical analysis of the data was performed using the SPSS 21.0 software.

# RESULTS

#### Task Performance

For error rates, there was a significant main effect of SEQUENCES (F(3,22) = 7.89, p < 0.001, η 2 <sup>p</sup> = 0.63). The post hoc test showed differences in the following pairs of sequences AX-AY (p < 0.001), AY-BX (p < 0.001), AY-BY (p < 0.001), BX-BY (p = 0.014). These results are shown in **Figure 2A**. Other significant main or interactive effects were not yielded (F < 1.08, p > 0.309).

For reaction times, there was a significant main effect of SEQUENCES (F(3,22) = 7.89, p < 0.001, η 2 <sup>p</sup> = 0.96). The post hoc test showed differences in the following pairs of sequences AX-AY (p < 0.001), AX-BX (p < 0.001), AX-BY (p < 0.001), AY-BX (p < 0.001), AY-BY (p < 0.001) and BX-BY (p = 0.045). These results are shown in **Figure 2B**. Other significant main or interactive effects were not yielded (F < 1.28, p > 0.305).

For the Proactive Index (error rates), the main effect of APPROACH MOTIVATION (F(1,24) = 0.06, p = 0.808) and the main effect of AROUSAL (F(1,24) = 2.93, p = 0.100) were not significant. Also, there was no significant first-order interaction effect of APPROACH MOTIVATION × AROUSAL (F(1,24) = 0.19, p = 0.287). For Proactive Index (reaction time), there were no significant main effects of APPROACH MOTIVATION (F(1,24) = 0.70, p = 0.411) or AROUSAL (F(1,24) = 0.80, p = 0.381). Also, the first-order interaction effect of APPROACH MOTIVATION × AROUSAL was not significant (F(1,24) = 1.66, p = 0.209).

#### ERPs

#### P3b

There was a significant main effect of the factor CUES (F(1,24) = 71.38, p < 0.001, η <sup>2</sup> = 0.75). The P3b amplitude was more positive in the B cue condition (M = 3.64 µV, SE = 0.62 µV) than in the A cue condition (M = −0.25 µV, SE = 0.47 µV). Also, the main effect of the factor APPROACH MOTIVATION was significant (F(1,24) = 71.38, p < 0.001, η <sup>2</sup> = 0.74). The P3b amplitude was larger in the high-approach motivation condition (M = 2.38 µV, SE = 0.50 µV) than in the low-approach motivation condition (M = 1.02 µV, SE = 0.51 µV). There was a significant first-order interaction effect for APPROACH MOTIVATION × AROUSAL (F(1,24) = 19.50, p < 0.001, η <sup>2</sup> = 0.45). The effect showed the different patterns of P3b amplitude in the high- and low-approach motivation condition. Based on the post hoc test, the difference between low and high arousal has been shown in the high-approach motivation condition (p < 0.001). Specifically, the P3b amplitude was smaller in high arousal (M = 1.82 µV, SE = 0.51 µV) than low arousal (M = 2.94 µV, SE = 0.52 µV). The analogous difference was not observed in the low-approach motivation condition (p = 0.170). Furthermore, the post hoc test showed that the P3b amplitude was larger in high-approach motivation (M = 1.82 µV, SE = 0.51 µV) than low-approach motivation (M = 1.27 µV, SE = 0.54 µV) in high-arousal conditions (p = 0.029). The analogous difference was observed in the

low-arousal condition (p < 0.001). The first-order interaction effect for APPROACH MOTIVATION × CUES was significant (F(1,24) = 6.44, p = 0.018, η <sup>2</sup> = 0.21). The post hoc test showed that difference between the A cue and the B cue was significant in high- (p < 0.001) and low-approach motivation conditions (p < 0.001). We also observed differences between low- and high-approach motivation in the A cue (p < 0.001) and the B cue (p < 0.001) conditions. The results are shown in **Figure 3**. There was a significant second-order interaction effect for AROUSAL × APPROACH MOTIVATION × CUES (F(1,24) = 6.98, p = 0.014, η <sup>2</sup> = 0.23). The results are shown in **Figure 4**. There was no significant main effect for the factor AROUSAL (F(1,24) = 1.56, p = 0.223) or the first-order interaction effect AROUSAL × CUES (F(1,24) = 2.32, p = 0.141). **Figure 5** illustrates the average ERP waveforms after the A and B cues for each condition.

#### CNV

There was a significant main effect of factor APPROACH MOTIVATION (F(1,24) = 19.41, p < 0.001, η <sup>2</sup> = 0.45). The CNV amplitude was more negative in the low condition (M = −1.64 µV, SE = 0.61 µV) than in the high-approach motivation condition (M = −0.21 µV, SE = 0.68 µV). Also, there was a significant first-order interaction effect for APPROACH MOTIVATION × AROUSAL (F(1,24) = 8.93, p = 0.006, η <sup>2</sup> = 0.27). The effect showed the different patterns of CNV amplitude in the high- and low-approach motivation condition. Based on the post hoc test, the difference between low and high arousal has been shown in the low-approach motivation condition (p = 0.004). Specifically, the CNV amplitude was more negative in the low-arousal (M = −2.36 µV, SE = 0.69 µV) than in the high-arousal (M = −0.91 µV, SE = 0.60 µV). The analogous difference was not observed in the high-approach motivation condition (p = 0.451). In addition, different patterns of CNV amplitude were showed in in the high- and low-arousal condition. Based on the post hoc test, the difference between the high- and low-approach motivation has been shown in the low-arousal condition (p < 0.001). Concretely, the CNV amplitude was more negative in the low- (M = −2.36 µV, SE = 0.69 µV) than in the high-approach motivation (M = −0.01 µV, SE = 0.71 µV). No analogous difference was observed in the high-arousal condition (p = 0.271). There was a significant first-order interaction effect for AROUSAL × CUES (F(1,24) = 6.29, p = 0.019, η <sup>2</sup> = 0.21). However, no simple effects were significant. Other significant main or interactive effects were not yielded (F < 1.89, p > 0.182). **Figure 6** illustrates grand average ERP waveforms after the A and B cues for each condition.

#### N2

The main effect of SEQUENCES was statistically significant (F(3,22) = 18.98, p = 0.442, η <sup>2</sup> = 0.72). Analysis using the post hoc test showed that the N2 amplitude in the AY sequence (M = −2.78 µV, SE = 0.51 µV) was larger than in the AX (M = 0.73 µV, SE = 0.25 µV, p < 0.001), BX (M = 0.74 µV,

separately for low (LA) and high (HA) arousal and low (LAM) and high (HAM) motivated positive affect.

SE = 0.26 µV, p < 0.001) and BY (M = 0.27 µV, SE = 0.32 µV, p < 0.001) sequences. Also, there was a significant main effect of the factor AROUSAL (F(1,24) = 6.81, p = 0.015, η <sup>2</sup> = 0.22). The N2 amplitude was less negative in high (M = −0.14 µV, SE = 0.25 µV) than in low levels of arousal condition (M = −0.37 µV, SE = 0.29 µV). It should be noted that simple effect showed that the difference between these conditions occurs only in the AY sequence. Other significant main or interactive effects were not yielded (F < 1.25, p > 0.315). **Figure 7** illustrates the grand average ERP waveforms after probe presentation in the AX, AY, BX and BY sequences for low and high arousal.

#### P3a

There was a significant main effect of the factor SEQUENCES (F(3,22) = 9.75, p < 0.001, η <sup>2</sup> = 0.57). The post hoc test showed that the P3a amplitude in the AY sequence (M = 3.74 µV, SE = 0.97 µV) was larger than for the AX (M = 1.09 µV, SE = 1.03 µV, p = 0.003), BX (M = −0.32 µV, SE = 0.91 µV, p < 0.001) and BY (M = 0.30 µV, SE = 0.95 µV, p < 0.001) sequences. Other significant main or interactive effects were not yielded (F < 3.22, p > 0.086). **Figure 8** illustrates the grand average ERP waveforms after probe presentation in the AX, AY, BX and BY sequences.

#### DISCUSSION

The current study investigated how approach motivation, positive affect and arousal induced by pictures had effects on cognitive control, particularly in the field of proactive and reactive control. Similar to previous studies (Liu and Xu, 2016; Li et al., 2018), we hypothesized that high-compared to low-approach motivated positive affect would be associated with the enhanced proactive control. Also, based on the Pessoa (2009) model and LC-NE functioning (Aston-Jones and Cohen, 2005), we postulated that high compared to low arousal would be associated with reduced proactive control and enhanced reactive control. We examined this hypothesis using the AX-CPT paradigm and electrophysiological method to measure ERP components associated with both cognitive control modes. The results mostly confirmed our hypothesis and demonstrated that high-approach motivated positive affect enhanced proactive control without any effect on reactive

control. Also, they showed that high arousal induced by pictures reduced proactive control and reinforced reactive control. We first discuss the behavioral results and the effect of the analyzed factors on proactive control. We then present the influence of arousal and approach motivation on reactive control.

Our results showed no effect of any affect dimension on any AX-CPT sequences at the behavioral level. We only showed the standard effect of AY sequences. Specifically, error rates for this sequence were the highest and reaction times the longest of all sequences (see Braver et al., 2007; Braver, 2012; Cooper et al., 2017). However, our findings are similar to the behavioral results found in previous ERP studies examining the impact of affect on cognitive control in the AX-CPT paradigm (Chaillou et al., 2017; Li et al., 2018). A possible explanation for this situation could be the greater number of task trials in the ERP research than in the behavioral study. This can lead to a practice effect on behavioral performance (see Braver et al., 2009).

Regarding approach motivation, we observed that the P3b amplitude was larger in the high- than in the lowapproach motivated positive affect condition. Nor was this effect dependent on the level of arousal induced by the pictures. Also, this difference in P3b amplitude was evident for both the A and B cues, and the P3b amplitude being more positive in the B cue than in the A cue. Thus, this pattern may indicate that high-approach motivated, positive affect is associated with enhanced context updating and larger utilization of cue information both in the A and B cues (see Donchin and Coles, 1988; Polich, 2007; Lenartowicz et al., 2010). This result suggests that high-approach motivated positive affect reinforces of proactive control, while low-approach motivated positive affect may lead to a decrease in this mode of cognitive control. This supposition is in line with earlier studies that have shown the difference between high- and low-approach motivated positive affect in relation to cognitive flexibility and stability. Low-approach motivated positive affect enhanced cognitive flexibility and distractibility, whereas high-approach motivated positive affect increased perseverance and reduced distractibility (Liu and Wang, 2014; Liu and Xu, 2016). Greater flexibility may be associated with decrease proactive control (Dreisbach and Goschke, 2004; Dreisbach, 2006). Hence, compared with high approach motivation, low approach motivation contributes to decreasing proactive control. This is in line with our findings for the P3b component.

Concerning arousal induced by the picture, we observed that the P3b amplitude was smaller for high than for low arousal in the high-approach motivation condition. This difference in P3b amplitude was evident only for the B cue. Hence, this pattern may indicate that low arousal is associated with enhanced context updating and larger utilization of B cue information (see Donchin and Coles, 1988; Polich, 2007; Lenartowicz et al., 2010). According to the DMC framework, proactive control engages a large number of resources to maintain contextual information. This reduces the number of goal representations in the focus of attention (Braver, 2012). Pessoa (2009, 2017) postulated that high-arousal stimuli are related to the greater rivalry between affective stimuli and executive control for attention resources. Previous studies have shown that emotional arousal reduces activity in the cortical regions involved in cognitive control process and enhances activity in the cortical regions involved in the emotion processes (Hart et al., 2010). Also, Pessoa et al. (2012)showed that high-arousal emotional stimuli as stop signals lead to worsened response inhibition, whereas low-arousal stop signals enhance inhibition. Also, Kuhbandner and Zehetleitner (2011) showed that the attentional selection of cues in a high-arousal situation is stimulus-driven salience of a stimulus but does not have goal-driven task relevance. Taken together, these findings suggest that proactive control requires attention resources to maintain goal-irrelevant information. However, these resources may be taken away by high emotional arousal. High compared to low arousal induced by pictures may lead to a reduction in proactive control, which would be reflected in the P3b amplitude.

Contrary to our hypothesis, our results showed that CNV amplitude was more negative in the low- than in the high-approach motivated positive affect condition. The CNV component is thought to be associated with the preparatory process for the motor response. In particular where the preparation of a motor response is preceded by a prior cue that the response is to be prepared (Ruchkin et al., 1995). This may indicate that greater engagement of proactive control is related to more effective task preparation and larger CNV amplitude. Thus, this pattern may indicate that low-approach motivated positive affect is associated with stronger response preparation processes than the high-approach motivated positive affect. Hence, the greater CNV amplitude may reflect increases in proactive control (Li et al., 2018). In line with this account, low-approach motivated positive affect may lead to an increase in this cognitive control mode. This supposition is in contradiction to our hypothesis and P3b results. However, other research has shown a smaller CNV component in conditions requiring active maintenance of a task goal in the working memory than an anticipated simple motor reaction (Vanderhasselt et al., 2014). This is in line with previous studies indicating that the CNV amplitude reduces with increasing working memory load or increasing response interference (Tecce, 1972; Roth et al., 1975; Gevins et al., 1996; McEvoy et al., 1998). Considering that the CNV component may reflect the working memory load related to maintaining information about the cue (see Onoda et al., 2004), our results may have another interpretation. In this regard, greater active maintenance of goal-relevant information should lead to greater working memory load. This load should be reflected in the CNV amplitude. Specifically, a less negative CNV amplitude may reflect a greater working memory load. According to the DMC framework, proactive control engages large resources which are involved in the active maintenance of goal-relevant information control (Braver, 2012; Chiew and Braver, 2017) so it may lead to greater working memory load. In this context, decreased CNV amplitude may be an indicator of increased proactive control. In line with this explanation, low-approach motivated positive affect may decrease proactive control whereas high-approach motivated positive affect may increase this cognitive control mode. However, this interpretation requires further research and should be considered with caution. In addition, it should be taken into account that the slow positive shift may overlap CNV activity and may influence the results obtained (Curry, 1984). All the more so because the CNV reflects the confounding of attention to the upcoming stimulus and preparation for the response, which take place simultaneously (Brunia et al., 2012). This could be one of several possible explanations for the different results obtained in previous studies using the AX-CPT paradigm (see van Wouwe et al., 2011; Kamijo and Masaki, 2016; Chaillou et al., 2017; Li et al., 2018).

Concerning arousal induced by the picture, we demonstrated that the CNV amplitude was more negative in the low than in the high arousal condition in the low-approach motivated positive affect condition. One possible explanation for this relates to the fact that a larger CNV amplitude may reflect more effective preparation for the motor response and consequently greater engagement of proactive control. In this regard, low compared with high arousal may lead to an increase in proactive control which is in line with our hypothesis. Other possible explanation relates to the fact that our result may be associated with a greater working memory load in high arousal induced by the picture. However, in this situation, the working memory load may be related to the allocation of resources to emotion processing. This would be in line with the model proposed by Pessoa (2009, 2017). Thus, this pattern may indicate that active maintenance of the task goal is easier in a low-than in a high-arousal condition. Additionally, proactive control is related to the active maintenance of context representations and goal-driven behavior (Braver, 2012; Chiew and Braver, 2017). Taken together, the findings suggest that proactive control may be more supported by low than by high arousal which is also in line with our hypothesis. However, this explanation requires further research and should be considered carefully.

We found no approach motivation effect on the N2 component. Our findings seem to be in line with previous findings by Li et al. (2018), who did not show a difference between high- and low-approach motivated positive affect with N2 amplitude. In addition, Chaillou et al. (2017) found no difference between positive and neutral affect in the N2 component. However, our results demonstrated that the N2 amplitude was more negative in low than in high levels of arousal. The results of previous studies have shown that the N2 component is a reflection of conflict monitoring related to either the inhibition of incorrect response tendencies caused by irrelevant stimuli or the choice of reaction in the face of competing alternatives (Van Veen and Carter, 2002; Nieuwenhuis et al., 2003). Hence, this pattern may indicate greater cognitive conflict and enhanced reactive control in the low- than in the high-arousal condition. However, this conflict occurs only in the AY sequence. Considering that the N2 component may reflect the actual control process (see Folstein and Van Petten, 2008), our results may have another interpretation. In this situation, greater activation of goal information (the A cue) should lead to greater interference between goal representation and the probe in the AY sequence. This interference should be reflected in the N2 amplitude. Specifically, a more negative N2 amplitude may indicate greater interference in the AY sequence. According to the DMC framework, greater interference in the AY sequence is related to enhanced proactive control (Braver, 2012; Chiew and Braver, 2017). Hence, the more negative N2 amplitude in the AY sequence may reflect increased proactive control or decreased reactive control. Considering this interpretation, our results may indicate a reduction in reactive control or enhanced proactive control in low- than high-arousal conditions. This explanation is in line with previous studies showing a more negative N2 amplitude in neutral affect compared with positive affect in AY sequences (van Wouwe et al., 2011).

We did not find an approach motivation effect, and we did not find an arousal effect on the P3a component. We only observed the typical effects according to sequence type: the P3a amplitude was more positive in the AY sequences compared to the AX, BX and BY sequences (Morales et al., 2015; Chaillou et al., 2017; Li et al., 2018). This result may indicate that approach motivation and arousal do not impact on proactive control or reactive control. However, previous research have postulated that P3a is related to focal attention and working memory mediated by DA activity (see Polich, 2007). On the other hand, other study results have suggested a relationship between P3a and the LC-NE system (see Howells et al., 2012). Therefore, the manipulation of approach motivation and arousal introduced by us could increase the variance associated with the activation of different neuronal systems.

Our results showed that approach motivation only modified the P3b and CNV components, without effects on the N2 or P3a components. This may indicate that approach motivated positive affect only modulates proactive control. Specifically, high-approach motivated positive affect enhanced proactive control, whereas low-approach motivated positive affect reduced proactive control. Our findings seem to be in line with those of other studies (Liu and Wang, 2014; Liu and Xu, 2016). Also, we showed that arousal induced by pictures modified P3b, CNV and N2 amplitudes. Considering that P3b and CNV reflect a change in proactive control and N2 reflects variation in reactive control, it can be assumed that arousal influences both types of control. Specifically, low arousal induced by pictures promoted increased proactive control and reduced reactive control. On the other hand, high arousal induced by pictures is related to reduced proactive control and enhanced reactive control. However, the results of Fröber and Dreisbach (2012) are contrary to our findings. It should be noted that Fröber and Dreisbach (2012) did not control approach motivated positive affect. In their study, the high-arousal picture displayed group sport and adventure pictures, and the low-arousal pictures showed babies and families. In our study, high-arousal and high-approach motivated pictures showed group and individual sport, whereas low-arousal and low-approach motivated pictures presented the faces of children and other people. Also, Fröber and Dreisbach (2012) demonstrated that affect modulates only proactive control. However, our findings indicate that approach motivation influences proactive control whereas arousal influences both proactive and reactive control. Therefore, we may very cautiously suppose that the effect observed by Fröber and Dreisbach (2012) may have been driven by approach motivated positive affect, not by arousal. However, further research is required to explain the observed differences.

In conclusion, our study is one of the first to explore simultaneously approach motivated positive affect and arousal influence on cognitive control in an AX-CPT paradigm. Our results showed that approach motivated positive affect modulated proactive control with no effect on reactive control.

#### REFERENCES


However, arousal influenced both proactive and reactive control. These results may indicate that approach motivated positive affect may be conducive to more precise preparation of one's actions through available information. However, arousal may modify the control mechanism as a result of a cognitive conflict that may contribute to changing the goals of the action (see Cohen et al., 2004). Our findings may contribute to a better understanding of the relationship between affect and cognitive control. However, further research is needed to explain the observed results better. In particular, taking into account the DMC framework, it is important to consider how arousal impacts on the maintaining and changing of information about the goal of the action in the situation of cognitive conflict and the change of goal representation. Also to be tested, taking into account the relationship between arousal and working memory capacity (Unsworth and Robison, 2017), is whether our results would be different for people with high vs. low working memory capacity.

#### AUTHOR CONTRIBUTIONS

AC and PF: substantial contributions to the conception of the work and substantial contributions to the design of the work. AC, PA and PS: the acquisition, analysis, or interpretation of data for the work. AC and PF: drafting the work. AC, PF, PA and PS: revising the work critically for important intellectual content, final approval of the version to be published and agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

## FUNDING

This work was supported by the National Science Centre (Poland) (Grant No. 2015/17/N/HS6/02770).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00320/full#supplementary-material


M. J. Kane, A. Miyake and J. N. Towse (Oxford: Oxford University Press), 76–106.


range, high-quality, realistic picture database. Behav. Res. Methods 46, 596–610. doi: 10.3758/s13428-013-0379-1


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Cudo, Francuz, Augustynowicz and Stróz˙ak. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Post-error Brain Activity Correlates With Incidental Memory for Negative Words

Magdalena Senderecka<sup>1</sup> \*, Michał Ociepka<sup>2</sup> , Magdalena Matyjek<sup>3</sup> and Bartłomiej Kroczek<sup>2</sup>

1 Institute of Philosophy, Jagiellonian University, Kraków, Poland, <sup>2</sup> Institute of Computer Science and Computational Mathematics, Jagiellonian University, Kraków, Poland, <sup>3</sup> Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Berlin, Germany

The present study had three main objectives. First, we aimed to evaluate whether shortduration affective states induced by negative and positive words can lead to increased error-monitoring activity relative to a neutral task condition. Second, we intended to determine whether such an enhancement is limited to words of specific valence or is a general response to arousing material. Third, we wanted to assess whether post-error brain activity is associated with incidental memory for negative and/or positive words. Participants performed an emotional stop-signal task that required response inhibition to negative, positive or neutral nouns while EEG was recorded. Immediately after the completion of the task, they were instructed to recall as many of the presented words as they could in an unexpected free recall test. We observed significantly greater brain activity in the error-positivity (Pe) time window in both negative and positive trials. The error-related negativity amplitudes were comparable in both the neutral and emotional arousing trials, regardless of their valence. Regarding behavior, increased processing of emotional words was reflected in better incidental recall. Importantly, the memory performance for negative words was positively correlated with the Pe amplitude, particularly in the negative condition. The source localization analysis revealed that the subsequent memory recall for negative words was associated with widespread bilateral brain activity in the dorsal anterior cingulate cortex and in the medial frontal gyrus, which was registered in the Pe time window during negative trials. The present study has several important conclusions. First, it indicates that the emotional enhancement of error monitoring, as reflected by the Pe amplitude, may be induced by stimuli with symbolic, ontogenetically learned emotional significance. Second, it indicates that the emotionrelated enhancement of the Pe occurs across both negative and positive conditions, thus it is preferentially driven by the arousal content of an affective stimuli. Third, our findings suggest that enhanced error monitoring and facilitated recall of negative words may both reflect responsivity to negative events. More speculatively, they can also indicate that post-error activity of the medial prefrontal cortex may selectively support encoding for negative stimuli and contribute to their privileged access to memory.

Keywords: emotion, error monitoring, error-related negativity (ERN), event-related potentials (ERPs), incidental memory and learning, incidental recall, post-error positivity (Pe), stop-signal task

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Maria Montefinese, University College London, United Kingdom Laurence Questienne, Aix-Marseille Université, France

#### \*Correspondence:

Magdalena Senderecka magdalena.senderecka@uj.edu.pl; magdalena.senderecka@gmail.com

> Received: 01 February 2018 Accepted: 13 April 2018 Published: 08 May 2018

#### Citation:

Senderecka M, Ociepka M, Matyjek M and Kroczek B (2018) Post-error Brain Activity Correlates With Incidental Memory for Negative Words. Front. Hum. Neurosci. 12:178. doi: 10.3389/fnhum.2018.00178

# INTRODUCTION

fnhum-12-00178 May 4, 2018 Time: 16:14 # 2

Recent years have produced many studies investigating the interaction between emotion and error monitoring. Most of these reports focused on the long-lasting negative affect associated with psychiatric diseases or character traits (for reviews, see Vaidyanathan et al., 2012; Endrass and Ullsperger, 2014). A relatively small number of works have examined the influence on performance monitoring of short-duration affective states induced by emotional stimuli, such as pictures, film clips or sounds (e.g., Larson et al., 2006; van Wouwe et al., 2010; Senderecka, 2018). However, no study has tested whether processing of emotional words can lead to increased error detection. In addition, although it seems reasonable to assume that emotional modulation of error monitoring may be associated with more efficient encoding of affective material and its subsequent recall from memory, the link between these effects has not been yet explored. The aim of the present study was to fill these gaps by investigating the links between short-duration affective states induced by emotional words, error monitoring and incidental memory.

Thus, our study had three specific goals. First, we aimed to evaluate whether short-duration affective states induced by emotional words can enhance error monitoring, as reflected by electrophysiological indices. Second, we intended to assess whether such an enhancement (if present) is specific to unpleasant or pleasant linguistic stimuli, or is a general response to arousing material, irrespective of valence. Third, we decided to examine whether post-error brain activity correlates with incidental memory for negative and/or positive words. To reach these goals, we used behavioral measures, as well as event-related potential (ERP) components.

Error monitoring is defined as the ability to evaluate ongoing actions, detect an error and dynamically adjust performance, which is critical to the adaptive control of behavior in a frequently changing environment. It constitutes part of a larger cognitive control system and is primarily related to activity in the medial frontal cortex (Ridderinkhof et al., 2004). The electrophysiological signature of error monitoring is reflected in two components of scalp-recorded ERP: error-related negativity (ERN; Gehring et al., 1993), also called error negativity (Ne; Falkenstein et al., 1990), and post-error positivity (Pe; Falkenstein et al., 1991).

Error-related negativity is a sharp, negative deflection that occurs over the fronto-central regions and peaks at around 0–100 ms after the commission of an error (Falkenstein et al., 1990; Gehring et al., 1993). Various theories about the functional significance of ERN point to different possibilities, ranging from a mechanism that monitors the difference between an intended and an actually performed action (Falkenstein et al., 1991; Coles et al., 2001), through to the result of a conflict between simultaneously active correct and incorrect response tendencies (Botvinick et al., 2001; Yeung et al., 2004), and a signal of reinforcement learning (Holroyd and Coles, 2002). More recent findings suggest that ERN reflects an increase in attentional control, supported by enhanced activation of the medial frontal cortex, typically observed in situations demanding monitoring of ongoing actions (van Noordt et al., 2015, 2016, 2017). Meanwhile, other studies strongly indicate that the ERN amplitude reflects the subjective significance of an error (Gehring et al., 1993; Hajcak et al., 2005) or the accompanying negative affect and emotional distress (Hajcak and Foti, 2008; Inzlicht and Al-Khindi, 2012).

A second ERP component related to error monitoring, namely the Pe, is a positive wave that is more sustained than ERN and occurs over the centro-parietal regions, approximately between 100 and 400 ms after error commission (Falkenstein et al., 1991). As in the case of ERN, several accounts have been proposed regarding its functional significance. A large body of research suggests that the Pe is associated with conscious recognition of an error and increased awareness of performance abilities (Nieuwenhuis et al., 2001; Endrass et al., 2007; Larson and Perlstein, 2009; Hughes and Yeung, 2011). Other studies indicate that the Pe displays topographical similarities to the stimulus−related P3 and may thus reflect the increased motivational significance of erroneous responses, which are rare, distinctive and salient events (Leuthold and Sommer, 1999; Ridderinkhof et al., 2009; Endrass et al., 2012). This is in line with the assumption that the Pe may be a manifestation of the emotional appraisal of an error or its consequences (Falkenstein et al., 2000). Additionally, the Pe is also considered an index of the accumulation of evidence that an error has occurred (Steinhauser and Yeung, 2010; see also Ullsperger et al., 2010; Wessel et al., 2011).

Research in the last two decades has yielded a substantial body of evidence showing that long-lasting negative affect and emotional distress are usually accompanied by increased error monitoring. For example, an enhanced ERN amplitude has been demonstrated in patients with anxiety disorders (Hajcak et al., 2003; Aarts and Pourtois, 2010) and obsessive–compulsive disorder (Gehring et al., 2000; Endrass et al., 2008, 2010; for a review, see Endrass and Ullsperger, 2014). A reliable increase of ERN has also been observed among non-clinical individuals with high levels of negative affect (Luu et al., 2000; Hajcak et al., 2004). Some studies have also reported enhanced performance monitoring, as indexed by ERN amplitude, among patients suffering from major depression (Chiu and Deldin, 2007; Holmes and Pizzagalli, 2008, 2010). Meanwhile, however, other studies have indicated that severe depression may also result in reduced ERN (Olvet et al., 2010, for a review, see Vaidyanathan et al., 2012).

While much attention has been paid to the relationship between error monitoring and long-lasting affective states, relatively less has been paid to the impact of short-term changes in emotion on error monitoring. Two previous studies have measured ERN in the context of viewing emotional pictures used to induce short-duration affective states (Larson et al., 2006; Wiswede et al., 2009a). Larson et al. (2006) observed enhanced ERN amplitude on trials in which flanker stimuli were superimposed on positive pictures, whereas Wiswede et al. (2009a) found increased ERN on flanker trials that followed the presentation of negative pictures. In turn, van Wouwe et al. (2010) observed enhanced ERN in individuals who viewed a positive film clip before being asked to complete a continuous performance task. In addition, increased ERN was reported

in studies that used more abstract emotional manipulation to investigate whether error monitoring is influenced by derogatory verbal feedback, motivational impact of punishment, or induction of feelings of helplessness (Wiswede et al., 2009b; Riesel et al., 2012; Pfabigan et al., 2013). However, it is worth noting that some studies failed to find ERN amplitude modulation in response to fear or sad and happy mood induction (Moser et al., 2005; Paul et al., 2017). Importantly, using a spatial Stroop task, Ogawa et al. (2011) found reduced ERN in the condition in which erroneous responses were followed by verbal admonishment. In summary, the studies reviewed above are extremely difficult to integrate due to the substantial variability of methodology, leading to contrary results and conclusions. Furthermore, although there is sparse evidence that points to the influence of short-duration affective states on the Pe amplitude (Moser et al., 2005; Paul et al., 2017), in the majority of these studies only the first component of the ERN-Pe error-related complex was taken into consideration.

Recently Senderecka (2016, 2018) investigated the influence of emotional visual and auditory stimuli on both errorrelated components simultaneously in a stop-signal paradigm. Participants performed an emotional stop-signal task (SST) that required response inhibition to briefly presented aversive and neutral pictures or sounds. The analyses revealed that negative stimuli from both sensory modalities improved error monitoring by increasing the Pe amplitude. However, the ERN amplitude was comparable in the emotional and neutral conditions, which agreed with some earlier studies (Moser et al., 2005; Paul et al., 2017), but contrasted with others (Larson et al., 2006; Wiswede et al., 2009a,b; van Wouwe et al., 2010; Ogawa et al., 2011; Riesel et al., 2012; Pfabigan et al., 2013). Given the inconsistency among methodological approaches and the discrepancy in the findings reviewed above, it can be stated that there is a clear need for systematic examination of both error-related components in a series of related and similarly designed tasks. Thus, the present study aimed to expand on Senderecka (2016, 2018) by further exploring the mechanism of the emotional enhancement effect on error monitoring in the SST paradigm.

The first goal of the present study was to test whether the previous pattern of results, which points to the emotional enhancement of error detection in the SST, could be obtained with linguistic stimuli. To address this question, we used an SST requiring response inhibition to negative, positive and neutral nouns. Most of the words are entirely symbolic signs whose meaning is acquired by learning. Thus, responses to such stimuli are not based on biological predisposition and have not been shaped by evolutionary pressures, unlike responses evoked by emotional pictures and sounds, especially aversive ones (Öhman and Mineka, 2001). In line with these considerations, there is broad agreement that emotional linguistic stimuli are less arousing than other types of visual affective material such as emotional scenes or facial expressions (Vanderploeg et al., 1987; Keil, 2006; Kissler et al., 2006; Hinojosa et al., 2009). Indeed, some studies revealed that, contrary to what occurs with affective pictures (Vuilleumier et al., 2001; Schimmack and Derryberry, 2005; Verbruggen and De Houwer, 2007), emotional words are, in general, not capable of interfering with performance in ongoing cognitive tasks in healthy participants, probably because of the limited arousing power of linguistic material (for reviews, see Williams et al., 1996; Siegle et al., 2002). On the other hand, a growing body of studies indicates that arousing words are able to attract enhanced attention compared to neutral words and to influence cognitive processing across a number of experimental tasks (e.g., Carretié et al., 2008; Chiu et al., 2008; Estes and Verges, 2008; Kanske and Kotz, 2010; Herbert and Sütterlin, 2011). These latter findings clearly suggest that the emotional intensity of linguistic stimuli is associated with the degree of interference caused by them in the ongoing cognitive task. Thus, the present study was intended to further examine this association through the analysis of ERP correlates of error monitoring, using SST with linguistic stimuli.

The second goal of the study was to test whether emotional enhancement of error monitoring (if present) is limited to words of specific valence. Current emotional state is modulated by both valence and arousal, two affective dimensions that are widely considered to explain the variance in emotional salience (Lang et al., 1990, 1993; Lang, 1995). Valence reflects how the motivational system responds to a stimulus (either appetitively or aversively), whereas arousal reflects the intensity of its reaction. In our previous studies (Senderecka, 2016, 2018) the emotional salience of stop signals was manipulated using either threatening and neutral pictures, or aversive and neutral sounds. Thus, the positively valenced stimuli were not included in the task. For this reason, it remains unclear whether the observed emotional modulation of error monitoring in the SST was specifically related to negative valence or rather to high arousal of unpleasant stop-signals. Given that two previous flanker studies (Larson et al., 2006; Wiswede et al., 2009a) which examined error detection in the context of viewing negative and positive pictures produced divergent results (selectively increased ERN either in the negative or positive condition), it is still unknown which affective dimension is a determinant for the strength of the emotional influence on performance monitoring. Thus, if there are emotion-based changes in error detection in the SST, it would be beneficial to explore whether they can be evoked by stimuli from both affective valence categories.

The third goal of the present study was to determine whether post-error brain activity correlates with incidental memory for emotional words. Incidental memory refers to the ability to encode and maintain information without prior intention to remember (Rugg et al., 1997). Emotions exert powerful influences on learning and memory that involve different brain systems engaged at multiple stages of information processing (LaBar and Cabeza, 2006). The findings from previous studies suggest that memory for emotional words is better than for neutral words (e.g., Herbert et al., 2008a,b; Ferré et al., 2015). For instance, negative words such as death are more likely to be recalled than neutral words such as bottle (Rubin and Friendly, 1986). The memory advantage of emotional over neutral information is called the emotion-enhanced memory effect (EEM, Hamann et al., 1999; Talmi et al., 2007). Emotional valence/arousal effects of linguistic stimuli can be seen in memory performance, even when the meaning of the experimental stimuli is processed incidentally (Kissler et al., 2007, 2009; Herbert et al., 2008a,b; but

see also Ramponi et al., 2010). Additionally, results indicate that one valence might affect memory performance differently than another (for a review, see Bowen et al., 2017), suggesting that memory improvement might be valence-specific.

The EEM occurs in tasks involving a long delay between an initial study phase and a later memory test, as well as in immediate free-recall memory tests, i.e., those using retention intervals of several minutes (for a review, see Murty et al., 2010). It has been suggested that in studies with long retention intervals, the EEM is primarily due to a better consolidation of emotional memory traces than that of neutral stimuli (McGaugh, 2004; Phelps, 2004). In turn, in studies with short retention intervals the EEM probably relies on a different mechanism (Talmi et al., 2007), if only because the delay between encoding and retrieval is too small to allow consolidation to occur (McGaugh, 2004). The memory improvement for affective material observed on immediate recall or after short delays may be due to multiple factors that play a significant role during encoding, such as enhanced perceptual sensitivity (Zeelenberg et al., 2006) or increased physiological arousal (LeDoux, 2000). A growing body of studies indicates that the EEM may also be a result of increased involvement of attention during encoding (Hamann, 2001; Calvo and Lang, 2004; Talmi and McGarry, 2012). Importantly, the increase in attentional control is also typically observed in situations demanding ongoing monitoring of performance (van Noordt et al., 2015, 2016, 2017). This raises the question of whether the increased involvement of attention during error detection is associated with more efficient memory performance for emotional stimuli. We can tentatively assume that error monitoring may provide an additional source of modulation for the processing of affective stimuli that may ultimately contribute to their privileged access to awareness and memory. Thus, it seems reasonable to ask whether post-error brain activity correlates with the strength of immediate recall for affective material presented within the task. Such a correlation analysis can provide important knowledge about memory performance in an error-monitoring context. To our knowledge, the link between these two mechanisms has not been explored yet.

The present study's hypothesis is that emotional words induce transient affective states which dynamically modulate error monitoring. We predicted that the response-locked Pe component would show increased amplitude in both negative and positive conditions. Based on our previous results (Senderecka, 2016, 2018), emotional enhancement of ERN amplitude was not expected. Finally, we assumed that posterror brain activity would be associated with incidental memory performance for emotional words.

# MATERIALS AND METHODS

#### Participants

Sixty-five volunteers (39 females and 26 males) aged 18–34 years old (M = 23.7 years, SD = 4.2) were recruited via Internet advertisements and were paid the equivalent of about 5 US dollars in Polish zloty (PLN). All participants were in good health, free of medications, and had normal or corrected-to-normal vision. None reported a history of psychiatric or neurological diseases. Of the initial sample recruited for the study, three participants were excluded from the analyses because they turned out not to be native speakers of Polish; two participants were excluded because of technical problems with the EEG recording or excessive EEG artifacts; one participant was excluded due to a probable misunderstanding of the instructions which led to an extremely small amount of correct responses; and another one was excluded because his mean RT deviated substantially from the mean of the sample (more than +3.0 standard deviations). The remaining 58 participants (33 females and 25 males), 18– 34 years old had a mean age of 23.4 years (SD = 3.9). The sample size was determined based on literature (Steele et al., 2016), our previous studies (Senderecka, 2016, 2018) and power analysis. The results indicated that our sample size would allow detection of a moderate effect size (f = 0.15) with a power >80%, at an alpha level of 0.05 (Cohen, 1988).

# Stimuli

Stimuli consisted of 81 words selected from the Nencki Affective Word List (NAWL; Riegel et al., 2015; Wierzba et al., 2015), which has recently been introduced as a standardized database of Polish words suitable for studying various aspects of language and emotions. The stimulus set contained 27 negative (e.g., anger, death, punishment), 27 positive (e.g., love, miracle, promotion), and 27 neutral (e.g., feature, product, document) nouns. Normative ratings indicated that negative words were less pleasant [t(26) = 32.84, p < 0.001, d = 1.83] than neutral words, which were less pleasant [t(26) = 32.32, p < 0.001, d = 2.00] than positive words. Both negative [t(26) = 54.62, p < 0.001, d = 2.00] and positive [t(26) = 31.47, p < .001, d = 2.10] words were more emotionally arousing than neutral words. However, normative ratings of arousal for negative and positive words were not significantly different from one another [t(26) = 0.50, p = 0.62, d = 0.50]. Specific words used in the study appear in Supplementary Table S1. Stimulus categories were controlled regarding word frequency, word length (numbers of letters and syllables) and imageability ratings, all Fs (2,52) < 1; for the words' characteristics, see **Table 1**.

#### Procedure and Task

The experimental procedure was in accordance with the ethical principles of the 1964 Declaration of Helsinki (World Medical Organization, 1996) and conformed to the ethical guidelines of

#### TABLE 1 | Words characteristics.


<sup>a</sup>Frequency measured as the number of occurrences per million words.

the National Science Centre of Poland (2016). The protocol was approved by the Research Ethics Committee at the Philosophical Faculty of the Jagiellonian University in Kraków, Poland. Participants were seated in a dimly lit, sound-attenuated, airconditioned testing room. After providing written informed consent to participate in the study, they completed the SST with emotional and neutral words. They were asked to restrict body movements and blinking as much as possible during the recording of the EEG. Immediately after the SST, they were instructed to write on a blank sheet of paper all the words they could remember from those presented during the task.

The SST required participants to perform a primary binarychoice response task. Each trial began with the presentation of a black central fixation cross for 800 ms, immediately followed by the presentation of the go stimuli. Two black arrows pointing left or right served as these stimuli. They were presented randomly one at a time, for 100 ms, each with 50% probability, on a gray background in the center of a 23<sup>00</sup> computer monitor. Participants were instructed to respond by pressing the left or right "ctrl" key on a computer keyboard according to the direction of the arrow that was presented to them. If the arrow pointed to the left, they were to respond by pressing the left "ctrl" key using their left index finger; if the arrow pointed to the right, they were to respond by pressing the right "ctrl" key using their right index finger. In addition, they were asked to react to the go stimuli as quickly and accurately as possible.

In a random sample of 25% of the trials, an emotionally negative, positive or neutral noun followed the go stimuli for 1300 ms (in successfully inhibited trials) or until the participant's response (in unsuccessfully inhibited trials), serving as the stop signal. The words subtended between 2.6 and 7.9 of visual angle horizontally when presented onscreen at a comfortable viewing distance of approximately 65 cm, in front of the participant, at eye level. Participants were instructed to inhibit their response while viewing a word that followed the initial go stimulus, regardless of which arrow was presented. They were also told that sometimes it might not be possible to successfully inhibit their response and that in such cases they should simply continue performing the task. Overall, the importance of going and stopping was stressed equally.

Each word occurred two times during the study. A tracking method was used to vary the interval between the presentation of the go stimulus and the stop-signal (i.e., the stop-signal delay, SSD): the interval increased or decreased by 50 ms (from 100 to 400 ms) for the next stop-signal trial, depending on whether the participants successfully inhibited or failed to inhibit their response to the go stimulus in the previous stop-signal trial. Thus, there were seven possible SSDs: 100, 150, 200, 250, 300, 350, and 400 ms. After a successful inhibition, the inter-stimulus interval became longer (thereby making inhibition more difficult on a subsequent stop-signal trial); after an unsuccessful inhibition, it became shorter (making inhibition easier on a subsequent stopsignal trial). The initial value of the SSD was set to 150 ms. The staircasing was done separately for three stop-signal conditions to ensure successful inhibition in approximately 50% of the stop trials in each condition. **Figure 1** presents an outline of the SST design.

Participants received one practice block of 40 trials before data collection to familiarize themselves with the task. In this training run we used a separate set of neutral words as stop signals. After the practice run, participants completed eight experimental blocks, each consisting of 81 trials, with short breaks between blocks. The trial order was randomized with the restriction that any given two stop trials had to have at least one go trial between them. The task was implemented using PsychoPy software (Peirce, 2007).

## EEG Recording

The continuous scalp electroencephalogram (EEG) was recorded from 32 silver/silver-chloride (Ag/AgCl) active electrodes (with preamplifiers) using the BioSemi Active-Two system: Fp1/Fp2, AF3/AF4, F3/F4, F7/F8, FC1/FC2, FC5/FC6, T7/T8, C3/C4, CP1/CP2, CP5/CP6, P3/P4, P7/P8, PO3/PO4, O1/O2, Fz, Cz, Pz, Oz. The electrodes were secured in an elastic cap (Electro-Cap), according to the extended 10–20 international electrode placement system. The signal was continuously recorded at 256 Hz and referenced online to the CMS-DRL ground, which drives the average potential across all electrodes as close as possible to amplifier zero. Electrode offsets were kept within a range of ±20 µV. The horizontal and vertical electro-oculograms (EOGs) were monitored using four additional electrodes placed above and below the right eye and in the external canthi of both eyes. The electrical signal was not filtered during EEG acquisition. All channels were re-referenced off-line to the average of the two

mastoid electrodes. The recordings were filtered off-line with a high-pass filter of 0.05 Hz (slope 24 dB/oct) and a low-pass filter of 25 Hz (slope 12 dB/oct). Ocular and other stationary artifacts were removed with the independent component analysis (ICA) algorithm using the Brain Vision Analyzer 2 (Brain Products, Munich, Germany).

#### Data Quantification

fnhum-12-00178 May 4, 2018 Time: 16:14 # 6

Response-locked (−100 to 600 ms relative to the key press) segments were subsequently checked and averaged. Contaminated trials exceeding maximum/minimum amplitudes of ±65 µV were rejected by a semi-automatic procedure. The mean number of rejected trials was low (1.9% on average).

Motor reaction ERPs were calculated separately for correct (Hit) and unsuccessfully inhibited (Error) responses. In addition, grand averages for incorrect responses were calculated separately for erroneous responses following negative (NEG Error), positive (POS Error), and neutral (NEU Error) stop-signal presentations. The mean number of correct, artifact-free epochs included in the ERP analysis across all participants for each of the response trial categories were as follows: Hit M = 477.1 (SD = 12.8); Error M = 77.7 (SD = 9.8); NEG Error M = 25.9 (SD = 3.1); POS Error M = 25.6 (SD = 3.7); NEU Error M = 26.2 (SD = 3.8). The minimum number of epochs was 397 for Hit, 38 for Error, 16 for NEG Error, 13 for POS Error, 9 for NEU Error. Thus, error-related components were based on no fewer than nine artifact-free error trials, a number that is sufficient to achieve stable estimates of the ERN and Pe (Olvet and Hajcak, 2009; Steele et al., 2016). Consistent with previous research on the error-related ERP components in the SST paradigm (Beyer et al., 2012), we focused on electrode Cz, where these components were found to be highest (see topographic maps in **Figure 2A**). In line with the literature (van Veen and Carter, 2002; Fiehler et al., 2005; Ullsperger and von Cramon, 2006), the mean voltage amplitudes in the post-response time-windows of 0–120 ms (ERN) and 180–300 ms (Pe) were selected. ERPs were baselinecorrected relative to the pre-response interval from −100 to 0 ms.

#### Statistical Analyses

To compare inhibitory performance across the three stopsignal conditions (negative, positive and neutral), two one-way repeated-measures ANOVAs were conducted on the behavioral variables: stop-signal reaction time (SSRT) and inhibition rate. The SSRT, which provides an estimate of the latency of the inhibitory process, was calculated following the procedure of Logan (1994). Reaction times from go stimuli responses in which no stop signal occurred were collapsed into a single distribution and rank ordered. The nth reaction time was selected, where n was obtained by multiplying the number of no-signal reaction times in the distribution (486) by the probability of responding (e.g., 0.5 if the global inhibition rate was equal to 50%) for each participant separately. The global SSRT was calculated by subtracting the average SSD from the nth reaction time (RT), following the horse race model (see Logan and Cowan, 1984; Verbruggen and Logan, 2008 for more detail). In turn, the SSRTs for each stop-signal condition were calculated by subtracting the negative/positive/neutral SSD from the nth reaction time, chosen based on the condition-wise probability of responding.

To analyze the amplitudes of ERN and the Pe, two oneway repeated-measures ANOVAs were conducted (separately for each component): the first with the Response Type (Hit, Error), and the second with the Error Condition (NEG Error, POS Error, NEU Error) as factors. All continuous variables were examined with the Kolmogorov–Smirnov test; this showed that the distributions of the variables were not statistically different from the normal distribution, except for percentages of correctly recalled negative, positive and neutral words, which were thus log-transformed for Pearson correlation analysis. The critical p value was set at.05 for all the analyses. To interpret significant findings, global analyses were followed by restricted post hoc t-tests.

# RESULTS

# Behavioral Data

Behavioral results are summarized in **Table 2**. Only correct trials (>99%) were taken into consideration in the mean RT analyses of the go trials. In order to control for outliers, trials on which RT was more than 3.0 standard deviations above or below the participant's mean RT were excluded from the behavioral analysis (1.2% of trials). The mean RT of the correct go trials was 431.0 ms (SD = 58.1). The global SSRT was 206.3 ms (SD = 29.7), whereas the global SSD was 213.8 ms (SD = 54.4).

As expected because of the staircasing procedure, stop performance was approximately 50% correct in all three conditions (negative: 50.6%, positive: 50.5%, neutral: 50.1%) and no main effect of emotion was observed in the repeated-measures ANOVA analysis [F(2,114) = 0.79, p = ns]. The SSRT did not differ significantly between the three stop-signal conditions [F(2,114) = 0.58, p = ns], indicating that stop performance was comparable in the emotional and in the neutral stop-signal trials. The SSD was also comparable in all conditions [F(2,114) = 0.33, p = ns].

Incidental recall was superior for emotional relative to neutral words [F(2,114) = 41.70, p < 0.001, η 2 <sup>p</sup> = 0.42]. This was true both for negative words [t(57) = 6.70, p < 0.001, d = 0.98] and positive words [t(57) = 7.86, p < 0.001, d = 1.07]. Correct recall did not differ between negative and positive words [t(57) = 0.35, p = ns].

# ERP Findings

The results of the global analysis conducted on both components are presented in the upper part of **Table 3**; the mean amplitudes and standard deviations for two components in all experimental conditions are shown in the lower part of **Table 3**. **Figure 2** presents the grand-average ERPs to the motor reaction at Cz with scalp distribution maps for difference waves.

#### ERN Component (0–120 ms)

The global analysis revealed that the main effect of Response Type was significant [F(1,57) = 38.46, p <0.001, η 2 <sup>p</sup> = 0.40]. The ERPs to Error (unsuccessfully inhibited response, time-locked to the button press) showed a sharp negative peak which was attenuated

NEG/POS/NEU Error waves and two difference waves: NEG-minus-NEU Error and POS-minus-NEU Error. The component-specific windows examined in this study are highlighted. Error, unsuccessfully inhibited responses; Hit, correct responses to go stimuli; NEG, negative; NEU, neutral; POS, positive.

in the ERPs to Hit (1M = 2.8 µV). The ERN amplitudes were statistically comparable [F(2,114) = 0.78, p = 0.46, η 2 <sup>p</sup> = 0.01) in the NEG, POS and NEU Error trials (all 1M ≤ 0.5 µV).

#### Pe Component (180–300 ms)

The ERPs to Error displayed sustained positive activity (following the ERN) which was absent in the ERPs to Hit (1M = 12.0 µV). Thus, the ANOVA showed a main effect of Response Type [F(1,57) = 209.41, p < 0.001, η 2 <sup>p</sup> = 0.79]. Statistical analysis revealed that the main effect of Error Condition was also significant [F(2,114) = 6.34, p = 0.002, η 2 <sup>p</sup> = 0.10]. The Pe amplitudes time-locked to the motor reaction in the emotional Error trials were greater than in the NEU Error trials. This was true for both NEG Error trials [t(57) = 2.75, p = 0.008, d = 0.23, 1M = 1.3 µV] and POS Error trials [t(57) = 3.36, p = 0.001, d = 0.30, 1M = 1.6 µV]. The Pe amplitudes did not differ between NEG and POS Error trials [t(57) = 0.54, p = ns, 1M = 0.3 µV].

#### Correlation Analyses

fnhum-12-00178 May 4, 2018 Time: 16:14 # 8

Correlation analyses were performed to explore associations between memory processing and the Pe component, for which emotional enhancement effects were observed. We intended to check whether the increased Pe amplitude in the two emotional error conditions is associated with facilitated incidental recall for emotional words. The Pearson correlation analyses revealed that memory performance for negative words was significantly correlated with the Pe amplitude in the NEG Error trials (r = 0.35, p = 0.007)<sup>1</sup> . However, there was no significant correlation either between memory performance for positive words and the Pe amplitude in the POS Error trials (r = −0.04, p = ns), or between memory performance for neutral words and the Pe amplitude in the NEU Error trials (r = 0.04, p = ns). **Figure 3** shows scatterplots

<sup>1</sup>The correlation remained significant even when one participant with unusually high percentage (40.7%) of correctly recalled negative words was excluded from the analysis (r = 0.33, p = 0.011).


TABLE 3 | Results of the global analysis of the ERP components.


Error, unsuccessfully inhibited responses; Hit, correct responses to go stimuli; NEG, negative; NEU, neutral; POS, positive; <sup>a</sup>df = 1,57; <sup>b</sup>df = 2,114.

revealing how Pe amplitudes in negative, positive and neutral error conditions were associated with incidental recall for words from the corresponding category.

To further check whether the association between incidental recall and post-error brain activity is specific to the negative condition, we conducted additional correlation analyses of memory performance for negative nouns with the Pe amplitude in general (averaged across three Error conditions), as well as with the Pe amplitude in the POS/NEU Error trials. The analyses revealed significant correlation between incidental recall for negative words and the Pe amplitude both in all erroneous response trials (r = 0.33, p = 0.012) and in the NEU Error trials (r = 0.30, p = 0.023). No correlation was found between memory performance for negative nouns and the Pe amplitude in the POS Error trials (r = 0.26, p = ns). **Figure 4** presents scatterplots illustrating how the Pe amplitude in erroneous response trials, as well as in the positive and neutral error conditions was associated with incidental recall for negative words. Pearson's correlations were also computed to test for possible associations between ERN amplitude and memory processing. These analyses did not reveal any significant correlation between the ERN and incidental recall, either within or across emotion categories (all p<sup>s</sup> = ns). **Table 4** presents the correlation matrix for both response-related components and memory performance for emotional and neutral stimuli<sup>2</sup> .

## Exploratory Analyses

#### Correlation Analyses Between Recall Performance and Source Activation of the Pe

Numerous studies using dipole modeling or low resolution electromagnetic tomography (LORETA; Pascual-Marqui et al., 1994) have revealed that the Pe may be generated by multiple neuronal sources, encompassing the anterior cingulate, the midcingulate and posterior cingulate cortex, and additional sources in the insula, orbitofrontal and superior parietal cortex (van Veen and Carter, 2002; Herrmann et al., 2004; van Boxtel et al., 2005; Mathewson et al., 2005; O'Connell et al., 2007; Vocat et al., 2008; Dhar et al., 2011; Paul et al., 2017). This raises the question of which regions of the brain that contribute to the Pe generation are potentially involved in memory enhancement for negative words. To answer this question, we evaluated voxel-based Pearson's correlations between the source activation of the Pe component in the NEG Error trials, measured using standardized LORETA (sLORETA; Pascual-Marqui, 2002), and incidental recall for negative words. To obtain a more detailed picture of possible associations, Pearson's correlations were also calculated between memory performance for positive/neutral words and sLORETA source activation for the Pe in the POS/NEU Error trials respectively. Further correlation analyses were performed between incidental recall for negative words and sLORETA source activation for the Pe

<sup>2</sup>The analogous Spearman rank correlation coefficients for both response-related components and percentages of correctly recalled negative, positive and neutral words (without applying the log-transformation) are presented in Supplementary Table S2. The results are mainly in line with those obtained using Pearson's correlations on the log-transformed data.

FIGURE 3 | Scatterplots and regression lines within emotion categories. Panel (A) presents the relationships between the Pe amplitude in NEG erroneous response trials and incidental recall performance for negative words. Panel (B) shows the relationships between the Pe amplitude in POS erroneous response trials and incidental recall performance for positive words. Panel (C) illustrates the analogous association between the Pe amplitude in NEU erroneous response trials and incidental recall performance for neutral words. NEG, negative; NEU, neutral; POS, positive.

in erroneous response trials, as well as in the POS/NEU Error trials.

recall performance for negative words. NEG, negative; NEU, neutral; POS, positive.

In sLORETA, computations are made in a realistic head model (Fuchs et al., 2002), using the MNI 152 template (Brain Imaging Centre, Montreal Neurologic Institute; Mazziotta et al., 2001), with the three-dimensional solution space, restricted to cortical gray matter and hippocampi. The intracerebral volume is partitioned in 6,239 voxels at 5 mm spatial resolution. Neuronal activity is computed as current density (µA/mm<sup>2</sup> ) without assuming a predefined number of active sources. The localization accuracy of sLORETA has received considerable validation from studies combining LORETA with other methods, such as structural magnetic resonance imaging (MRI; Worrell et al., 2000), functional MRI (Vitacco et al., 2002; Mulert et al., 2004; Olbrich et al., 2009) and positron emission tomography (Dierks et al., 2000; Pizzagalli et al., 2004; Zumsteg et al., 2005). It is worth noting that even deep structures such as the anterior cingulate cortex (ACC; Pizzagalli et al., 2001) can be correctly localized with this method.

In order to identify neural correlates of memory performance, the log-transformed power of the estimated electric current density over the Pe component's time window (180–300 ms postresponse-onset) was correlated with the log-transformed percent of correctly recalled words, within and across emotion categories. The analyses corresponded to the statistical non-parametric mapping (Holmes et al., 1996) and relied on a bootstrap method with 5,000 randomized samples. This procedure gave the exact significance thresholds regardless of non-normality and corrected for multiple comparisons. The level of significance for all of the analyses was set to p < 0.05 for r-values above 0.46.

The analysis revealed that enhanced memory performance for negative words was associated in NEG Error trials with

TABLE 4 | Pearson correlation matrix for ERP components' amplitude and memory performance.


Error, unsuccessfully inhibited responses; Hit, correct responses to go stimuli; NEG, negative; NEU, neutral; POS, positive. Significant effects are indicated in bold: ∗∗p < 0.01, <sup>∗</sup>p < 0.05.

significantly stronger activation in the bilateral network of medial frontal brain areas, encompassing the dorsal ACC and the medial frontal gyrus; see **Figure 5**. The coordinates of local maxima are provided in **Table 5**. No cortical regions displayed a significant correlation either with the percent of correctly recalled positive words (in POS Error trials) or with the percent of correctly recalled neutral words (in NEU Error trials). Moreover, no cortical regions showed a significant correlation with the percent of correctly recalled negative words either in erroneous response trials or in the POS/NEU Error trials.

#### Amplitude of the P3 and the Late Positive Potential (LPP) Time-Locked to the Stop-Signal Presentation in Successfully Inhibited Trials

The ERPs time-locked to the button press in erroneous response trials and to the stop signal in unsuccessfully inhibited trials partly overlap in time due to the relatively short interval between these two kinds of events. This raises the question to what extent the erroneous-response Pe might be considered as an index of brain activation independent of that related to the stop-signal-locked P3 and LPP. This question becomes even more important as the results of previous studies suggest that emotional visual stimuli

TABLE 5 | Brain regions showing significant Pearson correlations between memory performance for negative words and the Pe source imaging in the negative erroneous-response condition.


Maximum correlation values (r), coordinates of local maxima in MNI space, their respective Brodmann areas and number of significant voxels are listed for each region. BA, Brodmann area; X, Y, Z, coordinates in MNI space; X corresponds to the left–right; Y to the posterior–anterior; Z to the inferior–superior dimension.

may evoke a larger P3 (e.g., Delplanque et al., 2005) and an increased LPP (e.g., Schupp et al., 2004) compared to neutral stimuli (for a review, see Olofsson et al., 2008). From this, it can be hypothesized that these two stop-signal-related positive components could be more pronounced in our study after the presentation of the emotional stop signals and could then contaminate the Pe amplitude. If the P3/LPP indeed had larger amplitudes in response to negative and positive words, the greater Pe in the emotional Error trials would not be necessarily due to the increased error monitoring, but instead to the enhanced processing of the stop signal. To rule out this possibility, we examined the amplitude of the P3 and the LPP time-locked to the stop-signal presentation in successfully inhibited trials, which are not contaminated by response-related activity<sup>3</sup> .

Stop-signal-related data were quantified similarly as previously described for response-related data. Stimuluslocked segments (−100 ms to 700 ms around the stop-signal onset) were aligned to the pre-stimulus baseline from −100 ms to 0 ms and averaged separately for each Stop-Signal Condition: NEG Successful Stop, POS Successful Stop and NEU Successful Stop. The mean number of correct, artifact-free epochs included in the ERP analysis across all participants for each of the stopsignal conditions was as follows: NEG Successful Stop M = 27.2 (SD = 2.7); POS Successful Stop M = 27.0 (SD = 2.7); NEU Successful Stop M = 26.9 (SD = 3.1). The minimum number of epochs was 21 for NEG Successful Stop, 21 for POS Successful Stop, 20 for NEU Successful Stop. Thus, stop-signal-related components were based on no fewer than 20 artifact-free error trials, a number that is sufficient to achieve stable estimates of the P3 (Cohen and Polich, 1997). Consistent with previous research (Zheng et al., 2011; Fritsch and Kuchinke, 2013), time windows were selected around the P3 (270–440 ms) and the LPP (440–540 ms). Mean voltage amplitudes in the component-specific windows were used for statistical analysis. In line with previously described analyses, we focused on electrode Cz. To analyze the amplitudes of the P3 and LPP, one-way repeated-measures ANOVA was conducted separately for each ERP component with the Stop-Signal Condition (NEG, POS and NEU Successful Stop) as factor. The distributions of the variables were not statistically different from the normal distribution. In addition, correlation analyses were performed to examine potential associations between memory processing and both stop-signal-related components. The critical p-value was set at 0.05 for all the analyses.

The grand-average ERPs to the stop signal at Cz with scalp distribution maps in successfully inhibited trials are presented in **Figure 6**. The first ANOVA revealed that the main effect of Stop-Signal Condition was not significant in the P3 time window [F(2,114) = 1.25, p = ns], contrary to the results obtained for Pe and Error Condition. The P3 amplitudes were statistically comparable in the NEG (M = 18.3 µV, SD = 7.2], POS (M = 18.1 µV, SD = 6.7) and NEU (M = 17.6 µV,

<sup>3</sup>We thank the Reviewers for drawing these analyses to our attention.

SD = 8.1) Successful Stop trials. The second ANOVA showed only a weak trend toward an effect of Stop-Signal Condition for the LPP [F(2,114) = 2.69, p = 0.07, η 2 <sup>p</sup> = 0.05]. The mean voltage amplitudes observed in the LPP-specific window were as follows: NEG Successful Stop M = 10.6 (SD = 5.4); POS Successful Stop M = 10.0 (SD = 4.8); NEU Successful Stop M = 9.4 (SD = 6.2). Moreover, correlation analyses did not reveal any significant association between stop-signal-related components and memory processing, either within or across emotion categories (all p<sup>s</sup> = ns).

Thus, the correct-stop P3 and LPP observed on Cz electrode were not substantially larger in the emotional than in the neutral stop-signal trials. They were also not associated with incidental recall for words. This pattern of results suggests that the withincondition difference in the Pe amplitude was indeed generated by error-related processes. Therefore, it seems safe to conclude that in the SST the erroneous-response Pe reflect functionally distinct aspect of cognitive control from that associated with stop-signallocked positivities.

# DISCUSSION

The present study had three main objectives. First, we intended to test whether short-duration affective states induced by unpleasant and pleasant nouns can lead to increased error-monitoring activity relative to a condition involving neutral nouns. Second, we aimed to check whether such an enhancement is limited to words of specific valence or is a general response to arousing material. Third, we wanted to assess whether post-error brain activity can support incidental memory for negative and/or positive words. Our initial hypothesis that error monitoring would be enhanced in the emotional conditions was confirmed. In particular, we found significantly larger error-related brain activity in the Pe time window in both negative and positive trials. Regarding behavior, enhanced processing of negative and positive words was reflected in better incidental memory. Moreover, we observed that memory performance for negative words was positively correlated with the Pe amplitude, especially in the negative condition. Following up on this correlation, we performed source localization analysis in order to estimate the neural correlates of this effect. The results of sLORETA analysis revealed that the memory recall for negative words was associated with widespread bilateral activations in the anterior cingulate gyrus and in the medial frontal gyrus.

# Emotional Enhancement of Error Monitoring

The analyses revealed that both error-related ERP components, namely ERN and the Pe, were more pronounced in erroneous than in correct response trials; this corresponds with previous research (Falkenstein et al., 1991; Nieuwenhuis et al., 2001). We observed comparable ERN amplitudes in the neutral and in emotional arousing trials, regardless of their affective valence. This pattern of results is in line with our findings from two previous SST studies with threatening visual and aversive auditory stop signals (Senderecka, 2016, 2018). The lack of emotional modulation of the ERN points to the possibility that the post-response conflict (Botvinick et al., 2001; Yeung et al., 2004) or mismatch between the actual response and the desired state (Falkenstein et al., 1991; Coles et al., 2001) had a similar degree in all conditions of the task. This may also suggest that the increase in attentional control (van Noordt et al., 2015, 2016, 2017) or the decrease in dopaminergic activity (Holroyd and Coles, 2002) evoked by unexpected negative outcomes of an action were comparable across the three stop-signal categories. Finally, this result may also indicate that at the early stage of performance monitoring, the subjective significance of an error (Gehring et al., 1993; Hajcak et al., 2005) or the accompanying emotional distress (Hajcak and Foti, 2008; Inzlicht and Al-Khindi, 2012) did not differ between the negative, positive and neutral conditions.

Our results stand in contrast to previous works reporting ERN amplitude modulation in response to affective state induction (Larson et al., 2006; Wiswede et al., 2009a,b; van Wouwe et al., 2010; Ogawa et al., 2011; Riesel et al., 2012; Pfabigan et al., 2013). However, they align with less numerous yet informative studies that did not find such an influence (Moser et al., 2005; Paul et al., 2017). It should be underlined that the aforementioned studies are difficult to compare due to the substantial variability in methodology, including the nature of the task (flanker task, Stroop task, continuous performance task, go/no-go task or SST), the type of affect-induction procedure (based on bottom–up or top–down emotional manipulation) and finally the nature of the errors (errors in choice-reaction tasks or inhibition errors). Thus, these contradictory or at least equivocal findings may be attributed to specific procedure demands, and certainly call for further investigations to elucidate the influence of short-duration affective states on early stages of error monitoring.

As for the second stage of error processing represented by the Pe, a clear pattern of emotional modulation was apparent. The Pe amplitude was larger in the negative and positive trials than in the neutral ones. In the literature, the Pe has generally been associated with conscious appraisal of erroneous responses (Nieuwenhuis et al., 2001; Endrass et al., 2007; Larson and Perlstein, 2009; Hughes and Yeung, 2011) and the motivational significance of an error (Leuthold and Sommer, 1999; Ridderinkhof et al., 2009; Endrass et al., 2012). Thus, a possible interpretation of our findings may be that participants were more aware of the errors committed after the presentation of the emotional stop signals. Therefore, these errors might have been more motivationally salient and attentionally engaging for them. Alternatively, the larger Pe in the emotional trials might also have reflected an enhanced affective appraisal of errors (Falkenstein et al., 2000). One additional possibility of considerable interest is that the larger Pe reflected an enhanced accumulation of evidence that an error had occurred (Steinhauser and Yeung, 2010).

It is worth noting that these different accounts of the ERN and Pe are not mutually exclusive. Rather, they emphasize different aspects of the cognitive-emotional system responsible for goaldirected behavior. Thus, we do not purport to adjudicate between these models with our present data. Instead, we accept that there are several plausible interpretations to explain enhanced Pe amplitudes in emotional conditions of our task.

The present results, in conjunction with our previous studies (Senderecka, 2016, 2018), demonstrate that the emotional amplification of the Pe amplitude occurs across a variety of affective stimulus types. Thus, the enhancement of error monitoring evoked by task-relevant, affective material is not restricted to stimuli with evolutionary significance (e.g., threatening pictures or aversive sounds), but instead extends to material with symbolic, ontogenetically learned emotional significance. Moreover, our analyses revealed that highly arousing words from two emotional valence categories modulated the Pe amplitude in a similar way. Hence, this pattern of results indicates that the affective enhancement of error monitoring that occurs across both negative and positive conditions is preferentially driven by the arousal content of an emotional stimulus.

The present data also suggest that various components of the error processing system are differentially sensitive to diverse emotional manipulation. Given that ERN and Pe are thought to reflect independent aspects of post-error processing, with the former primarily linked to conflict monitoring (Botvinick et al., 2001; Yeung et al., 2004) and the latter preferentially associated with conscious error recognition and remedial action (Nieuwenhuis et al., 2001; Endrass et al., 2007; Larson and Perlstein, 2009; Hughes and Yeung, 2011), such results are not surprising. Considering results from previous studies, it seems reasonable to tentatively assume that ERN may be primarily modulated by trait-related affective dispositions (Vaidyanathan et al., 2012; Endrass and Ullsperger, 2014) and is relatively less sensitive to short-duration affective states induced by emotional stimuli. Simultaneously, a growing body of evidence indicates that the Pe amplitude is statedependent and may be reliably modulated by affective stimuli presentation (Moser et al., 2005; Senderecka, 2016, 2018; Paul et al., 2017). Further research is surely needed to attain a thorough understanding of the associations between emotional states and the variability of these two error-related components.

# Links Between Post-error Brain Activity and Incidental Memory

The behavioral outcomes of the SST revealed that emotional words did not influence the stop-signal reaction time and inhibitory rate, as compared to neutral words. This may suggest that the arousing power of the linguistic material was not sufficient to interfere with inhibitory performance, similarly to what has been observed in other tasks that target various cognitive functions (e.g., Williams et al., 1996; Siegle et al., 2002). However, although participants did not have to explicitly process the meaning of the words during SST, the emotional arousal effect of linguistic stimuli was seen in incidental recall. Both negative and positive nouns produced a benefit in memory performance as compared to neutral nouns, this is in line with the EEM effect observed in previous studies (for a review, see Murty et al., 2010). Participants were not forewarned of the subsequent free recall test, thus any effect observed on the recall should be attributable to incidental learning during encoding.

The general memory improvement for affective material could be due to multiple factors that play a significant role during encoding, such as greater attentional engagement (Hamann, 2001; Calvo and Lang, 2004), enhanced perceptual sensitivity (Zeelenberg et al., 2006), or increased physiological arousal (LeDoux, 2000) in response to emotional stimuli. They can be associated with greater activation of the amygdala, hippocampus, frontal and temporal cortices, as well as the ventral visual stream during encoding (Murty et al., 2010). Altogether, these different mechanisms provide multiple, additive or interactive sources of modulation for the processing of emotional stimuli that ultimately determine their privileged access to awareness and memory systems. Thus, all these factors possibly contributed to the memory enhancement for negative and positive words in the present study.

Additionally, our analyses revealed an interesting correlation of memory performance for negative words with the Pe amplitude across all erroneous trials, as well as in the negative and neutral conditions in particular. Since our study was correlational in nature, no strong conclusions can be drawn about the exact mechanism of these effects. However, at least two interpretations can be offered to explain our findings. First, our results provide evidence that enhanced error monitoring is associated with facilitated recall of emotionally negative words that have been encoded during the experimental session. However, this correlation does not necessarily reveal cause and effect relationship. Enhanced error monitoring and facilitated recall of negative words may co-occur because they both reflect responsivity to negative information<sup>4</sup> . Errors are maladaptive reactions that may put an individual in danger, whereas negative words are symbolic representations of concepts, places, or objects that are likely to threaten his or her safety. Since both errors and unpleasant linguistic stimuli are negative events, individuals who are especially sensitive to their own errors might also be particularly inclined to allocate more attention to negative words, improving their encoding and subsequent recall from memory (Hamann, 2001; Calvo and Lang, 2004; Talmi and McGarry, 2012). Consequently, in our study, these participants who showed larger Pe amplitudes could recall more negative nouns. No such association was observed between the Pe amplitude and incidental recall for positive and neutral nouns, because errors and words from two other categories were not emotionally congruent events. The non-causal relation between enhanced error monitoring and facilitated recall of negative words is additionally supported by the fact that memory performance for negative words was significantly correlated not only with the Pe amplitude in the negative condition but also with the Pe amplitude across all erroneous trials and in the neutral condition. Thus, it can be assumed that memory improvement for negative material and an increased amplitude of the Pe were associated with each other because they are both related to individual differences in emotionality. That is, individuals who are characterized by high sensitivity to negative events exhibit enhanced recall of unpleasant words and increased error monitoring, as indexed by the Pe amplitude.

<sup>4</sup>We thank the Reviewers for directing our attention to this interpretation.

Alternatively and more speculatively, our findings might also be interpreted as indicating a causal link between posterror brain activity and enhanced recall of negative words. This second interpretation relies on the results of source localization analysis. It revealed that enhanced recall of negative words correlated positively with the brain activity in the dorsal ACC and in the dorsomedial prefrontal cortex (dmPFC) that was registered in the Pe time window during negative trials. No such correlation was observed between memory performance for negative words and medial prefrontal brain activity during positive, neutral or globally erroneous trials. This specific pattern of results suggests that error-related brain activity in the negative condition may selectively support memory encoding for negative material. The ACC and the dmPFC are known to be engaged in both cognitive and affective processing (Phan et al., 2004). The dorsal ACC contributes to error monitoring (for a review, see Ridderinkhof et al., 2004), but can also act as salience detector when faced with emotional stimuli (Davis et al., 2005). Moreover, both the dorsal ACC and the dmPFC are strongly activated during fear conditioning (Etkin et al., 2011). These activations probably reflect threat appraisal, accompanied by learning processes. Although the exact brain activity elicited during error monitoring and threat appraisal within learning processes may differ significantly, they nonetheless involve at least partially overlapping neural networks. Therefore, it seems reasonable to tentatively assume that the neural processes involved in error detection in the negative condition may have a facilitative effect on the neural processes underlying unpleasant stimuli encoding due to an overlap of the neural networks behind these two functions. Within this interpretation, the EEM effect observed for positive words could be based on a different neural mechanism, probably related to enhanced perceptual sensitivity (Zeelenberg et al., 2006) or increased physiological arousal (LeDoux, 2000), but operating independently of the error monitoring process.

# CONCLUSION AND FUTURE DIRECTIONS

Using error-related ERP components and behavioral measures, this study examined the links between short-duration affective states induced by emotional nouns, error monitoring, and incidental memory. In particular, we investigated, first, whether emotional words can lead to increased error monitoring, as reflected by the ERN and Pe amplitudes, relative to a neutral task condition, and second, whether this enhancement can be differentially modulated by affective valence. Our third goal was to assess whether post-error brain activity is associated with incidental memory for negative and/or positive words.

According to our hypothesis, we found significantly larger error-related brain activity in the Pe time window in both negative and positive conditions. In contrast, the ERN amplitudes were comparable in all types of trials, regardless of their affective valence. These findings suggest that the emotional enhancement of error monitoring, as reflected by the Pe amplitude, may be induced by stimuli with symbolic, ontogenetically learned emotional significance. They also indicate that the emotionrelated enhancement of the Pe is not limited to words of specific valence, thus, it is preferentially driven by the arousal content of an affective stimuli. Moreover, they provide additional evidence that the ERN and Pe reflect independent aspects of post-error processing and are differentially sensitive to emotional manipulation.

Importantly, to our knowledge, this is the first study that has examined memory performance in an error-monitoring context. In correspondence with the EEM effect described in previous studies, we observed that both negative and positive nouns produced a benefit in incidental recall as compared to neutral nouns. Interestingly, the memory performance for negative words turned out to be positively correlated with the Pe amplitude, particularly in the negative condition. The sLORETA analysis revealed that the subsequent memory recall for negative words was associated with widespread bilateral activations in the dorsal anterior cingulate cortex and in the medial frontal gyrus, registered in the Pe time window during negative trials. These results suggest that enhanced error monitoring and facilitated recall of negative words may both reflect responsivity to negative events. More speculatively, they can also indicate that post-error activity of the medial prefrontal cortex may selectively support encoding for negative stimuli and contribute to their privileged access to memory.

Some limitations of the present work and future directions for research should be mentioned here. First, although LORETA is a widely used and empirically well-supported source localization method (Pascual-Marqui et al., 1994; Pascual-Marqui et al., 2002), the inverse solution results should be always interpreted with caution because of the imprecise nature of the mathematical reconstruction on which they are based. In addition, the present results were obtained using small number (32) of scalp electrodes, thus, they must be considered with reservation. With low spatial resolution, there is a decreased chance that LORETA will be able to effectively identify the closely spaced sources (Greenblatt et al., 2005). Further research is surely needed to validate our findings by using an enhanced spatial resolution.

Second, in studies with long retention intervals the EEM probably relies on a different mechanism than in tasks involving a short delay between an initial encoding and subsequent recall (Talmi et al., 2007). In the former case, the EEM is primarily due to a better consolidation of emotional memory traces. Taking into account this divergence, further research is necessary to determine whether post-error brain activity is associated with memory performance when tested after a long delay.

Third, in the present study a large set of emotional and neutral nouns selected from a standardized database was used to induce short-duration affective states. Therefore, the choice of the linguistic stimuli precluded the examination of whether the observed association between error monitoring and memory performance can be observed for other kinds of aversive stimuli. Thus, it would surely be worthwhile to replicate the present results using emotional pictures, which from an evolutionary

perspective are more biologically salient and motivationally relevant than words.

#### AUTHOR CONTRIBUTIONS

MS developed the rationale for the study, designed the experiment, analyzed the data, and wrote the manuscript. MO prepared the experimental task. MM collected the data. BK contributed analysis tools for behavioral data. All authors reviewed the manuscript.

# FUNDING

This work was supported by an Opus 10 grant (2015/19/B/HS6/00341) from the National Science Centre of Poland awarded to MS.

# REFERENCES


# ACKNOWLEDGMENTS

The authors gratefully acknowledge the help of Adrian Borowicz, Jagoda Byszko, Katarzyna Hat, and Jakub Pawlak with data recording. The authors are also grateful to the participants who volunteered to take part in the study and to Michael Timberlake who proofread the manuscript. The EEG recording was carried out in the Neurocognitive Processing Laboratory, Institute of Philosophy, Jagiellonian University, Kraków, funded under the Polish Ministry of Science and Higher Education Investment Grant (6380/IA/158/2013) led by MS.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00178/full#supplementary-material





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Senderecka, Ociepka, Matyjek and Kroczek. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Executive Functions and Performance Variability Measured by Event-Related Potentials to Understand the Neural Bases of Perceptual Decision-Making

Rinaldo L. Perri 1, 2 \* and Francesco Di Russo2, 3

<sup>1</sup> Department Unicusano, University Niccolò Cusano, Rome, Italy, <sup>2</sup> Department of Movement, Human and Health Sciences, Foro Italico University of Rome, Rome, Italy, <sup>3</sup> IRCCS Santa Lucia Foundation, Rome, Italy

Keywords: event-related potentials (ERP), decision-making, executive functions, insula, response variability, electroencephalography, frontal lobe

#### DECIDING BETWEEN DIFFERENT CHOICES: NEUROCOGNITIVE FACTORS OF DECISION-MAKING AND RESPONSE VARIABILITY

Perceptual decision-making tasks usually require subjects to recognize stimulus categories and select between different response alternatives. For example, in Go/No-go tasks, one has to respond to target stimuli and withhold responding to non-target stimuli. Accomplishing even just a single trial of such a task needs a complex sequence of functions (most of them executive) consisting, for example, in motor readiness, sustained attention, sensory processing, inhibitory control, conflict monitoring, stimulus-response mapping, context updating and, if any, error detection and awareness. In this context, the motor response reflects the behavioral outcome of the fast and proper interaction of the above-mentioned processes, and the response consistency (or variability) is often adopted as index of executive functioning.

Nowadays, one challenge of the cognitive neuroscience is to understand how executive functions allow to make decisions. In fact, understanding decisional processes, and reasons of decision failure, would be helpful to clarify the executive dysfunctions of clinical conditions such as obsessive compulsive disorders, impulsivity, and addictions (typically intended as a failure of inhibition; Chamberlain et al., 2005; Crews and Boettiger, 2009; Álvarez-Moya et al., 2011), as well as success in real-life tasks (e.g., car driving; Bunce et al., 2012) and goal-directed behaviors (e.g., complying with diet schedules; Jahanshahi et al., 2015). In this context, the response variability reflects a behavioral index of efficiency of frontal cognitive control (Bellgrove et al., 2004), and this association was suggested since the first half of the twentieth century, when Head (1926, p. 145) reported that "an inconsistent response is one of the most striking consequences of lesions to the cerebral cortex." More recently, consistent literature indicated response variability as an indirect index of top down control (Tamm et al., 2012), executive functioning (Swick et al., 2013), neurological (Segalowitz et al., 1997; Hultsch et al., 2000), and psychiatric conditions (Barkley et al., 1992; Vinogradov et al., 1998; Leth-Steensen et al., 2000), and frontal lobes integrity (Bunce et al., 2007; Walhovd and Fjell, 2007; Lövdén et al., 2013). The frontal cortex is in fact considered as the main region supporting the executive functions and behavioral variability (Stuss et al., 2003), as revealed by the poor response consistency and accuracy of frontal patients performing a decision-making task (Arnot, 1952; Stuss et al., 1999, 2003; Picton et al., 2007).

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Claude Alain, Rotman Research Institute (RRI), Canada

#### \*Correspondence:

Rinaldo L. Perri rinaldo.perri@uniroma1.it

Received: 27 September 2017 Accepted: 01 November 2017 Published: 15 November 2017

#### Citation:

Perri RL and Di Russo F (2017) Executive Functions and Performance Variability Measured by Event-Related Potentials to Understand the Neural Bases of Perceptual Decision-Making. Front. Hum. Neurosci. 11:556. doi: 10.3389/fnhum.2017.00556

Even though it is evident the relationship between executive functioning and performance variability at group level (e.g., in the comparison between high- and low-level athletes in sport; Vestberg et al., 2012), it is less known the mediating role of response variability at intra-individual level. Also, it is still not clear the mediating role of PFC activity in the intra-individual variability because of contrasting results of neuroimaging studies: in fact, two studies reported a greater dorsolateral PFC (DLPFC) activation associated with high intraindividual variability (Bellgrove et al., 2004; Simmonds et al., 2007), while Weissman et al. (2006) reported reduced prestimulus activity of the right DLPFC in the less consistent trials. In other words, depending on the main findings, neuroimaging literature interpreted the high individual variability as the consequence of the greater need of top-down executive control (enhanced PFC activation), or in terms of lapses in attention (reduced PFC activation).

# ERPS AND EXECUTIVE FUNCTIONS: STATE OF THE ART AND MAIN LIMITATIONS

Identification of neurophysiological correlates of executive functioning requires to investigate different cognitive abilities, which in part depend on the experimental paradigm: for example, in Stroop or sustained attention tasks (Demeter and Woldorff, 2016), voluntary selective attention would be more stressed than Go/No-go tasks in which accumulation of sensory evidence would be determinant, or oddball tasks where decision making in effected by expectancy, or stop-signal tasks in which the so-called "reactive inhibition" is often required (for a review see Jahanshahi et al., 2015). It is also relevant to note that decisional processes work in a narrow temporal window, such as the time needed to perform a single trial in a speeded decision-making task. This constraint requires the researchers to adopt a technique with adequate temporal resolution to carry out their own investigations: this means that neuroimaging studies may not be the most suitable to investigate the fast temporal succession of the decisional processes. Moreover, as also suggested by Bogacz et al. (2010), the duration of the decision processes can affect the amplitude of the BOLD signal, therefore functional magnetic resonance (fMRI) findings should be interpreted with caution when studying the decision-making. In other terms, it could be possible that the long time needed to make a decision (leading to large response variability) explains the greater PFC activation reported by some studies (Bellgrove et al., 2004; Simmonds et al., 2007). At the opposite, even if less informative on the anatomical source of the observed activities, the electroencephalography (EEG), and especially the eventrelated potentials (ERPs) technique, is particularly appropriate to catch the fast succession of the decisional brain's events. However, most of the ERP literature in this field focused on post-response activities like the central-parietal P3 (Segalowitz et al., 1997; Saville et al., 2011), and the error-negativity (Ne) and error positivity (Pe) in case of error commission (Falkenstein et al., 1991, 1995, 1996). Similarly, when focused on the pre-movement activities, electrophysiological studies mainly observed the frontal-medial modulation of the N2 component (Bokura et al., 2001; van Boxtel et al., 2001; Nieuwenhuis et al., 2003), whose functional role is still debated (Perri et al., 2015b; Di Russo et al., 2017). In other terms, the main limitation of ERP literature was the lack of a solid background on the contribution of the executive functions of the frontal cortex in the decisional processes. In fact, except for the motor preparation activities of the frontal areas, as reflected by the Bereitschaftspotential (BP; e.g., Shibasaki and Hallett, 2006) and the lateralized readiness potential (LRP; e.g., Rinkenauer et al., 2004), only the very recent literature started to report the ERP correlates of the PFC in the executive functioning and variability (for a review see Di Russo et al., 2017).

# LOOKING INTO THE FRONTAL LOBES: EMERGING EVIDENCE ON THE ERP CORRELATES OF EXECUTIVE FUNCTIONS

When performing a decision-making task, the contribution of the PFC executive functions is manifested mainly through cognitive processes like top-down attentional control, maintenance of information in the short-term memory, ability to ignore distractors and focus on relevant features, and inhibition of the wrong schema and selection of the appropriate one. Since even one of these processes would be able to affect the inter- and intra-individual variability of the performance, they should be dissociated and investigated separately.

Recent ERP studies described different pre-movement activities within the frontal cortex emerging both before and after the stimulus appearance in decisional tasks: as reviewed by Di Russo et al. (2017), there is a growing body of evidence defining the mediating role of these components in the variability of executive functions. It was shown that before the appearance of a stimulus, that is the preparation stage, at frontopolar sites is possible to detect the so called prefrontal negativity (pN) component, as shown in **Figure 1A** and in the left side of **Figure 1C**, together with the more posterior BP (e.g., Di Russo et al., 2013). The bilaterally-distributed pN was described as the electrophysiological correlate of the inferior frontal gyrus (iFg) activity (Di Russo et al., 2016; Sulpizio et al., 2017), especially involved in the proactive inhibitory (Perri et al., 2016; Bianco et al., 2017a,b,c) and top-down attentional control (Perri et al., 2014a, 2015a, 2017). Another ERP component is the dorsolateral pN (DLpN), that on the right hemisphere was associated to modulation of baseline levels in the accuracy system (i.e., the larger the right DLpN, the poorer the accuracy performance; Perri et al., 2014b; Lucci et al., 2016). The mediating role of the pN component in intra-individual variability emerged through studies that showed how enhancement of this activity predisposes subjects to effective inhibitory control (Perri et al., 2016) and, at the opposite, that its reduction predicts poor attentional control leading to response omission (Perri et al., 2017). A prefrontal activity compatible with the pN was described by West and Alain (2000) adopting a Stroop task: findings are consistent in revealing that momentary lapses of

attention are associated with a pre-stimulus change in the ERP activity over the frontal regions. Moreover, it was shown that also inter-individual variability of performance was associated with the individual level of pN activity: in other words, the more consistent performers are marked by larger pN activity than the less consistent ones (Perri et al., 2015a). If the pN component can be described as a sort of readiness activity (a cognitive disposition in performing the task), there are also executive functions that work in the information processing stage, that is after the stimulus onset. In this regard, the ERP literature identified a complex of three components that were labeled as prefrontal N1 (pN1), prefrontal P1 (pP1), and prefrontal P2 (pP2): they were respectively associated with the sensory-awareness, the sensory-motor integration and the stimulus-response mapping process (for a review see Di Russo et al., 2017). The main generator of the prefrontal ERPs was localized in the bilateral anterior insula (Di Russo et al., 2016; Sulpizio et al., 2017), and these components are typically detected 80–400 ms after the stimulus appearance; however, while the pN1 and pP1 components reflect top-down perceptual processing of any stimulus to be processed (even in a passive-perception task), the later pP2 can be detected only in presence of decisional requests, that is the need to classify the information by matching it with the relative response, or stimulus-response mapping. **Figure 1B** shows the ERP waveforms of the pN1, pP1 and Pp2 together with the well-known N2 and P3. **Figure 1C** (right) shows the pP2 scalp topography concomitant to the N2. It is noteworthy that the pP2 component was also labeled as Go-P2 (Gajewski and Falkenstein, 2013), anterior P2 (P2a; Potts et al., 1996), frontal selection positivity (FSP; Kenemans et al., 1993), and frontal P3 (P3f; Makeig et al., 1999), and in all cases larger amplitudes were reported for target than non-target trials, regardless of task and response modality (finger movement, speech, silent count). Further, modulation of this component were repeatedly associated with groups difference in decisional speed (Perri et al., 2014b; Bianco et al., 2017c), such as with accuracy variability at both inter-individual (Di Russo et al., 2013; Perri et al., 2014b), and intra-individual level (Perri et al., 2015b, 2017).

Concluding, there is a growing literature revealing the utility of the ERPs in the study of the executive functions of the prefrontal cortex. In fact, even if with a less spatial resolution of neuroimaging scans, recent ERP literature has proven to be able in overcoming some concerns such as the source localization and the presence of artifacts in the anterior sites, that in the past may have limited the EEG investigation of the frontal executive functioning. At the opposite, the background reviewed here on the functional roles and generators of prefrontal ERPs suggest the latter as a promising tool to foster a new-way approach in the neurocognitive study of the executive functions. In fact, the extensive review of Di Russo et al. (2017) revealed the potential role of the prefrontal ERPs in identifying the cognitive factors mediating the variability of performance within subjects and between groups. Also, since it was shown that prefrontal ERPs are affected by neurological disease and susceptible of modifications as effect of rehabilitation (Di Russo et al., 2013) and sport training (Bianco et al., 2017b,c), future studies may clarify which cognitive factors could operate on them. Similarly, investigation of the prefrontal ERPs in clinical populations would be useful to shed new light on the strength relationship between prefrontal lesions and executive functions (for a review see Alvarez and Emory, 2006).

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

RP: conception and writing of the work; FD: contribution to writing and conception of the work, critical revision of the text; RP and FD: approved the final version of the manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Perri and Di Russo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Individual Differences in Verbal and Spatial Stroop Tasks: Interactive Role of Handedness and Domain

Mariagrazia Capizzi <sup>1</sup> \*, Ettore Ambrosini <sup>1</sup> and Antonino Vallesi 1,2

<sup>1</sup>Department of Neuroscience, University of Padova, Padova, Italy, <sup>2</sup>San Camillo Hospital IRCCS, Venice, Italy

A longstanding debate in psychology concerns the relation between handedness and cognitive functioning. The present study aimed to contribute to this debate by comparing performance of right- and non-right-handers on verbal and spatial Stroop tasks. Previous studies have shown that non-right-handers have better interhemispheric interaction and greater access to right hemisphere processes. On this ground, we expected performance of right- and non-right-handers to differ on verbal and spatial Stroop tasks. Specifically, relative to right-handers, non-right-handers should have greater Stroop effect in the color-word Stroop task, for which inter-hemispheric interaction does not seem to be advantageous to performance. By contrast, non-righthanders should be better able to overcome interference in the spatial Stroop task. This is for their preferential access to the right hemisphere dealing with spatial material and their greater inter-hemispheric interaction with the left hemisphere hosting Stroop task processes. Our results confirmed these predictions, showing that handedness and the underlying brain asymmetries may be a useful variable to partly explain individual differences in executive functions.

#### Edited by:

Francesco Di Russo, Foro Italico University of Rome, Italy

#### Reviewed by:

Keith Brandon Lyle, University of Louisville, United States Bruno Kopp, Hannover Medical School, Germany

#### \*Correspondence:

Mariagrazia Capizzi mariagrazia.capizzi@unipd.it; mgcapizzi@hotmail.com

Received: 25 July 2017 Accepted: 25 October 2017 Published: 10 November 2017

#### Citation:

Capizzi M, Ambrosini E and Vallesi A (2017) Individual Differences in Verbal and Spatial Stroop Tasks: Interactive Role of Handedness and Domain. Front. Hum. Neurosci. 11:545. doi: 10.3389/fnhum.2017.00545 Keywords: hemispheric lateralization, brain asymmetries, spatial processing, verbal processing, hand preference

# INTRODUCTION

''I may be left-handed, but I'm always right!'' is just one of the many quotes that circulate on the web ironically attesting that left-handedness has been traditionally associated with negative value and connotation. As an example, consider that in some societies left-handed children were often forced to use the right hand for those tasks they would naturally perform with their left hand such as writing (e.g., Klöppel et al., 2007). Left-handedness is now more widely accepted, though right-handed people still make up the majority of the population (∼90%; Corballis, 2003; see also Peters et al., 2006).

Contrary to popular belief, hand preference represents a valuable opportunity that nature provides us with to explore the hemispheric organization of the human brain. Summing up the key findings from previous neuroanatomical studies (e.g., Witelson, 1985, 1989; Habib et al., 1991; Witelson and Goldsmith, 1991; Tuncer et al., 2005), non-right-handers (i.e., left- and mixedhanders) would have on average a larger corpus callosum than right-handers. This implies better inter-hemispheric interaction, which means better coordination across both hemispheres, for left-handers compared to right-handers (e.g., Cherbuin and Brinkman, 2006). Mixed-handedness has also been associated with increased right hemispheric activity at rest (e.g., Propper et al., 2012).

What do these anatomical and functional differences between non-right- and right-handers tell us about cognitive functioning? Specifically, can handedness give enhanced insights into individual differences in behavioral performance and, if so, to what extent? This intriguing question has stimulated a great deal of work with mixed results so far. In a recent review, Prichard et al. (2013) concluded that, to overcome the current impasse on the topic, it is necessary to move away from the use of direction of hand preference, resting on the comparison of left- vs. right-handers, and focus instead on consistency of handedness, comparing inconsistent/mixedhanders (ICH) vs. consistent/strong-handers (CH). Relative to ICH, CH use ''the dominant hand for virtually all common manual activities'' (Prichard et al., 2013, p. 1). In line with Prichard et al. (2013), consistency of handedness has been shown to be a good predictor of performance in many cognitive domains. Specifically, ICH exhibit superior performance on tasks that require access to right-hemisphere processes and that implicate inter-hemispheric interaction, such as memory retrieval and belief updating/cognitive flexibility tasks (e.g., Jasper and Christman, 2005; Propper et al., 2005; Lyle et al., 2012). Overall, these findings have been taken as evidence for the argument that ''consistent vs. inconsistent handedness is associated with decreased vs. increased interhemispheric interaction and with decreased vs. increased right hemisphere access, respectively'' (Prichard et al., 2013, p. 1). However, the debate about whether hand preference, and in particular consistency of handedness alone, represents a useful variable to explain performance is far from over (e.g., Hardie and Wright, 2014). The main reasons are briefly outlined below.

The distinction between CH and ICH is usually based on the median split on scores in one of the most widely used questionnaires to measure handedness, namely, the Edinburgh Handedness Inventory (EHI; Oldfield, 1971; see also Edlin et al., 2015). When the median split is performed on the raw EHI scores, direction of handedness and consistency of lateralization may be conflated. In such a case, indeed, the consistent group is composed of consistent right-handers only, whereas the inconsistent group includes inconsistent right-handers, inconsistent left-handers and consistent lefthanders. The same problem still holds for those studies that exclude consistent left-handers from the analyses (e.g., Propper et al., 2005). To avoid this, a common procedure is to perform the median split on the absolute EHI scores, instead of the raw ones, to group CH (whether left or right) into one category and ICH (whether left or right) into another category (e.g., Lyle et al., 2012). This approach too, however, might be criticized to the extent that dichotomization of a continuous variable (like the EHI scores) into a categorical measure can lead to biased results (e.g., DeCoster et al., 2009). Taking into account all these issues related to handedness as a categorical variable, here we performed polynomial regression analyses on the continuous EHI scores to explore both direction and consistency of handedness. We will elaborate further on this issue in the ''Data Analysis'' section.

To investigate the relationship between handedness and cognitive functions, the present study focused on the Stroop task (Stroop, 1935; MacLeod, 1991). In a typical color-word Stroop paradigm, participants are presented with words denoting different colors. The association between the ink color in which the word is displayed and the meaning of the word can be either congruent (e.g., RED printed in red) or incongruent (e.g., RED printed in blue). The participant's task is to identify the ink color of the word and ignore its meaning. A robust finding that emerges in the Stroop task is the so-called ''Stroop effect'', which refers to a drop in performance in incongruent compared to congruent color-word matching.

A general agreement exists that successful performance in the Stroop task requires the ability to overcome a prepotent and automatic tendency (i.e., reading the word) in order to implement, in its place, a less spontaneous process (i.e., identifying the ink color), a series of operations that collectively tap into the construct of cognitive control (e.g., MacDonald et al., 2000; Koechlin et al., 2003; Braver, 2012). Converging evidence from neuropsychological (e.g., Perret, 1974; Gläscher et al., 2012; Tsuchida and Fellows, 2013; Geddes et al., 2014; Cipolotti et al., 2016), structural (e.g., Vallesi et al., 2017) and functional magnetic resonance imaging (fMRI) data (e.g., Floden et al., 2011; see Derrfuss et al., 2005; Laird et al., 2005; Cieslik et al., 2015, for meta-analyses) points to the selective involvement of left brain areas in the color-word Stroop task, corroborating the claim that some executive functions may be fractionated along the left-right axis of the human brain (see Stuss, 2011; Vallesi, 2012).

Based on previous findings on handedness and Stroop (Christman, 2001), we predicted worse performance for nonright-handers compared to right-handers in a typical color-word Stroop paradigm, a task mainly lateralized to the left side of the brain. According to Christman (2001), left-handers would be impaired at keeping word and color dimensions of Stroop stimuli separate because of their greater degree of inter-hemispheric interaction. Importantly, the same study also showed that left-handers outperformed right-handers on a version of the local-global task requiring integration of left and right hemispheric processes. These findings open up the possibility that non-right-handers could hence outperform righthanders should the Stroop task involve spatial rather than verbal material. Indeed, reasoning that processing of spatial information recruits more the right hemisphere (e.g., Deutsch et al., 1988; Shulman et al., 2010), and that the cognitive processes underlying the Stroop task tend to be lateralized to the left one (e.g., Floden et al., 2011; Gläscher et al., 2012; Tsuchida and Fellows, 2013; Geddes et al., 2014; Cipolotti et al., 2016), it is conceivable to expect that greater collaboration between the two hemispheres should result in better performance in this context. Supporting our rationale, there is evidence that left-handers are facilitated in tasks that engage the right hemisphere for visuo-spatial activities (e.g., Beratis et al., 2013). For example, these authors showed that left-handers performed better than right-handers on the Trail-Making Test-B (TMT-B), a task that has been related to the functioning of the right hemisphere (e.g., Jacobson et al., 2011; Kopp et al., 2015).

In sum, to our knowledge, no study has so far directly compared verbal and spatial Stroop tasks as a function of handedness within a single experimental session. This manipulation allows exploring whether hand preference may explain individual differences in tasks that target the same cognitive operation (i.e., resistance to interference) but that diverge in the degree of inter-hemispheric interaction they require. Moreover, our protocol may partly add to the understanding of the hemispheric organization of executive functions in the brain, an issue that still remains debated and poorly understood (e.g., Badre and D'Esposito, 2007; Jacobson et al., 2011; Kim et al., 2012; Geddes et al., 2014; Babcock and Vallesi, 2015; Capizzi et al., 2016; Cipolotti et al., 2016; for a review, see Vallesi, 2012).

#### MATERIALS AND METHODS

#### Participants

An initial sample of 246 University students took part in the study as part of a larger research project, for which they were asked to fill in the EHI among other questionnaires. This allowed us to categorize them according to their hand preference. Forty-three extra participants (41 with an EHI score below 0 and 2 with an EHI score equal to 0) were then recruited through social media advertisements targeting nonright-handers in order to have an appropriate sample belonging to this population. However, such participants were debriefed on the precise nature of the study only once the experimental session was concluded<sup>1</sup> .

Data from two participants were discarded due to poor performance (<50% accuracy in either of the two Stroop tasks). The remaining 287 participants (mean age: 23.3 years, age range: 19–34 years, 171 females) were included in the analyses. All participants had normal or corrected-to-normal visual acuity and reported normal color vision. Participants were compensated for their time and gave written informed consent prior to participation. The procedure of the study was approved by the Bioethical Committee of the Azienda Ospedaliera di Padova and the study was conducted according to the guidelines of the Declaration of Helsinki.

Handedness was determined with the original version of the EHI (Oldfield, 1971), which provides a score ranging from −100 (extreme left-handedness) to +100 (extreme right-handedness). In our sample, 232 participants (mean age: 23.28 years, age range: 19–34 years, 140 females) had EHI scores above 0 (mean score = 82.31, SD = 17.23, range = 10–100), 52 (mean age: 23.29 years, age range: 20–34 years, 30 females) below 0 (mean score = −61.15, SD = 31.32, range: −10 to −100) and 3 (mean age: 24.33 years, age range: 23–25 years, 1 female) had no overall preference (EHI score = 0). The EHI score mean of the whole sample was 55.45 (SD = 59.19) with an EHI score median of 80.

#### Tasks and Procedure

Participants were tested individually in a quiet and normally illuminated room. The color-word and the spatial Stroop tasks were shortened versions of the ones used in Puccioni and Vallesi (2012a,b). The two Stroop tasks were presented in a counterbalanced order across participants along with other behavioral tasks not reported here.

In the color-word Stroop task, stimuli consisted of four Italian color words: ROSSO (red in Italian), BLU (blue), VERDE (green) and GIALLO (yellow). Each word was presented individually in one of four ink colors (red, blue, green and yellow) in such a way to yield congruent and incongruent color-word pairings. Participants were required to identify the ink color and ignore the meaning of the word through a key press on the computer keyboard.

In the spatial Stroop task, stimuli consisted of four arrows pointing to one of the four corners of the screen (i.e., upperleft, upper-right, lower-left and lower-right). Each arrow was presented individually in one of the four quadrants of the screen resulting in congruent (e.g., upper-left pointing arrow positioned in the upper-left quadrant) and incongruent conditions (e.g., upper-left pointing arrow positioned in the lower-right quadrant). Participants had to respond according to the pointing direction of the arrow and ignore the corresponding position through a key press.

For both color-word and spatial Stroop tasks, stimuli (words or arrows, respectively) were presented for 500 ms and then replaced by a 2000 ms blank response screen. The next trial appeared after an inter-trial interval that lasted randomly and continuously between 250 ms and 700 ms. Each Stroop task consisted of two blocks of 64 trials each with a short rest break between the blocks. Congruent and incongruent trials were randomly and equally distributed. Only complete alternation sequences were employed to minimize both positive and negative priming confounds. That is, neither the ink nor the word color for the color-word Stroop task, and neither the direction nor the arrow position for the spatial Stroop task, used on the current trial were repeated in either way (ink or word, and direction or position) on the subsequent trial (see Puccioni and Vallesi, 2012a,b, for details).

Prior to the experimental blocks, participants completed 16 training trials. They had to perform correctly on at least 10 out of the 16 trials to proceed to the subsequent experimental blocks. Otherwise, the practice had to be repeated until such a criterion was reached.

#### Data Analysis

Since the distributions of the response times (RTs) and accuracy scores were skewed and/or kurtotic, we respectively applied logarithmic and arcsine square root transformations to improve normality and reduce skewness. The use of log-transformed

<sup>1</sup>One might argue that, despite these extra participants were not aware of the purpose of our study, their recruitment procedure could have introduced some biases such as a stronger motivation of non-right-handers compared to right-handers in performing the task. However, we believe this was not the case here because a specifically higher motivation in non-right-handers, if present, should have equally influenced verbal and spatial Stroop tasks, and therefore cannot explain the differences we found between the two domains as a function of handedness.

RTs also enabled controlling for possible unspecific effects of generalized slowing (e.g., Ben-David et al., 2014).

For the RT analysis, the first trial in each block (1.56% of all the trials) was discarded, as well as errors (5.70% of the remaining trials) and anticipations (RTs < 150 ms, <0.01% of the remaining trials). Additionally, trials following an error (5.26% of the remaining trials) were excluded to avoid post-error slowing confounds (Burns, 1965). Finally, for each participant, trials with an RT above or below 2 SD from their individual task mean condition were treated as outliers and discarded from the RT analysis (4.61% of the remaining trials). For the accuracy analysis, the first trial in each block was removed.

For both correct RTs and accuracy scores, we computed verbal and spatial Stroop effects by calculating the difference between congruent and incongruent trials and then assessed their statistical significance by means of one-sample t test against zero. The Cohen's d was used as a measure of the effect size (Cohen, 1977). We also tested the reliability of our verbal and spatial (RT and accuracy) Stroop effects by computing split-half correlations corrected with the Spearman–Brown formula. This procedure is critical when performing correlation/regression analyses, since low observed correlations might result from poor reliabilities of the used measures and not from the lack of a true relationship between variables. To this aim, for both verbal and spatial tasks, we randomly divided congruent and incongruent trials of each participant into two subgroups of equal size; we then computed the spatial and verbal Stroop effects for each half as described above and calculated the corresponding Spearman–Brown-corrected reliability indexes. All reliability indexes were obtained from 5000 different randomizations of the trials.

Next, the following analyses were performed. First, to investigate the linear relationship between direction of handedness and Stroop performance, we ran non-parametric analyses (i.e., based on rank-transformed data) since the distribution of the EHI scores was negatively skewed and not normally distributed (skewness = −1.55; kurtosis = 1.02; Shapiro-Wilk's W = 0.717, p < 0.001). Specifically, we performed a non-parametric linear regression analysis between the participants' rank-transformed EHI scores and their rank-transformed verbal and spatial Stroop effects. Note that the resulting regression parameter is equivalent to the non-parametric Spearman's correlation coefficient ρ. Moreover, to assess whether the relationship between direction of handedness and Stroop performance was modulated by the cognitive domain, we carried out non-parametric general linear model (GLM) analyses with the participants' rank-transformed verbal and spatial Stroop effects as dependent variables, Domain as a within-participants factor, and the rank-transformed EHI scores as a continuous predictor.

Second, to compare the effects of both consistency and direction of handedness in the same analysis, we carried out additional non-parametric regression analyses by including two regressors for the linear and the quadratic effect of the participants' EHI scores in explaining their variability in the spatial and verbal Stroop tasks. For each Stroop effect, the fit of this quadratic model to the data was compared to that of the simpler linear model by means of an F test (∆F) for R 2 change. In this way, we were able to assess whether the inclusion of the quadratic term was justified and, hence, to test the specific contribution of consistency of handedness over and above that of direction. Indeed, if consistency matters, the quadratic model should show a better fit to the data as compared to the linear one, with a U-shaped relationship between EHI scores and Stroop effects. Specifically, a U-shaped finding would indicate a difference between Stroop performance of inconsistent handers with respect to that of both consistent left- and consistent righthanders.

# RESULTS

# Response Times

The RT analysis confirmed the traditional Stroop task results with reliable verbal (M = 0.057, SD = 0.039, corresponding to a mean untransformed raw effect of 92.89 ms, SD = 70.64 ms; t(286) = 24.75, p < 10−<sup>72</sup> , d = 1.46) and spatial Stroop effects (M = 0.084, SD = 0.028, corresponding to a mean untransformed raw effect of 109.40 ms, SD = 53.50 ms; t(286) = 50.11, p < 10−142; d = 2.96). The correlation between the two Stroop effects was not significant (ρ = −0.008, t(285) = −0.14, p = 0.888), a result that suggests some degree of independence in the cognitive processes underlying verbal and spatial Stroop tasks. The spatial and verbal RT Stroop effects had a good split-half reliability (respectively, median = 0.767 and 0.738, two-sided 95% confidence interval = [0.709 0.810] and [0.673 0.785]), making their use in the subsequent regression analysis appropriate. It is important to note that the global EHI score has also been shown to have good test-retest reliability (e.g., Ransil and Schachter, 1994).

The non-parametric regression analyses between the verbal and spatial Stroop effects and the EHI scores showed an opposite pattern of results: there was a negative correlation for the verbal Stroop task (ρ = −0.174, p = 0.003) and a positive correlation for the spatial Stroop one (ρ = 0.121, p = 0.041). These results indicate that in the verbal domain the Stroop effect was reduced for participants with more positive EHI scores, whereas in the spatial domain the Stroop effect was reduced for participants with less positive (or negative) scores. Moreover, the GLM analysis revealed a significant interaction between cognitive domain and direction of handedness (F(1,285) = 12.78, p < 0.001, η 2 <sup>p</sup> = 0.04), confirming that the verbal and spatial Stroop effects were predicted by the EHI scores in an opposite way. To further verify that the correlations between the Stroop effects and the EHI scores in the verbal and spatial domains were significantly different, we also used the statistical test for comparing overlapping correlations from dependent groups described in Diedenhofen and Musch (2015). By overlapping correlations, it is meant that the same variable (in our case the EHI scores) is common to both correlations. Such a

test confirmed that the two correlations observed here were statistically different (Meng et al's., 1992; Z = −3.434, p < 0.001).

We then fitted the participants' verbal and spatial Stroop effects with two non-parametric models including two regressors for the linear and the quadratic effect of the participants' EHI scores. This was done in order to evaluate whether these quadratic models explained participants' Stroop performance better than the simpler linear ones. The improvement of the model fit due to the inclusion of the quadratic term was not significant for the verbal Stroop effect (∆F(1,284) = 1.95, corrected p = 0.327) and only marginally significant for the spatial one (∆F(1,284) = 4.57, corrected p = 0.066), but the increase in R 2 was negligible in both cases (respectively, <0.01 and 0.02). This analysis, thus, showed that the quadratic model accounting for the effects of both consistency and direction of handedness together did not provide a more adequate explanation of the participants' verbal Stroop effect as compared to the simpler linear model describing the effect of direction of handedness alone. There was, instead, a marginal significant improvement of the model for the inclusion of the quadratic term in the case of the spatial Stroop task.

Finally, to control for a possible rightward bias that might have influenced the correlation results, we performed an additional analysis in which the proportion of positive and negative EHI scores was exactly matched. That is, for each participant with a negative EHI score, one participant with the opposite EHI score was randomly selected. Since negative and positive values that could not be paired up were discarded, this procedure resulted in a total of 41 matched participant pairs. We then computed a Spearman correlation on this new dataset and repeated the same procedure on 10,000 random subsets. The two-sided 95% confidence interval (CI95%), which corresponds to an alpha level of 0.05, was finally computed. The results of this series of correlations remained the same as those described above for the color-word Stroop task (median ρ = −0.242, CI95% = [−0.374 −0.107]), but not for the spatial Stroop task (median ρ = −0.021, CI95% = [−0.159 0.114]). For illustrative purpose, **Figure 1** shows the bivariate distributions of the participants' verbal and spatial Stroop effects as a function of their EHI scores resulting from the 10,000 matched random subsets derived as described above.

#### Accuracy

The analysis conducted on the accuracy scores paralleled the RT findings in that there were significant verbal (M = 0.081, SD = 0.115, corresponding to a mean untransformed raw effect of 3.88%, SD = 6.21%; t(286) = 11.86, p < 10−<sup>25</sup> , d = 0.70) and spatial (M = 0.202, SD = 0.125, corresponding to a mean untransformed raw effect of 6.99%, SD = 7.16%; t(286) = 27.33, p < 10−<sup>80</sup> , d = 1.61) Stroop effects also in the accuracy data. Differently from what observed for the RT data, however, the correlation between the two accuracy Stroop effects was significant (ρ = 0.182, t(285) = 3.13, p = 0.002). The spatial and verbal accuracy Stroop effects had good reliability indexes (respectively, median = 0.759 and 0.529, CI95% = [0.704 0.804] and [0.417 0.620]), albeit the verbal one was slightly lower than that found for the corresponding RT Stroop effect.

The non-parametric regression analyses between the verbal and spatial Stroop effects and the EHI scores showed a pattern of results that differed from that observed for the RT Stroop effects in the following aspect. While there was a significant negative correlation for the verbal Stroop task (ρ = −0.224, p < 0.001), the correlation for the spatial Stroop one failed to reach significance

(ρ = −0.035, p = 0.560). However, the GLM analysis confirmed a significant interaction between cognitive domain and direction of handedness (F(1,285) = 6.38, p = 0.012, η 2 <sup>p</sup> = 0.02), showing that the verbal and spatial accuracy Stroop effects were predicted by the EHI scores in a significantly different way. Moreover, the test for comparing overlapping correlations confirmed that the correlation for the verbal task was different than that for the spatial one (Meng et al's., 1992; Z = −2.522, p = 0.012).

Paralleling the results on the RT Stroop effects, the improvement of the model fit due to the inclusion of the quadratic term was not significant for the verbal and spatial Stroop effects (both ∆Fs(1,284) < 2.82, ps > 0.583), with negligible R 2 increase in both cases (both < 0.01).

As for the RT Stroop effects, we performed an additional analysis in which the proportion of participants with positive and negative EHI scores was exactly matched in 10,000 random subsets of the data. The results of this series of correlations confirmed those described above. Indeed, the correlation for the color-word Stroop task was significant (median ρ = −0.281, CI95% = [−0.406 −0.143]), whereas that for the spatial Stroop task was not (median ρ = −0.083, CI95% = [−0.235 0.069]). For illustrative purpose, **Figure 2** shows the bivariate distributions of the participants' verbal and spatial accuracy Stroop effects as a function of their EHI scores resulting from 10,000 matched random subsets.

#### DISCUSSION

The goal of the present study was to explore the relationship between handedness and cognitive functioning. Our working hypothesis derives from recent work showing that non-righthanders should perform comparatively better on those tasks requiring greater inter-hemispheric collaboration and access to the right hemisphere, such as the spatial version of the Stroop task (Prichard et al., 2013). They should instead perform worse on tasks for which inter-hemispheric interaction is not advantageous to performance and that require preferential access to the left hemisphere, such as the verbal version of the Stroop task (Christman, 2001). If this were true, handedness research could also shed some light on current accounts of executive functions, according to which these functions may be differently lateralized in the brain (e.g., Stuss, 2011; Vallesi, 2012), with mechanisms underlying the Stroop task assumed to mostly engage the left hemisphere (e.g., Floden et al., 2011; Gläscher et al., 2012; Geddes et al., 2014; Cipolotti et al., 2016).

The main findings of this study can be summarized as follows. In line with our predictions, we found that handedness modulated performance on verbal and spatial Stroop tasks in opposite ways. Indeed, the regression analyses showed that righthanders were better able to reduce Stroop interference in the verbal task compared to non-right-handers, in terms of both RT and accuracy, whereas non-right-handers exhibited an advantage in the spatial Stroop task, albeit for RT only. In other words, the relationship between handedness and verbal-spatial Stroop performance was accounted for by linear relationships with opposite signs, thus showing that direction of handedness played a critical, but differential, role in the two Stroop tasks. This finding was further corroborated by follow-up non-parametric GLM analyses, which confirmed that the verbal and spatial Stroop effects were predicted by the participants' EHI scores in a significantly reversed manner, for both RT and accuracy.

In order to disentangle direction and consistency of handedness, we also fitted the participants' verbal and spatial Stroop effects with both linear and quadratic models. This allowed clarifying that, especially for the verbal domain, consistency of handedness did not account for the data better than direction alone. Contrarily, we should have found a different pattern for inconsistent/mixed-handers compared to either consistent left or consistent right-handers. Moreover, we tested whether the relatively small number of left-handed participants (N = 52) in our sample influenced the results obtained in the two Stroop tasks. To control for this possible bias, we equated the number of participants with negative and positive EHI scores and performed correlations on these new datasets, repeating such a procedure 10,000 times. This analysis refined the following points. For the color-word Stroop task, and hence in the context of the verbal domain only, the more and the stronger one is right-lateralized, the greater the ability to resist interference from competing word reading information. Such an advantage was present for both RT and accuracy data. It can be then concluded that the significant relation between handedness and Stroop performance observed in the verbal domain was reliable and robust in spite of the common rightward bias in the original distribution of the EHI scores. By contrast, in the spatial domain, the significant correlation result we found for the RT data was not confirmed by the accuracy analysis. Also, the RT advantage related to handedness disappeared in the correlation analysis of the spatial Stroop task when equating the number of participants with negative and positive EHI scores.

Two non-mutually exclusive explanations can account for the latter finding. The first one considers that the correlation we found in the spatial Stroop task could have been driven not only by consistent left-handers but, at least partly, also by inconsistent/mixed-handers (i.e., those participants with EHI scores in the middle of the distribution), who were underrepresented in the control analysis matching the number of participants with positive and negative EHI scores (see **Figures 1**, **2**). Lending support to this hypothesis, the inclusion of the quadratic model in the regression analysis was not significant for the verbal domain, while it was marginally significant for the spatial one suggesting that consistency of handedness could contribute to explain Stroop performance in the spatial domain. The second consideration is that, relative to the verbal domain, the relation between handedness and Stroop performance in the spatial one was more prone to be biased by the rightward asymmetry in the distribution of the EHI scores and the relatively low number of left-handers. One might therefore speculate that what we observed in the spatial Stroop task might simply reflect a statistical artifact and not a real advantage related to direction of handedness. Enrolling a higher number of left-handers to control for the rightward bias in the distribution of the EHI scores in future work is, thus, necessary to assess the impact of both direction and consistency of handedness on spatial Stroop performance. In any case, it is important to underscore here that our main finding was that handedness significantly modulated performance on verbal and spatial Stroop tasks in relatively opposite ways, as shown by the GLM analysis and the Meng's test, a pattern of results that supports the idea that hand preference exerted a differential influence on the two types of tasks.

An alternative explanation for the differences observed between spatial and verbal Stroop performance as a function of handedness is related to task difficulty. That is, it could be argued that since the spatial Stroop task was more difficult than the verbal Stroop task (in terms of higher Stroop effect), non-right-handers outperformed right-handers when overall task difficulty was relatively high, while the opposite was true when overall task difficulty was low. This would fit well with previous studies showing that hemispheric interactions are beneficial for relatively difficult tasks, while within-hemisphere processing is advantageous for relatively simple tasks (e.g., Banich and Belger, 1990; Weissman and Banich, 1999, 2000). Despite its apparent plausibility, however, this explanation cannot apply to our data to the extent that, in terms of overall task difficulty, the verbal Stroop task was indeed relatively more difficult than the spatial one<sup>2</sup> . Accordingly, general task difficulty does not offer a valid framework to explain our findings. It should also be noted that our findings cannot be simply attributed to low-level verbal or spatial abilities, as these abilities were not correlated to the EHI scores<sup>3</sup> . The results reported here were specific to the Stroop effect, as also suggested by the fact that when congruent and incongruent conditions were taken separately, no significant correlations with the EHI scores were observed for RT data, while only one significant correlation emerged for accuracy<sup>4</sup> . In particular, these control analyses showed that non-right-handers had lower accuracy in the incongruent condition of the color-word Stroop task only, a result that is still in line with our hypothesis of worse performance for non-right-handers on the verbal condition, for which inter-hemispheric interaction was assumed not to be useful for performance. Thus, this result does not affect our main conclusions.

Although our hypothesis on the common involvement of the left hemisphere for both Stroop tasks could seem somewhat speculative, there is evidence that bolsters it. In a recent resting-state electroencephalographic (EEG) study, Ambrosini and Vallesi (2017) used the same color-word and spatial Stroop tasks as the ones reported here. They found that participants with stronger resting-state-related activity in left-lateralized prefrontal

<sup>2</sup>The RTs for the congruent and, importantly, the incongruent conditions of the spatial Stroop task were both significantly lower than those in the congruent condition of the verbal Stroop task (respectively, t(286) = 24.46 and 6.89, p < 10−<sup>71</sup> and 10−<sup>10</sup> , d > 1.44 and 0.40), suggesting that the spatial Stroop task was actually easier than the verbal one.

<sup>3</sup>Additional control non-parametric regressions failed to find significant correlations between EHI scores and either RT or accuracy measures of both low-level verbal (respectively, ρ = −0.054 and 0.101, p = 0.361 and 0.087) and spatial (respectively, ρ = 0.096 and 0.035, p = 0.103 and 0.553) abilities, which were computed, respectively, as the mean RT or accuracy collapsed over congruent and incongruent conditions of the verbal and spatial Stroop tasks.

<sup>4</sup>The ρ correlations between EHI scores and RT data were as follows: spatial congruent = 0.077, p = 0.193; spatial incongruent = 0.106, p = 0.071; verbal congruent = −0.025, p = 0.678; verbal incongruent = −0.088, p = 0.136. The ρ correlations between EHI scores and accuracy were as follows: spatial congruent = −0.023, p = 0.690; spatial incongruent = 0.051, p = 0.388; verbal congruent = −0.038, p = 0.520; verbal incongruent = 0.187, p = 0.001.

regions were more able to resolve Stroop interference in both verbal and spatial tasks, which were administered at a later time with respect to the EEG session. Left-lateralized activations in the spatial Stroop task were also reported by Zoccatelli et al. (2010) in their fMRI study.

Along the same line, in a previous fMRI study, we investigated another executive function relying on control processes, namely, task-switching ability and, like here, spatial and verbal tasks were administered to the same participants (Vallesi et al., 2015). Our results showed a left-lateralized involvement of fronto-parietal regions for the verbal task and a more bilateral pattern for the spatial task. Importantly, a conjunction analysis revealed that, together with the bilateral supplementary motor area, task-switching in both spatial and verbal tasks activated left fronto-parietal regions. It thus seems likely that the left hemisphere is specialized for those cognitive control processes underlying resistance to interference and cognitive flexibility (e.g., Derrfuss et al., 2005; Ambrosini and Vallesi, 2016), but that it may interact with the right hemisphere as a function of the (spatial) nature of the task to be performed (see also Babcock and Vallesi, 2015). Future studies should employ other types of conflicting stimuli differently lateralized to the two hemispheres to further test our predictions and check their generalizability. Moreover, it is highly recommended to complement these behavioral observations with neural data to gain more direct insights into the brain asymmetries underlying handedness and cognitive control task-performance within the same individuals.

In sum, the behavioral dissociations reported here confirmed our starting hypotheses. Indeed, replicating previous findings (Christman, 2001; but see Beratis et al., 2010), we found that nonright-handers showed significantly greater interference when faced with the verbal Stroop task for which inter-hemispheric interaction was not useful and interference had to be resolved mainly by their left side of the brain. Conversely, they performed relatively better, at least in terms of RT, when confronted with the spatial Stroop task, for which access to right hemisphere

#### REFERENCES


processes was needed and greater collaboration between the two hemispheres was beneficial to performance.

To conclude, the current study suggests that handedness may be a useful tool to also test predictions derived by neural models that fractionate high-level cognitive processes along the left-right axis of the human brain. More importantly, it provides evidence in favor of a growing literature arguing that handedness may help explain individual differences in cognitive performance.

#### AUTHOR CONTRIBUTIONS

MC drafted the manuscript and was involved in all subsequent revisions. She was also involved in data collection and data analysis. EA performed statistical analysis, drafted the manuscript and provided additional revisions to the manuscript. AV was involved in the conception of the work and provided ongoing contributions and feedback throughout the experimental process. He also provided additional revisions to the manuscript. All the authors have approved the final version of the manuscript and agree to be accountable for all aspects of the work.

#### FUNDING

The authors are supported by the European Research Council under the European Union's 7th Framework Programme (FP7/2007-2013) grant agreement n◦ 313692 awarded to AV.

# ACKNOWLEDGMENTS

We wish to thank Chiara Rossato, Sandra Arbula, Laura Babcock, Valentina Pacella and Vincenza Tarantino for their assistance with data collection and Città della Speranza, Padova, for its logistic support.

adults integrate visual targets differently than younger adults? PLoS One 9:e113551. doi: 10.1371/journal.pone.0113551


resonance imaging. [Electronic version]. Surg. Radiol. Anat. 27, 254–259. doi: 10.1007/s00276-004-0308-1


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Capizzi, Ambrosini and Vallesi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Asymmetric Spatial Processing Under Cognitive Load

Lien Naert<sup>1</sup> \*, Mario Bonato1,2 and Wim Fias<sup>1</sup>

<sup>1</sup> Department of Experimental Psychology, Ghent University, Ghent, Belgium, <sup>2</sup> Department of General Psychology, University of Padova, Padua, Italy

Spatial attention allows us to selectively process information within a certain location in space. Despite the vast literature on spatial attention, the effect of cognitive load on spatial processing is still not fully understood. In this study we added cognitive load to a spatial processing task, so as to see whether it would differentially impact upon the processing of visual information in the left versus the right hemispace. The main paradigm consisted of a detection task that was performed during the maintenance interval of a verbal working memory task. We found that increasing cognitive working memory load had a more negative impact on detecting targets presented on the left side compared to those on the right side. The strength of the load effect correlated with the strength of the interaction on an individual level. The implications of an asymmetric attentional bias with a relative disadvantage for the left (vs the right) hemispace under high verbal working memory (WM) load are discussed.

#### Edited by:

Celine R. Gillebert, KU Leuven, Belgium

#### Reviewed by:

Edward Golob, University of Texas at San Antonio, United States Bruce Mehler, Massachusetts Institute of Technology, United States

> \*Correspondence: Lien Naert lhnaert.naert@ugent.be

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 08 January 2018 Accepted: 06 April 2018 Published: 23 April 2018

#### Citation:

Naert L, Bonato M and Fias W (2018) Asymmetric Spatial Processing Under Cognitive Load. Front. Psychol. 9:583. doi: 10.3389/fpsyg.2018.00583 Keywords: spatial attention, verbal working memory, cognitive load, detection task, visuo-spatial processing

# INTRODUCTION

We are constantly confronted with an amount of information that dramatically exceeds our ability to process it. Our capacity to attentively process information coming from the outside is limited and thus attentional selection is essential. Different types of selective attention can be distinguished (Carrasco, 2011). One type is feature-based attention, where attention is allocated to a specific aspect of objects (e.g., color). Depending on the task, a particular feature is made relevant and often the aim is to focus only on the relevant information and ignore the irrelevant. This is typically investigated in interference paradigms by testing the effect induced by a distractor or a taskirrelevant stimulus. In a Stroop task for instance, where the aim is to name the color of a word, the meaning of the word itself, even though irrelevant, interferes when performing the task (Stroop, 1935). Another type of selective attention is spatial attention, in which the available attentional resources are distributed across space as a function of task demands. Depending on the specific task to be performed, the most fruitful behavior could be to focus on a specific region or to distribute attentional resources across larger areas. A prototypical task used to investigate spatial attention is the Posner cueing paradigm (Posner, 1980). In this paradigm, a lateralized target is preceded by a cue which directs attention toward a specific location in space. In a valid trial, the cue correctly predicts target position while in an invalid trial, the cue predicts a position different from where the target will appear.

An important question is how the mechanisms of selective visual attention operate in dual task situations; that is, in situations in which central processing resources cannot be dedicated to one task only. For instance, consider what happens in traffic. While driving a car you have to select

visual information that is relevant (like pedestrians, other cars, traffic signs, etc.) while ignoring irrelevant information (like scenery, houses, etc.). If you receive a phone call while driving, processing resources will have to be divided between driving and the verbal interaction. The question then is to what extent selectively attending to relevant visual information and ignoring irrelevant information is equally efficient as it is without being engaged in a phone call.

This has mainly been addressed by interference paradigms. The load theory by Lavie (1995) looked at how attentional selection is influenced by the level and type of load. Lavie made the important distinction between the effects of perceptual load and cognitive load for determining whether or not peripheral distracters were interfering with performance. Perceptual load is defined as the number and complexity of perceptual operations that the task involves. Cognitive load is described as the total amount of demand imposed on working memory (WM). Lavie and colleagues found that during tasks with a low perceptual load, despite the fact that we try to ignore them, distractors are still processed (Lavie, 1995; de Fockert et al., 2001; Lavie, 2010) whereas when the perceptual load is high, interference from irrelevant information disappears. In another study, the effects of perceptual load were contrasted with cognitive load conditions (Lavie and De Fockert, 2005). Interestingly, the two different types of load had an opposite effect on processing distractors. Whereas high perceptual load improved the ability to focus attention to the relevant and ignore the irrelevant, this process deteriorated under conditions of high cognitive load. Irrelevant distractors were difficult to ignore under high cognitive load and therefore had a strong impact on processing the relevant target information compared to when the target and distractor were presented under conditions of low cognitive load. The different effect observed for perceptual versus cognitive load shows that increasing the load does not always imply an automatic impairing of selective attention processing by disrupting general cognitive control. Kim et al. (2005) propose that the effects of load could depend on whether there is overlap in content-specific processing, thus leading to interference between WM and the selective attention task (but see Gil-Gómez de Liaño et al., 2016).

In this respect it should be made explicit that WM is not a unitary mechanism, but is suggested to have two domain-specific slave systems: the visuospatial sketchpad and the phonological loop (Baddeley and Hitch, 1974; Baddeley, 2003). Traditional models of dual task interference state that the visuospatial and phonological WM resources are to a great extent independent of each other. Consequently, the maintenance of information in verbal WM is not supposed to conflict much with another concurrent task unless it requires verbal processing too. A study by Marciano and Yeshurun (2017) revealed a recurrent difficulty for replicating the load theory's predictions concerning the perceptual load and distractibility. They found substantial interindividual differences about the impact load had. Frequently, the results showed an opposite pattern to what was expected based on the load theory. Under low levels of perceptual load, participants with a low WM capacity show greater distractor processing than those with a high WM capacity (Shipstead et al., 2012), showing that WM can influence the visual distractibility. Marciano and Yeshurun (2017) investigated if their results could be explained by inter-individual differences, by testing the same participants several times. The results showed large betweensessions variations, indicating that not only inter-, but also intraindividual differences should be taken into account to have a better understanding of what is actually being measured.

As above reviewed, the load theory of Lavie (1995) focuses on how load affects the processing of distractors in interference paradigms. However, it is not clear yet how load affects the distribution of attention across space, in particular with respect to the processing of task-relevant (as opposed to task-irrelevant, in Lavie's studies) features. Clarifying the effect of cognitive load on spatial processing, and more specifically looking into a possible difference between left and right hemispace, is the focus of the present study. Despite the absence of systematic investigations, a few studies are revealing. For example, a neuropsychological study by Bonato et al. (2010) has looked at spatial awareness under different types of load. By asking four right-hemisphere damaged patients to detect the appearance of lateralized visual targets it was shown that the number of omissions for left, contralesional targets dramatically increased when patients had to concurrently perform a second task, either visual or auditory in nature. The testing of spatial processing under such dual task conditions turned out to be a much more sensitive method to detect neglect than the classical diagnostic tests, as a striking asymmetry in visual awareness could be detected even in the absence of neglect on the classical tests (Bonato, 2015). Notably, left hemisphere damaged patients show load-induced omissions for the right hemispace (Blini et al., 2016), confirming that, after brain damage, the load effect is specific for contralesional hemispace processing.

Bonato et al. (2010) did not find any effect of increased cognitive load on spatial awareness in neurologically intact matched control participants. The clinical task may have been too easy to induce attentional asymmetries in the absence of brain damage. However, a recent review based on studies in healthy participants (Chen and Spence, 2016) suggests that, during multisensory integration, spatial processing can become asymmetric and show a rightward attentional bias under high load, in particular of perceptual origin. Peers et al. (2006) tested the effect of a secondary sound discrimination task that could either be spatial or non-spatial by nature upon a primary visual task which consisted in reporting as many as possible of six letters briefly presented in a circular arrangement. A rightward bias emerged only when adding the secondary task. Another example of spatial asymmetry under high load comes from a study looking at the crossmodal ventriloquism aftereffect (= shift in perception of spatial location; Eramudugolla et al., 2011). Besides the spatially discrepant presentation of visual and auditory stimuli which is necessary to induce the ventriloquism effect, an additional visual pattern detection task was presented. The load was manipulated by either presenting simple or complex patterns. A larger ventriloquism aftereffect was found in the high load condition, but only toward the right hemispace. A third example of spatial asymmetry under load does not involve multisensory integration, but solely focuses on the auditory

domain (Golob et al., 2017). Participants had to respond to the amplitude modulation rate without paying attention to the sound location, while short term memory load was manipulated. Load presence led to clear slowing for the left compared to the right side. However, this was only true for spatial load and not for verbal load. Load can influence auditory spatial attention, but the specific pattern depends on the type of load. In a recent study (Lisi et al., 2015) a multitasking manipulation similar to the one by Bonato et al. (2010) has been applied to young, healthy participants, while they were asked to process lateralized visual targets which were masked. Under load, a trend emerged (stable across tasks yet non-significant) concerning a larger impact of the concurrent tasks upon left sided targets compared to the right. We did not find any studies looking into the effect of cognitive load on spatial attention asymmetries only within the visual modality.

Also, the load theory suggests that the nature of the imposed load is a critical determinant of how selective attention is deployed. Given that perceptual and cognitive load can have opposite effects on selective attention, it is important to use paradigms that unequivocally and exclusively manipulate one type of load. The fact that some of the tasks described above may have comprised two types of load (perceptual and cognitive) might account for some of the inconsistencies in the literature. For instance, the dual-task condition in the study of Bonato et al. (2010), in which identifying a visual or auditory stimulus was part of the secondary task, sums the effects related to dividing attention to process two different sources of information with those due to the maintenance of multiple response options, thereby increasing cognitive load. In the present study we are particularly interested in specifically isolating the effects of cognitive load, on spatial monitoring, in a context of divided attention. Yet, given the lack of paradigms designed to specifically target cognitive load, the results that are reported so far are inconclusive. One of the paradigms that allows to unequivocally manipulate cognitive load, is to preload working memory and to evaluate its impact on spatial attention during the maintenance interval in which no perceptual stimuli related to the WM task are presented.

Majerus et al. (2012) used such a paradigm in the context of an fMRI experiment. They used a letter recall task to load verbal WM and investigated how the brain responded to visual items presented during the retention interval as a function of the number of letters that had to be maintained in WM. The results showed a clear impact of WM load on the neural networks associated with spatial attention. During conditions of high WM load, the response of the temporo-parietal junction (TPJ) to the visual stimuli was suppressed. The TPJ is part of the ventral attention network and has been associated with reorienting attention toward salient or unexpected visual stimuli (Corbetta et al., 2000; Marois et al., 2000), both task-relevant and task-irrelevant (Downar et al., 2000; Corbetta and Shulman, 2002). Although this study did not systematically manipulate the location of the visual stimuli and although no behavioral measurements were obtained, these results are suggestive of a potentially influential role of cognitive load on spatial attention. Based on the fact that damage to the right TPJ is considered to be the crucial reason for the rightward attentional bias in neglect, it can a priori be predicted that the right-lateralized reduction of TPJ activity induced by high cognitive load would lead to a worse detection of stimuli in the left hemispace compared to those in the right hemispace. The current study was designed to test this prediction. Specifically, we investigated spatial processing differences for left-sided versus right-sided stimuli when verbal WM is loaded with more (high load) or less (low load) items. Although spatial selective attention has been widely investigated while manipulating the load, to our knowledge, no study has looked into the effect of WM load on spatial attention concerning left–right asymmetries. Specifically, we used an approach in which a detection task was used to measure spatial attention in the context of a WM letter recall task that was used to preload verbal WM. Moreover, we aimed at correlating the size of the space-load interaction with the impact load had at the individual level.

# EXPERIMENT 1

# Materials and Methods

#### Participants and Apparatus

Twenty participants (all university students, six males, M = 23.45 years, SD = 3.59) gave informed consent and were paid €10 to participate. None of the participants were aware of the purpose of the experiment. Two participants were left-handed (based on self-report) and all had normal or corrected-to-normal vision. We conducted the experiment on a Dell laptop running E-prime 2.0.8.90 software (Psychology Software Tools, Inc.<sup>1</sup> ). The stimuli were presented on an external monitor (19-inch wide-screen LCD, Dell) and responses were given on an external keyboard. The distance from the participant to the screen was approximately 60 cm. All stimuli were presented in black on a white background.

#### Procedure

The experiment consisted of a detection task that was presented during the maintenance interval of a parallel WM task (see **Figure 1**).

#### **Working memory task**

Before the start of the detection task, a sequence of two (low WM load) or six (high WM load) letters had to be memorized. The letters were presented one at a time, in the middle of the screen. None of the letters were vowels. No letters were repeated within a sequence. Participants were instructed to remember the letters in exactly the same order. They could go through the letters at a self-determined pace. Once all the letters had been seen, the detection task (15 trials) started. After finishing the detection task, participants had to recall the sequence of letters and type it using a standard keyboard. Their response appeared on the screen and if necessary, corrections could be made. There were 40 WM sets, half of which were low WM load trials and half of which were high WM load trials. It was a block design, alternating the

<sup>1</sup>www.pstnet.com

load condition, with five sets per block and eight blocks in total. Participants were always informed in advance about the difficulty level of the block (easy or hard). Half of the participants started with an easy block and half of them with a hard block. Between each block participants could take a short break if necessary.

#### **Detection task**

The detection task was performed during the maintenance interval of the WM task and always comprised 15 trials. Thus, after the participants went through the letters they had to remember for the WM task, the detection task began. During the whole detection task, a fixation cross (height 14 mm, 1.33◦ of visual angle) was present in the center of the screen. Every trial started with the text "Klaar!" ("Ready!") displayed for 1500 ms on top of the fixation cross. A target stimulus appeared 400 ms after the text disappeared. The target stimulus (black dot) had a diameter of 9 mm (0.86◦ ), and was presented for 16 ms (synchronized with the 60 Hz refresh rate of the screen) either on the left or the right side of the screen. In one third of trials, no target was presented. Participants were instructed to press the spacebar as soon as they saw the target, irrespective of its position. They maximally had 1000 ms to give a response and as soon as a response was given, the next detection trial started. They were explicitly asked to perform this detection task with the index finger of their dominant hand. For every detection task during a WM trial, the position of the target was balanced and randomized across the 15 detection trials. The experiment encompassed 600 target detection trials: 200 with a left-sided target, 200 right-sided, and 200 without any target. Half the trials were performed while under low WM load and half under high WM load.

#### Data Analysis

We used R (R Core Team, 2016) and lme4 (Bates et al., 2015) to perform generalized linear mixed effects (GLME) analyses. In case the dependent variable was dichotomous (accuracy), we used logistic regression analyses. Both for the fixed and random effects, the chi-square statistics and the corresponding p-values were acquired by the likelihood ratio test. The full model was compared with the model without the effect at test. All the results were controlled for age, gender, and handedness.

# Results

First, to investigate the performance on the WM task, we entered the accuracies into a GLME model with a random intercept across participants and WM load as a fixed effects predictor. There was a significant main effect of Load, χ 2 (1, N = 20) = 15.87, p < 0.001. Accuracy in recalling the WM sequence was higher in the low WM load condition (M = 96.5%, SD = 4.9%) compared to the high WM load condition (M = 84.3%, SD = 12.8%), indicating that the load manipulation was successfully implemented. Secondly, participants performed well on the detection task with an average of 97% correctly detected targets. The error-rate (omissions and false alarms) was similar for the low load (2.9%) and for the high load (3.1%) condition.

To investigate the influence of the WM load manipulation on the detection task, the RTs on the detection task were entered in a GLME model, with a random intercept per participant, a random slope for Load and for Position (left versus right) and, as fixed effect predictors, Block and the interaction between Load and Position. Error trials (3.06% of the data) and trials with RTs below 100 ms (0.6% of the data) were excluded from further analysis. Additionally, the trials in which no target appeared were left out from further analysis. Finally, the detection trials during an incorrect WM trial were also omitted from further analysis, due to the impossibility to determine whether a WM load was present during those detection trials. There was a significant main effect of Load, χ 2 (1, N = 20) = 4.47, p = 0.034. RTs were slower under high WM load (M = 307.7 ms, SD = 88.3) compared to low WM load (M = 297.3 ms, SD = 79.1), which again confirms that the WM load manipulation was successful

and that its effect was present during the detection task. We found no main effect of Position, χ 2 (1, N = 20) = 0.69, p = 0.41. Next, there was a significant main effect of Block, χ 2 (1, N = 20) = 14.73, p < 0.001. Participants became faster toward the end of the experiment, indicating a learning effect. When looking at the data in more detail, we found a significant interaction between Load and Block, χ 2 (1, N = 20) = 6.28, p = 0.012, with the effect of load disappearing in the last blocks. We reasoned that this might explain why the interaction between Load and Position was not significant, χ 2 (1, N = 20) = 3.14, p = 0.07. As illustrated in **Figure 2**, the main effect of Load disappeared during the last two blocks, in which participants reacted equally fast on targets during low WM load trials as during high WM load trials. As we cannot be sure that the load manipulation was still effective throughout those last two blocks, we repeated the same analyses without the last two blocks. All analyses revealed very similar results, and the interaction between Load and Position turned out to be significant, χ 2 (1, N = 20) = 3.86, p = 0.049. The slope for the main effect of Load was steeper for targets appearing on the left than on the right (**Figure 3**). It shows that the difference in processing stimuli under low versus high load is larger for stimuli in the left hemifield compared to stimuli in the right hemifield. If this interaction is indeed a consequence of a high WM load, we expect to find a correlation across subjects between the size of the load and the size of the interaction between load and position. Therefore, we correlated the beta values (from the GLME model) of load with those of the interaction for each participant. The results show that participants who experienced a bigger impact of load, also show a stronger interaction, r(18) = −0.64, p = 0.002 (**Figure 4**).

#### Discussion

Experiment 1 investigated whether cognitive load influences spatial processing differently for stimuli in the left or the right hemispace. We found that adding WM load had a greater effect on detection speed for a target presented on the left compared to the right side of a monitor. The interaction between load and position was significant as long as an effect of load was present. The disappearance of the load effect in the last two blocks might be due to learning, although we would have expected a gradual difference across blocks instead of a sudden drop in reaction times.

This pattern of results is suggestive of an asymmetrical attentional bias with a relative disadvantage for the left vs the right hemispace as a result of a high cognitive load. These results could in principle be explained by hand compatibility, as left hemisphere lateralized motor activity could imply an increased sensitivity for the right hemispace. Participants were instructed to respond with their dominant hand, which was mostly the right hand (except for two left-handed participants). Thus, a facilitation effect could be present for right-sided targets due to responses being performed with the right hand rather than to effector-independent visuospatial asymmetries. To explore this alternative explanation, we conducted a second, almost identical, experiment. The only difference consisted in the use of the left (non-dominant) hand for responding. Finding a larger impact of load for targets on the right would be in favor of the response compatibility hypothesis. On the other hand, in case we would again find a larger impact of load on left-sided targets, it would support our original hypothesis about asymmetrical spatial processing under cognitive load.

# EXPERIMENT 2

#### Materials and Methods

Twenty participants (all university students, three males, M = 24.25 years, SD = 4.10) gave informed consent and were paid €10 to participate. None of the participants were aware of the purpose of the experiment. All participants were right-handed (based on self-report) and all had normal or corrected-to-normal vision. The method was identical to that of Experiment 1, with the exception that participants had to respond with their nondominant hand during the detection task. The data analyses and model building were also identical.

FIGURE 3 | Average RTs as a function of target position and load (Experiment 1). The left panel represents all blocks, the one on the right only those where we could ensure that the WM manipulation was effective. The interaction between WM load and target position only becomes significant (from p = 0.07 to p = 0.049) when excluding the last two blocks from analysis. The effect of load is larger for stimuli presented in the left vs the right hemispace.

#### Results

To investigate the performance on the WM task, we entered the accuracies into a GLME model with a random intercept across participants and WM load as a fixed effects predictor. There was a significant main effect of Load, χ 2 (1, N = 20) = 21.4, p < 0.001. Accuracy in recalling the WM sequence was higher in the low WM load condition (M = 99%, SD = 3.5%) compared to the high WM load condition (M = 82.75%, SD = 20.4%), indicating that the load manipulation was successfully implemented.

Participants correctly detected 98.6% of targets. The errorrate was similar for the low load (1.3%) and for the high load (1.6%) condition. Together with error trials, trials with RTs below 100 ms (1.7% of the data) were excluded from further analysis. As for Experiment 1, the no-target trials and the detection trials during an incorrect WM trial were omitted from further analysis. To investigate the influence of the WM load manipulation on the detection task, the RTs on the detection task were entered in a GLME model, with a random intercept per participant, a random slope for Load and for Position and as fixed effect predictors Block and the interaction between Load and Position. A significant main effect of Load emerged, χ 2 (1, N = 20) = 7.96, p = 0.005. RTs were slower under high WM load (M = 301.7 ms, SD = 87.4) compared to low WM load (M = 291.4 ms, SD = 71.4), which again confirms that the WM load manipulation was successful and that its effect was present during the detection task. Position showed a trend toward a response compatibility-like effect with faster responses to leftsided targets, χ 2 (1, N = 20) = 3.26, p = 0.07. Next, there was a significant main effect of Block, χ 2 (1, N = 20) = 71.37, p < 0.001 (**Figure 5**). Participants became faster toward the end of the experiment, indicating a learning effect. There was no significant interaction between Load and Block, χ 2 (1, N = 20) = 1.67, p = 0.196. There was a significant interaction between Load and Position, χ 2 (1, N = 20) = 10.25, p = 0.001. As illustrated in **Figure 6**, Load has a bigger impact on the reaction times of targets appearing on the left than on the right. We also looked at the correlation between the size of the load and the size of the interaction at individual level. We correlated the individual beta values (from the GLME model) of load with those of the interaction between load and position. The correlation was significant, r(18) = −0.68, p < 0.001. We then checked for the presence of outliers. Only one participant could be considered

an outlier (>2.5 SD). Its exclusion did not considerably change the results, r(17) = −0.58, p = 0.005 (**Figure 7**). As an extra analysis, we combined the data of both experiments. Our aim was to investigate the effect of between-subject variable Hand Response (Experiment 1 and Experiment 2) on the interaction between Load and Position. To do so, we entered the RTs of the two detection tasks into a GLME model with a random intercept across participants and a three-way interaction between Load, Position, and Hand Response as fixed effects predictors. While the interaction between Load and Position remains significant, χ 2 (1, N = 40) = 14.01, p < 0.001 there was no hint of a significant three-way interaction, χ 2 (1, N = 40) = 0.10, p > 0.05.

#### Discussion

In Experiment 2, participants responded using their nondominant hand (left). We again found that cognitive load had a greater effect on the stimuli presented on the left versus the right side. When comparing **Figures 3**, **5**, one may observe that in the first experiment, participants were faster at the ipsilateral location only in the high load condition while in the second experiment this finding was present for the low load condition. To evaluate the effect of hand response on these effects, we analyzed the data of the two experiments combined. The interaction between the position of the target and the WM load we found was not influenced by the hand performing the response. It is therefore possible to conclude that the difference present across experiments is purely an additive effect. A tentative explanation for this additive effect could be based on hemispheric activation. The biggest advantage emerges in the low WM load condition where targets appear left and participants

have to respond with their left hand. Every aspect of this condition is primarily processed by the right hemisphere which is dominant for some aspects of spatial attention (Kinsbourne, 1970a). This hemisphere-driven effect could lead to the observed faster reaction times. The conceptual replication of the findings of Experiment 1 in Experiment 2 provides additional support for hypothesis of an attentional origin for asymmetric spatial processing under cognitive load.

#### GENERAL DISCUSSION

The aim of this study was to investigate how spatial attention would be affected by different levels of cognitive load. We hypothesized that the processing of information presented within the left hemispace would suffer more from high load compared to information within the right hemispace. We tested this hypothesis by manipulating the verbal WM load via a letter recall task to then measure spatial attention effectiveness via a detection task. Participants first had to memorize a sequence of letters and while their WM was loaded, they performed a detection task in which they had to detect briefly presented targets appearing either on the left or on the right side. In the first experiment, we found that high WM load slowed down the detection of leftsided targets more than the detection of right-sided ones, which

corresponded with our hypothesis. To exclude an alternative explanation of hand response, we conducted a second experiment with a different response mapping. While in the first experiment participants had to respond with their dominant hand, we asked the participants of the second experiment to respond with their non-dominant hand. Also in the second experiment, results showed a significant interaction between the WM load condition and the target position with a more evident effect for processing information in the left hemispace. We analyzed the data of the two experiments together and found no influence of hand response on the described interaction between the position of the target and the amount of WM load present. In the two experiments, our hypothesis is further corroborated by the fact that the individual strength of the load effect correlated with the individual degree of asymmetry in spatial processing.

We point out that there are individual differences we cannot control for and which might affect performance. For example, someone's anxiety level can influence the effect of perceptual load (Moriya and Tanno, 2010; Sadeh and

across participants.

Bredemeier, 2011) and it can be affected by many more both inter- and intra-individual differences (Shipstead et al., 2012; Marciano and Yeshurun, 2017). An advantage of this study compared to some other load studies (e.g., Lavie, 1995) is that we have an independent measurement of the load manipulation and can correlate it at the individual level to the spatial processing effect we are interested in. In the current study we were interested in the effect of cognitive load on the symmetry of spatial processing. We manipulated the load level by a verbal WM task in which either two or six letters had to be remembered. While memorizing six letters can generally be stated as more difficult and thus inducing a high load, there is quite some variation between individuals when it comes to determining how difficult the high load condition is. This inter-individual difference is reflected both in the accuracy score on the WM task itself as in the reaction times during the detection task. To be sure that the asymmetrical spatial processing we observed could be attributed to the presence of high cognitive load, we expected that, at the individual level, the size of the load effect would have been related to the size of the asymmetry. That was exactly what we found when we correlated the effect of load on reaction time in the detection task with the interaction between load and position.

Majerus et al. (2012), showed TPJ deactivation under conditions of high cognitive load. The asymmetrical spatial processing we described is compatible with the possibility of right TPJ suppression under load (Shulman et al., 2010). We can thus state that our behavioral results are in accordance with existing fMRI evidence of an interaction between spatial attention and WM plausibly occurring in the TPJ (Anticevic et al., 2010).

From a model perspective, to explain the mismatch between neural and functional impairment in neglect patients, Corbetta and Shulman (2011) suggest a modulation of the ventral attention network (VAN) on the dorsal attention network (DAN). We propose that this modulation of the VAN on the DAN could be mediated by cognitive load. Keeping in mind that the TPJ, part of the VAN, is suggested to be dominant in the right hemisphere, load-dependent de-activation in the (right-lateralized) VAN should translate – at the behavioral level – to worse spatial processing in the left hemispace compared to the right hemispace. This is compatible with the prominent disturbances in contralesional spatial processing right hemisphere damaged patients show. This also reflects the results found in this study with healthy participants where the load manipulation was clearly WM oriented. To gain more insight into the underlying neural mechanisms responsible for the current findings, it would be interesting to perform an fMRI study with a similar behavioral design.

Our results are also in line with studies looking at the effect of perceptual load on multisensory spatial processing (Chen and Spence, 2016), although in those studies it was not possible to disentangle the perceptual and cognitive load from each other. This makes it difficult to attribute the asymmetric effect to one specific type of load. In our experiment we deliberately chose to use a verbal WM load to avoid any perceptual load influence. After having ensured the cognitive nature of the load manipulation, the question becomes whether the current findings are specifically related to the verbal nature of the WM load. Memorizing a sequence of letters has a verbal nature and will consequently activate our left hemisphere more because of its dominance for language. Spreading of activation within the left hemisphere could offer an alternative explanation for why left-sided targets are processed slower compared to right-sided ones (Tokimura et al., 1996; Seyal et al., 1999; Meister et al., 2003). This reasoning expands Kinsbourne's (1970a,b) findings concerning significant asymmetries in the performance on a visual task in favor of the right side in the presence of verbal load as opposed to no load. However, one could argue that both load conditions are of a verbal nature, and the corresponding verbal components should not differentially interact with the spatial component and thus it should not confound the results. Of course one could still argue that a high WM condition is more demanding and might activate language more than the low load condition. This question remains open for future investigation.

Resulting from our underlying neural hypothesis, we might prefer to frame our findings as due to a disadvantage for the

#### REFERENCES


left hemispace rather than a rightward facilitation. One way to empirically differentiate between an advantage for the right versus a disadvantage for the left hemispace would be to add central targets and use this as a reference point to compare the reaction times of the left and right targets with.

# CONCLUSION

We investigated the effect of cognitive load on spatial processing in the left versus the right hemispace. We observed a different impact of verbal WM load on spatial processing. Load affected more the left than the right hemispace. A correlation between the load effect and the interaction on the individual level further supported our hypothesis that there is an attentional origin for the asymmetric spatial processing we observed. Further research might allow to test whether the effect is specific for the verbal nature of the WM load.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Ethical Committee of the Faculty of Psychology and Educational Sciences of UGhent with written informed consent from all subjects in accordance with the Declaration of Helsinki. The protocol was approved by the Ethical Committee of the Faculty of Psychology and Educational Sciences of UGhent.

# AUTHOR CONTRIBUTIONS

LN and WF developed the study concept. LN programmed the experiment, collected the data, and performed the data analysis. LN, WF, MB drafted the manuscript. WF and MB provided critical feedback on different versions of the manuscript. All authors listed have made substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This work is supported by grant BOF.DOC.2016.0043.01 by BOF and GOA 01G01108 of Ghent University. MB was funded by an FP7 Marie Curie individual Fellowship "SPACE Load," Project ID: 625378.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Naert, Bonato and Fias. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Physiological Measures of Dopaminergic and Noradrenergic Activity During Attentional Set Shifting and Reversal

Péter Pajkossy1,2 \*, Ágnes Szoll ˝ osi ˝ 1,2, Gyula Demeter1,2,3 and Mihály Racsmány1,2

1 Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary, <sup>2</sup> Department of Cognitive Science, Budapest University of Technology and Economics, Budapest, Hungary, <sup>3</sup> Rehabilitation Department of Brain Injuries, National Institute of Medical Rehabilitation, Budapest, Hungary

Dopamine (DA) and noradrenaline (NA) are important neurotransmitters, which are suggested to play a vital role in modulating the neural circuitry involved in the executive control of cognition. One way to investigate the functions of these neurotransmitter systems is to assess physiological indices of DA and NA transmission. Here we examined how variations of spontaneous eye-blink rate and pupil size, as indirect measures of DA and NA activity, respectively, are related to performance in a hallmark aspect of executive control: attentional set shifting. We used the Intra/Extradimensional Set Shifting Task, where participants have to choose between different compound stimuli while the stimulus-reward contingencies change periodically. During such rule shifts, participants have to refresh their attentional set while they reassess which stimulus-features are relevant. We found that both eye-blink rate (EBR) and pupil size increased after rule shifts, when explorative processes are required to establish stimulus–reward contingencies. Furthermore, baseline pupil size was related to performance during the most difficult, extradimensional set shifting stage, whereas baseline EBR was associated with task performance prior to this stage. Our results support a range of neurobiological models suggesting that the activity of DA and NA neurotransmitter systems determines individual differences in executive functions (EF), possibly by regulating neurotransmission in prefrontal circuits. We also suggest that assessing specific, easily accessible indirect physiological markers, such as pupil size and blink rate, contributes to the comprehension of the relationship between neurotransmitter systems and EF.

Keywords: pupil size, eye-blink rate, attentional set shifting, executive functions, dopamine, noradrenaline

# INTRODUCTION

The adaptive control of behavior and information processing encompasses the flexible shift of attentional focus, updating of relevant information and the inhibition of irrelevant information (Miyake et al., 2000). This broad set of control functions, often referred to as executive functions (EF), evolves from the interaction of complex brain networks (Robbins and Rogers, 2000). A wealth

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Stephen B. R. E. Brown, Leiden University, Netherlands Anita Cybulska-Klosowicz, Nencki Institute of Experimental Biology (PAS), Poland

> \*Correspondence: Péter Pajkossy ppajkossy@cogsci.bme.hu

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 07 February 2018 Accepted: 26 March 2018 Published: 11 April 2018

#### Citation:

Pajkossy P, Szoll ˝ osi Á, Demeter G ˝ and Racsmány M (2018) Physiological Measures of Dopaminergic and Noradrenergic Activity During Attentional Set Shifting and Reversal. Front. Psychol. 9:506. doi: 10.3389/fpsyg.2018.00506

**425**

of research suggests that areas in the prefrontal cortex and their interactions with subcortical networks play the most vital role in implementing executive control functions (Alexander et al., 1986; Smith and Jonides, 1999; Frank et al., 2001; Miller and Cohen, 2001; Chudasama and Robbins, 2006). Importantly, the modulatory effect of subcortical areas on the prefrontal cortex is thought to be implemented through neurotransmitter systems which influence neural processing in several areas of the prefrontal cortex (Doya, 2008; Robbins and Arnsten, 2009). In this paper, we aimed to focus on two neurotransmitters, dopamine (DA) and noradrenaline (NA), to show how withinand between-subject variability of DA and NA levels, measured by physiological markers, is related to attentional set shifting, a specific component of EF.

Attentional set can be defined as a class or dimension of environmental features, which is considered as being taskrelevant and is attended by the individual (Owen et al., 1993; Heisler et al., 2015). The ability to switch between attentional sets, as a response to current task demands, is an important aspect of cognitive flexibility. This executive function is often measured by the Wisconsin Card Sorting Task (henceforth WCST; Berg, 1948; Heaton et al., 1993), and by the Intra/Extradimensional Set Shifting Task (henceforth IEDT; Downes et al., 1989; Owen et al., 1992). It is widely shown that successful performance in these tasks is associated with activations in the prefrontal cortex (see e.g., Milner, 1963; Janowsky et al., 1989; Grafman et al., 1990; Dias et al., 1996; Hampshire and Owen, 2006; Nyhus and Barceló, 2009). Furthermore, a substantial body of neuropsychological evidence revealed that attentional set shifting is impaired in several psychiatric and neurological conditions (schizophrenia: Heinrichs and Zakzanis, 1998; Jazbec et al., 2007; Reichenberg and Harvey, 2007; Pantelis et al., 2009; obsessive–compulsive disorder: Roh et al., 2005; Chamberlain et al., 2006; Demeter et al., 2013; Parkinson's disease: Owen et al., 1992; Kudlicka et al., 2011).

In both the WCST and the IEDT, participants have to choose between complex stimuli characterized by distinct stimulus dimensions (e.g., in the IEDT, figures with different shapes overlaid by lines with different curvature). Only one feature of one stimulus-dimension is rewarded (e.g., rectangle shape), and the task is to find out the rewarded feature through trial and error learning. In doing so, participants have to specify which stimulus dimension they attend to, thereby creating an attentional set. Importantly, after participants have managed to figure out the stimulus–reward contingency, as indicated by consecutive correct choices, the reward rule changes: the rewarded stimulus becomes either another feature from the same dimension (e.g., a different figure), or a feature from another dimension (e.g., one of the lines). In the former case, reversal learning is required: participants have to ignore the previously rewarded stimulus and turn to a previously non-rewarded one. In the latter case, participants have to reassess which dimension they attend to – this process is termed attentional set shifting.

Two conceptually distinctive phases can be identified in both the IEDT and the WCST. First, participants have to figure out the rewarded feature through trial and error learning based on feedback received for previous choices. After the rule changes, they have to explore what the new rewarded feature is. This can be termed the explorative phase of the task. Second, after the stimulus–reward contingencies are identified, participants have to continuously choose the correct response option based on the established attentional set. That is, participants have to exploit the acquired knowledge to choose the correct option. This can be labeled as the exploitative phase of the task.

Importantly, both NA and DA transmissions are linked to exploration and exploitation. NA is released throughout the cortex, and this NA transmission originates almost exclusively from the brain stem nucleus locus coeruleus – often termed the LC/NA system (Aston-Jones et al., 1999; Aston-Jones and Cohen, 2005). The adaptive gain theory, proposed by Aston-Jones and Cohen (2005), differentiates between a phasic and a tonic mode of LC function. The phasic mode is associated with a moderate tonic firing level and with strong task-related phasic burst of LC, which serves to coordinate cortical networks in order to facilitate task-relevant responses. In contrast, the tonic mode, associated with exploration and task-disengagement, is characterized by a high tonic firing level and by the absence of clear phasic bursts. Partly similar theories also suggest that phasic NA transmission is related to the coordination of task relevant networks (Bouret and Sara, 2005), whereas tonic NA firing represents the level of unexpected uncertainty (Yu and Dayan, 2005), which in turn leads to explorative behavioral tendencies.

The functions attributed to DA transmission are also relevant for the regulation of explorative and exploitative behavioral tendencies. Specifically, reinforcement learning required for establishing the correct stimulus–reward contingency is suggested to be dependent on midbrain DA neurons. These neurons are thought to code reward prediction error, that is, the difference between the expected and the experienced reward (Schultz et al., 1997; Glimcher, 2011). In current theoretical models, these low-level features of individual DA neurons underlie the regulatory function of the DA neurotransmitter system. Through different paths involving different DA receptors, DA might regulate the balance between stability and flexibility of cortical representations (Frank and O'Reilly, 2006; Cools and D'Esposito, 2011; Maia and Frank, 2011). Furthermore, it is also suggested that tonic changes in striatal DA outflow might contribute to the regulation of the trade-off between exploration and exploitation (Frank et al., 2009; Beeler et al., 2010; Humphries et al., 2012). Interestingly, in some computational models, it was proposed that the striatal DA system, which is involved in reinforcement learning, might interact with the LC/NA system to determine the shift between exploitation and exploration (McClure et al., 2006; Frank et al., 2007).

Based on the above, both NA and DA transmission can contribute to attentional set shifting through the above described explorative or exploitative processes, respectively. Accordingly, supporting evidence for a link between attentional set shifting and NA/DA transmission has already been reported. In rodent studies, the manipulation of both NA and DA transmission affects attentional set shifting (Lapiz and Morilak, 2006; Tait

et al., 2007; McGaughy et al., 2008; Cybulska-Klosowicz et al., 2017a,b). Moreover, in human studies using the WCST, individual differences in task performance are associated with neurobiological substrates linked to DA transmission (e.g., Joober et al., 2002; Hsieh et al., 2010). These studies either manipulate the level of DA and NA before the task or observe the consequences of individual differences. In our study, we aimed to enrich the above evidence by using a method which enables us to track online changes in neurotransmitter level. Therefore, we assessed easily accessible physiological indices which indirectly measure the activity of both the NA and the DA system (for a similar approach, see Van Slooten et al., 2017). Investigating the link between such measures and task performance might shed light on factors which determine individual differences in attentional set shifting.

NA transmission was investigated by assessing pupil diameter, as recent results suggest that pupil size reflects the activity of the LC (Aston-Jones and Cohen, 2005; Murphy et al., 2014; Joshi et al., 2016). Sudden, task-evoked increase in the size of the pupil has been the subject of scientific inquiry for decades, in particular as a measure of mental effort or cognitive load (Kahneman and Beatty, 1966; Beatty, 1982). Such pupil dilation accompanies various types of cognitive processing and is associated with phasic bursts of the LC/NA system (Aston-Jones and Cohen, 2005; Gilzenrat et al., 2010). Recently, more enduring, slower pupil size changes during cognitive processing have also attracted attention (Jepma and Nieuwenhuis, 2011; Hayes and Petrov, 2015). Interestingly, pupil size might signal a third aspect of NA function: as it was revealed by a current study (Tsukahara et al., 2016), individual variations in baseline pupil size, measured before the task begins, were associated with working memory performance. They found that the pupil size of participants with high working memory capacity was larger, as compared to participants with low working memory capacity. The authors suggest that such task-unrelated baseline measure of pupil size might be an index of large-scale brain network activity orchestrated by the LC/NA system.

As an indirect index of DA transmission, we assessed eyeblink frequency. Spontaneous eye-blink rate (EBR) is affected by DA agonist and antagonists (e.g., Blin et al., 1990; Cavanagh et al., 2014), and disorders characterized by atypical DA levels are associated with differences in EBR (decrease in Parkinson's disease: e.g., Karson et al., 1984; Bologna et al., 2012; increase in schizophrenia: e.g., Helms and Godwin, 1985; Swarztrauber and Fujikawa, 1998). Although the underlying mechanisms and characteristics of this link are still unclear, and the status of EBR as a biological marker of DA is disputed (e.g., van der Post et al., 2004; Tharp and Pickering, 2011), several lines of evidence point out that EBR might be a useful indirect index of striatal DA transmission (for reviews, see Jongkees and Colzato, 2016; Eckstein et al., 2017). It has been suggested that higher EBR might indicate lower updating threshold for cortical representations, which then leads to flexibility in processing, but at the cost of distractibility (Jongkees and Colzato, 2016).

This baseline EBR<sup>1</sup> is usually measured under free viewing condition with no specific task instruction. When EBR is measured under a specific task, both increase and decrease of EBR can be observed, as compared to rest periods. Before and after eye-blinks, visual processing is suppressed (Manning et al., 1983; Stevenson et al., 1986), and tasks involving visual attention typically decrease EBR (Drew, 1951; Stern and Skelly, 1984). Furthermore, several tasks involving mental effort are associated with within-task EBR changes (Holland and Tarlow, 1975; Bentivoglio et al., 1997; De Jong and Merckelbach, 1990; Siegle et al., 2008; Oh et al., 2012). Although most of the above studies did not link EBR changes to DA, some recent studies showed that task-related changes in EBR might specifically signal changes in DA transmission (van Bochove et al., 2013; Peckham and Johnson, 2016; Rac-Lubashevsky et al., 2017). This points to the possibility that EBR is not only a baseline measure of DA transmission, but it is suitable for tracking within-task changes of DA level.

Altogether several lines of evidence indicate that pupillometry and the measurement of EBR are possible indirect indices of NA and DA levels, and are suitable measures for revealing how these neurotransmitters are involved in explorative and exploitative aspects of attentional set shifting performance. In a recent study (Pajkossy et al., 2017), we have demonstrated that there is a relationship between tonic pupil size and attentional set shifting. Participants performed eye-tracker adapted versions of both the IEDT and the WCST. We found that pretrial pupil size increased in the explorative phase of the tasks, whereas in the exploitative phase of the tasks, a steady decrease in pretrial pupil size was observed. In the present study, we aimed to replicate these findings. Furthermore, we predicted that baseline pupil size, measured in a similar way as in Tsukahara et al. (2016), would be related to task performance. In line with their findings, we predicted a positive correlation between baseline pupil size and task performance.

Regarding EBR and DA transmission, we aimed to test whether we could use EBR as an indirect measure of DA transmission during information processing (and not under passive viewing conditions). First, we tested whether average EBR during the task (i.e., baseline EBR) could be linked to task variables. Second, we examined whether EBR changes did accompany rule shifts. As EBR is suggested to index the balance between stability and the flexibility of cortical representations, our predictions were similar to that of pupil size changes: a steady decrease during the exploitative phase, when the positive feedback acted to maintain current representations, whereas a sudden increase during the explorative phase, when the flexible updating of cortical representations was required.

<sup>1</sup>We define baseline measure as an index which has a relative stability over time, whereas tonic changes are conceptualized as changes evolving over a relatively slow time scale, over seconds (in contrast to phasic changes, which evolve in the seconds-milliseconds time range). We follow hereby the terminology of the adaptive gain theory which differentiates between phasic and tonic change in LC firing patterns. Note, however, that different terminology can also be used: the measure termed baseline EBR by us is referred to as tonic EBR in Jongkees and Colzato (2016), whereas changes in EBR during a task is termed tonic change by us, but phasic change in Jongkees and Colzato.

# MATERIALS AND METHODS

fpsyg-09-00506 April 9, 2018 Time: 16:42 # 4

# Participants

Participants were Hungarian undergraduate students, who received a monetary reward for their participation. We asked participants to refrain from consumption alcohol, caffeine, and nicotine one day prior to the experiment, because these substances might affect physiological variables, like EBR and pupil size (Holmqvist et al., 2011). Participants who did not comply with this instruction were not included in the study. The initial sample size was 60 participants. Two participants were excluded due to diagnosed neurological conditions, whereas a further five participants were excluded due to recording errors resulting in substantial loss of eye-tracking data. Furthermore, to ensure reliable blink detection, we took a rather conservative approach and excluded five participants due to low eye-tracker data quality. Thus, the final sample size consisted of 48 participants (29 females; age range: 18–31 years, Mage = 22.0, SD = 2.4). All experiments were run between 10 a.m. and 5 p.m., as EBR is affected by the circadian rhythm with increased EBR in the evening (Barbato et al., 2000; Jongkees and Colzato, 2016).

# The IEDT Task

#### Structure of the Task

The most frequently used IEDT version is part of the Cambridge Automated Test Battery (henceforth CANTAB), an often used neuropsychological test battery (Fray et al., 1996). In this task, stimulus dimensions are spatially overlapping. We adapted this task to eye-tracking by spatially segregating the two stimulus dimensions (holes inside figures, see **Figure 1**), which enabled us to independently track attention regarding the two stimulus dimensions. In all other aspects, the task was identical to the IEDT used in the CANTAB. Importantly, to equate net luminance of the screen, all figures and holes had the same surface size.

In each trial, participants had to choose the correct stimulus from two compound stimuli with two stimulus dimensions – two large rectangular figures with holes inside (in the following, we will refer to the rectangular figures as large figures, whereas the holes inside will be labeled as small figures). Participants were instructed to use the feedback received for previous choices to figure out the reward– stimulus contingency. We also told them that after consecutive correct responses, the rule would be changed. This rule shift was not signaled to them, but could be figured out based on the feedback received (i.e., using the previously correct stimulus–reward contingency led to negative feedback after rule shift).

The task consisted of nine stages. In some of the stages, only the stimulus–reward contingency changed, whereas in other stages, new stimuli were also introduced. The same stimulusexemplars were shown in all trials of a specific stage (i.e., the same two large and small figures), but their pairing varied randomly (i.e., both small figures could be presented on the surface of both large figures). The only constraint was that the same pairing

Discrimination 1; CD2, Compound Discrimination 2; CDR, Compound Discrimination Reversal; ID, Intradimensional Set Shifting; IDR, Intradimensional Set Shifting Reversal; ED, Extradimensional Set Shifting; EDR, Extradimensional Set Shifting Reversal.

could be presented only five times consecutively. In each phase, only one stimulus-exemplar was rewarded (e.g., one large figure). Participants advanced through the stages by figuring out the correct stimulus–reward contingency, and by choosing always the correct stimuli (i.e., the compound stimuli which included the rewarded stimulus-exemplar). Six consecutive correct responses triggered a rule shift and the start of the next stage. If this

criterion was not achieved after 50 trials, then the task was terminated.

During the nine stages, the stimuli and the rule shifts were varied systematically to test different aspects of cognitive processing (see **Figure 1**). In the first two stages, only the large shapes were presented. One of the large figures was randomly selected to be rewarded in the first stage (simple discrimination, SD), and the other large figure became rewarded in the second stage (simple discrimination reversal, SDR). This large figure remained the rewarded stimulus-exemplar during the next two stages (compound discrimination 1–2, CD1–CD2), where the two small figures were introduced gradually: in the third stage, the two stimulus-dimensions were presented in distinct areas of the screen, whereas in the fourth stage, they formed a compound stimulus – this arrangement was used in the later stages of the task. In the fifth stage, the large figure rewarded previously in the SD stage became the rewarded stimulus-exemplar again (compound discrimination reversal, CDR). Thus, the SDR and CDR stages constituted an example of reversal learning: a previously non-rewarded exemplar of a stimulus dimension became rewarded.

The more complex part of the task started in the sixth and seventh stage, where new large and small figures were introduced. In the sixth stage, one of the new large figures was randomly chosen to be rewarded (intradimensional set shifting, ID), whereas in stage 7, the other large figure was rewarded (ID reversal, IDR). Then, in the final two stages, a third set of large and small figures was introduced. Importantly, however, this time the small figures became the rewarded features – one of the new small figures was chosen to be rewarded in the eighth stage (extradimensional set shifting, ED), whereas the other small figure in the ninth stage (ED reversal, EDR). This part of the task tested three different cognitive functions. First, in the IDR and EDR stages, again reversal learning was required. Second, in the ID stage, intradimensional set shifting was required: attention had to be directed to new exemplars of the previously attended stimulus-dimension. Third, the ED stage required extradimensional set shifting: attention had to be transferred to the exemplars of a previously unattended stimulus dimensions.

#### Structure of a Trial

Each trial started with a 2.5-s fixation cross period, during which participants had to fixate a yellow fixation cross on a blue background. This was followed by the stimulus presentation period, when the two compound stimuli were presented on the left and the right side of the screen. The participants indicated their choice by clicking either the left or the right mouse button. After the response, a 0.5-s blank screen with blue background followed, and then feedback was given for 1 s. During the feedback period, the screen layout of the presentation period was shown again, but this time, a green or a red frame appeared around the compound stimuli, indicating correct or incorrect choice, respectively. Different sound signals were also associated with both the correct and the incorrect choices.

#### Eye-Tracking

We used an SMI RED500 remote eye-tracker, with a sampling rate of 250 Hz. No chin rest was used, and data from both eyes were recorded.

#### Eye-Blink Data Preprocessing

Blink data were also derived using the eye-tracker data. We used the algorithm of the SMI Begaze data processing software (Sensomotoric Instruments, Teltow, Germany) to detect eyeblinks, and the identified eye-blinks were then further processed using MATLAB (MathWorks, Inc., Natick, MA, United States). At the initiation of an eye-blink, the eye-lid occludes an increasing area of the pupil. Due to specifics of saccade and fixation detection, the algorithm detects a downward saccade at this time point. At the end of an eye-blink, the eye-lid is gradually lifted, and the pupil can be detected – resulting in the detection of an upward saccade. Eye-blinks are then detected as periods without detectable pupil surrounded by a downward and an upward saccade.

This detection method is a more indirect way to detect eyeblinks than electrooculography or video-based methods, and is sensitive to eye-tracking data quality. Thus, data were carefully preprocessed and we took a conservative approach to avoid false detection of eye-blinks. First, we examined noise levels in the gaze data point used by the detection algorithm. During fixations, the gaze direction remains relatively stable, and thus variations in reported gaze point might be attributable to a large extent to measurement noise. Thus, we calculated the root mean square error (RMS error), a measure of variation for each fixation and for each participant (Holmqvist et al., 2011). The median of RMS error values for each participant was then computed. The sample mean of these median values was M = 0.1, SD = 0.1. This falls within the range of noise levels reported for remote eye-trackers (Holmqvist et al., 2011; Orquin and Holmqvist, 2017). Nevertheless, we excluded five participants, whose median RMS error value exceeded the sample mean by two SDs. We used this rather strict criterion to ensure correct detection of eye-blinks. As a second measure ensuring reliable detection, we only accepted data points as eye-blinks, where data from both eyes indicated the presence of an eyeblink.

Examination of individual eye-blink duration distributions revealed that only a few eye-blinks lasted less than 60 ms (0.5% of all blinks). These eye-blinks were labeled as measurement artifacts, and were discarded from further analysis. Following previous research (von Mühlenen et al., 2013), eye-blinks above 500 ms were also discarded (2.8% of all blinks).

We computed descriptive statistics of eye-blink data generated using the above preprocessing steps for the final sample. The sample mean for median eye-blink duration was 114.6 (SD = 21.0), with an interquartile range 96 to 150 (computed for the final sample, N = 47). The mean eye-blink frequency during the task was 12.3 (SD = 9.3). The validity of our measurement is supported by the fact that these values are similar to previous findings for both eye-blink duration (Jandziol et al., 2001; von Mühlenen et al., 2013) and EBR (for a review, see Jongkees and Colzato, 2016).

#### Pupil Data Preprocessing

fpsyg-09-00506 April 9, 2018 Time: 16:42 # 6

Noise in pupil data was also filtered out. During eye-blinks, the eye-lid occludes parts of the pupil and alters pupil size. Thus, we identified eye-blinks in our data, as described in the previous section, and removed pupil data during eyeblinks. High-magnitude changes in pupil size before the start and after the end of eye-blinks were also removed from the data. Finally, segments of missing data points were also removed. These steps resulted in the removal of an average 6.2% (SD = 4.9) of our data. Missing data points were replaced using linear interpolation. Thereafter, to filter high frequency noise, we deleted those pupil size values in each data set, which deviated from the mean of the data set by more than three SDs. Ten data points before and after such segments were also removed. On average, less than 1% of the data were removed this way. These missing data points were also replaced by linear interpolation. Finally, data were smoothed using a Savitzky–Golay filter (parameters: polynomial order: 2, frame size: 21).

#### Statistical Analysis

The link between task performance and physiological measures was assessed from two different aspects. First, we investigated how individual differences in baseline values of pupil size and EBR are associated with task performance. Second, we also examined how EBR and pupil size change before and after rule shifts (i.e., during the exploitative and the explorative phase of the task). In both cases, an important aspect of the analyses considered the ED stage; thus, six participants who failed to reach the ED stage, were excluded from analysis. Furthermore, the first stage was considered as a warm-up phase, thus data from this stage were not involved in any of these analyses.

#### Analysis of Individual Differences

We analyzed two important aspects of task performance. First, we computed the number of errors made between the SDR and the ID stages. This measure indexes the ability to learn from feedback in a task involving a set of changing complex stimuli, but involves no ED. In contrast, we also computed the errors made during the eighth stage, where ED is required. This measure specifically indexes the ability to disregard the attentional set, which was relevant for the previous seven stages, and flexibly adapt a new attentional set.

To assess between-subjects variation in pupil size, before the start of the IEDT, we asked participants to fixate a fixation cross at the center of the screen for four seconds, and then we computed average pupil size during this period (similar assessment was used in Tsukahara et al., 2016). To measure individual differences in EBR, we calculated blinking rate during the task: we divided the number of eye-blinks starting from the SDR stage until the last stage with the time (in minutes).

The behavioral measures of task performance were not normally distributed due to the ceiling effect. Thus, we computed Spearman rank correlation to investigate the relationship between behavioral performance and physiological indices.

#### Analysis of Rule Shifts

To investigate changes during the task, we first computed EBR and pupil size values for each trial period separately (stimulus presentation, feedback, and fixation cross). EBRs were computed for each trial period by dividing the number of eye-blinks during the period by the trial period time in seconds (trial time was fixed for the feedback and fixation cross period, but varied during the stimulus presentation period). Note that this computation differs from the computation of baseline EBR, where the scale of time data was in minutes. Pupil size values for each trial period were computed by calculating the mean pupil size of that period.

To analyze the transition between the exploitative and explorative phases, we focus predominantly on the fixation cross period. During stimulus presentation and feedback, EBR and pupil size might be influenced both by neurotransmitter levels and by visual features of the presented stimuli. During the fixation cross period, however, visual changes are absent; thus changes in these measures might be attributable to changes in neurotransmitter levels. Nevertheless, as fixation cross periods are embedded into a stimulus presentation–feedback–fixation cross cycle, the other two periods are also investigated. For all three periods, measurement points are categorized according to their position relative to a rule shift. We use a corresponding labeling throughout the article. For example, the first trial after a rule shift will be labeled using the denotation RS[+1] (RS standing for rule shift), whereas we will refer to the third trial preceding a rule shift using the denotation RS[–3]. The assignment of these labels around the rule shift is depicted in **Figure 2**. Note that the fixation cross period directly preceding the first stimulus presentation of a new stage counts as RS[–1], as participants have no information at that time point that the rule has changed.

Values of pupil size and EBR are averaged across all rule shifts for different trial-types (e.g., RS[–1] or RS[+2]). For example, we averaged the values of pupil size for all RS[–1] trials the different rule shifts, get an average RS[–1] value. Because the first stage was constituted as a warm-up phase, data from the first rule shift (between SD and SDR states) were not involved in this calculation. Reversal and attentional set shifting constitute the two fundamental shift types; thus these average values were computed separately for the reversal stages (transition to the CDR, the IDR, and the EDR stage), and for the stages where attentional set shifting is required (transition to the ID and the ED stage ). Note that during rule shifts requiring reversal, rule shifts can be detected based on negative feedback for responses based on the outdated stimulus–response contingency. In contrast, during rule shifts requiring attentional set shifting, rule shift is signalized by the stimulus-layout change, and no feedback processing is required.

For the exploitative phase (i.e., before rule shift), we predicted for both EBR and pupil size a steady decrease. In the case of pupil size, this was analyzed using a repeated measures ANOVA with trial as an independent factor (RS[–6] to RS[–1]). The skewed distribution of EBR did not allow the use of ANOVA, thus we used its nonparametric variant, the Friedman ANOVA. At the beginning of the explorative phase, we predicted a sudden increase in both pupil size and EBR. In the case of pupil size, we compared RS[–1]and RS[+1] using a paired sample t-test, whereas for EBR, due to violation of the normality assumption, the Wilcoxon signed rank test was used.

#### Confounding Factors

Both pupil size and EBR are physiological measures, which might be influenced by several factors (e.g., arousal level, health conditions, motivational factors). During debriefing, we assessed these factors: we asked participants about sleep hours and sleep quality. They also completed the Karolinska Sleepiness Scale (Åkerstedt and Gillberg, 1990) to assess their drowsiness and fatigue. Moreover, participants rated on a nine-point scale how much effort they exerted during the IEDT. We examined whether these factors are related to EBR or pupil size to reveal whether these factors should be considered as confounders. Five participants indicated to take some form of medication (e.g., contraceptives or antibiotics), whereas 10 participants indicated to suffer from a minor cold. These factors were not considered as exclusion criteria, thus these participants were not excluded from our sample. Nevertheless, to rule out that our results are not confounded by these factors, all the analyses described above were also rerun without these participants, to check whether the pattern of results changes when these subjects are excluded.

# RESULTS

#### Behavioral Results

**Table 1** presents mean number of trials required to pass each stage and the number of participants passing the stages. As expected, the most difficult part of the task was the ED stage. Here, on average, 21.3 trials (SD = 15.3) were required to pass the stage. In contrast, in other stages, the trials to criterion mean values varied in around 10 trials.

#### Analysis of Individual Differences

Baseline pupil size was significantly and negatively correlated with the number of errors in the ED stage, r<sup>s</sup> (42) = −0.35, p = 0.02, whereas it was not related to errors preceding the ED stage, r<sup>s</sup> (42) = −0.02, p = 0.86 (see **Figures 3A,B**).

In contrast to pupil size, the EBR was not associated with errors during the ED stage, r<sup>s</sup> (42) = −0.02, p = 0.92, but was significantly and positively correlated with errors preceding the ED stage, r<sup>s</sup> (42) = 0.46, p = 0.002 (see also **Figures 3C,D**). This correlation might have been confounded by the fact that EBR was steadily increasing throughout the task, and thus EBR was also computed by involving data from the beginning of the task, between the SDR and the CD2 stage. This alternative computation of EBR did not influence our results (correlation with errors before the ED stage: r<sup>s</sup> (42) = 0.37, p = 0.01; correlation with errors during the ED stage: r<sup>s</sup> (42) = −0.07, p = 0.66).

# Analysis of Rule Shifts – Data During the Fixation Cross Period

**Figure 4** shows pupil size and EBR values before and during rule shifts. These values were measured during the fixation cross period, and thus are not conflated by effects related to visual features of the presented stimuli.

We found a significant decrease in pupil size during the exploitative phase (see **Figure 4A**). The main effect of trial was significant, F(2.91, 119.17) = 39.97, p < 0.001 (after Greenhouse– Geisser correction, epsilon = 0.58), η<sup>p</sup> <sup>2</sup> = 0.49, as was the linear trend, F(1, 41) = 75.04, p < .001, η<sup>p</sup> <sup>2</sup> = .65. There was a significant increase in pupil size between RS[–1] and RS[+1], t(41) = 10.99, p < .001, d = 1.69 (see **Figure 4A**). This difference was also present when focusing on rule shifts with attentional set shifting, t(41) = 8.59, p < .001, d = 1.32, and also when examining rule shifts with reversals, t(41) = 9.49, p < .001, d = 1.47 (see **Figures 4B,C**).

Regarding EBR, we did not find any difference during the trials of the exploitative phase, χ 2 (5) = 7.98, p = 0.16, and there was also no significant change between RS[–1] and RS[+1], Z = 1.52, p = 0.13, r = 0.16 (see **Figure 4D**). When restricting our data to rule shifts involving attentional set shifting, there was a significant increase in EBR after rule shift, Z = 2.75, p = 0.006, r = 0.30 (see **Figure 4E**). This was, however, not the case for the trials involving reversal, Z = 0.61, p = 0.55, r = 0.06 (see **Figure 4F**).

# Analysis of Rule Shifts – Data During the Stimulus Presentation and Feedback Period

**Figures 5**, **6** show pupil size and EBR values, computed for the stimulus presentation and the feedback period. Importantly, during these phases, both neurotransmitter levels and visual features might have determined pupil size and EBR.

TABLE 1 | Performance in the intra/extradimensional set shifting task


Trials to criterion: number of trials/choices required to complete a stage (values represent the mean values, standard deviations are shown in parentheses); Pass-Nr, number of participants successfully completing a stage; SD, Simple Discrimination; SDR, Simple Discrimination Reversal; CD1, Compound Discrimination 1; CD2, Compound Discrimination 2; CDR, Compound Discrimination Reversal; ID, Intradimensional Set Shifting; IDR, Intradimensional Set Shifting Reversal; ED, Extradimensional Set Shifting; EDR, Extradimensional Set Shifting Reversal; ∗∗p < 0.01; ∗∗∗p < 0.001, <sup>+</sup>p < 0.10.

before the ED stage. (B) Significant negative correlation between baseline pupil size and errors during the ED stage. (C) Significant positive correlation between baseline eye-blink rate and errors before the ED stage. (D) No significant correlation between baseline eye-blink rate and errors during the ED stage.

#### Stimulus Presentation Period

A repeated measures ANOVA with trial as within-subject factor (RS[–6] to RS[–1]) showed a significant main effect of trial, F(3.26, 133.58) = 20.35, p < 0.001 (after Greenhouse–Geisser correction, epsilon = 0.65), η<sup>p</sup> <sup>2</sup> = 0.33. The linear contrast was also significant, F(1, 41) = 39.03, p < .001, η<sup>p</sup> <sup>2</sup> = 0.49 (see **Figure 5A**). Contrary to our hypothesis, there was a significant decrease in pupil size between RS[–1] and RS[+1], t(30) = 2.31, p = 0.03, d = 0.35. Interestingly, however, the predicted increase in pupil size occurred during the next trial: the pupil size during the RS[+2] trial was significantly higher, than the pupil size during the RS[+1] trial, t(41) = 7.97, p < .001, d = 1.23 (see **Figure 5A**). This lag might be explained by the fact that RS[+1] is the first time point where the participant can notice a rule shift, and thus changes in pupil size induced by this rule shift can be observed at the next trial. The increase between RS[+1] and RS[+2] was significant both for trials with stimulus-attentional set shifting, t(41) = 7.16, p < 0.001, d = 1.21, and for trials with reversals, t(41) = 3.81, p < 0.001, d = 0.59 (see **Figures 5B,C**).

The Friedman ANOVA tests suggest that EBR did not significantly change during the exploitative phase, χ 2 (5) = 9.01, p = 0.11, but there was a significant increase in EBR between RS[–1] and RS[+1], as indicated by the Wilcoxon signed rank test, Z = 2.29, p = 0.02, r = 0.24 (see **Figure 5D**). This increase disappeared when restricting our analysis either to rule shifts with attentional set shifting, Z = 1.78, p = 0.07, r = 0.19 (see **Figure 5E**), or to rule shifts involving reversals, Z = 0.17, p = 0.87, r = 0.01 (see **Figure 5F**).

#### Feedback Period

There was a significant decrease in pupil size during the exploitative phase, as evidenced by the significant main effect of trial, F(2.25, 92.27) = 49.08, p < 0.001 (after Greenhouse–Geisser correction, epsilon = 0.45), η<sup>p</sup> <sup>2</sup> = 0.55, and by a significant linear trend, F(1, 41) = 70.88, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.634 (see **Figure 6A**). At the beginning of the explorative phase, the size of the pupil increased, as evidenced by a significant difference between RS[–1] and RS[+1], t(41) = 7.72, p < 0.001, d = 1.19 (see **Figure 6A**). This

increase was evident for rule shifts with attentional set shifting, t(41) = 6.06, p < 0.001, d = 0.92 (see **Figure 6B**), and also for rule shifts with reversals, t(41) = 3.89, p < 0.001, d = 0.60 (see **Figure 6C**).

There was no significant difference in EBR during the exploitative phase, χ 2 (5) = 5.12, p = 0.39 (see **Figure 6D**). There was a non-significant tendency for a decrease in EBR between RS[–1] to RS[+1], Z = 1.69, p = 0.09, r = 0.18 (see **Figure 6D**). This decrease was not significant for either rule shifts with attentional set shifting, Z = 1.47, p = 0.14, r = .16 (see **Figure 6E**), and also not for rule shifts with reversals, Z = 0.97, p = 0.33, r = 0.10 (see **Figure 6F**).

# Analysis of the Effect of Potential Confounders

Measures of last night's sleep, task effort, and drowsiness were not correlated with baseline measures of pupil size and EBR. Excluding participants with minor flu or concurrent medication only altered the results of significance testing for some of the results reported for EBR changes during rule shifts. Our most relevant result here was that EBR during the fixation cross period increased significantly for rule shifts requiring attentional set shifting. This difference remained significant, suggesting that the exclusion of participants merely decreased statistical power, and neither medication nor health condition confounded the results.

# DISCUSSION

In this study, we used easily accessible physiological measures to investigate how indirect measures of NA and DA transmission are related to performance in a task assessing one specific aspect of EF, attentional set shifting. We found that individual differences in baseline levels of pupil size and EBR were associated with different aspects of task performance. Additionally, we also showed that within-task changes in pupil size and EBR reflected the transition between the exploitative and the explorative phase of the task. Below we outline these results and their potential implications in more detail.

rule shifts involving reversal. RS[–1], RS[–2], (. . .), RS[–6]: the first, second, (. . .), and the sixth trial, respectively, preceding the rule shift; RS[+1], RS[+2], RS[+3]: the

First, we showed that baseline pupil size was related to performance in the ED stage. This result is a replication of the interesting finding of Tsukahara et al. (2016), who showed that individual differences in baseline pupil size were associated with performance in complex working memory tasks. Importantly, we found a specific association: baseline pupil size was only related to errors during the ED stage, but not to errors prior to this stage. This pattern of results suggests that baseline pupil size is specifically related to cognitive flexibility, and not to task effort or reinforcement learning. As proposed by Tsukahara et al. (2016), this correlation might be explained by the fact that baseline pupil size reflects the activity of the LC/NA system, which regulates the dynamics of different brain networks (Yellin et al., 2015; Shine et al., 2016), which in turn determine EF (Keller et al., 2015).

first, second, or third trial, respectively, following the rule shift. +p < 0.10; <sup>∗</sup>p < 0.05; ∗∗∗p < 0.001.

Second, we found a correlation between baseline EBR and task performance. We showed that individual differences in participants' EBR were correlated with more erroneous choices preceding the ED stage. This correlation is in line with models suggesting that baseline EBR reflects dopaminergic regulation of the trade-off between maintaining versus updating working memory representations (Frank and O'Reilly, 2006; Cools and D'Esposito, 2011; Maia and Frank, 2011; Jongkees and Colzato, 2016). The IEDT requires the maintenance of the same attentional set in working memory during a line of subsequent choices. Low threshold for updating these representations, signaled by high EBR, leads to incorrect choices and this is reflected in the correlation between EBR and erroneous choices.

Third, we replicated our previous findings presented in Pajkossy et al. (2017): tonic pupil size, as measured during the fixation cross period, decreased steadily during the exploitative phase and increased when the explorative phase began. Moreover, we extended this finding by showing that this pattern was also present when we assessed pupil size during the stimulus presentation and the feedback period. Thus, change in tonic pupil size during rule shifts was a robust indicator of exploration and exploitation, and this change was not affected by other

trial, respectively, following the rule shift. +p < 0.10; ∗∗∗p < 0.001.

factors (e.g., changes in luminance during stimulus presentation or feedback processing). Furthermore, tonic increase in pupil size was present during both reversal and attentional set shifting. This pattern of results is in line with the suggestion of Yu and Dayan (2005), who proposed that tonic NA signals unexpected uncertainty. The increase in tonic pupil size following rule shifts signals uncertainty because the established stimulus– response contingencies are no longer valid. Once this uncertainty vanishes, during the exploitative phase, tonic NA (and so pupil size) starts to decrease.

Fourth, we observed task-dependent tonic changes in blinking behavior. By measuring EBR during the fixation cross period, we detected an increase in EBR after rule shifts. Interestingly, however, this change was only present during rule shifts involving attentional set shifting, but not after rule shifts involving reversal. An important difference between reversal and attention set shifting is the differential requirement of explorative processes. On the one hand, reversal stages do not require exploration, because after the first reversal, participants learn the logic of the task: negative feedback without change in stimulus display requires the reversal of response tendencies. On the other hand, in attentional set shifting stages, stimulus–response mappings have to be newly established, and this requires exploring which stimulus feature is associated with reward. Therefore, the selective association between change in EBR and attention set shifting is in line with suggestions, which link tonic DA level to explorative processes (Frank et al., 2009; Beeler et al., 2010; Humphries et al., 2012).

To sum up, we have demonstrated that pupil size and EBR are related to task performance in a way that is predicted by theories of NA and DA neurotransmitter system. Interestingly, baseline and tonic measures of these physiological variables were related to different aspects of task performance. Baseline pupil size was specifically related to the ED stage, whereas changes in tonic pupil size were observed after each rule shift. Similarly, baseline EBR was associated with performance preceding the ED stage, whereas tonic change of EBR was associated with performance during the ED stage. These discrepancies suggest that these physiological measures assess different aspects of NA and DA neurotransmission. Both neurotransmitter systems exert their effect on multiple time scales and in multiple brain sites involving different receptor types (see e.g., Aston-Jones and Cohen, 2005;

Björklund and Dunnett, 2007). This complexity might be reflected in different aspects of these indirect physiological measures. The investigation of this issue poses an interesting challenge for future studies.

# Methodological Considerations

Our results also highlight that pupil size and eye-blinks are influenced by several factors; thus care should be taken to control for these confounders. For example, visual/structural features of the task can influence the pattern of results, as demonstrated by the EBR data from the stimulus presentation and the feedback period. The first feedback trial of a stage requires increased visual processing of the stimuli, and this can explain the decrease in EBR (see Drew, 1951; Stern and Skelly, 1984). Similarly, the increase of EBR during stimulus presentation after rule shifts was confounded by the fact that the time of the stimulus presentation was not fixed (as it was terminated by the participant's response). Such visual/structural confounds cannot explain changes in EBR (and pupil size) during the fixation cross period; thus they can be interpreted as reflecting changes in DA (and NA) transmission.

The effect of stimulus luminance and room illumination on pupil size is also a potential confounding factor, which has to be carefully investigated before interpreting pupil size results. Although room illumination was held constant, stimulus luminance changed during the task, as the luminance of the figures and the background was different. To take this potential confounder into account, we only compared pupil size measures during the same trial period, where stimulus luminance conditions were comparable. During the fixation cross period, screen display was always the same; thus luminance could not cause the pattern of our results. During stimulus presentation and feedback phase, the stimulus display varied as different figures were presented. Note however, that the surface of each figure was the same, and so the net luminance of the screen remained constant. Because of this, it is unlikely that luminance differences would have influenced our results.

Another type of confound is related to participant characteristics and behavior. Pupil size and EBR are physiological variables, which are influenced by health condition, medication, arousal levels, nicotine/caffeine consumption, sleep quality, and sleep hours (Holmqvist et al., 2011). As it was described in detail in the result section, we verified that our results are not confounded by these variables.

Before interpreting pupil size and EBR as indirect measures of NA and DA transmission, we also carefully considered two specific measurement issues. First, it is important to note that eye-blink detection was performed indirectly, by analyzing gaze parameters measured by an eye-tracker. Therefore, measurement noise and other eye-tracking artifacts might have distorted eye-blink detection. To prevent this, we used a strict criterion to exclude participants with inappropriate data quality. Furthermore, distorted eye-blink detection would have caused a general over- or underreporting of EBR (due to false or missed detection of eye-blinks). Such bias is not likely to cause the specific pattern of our results (e.g., selective increase in EBR only after attentional set shifting, detected only during the fixation cross period). The second issue is related to the fact that pupil size and EBR are not completely independent measures. On the one hand, it has been shown that eye-blinks are followed by a sequence of short dilation and constriction of the pupil (Knapen et al., 2016). On the other hand, pupil size influences the quality of those gaze data sets (Choe et al., 2016), which are used to detect eye-blinks. These interdependencies offer the possibility that changes in pupil size and EBR do not reflect the distinct influence of the NA and the DA neurotransmitter systems; instead the change in one measure is caused by the change in the other. The specific pattern of our results, however, contradicts this interpretation. Pupil size and EBR do change in a similar way during the fixation cross period, but not during the stimulus-presentation and the feedback periods. If pupil size and EBR values would have been influenced by some common measurement artifacts, this would have affected them similarly in all periods.

Finally, it is important to highlight a limitation of our results, with respect to the baseline EBR measure we used. We measured baseline EBR not during a passive viewing condition, but during the task. Admittedly, several factors might influence EBR values during task execution (Irwin and Thomas, 2010; Jongkees and Colzato, 2016), and these factors might have confounded our EBR measure. Nevertheless, our results are in line with previous studies using EBR under free viewing conditions, which found that EBR is linked to the trade-off between maintaining and updating cortical representations (e.g., Dreisbach et al., 2005; Zhang et al., 2015; for a review see Jongkees and Colzato, 2016). This might raise the possibility that EBR during task execution can be also used as an indirect measure of DA transmission – this assumption should be tested in future studies, for example by assessing EBR during both a free viewing and a task execution condition.

# CONCLUSION

In summary, our results show that easily accessible physiological indexes can be used to assess how NA and DA transmission is associated with attentional set shifting. We have demonstrated that individual differences in pupil size and EBR are correlated with individual differences in task performance. Moreover, we also showed that within-task changes of pupil size and EBR might reveal how changes in NA and DA transmission accompanies exploitative and explorative aspects of attentional set shifting. Importantly, the pattern of our results showed a specific relationship between different task features (e.g., stages involving reversal vs. attentional set shifting) and different aspects of the physiological variables (e.g., baseline level vs. within-task changes). These results suggest that the NA and DA neurotransmitter systems are involved in attentional set shifting by regulating the balance between different and antagonistic aspects of information processing required during the task (exploration vs. exploitation, stability vs. flexibility). Our results suggest that by measuring pupil size and EBR, we can shed light on how variations in EF are related to variations in neurotransmitter levels, and thus this method might be a promising tool in exploring the sources of variability in EF.

# ETHICS STATEMENT

fpsyg-09-00506 April 9, 2018 Time: 16:42 # 13

The protocol was approved and the study was carried out in accordance with the recommendations of the United Ethical Review Committee for Research in Psychology, Hungary. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

PP, ÁS, GD, and MR participated in planning and designing the experiment. PP collected and analyzed the data and

# REFERENCES


wrote the first version of the manuscript, which was then discussed with ÁS, MR, and GD before preparing the final version.

# FUNDING

This work was supported by the 2017-1.2.1-NKP-2017-00002 Research Grant (National Brain Research Program, Hungary). PP was supported by the Bolyai János Research Scholarship of the Hungarian Academy of Sciences.

# ACKNOWLEDGMENTS

We thank Dorottya Bencze for the assistance with preparation of figures.



schizophrenia: impact of distractors. Schizophr. Res. 89, 339–349. doi: 10.1016/ j.schres.2006.08.014



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Pajkossy, Szoll ˝ osi, Demeter and Racsmány. This is an open-access ˝ article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Context Modulates Congruency Effects in Selective Attention to Social Cues

Andrea Ravagli<sup>1</sup> , Francesco Marini1,2,3, Barbara F. M. Marino<sup>1</sup> and Paola Ricciardelli1,4 \*

<sup>1</sup> Department of Psychology, University of Milano-Bicocca, Milan, Italy, <sup>2</sup> Department of Psychology, University of Nevada, Reno, Reno, NV, United States, <sup>3</sup> Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California, San Diego, San Diego, CA, United States, <sup>4</sup> Milan Center for Neuroscience, Milan, Italy

Head and gaze directions are used during social interactions as essential cues to infer where someone attends. When head and gaze are oriented toward opposite directions, we need to extract socially meaningful information despite stimulus conflict. Recently, a cognitive and neural mechanism for filtering-out conflicting stimuli has been identified while performing non-social attention tasks. This mechanism is engaged proactively when conflict is anticipated in a high proportion of trials and reactively when conflict occurs infrequently. Here, we investigated whether a similar mechanism is at play for limiting distraction from conflicting social cues during gaze or head direction discrimination tasks in contexts with different probabilities of conflict. Results showed that, for the gaze direction task only (Experiment 1), inverse efficiency (IE) scores for distractor-absent trials (i.e., faces with averted gaze and centrally oriented head) were larger (indicating worse performance) when these trials were intermixed with congruent/incongruent distractor-present trials (i.e., faces with averted gaze and tilted head in the same/opposite direction) relative to when the same distractor-absent trials were shown in isolation. Moreover, on distractor-present trials, IE scores for congruent (vs. incongruent) head-gaze pairs in blocks with rare conflict were larger than in blocks with frequent conflict, suggesting that adaptation to conflict was more efficient than adaptation to infrequent events. However, when the task required discrimination of head orientation while ignoring gaze direction, performance was not impacted by both block-level and current trial congruency (Experiment 2), unless the cognitive load of the task was increased by adding a concurrent task (Experiment 3). Overall, our study demonstrates that during attention to social cues proactive cognitive control mechanisms are modulated by the expectation of conflicting stimulus information at both the block- and trial-sequence level, and by the type of task and cognitive load. This helps to clarify the inherent differences in the distracting potential of head and gaze cues during speeded social attention tasks.

Keywords: gaze discrimination, head orientation, social cues, social attention, distraction context manipulation paradigm, proactive control, conflict adaptation, proportion congruency effect

#### Edited by:

Antonino Vallesi, Università degli Studi di Padova, Italy

#### Reviewed by:

Mario Dalmaso, Università degli Studi di Padova, Italy Christopher James Wilson, Teesside University, United Kingdom

> \*Correspondence: Paola Ricciardelli paola.ricciardelli@unimib.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 01 March 2018 Accepted: 22 May 2018 Published: 12 June 2018

#### Citation:

Ravagli A, Marini F, Marino BFM and Ricciardelli P (2018) Context Modulates Congruency Effects in Selective Attention to Social Cues. Front. Psychol. 9:940. doi: 10.3389/fpsyg.2018.00940

# INTRODUCTION

fpsyg-09-00940 June 8, 2018 Time: 19:3 # 2

Head and gaze directions are the most important pieces of information used by human perceptual-cognitive systems during social interactions to determine where another person is attending (Argyle and Cook, 1976; Emery, 2000). Therefore, characterizing how head and eye cues are combined during perceptual and cognitive processing – particularly when these cues are conflicting (e.g., Balsdon and Clifford, 2017) – is fundamental in understanding human behavior in the context of social interactions. When we observe other people's faces, it is common that head and gaze directions are not aligned (e.g., right-oriented head with left-averted gaze). When one needs to determine where another person is attending based on conflicting directional information delivered by head and gaze, this conflict must be resolved by perceptual-cognitive systems (e.g., Perrett et al., 1992) – however, the mechanisms underlying head-gaze conflict resolution are not fully understood yet (e.g., Langton, 2000; Moors et al., 2016; Otsuka et al., 2016a,b).

It is well-known that head-gaze conflict might lead to biases in the perceived gaze direction during tasks requiring the integration of eye and head orientation (e.g., Gibson and Pick, 1963; Cline, 1967; Anstis et al., 1969; Otsuka et al., 2015, 2016a,b; Moors et al., 2016; Balsdon and Clifford, 2017). In a frequently investigated bias known as "repulsive effect" or "overshoot effect," the perceived gaze direction is slightly biased toward the opposite direction relative to the direction in which the head is oriented (e.g., Gibson and Pick, 1963; Anstis et al., 1969; Masame, 1990; Gamer and Hecht, 2007; but for a different bias<sup>1</sup> , see also Cline, 1967; Maruyama and Endo, 1983; Langton et al., 2004). The overshoot effect and similar biases occurring in the presence of conflictual head-gaze cues help to characterize the mechanisms of integration of head and gaze information. One possibility is that these biases derive from an imbalance in the weights attributed to directional information from the head and the eyes, respectively, during the integration of multiple and conflicting directional cues. For example, if the head of an observed face has a rightward tilt and the eyes are centered, an excessive negative weight attributed to the directional cues from the head might result in the gaze being perceived as directed slightly to the left – thus resulting in the overshoot effect. However, multiple accounts exist regarding the integration of head and eye information during face perception (e.g., Langton, 2000; Ricciardelli and Driver, 2008; Nummenmaa and Calder, 2009; Otsuka et al., 2014).

A frequently used taxonomy distinguishes between global and local information conveyed by face stimuli. Global information corresponds to the overall form (thus including head orientation) while local information corresponds to finer-grain details (thus including gaze direction). The overshoot effect and other perceptual modulations on perceived gaze direction driven by head orientation may depend on how local and global perceptual cues are combined to form a coherent percept (e.g., Tanaka and Farah, 2007; McKone, 2008; Tanaka and Simonyi, 2016). Importantly, the distinction between local and global cues reflects perceptual processes with different temporal dynamics: for example, monkey electrophysiology research has demonstrated that global and local information are processed with peculiar spatio-temporal dynamics (De Souza et al., 2005; Rolls, 2007). Although this finding seems to suggest a relative reciprocal independence of head and eye cues during perceptual face processing, whether global and local cues are processed in parallel or are integrated is still a matter of debate. Perrett et al. (1992) in an electrophysiological study found that cells in the superior temporal sulcus (STS) have an extensive sensitivity to head views, gaze direction and body postures. Interestingly, the authors argue that the primary function of this sensitivity is to signal the direction of attention of other individuals and that gaze direction is the best cue to indicate the focus of attention. Accordingly, it was reported that the sensitivity to gaze direction, when not in accordance with head orientation or body posture, could override the sensitivity of head view, which in turn could override body posture. The authors, thus, postulate the existence of a directionof-attention detector (DAD) that combines in a hierarchical manner the information from separate detectors that analyze the direction of the eyes, head and body. This hierarchy in combining eye, head, and body cues is achieved thanks to a network of inhibitory connections. That is, information from the eyes can directly inhibit cells coding an inappropriate head direction, but not vice-versa, and information about a particular head angle can inhibit cells coding a conflicting body position, but not vice versa. In contrast, other studies have suggested that information from head orientation is not completely suppressed when in conflict with gaze direction (Langton, 2000; see Emery, 2000; Langton et al., 2000 for reviews). This implies that head orientation contributes somehow to the computation of attention direction even when the head angle conflicts with the direction of gaze. Ricciardelli and Driver (2008) provided behavioral evidence showing that the mechanisms responsible for processing head and gaze direction show some hierarchical organization and may not operate completely independently (see also Hietanen, 1999, 2002; Bayliss et al., 2004). Accordingly, the overshoot effect also suggests the existence of some degree of integration between local and global cues. However, the reciprocal weights attributed to head and eye cues during this integration are not completely understood yet (e.g., Perrett et al., 1992; Langton, 2000; Ricciardelli and Driver, 2008; Otsuka et al., 2014). Ricciardelli and Driver (2008) proposed that the visual system might attribute different weights to particular head and eye cues according to their visibility and the required speed of the gaze discrimination judgement. However, it is unclear what mechanisms intervene to resolve conflicts between head orientation and gaze direction.

A possibility is that top-down cognitive control processes intervene to resolve this conflict. Is the perceptual decision about where another person is attending susceptible to top-down control? If so, perceptual-cognitive systems might intervene to filter-out one source of information (e.g., eye cues or head cues) from the cue integration process when the observer has prior knowledge that a source of information is irrelevant within a given face-processing context. In the present study, we modeled

<sup>1</sup>The other bias is the towing or attraction effect, which indicates that perceived gaze direction falls in between head and eye orientation (e.g., Moors et al., 2016).

this scenario using a recently introduced experimental paradigm for the characterization of proactive and reactive cognitiveattentional control mechanisms in the presence of conflicting and distracting information – namely, the Distractor Context Manipulation paradigm (Marini et al., 2013, 2015, 2016). In this paradigm, different levels of expectation for conflicting information are created at the block level by using multiple types of experimental blocks with different probabilities of conflict, in addition to no-conflict trials that are both intermixed with conflict trials (in "Mixed" blocks) and presented in isolation in a separate block ("Pure" block). One type of block had an expectation of distraction because both distractor-absent and distractor-present trials were intermixed (the Mixed block), while the other type of block included only distractor-absent trials (the Pure block) and therefore engendered no expectation for distractors. The comparison of speeded performance (e.g., reaction times or inverse-efficiency scores) in distractor-absent trials of the Mixed blocks vs. distractor-absent trials of the Pure blocks might reveal a behavioral cost in Mixed blocks, which has been related to the recruitment of mechanisms for conflict resolution and distraction-filtering (Marini et al., 2013, 2016) and hence termed "distraction-filtering cost." Because this distraction-filtering cost was inversely correlated to the behavioral cost caused by conflict on conflict-present trials (i.e., incongruent vs. congruent trials), it is considered that the behavioral signature of the proactive engagement of a distractionfiltering mechanism is invoked in potentially distracting contexts in order to limit the negative impact of conflicting distraction (Marini et al., 2013, 2016). This distraction-filtering mechanism was shown to be sensitive to contextual factors because it was modulated both proactively (for example, in relation to the probability of occurrence of conflicting distractors within a given experimental block) as well as reactively (for example, after the occurrence of conflicting distractors in the immediately preceding trial) (Marini et al., 2013, 2016). This distractionfiltering mechanism has been described in several different paradigms of visual and cross-modal attention (Marini et al., 2013, 2016), and appears to be a general mechanism of cognitiveattentional control. Therefore, it is plausible to hypothesize that a similar mechanism would intervene also in other types of attention tasks – such as social attention tasks. Social attention tasks, such as gaze-direction or head-orientation discrimination tasks, may require selecting task-relevant information while filtering-out irrelevant cues, particularly in the presence of conflicting cue information. However, whether or not a distraction filtering mechanism intervenes during attention tasks with social cues remains to be established.

Here, our general working hypothesis was that a similar cognitive control mechanism for filtering-out conflicting and/or distracting information might be recruited in the context of attention to social cues in order to resolve the potential conflict between head and gaze cues (e.g., as in the overshoot effect). The rationale rests on the fact that in Ricciardelli and Driver's (2008) study speeded gaze direction judgments were faster when head and gaze are oriented to the same direction (congruent trials) and slower when oriented to opposite directions (incongruent trials). However, head-gaze congruency effect [reaction time (RT) in incongruent minus congruent trials] reversed in the absence of task-imposed speed constraints. Under time pressure, the global head orientation appeared to be weighted more heavily, so that the gaze toward the same side as head deviations then became easier to judge rapidly; however, when the gaze toward the opposite side resulted in the overshoot effect. Intriguingly, this suggests that the reciprocal weights of relevant and irrelevant information can be adjusted depending on the speeded context of the task.

Here, we conducted three experiments in which we adapted the DCM paradigm to both gaze-direction and head-orientation discrimination tasks in contexts with different proportions of trials with congruent/incongruent head orientation and gaze direction, respectively. In Experiment 1, we wanted to investigate if head orientation can be filtered-out during gaze direction discriminations as indicated by the incursion of a distractionfiltering cost. If head-orientation is filtered-out during gazedirection discrimination tasks, then a distraction-filtering cost should be found on trials with potential distraction compared to trials with no distraction. Moreover, the congruency effect (incongruent minus congruent trials) should be modulated both proactively and reactively by conflict probability. Therefore, we expected to observe larger distraction filtering costs and smaller congruency effects in contexts with high probability of conflicting distraction relative to contexts with low probability of conflicting distraction. Moreover, we expected that the overall proactive distraction-filtering mechanism, whose recruitment corresponds to the magnitude of the distraction-filtering cost, would be enhanced reactively after trials with conflicting distractors. In Experiment 2, we investigated the reverse type of task, in which participants performed head orientation discriminations while gaze direction needed to be filtered (Experiment 2). When the task-irrelevant information was gaze direction, we expected a different pattern of results. Because gaze direction is processed more locally than head orientation (Watanabe et al., 1999) and given the well-known presence of a global advantage in information processing (e.g., Navon, 1977; Mills and Dodd, 2014), it is plausible that the negative impact of conflicting gaze on the head orientation task would be smaller or even absent. If so, neither proactive nor reactive modulations of conflict should emerge in Experiment 2. In Experiment 3, we tested the effects of increasing cognitive load in filtering-out gaze direction during head orientation discriminations. Because cognitive load is thought to modulate the efficiency of distraction-filtering (Marini et al., 2013), we expected that conflict-related effects would emerge with a similar pattern to the one predicted for Experiment 1 when the cognitive load of the head orientation discrimination task was increased.

# EXPERIMENT 1

#### Materials and Methods Participants

Twenty-two participants took part in Experiment 1 (mean age 22.4, range 18–26, 17 females, 19 right-handed). Two participants were excluded from the analysis due to the high number

(more than 48 trials) of omitted responses. All with normal or corrected-to-normal vision and with no known neurological or psychiatric condition. All participants were recruited among Psychology students, gave their written informed consent to take part in the study, and received course credit for their participation. To ensure no waste of time and resources the sample size of all experiments was determined on the basis of previous studies (Marini et al., 2013, 2015, 2016) or on a priori power analysis.

#### Ethics Statement

All the experiments were approved by the ethical committee of the University of Milano-Bicocca and were conducted in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki (World Medical Association, 2001) and fulfilled the ethical standard procedure recommended by the Italian Association of Psychology (AIP). All the experimental protocols were also approved by the ethics committee of the University of Milano-Bicocca.

#### Apparatus and Materials

Participants sat in a dimly illuminated room at a distance of 57 cm from the central fixation point of a 21<sup>00</sup> computer screen (Samsung SyncMaster 1100p plus, 1280 × 1024 pixel, refresh rate 85 Hz). The experimental paradigm was programmed in Matlab (MathWorks, Inc.) with Psychtoolbox 3.0 (Kleiner et al., 2007). Responses were collected through button presses on a USB keypad.

Stimuli consisted of Caucasian faces with different gaze orientation and different head orientation. Photographs from the Radboud Faces Database (Langner et al., 2010) were modified using Java Psychomorph 6 (Tiddeman et al., 2005) in order to generate an average face for each gender (male/female), gaze orientation (left/right), and head orientation (centered or tilted 45◦ left/right). This procedure generated a total of 12 unique face stimuli. Additionally, we used a phase-spectrum perturbation technique in Matlab for generating another 12 scrambled faces, which we used as masking stimuli.

#### Procedure

Each trial started with the presentation of a face stimulus (subtending a visual angle of 14.5◦ vertically by 10◦ horizontally) on a uniform gray background. Participants were instructed to indicate gaze direction ("target" dimension; left or right) as fast and as accurately as possible, while ignoring head orientation ("distractor" dimension; left, centered, or right). The face stimulus stayed on-screen until either participant's response or for 1000 ms (whichever occurred first), and was immediately followed by a visual mask (100 ms). The inter-trial interval was jittered between 300 and 600 ms. Three types of trial were used: (1) distractor-absent trials, with no lateral tilt of the head orientation (i.e., the head was straight); (2) congruent distractor trials, with the head orientation tilted in the same direction as the gaze (either both left or both right); and (3) incongruent distractor trials, with the head orientation tilted in the opposite direction as the gaze (either gaze left and head right, or vice-versa).

In order to investigate the functioning of proactive mechanisms for controlling the conflict emerging when gaze direction was task-irrelevant, we used the same distraction context manipulation, which has been used in previous work for identifying and characterizing mechanisms of distraction filtering (Marini et al., 2013, 2015, 2016). This paradigm typically involves two types of blocks (see **Figure 1**): Pure blocks, in which distractor-absent stimuli are presented on 100% of trials, and Mixed blocks, in which distractor-absent stimuli are presented on 20% of trials and distractor-present stimuli are presented on 80% of trials. Here, we used two different types of Mixed blocks: (i) the 60% Congruent block (60% Cong), consisting of 60% congruent distractor trials, 20% incongruent distractor trials, and 20% distractor-absent trials; (ii) the 60% Incongruent block (60% Inc), consisting of 60% incongruent distractor trials, 20% congruent distractor trials, and 20% distractor-absent trials. Every block was preceded by an on-screen cue that informed participants about the type of upcoming block (Pure, 60% Cong, 60% Inc). Prior to the beginning of the experiment, written instructions and examples of stimuli were shown on the screen and participants performed 30 practice trials. Each experiment consisted of 960 trials divided in 15 blocks (5 blocks of each type, presented in a counterbalanced sequence) and had an average duration of 25 min.

# Results

#### Data Handling

Reaction time and response accuracy were measured. RTs were filtered to eliminate outliers, defined as those trials on which RT was either below 200 ms or above the mean plus three standard deviations computed in log values (Ratcliff, 1993). In order to control for the speed-accuracy tradeoff, inverse efficiency (IE) scores were calculated by dividing RTs by the proportion of accuracy (Townsend and Ashby, 1983). We conducted full statistical analyses on IE scores. We report analyses on IE scores with ms<sup>a</sup> as the unit of measurement where "a" indicates that the ms value is adjusted (Marini et al., 2015).

We carried out four main analyses, each with a different purpose. In the first one we analyzed the distractor-absent trials in the Pure blocks (baseline) that we compared to the distractor-absent trials in Mixed blocks to estimate the cost of the engagement of the strategic filtering mechanism in the blocks where the distraction was present. By doing this, we expected to measure a behavioral cost (slower responses) in distractorabsent trials of the Mixed blocks (distracting context), compared to the same distractor-absent trials within the Pure blocks (distractor-free context). The second analysis focuses on the sequential effects that could be present in the distractor-absent trials. In Mixed blocks only, we compared the distractor-absent trials that followed a distractor-absent trial (previous distractorfree context) to the distractor-absent trials that followed a distractor-present trial (previous distraction context). If in the first analysis, we found a distraction-cost, and now we do not find a significant difference between the previous distractor-free and previous distraction context, it means that in the Mixed blocks, only a proactive/strategic control is at play. By contrast, if we

is not shown here for brevity, but see Figure 5.

found a significant difference, then a reactive response strategy has been adopted because of having experienced distraction in the preceding trial. However, in the latter case the problem is to establish whether the behavioral cost found in the first analysis is all or only in part accounted by the reactive control. To this end, we need to compare the distractor-absent trials that followed a distractor-absent trial in the Mixed blocks to the distractor-absent trials that followed a distractor-absent trial in the Pure blocks. If IEs for distractor-absent trials following a distractor-absent trial between Pure and Mixed blocks did not differ, then the cost we measured in Mixed blocks (compared to Pure blocks) in the first analysis would only due to reactive control. In contrast, if the IEs in Pure and Mixed blocks were different, despite the experience of the previous trial was identical (a distractor-absent trial), then we could safely conclude that a proactive/strategic control is also at play in the Mixed blocks and a distractor-expectation cost is present. Therefore, performing the analysis on sequential effects allows us to choose the correct interpretation of the results of our first analysis. The third analysis regards the congruency effects in Mixed blocks. From Ricciardelli and Driver's (2008) study appears that the congruency effect can be modulated by some contextual factors such as, for example, some temporal constraints implemented in the experimental paradigm used. In our third analysis, we tested a contextual hypothesis. Specifically, we tested whether the frequency of conflict (i.e., high in 60% Incongruent blocks or low in 60% Congruent blocks) could modulate the head-gaze congruency effect because of the different type of the information (i.e., head or gaze) that needs to be processed in the two different conditions of conflict (high vs. low). The fourth and last analysis focuses on the Gratton effect. In previous works with non-social stimuli, it has been reported that the congruency effect is lower after incongruent trials than after congruent ones (Gratton effect). This is taken as an index of a reactive adaptation to conflict and it may well be that, in Mixed blocks, it is modulated by the distraction probability. With this analysis we test whether or not the mechanisms that control conflicting information are the same for social and non-social stimuli (for a direct comparison, see also Actis-Grosso and Ricciardelli, 2017; Ciardo et al., 2018).

Statistical analyses were conducted via repeated-measure analysis of variance (ANOVA) or via Friedman ANOVA when

the data did not meet the assumption of normality (tested with the Shapiro-Wilk test). When significant ANOVA effects emerged, we further explored the results with paired samples t-tests and the family-wise error rate was controlled with the Holm–Bonferroni method (Holm, 1979). When data violated the assumption of normality, Wilcoxon signed-rank tests were used instead of paired t-tests. Effect sizes were computed by calculating the appropriate index out of the following: the partial eta squared index (η 2 p ) for standard ANOVA, or the Kendall's W (Sheskin, 2003) for Friedman ANOVA, or the r 2 in Wilcoxon signed-rank tests for pairwise comparisons. Statistical analyses were implemented in IBM SPSS Statistics version 22.

#### Analysis of Distractor-Absent Trials

Mean IEs on distractor-absent trials of Pure and Mixed blocks entered a one-way ANOVA factoring Block (Pure, 60% Cong, 60% Inc) and a significant effect was observed [χ 2 (2) = 33.6, p < 0.001, W = 0.84]. Responses on distractor-absent trials were faster in Pure blocks (mean IE ± SD = 462 ± 39 msa) compared both to 60% Cong blocks (mean IE ± SD = 561 ± 59 msa) and to 60% Inc blocks (mean IE ± SD = 601 ± 87 msa) (z = 3.92, p < 0.001, r <sup>2</sup> = 0.38 for each comparison). It was found that distractor-absent trials were faster in 60% Cong blocks than in 60% Inc blocks (z = 3.01, p = 0.003, r <sup>2</sup> = 0.23; **Figure 2**). These results indicate distractor-absent trials were overall slower in Mixed than in Pure blocks and the magnitude of the Mixed blocks cost was larger in 60% Inc (vs. 60% Cong) blocks, possibly reflecting likelihood of conflict.

#### Sequential Effects on Distractor-Absent Trials

Inverse efficiencies for distractor-absent trials were sorted according to the Type of Preceding Trial and analyzed with a two-way ANOVA factoring Block (60% Cong block vs. 60% Inc block) and Type of Preceding Trial (distractor-absent vs.

congruent distractor vs. incongruent distractor). The main effect of Block and the main effect of Type of Preceding Trial were both significant [F(1,19) = 6.82, p = 0.017, η 2 <sup>p</sup> = 0.26, and F(2,38) = 44.0, p < 0.001, η 2 <sup>p</sup> = 0.70, respectively], while their interaction was not [F(2,38) = 1.61, p = 0.40]. Data for each type of trial were then collapsed across types of Mixed blocks. Distractor-absent trials following another distractor-absent trial (mean IE ± SD = 519 ± 62 msa) were faster than those following an incongruent distractor trial (mean IE ± SD = 614 ± 84 msa) [t(19) = 8.84, p < 0.001, η 2 <sup>p</sup> = 0.80]. Moreover, distractorabsent trials of Pure blocks (mean IE ± SD = 462 ± 39 msa) were faster than distractor-absent trials of Mixed blocks following a distractor-absent trial [t(19) = 7.45, p < 0.001, η 2 <sup>p</sup> = 0.75]. This pattern of results (**Figure 3**) indicates that the distractorexpectation cost was observed even at the net of any reactive

conflict adaptation effect carrying-over from the previous trial

(**Table 1**).

trials. Data of Mixed blocks come from collapsing 60% Cong block and 60% Inc block data. Error bars represent the standard error. Asterisks mark significantly different mean/s.

TABLE 1 | Congruency effect as a function of previous trial congruency and type of Mixed block.


Values in the first and second rows represent the mean congruency effect ± standard deviation. Values in the third row correspond to the mean Gratton effect ± standard deviation.

mean/s.

#### Congruency Effects in Mixed Blocks

As before, IE scores were entered a two-way ANOVA with factors Block (60% Cong vs. 60% Inc) and Type of Trial (distractorabsent vs. congruent distractor vs. incongruent distractor). We found a marginally significant main effect of Block [F(1,19) = 4.4, p = 0.050, η 2 <sup>p</sup> = 0.19], a significant main effect of Type of Trial [F(1.2,22.5) = 44.43, p < 0.001, η 2 <sup>p</sup> = 0.70] and a significant interaction [F(1.1,21.4) = 54.6, p < 0.001, η 2 <sup>p</sup> = 0.74]. These effects were then explored separately for the 60% Cong and 60% Inc blocks with a one-way ANOVA (each) factoring Type of Trial (distractor-absent vs. congruent distractor vs. incongruent distractor).

In the 60% Cong block, this analysis revealed a significant effect [χ 2 (2) = 30.4, p < 0.001, W = 0.76]. Incongruent distractor trials (mean IE ± SD = 1013 ± 239 msa) were slower compared both to distractor-absent trials (mean IE ± SD = 562 ± 59 msa) and to congruent distractor trials (mean IE ± SD = 568 ± 50 msa) (z = 3.92, p < 0.001, r <sup>2</sup> = 0.38 in both comparisons). These results attest to a positive congruency effect in 60% Cong blocks (**Figure 4**).

In 60% Inc blocks, the effect of Type of Trial was significant [χ 2 (2) = 24.4, p < 0.001, W = 0.61]. Distractor-absent trials (mean IE ± SD = 601 ± 87 msa) were faster compared both to congruent (mean IE ± SD = 725 ± 105 msa) and to incongruent (mean IE ± SD = 734 ± 152 msa) distractor trials (z = 3.73, p < 0.001, r <sup>2</sup> = 0.35, and z = 3.88, p < 0.001, r <sup>2</sup> = 0.38, respectively). Mean IEs on congruent and on incongruent distractor trials were not significantly different (z = 0.37, p = 0.71), indicating the absence of any congruency effect in 60% Inc blocks. Furthermore, IEs within the same type of distractorpresent trials were compared between Mixed blocks. Congruent and incongruent distractor trials were faster in the 60% Cong and

in the 60% Inc blocks, respectively (i.e., in those blocks where the specific type of trial, congruent or incongruent, was more frequent) [t(19) = 7.32, p < 0.001, η 2 <sup>p</sup> = 0.74, and t(19) = 6.65, p < 0.001, η 2 <sup>p</sup> = 0.70, respectively (**Figure 4**)].

#### Gratton Effect

In order to investigate the Gratton effect (Gratton et al., 1992), a 2 × 2 ANOVA was run on congruency effect values (i.e., the IE-differences on incongruent minus congruent trials), with Block (60% Cong vs. 60% Inc blocks) and Previous Trial Congruency (congruent vs. incongruent) as factors. The main effect of Block was significant [F(1,19) = 25.6, p < 0.001, η 2 <sup>p</sup> = 0.57], the main effect of Type of Preceding Trial was significant [F(1,19) = 55.7, p < 0.001, η 2 <sup>p</sup> = 0.75], and their interaction was not significant [F(1,19) = 1.47, p = 0.24]. Interestingly, conflict in the preceding trial reduced the magnitude of the congruency effect (Gratton effect), likely due to adaptation to conflict, and did so independently of the probability of conflict at the block level (**Table 1**).

# EXPERIMENT 2

# Materials and Methods

#### Participants

Fifteen new participants took part in Experiment 2 (mean age 23.1, range 21–25, 11 females, 12 right-handed). All participants were recruited as before, had normal or corrected-to-normal vision, were unaware of the purpose of the research and the experimental procedure, and gave their written informed consent before testing.

#### Apparatus, Materials, and Procedure

The design was similar to that of Experiment 1. The stimuli were the same Caucasian faces (male/female) as those used in Experiment 1 with different gaze orientations (straight/left/right) and different head orientations (tilted 45◦ left/right). The combination between these gaze and head orientations yielded a total of 12 unique face stimuli (**Figure 5**).

Each trial started with the presentation of a face stimulus (subtending a visual angle of 14.5◦ vertically by 10◦ horizontally) on a uniform gray background. Participants were instructed to indicate the head orientation ("target" dimension; left or right) as fast and as accurately as possible, while ignoring gaze orientation ("distractor" dimension; left, straight, or right). The face stimulus stayed on-screen until either participant's response or for 1000 ms (whichever occurred first), and was immediately followed by a visual mask (100 ms). The inter-trial interval was jittered between 300 and 600 ms. Three types of trial were used: (1) distractor-absent trials, with averted gaze (i.e., the gaze was always straight); (2) congruent distractor trials, with the gaze averted in the same direction as the head orientation (either both left or both right); and (3) incongruent distractor trials, with the gaze averted in the opposite direction as the head orientation (either gaze left and head right, or viceversa).

mean/s.

Figure 1.

fpsyg-09-00940 June 8, 2018 Time: 19:3 # 8

In order to investigate the functioning of proactive mechanisms for controlling the conflict emerging when gaze direction was task-irrelevant, we used the same distraction context manipulation as that used in Experiment 1. Specifically, two types of blocks were employed: Pure blocks, in which distractor-absent stimuli were presented on 100% of trials, and two Mixed blocks, a 60% Congruent block (i.e., consisting of 60% congruent distractor trials, 20% incongruent distractor trials, and 20% distractor-absent trials) and a 60% Incongruent block (i.e., consisting of 60% incongruent distractor trials, 20% congruent distractor trials, and 20% distractor-absent trials). Every block was preceded by a cue on the screen informing participants about the type of upcoming block (Pure, 60% Cong, 60% Inc). Prior to the beginning of the experiment, written instructions and examples of stimuli were shown on the screen and participants performed 30 practice trials. There were 960 trials in total divided in 15 blocks (5 blocks of each type, presented in a counterbalanced sequence) and had an average duration of about 25 minutes.

#### Results

Reaction time and response accuracy were measured. RTs were filtered to eliminate outliers, defined as those trials on which RT was either below 200 ms or above the mean plus three standard deviations computed in log values (Ratcliff, 1993). IE scores were calculated as in Experiment 1 and all statistical analyses were performed as before.

#### Analysis of Distractor-Absent Trials

Inverse efficiency scores measured in the distractor-absent trials were submitted to a one-way ANOVA with Block as a three-level factor (Pure, 60% Cong, 60% Inc). We did not find a significant effect of Block [χ 2 (2) = 2.5, p = 0.282, W = 0.08], suggesting the absence of a filtering cost (**Figure 6A**).

#### Congruency Effects in Mixed Blocks

IE scores measured in Mixed blocks entered a two-way ANOVA with Type of Block (60% Cong vs. 60% Inc) and Type of Trial (distractor-absent vs. congruent distractor vs. incongruent

distractor) as within-subjects factors. Neither the main effects [Type of Block: χ 2 (1) = 1.7, p = 0.20, W = 0.11; Type of Trial: χ 2 (2) = 0.40, p = 0.82, W = 0.01] nor their interaction [χ 2 (5) = 2.54, p = 0.77, W = 0.03] were significant, indicating that gaze direction information did not provide any distraction and could be easily filtered out (**Figure 6B**).

#### EXPERIMENT 3

The present experiment was aimed to test whether increasing the task load would increase the distracting power of gaze direction when task irrelevant as it was in Experiment 2.

#### Materials and Methods Participants

Twenty-two new participants took part in Experiment 3 (mean age 22.2, range 19–25, 12 females, 21 right-handed). They were all recruited as before, had normal or corrected-to-normal vision, were unaware of the purpose of the research and the experimental procedure, and gave their written informed consent before testing.

#### Apparatus, Materials, and Procedure

They were the same as in Experiment 2 except that now in order to increase the task load, a question about the gender of the face stimulus was randomly presented on screen after the mask for only a subset of the total trials (24%) (**Figure 5**). The gender discrimination task was chosen on the base that gender, different from gaze direction and head orientation, is an invariant facial property that interacts with changeable facial aspects (e.g., Karnadewi and Lipp, 2011). Therefore, we reasoned that both the changeable and invariant proprieties of the face (i.e., head orientation and gender) in the present experiment had to be processed (although only in 24% of trials) thus increasing the task load compared to Experiments 1 and 2 where only changeable aspects of the face needed to be taken into account.

FIGURE 7 | Experiment 3: Mean IE scores of distractor-absent trials in Pure and Mixed blocks (60% Cong block and 60% Inc block). Results of Experiment 2 are also plotted on background for a comparison. Error bars represent the standard error of the means across participants. Asterisks mark significantly different mean/s.

#### Results

All statistical analyses were performed as before. In addition, for a comparison of IE scores between Experiments 2 and 3, we conducted a Mann–Whitney test in distractor-absent trials and Mixed blocks, separately. Eta-squared for Mann–Whitney test was defined as z 2 /N-1 (Hatch and Lazaraton, 1991).

#### Analysis of Distractor-Absent Trials

As in the previous experiment, IE scores measured in the distractor-absent trials were submitted to a one-way ANOVA with Type of Block as a three-level factor (Pure, 60% Cong, 60% Inc). The analysis revealed that the effect of Type of Block was significant [χ 2 (2) = 6.8, p < 0.04, W = 0.15; **Figure 7**]. Responses to distractor-absent trials were slower in 60% Cong block (mean IE ± SD: 412 ± 74 msa) compared both to Pure block (mean IE ± SD: 404 ± 77 msa; z = −2.22, p < 0.03, r <sup>2</sup> = 0.94) and 60% Inc block (mean IE ± SD: 403 ± 76 msa; z = -1.96, p = 0.05, r <sup>2</sup> = 0.92) which did not differ from each other (z = −0.21, p = 0.83). The comparison between Experiments 2 and 3 approached significance for the 60% Inc block (U = 104.00, p = 0.05, η 2 <sup>p</sup> = 0.10), suggesting the presence of a top-down/strategic cost only. This is likely because, since the participants were expecting to perform a second task, they paid more attention to the face gender thus improving their performance compared to Experiment 2, especially in distractorabsent trials.

#### Sequential Effects on Distractor-Absent Trials

To test the possible effect of conflict in the preceding trial on responses, distractor-absent trials of Mixed blocks were sorted according to the Type of Preceding Trial and analyzed with a two-way ANOVA factoring Type of Block (60% Cong block vs. 60% Inc block) and Type of Preceding Trial (distractor-absent vs. congruent distractor vs. incongruent distractor). Neither the main effects [Type of Block: χ 2 (1) = 2.9, p = 0.09, W = 0.13; Type of Preceding Trial: χ 2 (2) = 1.1, p = 0.58, W = 0.02] nor their interaction [χ 2 (5) = 4.9, p = 0.43, W = 0.04] were significant, indicating none reactive conflict adaptation effect carrying-over from the previous trial but a proactive adaptation only (**Figure 8**).

#### Congruency Effects in Mixed Blocks

Again IE scores measured in Mixed blocks were submitted to a two-way ANOVA with Type of Block (60% Cong vs. 60% Inc) and Type of Trial (distractor-absent vs. congruent distractor vs. incongruent distractor) as within-subjects factors. The main effect of Type of Block [χ 2 (1) = 6.5, p < 0.02, W = 0.30] and the interaction between Type of Block and Type of Trial [χ 2 (5) = 12.0, p < 0.04, W = 0.11] were significant. Moreover, the comparison between Experiments 2 and 3 approached significance for distractor-absent trials (U = 104.00, p = 0.05, η 2 <sup>p</sup> = 0.10), suggesting an overall increase of attention (topdown/strategic control)when a second task (the discrimination of face gender) needs to be performed. Each type of Mixed block was then explored with a one-way ANOVA factoring Type of Trial (distractor-absent vs. congruent distractor vs. incongruent distractor).

Within 60% Cong blocks, an effect of blocks was found to approach significance [χ 2 (2) = 5.8, p = 0.06, W = 0.13], indicating slower reaction times in distractor-absent trials (mean IE ± SD: 418 ± 77 msa) relative to both congruent (mean IE ± SD:

Type of Trial (distractor-absent, congruent distractor, and incongruent distractor), separately for 60% Cong block and 60% Inc block data. Results of Experiment 2 are also plotted on background for a comparison. Error bars represent the standard error of the means across participants. Asterisks mark significantly different mean/s.

413 ± 76 msa; z = −2.00, p = 0.05, r <sup>2</sup> = 0.97) and incongruent distractor trials (mean IE ± SD: 409 ± 73 msa; z = −2.16, p < 0.04, r <sup>2</sup> = 0.94), which did not differ from each other (z = −1.18, p = 0.24; **Figure 9**).

Within 60% Inc blocks, an effect of blocks was found to approach significance [χ 2 (2) = 5.8, p = 0.05, W = 0.13], indicating faster reaction times in distractor-absent trials (mean IE ± SD: 403 ± 76 msa) relative to incongruent distractor trials (mean IE ± SD: 412 ± 74 msa; z = −1.96, p = 0.05, r <sup>2</sup> = 0.92; all the other ps were >0.12, **Figure 9**).

#### Gratton Effect

In order to investigate the Gratton effect, a 2 × 2 ANOVA was run on congruency effect values (i.e., the IE-differences on incongruent minus congruent trials), with Type of Block (60% Cong vs. 60% Inc blocks) and Previous Trial Congruency (congruent vs. incongruent) as factors. Neither the main effects [Type of Block: F(1,21) = 0.26, p = 0.61, η 2 <sup>p</sup> = 0.01; Previous Trial Congruency: F(1,21) = 0.89, p = 0.36, η 2 <sup>p</sup> = 0.04] nor their interaction [F(1,21) = 0.82, p = 0.38, η 2 <sup>p</sup> = 0.04] were significant, indicating that conflict in a preceding trial had no effect on the following trial.

#### DISCUSSION

In the present study, we tested whether proactive control processes for filtering-out irrelevant stimulus information (e.g., Marini et al., 2013) were recruited in a social attention task where participants made a speeded directional judgment based

on either head orientation or gaze direction cues in contexts with varying probability of conflict between the two cues. In particular, we were interested in studying whether the weights attributed to directional information coming from the head and the eyes during the integration of multiple and conflicting cues could be adjusted based on the prior knowledge of different probabilities of conflict, thus affecting the head-gaze congruency effect (RT in incongruent minus congruent trials). Proportion of conflict (incongruent trials) within Mixed blocks (Experiments 1–3) and cognitive load (Experiment 3) were taken into account as variables that could potentially modulate the congruency effect. If head orientation and gaze direction could be filtered out when task irrelevant, then a distraction-filtering cost (i.e., slower responses and larger IE score on distractorabsent trials of Mixed blocks relative to the same trials of Pure blocks) should be found both for head and gaze cues. In addition, the congruency effect should be modulated both proactively and reactively by conflict probability. Finding these results when both the head and the gaze are task irrelevant would suggest similar weights for these directional cues in the integration process. By contrast, the absence of a distractionfiltering cost and thus the lack of proactive and reactive modulation when gaze direction or head orientation is irrelevant, would indicate a difference in the weights of head and gaze cues.

In Experiment 1, where head orientation was irrelevant, we found that the Mixed blocks cost was modulated by the frequency of conflict and the presence or absence of conflict in the previous trial. Specifically, distractor-absent trials in Mixed blocks following an incongruent distractor trial were slower than those following a distractor-absent trial. This is the evidence of a reactive trial-to-trial adjustment triggered by the conflict occurrence. However, distractor-absent trials following a distractor-absent trial in Mixed blocks were slower than distractor-absent trials following a distractor-absent trial in Pure blocks suggesting that the observed slowing-down is not fully accounted for trial-to-trial adjustments, but also involves a proactive component. This adaptation may be guided by previous knowledge of the probability of conflict, and/or by the contingent trial history of proportion congruency.

The congruency effect was present in 60% Cong blocks whereas it was absent in 60% Inc blocks. Interestingly then, in those Mixed blocks where a specific type of distractorpresent trial (congruent or incongruent) was relatively frequent, responses were also faster than for the relatively infrequent trial type (see **Figure 4**). This indicates that participants have benefitted from prior knowledge about the probability of conflict at the block level. This suggests that the control of proactive distraction-filtering at the block-level and the reactive trial-totrial adjustments to conflicting distraction may be controlled by different dynamics. One possibility is that a central monitoring system, whenever conflict occurs, triggers reactively a topdown control mechanism that enhances distraction filtering in the next trial, which results in the observed reduction of interference. Some authors (Nieuwenhuis et al., 2006; Lamers and Roelofs, 2011) did not consider the Gratton effect explainable by the conflict-monitoring hypothesis (Botvinick et al., 2001) and suggested that simple stimulus or response repetition may account for this effect. In Experiment 1, when a given type of trial was repeated, the Gratton effect was different based on the block-level probability of conflict, which does not seem to be compatible with the stimulus or response repetition account of the Gratton effect. Instead, because the conflict-probability context at the block-level modulated the magnitude of the Gratton effect, our findings concur with the idea that a general conflict-monitoring system may control trialto-trial the magnitude of the congruency effect in the current paradigm.

On the contrary, as expected, in Experiment 2 where gaze direction was the distracting information and head orientation was the target dimension, no filtering cost was found and no congruency effect emerged either. However, in Experiment 3 when the cognitive load of the head orientation discrimination task was increased by asking participants to perform a gender discrimination task as well as a head discrimination one (in 24% of the trials), the Mixed block cost emerged again as in Experiment 1. Specifically, in Experiment 3 we found slower responses to distractor-absent trials in 60% Cong blocks than in Pure blocks and in 60% Inc blocks, whereas responses to distractor-absent trials between the latter two block types were not different from each other. Moreover, unlike in Experiment 1, no reactive conflict adaptation effects carried over from the previous trial emerged. Finally, no significant congruency effects were found in 60% Cong blocks or in 60% Inc blocks, and no Gratton effect emerged either.

Our findings in Experiments 1 and 3 speak in favor of the recruitment a proactive control mechanism that is at play also when we process conflicting directional social cues coming from the face, such as head orientation and gaze direction. In particular, similar to what has been already reported for non-social stimuli (Marini et al., 2013), this distraction-filtering mechanism is sensitive to contextual factors and intervenes to filter-out one source of information (e.g., eye cues or head cues) from the cue integration process when the observer has prior knowledge that a source of information is irrelevant within a given face-processing context. Specifically, knowing in advance the conflict probability modulates a proactive control mechanism whose function is to maintain active the current task goals (Braver, 2012), thus allowing the observer to perform a directional discrimination judgment task of either the gaze or the head. Interestingly, however, our results clearly demonstrate a qualitative different pattern of results between the two tasks, suggesting that head and gaze cues are processed together and when one of the two is task irrelevant the weight of the interference is not the same. In particular, in a speeded judgment task as used in the present study, the interference of head orientation on gaze direction judgments was stronger, as indicated both by the presence of the filtering distracting cost in Experiment 1 and its absence in Experiment 2. In parallel, a proactive distraction-filtering cost was also absent in Experiment 2, suggesting an inherently larger weight attributed to head orientation and a reduced distracting power of the gaze direction in this task. This is in line with the finding reported by Ricciardelli and Driver (2008) who found effects

of the congruency between head orientation and gaze direction in a left/right gaze judgment task. As in the present study (Experiment 1), when a speeded judgment of gaze direction was required, faster RTs were found when the head and gaze directions deviated toward the same side (congruency effect; see also Langton, 2000), but only when the full face was visible (global processing). Therefore, under time pressure, the head orientation of the whole face becomes weighted more heavily, so that gaze deviations toward the same direction as the tilted head then become easier to judge rapidly. Indeed, our findings from Experiment 1 show that the distractionfiltering cost in blocks with rare conflict (60% Cong blocks) was comparably smaller than the one in blocks with frequent conflict (60% Inc blocks) where the inhibition of task-irrelevant information required was stronger (e.g., Ridderinkhof, 2002; Braver, 2012).

Moreover, the finding that the distraction-filtering cost when gaze direction is the distracting information, emerges only with higher cognitive load (Experiment 3) is in line with the idea that the cognitive load can reduce the efficiency of distractionfiltering by increasing both the distraction-filtering cost and the congruency costs (cf. Experiments 5 and 7 in Marini et al., 2013), particularly in the blocks with frequent distraction. However, the difference between Experiments 2 and 3 is only marginal, and increasing the cognitive load within subjects would have been a better manipulation. Nevertheless, the findings of the present study suggest that the effect of cognitive load is likely because, in order to perceive the gender of the face, the whole face (head and the eyes) needs to be processed and attended. In a speeded task, filtering-out distraction proactively was more effortful when this distracting information came from a local feature such as the gaze direction (vs. head orientation). This explanation would account for the fact that in Experiment 2 (low cognitive load) we did not found any the distraction-filtering cost and in Experiment 3 we found it only for distractor-absent trials in Mixed blocks with rare conflict. This is a new finding that extends previous evidence (e.g., Perrett et al., 1992; Sugase et al., 1999; Watanabe et al., 1999; Ricciardelli and Driver, 2008) of a hierarchical organization and processing of head and gaze cues. Taken together, the present and previous results suggest that the visual system may give different weights to head and gaze cues according to different contexts (i.e., the required speed of the judgment, the expectation of conflict and cognitive load). The hierarchy of cues may weight head direction cues more strongly when a speeded visual judgment is required (perhaps because head orientation is a global visual property of face and is extracted more rapidly than local features; e.g., Sugase et al., 1999). This would explain the pattern of engagement of a distraction-filtering mechanism and the presence of congruent effects observed in the current study.

Another aspect emerging from the present findings concerns the integration of different directional cues (head orientation and gaze direction) of the face that gives rise to the two biases (a repulsive effect and the attraction effect) in perceived gaze direction discussed in literature when eye and head orientation are conflicting (Moors et al., 2016). Since the present study and a previous study by Ricciardelli and Driver (2008) showed that the weight of these cues in the integration process could be modulated both by the frequency of conflict and by time constraints, these two biases may change as a function of both distraction expectation and speeded task requirement. Further research is needed to test this hypothesis. Moreover, a possible limitation of our study is that we only investigated the top-down modulation triggered by the instruction.

Overall, our study demonstrates that, during attention to social cues, proactive cognitive control mechanisms are modulated by the expectation of conflicting stimulus information, and that conflict adaptation mechanisms intervene flexibly in order to facilitate decisions that are most frequently required depending on the specific task context. Unlike previous results with non-social stimuli (Marini et al., 2013), here the reactive adaptation to conflict as a function of previous trial congruency (i.e., the Gratton effect) was not modulated by the distraction probability at the block level. This result may be specifically related to the processing of social direction cues, and suggests a weaker strategic top-down control when processing social and biological stimuli (see also Marino et al., 2015). In a similar vein, a recent study investigating the conflict between the spatial information conveyed by noninformative gaze and arrow direction and the target spatial position in cueing task reported that, unlike other conflict tasks (the Simon task), in the gaze and arrow cueing task the previous-trial congruence modulated only responses to the following congruent trials (Ciardo et al., 2018). The fact that the congruency of the preceding trial did not affect performance in the subsequent incongruent trials is in line with the idea of a failure of an inhibitory mechanism in suppressing the automatic orienting of attention triggered by gaze and overlearnt directional cues (such as arrows), even following a conflicting event.

In conclusion, the present study shows that filtering-out potentially conflicting head information is a more resourcedemanding process than filtering out conflicting gaze direction and entails a cost likely related to proactive control mechanisms. Accordingly, this cost is larger when conflict occurs frequently (vs. rarely), but the opposite was true when gaze direction was irrelevant and the task cognitive load was increased.

Perceiving where someone else is attending relies on the integration of multiple social cues. This integration process can be modulated by proactive control mechanisms that are sensitive to the context (being the probability of encountering conflicting social cues within a given experimental block). These cognitive control mechanisms consist of proactive slowing-down due to the expectation of distraction and conflict adaptation. Together, they help the focusing on relevant cues (e.g., gaze) and away from irrelevant ones (e.g., head orientation).

# AUTHOR CONTRIBUTIONS

AR, FM, and PR conceived the experiments. AR implemented the study and collected part of the data. AR and BM analyzed

the data and drew the figures. All authors drafted the manuscript, provided critical revisions, and approved the final version of the manuscript.

#### FUNDING

AR was supported by Fondo di Ateneo grant (FA 2015) awarded to PR from University of Milano-Bicocca. PR was supported by Fondo di Ateneo Quota Competitiva (FAQC 2015) from University of Milano-Bicocca. FM

#### REFERENCES


and BM were supported by University of Milano-Bicocca through a doctoral and postdoctoral fellowship, respectively.

#### ACKNOWLEDGMENTS

The authors thank Carmen Campana for her help in data collection and Angelo Maravita for his precious advices at the initial stage of the study. The authors also thank Tim Vaughan for helping by revising the English.


task: conflict adaptation or associative priming? Mem. Cogn. 34, 1260–1272. doi: 10.3758/BF03193270


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer MD and the handling Editor declared their shared affiliation.

Copyright © 2018 Ravagli, Marini, Marino and Ricciardelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# TVA-Based Assessment of Visual Attention Using Line-Drawings of Fruits and Vegetables

#### Tianlu Wang<sup>1</sup> and Celine R. Gillebert1,2 \*

<sup>1</sup> Department of Brain and Cognition, KU Leuven, Leuven, Belgium, <sup>2</sup> Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom

Visuospatial attention and short-term memory allow us to prioritize, select, and briefly maintain part of the visual information that reaches our senses. These cognitive abilities are quantitatively accounted for by Bundesen's theory of visual attention (TVA; Bundesen, 1990). Previous studies have suggested that TVA-based assessments are sensitive to inter-individual differences in spatial bias, visual short-term memory capacity, top-down control, and processing speed in healthy volunteers as well as in patients with various neurological and psychiatric conditions. However, most neuropsychological assessments of attention and executive functions, including TVAbased assessment, make use of alphanumeric stimuli and/or are performed verbally, which can pose difficulties for individuals who have troubles processing letters or numbers. Here we examined the reliability of TVA-based assessments when stimuli are used that are not alphanumeric, but instead based on line-drawings of fruits and vegetables. We compared five TVA parameters quantifying the aforementioned cognitive abilities, obtained by modeling accuracy data on a whole/partial report paradigm using conventional alphabet stimuli versus the food stimuli. Significant correlations were found for all TVA parameters, indicating a high parallel-form reliability. Split-half correlations assessing internal reliability, and correlations between predicted and observed data assessing goodness-of-fit were both significant. Our results provide an indication that line-drawings of fruits and vegetables can be used for a reliable assessment of attention and short-term memory.

#### Edited by:

Kathrin Finke, Friedrich-Schiller-Universität Jena, Germany

#### Reviewed by:

Randi Starrfelt, University of Copenhagen, Denmark Christian H. Poth, Bielefeld University, Germany

> \*Correspondence: Celine R. Gillebert celine.gillebert@kuleuven.be

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 27 November 2017 Accepted: 07 February 2018 Published: 27 February 2018

#### Citation:

Wang T and Gillebert CR (2018) TVA-Based Assessment of Visual Attention Using Line-Drawings of Fruits and Vegetables. Front. Psychol. 9:207. doi: 10.3389/fpsyg.2018.00207 Keywords: theory of visual attention, executive functions, visuospatial attention, short-term memory, assessment, neuropsychology

# INTRODUCTION

Visuospatial attention, executive control, and short-term memory are essential in the daily human interaction with the environment, and deficits in these domains have devastating effects on the quality of life (Van Zandvoort et al., 1998). The process of perceiving and processing changes in the visual environment has been extensively studied and quantified through the theory of visual attention (TVA), a quantitative account of attention and short-term memory in healthy adults (Bundesen, 1990; Finke et al., 2005; Habekost, 2015). In TVA-based assessments, performance on whole and partial report tasks is typically assessed with self-reports (reporting as many as possible of the previously presented target stimuli, e.g., Bublak et al., 2005; Chechlacz et al., 2015) or probed change detection reports (choosing whether a probe stimulus is the same as the previously presented target stimulus, e.g., Kyllingsbæk and Bundesen, 2009; Gillebert et al., 2012).

Five basic parameters can be estimated from this performance: the storage capacity of visual short-term memory (VSTM) K, the visual processing speed C, the minimum effective exposure duration t0, the efficiency of top-down selectivity α, and the distribution of attentional weights across the visual field ω.

Each of these parameters represents a distinct facet of visual attention and disruptions in any may have a significant impact on the quality of life (Mitchell et al., 2010). For instance, an imbalance in the distribution of attentional weights across the visual field is a main symptom of hemispatial neglect, one of the most common and disabling attentional disorders after stroke (Hyndman et al., 2008; Corbetta and Shulman, 2011). However, other facets of attention including visual processing speed, short-term memory capacity and top-down selectivity are also affected in stroke patients with hemispatial neglect (e.g., Duncan et al., 1999; Habekost and Bundesen, 2003). Besides hemispatial neglect, several clinical studies have shown that TVA-based assessment yields sensitive and reliable measures of cognitive abilities in patients with acquired brain injury, neurodevelopmental disorders, aging and neurodegenerative disorders, as well as neuropsychiatric disorders (see Habekost, 2015, for a review).

A standard paradigm which has emerged in recent years is the CombiTVA paradigm (Vangkilde et al., 2011). This combined whole/partial report paradigm delivers sensitive measures of attention and short-term memory informed by TVA-based modeling of attention functions. The assessment consists of a whole-report part, during which as many stimuli as possible are reported, and a partial-report part, during which only stimuli with a certain target feature are reported. Conventionally, TVAbased assessments are performed with simple letter stimuli, but digits and short words have also been used (Starrfelt et al., 2009; Habekost et al., 2014a). An assessment that is not based on alphanumeric stimuli could be helpful to measure attention and short-term memory impairments in individuals who have difficulties recognizing and processing letters or numbers. For example, such alternative assessments could be valuable in testing for neurodevelopmental disorders in young children who have not yet learned the alphabet or in whose reading (and/or processing of letters) is impaired by a neuropsychological disorder. Previously, studies have assessed visual processing speed (Peers et al., 2005) or VSTM capacity (Sørensen and Kyllingsbæk, 2012) using images instead of letters, but to date no full TVA-based assessment has been published with nonalphanumeric stimuli.

In the current study, we examined the reliability of TVAbased assessment using a different set of stimuli. To this end, we adapted a whole/partial report paradigm to include stimuli that consist of line-drawings of common, familiar, and easily distinguishable fruits and vegetables, and measured the five quantitative parameters in the same participants for both the "alphabet stimuli" and the "food stimuli." We tested the parallelform reliability of TVA-based assessments by calculating the correlations between the five basic parameters obtained with the food stimuli versus those obtained with the conventional alphabet stimuli. We also assessed the internal reliability of both stimulus sets by calculating the split-half correlations.

Previous studies assessing attention and short-term memory without TVA-based modeling have shown complex stimuli to have a higher visual information load, which in turn holds an inverse linear relationship to the number of stimuli one can hold in memory (Alvarez and Cavanagh, 2004). We therefore expected the increased complexity of the food stimuli to result in a lower VSTM capacity K as well as a lower processing speed C, which has previously been shown to be correlated to K (Finke et al., 2005). The increased complexity of the stimuli has also been shown to be expressed through a lower efficiency of search for a target among distractors from the same category (Awh et al., 2007), which we, in our TVA-based study, expected to lead to a higher value for the top-down control parameter α. The effect of stimulus type and complexity on the distribution of attentional weights has not been examined yet in the context of TVA. Based on earlier work on perceptual performance at short presentation durations, we expected the perceptual threshold t<sup>0</sup> in our TVA-based study to represent perceptual limitations rather than VSTM capacity limitations, and hence to be correlated to the decrease in C, and increase for the food stimuli (Eng et al., 2005).

# MATERIALS AND METHODS

#### Participants

A total of 36 right-handed healthy volunteers with normal or corrected-to-normal vision participated in the experiment. We excluded participants with a previous history of neurological or psychiatric disorders or participants with red-blue color blindness. The mean age was 22.5 years (SD = 2.8 years, range: 19–30), 8 were male, and 28 were female. One participant was a secondary school graduate, 28 were current bachelor or master students of the University of Leuven, Belgium, 3 were master graduates, and 4 were current doctoral students. All participants provided written informed consent in accordance with the Declaration of Helsinki. The protocols were approved by the Social and Societal Ethics Committee (Reference number: G2017 02 787).

# Apparatus and Stimuli

The stimuli were displayed on an ASUS VG248QE 1920×1080 24-inch monitor (refresh rate set at 100 Hz). The paradigm was presented using Unity <sup>R</sup> software (version 5.5.1f1<sup>1</sup> ). Unity scripts controlled the timing and durations of the stimuli displays according to the frame rate. Stimuli were chosen from a set of 20 capital alphabet letters or 20 vector line drawings<sup>2</sup> of various fruits and vegetables (Supplementary Materials) with a maximum of 100 pixels in the x- and y-dimensions corresponding to approximately 2.7◦ of visual angle for both sets. The luminance of the red color of the targets and blue color of the distractors were 22.5 and 28.3 cd/m<sup>2</sup> , respectively.

<sup>1</sup>https://unity3d.com

<sup>2</sup>Dreamstime. https://www.dreamstime.com/stock-illustration-fruit-vegetableline-art-icons-big-set-design-vector-modern-thin-outline-fresh-healthy-foodsymbols-image73217933 [accessed April 21, 2017].

# Procedure

In the current study, we adopted the CombiTVA (Vangkilde et al., 2011) paradigm designed as a combination of the two classical experimental paradigms, whole report (Sperling, 1960) and partial report (Shibuya and Bundesen, 1988), allowing full assessment of distinct facets of visual attention within a single task (**Figure 1**; Vangkilde et al., 2011). The established procedures used briefly presented, multi-stimuli displays in which participants were asked to identify all the stimuli (whole report trials, where processing and memory capacity can be measured), or to only report a subset of stimuli with certain features (partial report trials, to measure attentional selection).

The participants were seated in a semi-dark room approximately 60 cm from the screen. The testing session consisted of two parts of approximately 25 min, lasting 1 h and 15 min in total including instructions and breaks. The whole/partial report paradigm was repeated twice, once with alphabet stimuli and once with food stimuli (**Figure 2**). The order was counterbalanced across participants with half of the participants starting with the alphabet stimuli followed by the food stimuli in the second part, and the other half starting with the food stimuli. Participants were given standardized written and verbal instructions. Before the start of the assessment, the participants first practiced matching the alphabet or food stimuli presented centrally on the screen, one-by-one in a randomized order, in order to become acquainted with the stimuli and the response keys on the keyboard. This was repeated twice for the alphabet stimuli and five times for the food stimuli, since the participants were assumed to be familiar with the alphabet letters on the keyboard.

FIGURE 1 | Outline of a single trial in the CombiTVA paradigm (Vangkilde et al., 2011) showing the timing and three display types: six target whole report (red letters), two targets whole report, and the two targets and four distractors partial report trial (red and blue letters).

FIGURE 2 | Alternative food stimuli for the whole (A,B) and partial (C) report paradigm: vector line drawings of fruits and vegetables (not in scale for visibility

purposes).

The whole/partial report paradigm consisted of five practice blocks of 26 trials each for both alphabet and food stimuli, and nine experimental blocks of 40 trials each. All trials shared the same basic design as illustrated in **Figure 1**. A trial started with a red cross (approximately 1◦ of visual angle) presented in the center of the screen, which participants were instructed to fixate throughout the trial. After a delay of 1000 ms, a stimulus display was presented around an imaginary circle (r = 7.5◦ of visual angle) with six possible stimulus locations. The stimulus display was followed by a mask (made from red and blue stimulus fragments completely covering the six stimulus locations) or a black screen (in unmasked trials) presented for 500 ms, and finally a black screen without fixation cross indicating that the participants should respond by typing in the target stimuli that they had seen on a regular keyboard or a keyboard with customized stickers on the keys (**Figure 3**). The unmasked trials were added to increase the motivation of the participant by making the task easier.

The masked whole report trials used red target stimuli with either two stimuli presented for 80 ms or six stimuli presented for one of six stimulus durations (10, 20, 50, 80, 140, or 200 ms) followed by a mask of 500 ms. The unmasked whole report trials presented six red target stimuli for one of two stimulus durations (10 or 200 ms). The partial report trials consisted of two red


target stimuli and four blue distractor stimuli presented for 80 ms followed by as mask of 500 ms. The different trial types are listed in **Table 1**. The sequence was randomized and each trial featured randomly chosen stimuli with the same stimulus appearing only once in that trial.

#### Instructions

Before testing, all participants were told that the speed of their response was irrelevant and they should report as many of the red target stimuli they were "fairly certain" of having seen. They were informed that feedback on the accuracy of their reports (as a percentage, based only on reported but not missed letters) would be given after each block. We asked them to keep their reports within an accuracy range of 80–90% correct (Vangkilde et al., 2011). The response time was unlimited.

## Data Analysis

The five basic attention parameters were obtained for each participant and each session by fitting the accuracy data to an ex-Gaussian distribution using a maximum likelihood fitting procedure provided by the LIBTVA toolbox for MATLAB (R2016b, The Mathworks, Natick, MA, United States; for full details on the fitting procedure, see Kyllingsbaek, 2006; Dyrholm et al., 2011). Briefly, the maximum likelihood fitting procedure estimates attentional abilities in terms of five parameters: (1) the capacity of VSTM (K, in elements); (2) the speed of visual information processing (C, in elements/ms); (3) the minimum exposure duration for conscious perception (t0, in ms); (4) top-down controlled selection α = wdistractor <sup>w</sup>target ; and (5) the distribution of attentional weight ω = wleft <sup>w</sup>left+wright . All parameters in our maximum likelihood fitting model were allowed to vary freely, however, if a negative value for t<sup>0</sup> was found in the initial estimation, the value for t<sup>0</sup> was fixed to 0 and the model was refitted to the data (Gillebert et al., 2016). We also included the unmasked trials in the maximum likelihood fitting procedure, resulting in the sensory decay parameter µ. Since this parameter does not hold much relevance in our research question, we did not include it in the following analysis. Differences in parameter values between the two stimulus types were assessed using paired t-tests. To correct for multiple comparisons in the analysis comparing

the five TVA parameters between the two stimulus types, we set the threshold for significance to an uncorrected p < 0.01 (Bonferroni-corrected p < 0.05). To assess the parallel-form reliability of TVA-based parameters with food stimuli instead of the conventional alphabet stimuli, we calculated the Spearman's rank correlation and Pearson correlation, and the respective confidence intervals, across all participants (Fisher, 1921; Bonett and Wright, 2000). We assessed the internal reliability of the TVA parameters for both stimulus types for each participant individually by calculating the split-half correlations. We also obtained a measure of the goodness-of-fit by correlating the observed performance scores with the predicted performance scores from the model.

# RESULTS

The mean performance across all participants, expressed as the percentage of correct responses from all reported responses, was 87.1% (SD = 5.6%) for the alphabet stimuli, and 82.7% (SD = 6.0%) for the food stimuli. A paired t-test showed that the performance in the whole report trials was significantly higher for the alphabet stimuli (91.4%, SD = 6.0%) compared to the food stimuli (86.6%, SD = 7.0%; t = 3.76, p < 0.01, df = 35). Performance was not significantly different between stimulus types in the partial report trials (alphabet 69.2%, SD = 17.1%; food 67.9%, SD = 17.8%; t = 0.56, p = 0.58, df = 35).

**Figures 4A,B** show the observed whole report data of a representative participant (p17). The solid line represents the predicted scores derived by the maximum likelihood fitting procedure. The point on the x-axis where the curve rises from the abscissa is the minimum effective exposure duration (alphabet t<sup>0</sup> = 4 ms; food t<sup>0</sup> = 16 ms), and the initial slope of the curve represents the processing speed (alphabet C = 54 elements/s; food C = 25 elements/s). Where the curve flattens out with increasing exposure duration, the asymptote represents the VSTM storage capacity (alphabet K = 4.1 elements; food K = 1.6 elements). **Figures 4C,D** show the observed and predicted partial report data of the same representative participant. The attentional weights for the distractors relative to the targets is expressed in the topdown control parameter (alphabet α = 0.32; food α = 0.23). Finally, **Figure 5** illustrates the distribution of attentional weights for the left and right hemifields, showing a shift to the right for the food stimuli compared to the alphabet stimuli (alphabet ω = 0.47; food ω = 0.39).

Across the participants, the mean VSTM storage capacity K, processing speed C, perceptual threshold t0, top-down selectivity α, and laterality index ω for the alphabet stimuli correspond to values found in previous literature for a similar age group (**Table 2**; e.g., Finke et al., 2005; Gillebert et al., 2012; Espeseth

TABLE 2 | Descriptive statistics of the TVA parameters split into stimulus type, p-values, t-statistics, and degree of freedoms of the paired t-test, correlation coefficients (95% confidence intervals) and p-values of the Spearman's and Pearson's correlation test between the TVA parameters of the alphabet versus food stimuli.


et al., 2014). There were no outliers across the parameters for either stimulus types. Across all participants, K, t0, C, and α differed significantly when comparing between the two stimulus types, but these parameters were also significantly correlated between the alphabet and food stimuli (**Figure 6** and **Table 2**). K, C, and α were significantly lower, while t<sup>0</sup> was significantly higher for the food stimuli relative to the alphabet stimuli. ω was not significantly different, but a trend for a significant correlation was present (Pearson's r = 0.44, p < 0.01; Spearman's ρ = 0.42, uncorrected p = 0.01). Following Cohen's (1988) conventions,



The numbers reported in parentheses represent the Spearman-Brown predicted correlations corrected to the full session length (Brown, 1910; Spearman, 1910).

the observed effect sizes ranged from medium (ρ = 0.42) to large (ρ = 0.76) for all five parameters.

Split half correlations indicating the internal reliability of the TVA-assessment for both stimulus types are shown in **Table 3**. All correlations were significantly different from zero at p < 0.001, both before and after corrections to full session length with the Spearman–Brown prediction formula (Habekost and Rostrup, 2006; Habekost et al., 2014b). The observed and the predicted mean scores for the whole and partial report trials were strongly correlated for both stimulus types, averaged over all participants (alphabet r = 0.95, SD = 0.03; food r = 0.87, SD = 0.07).

#### DISCUSSION

Previous studies have shown that TVA-based assessment based on letter report can yield sensitive and reliable measures for both visuospatial attention and short-term memory (Bundesen, 1990; Vangkilde et al., 2011). However, assessments of attention, executive control, and short-term memory that are not based on alphanumeric stimuli may be useful for an extended target group. Here we created a set of visual stimuli that are not based on letters or digits, and examined the effect of stimulus type on attention parameters using a TVA-based procedure in the context of Bundesen's theory of visual attention (Alvarez and Cavanagh, 2004; Eng et al., 2005). For healthy participants, our analyses indicated a significant correlation between the parameters K, t0, C, and α derived with the alphabet and food stimuli, with a trend present for ω as well. The data collected using food stimuli instead of alphabet stimuli could be closely modeled, as indicated by the high correlation between the observed and the predicted performance scores. The significant splithalf correlations indicate a high internal reliability of the TVA paradigm with the food stimuli.

As expected from previous studies examining the capacity of VSTM for stimuli with varying degrees of complexity both with or without TVA-based modeling, the parameter K in our study was significantly lower for food stimuli compared to alphabet stimuli (Alvarez and Cavanagh, 2004; Eng et al., 2005; Sørensen and Kyllingsbæk, 2012). It should, however, be noted that participants were not familiar with the keyboard used for reporting the food stimuli prior to the experiment. Although the participants received ample of time to familiarize themselves with the keyboard and to practice reporting, we cannot exclude that the longer and more difficult search may have led to the decay of the representations in short-term memory. The visibility of all 20 food stimuli on the keyboard during the search may also have interfered with the retention of the perceived target stimuli. Finally, it is possible that the food stimuli, although nonalphanumeric, were encoded verbally in short-term memory. Any differences between the verbal short-term memory span of the two stimulus types could have played a role in the decreased VSTM capacity, and this should be investigated further.

The processing speed C was significantly lower for the food stimuli compared to the alphabet stimuli, while the perceptual threshold t0, which is the minimum amount of time needed to perceive the stimuli, was higher (Shibuya and Bundesen, 1988). Notably, opposite to our expectations, the top-down attentional control parameter α was significantly lower for the food stimuli compared to the alphabet stimuli. As α denotes the ratio between the distractor weights and target weights, the lower value of α indicated that the participants were relatively better able to prioritize the processing of the targets compared to the distracters for food stimuli compared to alphabet stimuli (Botvinick and Braver, 2015). Unlike the whole report trials where the performance with the food stimuli was worse than with the alphabet stimuli, the performance in the partial report trials with the food stimuli closely matched the performance with the alphabet stimuli. It is possible that the increased complexity of the stimuli enabled the participants to better focus on the red feature of the targets, thereby ignoring the task-irrelevant distractors (Botvinick and Braver, 2015). Future studies should test and correct for these factors as needed. The distribution of attentional weights was not significantly different between the two stimulus types, but a significant Pearson's correlation and a trend for a significant Spearman's correlation was present. The inter-individual variability in this parameter was much larger for the food stimuli compared to the alphabet stimuli, with more extreme biases toward the left or right visual field. Given the lower capacity of VSTM for food stimuli compared to alphabet stimuli, participants may have prioritized one or two spatial locations rather than distributed their attention over all spatial locations. The significant Spearman correlations of the other four parameters suggested that the individual ranks of the participants were maintained when performing the assessment with the food stimuli. Several previous studies have examined the sensitivity of TVA-based assessments in highlighting the inter-individual differences in visuospatial attention (Foerster et al., 2016). The significant correlations provide an indication that the sensitivity of TVA-based assessments that make use of conventional alphanumeric stimuli also transfers to assessments that make use of the new stimulus set of line-drawings of fruits and vegetables.

# CONCLUSION

Our results indicate that using a set of food stimuli maintains the overall parallel-form reliability and the internal reliability of TVA-parameters acquired with the whole/partial report paradigm, which conventionally include simple alphabet stimuli. Future studies need to examine the performance of this labbased assessment task with young children who have not learned to read yet, or any patient populations in which assessments making use of non-alphanumeric stimuli are preferred. Future developments should focus on a more patient-friendly bedside method that can be performed in a short time and is robust against varying visual testing conditions (Habekost, 2015). Specifically, for use in patient populations, the assessment would require a smaller stimulus set size that leaves out stimuli with similar shapes which are easily confused, such as the bell pepper and the pumpkin. This would also decrease the difficulty of reporting by lowering the number of stimuli presented on the keyboard. Further adjustments include shorter test durations, more unmasked trials, and multiple-choice answering (Habekost, 2015). Exploring alternative reporting methods in paradigms such as change detection that decrease the involvement of motor function, could open opportunities for use of the assessment in patient populations with motor impairments.

# AUTHOR CONTRIBUTIONS

TW and CG conceptualized the paper. TW acquired and analyzed the data, and drafted the paper. CG critically revised the paper for important intellectual content.

# FUNDING

This work was supported by the Research Foundation – Flanders (G072517N) and the Wellcome Trust (101253/A/13/Z).

# ACKNOWLEDGMENTS

We would like to acknowledge the volunteers who participated in this study for their time and effort. We also thank the colleagues at the Department of Brain and Cognition for participating in the initial versions of the experiment and providing valuable feedback to the experimental paradigm, as well as two reviewers for their constructive comments on the manuscript.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00207/full#supplementary-material

# REFERENCES

fpsyg-09-00207 February 23, 2018 Time: 16:41 # 9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wang and Gillebert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Within-Subject Correlation Analysis to Detect Functional Areas Associated With Response Inhibition

Tomoko Yamasaki <sup>1</sup> , Akitoshi Ogawa<sup>1</sup> , Takahiro Osada<sup>1</sup> , Koji Jimura<sup>2</sup> and Seiki Konishi 1,3,4 \*

<sup>1</sup>Department of Neurophysiology, Juntendo University School of Medicine, Tokyo, Japan, <sup>2</sup>Department of Biosciences and Informatics, Keio University School of Science and Technology, Yokohama, Japan, <sup>3</sup>Research Institute for Diseases of Old Age, Juntendo University School of Medicine, Tokyo, Japan, <sup>4</sup>Sportology Center, Juntendo University School of Medicine, Tokyo, Japan

Functional areas in fMRI studies are often detected by brain-behavior correlation, calculating across-subject correlation between the behavioral index and the brain activity related to a function of interest. Within-subject correlation analysis is also employed in a single subject level, which utilizes cognitive fluctuations in a shorter time period by correlating the behavioral index with the brain activity across trials. In the present study, the within-subject analysis was applied to the stop-signal task, a standard task to probe response inhibition, where efficiency of response inhibition can be evaluated by the stop-signal reaction time (SSRT). Since the SSRT is estimated, by definition, not in a trial basis but from pooled trials, the correlation across runs was calculated between the SSRT and the brain activity related to response inhibition. The withinsubject correlation revealed negative correlations in the anterior cingulate cortex and the cerebellum. Moreover, the dissociation pattern was observed in the within-subject analysis when earlier vs. later parts of the runs were analyzed: negative correlation was dominant in earlier runs, whereas positive correlation was dominant in later runs. Regions of interest analyses revealed that the negative correlation in the anterior cingulate cortex, but not in the cerebellum, was dominant in earlier runs, suggesting multiple mechanisms associated with inhibitory processes that fluctuate on a run-by-run basis. These results indicate that the within-subject analysis compliments the acrosssubject analysis by highlighting different aspects of cognitive/affective processes related to response inhibition.

#### Edited by:

Gail Robinson, The University of Queensland, Australia

#### Reviewed by:

Adam David George Hampshire, Imperial College London, United Kingdom Chiang-shan R. Li, Yale University, United States Weidong Cai, School of Medicine, Stanford University, United States

#### \*Correspondence:

Seiki Konishi skonishi@juntendo.ac.jp

Received: 30 October 2017 Accepted: 04 May 2018 Published: 22 May 2018

#### Citation:

Yamasaki T, Ogawa A, Osada T, Jimura K and Konishi S (2018) Within-Subject Correlation Analysis to Detect Functional Areas Associated With Response Inhibition. Front. Hum. Neurosci. 12:208. doi: 10.3389/fnhum.2018.00208 Keywords: human, functional magnetic resonance imaging, cognitive control, executive function, performance

#### INTRODUCTION

In human fMRI studies, brain activity is generally used to identify functional areas associated with brain functions. Brain-behavior correlation is often used to detect functional areas, calculating correlation between the behavioral index and the brain activity related to a particular function in the group level. In the case of the stop-signal task (Logan and Cowan, 1984; Rubia et al., 2001; Aron et al., 2003), a standard task to probe response inhibition, the correlation is calculated between the stop-signal reaction time (SSRT) and the brain activity related to response inhibition. Previous studies have revealed functional areas related to the response inhibition, including the inferior frontal cortex, the pre-supplementary motor area, the superior

frontal cortex, the anterior cingulate cortex, the striatum, the subthalamic nucleus and the cerebellum (Aron and Poldrack, 2006; Garavan et al., 2006; Li et al., 2006; Aron et al., 2007; Forstmann et al., 2008, 2012; Congdon et al., 2010; Rubia et al., 2010; Boehler et al., 2011; Ghahremani et al., 2012; Hirose et al., 2012; Jimura et al., 2014).

stop manual responses.

These previous studies of response inhibition calculated the brain-behavior correlations across subjects, regarding data from one subject as one sample for the correlation analysis, based on inter-individual variability. It is also possible to utilize intraindividual variability of executive functions, instead of interindividual variability, and to calculate correlation across fMRI runs of the same subjects, regarding data from one run of the same subject as one sample for the correlation analysis (**Figure 1A**). Such analyses have been conducted in a trial basis (e.g., Christoff et al., 2001; Yarkoni et al., 2009). It is to be noted, however, that the SSRT is estimated, by definition, not in a trial basis but from pooled trials such as fMRI runs (Logan and Cowan, 1984; Verbruggen et al., 2013). The withinsubject analysis may complement the results from the acrosscorrelation analysis by focusing on cognitive fluctuations in a shorter time period. However, despite the abundant literatures reporting brain-behavior correlation based on the across-subject analysis, very little about response inhibition has been reported based on the within-subject analysis. More broadly, neural mechanisms of learning response inhibition have been studied, mostly tracking time courses of brain activity (Toni et al., 2001; Milham et al., 2003; Kelley et al., 2006; Erika-Florence et al., 2014; Berkman et al., 2014; Hampshire et al., 2016). However, timerelated changes of the within-subject correlations have rarely been examined.

In this study, we conducted the within-subject correlation analysis using the data published in a study of the across-subject correlation applied to the stop-signal task (Jimura et al., 2014). Correlation between the SSRT and the brain activity related to response inhibition was calculated based on the across- and within-subject analyses (**Figure 1A**), and the results from both of them were compared. We also examined the time-dependent changes of the within-subject correlation using the same dataset, based on comparison between earlier and later runs as conducted previously (Jimura et al., 2014).

#### MATERIALS AND METHODS

#### Subjects

The present study reanalyzed the data published previously (Jimura et al., 2014). Forty-six healthy right-handed subjects (26 males, 20 females; age range: 20–26) participated in this study. This study was carried out in accordance with the recommendations of the guideline regarding the ethics of noninvasive research of human brain functions by Japan Neuroscience Society with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the institutional review board of Juntendo University School of Medicine.

#### Imaging Procedures

The imaging procedures are described previously in more detail (Jimura et al., 2014). The experiments were conducted using a 3.0 T-MRI system. T1-weighted structural images were then obtained for anatomical reference (76 × 2-mm slices; in-plane resolution: 1 × 1 mm). For functional imaging, a gradient echo echo-planar sequence was used (40 × 4-mm slices; TR = 3000 ms; TE = 50 ms; flip angle = 90 degree; in-plane resolution: 4 × 4 mm). Each functional run consisted of 64 whole-brain acquisitions. Twelve functional runs were administered for each subject.

#### Behavioral Procedures

The behavioral procedures are described previously in more detail (Jimura et al., 2014). Subjects performed a stop-signal task (Logan and Cowan, 1984). The stop-signal task is depicted in **Figure 1B**. At the beginning of the trial, a gray circle was presented for 1700 ms. In the GO trial, then, a green circle was presented for 800 ms, and the subjects were instructed to make a button press with the right thumb. In the STOP trial, a green circle was presented. After a stop-signal delay (SSD), the green circle was changed to a blue circle, and the subjects were required to withhold the manual response. The color of Go signal and Stop signal was counterbalanced across subjects. The SSD was updated on each STOP trial based on a tracking procedure, allowing us to maintain accuracy of the STOP trial at approximately 50% (Band et al., 2003).

To evaluate the efficiency of the response inhibition, this study estimated a behavioral index, SSRT for each subject based on an integration method (Logan and Cowan, 1984; Verbruggen et al., 2013). SSRT is a behavioral index reflecting the response inhibition efficiency, and individuals with shorter SSRTs can be considered as more efficient in response inhibition (Logan and Cowan, 1984). Each functional run contained 16 STOP trials and 48 GO trials (STOP/GO ratio = 1:3). Each subject underwent a total of 12 runs.

# Data Analysis

The brain activity related to response inhibition was examined in the same way as the previous study (Jimura et al., 2014). Functional images were preprocessed using SPM8<sup>1</sup> . The images were first realigned, then corrected for slice timing, and spatially normalized to a standard MNI template with interpolation to a 2 × 2 × 2 mm space, followed by spatial smoothing with an 8-mm kernel. Events of interest (GO success and STOP success), together with nuisance events (GO fail and STOP fail), were coded at the onset of the GO signal of each trial and were modeled as transient events in a general linear model. Single-level analysis was performed to estimate signal magnitudes, and the magnitude images were contrasted between STOP success and GO success trials in the 3rd to 12th runs, during which SSD, SSRT and accuracy of STOP trials were found stable (Jimura et al., 2014). Group-level statistics were estimated in a one-sample t-test, treating subjects as a random effect.

As a positive control, the across-subject brain-behavior analysis was performed to replicate the results reported previously (Jimura et al., 2014). The voxel-wise correlation was calculated between the SSRT and the signal magnitudes for the contrast STOP success minus GO success during the stable runs (i.e., 3rd to 12th runs). The correlation coefficient was then converted to Fisher's z, and the Fisher's z was further normalized to a z gaussian distribution to indicate statistical significance level.

Within-subject brain-behavior analysis was also performed, calculating the correlation between the SSRT and the signal magnitudes for the contrast STOP success minus GO success for each run in the stable runs (i.e., 3rd to 12th runs) of the same subjects. The correlation coefficient for each subject was then converted to Fisher's z, and the Fisher's z was entered into a one-sample group-mean test, treating subjects as a random effect. To correct for multiple comparisons, statistical testing was performed based on non-parametrical permutation inference (Eklund et al., 2016) implemented in randomise in FSL suite (Winkler et al., 2014<sup>2</sup> ). Cluster-wise statistical correction was performed for voxel clusters defined by a threshold (P < 0.01, uncorrected; Eklund et al., 2016), and then significance level was assessed above P < 0.05 corrected for multiple comparisons within a functional areas associated with response inhibition identified by meta-analysis of forward inference in Neurosynth<sup>3</sup> (Yarkoni et al., 2011) for cortical areas, and also across the whole brain for other brain areas.

To examine the temporal changes in correlations, the data set (3rd to 12th runs) was divided into two parts. To keep the minimal number of samples for the within-subject correlation analysis, the first six runs (3rd to 8th runs) and the last six runs (7th to 12th runs) were classified into FIRST and SECOND, with the middle 7th and 8th runs doubled in the two parts. Unlike Jimura et al. (2014) where 46 subjects could be used for the across-subject correlation analysis of FIRST and SECOND, 10 runs had to be divided for the within-subject correlation analysis of FIRST and SECOND in the present study. We ameliorated this issue by duplicating two runs (7th–8th runs): 3rd–8th for FIRST and 7th–12th for SECOND. However, further duplication (6th–9th runs) will not be acceptable because, in the case of the 3rd–9th runs for FIRST and 6th-12th runs for SECOND, more than half of the data points (four out of seven data points) will be doubled. So, we chose minimal duplication to ameliorate statistical power. Then the acrossand within-subject analyses were performed between the SSRTs and activation magnitudes, and the two correlation maps (FIRST and SECOND) were Fisher's z-transformed and were normalized to a z gaussian distribution.

# RESULTS

# Behavioral Results

Behavioral results were shown for the 3rd to 12th runs, during which SSD, SSRT and accuracy of STOP trials were found stable (Jimura et al., 2014). The RT of GO trials, SSD and SSRT in the 10 runs were 514.3 ± 73.2 ms (mean ± SD), 314.2 ± 84.9 ms and 197.8 ± 30.9 ms, respectively. The differences between FIRST (3rd to 8th runs) and SECOND (7rd to 12th runs) were not significant in any of these behavioral measures (P = 0.12, P = 0.53, P = 0.07, respectively; **Figure 2A**), suggesting that behavioral efficiency of response inhibition was constant between these periods.

#### Imaging Results

As a positive control, the brain activity during STOP success relative to GO success during the stable period (3rd to 12th runs) was calculated (**Figure 2B**). Although the same authors conducted the analysis, there existed slight differences from Jimura et al. (2014) regarding the brain activation and the acrosssubject correlation, presumably due to differences in update versions of OS (MS Windows), Matlab and SPM8. However, as reported previously, activations were observed in multiple areas

<sup>2</sup>https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/Randomise

<sup>3</sup>http://neurosynth.org/analyses/terms/response%20inhibition/

<sup>1</sup>www.fil.ion.ucl.ac.uk/spm

including the inferior frontal gyrus, pre-supplementary motor area, and temporo-parietal junction and anterior insula (Konishi et al., 1998, 1999; Garavan et al., 1999; de Zubicaray et al., 2000; Liddle et al., 2001; Menon et al., 2001; Rubia et al., 2001; Bunge et al., 2002; Durston et al., 2002a,b; Mostofsky et al., 2003; Hester et al., 2004; Kelly et al., 2004; Matsubara et al., 2004; Brass et al., 2005; Aron and Poldrack, 2006; Chambers et al., 2006, 2009; Li et al., 2006, 2008; Leung and Cai, 2007; Sumner et al., 2007; Nakata et al., 2008; Zheng et al., 2008; Cai and Leung, 2009; Chao et al., 2009; Chikazoe et al., 2009a,b; Sharp et al., 2010; van Gaal et al., 2010; Zandbelt and Vink, 2010; Boecker et al., 2011; Arbula et al., 2017). Correlations were also calculated between the SSRTs and the brain activity (STOP minus GO) in the 3rd to 12th runs (**Figure 2C**, see Supplementary Figure S1 for whole-brain slices). Negative correlations were observed in cortical, subcortical and cerebellar regions, consistent with prior studies (Li et al., 2006, 2008; Aron et al., 2007; Congdon et al.,

2010; Boehler et al., 2011; Ghahremani et al., 2012; Hirose et al., 2012).

Because the shorter SSRT indicates more efficient performance, the negative brain-behavior correlation is expected to be associated with response inhibition. **Figure 3A** shows the within-subject correlation in the 3rd to 12th runs. Negative correlations were revealed in the anterior cingulate cortex (peak coordinate: −10, 4, 36; t(44) = −4.5 at (10, 18, 30) from Neurosynth) and the cerebellum (lobule VIII; peak coordinate: −28, −52, −40; t(44) = −4.4; **Figure 3A**, see Supplementary Figure S2 for whole-brain slices). Scatter plots in these two regions are shown in **Figure 3B** for one representative subject. To compare the negative correlation pattern of the across- and within-subject correlations, 10 common regions of interest were defined by averaging the normalized z-maps of the acrossand within-subject correlations and detecting regions with 10 greatest z-scores. Although the z-scores of the correlation

analyses depend on the data structure of the number of the subjects/runs, the present dataset exhibited greater negative correlation in the across-subject analysis than in the withinsubject analysis (t(9) = 3.1, P < 0.01; Supplementary Figure S3A). Alternatively, the common regions were defined based on independent dataset, using the coordinates reported in Chikazoe et al. (2009b), where the same authors used a similar version of the stop-signal task. Greater negative correlation in the acrosssubject analysis was similarly observed (t(5) = 2.6, P < 0.05; Supplementary Figure S3B).

Greater negative correlation associated with response inhibition in the latter half of the runs than in the earlier half was reported in Jimura et al. (2014) using the across-subject analysis. The temporal changes in the within-subject analysis was also examined in this study, analyzing FIRST (3rd to 8th runs) and SECOND (7th to 12th runs) parts of the runs. **Figure 4A** (top) shows the within-subject correlation for FIRST runs (see Supplementary Figure S4 for whole-brain slices). Negative correlations were dominant in the whole brain (t(447.9) = −12.0, P < 0.001, the degrees of freedom corrected with the number of resels). **Figure 4A** (bottom) shows the within-subject correlation for SECOND runs (see Supplementary Figure S5 for whole-brain slices). Conversely, positive correlations were dominant in the whole brain (t(487.5) = 5.0, P < 0.001). The difference between FIRST and SECOND did not reveal any significant correlation, based on the statistical procedures used in **Figure 3A**. For a comparison purpose, the across-subject correlations for FIRST and SECOND runs in whole-brain slices are shown in Supplementary Figures S6, S7. Regions of interest analyses were performed further, using the coordinates from independent dataset of Chikazoe et al. (2009b). Greater within-subject correlation in FIRST than SECOND was observed in the anterior cingulate region (t(45) = 2.4, P < 0.05), whereas no correlation difference was observed in the cerebellar region (**Figure 4B**). Additionally, the within-subject correlation analysis was performed for Go RT, instead of SSRT. There was little within-subject correlation in the anterior cingulate or cerebellar regions (Supplementary Figure S8).

#### DISCUSSION

The present study employed the within-subject correlation analysis, calculating across-run correlation for each subject between the behavioral index and the brain activity associated with response inhibition. Within-subject correlation was observed in the anterior cingulate cortex and the cerebellum. Moreover, differential patterns of correlation were observed in the earlier vs. later runs. These results suggest that the within-subject correlation analysis complements the

FIGURE 4 | Time-related changes of the within-subject correlation. (A) Statistical maps of the within-subject (across-run) correlation between the SSRT and the brain activity related to response inhibition in FIRST (3rd to 8th) six runs and SECOND (7th to 12th) six runs. The format is similar to that in Figure 2C. (B) Regions of interest analyses of the temporal changes of the within-subject correlation, showing correlation in the whole runs, FIRST runs and SECOND runs. The coordinates were defined based on independent datasets from Chikazoe et al. (2009b).

across-subject correlation analysis by revealing different aspects of cognitive/affective processes related to response inhibition.

This study examined both the across- and within-subject correlation analyses using the same data of 46 subjects, with 10 effective runs in each subject. There was a whole-brain level tendency that the across-subject negative correlation was greater than the within-subject correlation (Supplementary Figure S3), suggesting that the across-subject variability is greater than the within-subject variability. At the same time, the relative robustness of the correlation analyses depends on the data structure of the number of the subjects/runs, and it is possible that the within-subject negative correlation is more robust when more than 10 effective runs are collected for each subject. Because the latter runs exhibited whole-brain tendency of positive correlation (**Figure 4A**), however, collecting more than 10 runs may result in less robust negative correlation. Therefore, it is also possible that the number of runs in the present dataset is reasonable for the within-subject correlation analysis.

The across-subject correlation analysis reveals functional areas where more efficient performers with shorter SSRT elicit higher brain activity, whereas the within-subject (across-run) correlation analysis reveals functional areas where more efficient performance in a run in the same subject elicits higher brain activity. The within-subject correlation observed in the anterior cingulate cortex (**Figure 3A**) may reflect across-run fluctuation of monitoring processes (Carter et al., 1998; Botvinick et al., 2001; Braver et al., 2001) during performance of the stop-signal task that contributed to response inhibition. The correlation observed in the cerebellum may reflect motor/cognitive control processes (Imamizu et al., 2000; Ito, 2008) that has been observed in previous studies of the across-subject correlation analysis (Ghahremani et al., 2012; Jimura et al., 2014). Regions of interest analyses revealed that the anterior cingulate correlation was dominant in the earlier runs, whereas the cerebellar correlation was relatively constant (**Figure 4B**). The differential results suggest multiple mechanisms associated with inhibitory processes that fluctuate on a run-by-run basis, with the anterior cingulate mechanism contributing only in the earlier runs. The anterior cingulate activity is known to decline more rapidly than learning of attentional control in Stroop task, suggesting that the anterior cingulate cortex is involved in other aspects than implementation of top-down attentional control (Milham et al., 2003), such as monitoring processes (Carter et al., 1998; Botvinick et al., 2001; Braver et al., 2001). It has also been reported that the activity in the anterior insula/inferior frontal operculum network, to which the anterior cingulate cortex belongs, declines more slowly during sequential learning of new tasks, than other lateral frontal cortex networks (Hampshire et al., 2016). The results may raise the possibility that sequential learning of new tasks requires monitoring processes long after the tasks are learned, in order to inhibit proactive interference from previously acquired tasks.

Interestingly, the positive correlation was observed in the latter runs, primarily in the medial prefrontal cortex (**Figure 4A**), which is known as a part of a cognitive control network (Hu et al., 2016), or as a member area of the default-mode network (Fox and Raichle, 2007). It is unlikely that the brain activity related to cognitive control makes performance worse. Alternatively, subjects might have recruited more brain regions when they performed the task using a less-efficient strategy.

#### REFERENCES


Based on the function of the default mode network (Buckner et al., 2008), it is suggested that subjects were not focused on the external environment, which led to worse performance in latter runs.

Brain-behavior correlation changed during 10 runs of performance in the present study. While the across-subject analysis revealed enhanced negative correlation during the second vs. the first half of the runs (Jimura et al., 2014), the within-subject analysis revealed opposing correlations in FIRST and SECOND runs, showing negative and positive correlations in FIRST and SECOND runs, respectively (**Figure 4A**). Although the across-subject correlation has been used to identify robust functional areas, the within-subject correlation analysis may complement the across-subject analysis by shedding light on the cognitive/affective processes that fluctuate in a shorter period and may also contribute to rapid improvement of performance in athletes in the field of sports science (Nakata et al., 2010; Miyashita, 2016).

# AUTHOR CONTRIBUTIONS

TY, AO, TO and SK designed the research and wrote the manuscript. TY, AO, TO, KJ and SK analyzed the data.

#### FUNDING

This work was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number 16K16076 to AO and 16K18367 to TO and a grant from Naito Foundation to SK.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum.2018.002 08/full#supplementary-material


inhibitory control. Brain Res. 1105, 130–142. doi: 10.1016/j.brainres.2006. 03.029


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Yamasaki, Ogawa, Osada, Jimura and Konishi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.