# INTEGRATING TIME & NUMBER: FROM NEURAL BASES TO BEHAVIORAL PROCESSES THROUGH DEVELOPMENT AND DISEASE

EDITED BY : Fuat Balcı, Metehan Çiçek, Karin Kucian and Trevor B. Penney PUBLISHED IN : Frontiers in Human Neuroscience, Frontiers in Psychology, Frontiers in Behavioral Neuroscience and Frontiers in Computational Neuroscience

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-753-9 DOI 10.3389/978-2-88963-753-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# INTEGRATING TIME & NUMBER: FROM NEURAL BASES TO BEHAVIORAL PROCESSES THROUGH DEVELOPMENT AND DISEASE

Topic Editors: Fuat Balcı, Koç University, Turkey Metehan Çiçek, Ankara University, Turkey Karin Kucian, University Children's Hospital Zurich, Switzerland Trevor B. Penney, The Chinese University of Hong Kong, China

Citation: Balcı, F., Çiçek, M., Kucian, K., Penney, T. B., eds. (2020). Integrating Time & Number: From Neural Bases to Behavioral Processes Through Development and Disease. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-753-9

# Table of Contents


Karina Hamamouche, Maura Keefe, Kerry E. Jordan and Sara Cordes

*66 Dilation and Constriction of Subjective Time Based on Observed Walking Speed*

Hakan Karşılar, Yağmur Deniz Kısa and Fuat Balcı


Kenneth R. Light, Brian Cotten, Talia Malekan, Sophie Dewil, Matthew R. Bailey, Charles R. Gallistel and Peter D. Balsam

*126 Effect of Presentation Format on Judgment of Long-Range Time Intervals* Camila Silveira Agostino, Yossi Zana, Fuat Balci and Peter M. E. Claessens

# Editorial: Integrating Time & Number: From Neural Bases to Behavioral Processes Through Development and Disease

#### Fuat Balcı <sup>1</sup> , Metehan Çiçek 2,3,4 \*, Karin Kucian5,6,7 and Trevor B. Penney <sup>8</sup>

<sup>1</sup> Department of Psychology, College of Social Sciences and Humanities, Koç University, Sariyer, Turkey, <sup>2</sup> Department of Physiology, School of Medicine, Ankara University, Ankara, Turkey, <sup>3</sup> Department of Interdisciplinary Neuroscience, Ankara University, Ankara, Turkey, <sup>4</sup> Brain Research Center, Ankara University, Ankara, Turkey, <sup>5</sup> Center for MR Research, University Children's Hospital Zurich, Zurich, Switzerland, <sup>6</sup> Children's Research Center, University Children's Hospital Zurich, Zurich, Switzerland, <sup>7</sup> Neuroscience Center Zurich, Zurich, Switzerland, <sup>8</sup> Department of Psychology, The Chinese University of Hong Kong, Shatin, China

Keywords: interval timing, numerosity, mental magnitudes, psychophysics, dyscalculia, dyslexia, schizophrenia

#### **Editorial on the Research Topic**

#### **Integrating Time & Number: From Neural Bases to Behavioral Processes Through Development and Disease**

Although crucial overlap exists in time and number processing, a thorough explanation of how neurons integrate time and number is lacking. Moreover, researchers have demonstrated the clinical relevance of timing and counting in disorders directly relevant to these conditions (e.g., dyscalculia, autism spectrum disorder) or others associated with dopaminergic dysfunction (e.g., Parkinson's Disease, Schizophrenia, ADHD).

The present Research Topic comprises a collection of papers evaluating the neural mechanisms of time and number perception, as well as the diseases which disturb these functions. A wide range of studies, from behavioral, psychophysical, neuroimaging, clinical, to theoretical, and from childhood to adulthood, as well as from typical to atypical processing, provide broad insights into time and number perception in the present Research Topic.

## TIME PERCEPTION STUDIES

Apaydın et al. used event-related fMRI to investigate the interaction of time perception with reward prospects. Participants performed a time perception task in which they judged the velocity of an occluded moving object in "reward" vs. "no-reward" sessions. Findings showed a rightlateralized fronto-parietal activity during timing. Interaction of time and reward showed prefrontal and caudate nucleus activity, which suggests that reward and timing systems of the brain might be integrated in prefrontal-striatal circuitry.

Coull et al. used a temporal bisection task with lateralized arrow direction (to the left/right side) and/or lateralized stimulus (on the left/right side) presentation with children and adults to assess the beginning of mental timeline acquisition. Their findings suggest that while spatial position manipulated in a symbolic way influenced the duration judgments of 8 and 10 year olds, for 5– 6 year olds only lateralized stimulus presentation affected timing. In another study, Kar¸sılar et al. tested the effect of observed motion on time perception. They used animations of a walking stickfigure with different movement speeds and directions. Their findings suggest that (irrespective of the direction of motion) time dilates while observing faster motion and it contracts while observing slower motion.

Edited and reviewed by: Lutz Jäncke, University of Zurich, Switzerland

> \*Correspondence: Metehan Çiçek mcicek@ankara.edu.tr

#### Specialty section:

This article was submitted to Cognitive Neuroscience, a section of the journal Frontiers in Human Neuroscience

> Received: 16 March 2020 Accepted: 19 March 2020 Published: 09 April 2020

#### Citation:

Balcı F, Çiçek M, Kucian K and Penney TB (2020) Editorial: Integrating Time & Number: From Neural Bases to Behavioral Processes Through Development and Disease. Front. Hum. Neurosci. 14:129. doi: 10.3389/fnhum.2020.00129

Maaß and van Rijn aimed to assess whether clock variability could be reliably measured with an easy 1-s production task, which could then be applied in clinical and/or developmental studies. They suggest that the observed variability adheres to the scalar property and predicts temporal performance observed in a reproduction task. Zeki and Balcı presented a model of time cells with sequentially firing neurons and noisy conductances that exhibit a simple neural wave activity and showed that this simplified model of time cells can account for the prominent properties of the timing behavior including the scalar property.

## STUDIES EVALUATING TIME AND NUMBER PERCEPTION TOGETHER

Hamamouche et al. tested the effect of cognitive load on temporal and numerical processing in an attempt to determine whether the common magnitude system was supported. Participants performed numerical and temporal judgements under the distraction of a verbal working memory task. Results were inconsistent with the common magnitude account. On the other hand, Light et al. trained mice to perform Mechner's counting task in which animals had to press one lever for a required number of times before claiming the reward by pressing a second lever. The results indicate that mice used both counting and timing strategies to complete the task.

Agostino et al. evaluated the estimation of subjective magnitude of long-range time intervals and tested the effect of numerals in duration judgements. The results showed that people map the magnitude of stimuli presented as an abstract number and number + time-unit in a similar way, while time interval magnitudes without the use of numerals (e.g., personal events) are treated differently.

## TIME AND NUMBER PROCESSING DEFICITS IN DISEASE STATES

McCaskey et al. investigated the numerical abilities of typically developing children and those with developmental dyscalculia (DD) by using neuropsychological tests and fMRI scanning during a numerical order paradigm. The results showed that children with DD showed persistent deficits in number processing and arithmetical skills, but they showed an improvement and their brain imaging results revealed an increase in frontal and parietal brain activation over time.

Di Filippo and Zoocolatti evaluated 5 children with dyslexia, 16 with dyscalculia, 7 with a "mixed pattern," and 49 control children in terms of reading and numerical skills. They showed that the deficit of children (with dyscalculia and mixed pattern) on numerical tasks could be described by a single global factor.

Snowden and Buhusi reviewed the neuroimaging literature related to the neural correlates of interval timing problems in schizophrenia. Based on previous work the authors suggest that a disrupted cortico-striatal-thalamo-cortical network may be responsible for timing deficits observed in schizophrenia. Furthermore, Xiong et al. investigated electrophysiological differences in patients with first-episode schizophrenia and chronic schizophrenia. Their findings suggest that duration and frequency mismatch negativity and evoked theta power differences may be used to detect schizophrenia at early stages and are related to poor cognitive functioning in schizophrenic patients.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a shared affiliation, though no other collaboration, with one of the authors KK.

Copyright © 2020 Balcı, Çiçek, Kucian and Penney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Longitudinal Brain Development of Numerical Skills in Typically Developing Children and Children with Developmental Dyscalculia

Ursina McCaskey 1, 2 \*, Michael von Aster 1, 2, 3, 4, Urs Maurer 4, 5, 6, Ernst Martin1, 2 , Ruth O'Gorman Tuura1, 2, 7 and Karin Kucian1, 2, 4

*<sup>1</sup> Center for MR-Research, University Children's Hospital Zurich, Zurich, Switzerland, <sup>2</sup> Children's Research Center, University Children's Hospital Zurich, Zurich, Switzerland, <sup>3</sup> Clinic for Child and Adolescent Psychiatry, German Red Cross Hospitals, Berlin, Germany, <sup>4</sup> Neuroscience Center Zurich, University of Zurich and Swiss Federal Institute of Technology Zurich, Zurich, Switzerland, <sup>5</sup> Department of Psychology, University of Zurich, Zurich, Switzerland, <sup>6</sup> Department of Psychology, Chinese University of Hong Kong, Hong Kong, Hong Kong, <sup>7</sup> Zurich Center for Integrative Human Physiology, University of Zurich, Zurich, Switzerland*

Developmental dyscalculia (DD) is a learning disability affecting the acquisition of numerical-arithmetical skills. Studies report persistent deficits in number processing and aberrant functional activation of the fronto-parietal numerical network in DD. However, the neural development of numerical abilities has been scarcely investigated. The present paper provides a first attempt to investigate behavioral and neural trajectories of numerical abilities longitudinally in typically developing (TD) and DD children. During a study period of 4 years, 28 children (8–11 years) were evaluated twice by means of neuropsychological tests and a numerical order fMRI paradigm. Over time, TD children improved in numerical abilities and showed a consistent and well-developed fronto-parietal network. In contrast, DD children revealed persistent deficits in number processing and arithmetic. Brain imaging results of the DD group showed an age-related activation increase in parietal regions (intraparietal sulcus), pointing to a delayed development of number processing areas. Besides, an activation increase in frontal areas was observed over time, indicating the use of compensatory mechanisms. In conclusion, results suggest a continuation in neural development of number representation in DD, whereas the neural network for simple ordinal number estimation seems to be stable or show only subtle changes in TD children over time.

#### Keywords: brain development, child, developmental dyscalculia, longitudinal, number processing

# INTRODUCTION

How does the "numerical brain" develop? Numbers are omnipresent in our lives and our innate ability to detect small numerosities enables us to develop complex mathematical skills at a young age (Starkey et al., 1990; Xu and Spelke, 2000; Izard et al., 2009). Not surprisingly, individuals with Developmental Dyscalculia (DD) struggle in their everyday life. DD is a learning disability affecting the acquisition of numerical-arithmetical skills in children with normal intelligence and age-appropriate school education (WHO, 2010). Many studies have shown that children with DD display various deficits in number processing skills such as magnitude processing or spatial

#### Edited by:

*Xiaolin Zhou, Peking University, China*

Reviewed by: *Xinlin Zhou, Beijing Normal University, China QI Chen, South China Normal University, China*

> \*Correspondence: *Ursina McCaskey ursina.mccaskey@kispi.uzh.ch*

Received: *07 September 2017* Accepted: *11 December 2017* Published: *04 January 2018*

#### Citation:

*McCaskey U, von Aster M, Maurer U, Martin E, O'Gorman Tuura R and Kucian K (2018) Longitudinal Brain Development of Numerical Skills in Typically Developing Children and Children with Developmental Dyscalculia. Front. Hum. Neurosci. 11:629. doi: 10.3389/fnhum.2017.00629* number representation (Landerl et al., 2004; Rousselle and Noël, 2007; Mussolin et al., 2010; Landerl, 2013). Those skills are assumed to predict later arithmetical achievement (Halberda et al., 2008; De Smedt et al., 2009; Geary et al., 2012; Träff, 2013) and are therefore essential for the development of numeracy. DD has a high prevalence (3–7%) (Gross-Tsur et al., 1996; Wyschkon et al., 2009; Reigosa-Crespo et al., 2012) and a persisting character (Shalev et al., 1998, 2005). The fact that difficulties in numeracy result in reduced employment opportunities and high public costs underscores the importance of understanding more about numerical brain development (Parsons and Bynner, 2005; Gross, 2009).

Research performed over the last decades demonstrates that from the first day after birth, infants are capable of discriminating quantities (Xu et al., 2005; Izard et al., 2009) and show specialized neuronal correlates for the processing of numerosities early in development (Hyde et al., 2010; Hyde and Spelke, 2012). Over development, a spatial representation of quantity and numbers, also known as mental number line (Berch et al., 1999; Dehaene, 2003), emerges. With the acquisition of number words and the symbolic number system, the formation of such an internal representation further refines (Siegler and Booth, 2004; von Aster and Shalev, 2007; Ebersbach et al., 2008; Halberda and Feigenson, 2008). Moreover, numerical magnitude processing skills (linearity of the mental number line, performance in quantity comparison tasks) correlate with arithmetical knowledge and predict future mathematical achievement (Booth and Siegler, 2008; De Smedt et al., 2013). A recent study further showed that number line estimation is a good predictor of arithmetic ability at an early age, whilst ordinal processing of numerical symbols was revealed to be a strong predictor of older children's arithmetical skills (Lyons et al., 2014; Zhu et al., 2017). Besides various other deficits in numerical-arithmetical skills, several studies with DD children reported that they are less accurate in placing numbers on a number line (Geary et al., 2008; Landerl, 2013). Piazza et al. (2010) showed that DD children performed at a similar level as 5-years-younger typically developing (TD) children in a task measuring number representation. Furthermore, results of a review reveal that weak performance of magnitude processing skills correlates with low mathematical achievement and DD (De Smedt et al., 2013). These results are supported by neuroimaging findings demonstrating that children with DD show aberrant functional activation in number tasks compared to TD peers. Significantly reduced activation is mainly found in domainspecific regions of the parietal lobe, known to be important for magnitude and ordinal processing and supposed to incorporate the mental number line (Kucian et al., 2006, 2011a; Price et al., 2007; Mussolin et al., 2010; Ashkenazi et al., 2012). For instance, children with DD showed reduced activation in the bilateral intraparietal sulcus (IPS) and superior parietal lobe when solving a number processing task (Kucian et al., 2011a). Moreover, when confronted with arithmetical problems, DD children failed to show a task related modulation in parietal areas. However, findings are not consistent and some studies describe increased activation in DD in these areas (Davis et al., 2009; Kaufmann et al., 2009b, 2011). Rosenberg-Lee et al. (2015), for instance, reported that children with DD show hyper-activation in parietal cortices when solving subtraction problems. Moreover, activation differences are also found in domain-general regions mainly in the frontal brain, attributed to working memory, attention and planning, but also in occipito-temporal areas of the brain (Kucian et al., 2006, 2011a; Price et al., 2007; Davis et al., 2009; Kaufmann et al., 2009b; Rosenberg-Lee et al., 2015). Recent studies further revealed that children with DD show functional hyper-connectivity of the IPS with the bilateral fronto-parietal network (Jolles et al., 2016; Michels et al., 2017). To summarize, these findings describing an aberrant brain activation pattern possibly reflect the typical deficiency in number processing and the greater cognitive resources needed to solve numerical tasks.

A number of cross-sectional studies have been conducted to investigate age dependent neural differences of numerical functions in TD children. Findings suggest that children activate similar regions to adults when solving numerical tasks (Peters and De Smedt, in press). However, children recruit parietal regions to a lesser extent, in particular the IPS, and show increased frontal activation compared to adults (Ansari et al., 2005; Ansari and Dhital, 2006; Cantlon et al., 2006; Kucian et al., 2008; Holloway and Ansari, 2010). According to these findings, researchers hypothesized that there is a shift from an initially controlled and effortful (frontal activation) to a subsequently more automatic processing of numerical magnitude (parietal activation) (Ansari et al., 2005; Rivera et al., 2005; Kucian et al., 2008; Holloway and Ansari, 2010). Conversely, Rosenberg-Lee et al. (2011) reported an increase in parietal, but also prefrontal and visuo-temporal regions over 1 year in children solving arithmetic problems, suggesting a nonlinear trajectory of development.

Several behavioral long-term studies investigated the development of typical and atypical number processing (such as dot enumeration, counting, and number comparison), showing that its efficiency is a good predictor for arithmetical achievement (Halberda et al., 2008; Desoete et al., 2012; Geary et al., 2012; Passolunghi and Lanfranchi, 2012; Landerl, 2013; Reigosa-Crespo et al., 2013; Träff, 2013). Landerl (2013) followed children's numerical abilities over 2 years and found that even if dyscalculic children showed improvements, numerical processing remains persistently deficient. This is also in line with the results of a systematic review about longitudinal studies of mathematical difficulties indicating that students with math difficulty improve in mathematical measures over time but do not catch up to their peers (Nelson and Powell, 2017). Further studies revealed that those deficits are already detectable in kindergarten and continue to persist into adolescence (Shalev et al., 1998, 2005; Stock et al., 2009; Geary et al., 2013; Mazzocco et al., 2013).

To date, the current body of research has identified a substantial deficit in numerical processing in children with DD. Studies with TD subjects indicate that there is a functional

**Abbreviations:** ADD/ADHD, Attention Deficit and Hyperactivity Disorder; DD, Developmental Dyscalculia; fMRI, functional Magnetic Resonance Imaging; IQ, Intelligence Quotient; MNI, Montreal Neurological Institute; RT, Reaction Time; TD, Typically Developing.

specialization in the areas devoted to numerical magnitude representation and involved in the development of the mental number line. On the neural level, DD is associated with aberrant activation patterns of the number-specific parietal regions and domain-general areas. Nevertheless, little is known about the neural development of numerical abilities.

Hence, the goal of the present study was to investigate the typical and atypical neural development of numerical abilities by means of longitudinal functional Magnetic Resonance Imaging (fMRI) and behavioral data. With fMRI we investigated the ordinal aspect of number processing, as differences between DD and TD children have been reported in parietal and domain general regions during a numerical order task (Kucian et al., 2011a). In addition, ordinal number processing has been shown to be an important predictor for arithmetic skills as development progresses (Lyons et al., 2014). Together with behavioral measures on the spatial representation of quantity and numbers (number line task) we aimed to provide insight into the development of numerical abilities.

Evidence from studies with TD children and adults revealed a shift from frontal to parietal activation over time. Based on this literature, we expect to find an increase in activation in the number-specific parietal regions and a decrease in the domaingeneral regions reflecting the growing proficiency in number processing in TD children (Ansari and Dhital, 2006; Cantlon et al., 2006; Holloway and Ansari, 2010). To our knowledge there are no studies about the neuro-functional development of children with DD, making predictions about the atypical development difficult. However, studies show that children and adults with DD show aberrant activation in the number-specific parietal areas (Molko et al., 2003; Kucian et al., 2006, 2011a; Kaufmann et al., 2009a). Furthermore, longitudinal behavioral findings show that children with DD show persistent deficits in numerical processing (Geary et al., 2013; Landerl, 2013; Nelson and Powell, 2017). In line with these findings, we hypothesize a persistent deficiency in numerical processing and consistently lower parietal activity in children with DD compared to TD children (Kucian et al., 2006, 2011a; Price et al., 2007). As we predict persistent aberrant parietal activity, but at the same time improvements in number processing (Landerl, 2013; Nelson and Powell, 2017), we further expect to find higher frontal activation over time in DD children reflecting the changes in the requirement of the cognitive resources as a result of a delayed development.

# MATERIALS AND METHODS

#### Study Design and Participants

In this longitudinal study, a group of children with DD and a group of TD children were evaluated by neuropsychological tests and fMRI (baseline). After 4.2 (SD = 0.46) years, children returned for a second neuropsychological and fMRI assessment (follow-up) (**Figure 1**).

In total 35 (23 DD, 12 TD) children between 8 and 11 years were recruited into this study, of which 25 took part in a previous study (Kucian et al., 2011a) (Note that while the previous study

Performance; SLRT-II, Salzburg Reading and Orthography Test.

acquired data both before and after number line training, only the pre-training data from this previous study were included as a baseline measurement for the present study. Therefore, all children included in the present study participated for the first time in a study at our MRI center and performed the same behavioral tests and fMRI paradigm). Inclusion criteria for all children were an IQ > 85 and no history of a neurologic or psychiatric disorder. Additionally, DD children had to perform below the 10th percentile in the total score or three subtests of a standardized numerical test battery (ZAREKI-R) at baseline. TD children required age-appropriate mathematical performance at baseline and follow-up, defined as performing above the 10th percentile in the ZAREKI-R (range of the TD children: PR 46- 100) and above the cut-off of 67 points in the BASIS-MATH 4-8 (range of the TD children: 68–83) (see also **Figure 1** and Supplementary Material). According to these criteria, six DD children were excluded because they exceeded the cut-off in the numerical test and one TD child because of medication. Therefore, the behavioral data analyses are based on 17 DD and 11 TD children.

For the fMRI analysis, three data sets at baseline and seven at follow-up were excluded because of task performance <50% (1 data set), scanner problems (3 data sets) or poor image quality caused by dental braces (6 data sets). Hence, subsequent statistical group comparisons are based on 14 DD and 11 TD fMRI data sets at baseline, and 13 DD and 8 TD fMRI data sets at follow-up.

Informed and written consent was obtained from participants when older than 16 years and all parents. The study was approved by the Ethics committee of Zurich, Switzerland based on guidelines from the World Medical Association's Declaration of Helsinki (WMA, 2002).

### Behavioral Testing

All children completed age-appropriated neuropsychological tests at baseline and follow-up (for an overview see **Figure 1**).

#### Handedness

Handedness (3 left handed, 8 ambidextrous, 17 right handed) was determined by the Edinburgh Handedness Inventory (Oldfield, 1971).

#### Diagnosis of DD and General Numerical Abilities

At baseline, numerical abilities were assessed using the revised version of the Neuropsychological Test Battery for Number Processing and Calculation in Children (ZAREKI-R) (von Aster et al., 2006). This test battery consists of 12 subtests assessing basic numerical skills as well as calculation (see Supplementary Material for detailed information about the subtests). Based on this test battery children with DD were identified from scores below the 10th percentile in three subtests or in the total test score (test scores are reported in percentile ranks). At the followup assessment, the test for Basic Diagnosis in Mathematics Education for Grades 4–8 (BASIS-MATH 4–8) (Moser Opitz et al., 2010) was used instead, because it is the only German test in existence which can identify numerical deficiencies up to the eighth grade. The BASIS-MATH test battery is composed of three difficulty levels measuring several arithmetical abilities (see also Supplementary Material). The test battery assumes that mastery of basic mathematical concepts is not reached, if the performance falls under a threshold value of 67 points (out of total 83 points, reported test scores are raw values).

In order to asses children's arithmetic performance at the peer level, the Arithmetic subtest of the Wechsler Intelligence Scale for Children (WISC-III) (Tewes et al., 1999) was performed at baseline. In this subtest children had to solve story problems of increasing difficulty within a set time limit (reported test values are IQ scores). At the follow-up measurement, the Quantity Comparison subtest of the Cognitive Abilities Test (KFT 4-12+R) (Heller and Perleth, 2000) was performed. In the Quantity Comparison subtest subjects had 10 min time to solve as many quantity comparisons as possible of increasing difficulty (reported test values are T scores).

#### Number Line Task

The spatial representation of numbers was measured by means of a paper-and-pencil number line task adopted from Kucian et al. (2011a). Children had to estimate the position of 20 Arabic digits on a left to right oriented number line (length 16 cm) with the labeled end points 0 and 100. A single number was presented verbally as well as visually in form of an Arabic digit on a card. Each number had to be marked on consecutive number lines to avoid the possibility of comparisons between items. Two items per decade were chosen in order to evaluate the entire spatial representation between 0 and 100.

At the follow-up, a computerized and age-adapted version of the number line task was used. Each of the 20 numbers was presented visually on the screen and its position was indicated by mouse-click. The number line was 21.3 cm in length (806 pixel, with a screen resolution 96 dpi) and had labeled end points. For the number range 0–100, two numbers per decade were selected again. Additionally, participants had to solve 20 items in the number range between 0 and 1000. To obtain the items in the range 0–1000, the numbers of the number line test 0–100 were multiplied by 10 and a random digit between 0 and 9 was added in the unit position.

In both test versions accuracy was measured by calculating the percentage distance from the marked to the correct position of the given number (reported test values are raw values).

#### Basic Arithmetic Operations

Children solved 40 basic arithmetic problems (20 addition and 20 subtraction) (Kucian et al., 2011a). Each problem was presented verbally as well as visually on a card. The child had to provide the solution verbally and the examiner noted it on the evaluation sheet. There was no time limit for this test. The items ranged from 1 to 100 with single digit as well as double digit problems (e.g., 7+15, 36+42). The items were balanced for frequency of digits and bridging ten. The number of correctly solved items was quantified (reported test values are raw scores, maximum value 20).

At the follow-up, a computerized and age-adapted version of this task was used. Each of the 20 addition and 20 subtraction was presented visually on the screen and solutions were typed on a keyboard. There was no time limit for this test. To prevent ceiling effects, the test was expanded to numbers up to 1000. Items consisted of one-, two- and three-digit numbers (e.g., 811+5, 235+324) and were balanced for frequency of digits and bridging ten/hundred. RT was measured and the number of correctly solved items was quantified (reported test values are raw scores, maximum value 20).

#### Intelligence Quotient

Intelligence was measured with the third, respectively fourth edition of the WISC (Tewes et al., 1999; Petermann and Petermann, 2007) (WISC-III: Similarities, Block Design, Vocabulary, Picture Arrangement; WISC-IV: Similarities, Block Design, Matrix Reasoning). **Table 1** shows the estimated general IQ (reported test values are IQ scores).

#### Working Memory

Visuo-spatial and verbal working memory was assessed in order to control for memory effects. At baseline and follow-up working memory was measured with the Block-Suppression-Test (Beblo et al., 2004). The task required subjects to reproduce every second block of a previous presented sequence on a board with nine cubes. The sequences had a length of 3–9 cubes. Three items per sequence were presented. The longest sequence which was reproduced correctly twice was quantified (reported test values are raw scores, maximum value 9).

At the follow-up the subtest Digit Span of the WISC-IV (Petermann and Petermann, 2007) was additionally performed. In this task subjects had to repeat an auditorily presented sequence of numerals backwards. The sequences had a length of




*ZAREKI-R, Neuropsychological Test Battery for Number Processing and Calculation in Children [PR], BASIS-MATH 4–8, Basic Diagnostic in Mathematics for Grades 4–8 [raw score]; WISC, Wechsler Intelligence Scale for Children [IQ score]; KFT 4-12*+*R, Cognitive Abilities Test [T score]; BTT, Block-Tapping-Test [raw score]; TAP, Testbattery for Attentional Performance [PR]; SLRT-II, Salzburg Reading and Orthography Test [PR]. a t-Test. <sup>b</sup>Fisher's Exact Test. <sup>c</sup>Kolmogorov-Smirnov-Z Test.* \**p* < *0.05,* \*\**p* < *0.01,* \*\*\**p* < *0.001.*

2 to 8 numerals. The longest sequence which was reproduced correctly was quantified (reported test values are raw scores, maximum value 8).

#### Attention

Levels of attention and inhibition were measured at followup by means of the subtests Alertness and Go-Nogo of the computerized Testbattery for Attentional Performance (TAP) (Zimmermann and Fimm, 1993). In the Alertness subtest, subjects had to react as quickly as possible when the target stimulus "X" appeared (intrinsic alertness). Half of the trials were preceded by an acoustic cue stimulus (phasic alertness). The test has four runs and a total of 80 target items. For each subject the percentile rank of the median RT was quantified (reported test values are percentile ranks). In the Go-Nogo subtest, subjects had to react as quickly as possible to a target stimulus ("X," go condition), but inhibit reactions on a second presented stimulus ("+," nogo condition). The test has a total of 40 items (20 go and 20 nogo items). For each subject the percentile rank of the median RT was quantified (reported test values are percentile ranks).

#### Reading

The 1-Min-Reading-Task from the Salzburg Reading and Orthography Test (SLRT-II) (Moll and Landerl, 2010) assessing word and pseudoword reading fluency was used to estimate the reading performance at follow-up. Two sheets of paper with either 156 words or 156 pseudowords of increasing length and difficulty were presented. Subjects had 1 min per sheet to read as many words as possible. The amount of correctly read items was quantified (reported test values are percentile ranks). Because of lacking test norms in grades 7 and 8, we interpolated the norms from the test manual (grade 6) and from Kronschnabel et al. (2013) (grade 9).

### fMRI Task

The fMRI task, adopted from Kucian et al. (2011a), was identical between the baseline and follow-up measurements. In the experimental condition, subjects had to make ordinal judgements (numerical order task: "Are the numbers in an ascending/descending order?"). The control condition was a number identification task ("Is the number 2 present?") (**Figure 2**). The entire paradigm lasted 10.5 min and consisted of four blocks of the numerical order task alternating with four blocks of the number identification task. Blocks were counterbalanced between subjects. At the beginning of each block an instruction was shown for 2 s, followed by 10 trials of one of the two conditions and a rest period with a fixation cross for 20 s, resulting in a total block length of 59.5 s. Every stimulus was presented for 2 s, followed by a blank screen with an interstimulus-interval jittered between 3 and 5 s.

A stimulus consisted of three Arabic digits between "2" and "9" (horizontally aligned) shown simultaneously via a video goggles system (VisuaStimDigital, Resonance Technology Inc., USA). 40 numerical order items were presented, one fourth with ascending (correct), one fourth with descending (correct) and half of them with no specific order (incorrect). The order of the numerals in the ascending condition (e.g., 2 5 7) was reversed to obtain the descending items (e.g., 7 5 2) and mixed up to obtain the items with no specific order (e.g., 5 2 7). In the control condition, 40 number identification items were presented (20 correct and 20 incorrect items). The paradigm was balanced for numerical distance [max(n)-min(n): 5, 6, or 7] between the correct and incorrect as well as ascending and descending items, respectively. Children responded by a button press of the dominant hand (index finger for "yes," middle finger for "no"). The paradigm was programmed on E-Prime (Version 2, Psychology Software Tolls Inc., USA) and answers were recorded by an MRI compatible response box (Lumina Respond Pad, Cedrus Corporation, USA). Reaction times (RT) smaller than 300 ms and misses were not included in the analyses of the paradigm.

#### Image Acquisition

MRI data were acquired on a 3T General Electric Signa Scanner (GE Medical Systems, USA) using an 8-channel head coil. Whole

(\*) shown for 3–5 s. Reprinted from Kucian et al. (2011a), copyright (2011) with permission from Elsevier.

brain functional images were acquired interleaved with a gradient echo EPI sequence [36 slices, slice thickness (ST) = 3.4 mm, no interslice skip, matrix size (MS) = 64 × 64, field of view (FOV) = 220 × 220 mm, in-plane resolution = 3.4 × 3.4 mm, flip angle (FA) = 45◦ , echo time (TE) = 31 ms, repetition time (TR) = 2100 ms]. Additionally, a T1-weighted structural image was obtained with a fast spoiled gradient echo sequence (3D FSPGR, ST = 1 mm, no interslice skip, MS = 256 × 192, FOV = 240 × 192 mm, FA = 20◦ , TE = 2912 ms, TR = 9972 ms).

Participants were carefully instructed and supplied with hearing protection before entering the scanner. To minimize head motion, the head was stabilized with padding.

# Data Analysis

#### Behavioral Data

Behavioral data was statistically analyzed with SPSS (Version 20). To assess group differences parametric t-tests for independent samples or, if data deviated from normal distribution, nonparametric Kolmogorov-Smirnov-Z-test were performed. In the cases were the assumption of homogeneity of variance was violated, we adjusted the degrees of freedom using the Welch-Satterthwaite method. A mixed-model ANOVA with time (baseline/follow-up) as within-subject factor and group (DD/TD) as between-subject factor was conducted to examine developmental effects. Effect sizes are reported as Cohen's d for t-tests and partial η 2 for the mixed-model ANOVA. As suggested by Cohen (1988) effect sizes are interpreted as small (d = 0.2, η <sup>2</sup> = 0.01), medium (d = 0.5, η <sup>2</sup> = 0.06) or large (d = 0.8, η <sup>2</sup> = 0.14).

# fMRI Data

#### **fMRI motion**

For each subject the motion finger print according to Wilke (2012) was calculated. Total displacement, a vector combining the measures of translation (x, y, and z) and rotation (pitch, roll, and yaw), was used to check if there is a difference in motion between the baseline and the follow-up measurement and TD and DD group, respectively. In more detail, the motion fingerprint provides a total displacement value and a scan-to-scan displacement value for each volume. For each subject the mean total displacement (td) and mean scan-to-scan displacement (sts) over the time series were calculated (values are reported in mm).

Groups did not differ at the baseline in td (DD: 0.34-2.10, Median = 0.66, TD: 0.43–2.34, Median = 0.98, Kolmogorov-Smirnov-Z = 1.02, p = 0.17, two-sided) or sts (DD: 0.06–0.41, Median = 0.14, TD: 0.07–0.73, Median = 0.13, Kolmogorov-Smirnov-Z = 0.48, p = 0.83, two-sided). Also at followup, we did not find any differences between the groups for td (DD: 0.44–1.13, Median = 0.70, TD: 0.31–1.28, Median = 0.90, Kolmogorov-Smirnov-Z = 0.81, p = 0.40, two-sided) and sts (DD: 0.05–0.19, Median = 0.06, TD: 0.05– 0.10, Median = 0.06, Kolmogorov-Smirnov-Z = 0.26, p = 0.95, two-sided).

Between the baseline (0.34–2.34, Median = 0.91) and followup (0.31–1.28, Median = 0.75) no difference was found for td (Wilcoxon Signed Ranks Test, z = −1.39, p = 0.17, two-sided). However, the sts displacement was significantly higher for the baseline (0.06–0.73, Median = 0.13) compared to the followup measurement (0.05–0.19, Median = 0.06, Wilcoxon Signed Ranks Test, z = −3.53, p < 0.001, two-sided). Therefore, it is unlikely that motion affects the results of the group comparison, but it might impact the statistical power of the developmental comparison of the present study.

#### **fMRI preprocessing**

The data were analyzed by means of Statistical Parametric Mapping (SPM8, Wellcome Trust Centre for NeuroImaging, UK) running under Matlab (Release 2012b, The MathWorks Inc., USA).

Three dummy scans, acquired to stabilize magnetization at the beginning of the scan, were excluded from the analysis. Then the subjects' functional scans were realigned with rigid body transformations using the mean image as a reference scan. Six motion parameters (translation in x, y, and z direction as well as rotation in pitch, roll and yaw) were stored and included later in the analysis to control for motion. The mean functional image was then coregistered to the subjects' T1-weighted anatomical scan. In a next step, the individual anatomical scan was segmented into gray and white matter according to tissue probability maps of a pediatric atlas (NIH Paediatric Database) (Fonov et al., 2009, 2011). The parameters from the coregistration and segmentation were applied to the functional scans to normalize images into MNI (Montreal Neurological Institute) space. Finally, the functional images were smoothed with a Gaussian kernel of 6 mm FWHM (full width half maximum).

#### **fMRI statistics**

The first level analysis was performed using a mass-univariate approach based on the GLM. The time series from each subject were modeled with an event related design for the experimental and control condition using a canonical HRF (hemodynamic response function). The subjects' motion parameters were entered as additional regressors. Slow signal drifts and serial correlations were accounted for by using a high-pass filter of 180 s and a first level autoregressive model during maximumlikelihood estimation of the GLM parameters.

At the group level, a full factorial analysis with the factors group and time as well as IQ as a covariate was conducted for the contrast experimental-control condition. For the factor time (repeated measurement), within-subjects correlations were accounted for by estimating the covariance and accordingly adjusting the statistics and degrees of freedom during inference.

Statistical results are shown at p < 0.001, corrected for multiple comparisons using a cluster-extent threshold of k ≥ 19 voxels (513 mm<sup>3</sup> ) or at p < 0.005 and k ≥ 22 (594 mm<sup>3</sup> ). According to Slotnick (2008), the spatial autocorrelation of the data was estimated. Then a Monte Carlo simulation was run with 10'000 iterations, using a type I error voxel activation probability of 0.001, and an estimated FWHM as a Gaussian smoothing kernel in order to derive the cluster extent threshold yielding the desired correction for multiple comparisons at a p < 0.05 level (Slotnick, 2004).

Anatomical localization of the fMRI results was attained through the SPM Anatomy Toolbox (Eickhoff et al., 2005, 2007) and is reported in MNI coordinates.

## RESULTS

#### Behavioral Data

Groups did not differ in terms of age, gender and handedness (**Table 1**).

#### Diagnosis of DD and General Numerical Abilities

Numerical abilities differed significantly between DD and TD children at baseline [ZAREKI-R t(10.86) = −12.12, p < 0.001, d = −4.08] and follow-up [BASIS-MATH 4–8 t(26) = −7.80, p < 0.001, d = −3.06; **Table 1**]. Every child in the DD group scored under the threshold value of 67 points in the BASIS-MATH and therefore still fulfilled the diagnostic criteria for DD at the follow-up. Moreover, the BASIS-MATH data revealed that the DD group differed significantly in all difficulty levels of the test (all p < 0.001), showing a substantial deficit in the very basic arithmetical skills at a mean age of 14 years.

Not surprisingly, at both time points DD children also performed significantly worse than the TD group in the tests measuring numerical skills at a peer level [Arithmetic subtest of the WISC-III at baseline: t(15.73) = −3.34, p = 0.004, d = −1.68; Quantity Comparison subtest of the KFT 4–12+R at follow-up: t(22) = −6.90, p < 0.001, d = −2.94; **Table 1**].

#### Number Line Task

The number line task 0–100 differed slightly between baseline and follow-up assessment, which is why the following results must be interpreted carefully (see also Materials and Methods section). A mixed-design ANOVA with time as within-subject factor and group as between-subject factor showed a significant effect of group [F(1, 17) = 13.02, p = 0.002, η <sup>2</sup> = 0.434] for the number line 0–100 (**Table 2**). Children with DD placed the numbers further away from the correct position compared to the TD group. There was also a significant main effect of time [F(1, 17) = 26.42, p < 0.001, η <sup>2</sup> = 0.609], showing that accuracy increased with development. Finally, the significant interaction time by group indicated that DD children improved more over time than TD children [F(1, 17) = 5.44, p = 0.032, η <sup>2</sup> = 0.243]. The number line test 0–1000, performed at the follow-up assessment, also revealed lower performance for the DD than the TD group (Kolmogorov-Smirnov-Z z = 1.51, p = 0.011, d = 1.21) (**Table 2**). Regarding the measured RT at the follow-up assessment, no significant differences could be found in the number line tasks between groups [number line 1– 100: t(26) = −1.15, p > 0.05, d = −0.45; number line 1–1000: t(26) = −0.97, p > 0.05, d = −0.38].

#### Basic Arithmetic Operations

For the basic arithmetic operations, t-test revealed significant differences between DD and TD children. At baseline and followup, TD children solved more addition [baseline: t(8.67) = −3.43, p = 0.008, d = −2.33; follow-up: t(26) = −2.30, p = 0.030, d = 0.90] and subtraction problems correctly [baseline: Kolmogorov-Smirnov-Z z = 1.72, p = 0.001, d = 1.96, follow-up: t(24.67) = −3.79, p = 0.001, d = −1.53; **Table 2**]. The measured RTs at follow-up show that children with DD took longer to solve the addition [t(19.19) = 2.20, p = 0.041, d = 0.91] and the subtraction problems [t(24.99) = 2.42, p = 0.023, d = 0.97] compared to their peers (**Table 2**).

#### Intelligence Quotient

All participants reached normal range of intelligence during both assessments (IQ range baseline: 93–125; follow-up: 92–122). However, groups differed significantly in the estimated general IQ [baseline: WISC-III t(26) = −4.65, p < 0.001, d = −1.82; follow-up: WISC-IV t(26) = −4.41, p < 0.001, d = −1.73; **Table 1**]. Differences in IQ scores between a group of children with learning disabilities and a control group are often reported in the literature (Geary et al., 2000; Willcutt et al., 2013). One reason for this is that IQ-tests are not independent from numerical skills. The IQ was not entered as a covariate in the subsequent behavioral analysis, since IQ is not independent from the effects of interest (Miller and Chapman, 2001; Dennis et al., 2009; Field, 2009).

#### Attention, Reading and Working Memory

To match the groups for comorbid attention deficit and hyperactivity disorder (ADD/ADHD) and dyslexia, the TAP and the SLRT-II were performed. Groups did not differ significantly in any measurement of attention or reading performance (**Table 1**). Regarding working memory, subjects showed at baseline and follow-up comparable results in the verbal and visuo-spatial memory component. The only significant difference was found in visuo-spatial working memory at the follow-up TABLE 2 | Behavioral results for the spatial representation of numbers (number line), basic arithmetic operations (addition and subtraction), and the fMRI paradigm.


<sup>1</sup>*Number of subject per group DD/TD.*

*<sup>a</sup>Mixed-design ANOVA. <sup>b</sup>Kolmogorov-Smirnov-Z Test. <sup>c</sup> t-Test.*

\**p* < *0.05,* \*\**p* < *0.01,* \*\*\**p* < *0.001.*

assessment due to lower performance of children with DD compared to TD (Kolmogorov-Smirnov-Z z = 1.16, p = 0.039, d = −0.76) (**Table 1**).

#### Behavioral Results from fMRI Task

A mixed-design ANOVA with time as within-subject factor and group as between-subject factor was calculated. For accuracy, the ANOVA revealed a significant effect of time [F(1, 24) = 40.85, p < 0.001, η <sup>2</sup> = 0.63], showing that children were better able to solve the task with increasing age (**Table 2**). Furthermore, the DD group performed significantly worse than the TD group [F(1, 24) = 4.30, p = 0.049, η <sup>2</sup> = 0.152]. This significant difference arises from their lower performance in the number order task [F(1, 24) = 4.54, p = 0.044, η <sup>2</sup> = 0.159], as performance in the control task was comparable between groups [F(1, 24) = 1.66, p = 0.209, η <sup>2</sup> = 0.065]. The group by time interaction was not significant [F(1, 24) = 1.95, p = 0.176, η <sup>2</sup> = 0.075]. For RT, no effects of group [F(1, 24) = 1.91, p > 0.05, η <sup>2</sup> = 0.074] or interaction between time and group [F(1, 24) = 0.43, p > 0.05, η <sup>2</sup> = 0.018] was evident. However, children solved the task faster at the second assessment point [F(1, 24) = 50.45, p < 0.001, η <sup>2</sup> = 0.678; **Table 2**].

#### fMRI Results

Analysis of the task (experimental minus control condition) revealed bilateral parietal activation in TD children at baseline and follow-up. DD children showed at baseline only right lateralized activation in parietal regions (**Figure 3**, **Tables 3**, **4**).

#### fMRI Group Differences

At baseline, no significant difference between groups was found at the statistical threshold of p < 0.001.

FIGURE 3 | Activation at baseline and follow-up. Task related brain activation shown on a pediatric template (Fonov et al., 2009, 2011) for the contrast numerical order vs. control task at baseline (Left) and follow-up (Right) for children with developmental dyscalculia (DD) (Upper) and typically developing (TD) children (Lower) (*p* < 0.01, *k* ≥ 24, cluster-extent corrected).

At follow-up, two-sample t-tests revealed significant differences between children with DD and controls (**Figure 4A**, **Table 5**). Children with DD showed increased activation in frontal areas including bilateral middle frontal gyri (MFG) and TABLE 3 | Brain areas that showed significant activation for the numerical order vs. control task from dyscalculic and typically developing children at the baseline assessment (*p* < 0.01, *k* ≥ 24, cluster-extend corrected).


the left inferior frontal gyrus (IFG). In the parietal lobe, more activation in the bilateral angular gyri (AG), extending into the supramarginal gyri (SMG) and the left IPS was found. TD children did not show any increased activation compared to children with DD at follow-up (p < 0.001).

#### fMRI Developmental Effects

Developmental changes took place in the DD group, showing increased activation in the basal forebrain and the left insula at p < 0.001. At a threshold of p < 0.005, additional activation increases in the bilateral IPS, right insula, left IFG, left parahippocampal gyrus (PHG) and left thalamus were observed (**Figure 4B**, **Table 6**). No decrease in activation was found in the DD group over development.

TD children did not show any increase or decrease in activation over development at the statistical threshold of p < 0.005.

The negative interaction time by group indicated that the activation increase over time was more pronounced in children with DD than in TD children. The left IFG was the only region showing interaction effects at the higher clusterextent threshold (p < 0.001). A lower threshold (p < 0.005) revealed activation in similar regions to the t-test in DD over development, namely in the left middle cingulum extending into TABLE 4 | Brain areas that showed significant activation for the numerical order vs. control task from dyscalculic and typically developing children at the follow-up assessment (*p* < 0.01, *k* ≥ 24, cluster-extend corrected).


somatosensory area, left IPS, left hippocampus, and right AG (**Figure 4C**, **Table 6**).

In order to investigate the developmental effects further, a regression analysis was performed with 14 fMRI data sets, comprised of both baseline and follow-up scans. Each subjects' activation increase over time, for the contrast experimental minus control condition (interaction time point x condition), and the number of the correctly solved basic arithmetic operations (addition and subtraction) at baseline were included in the analysis. Results revealed a negative correlation between the activation increase over time and the number of correctly solved subtractions and additions at the baseline assessment (cluster-extent corrected p < 0.001). This indicates that children who solved fewer arithmetic problems correctly at baseline showed more activation increase over time in bilateral cingulate cortex extending into right frontal gyri and left supplementary motor area (SMA), bilateral insular lobe extending bilaterally into putamen and caudate nucleus, left inferior (IFG) and middle frontal gyrus (MFG), left superior temporal gyrus (STG) and right cerebellum. In the parietal lobe, broad activation increases were found in bilateral angular gyri (AG) extending into inferior parietal lobe and intraparietal sulcus (IPS), and bilateral precuneus (see Figure S1).

#### DISCUSSION

In the present longitudinal study, we investigated the neurofunctional development of children with and without DD by

*p* < 0.005, *k* ≥ 22, cluster-extent corrected).

means of neuropsychological tests and fMRI. In line with previous studies, we found that children with DD improved over time, but nonetheless showed persistent deficits in number processing and arithmetical skills when compared to their peers.

Brain imaging results revealed an increase in frontal and parietal brain activation over time in children with DD. In contrast, results of TD children point to a stable activation pattern over development. Furthermore, a lower performance in basic arithmetic operations correlates with a more pronounced increase in the fronto-parietal network over time.

## Deficient Numerical Processing and Aberrant Neural Networks

As hypothesized, we found considerable deficits in number processing and arithmetic abilities in children with DD compared to a peer group. The more pronounced inaccuracy in a number

interaction. Activation increase over time was more pronounced in children with dyscalculia compared to typically developing children (group by time interaction,



TABLE 6 | Brain areas that showed significant developmental changes in children with developmental dyscalculia and the negative interaction group by time (*p* < 0.005, *k* ≥ 22, cluster-extend corrected).


line task is typically found in dyscalculics and is consistent with a large body of research findings (Geary et al., 2008, 2012; Landerl, 2013). In addition, the accuracy in a number line task is thought to reflect a better representation of quantity (Siegler and Booth, 2004; Ebersbach et al., 2008). Therefore, our data point to a deficient mental number line representation in 9 and 14-year old dyscalculic children. Consistent with the study from Piazza et al. (2010), the children with DD performed at the same level as the control group when 4-years younger. Given that numerical magnitude representation further influences arithmetical learning, it is in good agreement with earlier studies (Booth and Siegler, 2008) that our DD group also showed poor performance in basic addition and subtraction problems.

Regarding brain activation, group differences were evident at the follow-up assessment. Children with DD showed increased activation in frontal (MFG, IFG) and parietal (AG, left IPS) regions of the numerical network compared to their peers. This is in contradiction with studies reporting reduced activation in the parietal key regions for numeracy. However, our findings are in line with several studies, who found increased activation in fronto-parietal regions of DD children (Kaufmann et al., 2009b; Kucian et al., 2011b; Iuculano et al., 2015; Rosenberg-Lee et al., 2015). Similar results were further reported in the meta-analysis by Kaufmann et al. (2011). The authors suggested, that the increased IPS and postcentral activation reflects the recruitment of finger-based number representation in DD children, which might also be the case in our study.

At baseline, we did not find any differences in the activation pattern of the groups. A reason for this is that we chose a rather strict significance level to report our results. When lowering the statistical threshold activation differences in occipito-parietal, temporal and frontal regions could be detected, which are in line with those reported in the literature (e.g., Kaufmann et al., 2009b; Kucian et al., 2011b).

#### Typical and Atypical Development

Consistent with our expectations, TD children showed a growing proficiency in number processing with development, as seen in a significant improvement in the number line task. DD children also did not stagnate in their development, exhibiting a decrease in error rates when placing numbers on a number line. In fact, our results showed that the dyscalculic's number line performance improved more over time than that of the TD children. This result is consistent with other findings of longterm studies (Geary et al., 2012; Landerl, 2013). However, even when the gap between the typical and atypical development in the mental number line decreases, children with DD always performed significantly lower than their peers. Our results confirm findings from earlier studies (Shalev et al., 1998, 2005) and the result from a systematic review (Nelson and Powell, 2017) showing that number representation in DD is deficient and delayed in development. In addition, children with DD still showed substantial deficits in simple arithmetic through the entire study. This result supports the importance of effective and efficient ordinal and magnitude number processing abilities in the development of arithmetical skills (Booth and Siegler, 2008).

The brain activation patterns of TD children revealed no significant difference over the examined time. This result seems surprising, considering findings from earlier studies, who showed an age related activation increase in the IPS and a decrease in frontal areas during magnitude processing (Ansari et al., 2005; Ansari and Dhital, 2006). However, it is important to note that most of the studies compare numeracy-established adults with developing children and therefore assume linearity in development. It might be that some of the mentioned changes occur only at specific periods in development. Moreover, the brain activation pattern from our results is in line with results from a study using the same task (Kucian et al., 2011a). Our findings are further consistent with Kucian et al. (2008), revealing no differences comparing children over a 3-year period, but finding changes between children and adults. Along with the increasing proficiency on the behavioral level, our results speak for a consistent and well working number processing network in TD children. Importantly, the present results do not exclude the possibility that the number processing network continuously develops and refines in typical development over time. Furthermore, these results must be interpreted cautiously and confirmed with bigger group sizes.

Interestingly, children with DD showed a remarkable activation increase in the entire fronto-parietal network over the observed period of development. The growth of activation in the basal forebrain, bilateral insula and bilateral IFG is in good agreement with the literature, indicating that these regions play a crucial role in working memory, attention, and cognitive control. Together with the better performance in the fMRI task, this finding supports the notion that children with DD constantly use domain-general regions to a larger extent, reflecting the higher cognitive demands induced by the task. Besides, children with DD showed an activation increase in the bilateral IPS over developmental time. It is further worth pointing out, that in children with DD the activation increase in the left IPS was much greater than in the right IPS, whilst TD peers showed stable bilateral IPS activation over time. Findings show that activation changes with growing proficiency in (symbolic) number representation in the left IPS and is stable over development in the right IPS (Vogel et al., 2015). In context with the improvement in the fMRI paradigm and the catch-up in the number line task, our results lend support to a stronger use of numberspecific areas in children with DD. This is in line with the results of the regression analysis, indicating that children who solved fewer arithmetic problems correctly at baseline showed more activation increase over time. Furthermore, the negative interaction also revealed activation in parietal number-specific regions and frontal domain-general regions, indicating that the developmental changes were more pronounced in children with DD. This mirrors the results from our behavioral data and previous studies, showing that the gap between TD and DD performance diminishes over development (Geary et al., 2012; Landerl, 2013).

To our knowledge, no neuro-imaging long-term studies exist in the field of dyscalculia, but results from studies with dyslexic children also showed differences in the development of the neural reading system. Comparable to our findings, age related increases are seen in domain-specific occipito-temporal regions but also in domain-general regions (left IFG) (Shaywitz et al., 2007). Furthermore, Rosenberg-Lee et al. (2011) looked at brain maturation processes between 2nd and 3rd grades during arithmetic problem solving. In line with our results, better behavioral performance and a significant increase in activity were observed in the right superior parietal lobe, IPS and AG, PHG, and frontal regions from grade 2 to 3. Based on these activation increases, which have been associated with initial stages of learning, the developmental effects in our DD group might also reflect neural maturation processes.

To summarize, our results support the notion that TD children have a well-functioning number processing network, and therefore showed only subtle developmental effects over the examined time. Dyscalculic children, however, showed agerelated changes in frontal areas of the brain. These can be related to compensatory mechanisms or different but less effective task solving strategies, which are often observed in children with DD. Secondly, the increase in domain-specific parietal areas, hints to maturation or delayed development of number processing areas. Although these findings are promising, it is important to note that children with DD did not fully catch up to their peer group in numerical-arithmetical skills and showed less focused activation patterns, underscoring that the deficiencies do not fully vanish with time.

#### Methodological Considerations

To our knowledge this is the first longitudinal study looking at neural development in children with and without DD. The lack of other longitudinal studies in DD might arise from several reasons. Firstly, longitudinal fMRI studies in children are especially prone to high drop-out due to more movement artifacts and braces. This was also the case in our study and the reason why we have unequal and small sample sizes. For this reason, our results (in particular the results from the TD children) should be interpreted with caution. However, the same main results were obtained when evaluating the study with equal group sizes revealing that our results are stable and not based on differences in group size (see Supplementary Material and Figures S2–S6). Furthermore, in order to check the statistical power of our findings, we conducted post-hoc power analyses (G∗Power; Faul et al., 2007) for the significant main results of the behavioral data with α = 0.05, and the effect and sample sizes as reported for the specific statistical test (see Results section, **Tables 1**, **2**). For most of the tests we reached good statistical power (1−β ≥ 0.80). However, for the interaction of the number line test 1–100 and the effect of group in the accuracy of the fMRI paradigm we detected a power of 0.60 and 0.51, respectively. Thus, the likelihood that these results reflect true effects is reduced. In addition, power analyses for the main effects of the fMRI data were conducted by means of the software package fMRIpower (Mumford and Nichols, 2008). This method estimates power for detecting significant activation within specific regions of interest, with the assumption that the planned studies will have the same number of runs per subject, runs of the same length, similar scanner noise characteristics, and data analysis with a comparable model (Mumford and Nichols, 2008). For this purpose, post-training data from Kucian et al. (2011a) (which were acquired with the same fMRI paradigm on the same MR-Scanner, but were not included in the present longitudinal study, see also Materials and Methods section) were used as "pilot data" for the power analysis. The power analyses were carried out for each of the regions of the automated anatomical labeling (aai) roi mask with α = 0.05 and the sample sizes as reported for the specific contrast of the present study. For the group differences at the follow-up assessment (**Figure 4A**, **Table 5**), power estimates between 11 and 62% were obtained for the brain areas that showed significant activation, with the highest power estimate of 62% observed in the left IPS and angular gyrus. Similarly, the brain areas that showed significant developmental changes in children with DD (**Figure 4B**, **Table 6**), reached power estimates between 18 and 79%. The highest values of 70 and 79% of power were detected for the left and the right IPS, respectively, whilst lower power estimates were reached for the frontal areas of the brain. Despite the fact that more subjects would be necessary to increase the power of the present study, the results of the conducted power analyses reveal that the power estimates for the numerical key areas are already near to the desired power of 80% and therefore likely show true effects.

Secondly, the choice of the fMRI paradigm, especially in longitudinal studies, is constrained by the requirements that it must be feasible for children with DD (performance over chance level) and not too easy for TD children (ceiling effects). An adaptation of the difficulty level of the task results in a loss of comparability over time, which we wanted to avoid. As a consequence, ceiling effects might have led to a loss of behavioral group differences at the follow-up.

Thirdly, longitudinal study designs are very time consuming regarding (re-)recruitment and maintenance of the participant's motivation. Thus, developmental questions are in many cases examined by more time-efficient methods such as cross-sectional designs. Importantly, cross-sectional designs do not take into account inter-individual differences to the same extent as longitudinal designs. Furthermore, most cross sectional-studies compare adults and children and might therefore miss an opportunity to capture the full developmental trajectory. We think that our results are promising and provide an important contribution to the understanding of the typical and atypical development of number processing, but further work is needed to verify our findings and strengthen the understanding of developmental trajectories.

Despite these methodological considerations, our findings suggest a continuation in the neural development of number representation in children with DD, whereas the neural network for simple ordinal number estimation seems to be stable or show only subtle changes over time in TD children. Furthermore,

#### REFERENCES


our results shed light on the behavioral and neural trajectories in dyscalculia and emphasize the importance of longitudinal studies for the understanding of development. This knowledge contributes to the understanding of numeracy and might therefore be meaningful for education and implementation of therapy and support of children with difficulties in mathematics.

### AUTHOR CONTRIBUTIONS

All authors have contributed and have approved the final manuscript. UMc: contributed to the design of the study, the acquisition, analysis, and interpretation of the data, and writing the manuscript; MvA: contributed to the design of the study, data interpretation and revised the manuscript; UM, EM, and RO: contributed to data interpretation and revised the manuscript; KK: contributed to the design of the study, the acquisition and interpretation of the data, and editing and revising the manuscript.

# FUNDING

This research was supported by a grant from the NOMIS Foundation.

#### ACKNOWLEDGMENTS

We would like to thank all children and their parents for participation in this study and Anatol Schauwecker for the help with the data collection. The present manuscript is partially based on the thesis of the UMc (McCaskey, 2016).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2017.00629/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 McCaskey, von Aster, Maurer, Martin, O'Gorman Tuura and Kucian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Analyzing Global Components in Developmental Dyscalculia and Dyslexia

#### Gloria Di Filippo<sup>1</sup> \* and Pierluigi Zoccolotti2,3 \*

<sup>1</sup> Faculty of Educational Sciences, Niccolò Cusano University, Rome, Italy, <sup>2</sup> Department of Psychology, Sapienza University of Rome, Rome, Italy, <sup>3</sup> Neuropsychological Unit, IRCCS Santa Lucia Foundation, Rome, Italy

The study examined whether developmental deficits in reading and numerical skills could be expressed in terms of global factors by reference to the rate and amount (RAM) and difference engine (DEM) models. From a sample of 325 fifth grade children, we identified 5 children with dyslexia, 16 with dyscalculia, 7 with a "mixed pattern," and 49 control children. Children were asked to read aloud words presented individually that varied for frequency and length and to respond (either vocally or manually) to a series of simple number tasks (addition, subtraction, number reading, and number comparisons). Reaction times were measured. Results indicated that the deficit of children with dyscalculia and children with a mixed pattern on numerical tasks could be explained by a single global factor, similarly to the reading deficit shown by children with dyslexia. As predicted by the DEM, increases in task difficulty were accompanied by a corresponding increase in inter-individual variability for both the reading and numerical tasks. These relationships were constant across the four groups of children but differed in terms of slope and intercept on the x-axis, indicating that two different general rules underlie performance in reading and numerical skills. The study shows for the first time that, as previously shown for reading, also numerical performance can be explained with reference to a global factor. The advantage of this approach is that it takes into account the over-additivity effect, i.e., the presence of larger group differences in the case of more difficult conditions over and above the characteristics of the experimental conditions. It is concluded that reference to models such as the RAM and DEM can be useful in delineating the characteristics of the dyscalculic deficit as well as in the description of co-morbid disturbances, as in the case of dyslexia and dyscalculia.

Keywords: dyscalculia, dyslexia, co-morbidity, reading, numerical cognition

# INTRODUCTION

In previous research on developmental dyslexia we showed the efficacy of examining performance across reading conditions with reference to a global factor (Zoccolotti et al., 2008). Here we describe a study in which the same approach was extended to the study of developmental dyscalculia.

According to the rate and amount model (hereafter RAM, Faust et al., 1999), group differences in speeded tasks are explained by a multiplicative interaction between a factor that marks the difficulty in any given condition ("amount") and one that marks the global slowness of a group across conditions ("rate"). Describing group differences in terms of a global factor allows

#### Edited by:

Fuat Balc*ı*, Koç University, Turkey

#### Reviewed by:

Evelyn Kroesbergen, Radboud University Nijmegen, Netherlands David Giofrè, Liverpool John Moores University, United Kingdom

#### \*Correspondence:

Gloria Di Filippo gloria.difilippo@unicusano.it Pierluigi Zoccolotti pierluigi.zoccolotti@uniroma1.it

#### Specialty section:

This article was submitted to Perception Science, a section of the journal Frontiers in Psychology

Received: 24 August 2017 Accepted: 01 February 2018 Published: 20 February 2018

#### Citation:

Di Filippo G and Zoccolotti P (2018) Analyzing Global Components in Developmental Dyscalculia and Dyslexia. Front. Psychol. 9:171. doi: 10.3389/fpsyg.2018.00171

**21**

controlling for the presence of over-additivity, i.e., the tendency of more difficult conditions to yield larger group differences over and above the influence of specific experimental manipulations. Indeed, this is typical of results obtained in various conditions such as Alzheimer's disease (Myerson et al., 1998) and traumatic brain injury (Ferraro, 1996).

The largest group of studies concerns the effect of aging (e.g., Verhaeghen and Cerella, 2002). When compared to younger adults, older individuals show group differences that are progressively larger in more difficult conditions; this effect can be expressed by contrasting in the same plot (often referred to as the Brinley plot) the condition means of the slower and the faster group (Cerella, 1985). The data points lie on a single regression line and the slope of this regression provides some information on the degree of overall impairment of the slower group. According to the RAM (Faust et al., 1999), the slowness shown by older people indicates a rate factor which interacts multiplicatively with the difficulty of the tasks over and above the specific characteristics of the target conditions.

In various studies, we applied this approach to the study of dyslexia (Zoccolotti et al., 2008; De Luca et al., 2010; Paizi et al., 2013; Martelli et al., 2014). By controlling for the influence of over-additivity, it is possible to examine which factors (if any) are genuinely involved in the reading deficit and which group differences can be parsimoniously interpreted as due to over-additivity. We found that the impairment of children with dyslexia concerns all tasks requiring the processing of strings of letters, such as word and pseudo-word reading or lexical decision tasks, and that this deficit can be almost entirely explained by a single global factor which interacts multiplicatively with the difficulty of the experimental conditions (Zoccolotti et al., 2008; Paizi et al., 2013). Thus, children with dyslexia show greater impairment with pseudo-words than words than typically developing children, but this greater lexicality effect can be entirely explained in terms of over-additivity; a similar pattern is present in the case of word frequency (Zoccolotti et al., 2008; Paizi et al., 2013). By contrast, a residual specific influence of length was found in some (though not all) studies: children with dyslexia are selectively impaired in the case of long words even after controlling for over-additivity (Zoccolotti et al., 2008).

Research on dyslexia has also evidenced a number of conditions in which performance of children with dyslexia is entirely (or largely) spared. Thus, children with dyslexia are impaired when they read words but not when they name the corresponding pictures (Zoccolotti et al., 2008; De Luca et al., 2017). Furthermore, they perform slowly when word and pseudoword reading and lexical decision tasks are presented in the visual modality but not when they are asked to repeat or make lexical decisions about the same words presented in the auditory modality (Marinelli et al., 2011). Finally, children with dyslexia are impaired in the case of strings of letters (words and nonwords) but not (or minimally) when the stimulus is either a single letter or a bigram (De Luca et al., 2010).

These findings raise the question about how to define the scope of the global factor that accounts for the reading deficit in dyslexia. Based on previous results, it appears that the key deficit concerns the processing of visually presented letter strings independent of their lexical value. A similar proposal was put forward by Marsh and Hillis (2005) based on data from neuroimaging studies and from patients with acquired reading deficits. The authors proposed the key role of the "grapheme description," a pre-lexical orthographic computation independent of case, font, location or orientation. A similar model was proposed by Dehaene et al. (2005) on the basis of their imaging studies on the so-called Visual Word Form Area (VWFA); at both a neural and behavioral level, the Local Combination Detector model envisages the various stages of visual processing which make it possible to process orthographic strings regardless of their location, font and size.

One model designed by Myerson et al. (2003) to formally describe the characteristics of the global factor underlying the differences between groups varying for overall speed of processing is the "difference engine model" or DEM. While the RAM focuses on evaluating the presence of specific factors once the over-additivity effect is controlled for, the DEM aims to describe the characteristics of the global factor itself. Therefore, the two models provide complementary information on the role of global and specific components in performance (Myerson et al., 2003); thus, we will refer to both models in the present study.

According to the DEM, the presence of a global factor is defined by the presence of co-variance between difficulty level (as defined by the condition means) and inter-individual variability (as assessed by the corresponding standard deviations, SD). Thus, with increasing difficulty (i.e., slower reaction times, RTs) SDs tend to grow systematically. This largely linear trend has a negative intercept on the y-axis and, thus, a positive intercept on the x-axis. Myerson et al. (2003) propose that performance on timed tasks can be ascribed to two different (and independent) components (named "compartments") that account for the overall response, i.e., a decisional compartment and a sensory-motor compartment. The intercept on the x-axis represents an estimate of the duration of the sensory-motor processing (sensory-motor compartment), which is expected to be constant across a large variety of tasks that require the planning of a minimal motor response. The cognitive component is marked by the co-variation between condition means and SDs. According to Myerson et al. (2003), the slope of the regression between condition means and SDs indicates the degree of correlation among the durations of the processing steps involved in the performances. In this vein, some individuals tend to have brief processing steps while others tend to have longer processing steps. It must be observed that the analysis made by DEM regards the steps in general terms while it does not consider the specific characteristic of the processing stages. In this perspective, the regression between condition means and SDs is the basic relationship which is hypothesized to hold independent of condition difficulty and group differences in speed of information processing. In the same vein, Wagenmakers and Brown (2007) consider the linear relationship between the mean and the SD of a response time distribution as a general law for RT responses under time pressure in decisional tasks.

The DEM specifies the general rules underlying global differences in performance between groups with different basic

levels of information processing but does not specify the expected values of the critical parameters (i.e., the time of the sensorymotor compartment and the slope of the regression between means and SDs). In fact, the latter are empirically defined. Based on the analysis of a large set of conditions, Myerson et al. (2003) noted that most cognitive tasks can be described by a relationship between means and SDs with a slope of ca. 30 and a x-intercept of ca. 300 ms.

Focusing on global changes in performance does not mean that all cognitive performances are equally impaired in a given group. For example, it has been reported that older individuals are more impaired in visual-spatial than verbal, lexical tasks (Hale and Myerson, 1996; Lawrence et al., 1998). According to the DEM these impairments indicate deficits in selective "domains," i.e., "verbal-lexical" or "visuo-spatial" domains. Deficits are global because within a given domain the impairment is predicated by a single regression in a Brinley plot; i.e., performances on verbal tasks are explained by a single multiplicative factor and the same holds true for visuo-spatial tasks. At the same time, they are distinct as slopes are clearly different for verbal and visuo-spatial tasks (Lawrence et al., 1998). Even though older individuals are differently impaired on verbal and non-verbal tasks this does not detract from the general law governing the relationship between mean RTs and SDs. Thus, the verbal and non-verbal tasks of both older and younger groups share the same relationship between condition means and SDs (see Figure 14, Myerson et al., 2003). Myerson et al. (2003) made a clear distinction between the general law defining the global factor, which holds across tasks and groups of individuals with different cognitive speeds, and the presence of domain selective deficits, i.e., the observation that a given group of individuals may be impaired on only a sub-set of tasks.

According to the DEM different domains can be accommodated within the same general law of processing. However, this does not necessarily imply that a single, general rule accounts for all possible tasks. In fact, the possibility should be considered that different general relationships hold for different sets of tasks. Notably, neither the RAM nor the DEM take this possibility into consideration but neither do they explicitly rule it out. Both models were devised to examine the possibility of accounting for several group differences using a limited set of predictors; so, their general aim was in this direction. However, it is not incompatible with these models that different general rules apply to different sets of tasks. Empirically, we have noted that the relationship between means and SDs in the case of reading words and pseudo-words was appreciably higher (0.70) than that observed in the case of tasks on single letters and bigrams (0.40; De Luca et al., 2010). Based on these observations we re-analyzed a large dataset from previous experiments on children with dyslexia and control children comparing performances in tasks of lexical decision and reading aloud. In the case of lexical decision tasks, the parameters were similar to previously reported values (i.e., slope of about 0.30 and x-intercept of about 300 ms; Zoccolotti et al., 2017). In the case of reading, the slope was considerably steeper (0.66) and the intercept longer (482.6 ms). Therefore, in the case of reading the inter-individual variability is very sensitive to the level of difficulty of the conditions such that even small increases in conditions difficulty produce large increases in inter-individual variability. By contrast, performances in lexical decision tasks indicate parameters similar to several other tasks reported in the literature. Overall, it appears that the general rule underlying the reading task may, indeed, be different from that of most other tasks used in the literature. Notably, these involve a relatively limited number of response alternatives; by contrast, reading requires the identification of a target among many alternatives (in fact, thousands of alternatives). We tentatively proposed that the requirement for a close coupling between orthographic and phonological processing, characteristic of reading tasks, drives the particularly high relationship between performance and inter-individual variability (Zoccolotti et al., 2017). In this view, a peculiar characteristic of reading is that seemingly small increases in difficulty can produce a large increase in variability, i.e., great changes in the tails of the distribution.

In the present research, we applied the same approach to the study of deficits in number processing and calculation. One line of research in dyscalculia has worked on the idea that the disturbance can be interpreted in terms of a deficient numerical processing system or approximate number system (ANS). In turn, this hypothesis rests on the evidence of a "number module" (Landerl et al., 2004; Butterworth, 2005; see also Wilson and Dehaene, 2007). The idea that a deficit in numerical processing is the core deficit in developmental dyscalculia is an interesting working hypothesis from the standpoint of searching for global components in the disturbance.

However, it should be noted that several different theoretical proposals have been put forward in the last years. According to Noël and Rousselle (2011), the deficit in the ANS is actually originated by a developmentally earlier disturbance in the ability to build an exact representation of numerical values. Evidence in favor of this hypothesis comes from the observation that deficits in tasks requiring the comparison of non-symbolic numbers are not present in younger children with dyscalculia (e.g., de Smedt and Gilmore, 2011) though they may emerge later in development (Mazzocco et al., 2011). Other authors have emphasized the heterogeneity of the dyscalculic deficit (Fias et al., 2013; Noël et al., 2016) and have proposed that a number of factors may contribute in generating the developmental deficit. For example, recent evidence points to the idea that shortterm visuo-spatial memory and inhibition may be critical in distinguishing between children with and without numerical deficits after controlling for several factors including age, IQ etc. (Szucs et al., 2013). On the other hand, it has been observed that the idea that cognitive factors may modulate the performance in numerical tasks and affect the emergence of the deficit does not necessarily detract from the hypothesis that the core deficit of the disturbance rests upon a deficient numerical processing system (Landerl et al., 2013).

To distinguish between reading and calculation, we examined children with deficits in one (or both) of these areas. It is well known that dyslexia and dyscalculia can appear in isolated forms (Rubinsten and Henik, 2006) but tend to be co-morbid (e.g., Lewis et al., 1994; Badian, 1999; Barbaresi et al., 2005; Tressoldi

et al., 2007; Dirks et al., 2008). Recent studies investigated the nature of the different cognitive correlates of this comorbidity (Willburger et al., 2008; Landerl et al., 2009; Wilson et al., 2015). Landerl et al. (2009) propose that a deficit in phonological processing is critical in the case of dyslexia while one in a number module is critical in dyscalculia. Partially similar results were reported in the case of young adults with dyscalculia (Wilson et al., 2015). In line with the comorbidity perspective (Pennington, 2006), cognitive deficits in the comorbid dyslexia/dyscalculia group could be accounted for by a combination of the two learning disorders, i.e., the effects were additive (Landerl et al., 2009). Recent research has also tried to detect deficits that could account for the shared variance in the two disorders, but results are still variable. Wilson et al. (2015) reported that candidates, such as domain general deficits in rapid naming or in verbal short-term memory, did not explain the comorbidity (Wilson et al., 2015). By contrast, Slot et al. (2016) reported that the shared risk for reading and calculation could be accounted for in terms of weakness in phonological awareness tasks.

In the present study, we set out to verify whether disturbances in reading and numerical tasks could be described in terms global factors in both isolated and co-morbid cases. Our general expectation was that children with dyslexia would be delayed in reading tasks and that their performances across conditions would be accounted for by a single regression (with a relatively steep slope) across all reading tasks but not (or minimally) numerical tasks. We expected the opposite in the case of children with dyscalculia, i.e., that their performance across numerical tasks would be accounted for by a single regression line and their performance on reading tasks would be minimally affected. Finally, we expected that co-morbid cases would perform pathologically on both sets of tasks.

With regard to the DEM, one interesting question is whether the putatively differential deficits in reading and numerical tasks can be explained in terms of different "domains" or, alternatively, in terms of different general laws of performance. In the former case, one would expect children with dyslexia or dyscalculia to be impaired only in the reading or numerical domain, but these two domains would show the same relationship between RT condition means and SDs. In the latter case, one would expect that reading and numerical tasks would actually show distinct relationships between means and SDs. Based on our recent reanalysis (Zoccolotti et al., 2017), the relationship between means and SDs can be hypothesized as steeper in the case of reading tasks than numerical tasks.

Overall, the aims of the study were to the following:



# MATERIALS AND METHODS

# Sample

Children were examined in five different schools in Rome in a middle-class environment. All the 22 classes in these schools participated in the study. Approximately 15 children per class accepted to participate in the study (out of an average of ca. 20 per class). Overall the sample included a total of 325 fifth graders (178 Male and 147 Female; mean age = 10.6).

Identification of reading and calculation deficits was based on the standards identified by the Consensus Conference on specific learning disabilities (Istituto Superiore di Sanità, 2011). To detect calculation deficits, it is required to use tasks involving specific skills (such as arithmetic facts, additions, subtractions, number comparisons etc.). The performance of the child should be expressed in terms of standard deviations from a normative sample (not in terms of age equivalents). Both tasks measuring accuracy and speed are envisaged. To detect reading deficits, it is required that children show a deficit in both accuracy and speed based on a quantitative comparison with standard norms. Performance on reading comprehension is not diagnostic of the reading deficit although it provides complementary information for the functional framing of the disturbance. It is envisaged that lists of words and non-words are included in the test battery as they provide particularly effective measure of reading deficit.

Children were initially administered two standard tests to detect reading and calculation deficits. As for calculation skills, the group-administered part of AC-MT test of (Cornoldi et al., 2002) includes subtests that investigate addition and subtraction, multiplication and division, number size comparisons, digit transformation, and number ordering. Two overall scores are derived, one for Written operations and one for Numerical knowledge. Normative values for these two scores are available from Cornoldi et al. (2002). As for reading, the MT test for elementary school (Cornoldi and Colpo, 1998) was individually administered to the children. In the MT reading test, the child reads aloud a text passage with a 4-min time limit; speed (s per syllable) and accuracy (number of errors, adjusted for the amount of text read) are scored and can be expressed as z scores with reference to normative values (Cornoldi and Colpo, 1998). The MT scale also envisages a comprehension sub-test; following clinical standards (Istituto Superiore di Sanità, 2011) this test was not used as part of the selection criteria. In this case, the child reads a second passage silently, with no time limit, and responds to 10 multiple-choice questions. The children were also given Raven's Colored Progressive Matrices to evaluate non-verbal intelligence; only children who performed within normal limits (>10 percentile) were included in the experimental samples.

Based on standard values, we identified children with either a selective deficit in reading, calculation or both. We adopted

a z score cut-off of −1.65 (corresponding to ca. 5% of children under normality assumptions). Out of the total of 325 children examined, 12 failed in reading accuracy and/or speed (i.e., 3.69%) while 23 children (7.08%) failed either in the Written operations or Numeric knowledge indexes on the AC-MT test (for comments on these proportions see section "Discussion"). Overall, five children (F = 4, M = 1) only failed in reading and were considered "dyslexic"; 16 children (F = 6, M = 10) only failed in number tasks and were considered "dyscalculic"; finally, 7 children (M = 4, F = 3) showed a mixed pattern (i.e., they failed at least one reading and one numerical test), consistent with the high co-morbidity between the two deficits. A control group of 49 children (F = 23, M = 26) with spared performance on all screening tests was also examined; the children were selected from the overall original sample based on their willingness to participate in the study (but with no further selection based on screening measures).

Performance of the different groups of children on the two screening tests is presented in **Table 1** in terms of mean z scores (and SDs) based on normative values. For the AC-MT test of calculation skills, values on the Written operations and Numeric knowledge indexes are presented. For the MT test, data for accuracy and time in reading a text passage are separately presented. Data on comprehension are also shown; note that they indicate only very mild deficiency in children with dyslexia and children with mixed pattern. This is in keeping with previous data on Italian children which indicate that comprehension is only mildly (or minimally) affected (Judica et al., 2002) if the comprehension test does not stress decoding skills (in the case of the MT test, no time limit is given to the child to read the text passage before responding to the multiple-choice questions).

A number of other tests were given to confirm the diagnosis on both reading and calculation skills. With regard to reading, we administered the Word and Pseudo-word Reading Test (Zoccolotti et al., 2005). The test includes four lists of 30 words (varying for frequency and length) and two lists of 30 non-words (varying for length). Number of errors and reading times are scored. Mean z score data (and SDs) for these lists are reported in **Table 1**. The children with dyslexia and those in the mixed pattern group were severely affected, the mean standardized score being less than two SDs in several conditions. In particular, the children with dyslexia were more affected in time than in accuracy measures; the disturbance was severe in both time and accuracy measures in the children in the mixed pattern group. Word and non-word reading was impaired to a similar extent. Thus, no selective deficit in reading non-words was detected, similarly to previous observations in Italian children

TABLE 1 | Performance of the different groups of children in the two screening tests and in the additional reading and number tasks in terms of mean z scores (and SDs) based on normative values.


(Zoccolotti et al., 1999). As a group, the children with dyscalculia were minimally affected across all measures.

As for numerical tasks, three sub-tests from the Battery for Developmental Dyscalculia (Biancardi and Nicoletti, 2004) were given: In Complex oral calculation sub-test, the child must perform 10 additions and 10 subtractions with results above 10; a time limit of 15 s is given. In triplets, the child must choose the largest number in a set of three Arabic numbers (1–6 digits); twenty trials are given; both accuracy and speed are measured. In the Insertions sub-test, the child must place a 1- to 5-number among three other numbers; 12 trials are given; both accuracy and speed are measured. Mean z data (and SDs) for these sub-tests are reported in **Table 1**. Children with dyscalculia were moderately affected in all conditions with the exception of accuracy in the Insertions sub-test. Children in the mixed pattern group were generally more affected across conditions, including accuracy on the Insertions sub-test. As a group, the children with dyslexia were generally spared in most conditions; however, a moderate deficit was detected in the Mental Calculation and in the Triplets (time) sub-tests.

Information about the reading and numerical conditions is summarized in **Figure 1**, where data from different measures of reading and number processing are collapsed into overall indexes separately for the screening and additional measures. Inspection of the figure indicates a somewhat more clear-cut separation between reading and numerical tasks in children with dyslexia than in children with dyscalculia. Children in the mixed pattern group showed a severe deficiency across the two sets of tasks.

Due to the selection criteria, the children generally performed well on Raven's Progressive Matrices; in particular, the control children' mean performance was 30.26 (SD = 3.35), the performance of children with dyslexia was 26.75 (SD = 5.42), the

FIGURE 1 | Mean and z-score performance (and SDs) in reading and number tasks of the four groups of children (controls, children with dyslexia, children with dyscalculia, and children with a mixed pattern). Values indicate average performance across tests used for screening and for the additional tests used for the evaluation of children (see text for details).

performance of children with dyscalculia was 29.05 (SD = 3.96) and the performance of children in the mixed pattern group was 27.29 (SD = 5.44). According to Pruneti et al. (1996), normative values indicate a mean performance of 30.2 (SD = 4.3) in fifth grade. Therefore, although the performance of children with dyslexia and children in the mixed pattern group was in the normal range it was somewhat lower than expected. However, group differences between the four groups did not reach statistical significance [F(3,79) = 2.2, p = 0.09].

The study was carried out according to the principles of the 2012–2013 Helsinki Declaration. Written informed consent to participate in the study was obtained from the parents of all children. The study was approved by the IRB of the Department of Psychology of Sapienza University of Rome.

#### Experimental Tests

Several experimental tests were given.

To test reading, the children had to read aloud words individually presented at the center of a PC screen. The list of words was derived from Paizi et al. (2013; experiment 3): highand low-frequency words (based on child-printed frequency counts; Marconi et al., 1993) that varied in letter length from four to seven letters (that varied in length from four to seven letters) for a total of eight different conditions were selected from the LEXVAR database (Barca et al., 2002). There were 15 stimuli in each orthogonal condition for a total of 120 stimuli. The stimuli were presented in five blocks of 24 words each. At the beginning, a practice block was administered; it consisted of 10 words that were different from the experimental items but had the same characteristics. A short pause was allowed after each block.

Bigram frequency was matched across conditions. Initial phonemes in the four sets were matched for the manner of articulation as well as for the voiced vs. voiceless features. N-size, age of acquisition and orthographic complexity were matched between corresponding length sets in the high- and lowfrequency conditions. For a full description of the list please refer to Paizi et al. (2013). Vocal reaction times (RT) were measured. Median RTs for each condition was the dependent measure.

For numerical tasks, five sub-tests were used: (1) One-number addition: the child saw a pair of numbers at the center of the PC screen with the + sign in the middle and had to say aloud the result of the addition ASAP (vocal RTs were measured); 20 trials were given; (2) One-number subtraction: similar to the previous sub-test except that the child had to say the product of the subtraction; (3) One-digit number reading: a digit was presented at the center of the PC screen and the child had to read it aloud; twenty trials were given; (4) Two-digit number reading: the same as 3 but the numbers had two digits. (5) Number comparison: two numbers were presented, one on the left and one on the right of the PC screen. The child had to press one of two keys that indicated the highest number. Median RTs were used as the dependent measure.

#### Procedure

The screening procedures included both group and individually administered tests, which were given to the children in a single session in a quiet room in their school.

The children who participated in the rest of the study took the additional reading and numerical tests during another individual session.

In the final session, the children were administered the experimental tests. The order of presentation of the reading and calculation tests was counter-balanced across subjects.

## Data Analysis

The RAM and DEM models make predictions in the case of openscale measures, such as RTs, but not accuracy. Thus, statistical analyses were carried out on RTs while accuracy values were inspected to exclude the possible presence of speed accuracy trade-off data. No such trade-off was detected and accuracy data were not further analyzed.

The RAM and DEM envisage different conditions to identify the presence of a global factor:

(1) The RAM (Faust et al., 1999) predicts a linear relationship between the condition means of two groups varying in terms of overall information-processing rate. Thus, we separately plotted the mean RTs for each group of children with learning problems (i.e., with dyslexia, dyscalculia and with a mixed pattern) against the performance of control children. We expected the group differences to increase linearly with the difficulty of the condition with separate regression lines for reading and numerical tasks. We expected that the children with dyslexia would have a steeper slope for reading than for numerical tasks and that those with dyscalculia would show the opposite pattern.

(2) The DEM (Myerson et al., 2003) predicts a linear relationship between the group means and the corresponding standard deviations. Note that homogeneity of variance is a basic assumption in parametric analyses; however, the presence of a clear-cut linear increase in SDs as a function of condition difficulty marks a systematic deviation from such assumption. To test the model prediction, mean RTs of the different groups of children in the different experimental conditions were plotted as a function of the corresponding SDs; data from the various groups were plotted in the same graph as the DEM predicts that the slope of the regression is constant across different groups of children. Furthermore, the model states that the x-intercept represents an estimate of the time for early visual processing, response selection and execution (sensory-motor compartment). Again, the DEM predicts that the same intercept on the x-axis would hold across different groups of children. As indicated in the Introduction, recent evidence (Zoccolotti et al., 2017) raises the question about whether reading tasks have the same general parameters as other cognitive timed tasks. To check this possibility, different plots between SDs and means were made for reading and number conditions.

To remove the effect of over-additivity in the data we made analyses on z-score data. Following Faust et al. (1999) they are calculated by subtracting the mean of each condition from the overall participant mean and dividing the product by the standard deviation of the condition means for each child; thus, z-transformed values represent the deviation of each condition from the overall participant mean. In this way, global components are controlled for, but individual variability across experimental conditions is preserved. We carried out

separate ANOVAs on the reading and numerical data assessing the effects of the different variables marking the conditions of these tasks on both raw and z-transformed RTs. Based on Faust et al. (1999) interactions with the group factor which are significant in the z-transformed data (i.e., controlling for global components) highlight a genuine effect; interactions with the group factor which are significant in the raw, but not in the z-transformed, analyses indicate the presence of over-additivity. Whenever appropriate, means were compared with the Tukey post hoc test considering p < 0.05 as a reference. The variables entered in the different analyses are spelled out in the Section "Results."

# RESULTS

## Analysis of Global Factors

**Figure 2** presents a Brinley plot to examine the prediction that "the condition means for a particular group as a function of the condition means for another group will be linear" (Faust et al., 1999). For children with dyslexia, performance on the word conditions was well fit by a single regression line (y = −4295.6 + 8.36x) that explained a large proportion of variance (R <sup>2</sup> = 0.97). Also, the conditions of the numerical tasks were well fit by a single regression line with a different slope (y = −653.7 + 2.19x; R <sup>2</sup> = 0.98).

We applied the same approach to children with dyscalculia; the resulting Brinley plot is presented in **Figure 3**. Also in this case, we applied a solution with two regression lines, one for reading (y = −208.9 + 1.41x; R <sup>2</sup> = 0.92) and one for numerical

(y = −621.65 + 2.04x; R <sup>2</sup> = 0.99) tasks, which effectively accounted for the experimental data. Inspection of the figure indicates that children with dyscalculia show a very limited spread of performance in the case of reading tasks (as expected). Furthermore, opposite to children with dyslexia, the slope for calculation tasks was steeper than that for reading tasks although the difference was less marked than in the case of children with dyslexia.

The performance of the children with a mixed pattern (dyslexia and dyscalculia) was severely impaired for both reading (y = −3705.2 + 7.33x; R <sup>2</sup> = 0.98) and numerical (y = −1442.4 + 3.56x; R <sup>2</sup> = 0.99) tasks. The resulting Brinley plot is presented in **Figure 4**. Note that the performance of these children is considerably more impaired; thus, the figure has a much larger scale than the two previous graphs.

We then tested the DEM prediction (Myerson et al., 2003) that there should be a linear relationship between the group means and the standard deviations in the same conditions. Data for the reading and number tasks are presented separately in **Figure 5A** (reading tasks) and **Figure 5B** (number tasks).

A number of general characteristics emerge in these plots. Data in plot 5a indicate that a single regression line accounts quite well for the performance of all four groups of children on the number tasks (with a R <sup>2</sup> = 0.91). The slope of the relationship is 0.25 and the intercept on the x-axis is 219.4. As one intercept on the x-axis accounts well for the data of all sub-groups of children, based on DEM this indicates that they are not different in the sensory-motor (non-decisional) compartment but only in the cognitive compartment. If separate regression lines are used for the four groups of children, slopes vary between 0.18 and 0.32 and determination coefficients vary between 0.82 and

0.97. As only one numerical sub-test required a manual response, we also re-examined the relationship between means and SDs excluding this sub-test. The results were very similar: the slope of the relationship was 0.25 and the intercept on the x-axis was 280.5 (with a R <sup>2</sup> = 0.93).

Also for reading tasks, a single regression line accounts well for the responses of all groups (with a R <sup>2</sup> = 0.87). In this case, the relationship is 0.63 and the intercept on the x-axis is 537.25 ms. Again, one intercept on the x-axis accounts well for the data of all sub-groups; this is in keeping with the idea that groups are not different in the sensory-motor (non-decisional) compartment but only in the cognitive compartment. If separate regression lines are used for the four groups of children, slopes vary between 0.47 and 1.02 and determination coefficients vary between 0.71 and 0.96.

Overall, our data indicate that the same relationship between mean performance and variability holds for all groups of children, as predicted by DEM. However, data relative to the orthographic and numerical tasks also indicate distinct linear relationships in terms of both slopes and intercepts on the x-axis. This pattern is consistent with the idea that reading and numerical tasks do not merely represent two separate domains; in fact, performances in these two sets of data point to the presence of two separate general relationships between means and SDs. Further comments on this point will be presented in the Section "Discussion."

#### Anovas

#### Reading

Two ANOVAs were carried-out on mean RTs and z-transformed values with length (4-, 5-, 6-, and 7-letter words) and frequency

(high-low) as repeated measures and group (controls, children with dyslexia, children with dyscalculia and mixed group) as unrepeated measure. The ANOVA on raw RTs showed a main significant effect of the group factor in the raw [Frt(1,79) = 34.14, p < 0.0001, η 2 <sup>P</sup> = 0.56], but (due to the data transformation) not in the z-transformed analysis (F<sup>z</sup> < 1, n.s.; Frt refers to the raw data analysis and F<sup>z</sup> to the z-transformed analysis). On average control children responded in 666.8 ms, which was significantly faster than the RTs of children with dyslexia (1282.9 ms) and of children in the mixed pattern group (1279.4 ms), who did not differ from each other. The RTs of children with dyscalculia were insignificantly slower than those of controls (734.4 ms) but slower than those with dyslexia and with a mixed pattern. The effect of word frequency [Frt(1,79) = 62.18, p < 0.0001, η 2 <sup>P</sup> = 0.44; Fz(1,79) = 20.70, p < 0.0001, η 2 <sup>P</sup> = 0.63] was significant, indicating faster RTs for high- (732.3 ms) than low-frequency (893.6 ms) words (diff. = 161.3 ms). The main effect of length [Frt(3,237) = 96.39, p < 0.0001, η 2 <sup>P</sup> = 0.55; Fz(3,237) = 102.92, p < 0.0001, η 2 <sup>P</sup> = 0.51] was significant, indicating slower RTs for longer words (with an average 82.9 ms increase per letter). The frequency by length interaction was significant [Frt(3,237) = 13.97, p < 0.0001, η 2 <sup>P</sup> = 0.15; Fz(3,237) = 3.75, p = 0.01, η 2 <sup>P</sup> = 0.05] indicating larger length effects for low- than for high-frequency words. All interactions with the group factor (group by length, group by frequency and group by length by frequency) were significant (at least p < 0.01) in the raw data analysis; however, they all vanished in the z-score analysis (all F<sup>s</sup> < 1.1, n.s.; all η 2 <sup>P</sup> < 0.04), indicating that they could all be accounted for in terms of over-additivity.

#### Numerical Tasks

Two ANOVAs were carried-out on raw RTs and z-transformed values with condition (one-digit number reading, two-digit number reading, number comparison, one-number addition and one-number subtraction) as repeated measure and group (controls, children with dyslexia, children with dyscalculia and mixed group) as unrepeated measure. The ANOVA on mean RTs showed a main significant effect of group in the raw data [Frt(3,79) = 31.03, p < 0.0001, η 2 <sup>P</sup> = 0.54], but was inherently nil in the z-transformed analysis. On average RTs of control children (931.3 ms) were significantly faster than RTs of children with dyscalculia (1276.2 ms) and of children in the mixed pattern group (2130.7 ms). The RTs of children in the mixed pattern group also differed from those of children with dyslexia and dyscalculia. The difference between controls and children with dyslexia (1384.8) fell short of significance (p = 0.08). The effect of condition was significant [Frt(4,316) = 88.40, p < 0.0001, η 2 <sup>P</sup> = 0.53; Fz(1,91) = 634.51, p < 0.0001, η 2 <sup>P</sup> = 0.89]: RTs to one- (575.0 ms) or two-digit (641.9) number reading yielded the shortest (and not significantly different from each other) RTs whereas one-number additions (1668.3 ms) and one-number subtractions (2259.7) were slower (and significantly different from each other); RTs for number comparison (920.0 ms) were intermediate (and significantly different from the onenumber addition and subtraction conditions). The group by condition interaction was significant in the raw data analysis [Frt(12,316) = 14.87, p < 0.0001, η 2 <sup>P</sup> = 0.36], but vanished in the z-transformed analysis [Fz(12,316) < 1, n.s.; η 2 <sup>P</sup> = 0.04], indicating the presence of over-additivity.

#### DISCUSSION

The first aim of the present study was to establish whether performance on numerical tasks could be described in terms of a global factor. Results are clearly in favor of this hypothesis. For example, RTs of children with dyscalculia grew by a slope of 2.04 as a function of condition difficulty with respect to the performance of control children and this regression accounted for a very large proportion of variance (R <sup>2</sup> = 0.99). This indicates that differences in raw data between children with dyscalculia and controls increase as a function of condition difficulty over and above the specific characteristics of the experimental conditions, pointing to the presence of an over-additive effect for numerical tasks. Consistently, when analyses were carried out to remove the effect of over-additivity by using z-transformed data, conditions requiring additions, subtractions, number comparisons etc. yielded about the same group differences between dyscalculic children and controls. These results are consistent with the idea that performance on numerical tasks can actually be described in terms of a single global factor. Interestingly, a line of research has focused on the idea that performance on different numerical and calculation tasks can be seen in terms of a number module (Landerl et al., 2004; Butterworth, 2005). The present results are broadly in keeping with this idea. However, it might be necessary to examine a larger variety of numerical tasks before a definite conclusion can be reached on this point. In particular, only symbolic stimuli were used; extension of these results to non-symbolic stimuli is required before a clear statement on the number module hypothesis can be made.

Therefore, a large proportion of the variability across number tasks can be accounted for in terms of global components in the data. Models, such as the RAM and DEM, help to define the characteristics of these components. One intriguing question concerns whether clusters of tasks can be expressed in terms of different domains or in terms of different general rules or laws, governing the relationship between performance and inter-individual variability. An example of a distinction in terms of domains is provided by the studies on aging (Hale and Myerson, 1996; Lawrence et al., 1998) which indicate a greater impairment in visuo-spatial than in verbal tasks, although the general relationship underlying these tasks was the same. However, recent evidence indicates that reading tasks may actually point to the presence of a different general law in the data. In a re-analysis of a large set of previous experiments examining vocal RTs to reading isolated words (Zoccolotti et al., 2017) we noted that the regression between means and SDs had a much steeper slope and a larger x-intercept. Accordingly, reading (but not lexical decision) tasks map onto a different general relationship so that inter-individual variability grows at a particularly high rate also with moderate increases in condition difficulty.

In this study, we were able to test this hypothesis on nonretrospective data by comparing performance on reading and

numerical tasks. The results clearly support the idea that the general relationship between means and SDs is quite different for reading and numerical tasks. In particular, the slope was considerably higher in reading (0.63) than in numerical (0.25) tasks. Furthermore, a larger intercept on the x-axis was present for reading (537.25 ms) than for numerical tasks (280.54 ms). On the one hand, the pattern for the number tasks is quite similar to the parameters reported for several visuo-spatial and verbal tasks by Myerson et al. (2003). On the other hand, the pattern for the reading tasks closely replicates the results reported in our retrospective analysis, where the slope was 0.66 and the intercept on the x-axis was 482.6 ms. Therefore, the present data are in keeping with the idea that the differences between reading and numerical tasks do not merely point to the presence of two different domains but actually refer to the presence of two general laws underlying these two sets of data. In trying to understand the origin of this quite general distinction, it is interesting to note that, with one exception (i.e., the number comparison task) all tasks used in the present research required a vocal response. Thus, the present data exclude the possibility that the difference may lie in the nature of the response. Indeed, all tasks used in the studies by Myerson et al. (2003) envisaged a manual response; so, this possibility deserved some consideration. Alternatively, we have proposed that reading is different from all other tasks as it involves a situation in which a very large set of alternative responses is present; indeed, the observer has to name a word from a possible pool of thousands of alternatives. According to the DEM, the slope of the regression between the means and the SDs indicates the degree of correlation among the durations of the processing stages. To identify a target the observer has to closely couple the output of the orthographic analysis with the identification of the corresponding phonological code and we have proposed that it may be this requirement that drives the particularly steep relationship between performance and inter-individual variability in reading tasks (Zoccolotti et al., 2017).

We searched for associated and dissociated reading and number difficulties starting from a moderately large school sample. Evidence indicated that these two sets of deficits frequently co-occurred, as expected. Out of a total of 325 children, 12 showed reading deficits (i.e., 3.69%). This figure is very similar to Barbiero et al.'s (2012) recent Italian prevalence data. These authors reported a proportion of children with dyslexia comprised between 3.1 and 3.2%. More than half of children with a reading deficit (7 or 58.3%) also failed on numerical tasks. Proportion of children with deficits in numerical tasks was higher: 23 children or 7.08%. As yet, no systematic epidemiological data are available in Italian for this deficit; however, it has been recently reported that prevalence estimates range around 6% (e.g., Wilson et al., 2015). Thus, the proportion of children with deficits in numerical tasks in the present study appears in line with the current available data in the literature. Seven of the 23 children with numerical deficits (i.e., 30.4%) also showed reading deficits. Thus, the overlap between the two disturbances was high.

This is in keeping with data from the literature; for example, Wilson et al. (2015) estimated that relative overlap between the two disorders averages 37% across different studies. Thus, the present data broadly fit with the figures reported in the literature both in terms of their separate prevalence and in terms of the overlap between the two disorders. Interestingly, although children were in the normal range in intelligence measures there was a tendency for children in the mixed group and in the dyslexic group to score lower than children with dyscalculia or controls in the Raven test. Recent evidence indicates differences in intelligence measures among groups with different learning deficits (Cornoldi et al., 2014; Giofrè et al., 2017; Toffalini et al., 2017). It may be interesting to verify the stability of these differences by using larger groups of children.

From these general figures, we identified two groups of children with putatively isolated reading and numerical deficits and a group with co-morbid symptoms. Empirically, data from both the standard cognitive tests and from the experimental tasks indicated that the dissociation between reading and numerical skills was incomplete in the dyslexic as well as in the dyscalculic group; in other terms, children with dyslexia were not entirely spared in number tasks and vice versa. Different factors may contribute to this outcome. First, dyslexia and dyscalculia are currently considered graded difficulties and cut-offs always maintain a certain degree of arbitrariness (e.g., Pennington and Bishop, 2009). Thus, incomplete dissociation of symptoms may actually be "real" in the sense that co-morbidity in itself may be seen as a graded phenomenon rather than as a categorical one. Alternatively, it is possible that the tests used were only partially sensitive to the underlying dimensions and that some true co-morbid cases were not detected because of the insensitivity of the measures used.

At any rate, the focus of the present research was on the possibility of expressing dissociations and associations between reading and numerical conditions in terms of global components. A partial dissociation between reading and numerical tasks was clear in children with dyslexia in terms of very different slopes in the Brinley plot between the two sets of tasks (8.36 for the reading and 2.19 for the number tasks, respectively). In children with dyscalculia, the pattern was reversed but the difference between the two slopes was much smaller (1.41 for the reading and 2.04 for the number tasks, respectively). In this case (see **Figure 3**) what seems to discriminate best between the two sets of tasks in this group of children is the presence of a very small range of variability across reading conditions. As discussed above, at a global level of analysis, reading is characterized by a very tight relationship between difficulty level and inter-individual variability. In these terms, seemingly small increases in condition difficulty will generate comparatively large group condition differences between affected and unaffected individuals. Comparing the reading means of **Figures 2**, **3** makes it clear that, unlike what happens in children with dyslexia, in those with dyscalculia this "explosion" does not take place and differences among reading conditions amount to tens (rather than hundreds) of

milliseconds, i.e., in a range very similar to that of controls. Results of children in the mixed pattern group indicate that the pattern of deficiencies in reading and numerical tasks is well accounted for by two regression lines with large slopes (7.33 for the reading and 3.56 for the number tasks, respectively). Overall, it appears that reference to global components provides a good description of both isolated and co-morbid deficiencies. Notably, however, the two sets of data do not yield specular profiles. This is probably due to the fact that different general laws underlie the two sets of tasks. In particular, the presence of a very tight relationship between condition means and inter-individual variability (which is characteristic of reading) is expressed very clearly producing a large spread of performances across different conditions in the affected children (those with dyslexia and with a mixed pattern) but not in the unaffected ones (i.e., children with dyscalculia).

Although research on co-morbidity has increased considerably in recent years, partly due to the seminal work of Pennington (e.g., Pennington, 2006; Pennington and Bishop, 2009), full comprehension of the cognitive underpinnings of the partial overlap of learning and other developmental disturbances is still lacking. Only a few studies have directly attempted to mark the different and common deficits present in children with dyslexia and dyscalculia. Deficits in phonological skills or in a number module mark the performance of children with dyslexia and dyscalculia, respectively (Landerl et al., 2009; Wilson et al., 2015); in fact, the deficits appear in additive fashion in co-morbid cases. However, this only partially explains the phenomenon because in multiple deficit models (e.g., Pennington, 2006) comorbidity between two conditions is also due to shared etiologic and cognitive risk factors. Although some steps have been taken in the search for the cognitive factors that separately mark the two disorders, identifying the common risk factor seems more difficult. For example, domain general deficits in rapid naming and in verbal short-term memory do not seem to account for the co-morbidity (Wilson et al., 2015). More recently, Slot et al. (2016) have reported that, apart from predicting directly the reading deficit, phonological awareness may represent a risk factor for the co-presence of reading and numerical task deficiencies.

In the present study, we aimed to establish whether performance in reading and numerical tasks can be effectively expressed in terms of global factors. We did not examine whether these factors should be viewed as entirely separate or whether they share some common cognitive elements. In fact, the aims of the present study were mainly descriptive and the present data are not informative on the source of the co-morbidity between dyslexia and dyscalculia. However, we feel that the possibility of identifying the global factors that account for performance in reading and numerical tasks might provide an interesting opportunity for studying the critical factors in determining the two deficits as well as their co-morbidity. In particular, taking into account global components in the data allows controlling for the presence of over-additivity and this could be instrumental in searching for

the specific factors underlying a deficit (or the communality between two deficits). In the particular case of dyscalculia, one hypothesis sees the disturbance as due to impaired processing in a number module (Landerl et al., 2004; Butterworth, 2005). Within this perspective, one should find similar deficits for symbolic and non-symbolic stimuli. Testing this hypothesis with reference to global factors can be instrumental as this approach is particularly suited to compare group differences across tasks which may vary considerably for general level of difficulty. Furthermore, other authors have posited that other cognitive factors, such as short-term visuo-spatial memory and inhibition may be critical in explaining the deficient performance in number tasks (Szucs et al., 2013). For example, placing manipulations of these factors within a global component analysis may help understanding whether they add only in terms of task difficulty (which would point to over-additivity) or would produce different selective deficits (which might point to different domains for tasks with high or low visuospatial memory components or high and low inhibition components).

One advantage of describing deficits in children with dyslexia and dyscalculia in terms of global models of performance is that it allows determining whether group differences can be ascribed to the decisional or also to the non-decisional part of the response. In particular, the DEM allows isolating a cognitive compartment (marked by the slope of the relationship between condition means and SDs) and a sensory-motor compartment (marked by the intercept of this relationship on the x-axis). Previous research has demonstrated that children with dyslexia have a deficit in the decisional compartment of the response but not in sensory-motor compartment (Martelli et al., 2014; for a similar conclusion reached on the basis of the diffusion model see Zeguers et al., 2011). The present results extend this conclusion to numerical tasks. Compared to controls, the children with dyscalculia and those in the mixed pattern group were severely affected on numerical tasks, but (as shown in **Figure 5**) the same relationship between means and SDs held for these groups as well as for the control readers and the children with dyslexia. Therefore, it can be concluded that performance differences among these groups on numerical tasks are actually confined to the decisional component of the response.

Several limitations of the present study must be pointed out. The original sample was moderately large, but some of

#### REFERENCES


the target sub-groups were smaller than optimal. Thus, it is possible that the identification of a small group of children with dyslexia may have made it difficult to detect the contribution of specific factors in modulating the deficit (such as the effect of length, which was found in some previous research; e.g., De Luca et al., 2010). At the same time, it should be noted that global components mark quite stable tendencies in the data and, thus, it seems unlikely that part of the present results are unstable. Furthermore, our focus here was on numerical deficits and the sample of children with dyscalculia was comparatively large enough for testing our hypotheses. Nevertheless, confirmation of the present findings in larger subgroups of children is certainly in order. Furthermore, while we tried to have a reasonably comprehensive analysis of reading and number skills, it proved difficult to also examine other comorbidities, such as ADHD which is well-known to co-occur with learning disorders (Pennington, 2006). ADHD may alter the distribution of RTs (e.g., Hervey et al., 2006). So, it remains a goal for future studies to evaluate whether the co-presence of ADHD symptoms may influence the pattern of findings reported here.

# CONCLUSION

The present study indicates that the approach of describing the deficit in numerical tasks shown by children with dyscalculia in terms of a global factor is effective, as previously shown in the case of reading deficits of children with dyslexia. As both reading and calculation performance can be effectively expressed in terms of global factors, this perspective may open interesting possibilities to for the study of the frequent, though partial, co-occurrence of these disturbances.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This research was supported by Sapienza Università di Roma.




with high grapheme-phoneme correspondence. Appl. Psycholing. 20, 191–216. doi: 10.1017/S0142716499002027

Zoccolotti, P., De Luca, M., Judica, A., and Spinelli, D. (2008). Isolating global and specific factors in developmental dyslexia: a study based on the rate and amount model (RAM). Exp. Brain Res. 186, 551–560. doi: 10.1007/s00221-007-1257-9 doi: 10.1007/s00221-007-1257-9

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Di Filippo and Zoccolotti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neural Mechanisms Underlying Time Perception and Reward Anticipation

Nihal Apaydın1,2, Sertaç Üstün<sup>3</sup> , Emre H. Kale<sup>2</sup> , ˙Ipek Çelikag˘ 2 , Halise D. Özgüven2,4 , Bora Baskak2,4 and Metehan Çiçek3,4 \*

<sup>1</sup> Department of Anatomy, School of Medicine, Ankara University, Ankara, Turkey, <sup>2</sup> Brain Research Center, Ankara University, Ankara, Turkey, <sup>3</sup> Department of Physiology, School of Medicine, Ankara University, Ankara, Turkey, <sup>4</sup> Department of Psychiatry, School of Medicine, Ankara University, Ankara, Turkey

Findings suggest that the physiological mechanisms involved in the reward anticipation and time perception partially overlap. But the systematic investigation of a potential interaction between time and reward systems using neuroimaging is lacking. Eighteen healthy volunteers (all right-handed) participated in an event-related functional magnetic resonance imaging (fMRI) experiment that employs a visual paradigm that consists monetary reward to assess whether the functional neural representations of time perception and reward prospection are shared or distinct. Subjects performed a time perception task in which observers had to extrapolate the velocity of an occluded moving object in "reward" vs. "no-reward" sessions during fMRI scanning. There were also "control condition" trials in which participants judged about the color tone change of the stimuli. Time perception showed a fronto-parietal (more extensive in the right) cingulate and peristriate cortical as well as cerebellar activity. On the other hand, reward anticipation activated anterior insular cortex, nucleus accumbens, caudate nucleus, thalamus, cerebellum, postcentral gyrus, and peristriate cortex. Interaction between the time perception and the reward prospect showed dorsolateral, orbitofrontal, medial prefrontal and caudate nucleus activity. Our findings suggest that a prefrontal-striatal circuit might integrate reward and timing systems of the brain.

#### Duke University, United States

Reviewed by: Aviv M. Weinstein,

Edited by: Mikhail Lebedev,

Ariel University, Israel Roee Admon, University of Haifa, Israel

#### \*Correspondence:

Metehan Çiçek mcicek@ankara.edu.tr

Received: 22 December 2017 Accepted: 09 March 2018 Published: 21 March 2018

#### Citation:

Apaydın N, Üstün S, Kale EH, Çelikag˘ ˙ I, Özgüven HD, Baskak B and Çiçek M (2018) Neural Mechanisms Underlying Time Perception and Reward Anticipation. Front. Hum. Neurosci. 12:115. doi: 10.3389/fnhum.2018.00115 Keywords: time perception, reward anticipation, fMRI, dopaminergic pathways

#### INTRODUCTION

Perception of time and thus coordination of temporal sequences of events in our internal and external environment is vital to adapt to the world around us. Although time is a fundamental dimension of life, the neural mechanisms underlying time perception are still unclear (Ivry and Spencer, 2004a,b; Lewis and Walsh, 2005; Burr and Morrone, 2006). Is the temporal processing in brain dependent on a specialized system or is it represented by specialized neural networks, or is it regionally perceived depending on the task? This is one of the most fundamental questions which has not been yet properly answered to fully elucidate how the human brain perceives time. The metrical representation of time was generally explored under two categories; duration estimation (explicit timing) and temporal expectation (implicit timing). Basal ganglia, supplementary motor area (SMA), cerebellum, and prefrontal cortex (Coull and Nobre, 2008; Coull et al., 2011) as well as inferior parietal and insular cortex (Ferrandez et al., 2003; Pouthas et al., 2005; Üstün et al., 2017) have been suggested as responsible for explicit timing.

Among the brain regions which were suggested as to have a functionally discrete role in time perception; the dorsal striatum of the basal ganglia and, more specifically, its ascending nigrostriatal dopaminergic pathway suggested to be the most crucial as shown by converging functional neuroimaging, neuropsychological, and psychopharmacological investigations in humans, as well as lesion and pharmacological studies in animals (Coull et al., 2011). Studies using monkey experiments showed that the dopaminergic neurons fire depending on the occurrence and the timing of the rewards (Hollerman and Schultz, 1998; Waelti et al., 2001). In a study of experimental striatum lesion in rat, it was observed that lesioned rats pushed the pedal later than the normal rats to receive a reward (Matell et al., 2003). In addition, dopaminergic agents have systematically deteriorated the performance of timing responses. Administration of dopamine agonists leads to a reduction in perceived time, while antagonists lead to prolongation (Ivry and Spencer, 2004b).

Reward and punishment are also known to be effective upon our future decisions and indeed reward has been shown to promote human performance in multiple task domains. It was suggested that the key brain regions for reward system are the nucleus accumbens (NAc) and the ventral tegmental area (VTA) (Rolls, 2000; Schultz, 2000; Schultz et al., 2000; Kelley and Berridge, 2002; Wise, 2002; Stefani and Moghaddam, 2006; Hikosaka et al., 2008). Recent studies have shown that the striatal and midbrain areas including the entire ventral striatum (VS) and the dopamine neurons of the substantia nigra (SN), are also involved in the reward circuit (Haber and Knutson, 2010). The VS was shown to connect to orbitofrontal cortex and anterior cingulate cortex and receive dopaminergic input from the midbrain. The studies suggest that a circuit structure including VS, VTA/SN, ventral pallidum, prefrontal cortex and thalamus is an integral part of the cortico-basal ganglia system in reward circuit (Haber and Knutson, 2010). A meta-analysis of 142 neuroimaging studies of reward valence processing confirmed that different brain regions are broadly involved in different stages of reward processing such as the orbitofrontal cortex, anterior cingulate, a sub-region of the ventral striatum and the nucleus accumbens (Liu et al., 2011).

Functional MRI studies in humans have showed that different brain regions are engaged in reward expectation and reward processing. A common finding is that the ventral striatum, including the nucleus accumbens was preferentially activated during expectation of the reward, whereas the ventromedial prefrontal cortex preferentially activated during reward outcome (Knutson et al., 2001; O'Doherty et al., 2002). In another fMRI study focusing on decision-making outcome of the reward showed differential responses to reward and punishment in the dorsal and ventral striatum. Components of the dorsal striatum, showed an increase in activation that was more sustained for trials associated with a rewarding outcome than with a punishing outcome (Delgado et al., 2000).

While the above findings provide evidence for the view that reward prospect and time perception may act by utilizing partially overlapping processing routes, a systematic investigation of this proposed overlap, as well as the potential interaction between these two factors is lacking. With the present functional magnetic resonance imaging (fMRI) study, we sought to elucidate the neural processes that are shared and distinct between brain regions responsible for time perception and reward prospection.

# MATERIALS AND METHODS

## Participants

Participants were 18 young adults (aged 18–45, mean = 25.8 ± 5.8 years, 11 female) with at least 8 years of school attendance. None of them reported a history of drug abuse, neurological or psychiatric diseases, or injuries to head. All were right handed according to the Chapman and Chapman Handedness Inventory (1987) (its validity and reliability for use in the Turkish population was reported by Nalçaci et al., 2002) with normal or corrected-to-normal visual acuity. The methods and procedures used in the study had approval from the Ankara University Institutional Review Board (AUIRB) and informed written consent was obtained from all participants in accordance with the protocols approved by the AUIRB. The participants were all volunteers and they have informed to receive a payment based on their performance prior to the study.

# Experimental Paradigm

To examine our hypothesis, we employed a temporal attention task in which observers had to extrapolate the velocity of an occluded moving object. The paradigm was designed using "Presentation <sup>R</sup> " (Version 18.0, Neurobehavioral Systems, Inc., Berkeley, CA, United States) program. The participants performed the tasks, while undergoing fMRI which consisted of two different conditions (control and time perception), applied in rewarded and unrewarded sessions.

On each trial, there was a black vertical bar in the middle of the screen with a gray background, which was constantly displayed during the trial. There was a unique cue associated with each condition and it was displayed in the center of the bar. For the time perception condition an arrow image was displayed as a cue and for the control condition a checkerboard was the cue.

After the presentation of the cue, the target which is a moving gray rectangle appeared from the left side of the screen and moved horizontally until it disappeared from the right side of the screen. When the rectangle reached the bar, part of it that was under the bar becoming "invisible" to the participant in order to induce the perception that the rectangle was passing under the bar. The initial speed of the rectangle was slightly increased or decreased while it was occluded by the bar but the rectangle resumed its initial speed when it reappeared on the right side of the bar. The color tone of rectangle gets lighter or darker from the initial level after the occlusion period.

In the time perception condition participants were asked to judge whether the rectangle reappeared earlier or later after the occlusion compared with its predicted velocity. They were required to press the right button of a response pad if the speed of the rectangle was increased while passing under the bar, and to press the left button if the speed of rectangle was decreased. In the control condition participants were asked

to attend to the contrast change of rectangle when it was reappeared again after the occlusion. They were asked to evaluate whether the color tone of the rectangle was darker or lighter compared to its initial tone. They were required to press the right button in case the color tone was decreased and press left button if it was increased. The color tone and the velocity of the rectangle changed in both conditions. However, participants were asked to attend to either velocity or color contrast according to the cue represented in the middle of the bar. The participants were instructed to press the buttons as quickly as possible when they decided to answer but also reminded that giving correct answer is also important for both rewarded and unrewarded sessions. A fixation point was presented on a gray screen at the inter-stimulus intervals which were 2, 4, and 6 s arranged in a pseudo-randomized and logarithmic manner favoring shorter durations. The sessions were also presented in a pseudo-randomized order. Reaction times (RT) were collected and percentage of correct answers were calculated. In rewarded sessions, the participants gained money depending on their percentage of correct response (PCR) scores (100 Turkish Lira to 100% correct score).

The tasks were presented on a 28 cm × 37.5 cm screen with a distance of 72.5 cm from the participant's eyes to the screen. The monitor resolution was 1920 pixels × 1080 pixels and the refresh rate was 60 Hz. The size of the black bar was 28cm × 9.47 cm (21.86◦ × 7.47◦ ) and the size of moving rectangle was 1.46 cm × 1.94 cm (1.16◦ × 1.54◦ ). The rectangle passed across the screen from left to right horizontally with two possible speeds when it is visible to the participant. The speeds of the rectangle were 435◦ /1817 ms and 435◦ /917 ms. If the initial speed of the rectangle was 3.70◦ /s it either increased to 7.11◦ /s or decreased to 1.82◦ /s while rectangle passed under the bar and when the rectangle reappeared again on the right side of the screen it returned to its initial speed. If the initial speed of rectangle was 7.32◦ /s it either increased to 10.48◦ /s or decreased to 3.62◦ /s while rectangle passed under the bar. Contrasts of the rectangle were calculated by using Weber contrast equation which is (pixel intensity of rectangle- pixel intensity of background)/pixel intensity of the background. The initial contrasts of the rectangle were −0.64, −0.52, and −0.37. If the initial contrast of rectangle was −0.64 it either increased to −0.76 or decreased to −0.52 when rectangle reappeared again on the right side of the screen. If the initial contrast of rectangle was −0.52 it either increased to −0.64 or decreased to −0.37 and if the initial contrast of rectangle was −0.37 it either increased to −0.52 or decreased to −0.21 when rectangle reappeared again on the right side of the screen. One trial lasted 2500 ms in total.

An event-related fMRI design was used. Before fMRI acquisition, all subjects performed a training session and had feedback if they had correctly done the trial or not. All participants succeed to finish the trial session with more than 60% accuracy. Inside the scanner, the participants performed four 7-min sessions, yielding a total of 32 trials in each session (16 trials for each condition). There were four sessions; two sessions were rewarded and two sessions were unrewarded (**Figure 1**).

## fMRI Image Acquisition

fMRI images were acquired using a 3-T Siemens Magnetom Trio MRI system with a 32-channel head-coil array. For each participant, a series of high resolution T1-weighted anatomical images were obtained [Time to repeat (TR): 2600, Time to Echo (TE):3.02, Field of View (FOV): 256 mm and slice thickness: 1.00 mm]. Functional scans were acquired using 46 3-mm slices with a 0-mm gap (TR:2600, TE:28, matrix:64 × 64, FOV: 192 mm, voxel size: 3 mm × 3 mm × 3 mm).

To display the visual stimuli Presentation <sup>R</sup> was run via a PC which was also used for collecting participants' response. The visual paradigm was projected onto a projection screen which was visible via a mirror that situated front of participant's head. Participants used a response pad with their right hand to performed the tasks while undergoing fMRI scan.

### fMRI Processing and Data Analysis

Analysis of the fMRI data was performed using SPM8 software (Wellcome Department of Cognitive Neurology, London, United Kingdom) run via MATLAB. The functional images were realigned to correct for movement artifacts. High-resolution anatomical T1 images were coregistered with the realigned functional images to enable anatomical localization of the activations. The structural and functional images were spatially normalized into a standardized anatomical framework using the default EPI template in SPM8, based on the Montreal Neurological Institute (MNI) averaged brain and approximating the normalized probabilistic spatial reference frame of Talairach and Tournoux (1988). Model estimation included a highpass filter (256s). Smoothing was performed with a 6-mm full-width half-maximum Gaussian kernel. The GLM design matrix included four task-related regressors (rewarded time, rewarded control, unrewarded time, unrewarded control). Also the six-head movement regressors derived from the realignment stage of preprocessing were also included as covariates of no interest.

The neuroimaging data were statistically analyzed by a 2 (time/no-time) × 2 (reward/no-reward) repeated measures analysis of variance (ANOVA) using the SPM12 flexible factorial design feature at the group level. The trials with incorrect subject responses were not included in the analysis. The averaged activity in the peak regions were extracted from individual subject images for the brain areas showing significant interaction solely to reveal the direction of the effect. To this end, mean percent signal change values were calculated for spheres centered at the peaks of activation clusters with 5 mm (for caudate nucleus only) or 10 mm diameters (please see **Table 1** for coordinates).

# RESULTS

#### Behavioral Results

The PCR and RT of all attendants were evaluated by repeated measures ANOVA using R software. The Bonferroni correction was applied for multiple comparisons. Separate two by two ANOVAs were applied to PCR and RT results with task (time

#### TABLE 1 | Significant activations revealed by the 2 × 2 ANOVA (p < 0.05, corrected).


perception/control) and reward (reward/no-reward) as factors (see **Figure 1** for behavioral results).

The main effect of Task was not significant [F(1,17) = 1.37; p = 0.257; η 2 <sup>G</sup> = 0.009]. The main effect of Reward was also not significant [F(1,17) = 0.91; p = 0.353; η 2 <sup>G</sup> = 0.021]. These findings suggest that there is no difference in terms of difficulty between task and control conditions. The interaction between the task and the reward condition was not significant [F(1,17) = 1.88; p = 0.188; η 2 <sup>G</sup> = 0.004]. Even though the accuracy difference is not significant, there is still improvement in accuracy on the rewarded condition. PCR on the unrewarded control condition was 79.7%; unrewarded time perception condition was 80.6%; while the PCR on the rewarded control condition was 81.9% and rewarded time perception was 86.1%.

The repeated measures ANOVA for RT showed a significant main effect of Task [F(1,17) = 17,70; p < 0.001; η 2 <sup>G</sup> = 0.033]. There was a significant main effect of Reward (F = 19.98; p < 0.0001; η 2 <sup>G</sup> = 0.251). These findings showed that subjects responded slower during control condition compared to the time perception condition. Also findings suggest that subjects responded faster during rewarded sessions compared to the unrewarded sessions.

#### Functional Imaging Results The Main Effect of Time Perception

The group results showed that while participants were performing the time perception task, inferior parietal lobe, dorsolateral/ventrolateral prefrontal cortex, intraparietal sulcus, peristriate cortex, ACC/SMA and cerebellum were significantly activated (**Table 1** and **Figure 2A**). Activity show a right hemisphere lateralization. Prefrontal and parietal cortex activity were more extensive in the right hemisphere. Overall, a distributed neural network was significantly activated during the timing task compared to the control condition.

#### The Main Effect of Reward

During the reward conditions, peristriate cortex, precuneus, anterior insular cortex (AIC) were significantly activated. In addition to the cortical activations posterior and anterior part of cerebellum, NAc, thalamus and caudate nucleus (CN) were significantly activated (**Table 1** and **Figure 2B**).

#### Interaction Between Time Perception and Reward

In this study, significant activations were found for the interaction between time perception and reward prospect.

Activations were seen in bilateral dorsolateral prefrontal cortex, orbitofrontal cortex, medial prefrontal cortex and CN (**Table 1** and **Figure 3**).

ANOVA was performed for percent signal change values obtained from spheres at the peaks of significant interaction activation clusters. The results showed significant time and reward interaction effect for the right dorsolateral prefrontal cortex [F(1,17) = 11.23, p < 0.01], left dorsolateral prefrontal cortex [F(1,17) = 8.64, p < 0.01], orbitofrontal cortex [F(17) = 6.62, p < 0.05], medial prefrontal cortex [F(1,17) = 6.29, p < 0.05] and CN [F(1,17) = 12.15, p < 0.01] (**Figure 3**). The main effect of time was significant for left dorsolateral prefrontal cortex [F(1,17) = 5.04, p < 0.05] and right dorsolateral prefrontal cortex [F(1,17) = 16.11, p < 0.001]. The main effect of reward was significant for CN [F(1,17) = 13.67, p < 0.01].

Follow-up t-statistics analysis of percent signal change values obtained from ROIs showed that there was no difference in time perception condition between the rewarded and unrewarded sessions (p > 0.05). On the other hand, control condition showed more activity during the reward sessions compared to the unrewarded ones (p < 0.05). These results suggest that significant ANOVA effects were derived by brain activity difference observed for color contrast perception task related to reward anticipation but not for the time perception task.

# DISCUSSION

We assessed common and distinct neural processes among time perception and reward prospection in healthy subjects by way of a visual paradigm that includes monetary reward. Time perception showed a fronto-parietal (more extensive in the right), ACC/SMA and peristriate cortical as well as cerebellar activity. On the other hand, reward anticipation activated AIC, NAc, CN, thalamus, cerebellum postcentral gyrus and peristriate cortex. The findings suggest that mainly a prefrontal subcortical circuit

(prefrontal-caudate) may be responsible for the integration of time perception and reward prospect functions.

# Time Perception

Presented results showed that time perception activated bilateral prefrontal, IPL and IPS regions but more extensive (predominantly) in the right hemisphere. These findings almost replicated the results reported recently. Üstün et al. (2017) used a very similar foreperiod paradigm to research on the relationship of time perception and working memory. They showed right lateralized fronto-parietal activations related to timing. In fact, their time perception related specific activations obtained by time > memory contrast (Talairach Coordinates: 52, −37, 40) was approximately in the same location as in the presented studies' time perception effect in parietal lobe (MNI Coordinates: 58, −42, 46). Prefrontal activity was also in close proximity in both studies. These findings support the view that while prefrontal cortex mediates the working memory aspect of timing, the IPL engages in the attentional mechanisms of time perception with a right hemisphere bias which might be the result of visuospatial nature of our foreperiod paradigm (Miall, 2006; Çiçek et al., 2009).

Bilateral SMA and ACC activations are also a replication of our lab's previous results (Üstün et al., 2017) and also in line with other study' findings (O'Reilly et al., 2008, 2013). It is suggested that SMA/ACC activity during timing might serve for internal model update and decision processes (Pouthas et al., 2005; O'Reilly et al., 2013).

The significant activity in V5/MT area is also in line with the previous neuroimaging studies using timing paradigms (Bueti et al., 2008b; Üstün et al., 2017). V5/MT area is suggested to be related to the perception of motion (Born and Bradley, 2005). Rather than just engaging related to the visual motion processing, V5/MT was suggested to be involved in visual perception of duration (Bueti et al., 2008a,b). The presented significant activity result in the cerebellum might also be related to the timing of moving objects (O'Reilly et al., 2008). Our paradigm required subjects to predict the timing of the reappearance of the occluded stimuli which might engage the cerebellar mechanisms.

## Reward Anticipation

We designed a visual paradigm using a secondary incentive (money) to engage reward anticipation processes. In the half of the scanning sessions subjects were awarded and in the other half they were not awarded (two sessions each). The main effect of reward, in another way of explanation, means result obtained by subtracting unrewarded from rewarded trials. The modeled activations are related to the reward anticipation rather than

reward outcome. Also our paradigm do not include punishment, instead -during reward sessions- subjects are rewarded for correct answers and not rewarded for incorrect answers. Indeed, only the trials with correct subject responses were included in the analysis.

Key findings of the presented study for reward effect was the NAc, CN, and AIC activity which are proposed to be among the key structures of the reward circuit (Haber and Knutson, 2010). These findings are in line with the previous neuroimaging studies reporting the dorsal (CN) and ventral striatum (NAc) as well as AIC activity during reward anticipation (Breiter et al., 2001; Knutson et al., 2001, 2003; Zink et al., 2004; Knutson and Greer, 2008). A meta-analysis of cued response studies suggested that the NAc and AIC were activated during anticipation of uncertain incentives (Knutson and Greer, 2008). The same study also proposed that while NAc activity is related to the gain anticipation, AIC activates for both gain and loss anticipation.

## Integrating Time and Reward

Significant contribution of our study is the localization of the physiological brain mechanisms integrating time perception and reward prospect. Presented results suggest that a prefrontalstriatal circuit might be the hub for integrating time and reward networks. Interaction effect showed activity in the dorsolateral, orbitofrontal, medial prefrontal, and CN. Percent signal change values obtained from these brain areas suggest that these regions are activated during the rewarded but not during the nonrewarded control condition. This finding might be interpreted as these regions are affected by the reward manipulation. The activities in these regions are significantly greater during the non-rewarded timing condition compared to the non-rewarded control condition which suggests that timing engages these brain regions too. On the other hand, the rewarded timing condition activity was not greater than the non-rewarded timing condition (but was greater than the non-rewarded control condition). Overall these findings suggest that the prefrontal-striate neural network might be involved in both the timing and the reward processes.

The orbitofrontal, medial prefrontal cortex and CN were suggested to be activated during reward anticipation (Kim et al., 2006; Knutson and Greer, 2008). On the other hand, the medial prefrontal cortex is suggested to be activated specifically during the reward outcome (Knutson et al., 2003). Monetary reward related action selection was suggested to engage the orbital and dorsolateral prefrontal cortex activity for exploratory decisions, but the medial prefrontal for exploitative decisions (Daw et al., 2006). Dorsolateral prefrontal cortex is suggested to be related to choosing the most valuable among multiple options, which might require working memory (Haber and Knutson, 2010). On the other hand, CN activity was proposed to link reward to behavior (Knutson and Cooper, 2005).

The presented study showed ventral striatum activity during the rewarded sessions, but we also found dorsal caudate activity related to the reward processing. Presented work did not report dorsal striatum activity related to timing but caudate activity was previously showed during a very similar fore-period task (Üstün et al., 2017). Animal studies and human neuroimaging studies (including subjects with cocaine addiction) suggest that while ventral striatum neurons respond to reward processing, dorsal striatum activate during the timing and other cognitive processes like spatial learning (van der Meer et al., 2010; Coull et al., 2011; Tau et al., 2014). On the other hand, it is also reported that the reward processing and reward based temporal learning activates both ventral and/or dorsal striatum (Delgado et al., 2000; Bischoff-Grethe et al., 2015; Tomasi et al., 2015). It was proposed that differences in response requirements might explain the differential activation of ventral/dorsal striatum (Bischoff-Grethe et al., 2015).

The nature of our paradigm, which aims to engage both the timing and reward networks, is consistent with everyday life but neither differentiate the anticipation from outcome nor the exploration from exploitation. Overall, presented results suggest that the integration of timing and reward networks might occur by the significant contribution of a prefrontal-striatal circuit. This circuit most likely links processes related to the reward based decisions, the timing of the events and the timing of the behavioral response as well (Haber, 2016).

What might be the reason of the presented co-activation of the prefrontal-striatal network in terms of brain's anatomy and cellular physiology? Based on diffusion tensor imaging fiber tracking results, CN was showed to connect with dorsolateral, orbitofrontal and medial prefrontal cortex (Lehéricy et al., 2004; Draganski et al., 2008). Dopamine neurons in the substantia nigra and VTA were reported to respond to the timing of rewards (Hollerman and Schultz, 1998; Schultz et al., 2000). Dopamine neurons send direct axonal connections to the striatum and to a wide area of the prefrontal cortex (Haber and Knutson, 2010). CN suggested to act as network hub to integrate functions of the medial, orbital and dorsolateral prefrontal cortex (Averbeck et al., 2014; Haber, 2016). These prefrontal-striatal connections are suggested to be parallel and segregated but show significant convergence as well (Haber, 2016). The same cortico-striatal connections are suggested to be glutamatergic and modulated by the dopaminergic input to the striatum (Gardoni and Bellone, 2015). Results suggest that the physiological mechanisms and neural networks involved in the reward prospect and time perception might converge on the dopaminergic cortico-striatal circuit (Meck and Benson, 2002; Coull et al., 2004; Haber and Knutson, 2010). Overall coactivation of the prefrontal-striatal network might reveal the integration of time perception and reward systems through direct and/or indirect contributions of dopaminergic and glutamatergic neurons.

Since the presented work submitted to a special issue of time and number, a brief discussion of our findings in terms of number sense is required. It is proposed that the intraparietal sulcus and the prefrontal cortex of the primate are both involved in the encoding of space, time and number (Burr et al., 2010). Tasks requiring number and time magnitude estimations suggested engaging basal ganglia, prefrontal and parietal cortex activity (Allman et al., 2011). COMT-related dopaminergic mechanisms were suggested to be implicated in number processing (Julio-Costa et al., 2013). Findings suggest

that while COMT polymorphism causing higher prefrontal dopaminergic activity affected supra-second timing performance, DRD2 polymorphism resulting in higher striatal D2 activity affected sub-second timing performance (Wiener et al., 2011). Balcı and his coworkers studied the effect of COMT and DRD2 polymorphism on the relation between time perception and reward anticipation. They found that gene polymorphism status related to balanced prefrontal (D1 receptor) and striatal (D2 receptor) activity, associated with the significant interaction of reward magnitude with timing performance (Balcı et al., 2013). Overall, the number processing, time perception and reward anticipation might engage substantially overlapping brain network probably depending on a cortico-striatal dopaminergic neural circuit (Wiener et al., 2011; Balcı et al., 2013; Julio-Costa et al., 2013).

# CONCLUSION

Our findings suggest that a prefrontal-striatal circuit might integrate reward and timing systems of the brain. Fundamental functions of the brain like spatial attention, time perception, working memory and as well as number sense were suggested to engage a fronto-parietal network (Burr and Morrone, 2006; Çiçek et al., 2009; Üstün et al., 2017). The prefrontal-striatal circuit might also be an important hub for integrating the reward system with other fronto-parietal network related systems of the brain.

#### REFERENCES


## AUTHOR CONTRIBUTIONS

NA and SÜ: substantial contributions to the conception and design of the work, acquisition, analysis, and interpretation of data and writing the work. EK: substantial contributions to the analysis and interpretation of data and revising the work critically. ˙IÇ: substantial contributions to the acquisition of data. HÖ and BB: substantial contributions to the conception and design of the work and revising the work critically. MÇ: project supervision, substantial contributions to the conception and design of the work, analysis and interpretation of the data, writing and revising the work critically. NA, SÜ, EK, ˙IÇ, HÖ, BB, and MÇ: final approval of the version to be published and agreement to be accountable for all aspects of the work.

## FUNDING

This study was supported by the Ankara University Scientific Research Project Fund (project number, 12B6055001).

## ACKNOWLEDGMENTS

We would like to thank Dr. Anna C. Nobre for helping to discuss the design of the study. We are grateful to all participants for their involvement in the study.


Haber, S. N., and Knutson, B. (2010). The reward circuit: linking primate anatomy and human imaging. Neuropsychopharmacology 35, 4–26. doi: 10.1038/npp. 2009.129

Haber, S. N. (2016). Corticostriatal circuitry. Dialogues Clin. Neurosci. 18, 7–21.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Apaydın, Üstün, Kale, Çelikag, Özgüven, Baskak and Çiçek. This ˘ is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Mental Timeline for Duration From the Age of 5 Years Old

#### Jennifer T. Coull<sup>1</sup> \*, Katherine A. Johnson<sup>2</sup> and Sylvie Droit-Volet<sup>3</sup>

<sup>1</sup> Aix-Marseille University, CNRS, LNC (UMR 7291), Marseille, France, <sup>2</sup> School of Psychological Sciences, University of Melbourne, Melbourne, VIC, Australia, <sup>3</sup> CNRS, Laboratoire de Psychologie Sociale and Cognitive, UMR 6024, Université Clermont Auvergne, Clermont-Ferrand, France

Both time and number can be represented in spatial terms. While their representation in terms of spatial magnitude (distance or size) might be innate, their representation in terms of spatial position (left/right or up/down) is acquired. In Western culture, the mental timeline represents past/future events or short/long duration on the left/right sides of space, respectively. We conducted two developmental studies to pinpoint the age at which the mental timeline for duration begins to be acquired. Children (aged 5–6, 8, or 10 years old) and adults performed temporal bisection tasks in which relative spatial position (left/right) was manipulated by either arrow direction (Experiment 1) and/or lateralized stimulus location (Experiments 1 and 2). Results first confirmed previous findings that the symbolic representation of spatial position conveyed by arrow stimuli influences the perception of duration in older children. Both 8 and 10 year olds judged the duration of leftward arrows to be shorter than that of rightward arrows. We also showed for the first time that as long as position is manipulated in a non-symbolic way by the visual eccentricity of the stimuli, then even 5–6 year olds' perception of duration is influenced by spatial position. These children judged the duration of left-lateralized stimuli to be shorter than that of either right-lateralized or centrally located stimuli. These data are consistent with the use of a mental timeline for stimulus duration from the age of 5 years old, with short duration being represented on the left side of space and long duration on the right. Nevertheless, the way in which left and right were manipulated determined the age at which spatial position influenced duration judgment: physical spatial location influenced duration perception from the age of 5 years old whereas arrow direction influenced it from the age of 8. This age-related dissociation may reflect distinct developmental trajectories of automatic versus voluntary spatial attentional mechanisms and, more generally highlights the importance of accounting for attentional ability when interpreting results of duration judgment tasks.

Keywords: timing, time perception, duration, space, position, development, timeline, magnitude

# INTRODUCTION

Time is often represented in spatial terms. For example, "we waited a long time" or "take a look back over your career." Indeed, the notion that the perception of time is merely a subjective construct derived from the movement of objects through space has been around for decades. James (1890) drew an analogy between the perception of time and space, stating "date in time corresponds to

Edited by: Fuat Balc*ı*, Koç University, Turkey

#### Reviewed by:

Carmelo Mario Vicario, Università degli Studi di Messina, Italy Alexander Kranjec, Duquesne University, United States

> \*Correspondence: Jennifer T. Coull jennifer.coull@univ-amu.fr

#### Specialty section:

This article was submitted to Perception Science, a section of the journal Frontiers in Psychology

Received: 05 March 2018 Accepted: 15 June 2018 Published: 10 July 2018

#### Citation:

Coull JT, Johnson KA and Droit-Volet S (2018) A Mental Timeline for Duration From the Age of 5 Years Old. Front. Psychol. 9:1155. doi: 10.3389/fpsyg.2018.01155

**45**

position in space" (p. 610). Piaget (1969) proposed that "time and space form an inseparable whole" (p. 1) and demonstrated that young children cannot disentangle the notions of time and space, with long duration being equated to long distance. Indeed, duration is represented in terms of spatial distance not only in adults and children (Casasanto et al., 2010; Charras et al., 2017), but also even in neonates (de Hevia et al., 2014) and monkeys (Merritt et al., 2010; Mendez et al., 2011).

This spatialization of time has been encompassed into two major theoretical frameworks, in which both time and number are represented in spatial terms. On the one hand, A Theory of Magnitude (ATOM) suggests that the dimensions of space (i.e., size), time (i.e., duration), and number (i.e., quantity) are all processed by a single, innate magnitude processing system (Walsh, 2003). Within this framework, small size is associated with short duration (and low quantity) and large size with long duration (and large quantity). On the other hand, the mental timeline (Bonato et al., 2012; Bender and Beller, 2014; Magnani and Musetti, 2017) or mental numberline (Hubbard et al., 2005) theories suggest that time and number are ordered along a linear spatial axis. As opposed to ATOM, this representation is thought to be culturally acquired rather than innate: in Western cultures, the left side of space is associated with short duration (Vallesi et al., 2008; Vicario et al., 2008) or early/past times (Santiago et al., 2007) while the direction is reversed (Fuhrman and Boroditsky, 2010; Ouellet et al., 2010) or rotated to the vertical axis (Boroditsky et al., 2011) in other cultures. The mental timeline also operates in the frontal (front-back) axis in adults (Torralbo et al., 2006; Ulrich et al., 2012; Eikmeier et al., 2013; de la Fuente et al., 2014) and children (Charras et al., 2017), and this egocentric representation of time may influence the way we conceptualize time even more strongly than horizontal (left–right) or vertical (up–down) orientations (Eikmeier et al., 2015a).

The mental timeline and ATOM are not mutually exclusive theories. Rather, they are based on complementary ways of measuring space (see also Núñez and Cooperrider, 2013; Winter et al., 2015). While ATOM emphasizes spatial magnitude (how big something is), the mental timeline emphasizes spatial position (where something is). To put it another way, ATOM is framed more in terms of coordinate spatial relations (i.e., metrical distance, which allows for measures of magnitude such as short/long) while the mental timeline depends upon categorical spatial relations (which allows for measures of relative position, such as left/right or above/below). Converging behavioral and neuroscientific data underline the distinction between these two forms of spatial measurement (Kosslyn et al., 1989), with categorical processing recruiting left parietal cortex and coordinate processing recruiting the right (Jager and Postma, 2003; Kranjec and Chatterjee, 2010). Moreover, spatial neglect patients with right hemisphere damage underestimate stimulus duration (Oliveri et al., 2009; Magnani et al., 2011), consistent with a role for the right hemisphere in magnitude processing. On the other hand, patients with left hemisphere lesions failed to show the usual effects of leftward prismatic adaptation on timing performance (Magnani et al., 2011), consistent with a role for the left hemisphere in processing relative position. In summary, while both ATOM and the mental timeline theory advocate a spatial representation of time, they conceptualize the nature of the spatial representation in different terms.

Time can also be conceptualized in terms of either magnitude or relative position. Duration refers to the length of time an event lasts (its temporal magnitude) while order refers to the moment in time at which an event occurs (its relative temporal position). The mental timeline has been suggested to represent both of these measures of time (Bonato et al., 2012). In terms of temporal position, stimuli representing events in the past (or future) are processed more quickly when they appear in the left (or right) side of space (Santiago et al., 2007) or when they are responded to with left (or right) response keys (Weger and Pratt, 2008). In terms of temporal magnitude, duration is under (or over)-estimated for stimuli representing the left (or right) side of space (Vicario et al., 2008; Droit-Volet and Coull, 2015) or when attention was shifted to the left (or right) side of space by optokinetic stimulation (Vicario et al., 2007) or prismatic adaptation (Frassinetti et al., 2009). Similarly, reaction times are faster when short- (or long-) duration stimuli are responded to with the left (or right) hand (Conson et al., 2008; Ishihara et al., 2008; Vallesi et al., 2008) or are paired with left (or right) sided primes (Di Bono et al., 2012). This spatial influence even transcends sensory modality, with visual stimuli being under- (or over-) estimated when auditory distractors were presented to the left (or right) ear (Vicario et al., 2009).

As mentioned previously, the mental timeline is acquired through culture and/or experience (Bonato et al., 2012; Casasanto and Bottini, 2014; Magnani and Musetti, 2017). It depends heavily on reading experience, and can be observed in blind, Braille-reading participants, as well as in sighted individuals (Bottini et al., 2015). However, the developmental trajectory of this spatial representation of time appears to differ as a function of the temporal measurement in question (duration or order). For example, Tillman et al. (2017) very recently showed that during language development, children aged 4– 6 make use of a left–right mental timeline to convey the meaning of words that refer to relative temporal position (deictic time words, such as "yesterday" or "next year"). By contrast, children's understanding of the relative remoteness of these words from the present time did not develop until age 7. These data suggest that the mental timeline is first used to conceptualize the temporal order of events while its use in representing their temporal magnitude emerges later in development. Consistent with this idea, Droit-Volet and Coull (2015) found that the presentation duration (i.e., temporal magnitude) of rightward-facing arrows was overestimated (and leftward ones underestimated) in 10 year-olds and adults, whereas there was no influence of arrow direction on duration judgments in younger children (5 and 8 year olds). This developmental dissociation provides further evidence that the mental timeline is not innate but is acquired during childhood.

However, it remains possible that the younger children in the study by Droit-Volet and Coull (2015) showed no evidence of using a mental timeline for duration simply because they weren't interpreting or processing the arrow stimuli in the same way as

the older children or adults. We previously argued that this was an unlikely explanation because the spatial position indicated by an arrow can be used deliberately by children as young as 3–4 years old to correctly locate a hidden object (Leekham et al., 2010; Jakobsen et al., 2013). Moreover, by 5–6 years of age, the spatial position indicated by an arrow is processed so automatically that it modifies reaction times in spatial orienting tasks (Ristic et al., 2002; Jakobsen et al., 2013; Gregory et al., 2016). Nevertheless, to formally test the possibility that the lack of a mental timeline in 5 year olds was not simply due to their inability to use arrows to indicate spatial position, we conducted a study in which relative spatial position was manipulated either by arrows pointing to the left or right, or by stimuli being physically presented on the left or right side of the computer screen. In addition, to test whether the influence of the mental timeline on perceived duration is due to an underestimation of left-sided stimuli and/or an overestimation of right-sided stimuli, we included control conditions (vertical arrows; central location) to which left- or right-sided stimuli could be separately compared. Based on our previous results using arrow stimuli (Droit-Volet and Coull, 2015), we hypothesized that duration judgments in the youngest children would not be modulated by either arrow direction or stimulus location. If, however, their perceived duration were influenced by stimulus location, this would suggest that our previous results did not reflect the lack of a mental timeline in 5 and 8 year olds. Instead, the arrow stimuli might simply reflect a symbolic representation of time ("time's arrow") that has not yet been acquired by these younger children.

#### EXPERIMENT 1

#### Methods

#### Participants

The sample was composed of 58 participants: 19 5-year-olds (mean age = 5.21, SD = 0.47, 10 girls), 20 8-year-olds (mean age = 7.48, SD = 0.22, 11 girls), and 19 10-year-olds (mean age = 10.18, SD = 0.59, 12 girls). Two additional children (one 5-year-old and one 10-year-old) withdrew from the experiment before its completion. Sample sizes were based on prior studies investigating the influence of spatial position on perceived duration (Vallesi et al., 2008; Vicario et al., 2009; Isham et al., 2017). In a sample of three groups of 20 participants, the expected power would be 0.99 for an effect size of 0.15 in our experimental paradigm. All children were recruited from nursery and primary school in Tulle, France and had normal educational backgrounds. Teachers were asked to notify us of any child with learning difficulties. None were identified. Children's parents and the school headmaster signed a formal agreement to conduct the study. The procedure was validated by the local academic inspection committee of the French National Education Minister.

#### Materials

Children were tested individually in a quiet room in their school. Stimulus presentation and data recording was controlled by E-prime software (Psychology Software Tools). Three different stimuli were used, all composed of the same small rectangle and triangle. The triangle was positioned to the left of the rectangle to create a leftward arrow and to its right to create a rightward arrow. A neutral (vertical) arrow had the triangle positioned either above or below the rectangle randomly across trials. Each stimulus subtended a visual angle of 5◦ . These three stimuli appeared in one of three different locations on the computer screen – left-lateralized, centered, or right-lateralized – all on the same horizontal plane. In the left and right locations, the center of the stimulus was at 14◦ eccentricity. In our previous study (Droit-Volet and Coull, 2015), responses were given manually with left and right hands. However, given the increased complexity of this experimental design (simultaneous manipulation of arrow direction and stimulus location) we aimed to facilitate the task for the children by asking them to give their responses verbally. Their responses were then keyed in by the experimenter.

#### Procedure

All children performed a temporal bisection task. In the initial training phase, children were shown the vertical (control) arrow, presented in the center of the screen, for either the short (200 ms) or long (800 ms) standard duration. They were trained to say "short" or "long" when the stimulus had been presented for the short or long standard duration respectively. Each trial started with the word "prêt" ("ready") displayed on the center of the computer screen. When the participant was ready, the experimenter pressed the spacebar and the stimulus appeared. The participant then responded either "short" or "long," depending on whether they perceived stimulus duration to be short or long, and the experimenter noted the response. There were eight training trials, with four short and four long durations presented in random order. Previous investigations have shown that eight training trials is sufficient for children in this age-range to understand task instructions (Droit-Volet and Wearden, 2001). Inter-trial intervals were randomized between 0.5 and 1 s. The testing phase immediately followed training. In the testing phase, the stimuli could be leftward, rightward or vertical arrows, and they could be presented on the left, right, or center of the computer screen. For each of these nine experimental conditions, stimulus duration could be 200, 300, 400, 500, 600, 700, or 800 ms. Each testing block thus comprised 63 trials (3 arrows × 3 locations × 7 comparison durations), presented in random order. Participants performed three such blocks, with a short break between blocks, in two separate sessions that were separated by a 15–20 min break. This gave a total of 378 trials (63 trials × 3 blocks × 2 sessions) per participant.

#### Data Analysis

The proportion of "long" responses [p(long)] was calculated for each experimental condition. The plot of p(long) as a function of the seven stimulus durations constituted a bisection curve. Bisection curves were constructed for each of the nine experimental conditions (3 arrows × 3 locations) for each individual participant. A pseudo-logistic function was fit to the bisection curves using GraphPad Prism 7 software. From the fit of this function to the data, we calculated Bisection Points (BPs)

and Weber Ratios (WRs) for each individual participant. The BP is the point of subjective equality and represents the stimulus duration at which participants respond long as often as short, i.e., p(long) = 0.50. The lower the BP, the longer the perceived duration. The WR is a measure of temporal sensitivity and is calculated as half the difference between the perceived duration at p(long) = 0.75 minus that at p(long) = 0.25, divided by the BP. The lower the WR value, the greater the temporal sensitivity and the steeper the psychophysical function.

For each participant, we calculated the average fit across all experimental conditions and excluded any participants with an average fit of R <sup>2</sup> < 0.7 (sixteen 5 year olds, three 8 year olds, and one 10 year old). Data from one additional 8 year-old was also excluded since the BP for one of the experimental conditions was more than double the maximum comparison duration and was therefore considered an outlier. Unfortunately, the fits of the pseudo-logistic function to the data were very poor for the 5 year-olds [mean ( ± SD) R <sup>2</sup> = 0.46 ( ± 0.22)]. Due to very low numbers of included 5 year-olds (n = 3), we were therefore forced to analyze data from the 8 and 10 year-old groups only (n = 16 and 18, respectively).

We first analyzed p(long) in a four-way ANOVA with arrow direction (leftward, vertical, rightward), stimulus location (left, center, right) and duration (200, 300, 400, 500, 600, 700, 800 ms) as within-subject factors and age group (8 or 10 years old) as a between-subjects factor. To examine the effect of spatial position on perceived duration and temporal sensitivity, we analyzed BP and WR respectively in two separate three-way ANOVAs, with arrow direction (leftward, vertical, rightward) and stimulus location (left, center, right) as within-subject factors and age group (8 or 10 years old) as a between-subjects factor.

#### Results

**Figure 1** shows the proportion of "long" responses [p(long)] in each experimental condition averaged over the 8 and 10 yearold groups. Analysis of p(long) revealed the expected significant main effect of stimulus duration, F(6,192) = 670.64, p < 0.0001, η 2 <sup>p</sup> = 0.95, with a greater proportion of "long" responses as stimulus duration increased (**Figure 1**).

More interestingly, we found significant main effects of both arrow, F(2,64) = 3.04, p = 0.055, η 2 <sup>p</sup> = 0.09 and location, F(2,64) = 20.60, p < 0.0001, <sup>η</sup> 2 p = 0.39. Post hoc tests indicated that rightward arrows were more likely to be judged long [mean ± standard error (SE) p(long) = 0.48 ± 0.015] than leftward arrows [mean ± SE p(long) = 0.45 ± 0.014] (p < 0.05, Bonferroni corrected) (**Figure 1A**). The response to vertical arrows [mean ± SE p(long) = 0.46 ± 0.016] lay between these two extremes, and was not significantly different to that for either leftward or rightward arrows (both p > 0.6). Stimuli presented in the central location were more likely to be judged long [mean ± SE p(long) = 0.49 ± 0.015] than those presented on the left (p < 0.0001) or right (p < 0.0001), with no significant difference between left and right locations [mean ± SE p(long) = 0.45 ( ± 0.015) for both left and right] (**Figure 1B**).

The main effect of age was not significant, F(1,32) = 0.37, ns. Moreover, the factor of age did not interact with duration, F(6,192) = 0.87, ns, arrow, F(2,64) = 1.35, ns nor location, F(2,64) = 1.38, ns, indicating that the overall pattern of effect was similar for both age groups.

Analysis of BP confirmed the significant main effects of both arrow, F(2,64) = 3.07, p = 0.053, <sup>2</sup> <sup>η</sup><sup>p</sup> = 0.09, and location, F(2,64) = 13.35, p < 0.001, η 2 <sup>p</sup> = 0.29, on perceived duration. Post hoc tests showed that duration was judged longer for rightward (mean ± SE BP = 493.42 ± 11.67 ms) versus leftward (mean ± SE BP = 512.68 ± 11.55 ms) arrows (p < 0.05), with the perceived duration of vertical arrows lying between these two values (mean ± SE BP = 505.74 ± 12.66 ms) (**Figure 2A**). Duration was overestimated for centrally located stimuli (mean ± SE BP = 485.59 ± 11.40 ms) compared to both left- (mean ± SE BP = 517.48 ± 12.16 ms) and right-lateralized stimuli (mean ± SE BP = 508.87 ± 11.44 ms) (both p < 0.005) (**Figure 2B**). There was no interaction with age for either arrow, F(2,64) = 1.35, ns or location, F(2,64) = 1.38, ns, indicating that the effects of arrow and location on perceived duration were similar for both age groups.

In terms of temporal sensitivity, there were no significant effects of either arrow, F(2,64) = 0.07, or location, F(2,64) = 2.37, on WR, nor an interaction of either of these factors with age (all p > 0.05). We found only a main effect of age, F(1,32) = 4.68, p < 0.05, η 2 <sup>p</sup> = 0.13, indicating that 8 year olds had wider curves (mean ± SE WR = 0.23 ± 0.015), and therefore poorer temporal sensitivity than 10 year olds (mean ± SE WR = 0.19 ± 0.014).

# Discussion

These results indicate that stimulus location significantly modified perceived duration in both age groups. However, the pattern of effect was not in the predicted direction. Based on existing literature (Vicario et al., 2008; Bonato et al., 2012), we hypothesized that right-lateralized stimuli would be overestimated compared to left-lateralized ones. However, we found that centrally located stimuli were judged longer than either left- or right-sided stimuli, with no difference between the two lateralized locations. One possible mechanism for this unexpected finding is that the spatial position indicated by arrows and stimulus location interfered with one another, leading participants to focus on the trained and 'spatially neutral' central location. Since the perceived duration of attended stimuli is longer than that of unattended stimuli (Brown, 2008), focusing attention on the central location might have led participants to overestimate the duration of stimuli presented there. In any case, we found no evidence for differential effects of left- versus rightlateralized stimuli on perceived duration in either of the age groups.

Our data do, however, confirm and extend previous results (Droit-Volet and Coull, 2015) that arrow direction had a significant impact on perceived duration in children older than 7 years old, with rightward-facing stimuli being judged longer than leftward stimuli. Nevertheless, effect sizes were rather small and should be interpreted with caution. Small effect size could be due to the fact that in the current study we manipulated both arrow direction and stimulus location simultaneously, whereas we manipulated only arrow direction in

FIGURE 1 | Proportion of trials in which participants judged stimulus duration to be "long" [p(long)] for each of the seven comparison durations in Experiment 1. Data are averaged over 8 and 10 year olds, and are plotted as a function of either (A) leftward, vertical (control) and rightward arrows, averaged across the three stimulus locations or (B) left-lateralized, centralized, or right-lateralized stimuli, averaged across the three arrow directions. Asterisks indicate significant differences between conditions.

FIGURE 2 | Bisection points (BPs) for each of the spatial conditions in Experiment 1. The higher the BP, the shorter the perceived duration. Data are shown separately for 8 and 10 year olds although there was no significant difference between groups. Data are plotted as a function of either (A) leftward, vertical (control), or rightward arrows, averaged across the three stimulus locations or (B) left-lateralized, centralized, or right-lateralized stimuli, averaged across the three arrow directions. Error bars represent standard errors.

our previous study (Droit-Volet and Coull, 2015). The interaction between potentially conflicting representations of spatial position in the current study may have diluted the influence of the mental timeline on perceived duration. Indeed, simultaneous manipulation of arrow direction and stimulus location severely impaired performance in the youngest children. Unfortunately, data from only three of the nineteen 5 year olds were orderly enough to be fit by a psychometric curve, and so this age group had to be excluded from analyses. The increase in combinatorial possibilities of arrow direction and stimulus location (nine experimental conditions presented in randomized order) likely confused these young children, who responded rather randomly. Indeed, Iarocci et al. (2009) also reported that the combined presentation of arrows and lateralized locations impaired performance on a spatial cueing paradigm in 5 year olds, but not 7 or 9 year olds. Although 5 year olds were able to successfully use predictive arrows to quickly detect a lateralized target, these performance benefits were lost when the arrow cue was immediately preceded by presentation of an uninformative, but salient, peripheral stimulus.

Importantly, if 5 year-olds were not performing our duration estimation task as required, we could not assess the effects of stimulus location on perceived duration in this age group. In our previous study (Droit-Volet and Coull, 2015), we manipulated

a single spatial factor: arrow direction. In the current study, we manipulated both arrow direction and stimulus location, which proved too complicated for the 5 year olds. Therefore, to try and facilitate the task for young children, and so allow us to measure the influence of the mental timeline in 5 year olds, we developed a simpler paradigm in which we modulated stimulus location only. In this follow-up experiment, we compared the perceived stimulus duration of a circle that appeared in either left, right, or central locations. This is equivalent to the task used in Experiment 3 of Vicario et al. (2008), although we used a temporal bisection task rather than temporal discrimination. We used the same short and long anchor durations as Experiment 1 (200 ms and 800 ms) but, to make the task shorter, included just five comparison durations rather than seven.

# EXPERIMENT 2

# Methods

#### Participants

The sample was composed of 44 participants: 20 5–6 year-olds (mean age = 5.62 years, SD = 0.29, range: 5.08–6.08 years; 13 females) and 24 adults (mean age = 20.13 years, SD = 3.71, range: 18–33 years; all female). Although there was a discrepancy in the gender balance for children (65% female) and adults (100% females), Espinosa-Fernández et al. (2003) have reported that there are no gender differences in estimating duration in the seconds range for participants under 60 years old. All children were recruited from nursery schools in Clermont-Ferrand, France and had normal educational backgrounds. Teachers were asked to notify us whether any child had learning difficulties. None were identified. Children's parents and the school headmaster signed a formal agreement to conduct the study. The procedure was validated by the local academic inspection committee of the French National Education Minister.

#### Materials

Children and adults were tested in a quiet room in school or University. Stimulus presentation and data recording was controlled by E-prime software (Psychology Software Tools). The stimulus was a black circle (3◦ visual angle), which appeared in one of three locations on the computer screen – left-lateralized, centered, or right-lateralized (eccentricity of left and right locations was 14◦ ). To facilitate the task for the children, they gave their responses orally ("short" or "long"), which were then keyed in by the experimenter. Adults gave responses by pressing the upward or downward arrow key of the computer keyboard with their right index finger. Eikmeier et al. (2015b) have previously shown that spatial location influences the processing of temporal words in similar ways whether participants respond vocally or manually. For adults, the upward and downward keys had "short" or "long" stickers placed upon them, with the association between up/down key and short/long response counterbalanced across participants. We chose response keys positioned in the vertical axis (up/down), rather than the horizontal one (left/right), to avoid possible interference effects between stimulus position and side of response (Simon, 1969).

#### Procedure

All participants performed a temporal bisection task. In the initial training phase, participants were first shown the stimulus in the central location for either the short (200 ms) or long (800 ms) standard duration, four times each (eight trials). Each trial started with the word "prêt" ("ready") displayed on the center of the computer screen. When the participant was ready, the spacebar was pressed (either by the experimenter for the children or by the participant for the adults) and the stimulus appeared. The participant gave either a "short" or "long" response, depending on whether they perceived stimulus duration to be short or long. There were eight training trials, with four short and four long durations presented in random order. The testing phase immediately followed training. In the testing phase, the stimulus was presented on the left, right, or center of the computer screen. For each of these three experimental conditions, stimulus duration could be 200, 350, 500, 650, or 800 ms (15 trial-types). Participants performed a total of 120 trials (eight repetitions of each of the 15 trial-types).

#### Data Analysis

The proportion of "long" responses [p(long)] was calculated for each experimental condition. Bisection curves were constructed for each of the three stimulus locations for each individual participant. A pseudo-logistic function was fit to the bisection curves using GraphPad Prism 7 software. From the fit of this function to the data, we calculated BPs and WRs for each individual participant. Fits were calculated separately for each of the three locations. Data from only one child (BP more than double the maximum stimulus duration) and one adult (BP could not be calculated due a very poor fit) had to be excluded from the analyses. The improved fit for 5–6 year olds' data in this experiment compared to Experiment 1 indicates the utility of manipulating only a single experimental factor at a time when measuring duration estimates in young children.

We first analyzed p(long) in a three-way ANOVA with stimulus location (left, center, right) and duration (200, 350, 500, 650, or 800 ms) as within-subject factors and age group (children, adults) as a between-subjects factor. To examine the effect of location on perceived duration and temporal sensitivity, we analyzed BP and WR in two separate two-way ANOVAs with stimulus location (left, center, right) as a within-subject factor and age group (children, adults) as a between-subjects factor.

# RESULTS

An ANOVA of p(long) revealed the expected significant main effect of stimulus duration, F(4,168) = 355.84, p < 0.0001, η 2 <sup>p</sup> = 0.89 with p(long) increasing as a function of stimulus duration (**Figure 3A**). More interestingly, we also found a significant main effect of location, F(2,84) = 5.10, p < 0.01, η 2 <sup>p</sup> = 0.11. Post hoc tests revealed that duration was judged longer for stimuli presented on the right [mean ± SE p(long) = 0.48 ± 0.022] versus left [mean ± SE p(long) = 0.43 ± 0.020] of the screen (p = 0.005). There was no significant difference between centrally located stimuli [mean

p(long) = 0.47 ± 0.020] and either left- or right-lateralized stimuli (p > 0.05). The main effect of location was qualified by a significant location × duration, F(8,336) = 2.12, p < 0.05, η 2 <sup>p</sup> = 0.05) interaction (**Figure 3A**), with the effect being significant for the 500 ms duration only (i.e., midway between the short and long anchor durations). There was no significant main effect of age, F(1,42) = 0.93, ns, nor an age × location interaction, F(2,84) = 2.38, ns. The only effect of age was a significant age × duration interaction, F(4,168) = 8.28, p < 0.0001, η 2 <sup>p</sup> = 0.17, with post hoc tests revealing that 5–6 year olds overestimated the two shortest comparison durations [mean p(long) = 0.04 and 0.29 for the 200- and 350-ms durations respectively, p < 0.005] as compared to adults [mean p(long) = 0.01 and 0.08 for the 200- and 350-ms durations respectively, p < 0.0001].

Analysis of BP confirmed a significant main effect of location, F(2,80) = 4.87, p = 0.01, η 2 <sup>p</sup> = 0.11 on perceived duration. Post hoc tests showed that duration was judged significantly shorter for left-lateralized stimuli (mean ± SE BP = 523.69 ± 16.75 ms) compared to right-lateralized (mean BP ± SE = 485.97 ± 16.82 ms) or centrally located ones (mean ± SE BP = 480.26 ± 14.16 ms) (all p < 0.05, Bonferroni corrected). The age by location interaction was not significant, F(2,80) = 2.51, ns, although there was a trend (p = 0.088, η 2 <sup>p</sup> = 0.06) for the effect of location to be stronger in children than in adults (**Figure 3B**).

As in Experiment 1, there were no significant effects of location on WR [F(2,80) = 1.83, ns] nor a location × age interaction, F(2,80) = 0.10, ns, although the main effect of age was significant, F(1,38) = 14.00, p < 0.001, η 2 <sup>p</sup> = 0.31), indicating that children (mean ± SE WR = 0.29 ± 0.024) had a wider curve (i.e., lower temporal sensitivity) than adults (mean ± SE WR = 0.16 ± 0.022).

### Discussion

In Experiment 1, the poor performance of 5 year-olds made it impossible to test the influence of spatial position on perceived duration. We therefore designed a simpler paradigm for Experiment 2, in which a non-spatial stimulus (filled black circle) was presented in one of three locations. Using this simpler paradigm, we found significant differences in perceived duration of left- versus right-lateralized stimuli even in 5–6 year-olds. Stimuli appearing on the left of the screen were perceived as shorter than those appearing on the right. There were no effects of stimulus position on temporal sensitivity, which is in line with results of our previous study (Droit-Volet and Coull, 2015), and of Experiment 1. The mental timeline therefore influences the accuracy, but not precision, of subjectively perceived duration. These data confirm the results of Vicario et al. (2008) in adults and, moreover, extend this finding to children as young as 5–6 years old.

Indeed, there was a trend for the effect to be even stronger in children than in adults. Although this effect was not statistically significant, the pattern of effect might be due to the fact that children gave responses verbally while adults gave responses manually. Although we took care to control for possible interference between the side of stimulus presentation and manual response-mode (see section "Methods"), it's possible that adults might have shown even stronger effects of stimulus location if they had given their responses verbally, for which no response interference is possible. Alternatively, the trend for the effect to be stronger in children than adults might be because children are more susceptible than adults to the influence of non-temporal contextual factors, such as sensory modality (Droit-Volet and Hallez, 2018), stimulus magnitude (Droit-Volet et al., 2008), visual salience (Charras et al., 2017) or, in this case, stimulus location.

The significant difference in the perceived duration of leftlateralized versus centrally located stimuli indicates that the difference between left and right-lateralized stimuli might be more accurately interpreted as an underestimation of left-sided stimuli, rather than an overestimation of right-sided ones. This finding replicates the pattern observed by Vicario et al. (2009) in a cross-modal timing study of adults. In their study, a visual stimulus was judged to have a shorter duration when auditory distractors were presented to the left ear, as compared to distractors in the either the right, or both, ears. By contrast, the temporal effect of distractors presented to the right ear was no different to that of bilateral distractors. Most studies of the mental timeline directly compare left and right lateralized stimuli to one another, making it impossible to conclude whether effects are due to an overestimation of right-sided stimuli and/or an underestimation of left-sided ones. It's unclear why leftlateralized stimuli might influence duration processing more than right-sided ones, but our results indicate that future studies of the mental timeline should ideally include a spatially neutral condition to determine whether this pattern is replicable. One speculative explanation is that the right hemisphere specialization for spatial attention (Corbetta and Shulman, 2002) preferentially weights processing of stimuli presented in the left visual field, meaning that the effects of left-lateralized stimuli on duration estimation contribute more to the overall pattern of effect than those of right-lateralized ones.

## GENERAL DISCUSSION

We conducted two experiments to pinpoint the age at which spatial position begins to influence children's temporal judgments, which would indicate the use of a mental timeline to represent duration. Children (aged 5–6, 8, or 10 years old) and adults performed temporal bisection tasks in which relative spatial position (left/right) was manipulated by either arrow direction (Experiment 1) and/or lateralized stimulus location (Experiments 1 and 2). An influence of relative spatial position on perceived duration was hypothesized to signify a spatial representation of time, consistent with the use of a mental timeline. Our results confirmed previous findings (Droit-Volet and Coull, 2015) that arrow direction influenced subjective estimates of duration in older children. In the current study, we found that duration of leftward facing arrows was perceived to be shorter than that of rightward facing ones in both 8 and 10 year olds. Contrary to our hypothesis, however, stimulus location modulated perceived duration in children as young as 5–6 years

old, with duration of left-lateralized stimuli being judged shorter than that of either right-lateralized or central stimuli. These data are consistent with the use of a mental timeline for stimulus duration from the age of 5–6 years old, with short duration being represented on the left side of space and long duration on the right.

# Children Have a Mental Timeline for Duration

In a prior study, we had concluded that the influence of arrow direction on duration judgments in older, but not younger, children reflected the acquisition of a mental timeline around the age of 8–10 years old (Droit-Volet and Coull, 2015). We took this developmental dissociation as further proof that the mental timeline is a culturally acquired representation of duration (Bonato et al., 2012; Núñez and Cooperrider, 2013; Winter et al., 2015; Magnani and Musetti, 2017). Nevertheless, by representing spatial location physically (left/right side of the screen) rather than symbolically (left/right arrow), we have now found evidence that children as young as 5–6 years old may also represent duration along a left–right axis. By age 5, children in Western cultures have already begun to acquire the habit of reading from left to right, and are likely to have number lines going from left to right pinned to the walls of their classrooms. Because words on the left are read before those on the right, a conceptual association is created between the order of events in time and relative spatial position. Such cultural conventions therefore influence the way in which we conceptualize the notion of temporal order: events that happen first are represented on the left whereas those that happen later are represented on the right (Fuhrman and Boroditsky, 2010; Hendricks and Boroditsky, 2015). Our data, along with many previous findings (e.g., Vallesi et al., 2008; Vicario et al., 2008), suggest that these cultural conventions also influence our notion of duration: events of short duration are represented on the left and longer ones on the right. The mechanism that would create an association between duration and spatial position is perhaps less intuitive than that between temporal order and spatial position. But one obvious possibility is that as we read from left to right, time elapses. Therefore, reading duration lengthens as we move from left to right on the page, creating an association between duration and position. Of course, this association would have to be reset every time we began again on the next line, making it less robust and perhaps explaining why there is less experimental evidence for a mental timeline for duration than there is for a mental timeline for temporal order (Bonato et al., 2012). Nevertheless, our data indicate that a mental timeline for duration appears to be in operation from the age of 5 years old.

# Children's Perception of Duration Is Influenced by Spatial Position as Well as Spatial Magnitude

Our findings complement previous developmental studies (Casasanto et al., 2010; Bottini and Casasanto, 2013; Charras et al., 2017) showing that 5 year olds represent duration in terms of spatial magnitude (i.e., size or distance). Young children are therefore able to conceptualize the duration of an event in terms of either spatial magnitude (short duration is equivalent to small size) or spatial position (short duration is equivalent to the left side of space). Importantly, these two distinct ways of representing duration in spatial terms – magnitude or position – are not mutually exclusive and may simply represent two different mechanisms for rendering the abstract notion of time a little more tangible (Núñez and Cooperrider, 2013; Winter et al., 2015). Additional experiments in younger children are now needed to confirm whether the spatial representation of duration develops first in terms of spatial magnitude before then being influenced, through cultural habits, by spatial position.

# Developmental Dissociation in the Effects of Arrows or Location on Perceived Duration

Crucially, the youngest children only showed evidence of a timeline when spatial position was represented physically by the

location of the stimulus on the screen, not when represented symbolically by an arrow (Droit-Volet and Coull, 2015). While the physical location of a stimulus captures the focus of spatial attention automatically, or "exogenously," arrows direct spatial attention in a more voluntary, or "endogenous" way. Although there is evidence that children in this age range orient attention automatically to the spatial location indicated by an arrow (Ristic et al., 2002; Jakobsen et al., 2013), it seems that this attentional mechanism is not yet strong enough to induce a knock-on effect on the perception of time. Indeed, the spatial attentional mechanisms induced by arrow stimuli mature later than those induced by the physical location of the stimulus (Brodeur and Enns, 1997; Ristic and Kingstone, 2009). Therefore, arrows might simply be an ineffective way of measuring how the locus of spatial attention modulates perceived duration in these younger children. Alternatively, arrows might be processed as a symbolic representation of "time's arrow" that is acquired only later in development, thus explaining why perceived duration is modulated by arrow direction only from the age of 8 years old (Droit-Volet and Coull, 2015).

# Experimental Parameters Affect Whether Spatial Position Modulates Perceived Duration

Finally, our pattern of findings highlight that specific experimental parameters will dictate whether or not we are likely to find evidence of a mental timeline for duration. For example, we found a significant difference in the perceived duration of stimuli appearing on the left versus right of the screen in Experiment 2, in which stimuli were simple black dots, but not in Experiment 1, in which stimuli were arrows. Task demands are known to modulate the strength of influence of the mental timeline (Rolke et al., 2013). It's therefore possible that the direction in which the focus of spatial attention was oriented endogenously by arrows in Experiment 1 interfered with the way in which attention was oriented exogenously by stimulus location, diluting any effects of lateralized spatial location on perceived duration. The interaction between endogenous and exogenous attentional mechanisms is immature before the age of 6 years old (Iarocci et al., 2009; Ristic and Kingstone, 2009) and, indeed, the factorial combination of arrows and locations proved too much for 5 year olds, whose performance on the task broke down completely. Even in the older children, we found no evidence for differential effects of left- versus right-sided stimuli on perceived duration. Instead, in Experiment 1, perceived duration was longer for stimuli appearing in the center of the screen rather than either the left or the right. Since the duration of attended stimuli are judged longer than non-attended ones (Brown, 2008), this unexpected result might be due to the fact that in this complex experiment, children focused their attention on the central location, inadvertently lengthening the perceived duration of stimuli appearing there. We acknowledge this is a post hoc explanation for an unexpected finding. Nevertheless, it's important to stress that the simplification of the task in Experiment 2 unveiled the effects of location on perceived duration, unencumbered by the potentially interfering effects of arrow direction.

Even though the combination of arrows and locations in Experiment 1 modified the expected effects of location on duration, it did not alter the effects of arrow direction on duration. In fact, we not only confirmed that 10 year olds perceive leftward arrows as having a shorter duration than rightward ones (Droit-Volet and Coull, 2015) we also found this effect in 8 year olds. This suggests that, at least in older children and adults, effects of symbolic arrows are potentially stronger than effects of physical location. Although this seems counterintuitive, one possible explanation for this comes from the very recent work of Isham et al. (2017). In this study, stimulus laterality (left/right) and stimulus eccentricity (central/peripheral) were manipulated conjointly. Although left-lateralized stimuli were estimated to have shorter duration than right-lateralized ones, this was true only when stimuli were presented within 3 degrees of central fixation. In fact, when stimuli were presented in more peripheral locations (8 degrees of eccentricity), these findings were reversed. Therefore, the influence of the mental timeline on duration depends upon the degree of lateral spatial displacement from the midline (Isham et al., 2017). In Experiment 1, our centrally located arrowhead was 2 degrees from the midline whereas arrowheads in the peripheral locations were 11 or 16 degrees from the midline. According to the results of Isham et al. (2017), the underestimation of left-lateralized stimuli should be more evident when relative position is manipulated by arrow direction (2 degrees) rather than by its peripheral location on the screen (11/16 degrees). This is precisely what we found. Nevertheless, Isham et al. (2017) also showed that in the periphery, leftsided stimuli were over-estimated. This contradicts our results in Experiment 2, which showed underestimation of left-lateralized stimuli. It may be that when location and eccentricity (Isham et al., 2017) are manipulated together, or indeed location and arrow direction (Experiment 1), the effects of the mental timeline on duration are less clear. In addition, given that the arrow stimuli in our study could be presented on the left, center or right of the screen, the crucial factor may not be eccentricity from the midline of the screen, but eccentricity from a point of reference such as the midline of the arrow stimulus itself. Further experiments would obviously be required to test this post hoc explanation.

# CONCLUSION

We found that spatial position influences perceived duration in children as young as 5–6 years old, indicating the use of a mental timeline to represent duration from a relatively young age. Nevertheless, this effect was found in these younger children only when spatial position was manipulated by varying the left/right location of the stimulus on the screen. If spatial position was conveyed in a more symbolic manner by leftward/rightward facing arrows, then the effect of position on perceived duration was found only in children aged 8 years and older (Droit-Volet and Coull, 2015). This age-related dissociation may reflect development of the conceptual understanding that "time's arrow" flows from left to right around the age of 8 years olds.

Alternatively, it could reflect distinct developmental trajectories of automatic versus voluntary attentional mechanisms, which are differentially engaged when spatial position is manipulated by the physical location of a stimulus or the direction of an arrowhead. This explanation highlights the importance of taking attentional ability into account when interpreting results of duration judgment tasks (Droit-Volet, 2016; Hallez and Droit-Volet, 2017). Unfortunately, we did not use neuropsychological tests to assess childrens' memory or attentional function in the present study. Future investigations should examine how underlying cognitive capacity affects the influence of spatial context on perceived duration.

Furthermore, it would be extremely informative to repeat this experiment in a group of children who read from right-to-left, the prediction being that left-lateralized stimuli would now be overestimated compared to right-lateralized ones. Importantly, such an experiment might help clarify why, in the current experiment, we found an underestimation of left-sided stimuli rather than an overestimation of right-sided ones. If our pattern of effect was indeed due to 5 year olds' bias to process left-lateralized stimuli then, as compared to central stimuli, there should be a disproportionate overestimation of left-sided stimuli in the right-to-left reading population, as opposed to an underestimation of right-sided ones.

Finally, it is important to remember that even though the spatialization of time might provide a useful heuristic for communication, it is not the only way that time can be represented. Kranjec and Chatterjee (2010) note that spatial metaphors for time (e.g., look forward to the party) appear later in language development than purely temporal words (e.g., the party is tomorrow), suggesting that time could be represented independently of space. They suggest, as an alternative framework, that time might be represented in sensorimotor networks of the brain. This hypothesis is supported by converging evidence from both the neuroimaging and developmental domains (Coull and Droit-Volet, unpublished). For example, structures of the brain typically associated

## REFERENCES


with motor function, such as Supplementary Motor Area or basal ganglia, are activated by purely perceptual timing tasks (Wiener et al., 2010; Coull et al., 2011). In parallel, young children appear better able to represent time when it is coupled to a motor act (Droit-Volet, 1998; Droit-Volet and Rattat, 1999). Untangling the distinct, and overlapping, contributions of spatial and sensorimotor experience to our understanding of time is an important challenge for future research.

# DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

## AUTHOR CONTRIBUTIONS

JC, KAJ, and SD-V conceived the experiments and wrote the paper. SD-V acquired the data. JC and SD-V analyzed the data.

# FUNDING

This work was partly funded by an Agence Nationale de la Recherche grant (ANR-12-BSH2-0005-05) awarded to JC.

## ACKNOWLEDGMENTS

We thank Jessica Boudrie and Laetitia Bartomeuf for collecting the data. We thank the principal (Mme Marlingue) and teachers of the Joliot Curie school in Tulle, and the principal (Mme Vacher) and teachers of the Diderot school in Clermont-Ferrand for allowing us to conduct this research. We would also like to thank the three reviewers for their thoughtful comments.




Winter, B., Marghetis, T., and Matlock, T. (2015). Of magnitudes and metaphors: explaining cognitive interactions between space, time and number. Cortex 64, 209–224. doi: 10.1016/j.cortex.2014.10.015

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Coull, Johnson and Droit-Volet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cognitive Load Affects Numerical and Temporal Judgments in Distinct Ways

#### Karina Hamamouche<sup>1</sup> \*, Maura Keefe<sup>1</sup> , Kerry E. Jordan<sup>2</sup> and Sara Cordes<sup>1</sup>

<sup>1</sup> Boston College, Chestnut Hill, MA, United States, <sup>2</sup> Department of Psychology, Utah State University, Logan, UT, United States

Prominent theories suggest that time and number are processed by a single neural locus or a common magnitude system (e.g., Meck and Church, 1983; Walsh, 2003). However, a growing body of literature has identified numerous inconsistencies between temporal and numerical processing, casting doubt on the presence of such a singular system. Findings of distinct temporal and numerical biases in the presence of emotional content (Baker et al., 2013; Young and Cordes, 2013) are particularly relevant to this debate. Specifically, emotional stimuli lead to temporal overestimation, yet identical stimuli result in numerical underestimation. In the current study, we tested adults' temporal and numerical processing under cognitive load, a task that compromises attention. Under the premise of a common magnitude system, one would predict cognitive load to have an identical impact on temporal and numerical judgments. Inconsistent with the common magnitude account, results revealed baseline performance on the temporal and numerical task was not correlated and importantly, cognitive load resulted in distinct and opposing quantity biases: numerical underestimation and marginal temporal overestimation. Together, our data call into question the common magnitude account, while also providing support for the role of attentional processes involved in numerical underestimation.

#### Edited by:

Fuat Balcı, Koç University, Turkey

#### Reviewed by:

Giovanna Mioni, Università degli Studi di Padova, Italy Jennifer T. Coull, Aix-Marseille Université, France

> \*Correspondence: Karina Hamamouche hamamouc@bc.edu

#### Specialty section:

This article was submitted to Perception Science, a section of the journal Frontiers in Psychology

Received: 02 June 2018 Accepted: 04 September 2018 Published: 02 October 2018

#### Citation:

Hamamouche K, Keefe M, Jordan KE and Cordes S (2018) Cognitive Load Affects Numerical and Temporal Judgments in Distinct Ways. Front. Psychol. 9:1783. doi: 10.3389/fpsyg.2018.01783 Keywords: quantity processing, time perception, number processing, cognitive load, quantity estimation

# INTRODUCTION

Throughout our daily lives, we constantly track temporal and numerical information. However, this process is never void of context; in fact, it is often coupled with distractions. We often calculate a tip at a restaurant while simultaneously talking to our friends or estimate how long it will take us to drive home while listening to a child crying in the backseat. Although research has most frequently tested quantity processing in controlled laboratory settings, recent work has revealed quantity processing biases in the presence of external stimuli. For example, these studies have revealed durations to be overestimated and numerosities to be underestimated in the presence of emotional content, namely angry faces (see Gil et al., 2007; Baker et al., 2013; Young and Cordes, 2013). These findings have led many researchers to re-think prominent theories of quantity processing, while also unveiling questions regarding the cognitive mechanism(s) involved in quantity processes. In the current study, we investigated adults' temporal and numerical processing under cognitive load – an attention-distracting working memory task. This manipulation not only mimics real-world quantity processing, but allows us to directly test theories of quantity processing.

Evidence from behavioral, neural, and clinical data reveal many striking parallels in numerical and temporal processing (Dormal et al., 2006; Feigenson, 2007; Provasi et al., 2011; Dormal and Pesenti, 2012, 2013; Vicario et al., 2013). For example, behavioral data indicate that the ease of numerical and temporal judgments relies on Weber's Law (Stevens, 1957). That is, it is easier to discriminate between two numerosities or durations when they differ by a larger ratio. Other behavioral work shows that rats and infants are able to generalize a rule learned in a numerical domain to a temporal domain and vice versa (Meck and Church, 1983; de Hevia et al., 2012). Neuroimaging and clinical work also reveals similarities between numerical and temporal processing. For instance, adults' intraparietal sulcus (IPS) is activated while processing both numerical and temporal stimuli (Dormal et al., 2012; Skagerlund et al., 2016). Relatedly, Hayashi et al. (2013); Experiment (1) show comparable activations in the intraparietal cortex and inferior frontal gyrus during temporal and numerical discrimination tasks. Moreover, many individuals' suffering from clinical disorders, such as Turner Syndrome, also experience comorbid quantity processing deficits (e.g., Vicario et al., 2013), further emphasizing commonalities in quantity processing. These findings have led researchers to propose that a single neural locus, or a common magnitude system, is responsible for processing both time and number (Meck and Church, 1983; Walsh, 2003; Cantlon et al., 2009).

Despite reports of numerous similarities in numerical and temporal processing, researchers have also identified many striking inconsistencies in processing these two types of quantity (e.g., Baker et al., 2013; Young and Cordes, 2013; Odic, 2017). Although number and time follow comparable developmental trajectories in infancy, different developmental trajectories occur in childhood (Odic, 2017). While some work has suggested that the IPS is activated during both temporal (e.g., Schubotz et al., 2000; Rao et al., 2001) and numerical (e.g., Cantlon et al., 2006; Piazza et al., 2007; Ashkenazi et al., 2008) tasks, the overall consensus is that the IPS is implicated in numerical, but not temporal, processing (Rammsayer and Classen, 1997; Nenadic et al., 2003; Mattel and Meck, 2004; Koch et al., 2009). Research revealing unique numerical and temporal biases when making quantity judgments in the presence of emotional content has been particularly damaging to the common magnitude account. In these studies, participants passively view a happy, angry, or neutrally valenced face immediately before making a numerical or temporal judgment. Results show that both children and adults consistently underestimate numerosity in the presence of both negatively and positively valenced stimuli (both happy and angry faces). In contrast, durations are overestimated in the presence of negatively valenced stimuli<sup>1</sup> (angry faces; Gil et al., 2007; Baker et al., 2013; Young and Cordes, 2013). That is, the exact same emotional stimuli differentially bias numerical and temporal processing, challenging claims of a common magnitude system. In the current study, we investigate the effect of cognitive load on temporal and numerical judgments. While comparable temporal and numerical biases would provide evidence in favor of a common magnitude system, unique temporal and numerical biases under cognitive load would call this account into question.

Distinct biases in the presence of emotional content have not only challenged claims of a common magnitude system, but also led researchers to speculate about the cognitive mechanism(s) underlying numerical and temporal processing. Because distinct patterns of estimation occur in the presence of emotional faces – only angry faces lead to temporal overestimation, but both angry and happy faces lead to numerical underestimation – some have explained these results by the differential effects of arousal and attention on quantitative processing (Young and Cordes, 2013). While previous work has tested the effects of altered attention or heightened arousal on temporal and numerical judgments separately (e.g., Thomas and Brown, 1974; Rammsayer and Lima, 1991; Brown and Stubbs, 1992; Macar et al., 1994; Brown, 1997, 2008; Casini and Macar, 1997; Fortin and Rousseau, 1998; Zakay, 1998; Khan et al., 2006; Wearden et al., 2007; Block and Zakay, 2008; Ortega and Lopez, 2008; see Block et al., 2010; Hamamouche et al., 2017), these studies have rarely investigated these manipulations on temporal and numerical processing in the same individuals using the same task. Moreover, attention and arousal are closely related constructs, and are inevitably involved in both temporal and numerical estimation. Thus, in the current study, we aimed to solely manipulate attention – defined as "the appropriate allocation of processing resources to relevant stimuli" (Coull, 1998, p. 344) – by introducing cognitive load during temporal and numerical processing in the same individuals. Using this manipulation, we will be able to (1) assess the impact of attention manipulations on temporal and numerical processing, and (2) determine whether comparable biases occur in the presence of an attention distracting task.

## The Current Study

In the current study, adults made temporal and numerical judgments under cognitive load – a distracting, working memory task – in order to assess the likelihood of the common magnitude system. Under the premise of a common magnitude system, one would predict altered attention from the cognitive load manipulation to identically impact numerical and temporal judgments. However, if numerical and temporal processing are dictated by distinct cognitive systems, temporal and numerical biases under cognitive load may not track together.

# MATERIALS AND METHODS

#### Participants

Eighty Boston College undergraduates participated in this study for course credit or cash compensation (58 females, Mage = 19.15 years). After exclusions (see criteria below), there were 71 participants with complete data.

# Experimental Design and Procedure

Participants completed both a numerical and a temporal bisection task. Within each task, there were two blocks: a

<sup>1</sup>While temporal overestimation typically occurs in the presence of angry faces, this effect seems to depend on the temporal measure used (see Gil and Droit-Volet, 2011; Lui et al., 2011).

baseline block and a cognitive load block. The order of the tasks (numerical vs. temporal) and the order of the blocks within each task (baseline vs. cognitive load) were counterbalanced. However, the order remained consistent within each participant – if the participant completed the cognitive load trials first in the temporal bisection task, s/he would also complete the cognitive load trials first in the numerical bisection task.

In the numerical bisection task, participants were first familiarized to a small (15 dots) and a large (60 dots) standard value (modeled after Lewis et al., 2017). Each display was shown twice and labeled both on the screen and by the experimenter with the appropriate size (i.e., small or large). To confirm that participants knew the standard values before beginning the test trials, participants first completed four standard practice trials in which they had to classify the dot arrays containing the standard values as either small or large (by pressing a button on the keyboard). Participants received feedback on each of the four standard practice trials. No other feedback was provided during the experiment.

In the baseline block of the numerical task, participants then completed additional practice trials in which they were presented with arrays containing intermediate numerosities (19, 24, 30, 38, 48) in addition to the standard values and were asked to indicate whether the numerosity of the display was more similar to the small or large standard. Each numerosity (standard values and intermediate values) was shown once in a randomized order, resulting in seven baseline practice trials. Participants completed these baseline practice trials to familiarize them to the demands of the task. After completing the practice trials, participants then completed the baseline test trials, during which each of the seven numerosities were presented 12 times each in a random order, resulting in a total of 84 test trials. Dot arrays were presented for 750 ms. For each numerosity, there were twelve different configurations of dots. Within each array, the size of all dots was held constant; however, dot sizes varied across arrays. Half of the arrays controlled for cumulative surface area, such that regardless of the numerosity, the cumulative area of the array was held constant (approximately 133.6 cm<sup>2</sup> ). Thus, individual dots were smaller as the number of dots in the array increased. The other half of the dot arrays controlled for dot size such that each dot, regardless of the numerosity of the array, had an area of 4.01 cm<sup>2</sup> . Thus, dot arrays with fewer dots also had a smaller cumulative surface area (cumulative areas ranged from 59.9 to 239.99 cm<sup>2</sup> ).

The cognitive load block was identical to the baseline block except that participants were required to remember and alphabetize four letters while simultaneously completing the numerical task. On every trial, participants saw four letters on the screen (e.g., M K F J) for 750 ms and were told to remember and alphabetize the letters. Participants were then presented with the dot array. After making their numerical judgment (whether the array was more similar to small or large standard), participants were shown a text box in which they were instructed to type the four letters in alphabetical order (e.g., F J K M). The participants first completed seven practice trials (one with each numerosity), and then completed 84 test trials (12 displays of each numerosity). During test, participants were only asked to type the letters alphabetically on a random two-thirds of the trials; however, participants did not know when they would be required to type the letters in alphabetical order, thus they were required to perform the working memory task on every trial in anticipation of receiving the prompt. The progression of the cognitive load trials can be found in **Figure 1**.

The temporal bisection task was identical to the numerical bisection, except participants saw a blue oval in the center of the screen for a specified duration instead of a dot array. Participants were familiarized to two standard durations, a short standard (400 ms) and a long duration (1600 ms; modeled after Young and Cordes, 2013). Again, participants began with four standard practice trials during which they classified the two standard durations and received trial-by-trial feedback. Participants then completed seven additional practice trials with each standard value and the intermediate durations (504, 635, 800, 1008, and 1270 ms) intermixed, during which they were instructed to decide whether the oval's duration was more similar to the short standard or the long standard. Then during the baseline block, participants completed 84 test trials (7 durations × 12 presentations = 84 test trials). The cognitive load block was identical – participants were again shown four letters prior to every temporal stimulus and were asked to remember and alphabetize them during the temporal task as described above.

# Data Coding

#### Exclusion Criteria

Five participants only completed one of the two bisection tasks (numerical or temporal) and an additional three participants only completed one trial type (baseline or cognitive load). Thus, performance on the task or block that was not completed was coded as missing data.

Data from participants who performed poorly on test trials involving the standard values (<75% accuracy on the standard values combined) were excluded (Number: NBaseline = 1, NCognitiveLoad = 2; Time: NBaseline = 1; NCognitiveLoad = 2).

#### **Cognitive load**

To assess participants' accuracy on the cognitive load task, we calculated the percentage of trials during which each participant correctly alphabetized the letters for the numerical and temporal bisection separately. Participants who typed in the letters without alphabetizing them were removed from the cognitive load analyses (NNumber = 2, NTime = 5).

#### **Bisection task dependent measures**

Two dependent measures were taken from the bisection task:

**(1) Relative Point of Subjective Equality (PSE)**. The PSE corresponds to the value at which 50% of the responses were classified as "large" (or "long"). First, the proportion of responses during which the participant judged the numerosities (or durations) as being closer to the large (long) standard was plotted as a function of the stimulus numerosity (or duration) and these data were fit with a Cumulative Gaussian function (as per Ço¸skun et al.,

2015). These curves were then used to determine each participant's PSE, or the value at which 50% of responses were "large" (or "long"). PSEs were calculated separately for each bisection task (numerical and temporal) and block type (baseline and cognitive load). It should be noted that a higher PSE is indicative of a leftward shift in the curve, indicative of a lower likelihood of the participant judging the value as similar to the long/large standard; i.e., it is indicative of underestimation. PSE values that were three standard deviations above or below the mean were replaced with the next highest/lowest value within that range (Time: NBaseline = 1, NCognitiveLoad = 1). Because PSE varies based on the range of values for each task (15–60 for number and 400–1600 for time), we calculated a Relative PSE for each task and trial block separately. The Relative PSE was calculated by dividing each participant's PSE by the geometric mean of the standard values (30 for the numerical task and 800 for the temporal task), thus allowing for direct comparisons between temporal and numerical performance.

**(2) Relative Difference Limen (DL)**. The DL is a measure of the participant's consistency in responding and corresponds to the value halfway between the set sizes corresponding to a 75% probability of a large/long response and a 25% probability of a large/long response. Outliers were replaced with the next largest/smallest value within the range (Time: NBaseline = 1, NCognitiveLoad = 1; Number: NBaseline = 1). Again, we calculated the Relative DL by dividing each participants' DL by the geometric mean of the standard values.

# RESULTS

First, we confirmed that the order in which participants completed the bisection task (numerical versus temporal first) and/or the block order (baseline trials versus cognitive load trials first) did not interact with our variables of interest (Relative PSE, Relative DL). Neither the order in which participants completed the bisection tasks (numerical versus temporal first) nor the order of the blocks (baseline vs. cognitive load) interacted with our variables of interest (p's > 0.05); thus, we collapsed data across these variables in the subsequent analyses.

#### The Relation Between Time and Number

The common magnitude account would predict a correlation between baseline performance on the numerical and temporal bisection. However, performance on the baseline numerical and temporal tasks was not correlated (Relative PSE: r = 0.090, p = 0.445, Relative DL: r = 0.132, p = 0.263).

## Cognitive Load Performance

Next, we tested whether cognitive load accuracy (i.e., correctly alphabetizing the four letters) differed as a function of the task (numerical versus temporal bisection). This was done to ensure that cognitive load affected participants in each task approximately equally. A paired samples t-test revealed no significant difference in cognitive load accuracy across the two tasks, t(72) = 0.723, p = 0.472 (Temporal Task: M = 0.70, SE = 0.02; Numerical Task: M = 0.69, SE = 0.02).

#### Effect of Cognitive Load on Temporal and Numerical Judgments Relative PSE

In order to directly compare biases on the numerical and temporal task, we conducted a 2 (Task: numerical vs. temporal bisection) × 2 (Block: baseline versus cognitive load) repeated measures ANOVA on the Relative PSE. There was a significant Task × Block interaction, F(1, 71) = 22.063, p < 0.001, ηp <sup>2</sup> = 0.237 (See **Figure 2**). No other main effects reached significance, p's > 0.2. To follow up on the Task × Block interaction, we next conducted paired samples t-tests comparing the baseline and cognitive load trials within the numerical and temporal task separately. Cognitive load trials (M = 1.13, SE = 0.02) were significantly underestimated compared to baseline trials (M = 1.04, SE = 0.02) in the numerical bisection, t(75) = −4.634, p < 0.001, See **Figures 3A**, **4A** whereas cognitive load trials (M = 1.05, SE = 0.02) were marginally overestimated compared to baseline trials (M = 1.10, SE = 0.02) in the temporal bisection, t(75) = 1.890, p = 0.063, See **Figures 3B**, **4B** providing support for the role of attention in numerical processing by indicating numerical underestimation under cognitive load. Lastly, we conducted additional paired samplest-tests to compare

numerical and temporal processing in each block separately. Performance at baseline was comparable on the numerical and temporal task during the baseline trials t(73) = −1.990, p = 0.05. However, under cognitive load performance, the relative PSE on the numerical task was significantly greater than the temporal, t(71) = 2.743, p = 0.008, emphasizing the unique effect of cognitive load on the two bisection tasks.

#### Relative DL

In order to test the effects of cognitive load on the consistency of participants' responding, we conducted identical repeated measures ANOVA on the Relative DL in both the numerical and temporal tasks. There was a main effect of task, F(1, 71) = 4.841, p = 0.031, η<sup>p</sup> <sup>2</sup> = 0.064, such that the Relative DL on the numerical task (M = 0.147, SE = 0.006) was significantly lower to that of the temporal task (M = 0.165, SE = 0.007), indicative of more consistent responding in the numerical task across both Blocks. No other main effects or interactions reached significance, p's > 0.8 (see **Figure 5**).

# DISCUSSION

Understanding how quantity processing occurs in the real world is critical for assessing prominent theories of quantity processing and can also shed light on the cognitive mechanism(s) underlying these processes. Previous research identified numerous similarities in processing quantities such as time and number, leading to the prominent common magnitude system theory (see Walsh, 2003). However, newer work revealing many discrepancies in quantity processing (e.g., Baker et al., 2013; Odic, 2017) has undercut the common magnitude account. Of particular interest to this line of work are findings that angry face stimuli lead to numerical underestimation, yet temporal overestimation. These results have led researchers to rethink the prominent theory of a common magnitude system and consider the alternative that distinct cognitive mechanism(s) may underlie numerical and temporal processing. In the current study, we used a cognitive load manipulation to directly test the effect of altered attention during temporal and numerical processing in adults.

# Temporal and Numerical Processing in Baseline Conditions

First, the common magnitude system would predict performance on comparable temporal and numerical tasks to be correlated. Replicating previous work, our baseline data revealed no correlation between performance on our temporal and numerical tasks (see Agrillo et al., 2013; Young and Cordes, 2013; Odic et al., 2016). This finding matches several null results from other research groups, further undercutting the common magnitude account by suggesting that one's ability to track time and number are dictated by distinct patterns of representational acuity even within the same individuals.

# Quantity Biases Under Cognitive Load

The common magnitude system would not only predict a correlation between temporal and numerical processing, but also temporal and numerical biases to track in the same direction under identical conditions. Despite this, several studies have demonstrated unique temporal and numerical biases in the presence of emotional content (Gil et al., 2007; Baker et al., 2013; Young and Cordes, 2013; Lewis et al., 2017). While these biases have previously been observed in the presence of emotional content, our data provide additional evidence for differential biases in temporal and numerical processing in a new context – during cognitive load. Mirroring earlier work with emotional content, our data reveal that cognitive load led to numerical underestimation. Temporal judgments, however, were marginally overestimated during cognitive load trials. Our findings join others challenging claims of the common magnitude system by identifying distinct and opposing biases in temporal and numerical processing (Gil et al., 2007; Baker et al., 2013; Young and Cordes, 2013; Lewis et al., 2017).

Our study also aimed to further explore the effect of attention on temporal and numerical processing. While previous work has suggested heightened arousal leads to temporal overestimation, but altered attention leads to numerical underestimation (Young and Cordes, 2013), attention and arousal are related constructs. Thus, our study attempted to solely test the effect of altered attention – through the induction of cognitive load – on both temporal and numerical processing within the same individuals.

As predicted, numerical judgments were underestimated during the critical cognitive load trials, suggesting that numerical processing was disrupted while concurrently performing a distracting working memory task.

It is also important to note that temporal judgments were also marginally impacted by cognitive load, but in a different direction and less robustly. Unlike the findings with the numerical data, cognitive load resulted in the marginal overestimation of durations. These findings somewhat replicate previous work demonstrating that individuals tend to overestimate time in the presence of angry stimuli (e.g., Gil et al., 2007; Young and Cordes, 2013). However, they are counter to findings suggesting that altered attention leads to shorter duration estimates (e.g., Brown and Stubbs, 1992; Brown, 1997, 2008; Block and Zakay, 2008; Block et al., 2010). The opposing findings of our study compared to previous work may be accounted for by methodological differences, such as the type of temporal task employed and the durations used. Although prior work has primarily explored attentional manipulations on timing in the context of production or reproduction tasks (which typically assess estimation), the current study employed a bisection task that assesses subjective temporal judgments. In line with this possibility is work revealing differential impacts of arousing stimuli on temporal judgments across different timing tasks. Although angry faces lead to temporal overestimation in bisection, estimation, and production tasks, emotion does not impact temporal performance on reproduction or generalization tasks (Gil and Droit-Volet, 2011). While differences in the task demands could have led to discrepancies between our study and others, research employing bisection tasks, like the one used in the current study, has also revealed temporal underestimation (see Casini and Macar, 1997; Droit-Volet et al., 2010; Tipples, 2010). Moreover, our task focused on short durations (<1600 ms), yet previous work has typically, although not exclusively, tested the effect of cognitive load on longer durations (>2 s). This is particularly important given substantial evidence for two separate timing systems for timing sub-seconds (<1 s) and supra-seconds (>1 s, e.g., Lewis and Miall, 2003; Buhusi and Cordes, 2011; Gooch et al., 2011). Thus, it is possible that either the task employed, or the specific durations tested, may have led our findings to conflict with previous work on timing and attention. Regardless, our data further emphasize the need to investigate the effect of cognitive load on both sub- and supra-second judgments across tasks to determine whether how attentional manipulations impact timing judgments.

Lastly, because cognitive load was expected to impact attention, it was predicted that the inclusion of a dual task paradigm would have led to less consistent responding on the cognitive load trials. Surprisingly, this was not the case. Our data analyses revealed that introducing cognitive load did not impact participants' consistency in making temporal or numerical judgments. In fact, our only finding in regards to response consistency was an overall main effect of task, such that numerical judgments were more precise than temporal judgments. This finding is consistent with evidence indicating that numerical judgments are more accurate than temporal judgments across the lifespan (Droit-Volet et al., 2008, Expt 1; Odic et al., 2016; Odic, 2017).

While our goal was to directly manipulate attention, it is important to note that our cognitive load manipulation not only altered attention, but also necessarily engaged other domain-general abilities such as working memory. Although we intentionally chose a cognitive load manipulation that has been used in the literature for manipulating attention (e.g., Postle et al., 1999), it is clear that this secondary task confounded attention and working memory. Thus, it is possible that the biases obtained in our data may be the result of manipulations to another domain general cognitive process, rather than attention specifically. Future work will be important for teasing apart this alternative. In particular, studies employing eyetracking methods, in which implicit measures of attention could be directly measured, would be particularly beneficial for investigating this possibility.

## CONCLUSION

These data provide evidence against a common magnitude system by (a) demonstrating inconsistencies in representing different types of quantity within individuals, and (b) showing numerical underestimation, but slight temporal overestimation

during an attention distracting task. Although our findings suggest that attention is critical for numerical processing, more work is needed to shed light on how attentional manipulations impact quantity processing. Future research will be critical for understanding the unique representational formats of quantities, and additional work exploring the role of attention and arousal as cognitive mechanisms underlying quantity processing may be particularly fruitful for shedding light on the representational patterns of quantitative information.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of Institutional Review Board, Boston College. The protocol was approved by the Institutional Review Board, Boston College. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# REFERENCES


#### AUTHOR CONTRIBUTIONS

KH and MK completed all data collection. KH wrote the manuscript. MK, KJ, and SC provided thoughtful feedback on the manuscript. All authors contributed to the design of the study and approved of this version of the manuscript.

# FUNDING

Funding was provided by a Boston College Research Expense Grants to SC.

# ACKNOWLEDGMENTS

The authors would like to thank Kylie Gallo and Taylor Reader for their assistance with data collection and coding.

Dormal, V., Dormal, G., Joassin, F., and Pesenti, M. (2012). A common right fronto-parietal network for numerosity and duration processing: an fMRI study. Hum. Brain Mapp. 33, 1490–1501. doi: 10.1002/hbm.21300



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hamamouche, Keefe, Jordan and Cordes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dilation and Constriction of Subjective Time Based on Observed Walking Speed

#### Hakan Kar ¸sılar1,2, Yagmur Deniz Kısa ˘ <sup>1</sup> and Fuat Balcı1,3 \*

<sup>1</sup> Department of Psychology, Koç University, Istanbul, Turkey, <sup>2</sup> Department of Psychology, Özyegin University, Istanbul, ˘ Turkey, <sup>3</sup> Koç University Center for Translational Medicine, Istanbul, Turkey

The physical properties of events are known to modulate perceived time. This study tested the effect of different quantitative (walking speed) and qualitative (walking-forward vs. walking-backward) features of observed motion on time perception in three complementary experiments. Participants were tested in the temporal discrimination (bisection) task, in which they were asked to categorize durations of walking animations as "short" or "long." We predicted the faster observed walking to speed up temporal integration and thereby to shift the point of subjective equality leftward, and this effect to increase monotonically with increasing walking speed. To this end, we tested participants with two different ranges of walking speeds in Experiment 1 and 2 and observed a parametric effect of walking speed on perceived time irrespective of the direction of walking (forward vs. rewound forward walking). Experiment 3 contained a more plausible backward walking animation compared to the rewound walking animation used in Experiments 1 and 2 (as validated based on independent subjective ratings). The effect of walking-speed and the lack of the effect of walking direction on perceived time were replicated in Experiment 3. Our results suggest a strong link between the speed but not the direction of perceived biological motion and subjective time.

#### Edited by:

Kielan Yarrow, City, University of London, United Kingdom

#### Reviewed by:

Micha Pfeuty, Université de Bordeaux, France Hulusi Kafaligonul, Bilkent University, Turkey

> \*Correspondence: Fuat Balcı fbalci@ku.edu.tr

#### Specialty section:

This article was submitted to Perception Science, a section of the journal Frontiers in Psychology

Received: 27 August 2018 Accepted: 29 November 2018 Published: 21 December 2018

#### Citation:

Kar ¸sılar H, Kısa YD and Balcı F (2018) Dilation and Constriction of Subjective Time Based on Observed Walking Speed. Front. Psychol. 9:2565. doi: 10.3389/fpsyg.2018.02565 Keywords: biological motion, speed, psychophysics, temporal bisection, time perception

## INTRODUCTION

Given that accurate timing is essential for the preparation and execution of most motor responses (see Buhusi and Meck, 2005), it can be implicitly assumed that perception of time is highly accurate across situations irrespective of what is being timed. However, it has been shown that changes in a stimulus' properties such as its size, brightness, numerosity or loudness can also modulate time perception (e.g., Thomas and Cantor, 1976; Xuan et al., 2007; Eagleman and Pariyadath, 2009). In line with theories that assume a shared mechanism for the perception of various magnitudes by adhering to a common representational metric (e.g., time, numerosity, space; Walsh, 2003), perceived time also changes in the same direction with the changes in other stimulus properties. In other words, as one perceptual dimension is experimentally increased (e.g., loudness) so does the perceived duration of that stimulus (e.g., Berglund et al., 1969).

The relationship between motion and time perception has also been well documented (Brown, 1995; Kaneko and Murakami, 2009), where an increase in speed can lead to overestimations

of durations, and vice versa (Matthews, 2011). Since motion can inherently be defined in terms of change per unit time (Poynter, 1989), it has been theorized that the larger amount of change experienced by the timing agent whilst observing faster motion or higher temporal and spatial frequencies (i.e., events happening more frequently across time and the level of detail in a stimulus per degree of visual angle, respectively) may in fact act as a proxy for the passage of time, and therefore lead to the observed overestimation of durations (Brown, 1995; Kanai et al., 2006; Kaneko and Murakami, 2009).

A prominent information-theoretic approach to modeling these variations in timing behavior generally assumes an internal clock (Treisman, 1963; Gibbon et al., 1984) with three hypothetical components: (1) a pacemaker-accumulator unit which generates and counts pulses, (2) a reference memory unit where the total number of pulses representing the timed interval are encoded, and (3) a decision component which compares the current number of pulses in the pacemaker-accumulator unit (i.e., working memory) to a random sample drawn from the reference memory unit in order to arrive at a temporal judgment (Gibbon et al., 1984). Thus, depending on the task used, stimuli with higher speeds could, for instance, speed up the pacemaker, thereby leading to longer perceived durations as a result of a higher number of pulses being registered per unit time in the accumulator (Zakay and Block, 1997; Wearden, 1999). On the other hand, a similar stimulus may lead to inadvertent attentional lapses which may lead to some of the pulses not getting registered in the accumulator (e.g., Penney, 2003; Kars˛*ı*lar and Balc*ı*, 2016), thereby leading to shorter perceived durations (for a review see Allman et al., 2014). Examples of increases in the pacemaker rate (in addition to those mentioned above) have been shown in response to fast click-trains presented before timing a duration (Penton-Voak et al., 1996), higher body temperature (Wearden and Penton-Voak, 1995), emotional stimuli (Droit-Volet et al., 2004), auditory as opposed to visual timing stimuli (Wearden et al., 1998), physical activity/motion (Sayal*ı* et al., 2018), as well as those manifested in terms of drug effects (see Coull et al., 2011 for a review). On the other hand, variations in perceived time due to attentional modulation have generally been shown in dual-task paradigms (e.g., Thomas and Weaver, 1975) where attentional resources are directed away from timing (Fortin and Rousseau, 1987; Macar et al., 1994), as well as in the oddball paradigm, where the duration of unexpected stimuli are perceived longer than those that were expected in a given trial (e.g., Tse et al., 2004; see Brown, 2008 for a review).

The neural energy model of timing that does not contain a pacemaker-accumulator architecture also readily accounts for the stimulus-property dependent findings outlined earlier. For instance, based on the series of findings that showed reduced neural activity (known as the repetition suppression; e.g., Wark et al., 2007) and shorter duration estimates of repeated stimulus (Pariyadath and Eagleman, 2007), Pariyadath and Eagleman (2007, 2008) proposed that the strength of neural responses reflecting metabolic costs of neural information-processing (referred to as neural energy) could be the determinant of the perceived duration. In support of this claim, the manipulation of various stimulus properties that are known to lead to longer temporal judgments (e.g., flicker rate, brightness, size) are also known to lead to higher neural signals in the corresponding brain areas (for review see Eagleman and Pariyadath, 2009).

It has been suggested that disparate neural/cognitive systems might be recruited with regard to the perception of animacy vs. inanimacy as well as the biological plausibility vs. implausibility of the observed stimulus (Caramazza and Shelton, 1998; Peuskens et al., 2005; Blake and Shiffrar, 2007; Shi et al., 2010; Zago et al., 2011). As such, research on the relationship between perception of time and perception of motion has been further distinguished in relation to these variables. For instance, the presentation durations of still images of running postures are judged to have lasted longer compared to images of standing postures (Yamamoto and Miura, 2012), while timing of still images that imply human movement are more precise than those with no implied motion (Moscatelli et al., 2011). Relatedly, the presentation durations of still images which show more intense actions that imply having taken longer (and more effort) to complete (Nather and Bueno, 2011), or words implying an action with faster average speed (e.g., "gallop"; Zhang et al., 2014) are generally judged to have lasted longer compared to their counterpart experimental conditions (but see Orgs et al., 2011, for an alternate account).

On the other hand, based on the now-well-documented finding that perception of biological vs. non-biological stimuli recruits different neural structures (Downing et al., 2001; Giese and Poggio, 2003), still other researchers have shown that the modulation of perceived time induced by observing a moving stimulus is directly mediated by the biological nature/plausibility of the observed action (Watanabe, 2008; Wang and Jiang, 2012; Lacquaniti et al., 2014). Similar results have been obtained with stimuli showing animate (i.e., not implied) vs. inanimate motion in real time (e.g., Carrozzo et al., 2010; Carrozzo and Lacquaniti, 2013).

In addition to the information-processing model based "internal clock speeding up due to higher arousal" account outlined above, discussion of results demonstrating a temporal bias in response to changes in biological stimulus properties has adhered to higher-order sensory-motor processes (Yamamoto and Miura, 2012), such as an effect of cognitive embodiment of perceived stimulus properties on perceived durations (Droit-Volet et al., 2013; Zhang et al., 2014). These effects are thought to result from the simultaneous cortical "simulation" of observed actions (Nather and Bueno, 2011; Chen et al., 2013), the potential underlying structures of which employ mirror neurons (Cattaneo and Rizzolatti, 2009). In other words, the ease with which a participant can cognitively simulate the action being observed might act as a mediator in perceiving its temporal properties, where those actions with which the timing agent is more familiar (i.e., regular walking as opposed to a nonbiological action) can induce a stimulus-dependent bias in interval timing (see section "Discussion"). The mechanisms associated with "biological motion" also appear to be specific to certain brain areas. For instance, posterior superior temporal

sulcus, fusiform face area, and occipital face area (Grossman and Blake, 2002) as well as premotor frontal areas (Saygin, 2007) have been associated with the processing of biological motion. Although the middle temporal area (MT) sends input to some of these areas (e.g., superior temporal sulcus), its activity is not specifically modulated by the biological nature of motion (Grossman et al., 2000 - see also Grossman and Blake, 2002 for other differentiations).

Overall, these studies support a directional relationship between perception of motion and the perception of time. The timing mechanism appears to be susceptible to the perceived speed of actual movement, as well as the implied speed embedded within still images (i.e., no actual physical change per unit time). We hypothesized that the length of perceived durations would increase parametrically with increased observed walking speed. Consequently, smaller differences in walking speed (Experiment 1) were expected to lead to a less pronounced effect of walking speed on perceived time, as opposed to larger effects due to larger differences among levels of the same variable (Experiments 2 and 3). Moreover, we expected larger effects when participants timed forward walking as opposed to backward walking motion, in addition to observing higher precision with which durations are timed in the forward walking condition due to it being a more familiar form of motion (i.e., processed more readily) in comparison to backward walking (Moscatelli et al., 2011; Loeffler et al., 2017). While previous research has conclusively demonstrated an effect of low-level motion on time perception (e.g., Kaneko and Murakami, 2009), no study so far has utilized straightforward representations of human motion through the use of stick-figure actions to test for presumably more readily embodied effect of motion on perceived durations. Using an easily-discernible type of walking motion allows for testing the robustness and the sensitivity of the reported effect of basic motion on time perception when it is embedded in high-level (i.e., biological) motion. Additionally, perceiving biological motion might typically take place at supra-second intervals, since multiple elements are presumably patched together over longer-than-sub-second intervals to perceive various elements of the observed motion, such as its speed, agency and intentionality. Below we describe three experiments, all of which utilized the temporal bisection task, which entails categorizing experienced durations as short or long based on their subjective similarity with the short and long reference durations. As such, all three experiments tested time perception at the supra-second level with a large number of participants in order to contribute to the generalizability of earlier effects to larger samples and different procedures. Our results show that the walking speed has a parametric effect on perceived durations, irrespective of the direction of motion (i.e., forward vs. backward), supporting the first of our hypotheses, and not the second one. Importantly, direction of walking motion was chosen in this study as a variable primarily for purposes of operationalizing the qualitative familiarity and unfamiliarity to type of motion (i.e., forward vs. backward walking; see Viviani et al., 2011; Maffei et al., 2014) and not to make inferences about the possibly differential timing of biological motion per se. In essence, we use the term "biological

motion" not as a methodological term (i.e., point light displays) as originally suggested by Johansson (1973) but rather as life motion, namely the "visual motion that expresses any sort of aspect characteristic for the motion of living beings" (Troje, 2013, pg. 4).

# EXPERIMENT 1

## Methods

#### Participants

Thirty-four participants (11 male, Mage = 21.8) were tested in Experiment 1. Participants received 1 course credit for their participation in Experiment 1. All experiments were approved by the Institutional Review Panel for Human Subjects of Koç University and were in accordance with the Declaration of Helsinki. All participants provided written consent for their participation. Two participants in Experiment 1 were excluded from the analyses due to more than two excluded fits (see section below "Data Analysis").

#### Stimuli and Apparatus

Stimuli used in both experiments consisted of animations of a walking stick-figure (approx. height = 10 cm/9.5 visual deg. in diameter) composed of black lines for limbs and torso, as well as a black circle for the head (**Figure 1**; see **Supplementary Materials** for animations). The animations consisted of the stick-figure walking on a rectangular white background, which was placed on a black canvas that encompassed the entire screen. All stimuli and instructions were presented on a 21" LCD screen (60 Hz refresh-rate) on an Apple iMac G4 computer, generated in Matlab using the PsychToolbox extensions (Brainard, 1997). Participants sat at a distance of approximately 60 cm from the screen, in a dimly lit room with no chin-rest or other restrictions. Responses were provided by using a mechanical keyboard (Zalman ZM-K500).

One "cycle" of the walking animation consisted of two steps taken by the stick-figure, where the posture in the last frame was set so as to continue with the posture in the first frame, allowing us to perceptually wrap-around the walking motion to present it for as long as necessary. The center of the stick figure did not move on the x-axis, which gave the impression of a simultaneously moving camera at a right angle, while small movements of the body on the y-axis represented the characteristic bouncing motion as natural human walk. At 50 frames per second (fps), one cycle (i.e., two steps) lasted 1.5 s, which was considered to be the normal (baseline) speed of walking. Five distinct walking speeds were then produced by modulating the fps of the walking animation (40, 50, and 63 fps) each of which lasted for one of 6 probe durations (1.0, 1.5, 2.0, 2.5, 3.0, and 3.5 s). Hence, lower fps values led to slower and higher fps values led to faster walking speeds. Consequently, higher fps rates for a given duration, or longer durations for a given fps rate meant that more step cycles (albeit partially) were presented in the animation. Finally, mirror animations were prepared by rewinding (i.e., reversing) the walking action in each animation where the stick-figure walked

backward, serving as the less plausible/less familiar walking condition.

#### Bisection Procedure

#### **Training**

Each session started with the presentation of two anchor durations at the offset of a space button press (short = 1 s, long = 3.5 s), represented by the presentation duration of a circular mottled texture (white, gray, black; 140, 10, 0.2 cd/m2 in brightness, respectively; approx. 8 cm/7.6 visual deg. in diameter). 10 training trials then ensued, in which the participants' task was to report if the duration of the automatically presented circular texture was the short or the long one (5 random trials each). A trial was repeated if an incorrect categorization was given by the participant. The buttons denoting a "short" or a "long" response were randomly assigned in each session. Each participant attended a single session, which lasted 50–60 min. Participants were instructed not to count or use any other chronometric methods, which has been reported to be sufficient to prevent counting (Rattat and Droit-Volet, 2012).

#### **Test**

After 10 correct responses in the training trials, the experimental block commenced, in which the participants' task was to categorize the six probe durations of walking animations as closer to the "short" or "long" anchor durations. Three walking speeds were employed: 40, 50, 63 fps. The animations started with the press of the space button. Once the animation ended, the participant was probed to respond after a stimulus-to-response interval sampled from an exponential distribution with a mean of 0.5 s and a lower bound of 0.2 s. All possible combinations of walking speeds (3 levels), probe durations (6 levels), and walking direction (2 levels) were randomly presented 12 times, leading to a total of 432 trials per session. No feedback was given after responding either for the reference or intermediate durations.

#### Data Analysis

Mean percentage of "long" responses were plotted as a function of the six probe durations for each combination of walking speed and walking direction conditions, thereby forming six sigmoidal psychometric functions per participant (see **Figure 2** – top panel for fits to average data). Standard two-parameter cumulative Weibull distribution functions were fit to these data. The parameters of fits with adjusted-R-squared values less than 0.70 (7% of the cases) were substituted by a random value that was drawn from the sample distribution. Points of subjective equality (PSE; the duration at which a short and a long response was equally likely) were calculated as the median of the Weibull fits. We were primarily interested in potential leftward or rightward shifts of the PSE values as a function of experimental conditions,

(circle) and dashed green (square) lines denote slow and fast walking speeds, respectively. Standard Error of the Mean of individually calculated Points of Subjective Equality (PSE) are marked with horizontal lines of identical colors as the walking speed condition. Weber Ratios (WR) are provided as insets.

which would typically be interpreted in terms of an increase or a decrease in clock speed (i.e., perceived time), respectively. We have also calculated the Weber Ratios (WR), which is a measure of the steepness of the psychometric function and refers to the sensitivity with which the probe durations are categorized. WR values were calculated by dividing the difference limen {[p(long) = 0.75 – p(long) = 0.25]/2} by the PSE. A higher WR value indicates that the participant had lower sensitivity (more difficult time) categorizing the durations as short or long.

A two-way repeated measures ANOVA with walking speed (3 levels; Exp 1: 40, 50, 63 fps; Exp 2: 25, 50, 100 fps) and walking direction (2 levels; forward and backward) as within subject factors, and PSE values as the dependent variable was conducted. In addition to conventional frequentist repeated-measures ANOVAs, we also report the results of these tests' Bayesian counterparts for which we report inverse Bayes Factors (BF10; the strength of evidence the data provides for the alternative hypothesis compared to the null hypothesis; e.g., Wagenmakers et al., 2018b). BF<sup>10</sup> values between 1–10, 10–100, and 100–300 are interpreted as providing weak to moderate, moderate to strong, and strong to decisive evidence for the alternative hypothesis, respectively. Conversely, the inverse of these factors (1/BF10) provide evidence for the null hypothesis in line with the same rule-of-thumb ranges (BF01; Jeffreys, 1961; Goodman, 1999); We used the JASP 0.9.0.1 open-source software

with default priors for all Bayesian tests (see JASP Team, 2018; Wagenmakers et al., 2018a). In the manuscript, we indicate the model that has the highest Bayes factors with respect to the null model except for the model that contains an interaction, which is instead compared to the likelihood of the model that contains the main effects of the factors that were also included in the interaction model.

#### RESULTS

Our analysis of data from Experiment 1 showed a main effect of walking speed [F(2,62) = 47.04, p < 0.001, η 2 <sup>p</sup> = 0.60], and no main effect of walking direction [F(1,31) = 0.34, p = 0.55], or an interaction between the two variables [F(1.57,48.52) = 2.34, p = 0.11, Greenhouse-Geisser Corrected]. Post hoc analyses showed that the difference between all walking speeds reached significance (MSlow = 2.38, MNormal = 2.25, MFast = 2.02; all ps < 0.01, see **Figure 2**). The Bayesian two-way ANOVA revealed an identical pattern of results where the data provided decisive evidence for the walking speed model over the null as the best model (BF<sup>10</sup> > 300).

Identical repeated measures ANOVAs with WR as the dependent variable and walking speed and walking direction as the independent variables were conducted. In terms of their effects on WR values, neither the main effects nor their interaction reached significance (all ps > 0.05). The same pattern of results was observed with Bayesian analyses where the Bayes factors did not yield more than anecdotal evidence for any of the main or interaction effect models (all BF<sup>10</sup> < 1).

#### EXPERIMENT 2

We repeated Experiment 1 with a new group of participants who were tested with a larger range of walking speeds. This allowed us to replicate the first experiment as well as to observe if larger differences between walking speeds indeed lead to larger differences in subjective time compared to Experiment 1. Additionally, by using unnaturally fast and slow movement speeds as timed stimuli, Experiment 2 had the potential to reveal any effects of walking direction which may have been masked in the previous in the experiment, where fast and slow walking speeds were still within the interpretive limits of normal biological action.

#### Methods

#### Participants

Thirty-two participants were tested in Experiment 2 (10 male, Mage = 21.2) and received 12 Turkish Liras for their participation.

#### Stimuli and Apparatus

The stimuli, apparatus, and the procedure used in Experiment 2 were identical to those used in Experiment 1 except for that participants were tested with 25, 50, and 100 fps in Experiment 2 (as opposed to 40, 50, and 63 fps in Experiment 1). Data analyses were identical to Experiment 1. The parameters of fits with adjusted-R-squared values less than 0.70 (4% of the cases) were substituted by a random value that was drawn from a distribution with identical parameters as the sample distribution.

#### RESULTS

Data from Experiment 2 showed the identical pattern of results, with a larger size of the significant main effect compared to Experiment 1; namely a main effect of walking speed [F(1.34,41.64) = 105.44, p < 0.001, η 2 <sup>p</sup> = 0.77], and no main effect of walking direction [F(1,31) = 2.32, p = 0.14], or an interaction between walking speed and walking direction [F(2,62) = 0.60, p = 0.55]. Again, identical with Experiment 1, post hoc analyses in Experiment 2 showed that the difference between all walking speeds reached significance (MSlow = 2.72, MNormal = 2.24, MFast = 1.77; all ps < 0.001). Bayesian analyses affirmed these findings, where the data provided decisive evidence for walking speed as the best model over the null model (BF<sup>10</sup> > 300).

Different from Experiment 1, a frequentist ANOVA revealed a significant main effect of walking speed on WR [F(2,62) = 7.48, p = 0.001] in Experiment 2. Post hoc analyses showed that WR values in the slow walking condition (M = 0.151) were significantly lower compared to the normal (M = 0.176) and fast (M = 0.188) walking conditions (both ps < 0.05), while the latter two conditions did not differ significantly from each other (p > 0.05). These findings were confirmed by the Bayesian ANOVA, which provided moderate evidence for the walking speed model over the null model (BF<sup>01</sup> = 4.87) in terms of its effect on the WR values.

Finally, we aimed to see if the degree of the effects were more prominent with larger differences in walking speed as manifested by the experimental paradigm (i.e., as in Experiment 2 compared to Experiment 1). Thus, the data gathered in both experiments were subjected to a mixed ANOVA with walking speed and walking direction as two within-subjects factors, test group as the between-subjects factor (2 grouping levels; Experiment 1 and Experiment 2), and PSE as the dependent variable. Results showed a main effect of walking speed [F(1.61,99.94) = 150.93, p < 0.001, η 2 <sup>p</sup> = 0.71], in addition to an interaction between walking speed and the grouping factor [F(2,124) = 30.23, p < 0.001, η 2 <sup>p</sup> = 0.33], while there were no main effects of walking direction or the grouping variable or any other interaction effects among the factors (all ps > 0.05; **Figure 3**). Post hoc independent samples t-tests showed that, in both the forward and backward walking conditions, the PSE values in the slow and fast walking speed conditions in Experiment 2 were significantly lower and higher than those in Experiment 1, respectively (all ps < 0.05, Holm-Bonferroni corrected, see **Figure 3**), whereas there were no differences among the normal walking speed conditions in either direction (both ps > 0.05). Complementary analyses using the Bayesian method in a mixed-ANOVA setup revealed the strongest evidence for the interaction model between walking speed and the grouping variable among all models. This interaction model was decisively preferred to the walking speed and the grouping variable main effects model (both BF<sup>10</sup> and BFinclusion > 300).

the 0.05 level. Error bars show standard error of the mean.

# EXPERIMENT 3 AND STIMULUS RATINGS

Previous research has shown that the detection of actual backward walking and digitally rewound forward walking require different types of cognitive competences (Viviani et al., 2011), and are likely perceived according to distinct perceptual factors and cognitive dynamics. This is evidenced by the fact that the simplest form of walking -in mechanistic terms-, is in fact a sequence of "falling" forward by shifting the center of the body mass, followed by a precisely timed catching of oneself before falling over or tripping, and then repeating this cycle with the contralateral limbs. As such, although walking backward utilizes the same mechanistic principle (i.e., temporarily shifting the center of mass, but this time backward), a different orchestration of a sequence of musculoskeletal coordination (and therefore a different observable movement) is utilized for relocating the body toward a backward position. Therefore, we conducted a third experiment to replicate Experiment 2 with stimuli embedded with more plausible backward walking action animations compared to those used in Experiment 1 and Experiment 2, in which backward walking animation was prepared by mimicking actual animations of forward moving profiles - see **Supplementary Materials**. We had these new animations rated by an independent group of participants in terms of how plausible they appeared.

## Methods

#### Participants

Twenty-nine participants (11 male, Mage = 21.1) attended the Stimulus Rating experiment, and a different group of 25 participants were tested in Experiment 3 (9 male, Mage = 21.8). Participants received 1 course credit in Experiment 3 and 0.5 course credit for their participation in the Stimulus Rating experiments. All experiments were approved by the Institutional Review Panel for Human Subjects of Koç University. All participants provided written consent for their participation.

#### Stimuli and Apparatus

#### **Stimulus Rating**

Realistic backward walking animations were prepared using the same software methods used in Experiments 1 and 2 for preparing forward walking animations, this time by observing animations of humans purposefully walking backward in an upright position. Separate backward walking animations were prepared in keeping with all walking speed (3 levels) and probe duration (6 levels) conditions used in Experiment 2. Stimulus presentation and response collection methods, as well as other experimental conditions were identical to those in Experiments 1 and 2.

#### **Experiment 3**

All apparatus and response collection methods were identical to those used in Experiment 1 and 2. All "rewound backward walking" stimuli from Experiment 2 were replaced by the "realistic backward walking" animations in Experiment 3 (see section above "Stimulus Rating").

#### Procedure

#### **Stimulus Rating**

Each trial consisted of the presentation of two consecutive backward walking animations, each of which started at the onset of a space button press by the participant. On each trial, one of the animations was a backward walking animation generated by rewinding forward walking (as in Experiment 1 and 2),

and the other was the novel, more realistic backward walking animation. The order of the animations was counterbalanced across trials. The two animations matched in their fps parameters (25, 50, or 100) and presentation durations (1.0, 1.5, 2.0, 2.5, 3.0, or 3.5 s). Combined levels of both variables (fps and duration) were counterbalanced and animations depicting each combination were presented twice in a single 20–25-min-long session. Presentation of the first animation was followed by a blank screen and an inter-stimulus interval (ISI) drawn from an exponential distribution with a mean of 0.5 s and a lower bound of 0.2 s. The ISI was followed by the written instruction to "Press the space button to see the second animation." Immediately after the end of the second animation, the participant was asked to state which of the two animations (i.e., former or the latter) had the more plausible form of backward walking. The button press was followed by a response-to-stimulus interval with identical randomly distributed delays as in the ISI.

#### **Experiment 3**

All procedures, variables and parameters were identical to Experiment 2.

## RESULTS

#### Stimulus Rating

Choice (as more familiar/plausible) proportions for the rewound backward and realistic backward walking animations were calculated. A one-sample t-test comparing choice proportions for the novel stimuli (over the rewound version) to chance level (i.e., 50%) revealed that the participants were significantly more likely to choose the novel backward walking stimuli over the rewound backward walking stimuli used in Experiment 2 as more plausible [M = 58.3, SD = 19.1, t(28) = 2.34, p < 0.05] A Bayesian one sample t-test showed that the choice proportions for the novel stimuli were 2.03 times more likely than the chance level.

#### Temporal Bisection Experiment

Identical analyses as those in Experiment 2 were performed with identical inclusion/exclusion criteria and mean inoculation methods. The PSE and WR values were calculated for each participant by fitting standard two-parameter cumulative Weibull distribution functions to individual data. Identical with Experiments 1 and 2, this procedure was carried out for each combination of walking speed and walking direction conditions.

A two-way repeated measures ANOVA with walking speed (3 levels; 25, 50, 100 fps) and walking direction (2 levels; forward and backward) as within subject factors, and PSE values as the dependent variable was conducted. Results showed a main effect of walking speed [F(1.34,32.2) = 89.38, p < 0.001, η 2 <sup>p</sup> = 0.79, Greenhouse-Geisser corrected], and no main effect of walking direction [F(1,24) = 0.99, p = 0.34], or an interaction between the two variables [F(2,48) = 1.45, p = 0.26; see **Figure 4**]. Post hoc analyses showed that, identical with the results of Experiments 1 and 2, the difference between all walking speeds reached significance, where the fastest walking speed led to the lowest PSE followed by the normal and the slow walking speed conditions (MSlow = 2.74, MNormal = 2.33, MFast = 1.89; see **Figure 4**); There was a monotonic relationship between walking speed and PSE. As with Experiments 1 and 2, Bayesian ANOVAs with identical variables confirmed the results of the traditional ANOVAs such that the walking speed model was preferred against the null as the best model (BF<sup>10</sup> > 300).

Identical repeated-measures ANOVAs with WR as the dependent variable and walking speed and walking direction as the independent variables were conducted. None of the main

effects or the interaction between the variables was found to be significant (p > 0.05). These findings were confirmed by the Bayesian ANOVA, which provided no evidence for any of the main or interaction effect models (BF<sup>10</sup> < 1).

# GENERAL DISCUSSION

fpsyg-09-02565 December 21, 2018 Time: 17:12 # 9

We conducted three experiments in which participants' task was to categorize six durations of animations depicting a stick-figure, walking forward or backward, at three different walking speeds. The forward-backward walking direction was added to the current study as a variable in a similar vein to previous studies which assigned backward walking the unique property of representing "unfamiliarity" of a given biological motion (i.e., Viviani et al., 2011; Maffei et al., 2014), any natural form of which could be considered "familiar" unless artificially manipulated. The first two experiments differed only in the degree of difference between the faster and the slower walking speed. When data from the first two experiments were examined separately, as well as in conjunction, our results suggested that subjective time dilates with faster observed walking speed and it constricts with slower observed walking speed. On the other hand, the direction in which the stick-figure walked (forward or backward) did not have an effect on perceived time, and it did not interact with walking speed in any of the experiments. In a third experiment, we tested participants with a more naturalistic backward walking animation (as opposed to rewinding forward motion as in the first two experiments) and replicated these findings.

There are two primary mechanisms through which subjective time can be modulated within the "pacemaker-accumulator" theoretic framework; these are 1) changes in the pacemaker rate and 2) changes in the probability by which pacemaker signals are integrated in the accumulator (Penney, 2003). In relation to our experimental manipulation, faster walking speed can be assumed to either increase the pacemaker rate (e.g., due to arousal) or lead to a decrease in attention to time (e.g., due to divided attention) and vice versa for slower walking speeds. Under the first possibility (i.e., change in pacemaker rate), subjective time would be expected to dilate with faster walking speed, while the opposite predictions would be made if the effects were on attention to time. To this end, our results directly support the effect of observed walking speed on pacemaker rate. Importantly, walking speed had a parametric effect on clock speed; Compare the effect sizes in Experiment 1 with Experiments 2 and 3 with differential degrees of deviation between walking speeds (see **Figures 3**, 4).

Interval timing models which employ such a switch component also assert the possibility of stimulus effects on switch closure (timing onset) and opening (timing offset) latencies (Gibbon et al., 1984; Zakay and Block, 1995; Wearden et al., 1998). An increase in switch closing or opening latency would lead to under or over-estimation of perceived durations, respectively, whereas simultaneous action of both states would nullify each other leading to no discernible effects (e.g., Wearden et al., 2007; Bratzke et al., 2017). Our results could potentially be explained by an effect of faster motion on switch closure latency, or vica versa. Since we have used one range of durations in this study, we cannot separate the additive effects that would be induced by switch closure latency from the proportional effects that would result from clock speed effects. Comprehensively elucidating an additive switch-based effect on perceived time veiled in our data remains a fertile methodological challenge for future research.

The behavioral effects observed in this study can also be readily accounted for by the neural energy model of timing (Pariyadath and Eagleman, 2007, 2008) since based on prior work (e.g., Kaufmann et al., 2000) observing faster walking speeds would be expected to lead to stronger neural activation (i.e., more neural processing) which, in the light of the neural energy model, would lead to longer time estimates. The experimental design and tools utilized in the current study, however, cannot distinguish between these two different theoretical accounts. On the other hand, the lack of an effect of walking direction on perceived time contradicts our hypothesis that was derived from the embodied cognition perspective. According to the embodiment perspective, cognition regarding real-world objects is time-pressured and is body-oriented (Anderson, 2003). We rarely interact with backward walking motion in real life and are therefore less familiar with it compared to forward (i.e., regular) walking. Based on this rationale, we expected the effect of walking speed to be more prominent in the forward walking condition than the backward walking condition, which was not the case, even though participants were indeed able recognize a biologically plausible backward motion compared to an implausible one (see Experiment 3, Stimulus Rating). Briefly, the lack of an effect of walking direction in the light of the previous studies does not support the embodied cognition account of our findings.

On a different level, our results can be interpreted from two perspectives; one view assumes that temporal and spatial information processing are independent and the other view assumes that temporal and spatial information processing can be coupled. According to the first approach, one can think that the effect of visual stimulus (such as the flickering presentation of visual input) would be via the stimulus-dependent arousal-based modulation of the central clock mechanism as discussed above (e.g., Droit-Volet and Wearden, 2002). The second approach is supported by work that shows that time perception can be modulated by adaptation to visual properties in a spatially localized fashion, which points at the effects at the level of sensory information processing (e.g., Johnston et al., 2006).

Although our study does not allow us to differentiate between these two accounts, as part of the second theoretical framework, one can speculate regarding the possible neural mechanisms that mediate the modulation of time perception by the observed walking speed (i.e., biological motion). One of the possibilities is that these effects are mediated by the "When Pathway" containing the right parietal cortex (i.e., inferior parietal lobe-IPL) that is assumed to process event timing bilaterally in the visual field (Battelli et al., 2007). Parietal lobe has indeed been shown to be associated with event timing in both monkey (Leon and Shadlen, 2003; Morrone et al., 2005 for LIP involvement) and

human work (Husain et al., 1997). In this pathway, visual information is assumed to be relayed from V1 to the middle temporal visual area (MT), that is involved in the perception of speed and direction of motion (Krekelberg and van Wezel, 2013; Liu and Newsome, 2005- irrespective of it being biological or not). Motion-related information from MT is then relayed to multiple areas including the right inferior parietal lobe (IPL), right angular gyrus, supramarginal gyrus, and posterior superior temporal sulcus (Battelli et al., 2007). From the types of motions, the "biological motion" has been argued to be one of the most prominent signals that is processed in this high-level attention-based system that contains the IPL as well as other areas such as the posterior superior temporal sulcus (Grossman et al., 2000, 2005; Grossman and Blake, 2002; Battelli et al., 2003, 2007). It might be this very neural pathway and its functional overlaps through which the observed biological motion and time might interact so robustly as in the case of our work.

Motion-related signals (biological or not) from MT modulate the activity of neuronal populations also in the lateral intraparietal area (LIP – Mazurek et al., 2003), which in and of itself has been shown to be involved in interval timing (e.g., Leon and Shadlen, 2003; Janssen and Shadlen, 2005; Jazayeri and Shadlen, 2015). If timing mechanisms are similar to those in perceptual decision making (e.g., Simen et al., 2011; Balc*ı* and Simen, 2016), where MT neurons are associated with momentary motion whereas LIP neurons integrate those motion-related signals over time (as in the context of two alternative random dot motion discrimination), one would expect the rate of temporal integration to be closely coupled with the speed of observed motion. This constitutes another neural pathway that might mediate the effect of observed motion (biological or not) on perceived time. Note that these arguments have been typically made in relation to the timing of relatively short intervals and their extension to longer scale time such as those utilized in our study requires further work (e.g., see Coull and Nobre, 1998; Bruno and Cicchini, 2016).

Another potential mechanism for the modulation of time perception by biological motion is through the premotor frontal areas that have been implicated both in biological motion (Saygin, 2007) as well as time perception (e.g., Mita et al., 2009). Within this framework, the connections between superior temporal sulcus and premotor areas (e.g., Luppino et al., 2001; Yoshida et al., 2011) might support the modulation of perceived time by the speed of biological motion. Lastly, the temporo-occipital junction (TOJ) has been shown to be activated by unfamiliar compared to familiar walking scenes (Maffei et al., 2014). The lack of an effect of walking direction in the current study suggests that the TOJ is probably not recruited with regard to the interaction between perceived walking speed and perceived time, narrowing down the possible neuroanatomical basis of the effect. Future neuroimaging and neuromodulation studies would help differentiate between these different implementational possibilities.

Weber's ratio, in its simplest form, has been suggested to be constant when timing different durations with a constant (i.e., non-modulated) clock-speed (Gibbon, 1977; Grondin, 2001), except for very short (Getty, 1975) or very long durations (Bizo et al., 2006); a range of which does not closely bound the durations used in the current study. Simulations conducted based on the decision rules as outlined in Wearden and Ferrara (1995) and the linear modulation of Poisson clock speed by the walking speed showed that WR should remain nearly constant for all walking speeds. Consistent with this prediction, the Weber's Ratios were relatively constant across conditions in Experiments 1 and 3, however, it increased as function of walking speed in Experiment 2. Therefore, our findings regarding WRs also supported the predictions of the clock-speed modulation account of the effect of observed walking speed.

In all of our experiments, we modulated fps values in order to increase/decrease the speed at which the stick-figure seemed to move. Importantly, a higher fps stimulus (our fast walking condition) by definition employs more frames that are presented to the participant per unit time. In relation to theories of timing that emphasize "perceived change per unit time" as the fundamental index of perceived duration (Poynter, 1989), it can be argued that it wasn't the high speed of movement per se that altered perceived durations in our paradigm, but rather the number of frames perceived by the participant per unit time. However, given that all of the simulation videos used in our study were presented with upward of 24 frames per second, beyond which most participants perceive continuous motion (e.g., Condon and Ogston, 1966; Haggard and Isaacs, 1966), such an argument seems implausible. Nonetheless, this possibility could be tested for by keeping the frame rate constant (e.g., 50 fps) among speed conditions in a future study.

Our experimental manipulation of walking speed was implemented in a fashion isolated from other visual correlates at the background scene (i.e., the rate of change in the background visual scene). This limited the ecological validity of the stimulus manipulation since natural visual processing of objects typically occurs in the presence of complex backgrounds. An experimental design that contains conditions with (a) a constant walking speed coupled with different rates of change in the background visual scene, (b) the rate of change in the background visual scene congruent with the change in walking speed, and (c) the rate of change in the background visual scene incongruent with the change in walking speed would allow future research to capture the differential effect of rate of change in the visual scene on time perception. In such experimental settings, we would expect the observed effects on time perception to be enhanced in visually congruent conditions and diminished in incongruent conditions provided that the participants process the scene (e.g., visual flow) together with the figure. Under this rationale, for a constant walking speed the rate of change in the background visual scene could also be an independent determinant of alterations in time perception. However, given the object-based visual attentional processing and the fact that various brain regions are differentially involved in the processing of biological and non-biological motion (e.g., Grossman et al., 2000), we would also expect the walking speed of the attended agent to have the dominant modulatory effect on time perception. Lastly, the current study employed no eye-tracking-based visual

restrictions throughout the trials so as not to prevent voluntary exploration of the stimuli during the timing of presented intervals. Saccadic eye movements are known to affect perceived time (Yarrow et al., 2001; Morrone et al., 2005; Burr et al., 2010; Suzuki and Yamazaki, 2010; Kars˛*ı*lar and Balc*ı*, 2016; Penney et al., 2016) and therefore future studies could test similar effects to ours by forcing some type of foveal fixation either at the center of the stimulus, or allow for fixations only within the area encompassed by the size of the presented videos. Future studies with such experimental designs are needed to complement our understanding of the effect of observed motion on time perception.

Relatedly, all of our experiments employed stimuli depicting a simple walking motion performed by an animated human-like agent, none of which showed an effect of walking direction on perceived time. As mentioned above, biological plausibility is possibly linked to the mechanism by which an object is timed. Therefore, a future study that tests how self-governing, non-biological motion stimuli (as opposed to backward movement used in the current study) are timed in contrast to stimuli depicting biological motion (i.e., walking), could further elucidate the mechanism by which this modulation of time perception was achieved in the current study. As such, it is possible that the backward walking motion used in our experiments failed to tap into the mechanism by which non-biological/unnaturally moving stimuli are processed (Maffei et al., 2014), which is why it might have exerted no discernible effect on perceived time, as opposed to what was hypothesized. We find such an investigation particularly relevant to our overarching research question since the motor system would be more likely to imitate the biological motion due to higher structural overlap between the human motor system and the observed stimulus (for detailed discussion see Wilson, 2001; Shiffrar and Heinen, 2011) and thereby better extract information regarding the motion-related state of the observed stimulus as a result of its stronger embodiment (Loula et al., 2005). Neuroscientific evidence in related fields further bolsters the relevance of addressing this issue as the brain areas (e.g., premotor areas and cerebellum) that have been implicated in the processing of human movement (Stevens et al., 2000; Saygin et al., 2004; Saygin, 2007) are also known to be involved in interval timing (Merchant et al., 2013).

As a final note, most types of biological motion used in experimental settings, -including many different forms of walking- are typically represented either by video recordings of actual actors, or by point-light animations (Johansson, 1973) which present the action in terms of coherently moving nodes/joints (see Grosbras et al., 2012 for a comprehensive review). While stimuli composed of video recordings benefit from high fidelity in terms of biological plausibility of the observed motion, these types of stimuli suffer from potential embodiment-related confounds depending on the (dis)similarity between the actor and the timing agent. On the other hand, point-light animations (e.g., Saygin et al., 2004; Watanabe, 2008) sidestep this problem by utilizing a more "symbolic" and flexible expression of biological motion with an otherwise invisible actor projected over a static background, which effectively omits all potential confounds such as color, shape, preconceived biases etc. However, point-light stimuli arguably lack some ecological validity, since timing of motion stimuli entails perception of almost all aspects of the observed organism and not just a sub-component of implied coherent motion vectorized in terms of moving dots. The stimuli used in the current study were, in principle, closer to point-light walker animations compared to video recordings; yet unlike their counterpart, they concretely and visibly represented the human motion in its entirety, including the action of the limbs, torso and the head (see **Supplementary Materials** for animations). To the best of our knowledge, these types of stick-figure stimuli have never been utilized in the context of timing. This methodological novelty, compounded by the relatively prominent effects, which parametrically increase with the experimental manipulation of observed walking speed, put forth the possibility for future studies to employ other similar forms of animation, (including 3-dimensional stimuli embedded within virtual or augmented reality environments), which in turn could more accurately elucidate the mechanism by which observing (or interacting with) some form of biological or non-biological motion could exert its effects on how humans perceive the "flow" of time.

## DATA AVAILABILITY STATEMENT

The datasets analyzed for this study can be found in the Drive Folder URL: https://tinyurl.com/ycc8cgkr.

## AUTHOR CONTRIBUTIONS

HK, YK, and FB contributed equally to the conception and design of the study. HK and YK prepared the stimuli and collected the data. HK organized the database. HK and FB performed the statistical analyses. All authors contributed to the first and revised draft of the manuscript, read, and approved the submitted version.

# FUNDING

This study was supported by a New Agendas for the Study of Time and TÜBA (Turkish Academy of Sciences) GEB˙IP 2015 grant to FB.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.02565/full#supplementary-material

DATA SHEET S1 | The model comparison tables for all Bayesian analyses conducted.

PRESENTATION S1 | The stimuli used in all experiments.

## REFERENCES

fpsyg-09-02565 December 21, 2018 Time: 17:12 # 12


Jeffreys, H. (1961). Theory of Probability, 3rd Edn. Oxford: Oxford University Press.


Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Percept. Psychophys. 14, 201–211. doi: 10.3758/BF03212378

Johnston, A., Arnold, D. H., and Nishida, S. (2006). Spatially localized distortions of event time. Curr. Biol. 16, 472–479. doi: 10.1016/j.cub.2006.01.032



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kar¸sılar, Kısa and Balcı. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# 1-s Productions: A Validation of an Efficient Measure of Clock Variability

Sarah C. Maaß1,2\* and Hedderik van Rijn<sup>1</sup>

<sup>1</sup>Department of Experimental Psychology, University of Groningen, Groningen, Netherlands, <sup>2</sup>Behavioral and Cognitive Neurosciences, University of Groningen, Groningen, Netherlands

Objective: Clock variance is an important statistic in many clinical and developmental studies. Existing methods require a large number of trials for accurate clock variability assessment, which is problematic in studies using clinical or either young or aged participants. Furthermore, these existing methods often implicitly convolute clock and memory processes, making it difficult to disentangle whether the clock or memory system are driving the observed deviations. Here we assessed whether 20 repeated productions of a well-engrained interval (1 s), a task that does not incorporate memory updating nor the processing of feedback, could provide an accurate assessment of clock variability.

#### Edited by:

Trevor B. Penney, The Chinese University of Hong Kong, China

#### Reviewed by:

Sven Thoenes, Leibniz-Institut für Arbeitsforschung an der TU Dortmund (IfADo), Germany Darren Rhodes, Nottingham Trent University, United Kingdom

> \*Correspondence: Sarah C. Maaß s.c.maass@rug.nl

Received: 30 September 2018 Accepted: 10 December 2018 Published: 21 December 2018

#### Citation:

Maaß SC and van Rijn H (2018) 1-s Productions: A Validation of an Efficient Measure of Clock Variability. Front. Hum. Neurosci. 12:519. doi: 10.3389/fnhum.2018.00519 Method: Sixty-eight undergraduate students completed two tasks: a 1-s production task in which they were asked to produce a 1-s duration by ending a tone by a keypress, and a multi-duration reproduction task. Durations presented in the reproduction task were tones lasting 1.17, 1.4 and 1.68 s. No feedback was presented in either task, and the order of presentation was counterbalanced between participants.

Results: The observed central tendency in the reproduction task was better explained by models including the measures of clock variability derived from the 1-s production task than by models without it. Three clock variability measures were calculated for each participant [standard deviation, root mean squared residuals (RMSRs) from an estimated linear slope, and RMSR scaled by mean production duration]. The model including the scaled RMSR was preferred over the alternative models, and no notable effects of the order of task presentation were observed. These results suggest that: (1) measures of variability should account for drift; (2) the presentation of another timing task before a 1-s production task did not influence the assessment of the clock variability; and (3) the observed variability adheres to the scalar property and predicts temporal performance, and is thus a usable index of clock variability.

Conclusion: This study shows that just 20 repeated productions of 1 s provide a reliable index of clock variability. As administering this task is fast and easy, it could prove to be useful in a large variety of developmental and clinical populations.

Keywords: interval timing, precision, clock variance, individual differences, clinical populations

# INTRODUCTION

Estimating and reproducing short intervals in the hundreds of milliseconds to seconds range is central to a wide range of behaviors. Irrespective of the theoretical framework, this type of timing is assumed to be driven by an internal time source, or clock, and memory traces of previously experienced intervals. Because of this dyad, variations in interval timing proficiency can either be driven by changes in the accuracy of the clock, or by variations in the efficacy of the memory mechanisms. Interestingly, deviations in timing performance observed in clinical populations are often attributed to variations in clock precision and accuracy (for a review see Allman and Meck, 2011). For example, Alzheimer diseased patients demonstrate increased timing variability in a bisection task using sub-second intervals (Caselli et al., 2009), and similar observations are associated with the performance of patients with bipolar disorder (Bolbecker et al., 2014) and Parkinson's disease (Pastor et al., 1992; Harrington et al., 1998; Malapani et al., 1998). Furthermore, performance of autistic children is poorer in temporal discrimination tasks when compared with healthy, age-matched controls (Karaminis et al., 2016). Changes in temporal accuracy are, however, not limited to clinical conditions. Even during normal, healthy aging, temporal precision declines, observable by more variable timing estimations and a general decrease in accuracy (for a review see Paraskevoudi et al., 2018).

Even though these phenomena are often explained in terms of deviations in clock variance, the impairment of temporal precision may also be clock-unspecific. For example, individuals with schizophrenia display greater variability in a rhythmic tapping task, potentially caused by larger timing variability (Carroll et al., 2009). Yet, these effects have also been attributed to procedural learning (Da Silva et al., 2012), or the inability to synchronize to external events (Wilquin et al., 2018). Thoenes and Oberfeld (2017) propose general cognitive deficiencies as a potential explanation for the impaired temporal performance in individuals with schizophrenia.

Distinguishing between clock-based and more general deficiencies is difficult as most tasks that are used to assess the precision of the clock implicitly rely on a convolution of clock and memory processes. Typically, the precision of interval timing is indexed by a rhythmic tapping task or by calculating a Weber fraction (e.g., Harrington et al., 1998; Karaminis et al., 2016), a measure indicating the minimal proportional chance for a changed stimulus to be discernible from the original (''just noticeable difference''). The Weber fraction is typically calculated from a psychometric function derived from a bisection or discrimination task. In both tasks, participants have to learn either one or two comparison durations or ''anchors'' during the scope of the experiment. Performance in both tasks thus depends on generating the correct memory representations during the experiment itself. Moreover, calculating an accurate psychometric function requires a relatively large number of trials making it unsuitable for populations with shorter attention spans or for those that are more easily fatigued. The rhythmic tapping task on the other hand is closely linked to motor performance, a process that undergoes a separate decline in aging and certain clinical conditions (Paraskevoudi et al., 2018; PD: Jankovic, 2008). Based on similar considerations, Paraskevoudi et al. (2018) proposed that ''more appropriate methods for detecting the accuracy and imprecision signatures of a slower clock are verbal estimation tasks, production tasks, and unpaced finger tapping tasks, which presumably reflect the internal tempo in its pure form'' (page 11). Here, we present a first validation of a pure clock variability measure that does not incorporate memory updating during the experiment, nor the processing of feedback, and can be quickly administered.

In our study, participants were asked to produce 20 1-s intervals by ending a machine-started tone by a key-press, without feedback or prior presentation of the defined interval. As 1-s intervals are likely to be highly familiar or trained to the participants, we assumed a stable internal representation and thus attributed the variability observed in the repeated production of this interval to clock variability. The most straightforward way of determining the precision of the repeated 1-s duration is by calculating a standard deviation. This way of determining precision assumes that the accuracy of the 1-s estimation remains the same over the 20 repeated productions. However, human performance on simple tasks such as continuation tapping (e.g., Lemoine et al., 2006) is known to be subject to drifts, especially over the shorter sequences used in this study (Wagenmakers et al., 2004). These drifts are defined as slow changes of the running mean, for example, participants might speed up or slow down during the course of the experiment. Even though there is disagreement in literature about the patterns best describing long-term dependencies, it is clear that an accurate measurement of precision in the 1-s production task should account for potential drifts.

Noise is ubiquitous in human information processing (Faisal et al., 2008), and earlier work has demonstrated that humans use knowledge of their sensory variability to produce estimates that are optimal in the context of the task (e.g., Körding and Wolpert, 2004; Murai and Yotsumoto, 2018): the noisier the incoming information on a particular trial, the stronger the influence of the expectation that was built up during previous trials. Thus, the temporal precision measures derived from the 1-s task should determine how much a learnt temporal context influences the performance on a specific trial. We therefore asked participants to also complete a reproduction task of three different durations. A typical phenomenon observed in these tasks is a central-tendency effect (Hollingworth, 1910, often referred to as Vierordt's law; see Lejeune and Wearden, 2009, for a discussion). This tendency of judgments of quantities to gravitate towards their mean, irrespective of whether it is in terms of spatial distances, durations, or any other perceptual quantity, is a highly robust perceptual effect (Jazayeri and Shadlen, 2010; Petzschner and Glasauer, 2011; Wiener et al., 2016). This effect is typically explained by assuming that the internal representation of a currently observed quantity is a mixture of the actually perceived quantity and a memory representation of all previous quantities. In the context of this study, the consequence of this ''central tendency'' driven by previously experienced durations is that durations shorter than this central tendency are overestimated, whereas those longer than it are underestimated.

This highly robust phenomenon that demonstrates the influence of memory on perception (see for reviews on memory influences on timing Shi and Burr, 2016 or van Rijn, 2016) can be captured in terms of Bayesian Inference (Jazayeri and Shadlen, 2010; Shi et al., 2013; see also Taatgen and van Rijn, 2011), in which the perceived duration (sensory likelihood) is integrated with previously perceived intervals that are stored in the reference memory (prior). The key feature of this process for the purpose of this work is that a narrower (or less noisy) likelihood will yield a smaller regression to the mean, and, vice versa, a stronger prior will result in more central tendency, and that the likelihood is driven by the precision of the interval timing processes. For example, in highly trained professional musicians, such as percussionists, reproduced auditory durations are reproduced to perfection, indicative of an extremely peaked likelihood, and resulting in a relatively small influence of the prior (Cicchini et al., 2012). Inversely, as aged participants exhibit more uncertainty, indicative of a wide likelihood, the influence of the prior increases, causing a stronger reliance on prior memories (Turgeon et al., 2016). In other words, the individual differences observed in the 1-s task should correlate with the central tendency observed in the reproduction task. However, both tasks rely on the production or reproduction of intervals in a similar time range. It is therefore possible that sequential effects can be observed (note that these are at a more global level than the trial-by-trial sequential effects discussed in, for example, Taatgen and van Rijn, 2011; Dyjas et al., 2012; Di Luca and Rhodes, 2016). We tested for this possibility by counterbalancing the order in which the production and reproduction task were administered.

Summarizing, we set out to test whether clock variability can be assessed with a short production task, consisting of 20 repeated productions of a 1-s interval and hypothesize that a measure of variability derived from this task should predict the amount of central tendency observed in a multi-duration reproduction task.

# MATERIALS AND METHODS

### Participants

Sixty-eight undergraduate students from the University of Groningen completed the experiment in exchange for course credit. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Psychology Ethical Committee of the University of Groningen. We excluded a total of five participants based on their performance in either the production or reproduction task. One participant was excluded based on failing to follow task instructions (98% of trials failing to meet the inclusion criteria for the Reproduction task discussed below; average response time during the reproduction task was 288 ms). For the production task, four participants were excluded due to large deviations in produced durations, even after excluding the first two startup trials (i.e., out of the 67 remaining participants, 63 did not produce any intervals longer than 3 s, whereas two participants produced intervals of over 3 s on five trials, one participant on 10 trials, and one on 15 trials). In total, 63 participants remained for further analyses (mean age: 21.4, range: 17–54, SD: 5.8, 43 female).

## Apparatus

A MacBook Pro 13'' (2011) controlled all experimental events. Auditory stimuli were presented through headphones (Sennheiser, HD280 Pro), with volume adjusted to comfortable levels. The experiment was programmed using Psychtoolbox-3 (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007) in Matlab R2014b.

## Procedure

The production task consisted of 20 trials. To prevent a rhythmic sequence, each trial commenced with an intertrial interval (ITI) with the presentation of a fixation cross ''+'' for a random duration between 2 s and 3 s sampled from a uniform distribution. Then a ''?'' appeared on the screen and simultaneously a 440 Hz pure tone started. Participants were asked to indicate when they thought 1 s since the onset of the tone had passed by pressing the spacebar (see **Figure 1**). As counting has been shown to increase the precision of duration judgments (Thönes and Hecht, 2017), participants were instructed to refrain from counting or keep track of time in any other way (e.g., tapping). This instruction has been shown to prevent influences of chronometric timing (Rattat and Droit-Volet, 2012).

The reproduction task consisted of two blocks of 120 trials each. Each trial consisted of the presentation of a duration, and the reproduction of that duration (see **Figure 2**). The durations presented were 1.17, 1.4 or 1.68 s long. Each trial commenced with an ITI of a random duration between 2 s and 3 s sampled from a uniform distribution during which a fixation cross (''+'') was presented in the center of the screen. Then a ''!'' appeared on the screen for 700 ms to prepare the

subjects for the presentation of the duration. Following this, the duration was presented by means of a 440 Hz pure tone that lasted for the duration associated with the current trial. Within each block of 120 trials, all three durations were presented 40 times, in random order. After completion of the tone, an inter stimulus interval (ISI) of 1.5 s was presented with a ''?'' displayed on screen. Then another 440 Hz pure tone was started. The task was to press the spacebar when the earlier presented duration had passed. To test for order effects, 31 participants performed the task in ''production—reproduction'' order, and 32 in reversed order (based on parity of their sequential participant number).

#### Statistical Analysis

For the reproduction task, we marked trials on which response times were lower than 500 ms or greater than 2.5 s as outliers, and removed them from analyses. Of the resulting dataset, 1.4% of all data points in the reproduction task were marked as outliers.

All analyses were performed in R, with the full script and data available at the OSF<sup>1</sup> . The Bayesian analyses were performed with the R package BayesFactor (version 0.9.12-4.2; Morey et al., 2015) using the default prior settings, and are interpreted based on the guidelines provided by Jeffreys (1961), as adapted by Lee and Wagenmakers (2014). The reported Bayes factors summarize the extent to which an observer's opinion of the tested variable should change based on the data. A Bayes factor of 1 indicates that both hypotheses are equally likely under the data and therefore is inconclusive. Bayes factors larger than 1 represent evidence for the alternative hypothesis of an influence of the tested independent variable on the dependent variable, and Bayes factors less than 1 represent evidence for the null hypothesis of no effect of the tested variable. For the Bayesian linear mixed effect models, we built models predicting centered estimated duration by the entered effects including participant as random factor. We then assessed the variable of interest by comparing the

<sup>1</sup>https://osf.io/bhe97/?view\_only=9c3ea605d482416691b43588fecf92f5

Bayes factor of the model including this variable with the Bayes factor associated with a model omitting this variable. To facilitate interpretation, we invert Bayes factors below 1 and describe in the text whether the Bayes factor is evidence for inclusion or exclusion of the factor. This way, all reported Bayes factors express the evidence for presence or absence of an effect as values progressively greater than 1.

# RESULTS

#### Production Task: Accuracy

As no feedback was given during the production task, the average reproduced durations provide an index of the accuracy of the internal representation of 1-s veridical time. The first two trials were excluded, as discussed in the next section. **Figure 3** depicts the average produced durations per participant and the resulting distributions as violin plots. The bar graphs depict the mean produced durations for both order conditions (0.98 s, SE = 0.08, when the production task preceded the reproduction task, and 0.76 s, SE = 0.07, for the inverse order). A Bayesian linear effect model indicated that there was no conclusive evidence either in favor or against an effect of order (BF = 1.46 ± 0.01%). However, the accuracy of the internal representation of a 1-s interval is of secondary relevance in the context of this study, as the purpose of the 1-s production task was to assess clock variability instead of (veridical) accuracy.

#### Production Task: Precision

Assuming that the internal representation of a 1-s interval is firmly encoded in long-term memory, and thus not likely to be affected by a small number of production trials without feedback, the variance observed in the 1-s productions reflects the trial-by-trial clock variability. The mean trial-by-trial estimates are depicted in **Figure 4**, plotted separately for the two order conditions. As can be seen in this figure, the first trial is associated

FIGURE 3 | Violin plots depicting the distributions of the 1-s productions and the individual participant means, separately for the order conditions (Pr: Production, Repr: Reproduction). The inset depicts the average duration of the 1-s productions, including error bars representing standard errors of the mean with the within-participants Cousineau-Morey correction applied (Morey, 2008).

Reproduction). Error bars represent standard errors of the mean with the within-participants Cousineau-Morey correction applied.

with very long average responses, and both the first and second trial have noticeably bigger error bars than the following trials. In all subsequent analyses (and the analyses reported in the previous section), we have therefore considered these two initial trials to be ''start-up trials,'' and excluded them from further analysis.

The most straightforward measure of this variability is the variance or standard deviation. **Figure 5** shows, again for both order conditions, the standard deviation of the final 18 trials of the sequence of 20 trials. A Bayesian linear model provided anecdotal evidence against order having an influence on the deviation expressed in SD (BF = 2.23 ± 0%).

The standard deviation is an appropriate measure if the noise can be assumed to be centered around a fixed mean. However, if the repeated samples are drawn from a distribution of which the mean changes over time, the standard deviation calculated assuming a fixed mean will overestimate the true variance as the shift in mean will increase the standard deviation. In the context of this task, if the internal representation of a 1-s interval shifts, the standard deviation will overestimate the clock's noisiness. **Figure 6** depicts the 1-s production data of a single participant. As can be seen, this participant shows a slow drift from productions of around 2 s to productions of 1 s, but actually shows relatively little variance around this estimated trend. Assuming slow drifts in performance, calculating a standard deviance would overestimate this participant's clock

FIGURE 5 | Violin plots depicting observed standard deviation per participant, and the resulting continuous distributions, separately for the order conditions (Pr: Production, Repr: Reproduction). Inset depicts mean deviation with error bars representing standard errors of the mean with the within-participants Cousineau-Morey correction applied.

noise. Even though **Figure 4** might suggest an overall slope close to 0 (the mean of individual slopes is −0.0055), the range is relatively large (−0.08 to 0.04, reflecting a drift of −1,440 to 720 ms between trial 3 and 20). As there is decisive evidence for a drift (one sample Bayesian t-test on the absolute slopes as depicted in **Figure 6**, BF = 4.83 × 10<sup>4</sup> ± 0%), a reliable measure should account for drift when estimating clock

FIGURE 7 | Violin plots depicting observed deviation expressed in root mean squared residual (RMSR) from a linear fit estimated per participant, and the resulting continuous distribution, separately for the order conditions (Pr: Production, Repr: Reproduction). Inset depicts mean deviation with error bars representing standard errors of the mean with the within-participants Cousineau-Morey correction applied.

variance. We therefore also calculated the root mean squared residuals (RMSRs) based on a linear regression predicting produced duration as a function of trial number (coded as 1–18, after removing the first two trials) fitted separately for each participant.

**Figure 7** depicts the RMSR for the two order conditions, analogous to the way the SD was plotted in **Figure 5**. As with the SD measures, a Bayesian linear model provided anecdotal evidence against order having an influence on the deviation expressed in RMSR (BF = 2.26 ± 0%).

#### Clock Variability

To assess which of the proposed measures is a better estimate of the clock variance, we assessed whether a participant's standard deviation or RMSR was a better predictor of the central tendency observed in a reproduction task. The data of the multi-duration reproduction task is graphically depicted in **Figure 8**.

For the analyses of this data, we centered both presented and reproduced duration by subtracting 1.4 s from the presented and reproduced durations. For the presented durations, this ensures that the predictors have mean 0, making it easier to interpret the effects of additional predictors. Moreover, and this holds for both presented and reproduced durations, it allows for easier interpretation of resulting coefficients. As a baseline model, we fitted a Bayesian linear mixed effect model, with participant as random factor, predicting centered estimated duration by centered presented duration. Evidence in favor of the more complex model that included presented duration was decisive when compared to a model just including

FIGURE 8 | Reproduced durations as a function of presented duration, plotted staggered on the horizontal axis, separately for the order conditions (Pr: Production, Repr: Reproduction). Error bars are standard errors of the mean with the within-participants Cousineau-Morey correction applied. The dotted line represents veridical time.

an intercept (BF = 9.17 × 10<sup>1244</sup> ± 7.99%). Comparing this model to a model that also included experimental order provided decisive evidence against the more complex model (BF = 407.24 ± 14.59%). Based on the rationale that clock variability should influence the width of the likelihood, and as such the relative contribution of the prior, the best estimate of the veridical clock variability should best predict reproduced duration. Decisive evidence was obtained for the inclusion of each of the clock variance measures when compared to the simpler model that did not include any estimate of clock variance (BF = 4.48 × 10<sup>10</sup> ± 9.74% for the SD-based measure, and BF = 2.47 × 10<sup>13</sup> ± 9.75%, for the RMSR-based measure), demonstrating that clock variability as estimated by a 1-s production task does indeed predict the amount of central tendency in a multi-duration reproduction task. This is depicted in **Figure 9**, where participants with higher clock variance also demonstrate a stronger central tendency effect. More importantly, a Bayesian model including RMSR is a decisively better predictor of the estimated durations than a model including the SD-based measure (BF = 5514.79 ± 7.9), demonstrating the superiority of the new RMSR measure.

One of the central findings in interval timing is that temporal precision is relative to the duration of the interval being estimated, a phenomenon called the scalar property of variance. As **Figure 3** demonstrates, the average produced durations range from a couple of hundred milliseconds to approximately 2 s. Scaling the observed variances by the mean produced duration would result in a less biased estimate of the internal noisiness of the clock. To test this assumption,

we fitted another Bayesian model that included the RMSR variance divided by the mean produced duration. This scaled model is a decisively better predictor of the estimated durations than a model including the non-scaled RMSR measure (BF = 4.15 × 10<sup>7</sup> ± 13.39%).

To assess the influence of the scaled RMSR measure on the regression towards the mean, we have plotted the estimated slope of the regression line as shown in, for example, **Figure 8**, against the scaled RMSR (see **Figure 10**). A slope of 1 would indicate a participant, whose reproductions are not at all influenced by context, whereas a slope of 0 would indicate a participant who always reproduces the exact same duration, irrespective of the duration of the presented interval. Or, in Bayesian inference terms, a slope of 1 indicates complete reliance on the likelihood, whereas a slope of 0 indicates complete reliance on the prior. As expected, increased clock variance is associated with smaller slopes, and vice versa: results of the Bayesian correlation indicate extreme evidence (BF = 325.58 ± 0%) in favor of a large or moderate negative association between the scaled RSMR measure and the slope, expressing the regression towards the mean [r = −0.45, MAD = 0.11, 90% CI (−0.63, −0.25)].

#### DISCUSSION

The goal of this article was to assess whether clock variability could be reliably measured in a span of a couple of minutes so that it can be applied in both prototypical experimental populations (i.e., young adults) and clinical or developmental studies in which complex or lengthy experiments are often problematic. We determined clock variability in a simple production task, and demonstrate that this variability predicts the central tendency observed in a multi-duration reproduction task. Following the rationale of Bayesian inference, this indicates that the measured variance in the production task is related to the width of the likelihood distribution in the multi-duration reproduction task, which has been associated with the noise in the clock parts of the temporal system (e.g., Shi et al., 2013). Thus, we can assume that the assessed variance in the production task, which takes less than 3 min to administer, is a reliable index of clock variance. Here, we will first discuss a number of methodological issues related to this paradigm and experiment, and then discuss a number of more theoretical considerations.

We asked the participants to produce 1-s durations by ending a tone 1 s after onset by a keypress. We explicitly opted for this duration as we assumed that typical participants will have a reasonably stable and well encoded representation of a second duration, due to the prevalence of this duration in everyday life. Because of this well ingrained duration, participants will hopefully be able to produce this duration without too much effort, and at the same time, it is unlikely that 20 repeated productions of this duration, without any external feedback, would cause noticeable changes in the internal representation.

Before conducting this experiment, we did not not have specific information on whether it would be necessary to exclude a number of initial trials as ''start-up trials,'' but as the first two trials were clearly associated with longer and more variable (between participants) produced durations, we categorized these first two 1-s productions as start-up trials. However, the population tested in this experiment are young adults who were well trained in participating in psychophysiological experiments, making it likely that more start-up trials might be needed when this paradigm is administered in other populations. Also, we considered the possibility that order effects might influence performance in either the production, or reproduction task. For example, after performing two blocks of the reproduction task, which contained durations between 1 s and 2 s, the 1-s estimates could be affected. However, **Figure 3** does not show any hints of the reproduction task influencing the 1-s production task (the numerical effects are in opposite direction as what we be expected), and the Bayesian analyses did not provide any reliable evidence for a difference between the two order conditions. This indicates that the 1-s estimations are immune to perturbations from the multi-duration reproduction task used in this study.

To quantify the clock variability, we assessed the predictive power of different measures of variance. As the produced durations show clear drifts during the 20 trials even though no feedback was provided, standard measures of variance that assume a fixed mean would overestimate clock variability. We therefore fitted a linear model to the produced duration of each individual, and calculated a deviation measure by taking the RMSRs of this linear model. Obviously, it is not unlikely that the drift in produced durations follows a more complex pattern than is captured by a simple linear regression (see, for example, the discussions on short range dependencies in Wagenmakers et al., 2004). However, fitting more complicated patterns would quickly result in overfitting given the limited number of trials acquired in this task. We therefore refrained from estimating more complex patterns. When the different variability measures were contrasted, the linear-model-based RMSR measure provided the best prediction of the central tendency observed in a multi-duration production task. As variance is known to be linearly related to the length of the durations produced (scalar property see e.g., Gibbon et al., 1984), we also tested the predictive power of a model in which we divided the RMSR by the produced durations. This scaled RMSR measure outperformed the non-scaled RMSR variance measure, indicating that the measured variance adhered to the scalar property. Bayesian models of time estimations (e.g., Jazayeri and Shadlen, 2010) have demonstrated that the perception of longer durations are reflected in wider likelihoods than those associated with shorter intervals. Here, we demonstrate that a similar effect can be observed between participants: the higher the variance of participants' performance during the 20 repeated 1-s productions, the more their reproduced durations will be affected by the prior.

One potential caveat of this method is that the observed variance in the 1-s production task may be driven by motor noise, which would also result in a stronger reliance on the prior in the reproduction task. This potential convolution of motor and clock noise is a challenge in studies assessing clock variability (see, for example, Cicchini et al., 2012; Turgeon et al., 2016, for discussion). If motor noise would be the driving factor, noise should be independent of produced durations, as all durations have the same motor action (i.e., the keypress indicating the end of the interval). Conversely, if the observed variability is driven by a noisy clock, the amount of variability should be directly related to the length of the produced duration. As the variance measure that accounts for produced length, the scaled RMSR, resulted in the best fit, we do not find support for the notion that motor noise is driving the observed phenomena. Yet, a future study could independently measure a participant's motor noise to separately assess its contribution to the central tendency effect.

In this manuscript, we assess whether we can measure clock variance in a short, 20-trial production task. This observed statistic predicts central tendency in a multi-duration task, a phenomenon known to be dependent on clock variability. Another way to assess the reliability of the measured clock variability is to compare the scaled RMSR measure with the Weber fraction, a measure typically used when clock variability is estimated. However, as Weber fractions are derived from paradigms that rely on memory representations learned during the experimental session, the Weber fraction is indicative of the noisiness of the whole temporal system (but note that experimental or pharmacological manipulations might allow separating both influences, e.g., Meck, 1983). Therefore, it would be useful to determine Weber fractions based on paradigms that vary in their reliance on memory processes to determine the consistency of estimated clock variance statistics. At the same time, additional research is needed to relate the here presented 1-s production method to other established measured of clock variability.

To conclude, we have shown that just 20 trials of a 1-s production task result in a reliable measure of clock variance. The observed variability adheres to the scalar property and predicts temporal performance observed in a reproduction task. As no feedback is required, and memory processes are unlikely to play an important role in this paradigm, this clock variance measure can be used to disentangle the extent to which temporal behavior in a task is driven by memory processes or clock deviations. With its fast and easy application, this task is suitable to be implemented in clinical and various developmental populations, even in attention-span limited participants. Hopefully, this task can be a useful addition to the toolkit of researchers interested in unraveling the locus of deviations found in temporal performance.

#### DATA AVAILABILITY

The datasets analyzed for this study can be found in the Open Science Framework (https://osf.io/bhe97/?view\_only=9c3ea6 05d482416691b43588fecf92f5).

#### REFERENCES


# AUTHOR CONTRIBUTIONS

SM and HR conceived the experiment, analyzed the data, and wrote the manuscript. SM collected all data.

#### FUNDING

This work is part of the NWO VICI research program ''Interval Timing in the Real World: a functional, computational and neuroscience approach'' with project number 453-16-005, awarded to HR, financed by the Netherlands Organization for Scientific Research (Nederlandse Organisatie voor Wetenschappelijk Onderzoek). HR's work was partially funded by EU Horizon 2020 FET Proactive grant ''TimeStorm'' #641100.

## ACKNOWLEDGMENTS

We would like to thank Thomas Wolbers and Martin Riemer for fruitful discussions regarding the setup of this study.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Maaß and van Rijn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neural Correlates of Interval Timing Deficits in Schizophrenia

Ariel W. Snowden and Catalin V. Buhusi\*

Interdisciplinary Program in Neuroscience, Department of Psychology, Utah State University, Logan, UT, United States

Previous research has shown that schizophrenia (SZ) patients exhibit impairments in interval timing. The cause of timing impairments in SZ remains unknown but may be explained by a dysfunction in the fronto-striatal circuits. Although the current literature includes extensive behavioral data on timing impairments, there is limited focus on the neural correlates of timing in SZ. The neuroimaging literature included in the current review reports hypoactivation in the dorsal-lateral prefrontal cortex (DLPFC), supplementary motor area (SMA) and the basal ganglia (BG). Timing deficits and deficits in attention and working memory (WM) in SZ are likely due to a dysfunction of dopamine (DA) and gamma-aminobutyric acid (GABA) neurotransmission in the cortico-striatalthalamo-cortical circuits, which are highly implicated in executive functioning and motor preparation.

Keywords: schizophrenia, interval timing, attention, cognitive dysfunction, working memory, neural correlates

#### Edited by:

Fuat Balcı, Koç University, Turkey

#### Reviewed by:

Anne Giersch, Institut National de la Santé et de la Recherche Médicale (INSERM), France Martin Wiener, George Mason University, United States

#### \*Correspondence:

Catalin V. Buhusi catalin.buhusi@usu.edu

Received: 30 September 2018 Accepted: 09 January 2019 Published: 29 January 2019

#### Citation:

Snowden AW and Buhusi CV (2019) Neural Correlates of Interval Timing Deficits in Schizophrenia. Front. Hum. Neurosci. 13:9. doi: 10.3389/fnhum.2019.00009

## INTRODUCTION

Schizophrenia (SZ) is a complex, heterogeneous psychiatric disorder characterized by a broad range of symptoms including delusions, hallucinations, impaired cognitive functioning, disorganized speech and behavior (Patel et al., 2014). SZ patients show deficits in interval timing, perceiving durations in the seconds-to-minutes range (Densen, 1977; Lee et al., 2009). **Table 1** shows the inconsistent pattern of time perception (TP) in SZ, possibly due to population heterogeneity and variability in illness duration (often not reported), symptom severity, and differential effects of typical vs. atypical antipsychotics. As temporal production tasks (Buhusi and Meck, 2010) often report less precision (greater variability) in time estimation (Carroll et al., 2009a,b; Roy et al., 2012; see also recent meta-analyses Ciullo et al., 2016; Thoenes and Oberfeld, 2017) and temporal estimation tasks often report overestimating durations (Densen, 1977; Wahl and Sieg, 1980; Tysk, 1983) differences in task demands may also account for inconsistent findings in SZ.

**Figure 1A** shows a schematic of a cognitive model of TP, the Information-Processing model (Gibbon et al., 1984). Regular pulses emitted by an internal clock are collected by an accumulator, stored in working memory (WM), and encoded for later use in the reference memory. At test, the current duration stored in WM and the duration encoded in reference memory are compared, and a response is made at the appropriate time. In this model, duration overestimation in SZ is explained by a faster clock resulting in greater accumulation of pulses at the time of the response, whereas variability in time estimation in SZ may be due to variability at clock, memory, and decision making levels (Gibbon et al., 1984).

An alternative, neurobiological model of TP, the Striatal Beat Frequency (SBF) model (Matell and Meck, 2004; Buhusi and Meck, 2005; Buhusi and Oprisan, 2013), is shown in **Figure 1E**.


 classification

 motor area; BG, basal ganglia.

 of diseases; yr, year; ↓,

decrease/slower/under;

 ↑,

increase/faster/over;

ns, non-significant

 differences;

 N/A, not available/not

 provided; PFC, prefrontal cortex; DLPFC, dorsal-lateral

 prefrontal cortex; SMA, supplementary

healthy controls; m, mean; SD, standard deviation; M, male; F, female; DSM, Diagnostic and statistical manual of mental disorders; ICD, International

glutamate; orange arrows, abnormal brain activation.

The model proposes that the cortico-striatal-thalamo-cortical system mediates cognitive processes involved in TP such as attention and WM (Buhusi and Meck, 2005). Cortical oscillations produce oscillatory beat patterns detected and encoded by striatal medium spiny neurons. The onset of a temporal duration initiates a phasic release of dopamine (DA), whereby DA neurons in the ventral tegmental area (VTA)/substantia nigra (SN) synchronize both cortical oscillations and the membrane levels of striatal medium spiny neurons (Matell and Meck, 2004; Buhusi et al., 2016). Indeed, DA is thought to be a neurotransmitter system crucial for both TP and attention to time (Buhusi and Meck, 2002; Buhusi, 2003). The SBF model is supported by data implicating dorsal-lateral prefrontal cortex (DLPFC), the supplementary motor area (SMA), the posterior parietal cortex, basal ganglia (BG), and the thalamus (Buhusi and Meck, 2005; Ivry and Schlerf, 2008) in TP. In this model, duration overestimation in SZ is explained by an increase in DA activation (equivalent to a faster clock, see Oprisan and Buhusi, 2011), whereas variability in time estimation in SZ may be due to abnormalities in neurotransmission/activation along the corticostriato-thalamo-cortical loop, as discussed in this review article.

As SZ patients exhibit both impairments in cognitive functioning and TP, it is unclear whether cognitive deficits in attention and WM are responsible for timing deficits in SZ, or disruption of TP is responsible for greater cognitive dysfunction. The current review will address this issue by examining the neural network involved in TP and highlighting abnormalities in these regions in SZ patients and proposes that both TP and cognitive deficits in SZ patients are a result of a malfunction of neurotransmission in the cortico-striatal-thalamo-cortical network.

# DORSAL-LATERAL PREFRONTAL CORTEX (DLPFC) AND WORKING MEMORY FOR TIME

The DLPFC is responsible for attending to presented stimuli and maintaining a representation of a given duration in WM in timing tasks (Curtis and D'Esposito, 2003; Coull et al., 2004). Neuroimaging and repetitive transcranial magnetic stimulation (rTMS) studies support a role for the DLPFC in TP. Healthy participants underestimate the to-be-timed duration when rTMS is delivered to the right DLPFC (Koch et al., 2003; Jones et al., 2004), while rTMS delivered to the DLPFC has been shown to improve timing accuracy in Parkinson's disease (PD) patients (Koch et al., 2004). The effects of rTMS on timing in PD patients may be due to decreases in DLPFC activation due to lower DA levels in comparison to healthy participants. Functional magnetic resonance imaging (fMRI) studies indicate greater activation in the DLPFC in healthy participants during timing tasks (Rao et al., 2001; Hinton and Meck, 2004; Üstün et al., 2017). Additionally, the PFC is implicated in TP in animals, both in lesion studies (Buhusi et al., 2018), pharmacological studies (Matthews et al., 2012), and in animal models of SZ (Buhusi et al., 2013).

Very few studies have investigated the neural correlates of TP in SZ patients. Volz et al. (2001) reported DLPFC hypoactivation in SZ patients relative to controls during temporal discrimination (**Table 1**). **Figure 1B** shows a contrast in activation between the control and SZ groups after subtraction of resting activation (bright yellow = greater contrast, dark red = smaller contrast). **Figure 1B** shows that SZ patients exhibited less DLPFC activation than controls, primarily due to DA imbalances in the BG, and secondarily due to abnormalities in the cortico-striatal-thalamo-cortical circuitry which may be responsible for attention and WM deficits (Volz et al., 2001). These findings are consistent with previous research reporting DLPFC hypoactivation in SZ patients in tasks involving sustained attention, such as the Continuous Performance Test (CPT), and tasks requiring WM, such as the Wisconsin Card Sorting Task (WCST, Weinberger et al., 1986; Barch et al., 2001). Decreased DLPFC activation during these tasks typically correlate with worse performance in patients.

One explanation for hypofrontality in SZ is a dysfunction in gamma-aminobutyric acid (GABA), an inhibitory neurotransmitter. Studies in primates indicate an increase in the firing rate of GABAergic DLPFC neurons during a delay in WM tasks (Lewis et al., 2005). Similarly, in the CPT task, decreases in DLPFC activation were observed in SZ patients during a delay presented between a cue stimulus and target stimulus (Barch et al., 2001). These findings indicate that recruitment of the DLPFC, mediated by GABA neurons, is necessary for WM and is disrupted in SZ. Lower GABA levels in the DLPFC in patients may explain hypoactivation in SZ: GABA increases synchronization of pyramidal cell neurons, facilitating task performance and resulting in increased DLPFC activation during WM tasks in healthy participants (Lewis et al., 2005).

# SUPPLEMENTARY MOTOR AREA (SMA) AND TEMPORAL ATTENTION

The SMA is involved in the generation of voluntary movements and storage of learned motor actions (Eccles, 1982). Although the SMA is primarily responsible for generating movement, many timing tasks have shown SMA activation when no motor output is required (e.g., Schubotz et al., 2000). Current research suggests that the role of the SMA in TP is to select populations of neurons to mediate temporal attention. Single-cell recording studies in primates have indicated neuronal coding of interval timing in the pre-SMA (Mita et al., 2009) and many studies have reported increased SMA activation during interval timing (Coull et al., 2004; Macar et al., 2004). In the Information-Processing model (**Figure 1A**), the SMA may act as an accumulator, recording the number of pulses in a given period (Coull et al., 2004). The SMA is also active during tasks involving the processing of rhythms and beat perception (Grahn and Brett, 2007), supporting the SBF model, which proposes that SMA, together with other cortical areas, generates oscillatory input which is detected and encoded in BG medium spiny neurons (**Figure 1E**).

SZ patients exhibit abnormal SMA activation during timing tasks. Ortuño et al. (2005) used positron emission tomography (PET) to examine cerebral blood flow in healthy controls (HCs) and SZ patients during a temporal reproduction task (**Table 1**). Although Ortuño et al. (2005) reported no statistically significant differences in temporal reproduction relative to controls, less SMA activation was reported for SZ patients relative to controls during the timing condition. **Figure 1C** shows SMA activation (yellow) in the control group (left panel) and SZ group (right panel; Ortuño et al., 2005), indicating that SZ patients exhibit less SMA activation relative to controls, and suggesting dysfunction of the right SMA and prefrontal areas in SZ patients, which may reflect a failure in early time processing related to attentional deficits.

A decrease in SMA activation has been previously reported in SZ patients in both fMRI (Schröder et al., 1995) and electroencephalography (EEG) studies (Dreher et al., 1999). For example, SZ patients exhibit an attenuated readiness potential (Dreher et al., 1999). The readiness potential is observed during the planning of motor actions, and source-localization techniques suggest that it is generated in the SMA (Praamstra et al., 1996). SMA recruitment is necessary for facilitating temporal attention, and SZ patients do not actively recruit the SMA during timing tasks, indicated by SMA hypoactivation. As the SMA is responsible for sending information to the BG for motor preparation (DeLong and Wichmann, 2007), it is possible that the SMA encodes a distorted duration before sending it to the BG to be stored in reference memory.

# BASAL GANGLIA (BG) AND REFERENCE MEMORY FOR TIME

The role of the BG in TP is well-established (Buhusi and Meck, 2005; Buhusi and Cordes, 2011). PD patients show both interval timing and motor timing deficits (e.g., reproducing rhythms by finger-tapping), indicating that BG is necessary for TP (Harrington et al., 1998). Similarly, Huntington's disease patients, in which BG degeneration occurs, perform worse than HCs on timing tasks (Cope et al., 2014). Increased BG activation during interval timing tasks has also been observed in fMRI scans of healthy participants (Ferrandez et al., 2003; Nenadic et al., 2003). Further, Rao et al. (2001) used event-related fMRI in a temporal discrimination task designed to activate regions in the fronto-striatal network during the encoding of a standard duration and a comparison of a new duration to the standard duration. An early BOLD signal (2.5–5 s after trial onset) associated with encoding a representation of the standard duration was observed in the BG, whereas a late BOLD signal (7.5–10 s after trial onset) associated with the retrieval and comparison of the two durations was observed in the right DLPFC (Rao et al., 2001).

As DA mediates BG functioning (Rammsayer and Classen, 1997), studies have examined the effects of altering DA levels on TP. In the framework of the Information-Processing model (**Figure 1A**, Gibbon et al., 1984), pacemaker pulses have been described as the firing of DA neurons (Gibbon et al., 1997). Indeed, administration of DA agonists results in a faster clock speed, whereas administration of DA antagonists results in a slower clock speed (Meck, 1996; Buhusi and Meck, 2002).

In the SBF model of TP, DA is assumed to mediate the encoding of temporal durations in the BG, suggesting that a dysfunction in the neuromodulation of DA may be responsible for both timing and other cognitive (e.g., attention, WM) deficits in SZ patients. Indeed, Buhusi (2003) showed that DA has a role both in timing and attention to time. Meyer-Lindenberg et al. (2002) found that PFC hypoactivation in SZ is associated with increased DA levels in the striatum. Relative cerebral blood flow was measured in the DLPFC during the WCST using PET and tracer-6 FDOPA to measure presynaptic DA activity in the striatum (Meyer-Lindenberg et al., 2002). Patients exhibited activity in the DLPFC that was negatively correlated with presynaptic DA activity; the less activation observed in the DLPFC, the greater the striatal DA uptake. This finding is compatible with the finding that increases in striatal DA are associated with a faster internal clock speed, which is often exhibited in SZ patients. This research suggests that timing deficits in SZ may be due to an interaction of increased DA activity in the BG and dysfunction of the DLPFC.

In addition to the DLPFC hypoactivation discussed above, Volz et al. (2001) reported hypoactivation of the caudate nucleus in SZ patients when compared to HCs, and concluded that the cortico-striatal-thalamo-cortical network is disrupted in SZ. If DA hyperactivity of the BG is associated with hypoactivation of the DLPFC in SZ, as observed in Meyer-Lindenberg et al.'s (2002) study, one should expect increases in activation in the BG to be correlated with decreases in DLPFC activation, which is contradictory to Volz et al.'s (2001) findings of hypoactivation in the caudate nucleus. Additional studies report hypoactivation of the caudate nucleus in SZ during WM tasks (e.g., Koch et al., 2008; Yoon et al., 2013). During the response phase of a WM task (**Table 1**), Yoon et al. (2013) found hypoactivation of the caudate nucleus and reduced functional connectivity between the PFC and the striatum in SZ. **Figure 1D** shows an fMRI contrast of activation between the control and SZ groups, showing brain regions where activation was greater in controls compared to SZ group (light blue = greater contrast). A large contrast is shown in the caudate nucleus, where patients exhibited hypofunction relative to controls. Functional connectivity between the PFC and striatum was also reduced.

The BG receive input primarily from the PFC and send output to the frontal regions via the thalamus (Alexander et al., 1986; Alexander and Crutcher, 1990; **Figure 1E**). It is therefore unclear whether abnormalities in the corticostriatal-thalamo-cortical circuits are primarily due to DLPFC dysfunction, or rather dysfunction of the BG. Many studies report that symptoms caused by PFC dysfunction in SZ (e.g., cognitive deficits) occur before positive symptoms (e.g., hallucinations and delusions; Lesh et al., 2011), resulting from hyperactivity in the mesolimbic DA pathway. DA activity is typically increased in the striatum in SZ and decreased in the frontal regions, which suggests less DA is released to the DLPFC during tasks involving WM and attention, reflecting an inability to actively recruit these regions during task performance.

# IMPLICATIONS FOR SCHIZOPHRENIA

Frith (1987) proposed that positive symptoms in SZ are a result of abnormalities in the intentions of actions. Patients are unaware of the sources of their actions which results in misattributions of the causes of consequences (e.g., delusions and false beliefs), and a disturbed sense of agency and self (Martin et al., 2017). Patients show a stronger binding of actions and consequences (Haggard et al., 2003), and a stronger binding of separate events in the absence of actions (Franck et al., 2005), judging events as synchronous over larger temporal disparities (Noel et al., 2018). These results suggest that temporal integration of events may lead to misrepresentations of events that are lost (e.g., inability to identify the beginning or end of an action sequence), therefore leading to misattributions. However, there are limitations to these findings, as explicit timing tasks were not employed in all studies.

Voss et al. (2017) identified the angular gyrus and DLPFC as being implicated in matching outcomes to actions and observed decreased connectivity between these regions in SZ. As hypoactivation of the inferior parietal gyri has been observed in SZ patients in TP (Alustiza et al., 2016), future studies should employ binding paradigms (e.g., Haggard et al., 2003) to examine neural activation during TP in SZ. Interestingly, Lošák et al. (2016) found decreased activation of the cerebellar vermis in SZ patients correlated with a faster clock during a time prediction task, suggesting that this region is critical for time prediction during salient events. As many studies report a correlation between a faster clock and positive symptoms in SZ patients, the nature of the relationship between TP and positive symptoms in SZ should be further investigated.

Here, we propose that both timing deficits and cognitive deficits in SZ are a result of neurotransmitter dysfunction in the cortico-striatal-thalamo-cortical loop. As information is transmitted in a loop, dysfunction in one region may cause impairments in another along the loop, rather than disruption occurring in a unidirectional manner. For example, correlations between TP performance and cognitive functioning measures have been observed in SZ (Lee et al., 2009; Roy et al., 2012). Roy et al. (2012) reported greater variability that was negatively correlated with WM performance during a reproduction task; patients with better WM exhibited less variability in the time reproduction task. Neuroimaging may be used to further examine dysfunction in the cortico-striatal-thalamo-cortical circuit and correlations between cognitive functioning and timing. As timing deficits have also been observed in individuals at high risk for SZ (Penney et al., 2005), neuroimaging studies may be used to assess dysfunction in this circuit correlated with TP in first-degree relatives of SZ patients, to assess timing deficits as a potential biomarker for SZ.

#### CONCLUSIONS

Current research examining the neural correlates of TP in SZ patients is limited. Previous research suggests a disrupted cortico-striatal-thalamo-cortical network which may be responsible for timing deficits observed in SZ. The DLPFC, SMA, and BG play distinct roles in TP, and abnormal activation in these regions is reported in timing tasks as well as additional tasks involving attention and WM in SZ. Timing deficits in SZ may be primarily due to increased DA levels in the BG and less DA and GABA in the DLPFC, which are necessary to mediate WM and attention during TP. Rather than timing deficits in SZ occurring as a result of cognitive dysfunction or vice versa, a malfunction in the neurotransmission of DA and GABA in the cortico-striatal-thalamo-cortical network may be responsible for disruption in the internal clock and cognitive functioning in SZ. As there are very few studies which examine neural activation during TP tasks in SZ, further research is needed to corroborate current findings. Future research should consider potential differences in TP performance due to duration of the illness and the extent to which abnormalities in neural

#### REFERENCES


activation during timing tasks are correlated with positive symptom severity, as disruptions in the internal clock rate may be linked to positive symptoms exhibited in SZ. Lastly, future research including neuroleptic-naive patients is necessary to rule out the potential confounds of antipsychotic medication effects on TP (**Table 1**).

#### AUTHOR CONTRIBUTIONS

Authors contributed equally to all aspects of developing and writing this manuscript.

#### FUNDING

This work was supported by a National Institutes of Health (NIH) grant MH073057 and an Independent Investigator Award from the Brain & Behavior Research Foundation (formerly National Alliance for Research on Schizophrenia and Depression, NARSAD) to CB.

#### ACKNOWLEDGMENTS

We would like to thank Dr. Christopher M. Warren and Mr. Michael Williams for helpful discussions.


flow evidence. Arch. Gen. Psychiatry 43, 114–124. doi: 10.1001/archpsyc.1986. 01800020020004

Yoon, J. H., Minzenberg, M. J., Raouf, S., D'Esposito, M., and Carter, C. S. (2013). Impaired prefrontal-basal ganglia functional connectivity and substantia nigra hyperactivity in schizophrenia. Biol. Psychiatry 74, 122–129. doi: 10.1016/j.biopsych.2012.11.018

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Snowden and Buhusi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Simplified Model of Communication Between Time Cells: Accounting for the Linearly Increasing Timing Imprecision

#### Mustafa Zeki <sup>1</sup> \* and Fuat Balcı <sup>2</sup>

<sup>1</sup> Department of Mathematics, College of Engineering and Technology, American University of the Middle East, Egaila, Kuwait, <sup>2</sup> Department of Psychology, Koç University, Istanbul, Turkey

Many organisms can time intervals flexibly on average with high accuracy but substantial variability between the trials. One of the core psychophysical features of interval timing functions relates to the signatures of this timing variability; for a given individual, the standard deviation of timed responses/time estimates is nearly proportional to their central tendency (scalar property). Many studies have aimed at elucidating the neural basis of interval timing based on the neurocomputational principles in a fashion that would explain the scalar property. Recent experimental evidence shows that there is indeed a specialized neural system for timekeeping. This system, referred to as the "time cells," is composed of a group of neurons that fire sequentially as a function of elapsed time. Importantly, the time interval between consecutively firing time cell ensembles has been shown to increase with more elapsed time. However, when the subjective time is calculated by adding the distributions of time intervals between these sequentially firing time cell ensembles, the standard deviation would be compressed by the square root function. In light of this information the question becomes, "How should the signaling between the sequentially firing time cell ensembles be for the resulting variability to increase linearly with time as required by the scalar property?" We developed a simplified model of time cells that offers a mechanism for the synaptic communication of the sequentially firing neurons to address this ubiquitous property of interval timing. The model is composed of a single layer of time cells formulated in the form of integrate-and-fire neurons with feed-forward excitatory connections. The resulting behavior is simple neural wave activity. When this model is simulated with noisy conductances, the standard deviation of the time cell spike times increases proportionally to the mean of the spike-times. We demonstrate that this statistical property of the model outcomes is robustly observed even when the values of the key model parameters are varied.

Keywords: interval timing, scalar variability, time cells, chain models, hippocampus, Weber's law

#### Edited by:

Matjaž Perc, University of Maribor, Slovenia

#### Reviewed by:

Jun Ma, Lanzhou University of Technology, China Daqing Guo, University of Electronic Science and Technology of China, China

> \*Correspondence: Mustafa Zeki mustafa.zeki@aum.edu.kw

Received: 22 October 2018 Accepted: 31 December 2018 Published: 29 January 2019

#### Citation:

Zeki M and Balcı F (2019) A Simplified Model of Communication Between Time Cells: Accounting for the Linearly Increasing Timing Imprecision. Front. Comput. Neurosci. 12:111. doi: 10.3389/fncom.2018.00111

# 1. INTRODUCTION

Time is a fundamental quantity that cannot be derived from other dimensions. Thus, keeping track of time requires its measurement by a neural "clock" mechanism. To that end, evolution has favored at least two timing mechanisms that operate at different time-scales. One of these timekeeping mechanisms, namely the circadian clock, captures periods with 24-h long cycles based on well-regulated molecular machinery (e.g., Partch et al., 2014). Many events in nature, on the other hand, are rarely periodic and/or too short to be captured based on the time-scales of the molecular events implicated for the circadian clock. Thus, capturing the temporal features of such events requires a rather flexible timekeeping apparatus that can be started and stopped arbitrarily, namely a neural mechanism with stopwatch-like properties (Buhusi and Meck, 2005). Accordingly, a mechanism of the latter type indeed enables many animals ranging from fish (Drew et al., 2005), to mice (Balci et al., 2009), to humans (Rakitin et al., 1998; Çavdaroglu ˘ et al., 2014) to flexibly keep track of time intervals in the range of seconds to minutes. This very ability is referred to as "interval timing."

There are a number of core features of this ubiquitous cognitive timekeeping function. For instance, for a given target time interval, the timed anticipatory responses of animals are approximately Gaussian distributed, which is typically centered at around the target interval, pointing at on-average high timing accuracy. However, the flexibility of this mechanism exerts a non-negligible cost in the form of imprecision: predictions/productions of a given target interval exhibit substantial variability between trials, which is reflected in the spread of the response time distributions. Thus, when it comes to precision, the operation of the internal stopwatch is far from perfect (i.e., outputs are not Dirac delta distributed). Importantly, within an individual, the resultant timing imprecision has a welldefined relationship to the target intervals; the standard deviation of the timed responses is proportional to their mean, namely the coefficient of variation of timed responses is virtually constant. This statistical property leads to the timescale invariance of interval timing and accounts for Weber's Law in the timing domain (Gibbon, 1977).

A line of empirical and theoretical research in the timing field has focused on the neurocomputational principles that would explain interval timing with its psychophysical features outlined above (for review see Karmarkar and Buonomano, 2007; Simen et al., 2011; Merchant et al., 2013; Hass and Durstewitz, 2014; Balcı and Simen, 2016). A recent line of neuroscientific evidence has introduced novel empirical ground for these approaches by demonstrating the existence of a specialized neural mechanism for timekeeping, namely the time cell ensembles that fire sequentially during different episodes of a temporally structured task (Kraus et al., 2013; Salz et al., 2016; Tiganj et al., 2016). Importantly, as a feature of information processing in the time cell architecture, the time interval from the activation of one ensemble to the next ensemble (inter-spike interval, ISI) has been shown to lengthen with progressing neuronal activity (MacDonald et al., 2011). In a very simple network of a neural chain architecture with feedforward excitatory connections, we aimed to address how the communication between the consecutive cell ensembles can be set to achieve scalar variability of interval timing behavior.

Even in very simplified cell and network settings, one needs to address several neurocomputational challenges to explain the statistical features of experimentally observed activity/behavior as a function of time. One of these features is the scalar property of interval timing. The challenge faced here is that when the distributions are added together, the standard deviation of the resulting distribution is compressed by the square root function. For example, when N identical normal distributions with mean = 1, 000 and the standard deviation = 100 are summed, the mean of the resulting distribution would be 1, 000N and its standard deviation would be 100N. Thus, the standard deviation would not increase linearly with the mean, contrary to what would be required by the scalar property. The important question that arises at this point is whether it is possible to compensate for the compression in variability (i.e., CV) due to the square root function via some inherent property of neural networks and thereby explain the scalar property as an emergent property of the network.

The second challenge faced in accounting for the experimentally observed activity with realistic neurons is the limiting timescale of the neuronal currents. Whether intrinsic or synaptic, many known neuronal currents operate in a timescale ranging from milliseconds to a few seconds. But the increase of ISIs in time cells (TC) is observed for as long as tens of seconds (Ermentrout and Terman, 2010). This motivated us to ask how neuronal currents with very short lifetimes can be used to generate effects that last much longer than their individual timescales.

In order to address such neurocomputational issues in signal transmission between the time cells, we constructed a simple time cell model in such a way that the delay from the firing of the ith time cell ensemble to the firing of the i + 1th ensemble increases with the inhibition. This is due to the hyperpolarizing intrinsic current that is activated with inhibition and inactivated with excitation. In other words, in this model, the time cells undergo a temporal integration that depends on the level of inhibitory current. In the chain architecture, when the time cells are connected with feedforward excitatory current, the way the time cells are modeled leads to experimentally observed increasing ISIs with propagating activity in the chain. We simulated the outlined network with noisy conductances multiple times to generate the distribution of spike times of various time cells. We observed that the standard deviation of the time cell spike times indeed increased linearly with the mean spike times and the mean-normalized distributions of different time cell activity superposed as often observed in the empirical data (i.e., timescale invariance). We finally showed that the observed results are the robust features of the model outputs that are preserved even after changing the values of the key parameters of the model.

# 2. METHODS: THE MODEL

In the current model, we use a network with feedforward connections among excitatory time cells to simulate the transmission of activity between time cells (see **Figure 1**). Importantly, the model focuses on the time it takes to transmit the excitatory signal from one time cell to the next time cell in the chain, namely, the inter-spike intervals (ISIs). Our assumptions regarding the role of inhibition in the model are explained below. The time cells are modeled using the spikeless integrate-andfire type neuron model (see Ermentrout and Terman, 2010) with currents that are modeled using the Hodgkin-Huxley type formalism as follows:

$$C\_m \frac{d\nu}{dt} = -\left(I\_L + I\_D + I\_{\text{Exc}} + I\_{Inh}\right) + I\_{input} + I\_{noise}$$

where IL, ID, IExc, and IInh stand for leak, D-type potassium, excitatory and inhibitory synaptic currents, respectively. The membrane potential is reset to v<sup>R</sup> = − 85 mV when v = V<sup>T</sup> with V<sup>T</sup> = −50 mV. C<sup>m</sup> is the membrane capacitance with C<sup>m</sup> = 200 µF/cm<sup>2</sup> . I<sup>L</sup> denotes the leak current with I<sup>L</sup> = gL(v− EL). g<sup>L</sup> and E<sup>L</sup> denote the leakage conductance and the reversal potential with values g<sup>L</sup> = 8 µS and E<sup>L</sup> = − 65 mV. The D-type potassium current is described by the equation I<sup>D</sup> = gDmdh 2 d (v − EK) with the maximal conductance and the reversal potential, g<sup>D</sup> = 4 µS and E<sup>K</sup> = −90 mV, respectively (Storm, 1988; Grissmer et al., 1994). The variables describing the fast activation (md) and slow inactivation (hd) of the Dcurrent are described by the following differential equations: dm<sup>d</sup> dt = (md<sup>∞</sup> − md)/md<sup>τ</sup> , dh<sup>d</sup> dt = (hd<sup>∞</sup> − hd)/hd<sup>τ</sup> where md<sup>∞</sup> = 1 − 1/(1 + exp((v + 65)/2)), md<sup>τ</sup> = 0.6, hd∞(v) = 1/(1 + exp((v + 65))) and hd<sup>τ</sup> = 1500 ms−<sup>1</sup> .

#### 2.1. Excitatory Synaptic Currents

The synaptic excitation is given asIExc = IExc1+IExc2. The current IExc<sup>1</sup> for the nth time cell that represents feedforward excitatory connection from the n − 1th time cell to the nth time cell is given by IExc<sup>1</sup> = gExcs(V − Eexc), where s is the synaptic variable of the n − 1th time cell. The equation describing the synaptic variable is reset to 1 with every spike of the corresponding time cell and decays exponentially with respect to the equation ds/dt = −βs with the decay rate β = 0.2ms−<sup>1</sup> .

The maximal excitatory conductance and the excitatory reversal potentials are given by gExc = 15 µS and EExc = 0 mV, respectively. Another excitatory current, IExc<sup>2</sup> represents the recurrent excitatory connections within each time cell ensemble. The recurrent excitation is given by the equation IExc = gExcrs(V − Eexc), where the maximal conductance gExcr = 50µS and s is the synaptic variable of the time cell receiving the synaptic current.

# 2.2. Model Assumptions

The model incorporates a synaptic slow inhibitory current, and this current is assumed to increase linearly with every activated time cell. The equation for the inhibitory current to a time cell is given by IInh = NgInh(V − Einh) with the maximal conductance gInh = 0.02 µS and the reversal potential EInh = −100 mV. N is the number of time cells that has fired since the beginning of the current simulation. The second assumption is that the active time cell stops firing after the excitatory transmission and thus after the firing of the next time cell.

# 2.3. External Input and Initial Conditions

A 10 ms-long square pulse input current Iinput = 4 mA/cm<sup>2</sup> is given to the first time cell at the beginning of each trial. Initial values of the membrane potentials are taken to be equal to the resetting value of −75 mV. The D-current activation and inactivation variables are assumed to have initial values of md<sup>0</sup> = 0 and hd<sup>0</sup> = 1, as the equilibrium value for the inactivation variable is 1 for a resting neuron. Initial values for all synaptic variables are assumed to be 0.

### 2.4. Network Architecture

A network of 60 time cells was simulated unless stated otherwise. We assumed feedforward AMPA-type excitatory connections between the time cells (**Figure 1**). All the time cells receive the same inhibitory current that incrementally increases in a linear fashion with the activation of each additional time cell.

# 2.5. Noise

Large network simulations were run by assigning the D-current and synaptic excitation maximal conductance values as normally distributed random variables. The standard deviation values for generating the data presented in section 3.4 are 1 for D-current maximal conductance and 5 for the maximal conductance of the synaptic excitation. In addition, the synaptic noise Inoise has the form Inoise = P n sn(V − Eexc), where the variable s<sup>n</sup> from 100 presynaptic neurons is activated at predetermined times t<sup>k</sup> , k = 1, 2, . . . from a Poisson process with an average firing rate of 50 Hz (see e.g., Fourcaud and Brunel, 2002). In simulations with non-zero Inoise, the value of the maximal conductance of the synaptic noise current is given by gnoise = 1 µS. The variable s<sup>n</sup> obeys the differential equation dsn/dt = −βns<sup>n</sup> with β<sup>n</sup> = 0.1 ms−<sup>1</sup> .

# 3. RESULTS

# 3.1. Time Cells in vivo

The characteristic feature of timing behavior is the reduced absolute timing precision with increasing target time intervals. Recent studies that aimed to find the neural correlates of the interval timing behavior in accordance with its statistical signatures discovered the so-called time cells in many different brain areas such as the striatum, hippocampus, and medial prefrontal cortex (Kraus et al., 2013; Eichenbaum, 2014; Salz et al., 2016; Tiganj et al., 2016). The conclusion that time cells can encode time came from the critical fact that different cell ensembles are activated during different time periods. Moreover, the time interval between subsequently activated cell ensembles has been observed to slowly increase with the elapsing time. In fact, it is this very property that leads to the increased absolute imprecision (constant level of relative imprecision) for the representation of longer time intervals. The increase in delays between the sequentially activated cell groups also means that "later periods" of an event are represented with fewer neurons

per unit of time. This seemingly simple behavior has two neurocomputational challenges to be tackled.

Any given time interval is the combination of time intervals between the activity of consecutively firing time cells; in other words, the combination of inter-spike intervals (ISI). The sum of the ISIs determines the perceived time interval. The scalar property dictates the noise in the timing behavior or in the resultant temporal representation to increase linearly with the target time. That is, if the standard deviation of the timed responses with a mean time of 10 s is 1 s, then the standard deviation of the same type of responses with a mean time of 20 s should be 2 s. The difficulty in here is that, when the noisy data are added, the noise in the combined data decays with the square root of the total number of datum combined. For example, if we add N normally distributed random variables with the same mean µ and the standard deviation σ, the mean of the sum is <sup>N</sup><sup>µ</sup> and the standard deviation of the sum is <sup>√</sup> Nσ. On the other hand, for the scalar property to emerge, the standard deviation of the combined data should be proportional to N. Note that even if we add distributions with an increasing mean and standard deviation that is proportional to the mean, it takes a lot of fine-tuning to achieve the scalar property.

The main idea of the current work is that the simple exponential decay of a neuronal current can explain both the lengthening ISIs observed in time cells and the scalar variability. In our model, as in the empirical data, one time cell fires after another via chain-like excitation. We assumed that the inhibition over time cells increases along with time, making the firing of the time cells less likely during the later periods of an event. For time cells to fire within such a network setting, some other hyperpolarizing current has to decay to compensate for it. It turns out that simple exponential decay of this particular current indeed explains the lengthening ISIs and maybe, more importantly, the scalar property is manifested as an emergent property of the model. With the increasing inhibition, the hyperpolarizing intrinsic current has to decay more to account for the increased inhibition, which explains the longer ISIs. What accounts for the non-linear increase in the noise is the fact that as the inhibition increases with time, the intrinsic current has to decay more with time to compensate for the increased inhibition and the exponential decay becomes more prone to noise as small perturbations in membrane conductance now lead to larger deviations in time. In this work, we used the D-type potassium current for the mentioned hyperpolarizing intrinsic current. But there are other currents such as the A-type potassium current, which can function like the D-type potassium current (Grünewald, 2003).

# 3.2. Oscillations in One Time Cell

In this section, we stimulate a time cell with step current stimulus for different amounts of inhibitory current to study the inhibition-dependent increase in the activation time of a time cell (see **Figures 2A,B**). We ran the simulations with three different levels of inhibition (i.e., N = 20, 40, and 60) applied for time intervals (1, 000 − 4, 000 ms), (4, 000 − 7, 000 ms), and (7, 000−10, 000 ms), respectively. A square-pulse input is applied 1,000 ms after the onset of each time interval (**Figure 2C**).The hyperpolarizing D-current is already active at the resting potential without any applied inhibition (**Figure 2D**). With the application of the first external step current at t = 2, 000 ms (**Figure 2C**), the D-current begins decreasing (**Figure 2D**) to compensate for the increased inhibition (N = 20). The time cell fires with a delay of about 600ms. When the inhibition coefficient is increased to N = 40 and N = 60, it takes increasingly longer

for the time cell to fire (about 800 and 1, 200 ms, respectively). This is because the hyperpolarizing D-current has to decay more to make up for the increased inhibition. Note that even though the increase in the inhibition is the same from N = 20 to N = 40 and from N = 40 to N = 60, the decay time increases non-linearly because of the exponential decay of the D-current inactivation variable (**Figure 2D**). When the intrinsic current is required to decay more with high inhibition during the later stages of the interval timing, the delay to spike (or ISI) becomes more prone to noise in the membrane potential. This is because the later stages of the exponential decay occur in a much slower manner and small perturbations in the membrane potential cause large deviations in decay time.

# 3.3. Larger Network Simulation

We then simulated forty time cells (see **Figure 1** for the architecture) with feedforward excitatory connections to test for the increasing delay in the ISIs with elapsing time (**Figure 3A**). The D-current decay time constant is set to be hd<sup>τ</sup> = 3, 000ms−<sup>1</sup> . In order to represent the within-ensemble excitatory connections, each time cell is designed so that it can send feedback excitation to itself (i.e., self-excitation; **Figure 1**). With the onset of the temporary external stimulus, the first time cell begins persistent spiking with feedback excitation and sends excitatory synaptic input to the second time cell. The activity progresses with the excitatory transmission between the time cells. Since we assume that with the activation of each time cell a slow inhibitory cell is activated persistently, the overall inhibition on the time cells increases with the progressing activity (**Figure 3B**). As discussed in the previous section, the increasing inhibition leads to longer delays in the activation of subsequent time cells with the elapsing time.

# 3.4. Statistical Results

In this section, we summarize the results of the simulations of the larger network for 200 times until the activation of each time cell is achieved to evaluate the variation in the spike times of each time cell in the chain under noisy network settings (see Methods). For example, for TC = 30, we run the network until the activation of the 30th time cell for 200 times. **Figure 4A** shows the coefficient of variation (CV) of spike times of the time cell with respect to the mean spike times. The red line refers to the constant value that the observed CV converges on. **Figure 4B** presents the mean spike times of time cells with the associated standard deviations. Note that the increase in the standard deviation is observed with the increasing cell index. Finally, **Figure 4C** shows the histograms of mean-normalized spike time distributions corresponding to the 30 th and the 40 th time cells. As an important property of interval timing behavior and consistent with the visual inspection of **Figure 4C**, the spiketime distributions that correspond to different time intervals are expected to superpose.

We next ran the simulations to observe the dependency of CV to the D-current decay constant hd<sup>τ</sup> and D-current maximal conductance g<sup>D</sup> (**Figures 5**, **6**); hd<sup>τ</sup> is a key parameter as it determines the decay rate of the D-current. We simulated the

FIGURE 3 | Raster plot of the larger network is displayed in (A). An inhibitory cell contributes to the overall inhibition of the time cells with every activated time cell. The overall inhibition increases along with time. The increase in inhibitory synaptic current delays the firing of time cells as depicted in Figure 2. The buildup of the synaptic inhibitory current is depicted in (B).

large network for the increasing values of the hd<sup>τ</sup> again for 200 times. The results of these simulations are displayed in **Figure 5**. The CV converges to the same constant value for all tested values of the D-current decay rate parameter. As expected, since D-current decays faster with the smaller values of hd<sup>τ</sup> , the wave speed changes considerably. The maximal Dcurrent conductance g<sup>D</sup> alters the delay to spiking. If g<sup>D</sup> is small, D-current has to inactivate more to compensate for the

FIGURE 5 | Simulation of time cell spike times for different values of the D-current inactivation variable decay rate, hdτ . In (A–D), the values for hdτ are taken to be 500, 1,000, 1,500, and 2,000 ms−<sup>1</sup> , respectively.

hyperpolarizing inhibition. Hence, the smaller values of g<sup>D</sup> lead to longer ISIs and higher values of g<sup>D</sup> lead to shorter ISIs. When g<sup>D</sup> is small, it takes less neurons to reach a given time as the ISIs are now longer. Similarly, when g<sup>D</sup> is large, it takes more neurons to reach a given time since ISIs are now shorter. Adding a different number of ISIs with increased g<sup>D</sup> reduces the CV as seen in **Figure 6**. The addition compresses the noise, hence, as g<sup>D</sup> gets larger, the CV value converged on gets smaller. Importantly, the fact that the CV approaches a constant value remains intact irrespective of the changes in gD.

In order to evaluate the differential effect of the synaptic noise arriving from multiple presynaptic cells (Inoise), we finally ran the simulations only with the Inoise removing the other sources of the noise (i.e., noise from D-current and TC-TC synaptic excitation). When the synaptic noise arriving from multiple presynaptic cells is used, the resultant CV increased depending on the maximal conductance but still remained constant again supporting the fact that virtually any manipulation that modifies the membrane potential of the time cells would result in a constant CV (see **Figure 7**).

# 4. DISCUSSION

In this paper, we developed a new simple neural model of interval timing that explains the key experimental observations regarding the time cell activity patterns and is closely related to the prominent behavioral and information processing models of interval timing. The model consisted of a single layer of time cells formulated as integrate-and-fire neurons, which results in wave activity with the application of a temporary external stimulus. The observed wave activity propagates with excitatory connections between the time cells and the ISIs increase with the resultant propagating wave. The model has a simple charge-discharge, loading-unloading mechanism. The overall inhibition converging on to all the time cells charges (loads) the time cells. When a time cell begins to receive synaptic excitation, the total hyperpolarizing inhibition over that particular time cell has to be compensated by the decay of the hyperpolarizing internal current (discharge-unload). Since the inhibition increases with the total number of activated time cells, the time it takes for a given time cell to spike also increases, resulting in incrementally longer ISIs. Importantly, the simple exponential decay of the internal current determines the length of the ISI. Hence, another mechanism for the wave activity propagation that utilizes exponential decay of some process would also give similar results to those presented here. For example, each unit can activate an additional slow inhibitory current on to the next time cell. The next time cell now has to wait for the exponential decay of the slow inhibition to spike, which takes longer as the inhibition builds up with the ongoing activity.

It is well-known that inhibition can help generate many neural rhythms, such as neural synchrony, irregularity in spike-times, persistent behavior, bursting, etc. (see e.g., Whittington et al., 2000; Rotstein et al., 2005; Moustafa et al., 2008; Guo et al., 2012, 2016; Neymotin et al., 2016; Zeki and Moustafa, 2017). In the current model, inhibition is used to modulate the behavior of the time cells. In particular, with the buildup of inhibition, the delay to the firing of time cells is increased with the help of a slow intrinsic current (D-current) that activates with inhibition and inactivates with excitation.

This architecture coupled with the biophysically-plausible functional characteristics of the proposed model captures the scalar property of interval timing, namely the constant coefficient of variation of timed responses for different target durations. Due to within- and between-trial noise characteristics, when the timed behavior of the model for different target intervals is expressed on a relative timescale, the predicted timed response curves superpose (**Figure 4C**). Furthermore, the model is also shown to be robust with respect to different values of the key model parameters.

The fact that ISIs extend with the progressing activity alone is not enough to get scalar variability in the model behavior. In fact, we did many numerical simulations by adding normal distributions with increasing mean ISIs and standard deviation that is increasing proportionally with the mean. In these simulations, the distributions obtained by adding the normal distributions with an increasing mean and standard deviation did not result in the scalar property; in fact, a lot of fine tuning would have to be done to obtain such a linear relationship between the standard deviation and mean. This observation supports the fact

that the mechanism used in our paper has scalar variability as an emergent property.

One of the important features of these findings is that interval timing behavior emerges with its well-established statistical properties based on the dynamics of the proposed architecture. The memory for time intervals is embedded within the neural circuit itself, which does not require independent encoding/decoding and comparison functions/components as proposed in some other information processing models of interval timing (e.g., Gibbon et al., 1984) (but see Matell and Meck, 2004, for a similar feature of Striatal Beat Frequency Model). Overall, the proposed model extends the scope of particularly behavioral theories of interval timing (Killeen and Fetterman, 1988; Bizo and White, 1997; Machado, 1997) in terms of their neural clock implementation based on biophysically plausible components and neuronal dynamics. In particular, the proposed model provides neural plausibility to LET-like (Machado, 1997) approaches in light of the recent neuroscientific evidence regarding time cells.

The model parameters, for example the decay rate of Dcurrent inactivation variable or maximal excitatory-synaptic synaptic conductance, can easily alter wave speed. In Miller et al. (2006), it is shown that NMDA-type neural excitation is also capable of manipulating wave speed. To keep the model simple and parsimonious, we did not include the NMDA current in this model. However, the possible integration of other NMDA currents and/or neuromodulators such as dopamine would be a natural next step for future research. The fact that the wave speed can be easily manipulated with the model parameters gives our model the ability to account for clock speed effects that are usually interpreted in light of the Dopamine-Clock Hypothesis; dopamine agonists have been shown to lead to overestimation of the time intervals (e.g., Maricq et al., 1981; Çevik, 2003; Matell et al., 2006; Balci et al., 2008), whereas dopamine antagonist has been shown to lead to underestimation of the time intervals (e.g., Meck, 1983; Meck and Church, 1984; Drew et al., 2005).

The capacity of the model in terms of the maximum possible time interval it can measure is limited by many factors. The built-up inhibition is compensated with the exponential decay of internal current (D-current). The inactivation of the internal current puts a cap on the capacity of the model. With the given set of parameters, the model can time intervals from a few hundred

#### REFERENCES


milliseconds to tens of seconds, which presents an acceptable range when compared to behavioral interval timing experiments (e.g., Meck, 1983; Meck and Church, 1984; Balci et al., 2008).

Interval timing is a complex process that combines various information-processing components such as reinforcement learning (e.g., Balci et al., 2009; Balcı, 2014; Petter et al., 2018), working memory (e.g., Zeki and Moustafa, 2017), etc. We simplified our model to focus on the mechanism of the transition from one time cell ensemble to another time cell ensemble. We assumed that an inhibitory cell activates with every time cell and stays active throughout the timing episode. This type of inhibitory layer can easily be realized using CAN-type calcium current or h-type depolarizing current that is used in representing persistent activity (e.g., Zeki and Moustafa, 2017). In order to keep the model easy to follow, we limited our focus to the transmission of the signal from one time cell to another. Another assumption was that, with the activation of an excited time cell, the time cell that is already active stops firing. An adaptation current such as the slow afterhyperpolarization (AHP) can make an active time cell stop in a delayed fashion instead.

Another limitation of the proposed model is the dependence of the timing behavior on the specific connectivity pattern between the time cells. This necessitates modular circuits in the brain that have evolved to specifically keep track of time intervals in the way proposed here. A modular perspective on cognitive functions favors the possibility of such specialized timing networks. Thus, it is possible that different timing circuits (e.g., at sensory cortices, cortico-striato-thalamocortical loop, cerebellum) in essence with similar functional characteristics are present. Recent findings on time cells with sequentially dynamic activation patterns indeed strengthen this very possibility.

## AUTHOR CONTRIBUTIONS

FB issued the question. MZ and FB designed the model idea together. MZ designed the model and run the simulations. MZ and FB wrote the article.

## FUNDING

This study was supported by American University of the Middle East.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zeki and Balcı. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Differential of Frequency and Duration Mismatch Negativity and Theta Power Deficits in First-Episode and Chronic Schizophrenia

Yan-Bing Xiong1,2,3,4, Qi-Jing Bo1,2,3,4, Chang-Ming Wang1,2,3,4, Qing Tian1,2,3,4, Yi Liu1,2,3,4 and Chuan-Yue Wang1,2,3,4 \*

<sup>1</sup> Department of Psychiatry, Beijing Anding Hospital, Capital Medical University, Beijing, China, <sup>2</sup> Beijing Key Laboratory of Mental Disorders, Beijing, China, <sup>3</sup> Beijing Institute for Brain Disorders Center of Schizophrenia, Beijing, China, <sup>4</sup> The National Clinical Research Center for Mental Disorders, Beijing, China

Background: Due to its impairment in patients with schizophrenia, mismatch negativity (MMN) generation has been identified as a potential biomarker for identifying primary impairments in auditory sensory processing. This study aimed to investigate the dysfunctional differences in different MMN deviants and evoked theta power in patients with first-episode schizophrenia (FES) and chronic schizophrenia (CS).

#### Edited by: Fuat Balcı, Koç University, Turkey

Reviewed by: Deana Davalos, Colorado State University,

United States Klaus Mathiak, RWTH Aachen University, Germany

> \*Correspondence: Chuan-Yue Wang wang.cy@163.net

Received: 13 August 2018 Accepted: 13 February 2019 Published: 06 March 2019

#### Citation:

Xiong Y-B, Bo Q-J, Wang C-M, Tian Q, Liu Y and Wang C-Y (2019) Differential of Frequency and Duration Mismatch Negativity and Theta Power Deficits in First-Episode and Chronic Schizophrenia. Front. Behav. Neurosci. 13:37. doi: 10.3389/fnbeh.2019.00037 Methods: We measured frequency and duration MMN from 40 FES, 40 CS, and 40 healthy controls (HC). Evoked theta power was analyzed by event-related spectral perturbation (ERSP) approaches.

Results: Deficits in duration MMN were observed in both FES (p = 0.048, Bonferroniadjusted) and CS (p < 0.001, Bonferroni-adjusted). However, deficits in frequency MMN were restricted to the CS (p < 0.001, Bonferroni-adjusted). Evoked theta power deficits were observed in both patient groups when compared with the HC (p FES = 0.001, p CS < 0.001, Bonferroni-adjusted), yet no significant differences were found between FES and CS. Frequency MMN was correlated with the MATRICS consensus cognitive battery (MCCB) combined score (r = −0.327, p < 0.05) and MCCB verbal learning (r = −0.328, p < 0.05) in FES. Evoked theta power was correlated with MCCB working memory in both FES (r = 0.347, p < 0.05) and CS (r = 0.408, p < 0.01).

Conclusion: These findings suggest that duration MMN and evoked theta power deficits may be more sensitive for detection of schizophrenia during its early stages. Moreover, frequency MMN and theta power could potentially linked to poor cognitive functioning in schizophrenic patients. The findings mentioned above indicated that the neural mechanisms of the three indexes may vary between people.

Keywords: schizophrenia, mismatch negativity, first-episode schizophrenia, chronic schizophrenia, time-frequency analysis

# INTRODUCTION

fnbeh-13-00037 March 4, 2019 Time: 17:3 # 2

One of the central features of schizophrenia is cognitive impairment (Elvevåg and Goldberg, 2000; Weickert et al., 2000), which may include both higher-order functions, such as working memory, and essential sensory functions like auditory function (Javitt and Freedman, 2015; Javitt and Sweet, 2015). Mismatch negativity (MMN) is a negative component of the auditory eventrelated potential (ERP), which may be indicative of the neural mechanisms of cognitive dysfunction that occur in patients with schizophrenia (Näätänen et al., 2014; Hay et al., 2015, Javitt and Sweet, 2015). In addition, MMN is a key component of auditory and visual change detection in the environment (Randeniya et al., 2018). MMN deficits have been identified in patients with firstepisode schizophrenia (FES; Hermens et al., 2010; Mondragón-Maya et al., 2013), patients with chronic schizophrenia (CS; Salisbury et al., 2002; Magno et al., 2008), and ultra-high risk individuals (UHR; Higuchi et al., 2014; Perez et al., 2014). In return, MMN has been proposed as a biomarker for the early detection of patients with schizophrenia.

Auditory MMN is typically induced by a response to the auditory oddball paradigm in which repeating standards are interrupted by rare deviant stimuli, which could differ from the standards for multiple characteristics (Näätänen et al., 2001). MMN deficits have been widely investigated in relation to frequency and duration MMN deviants in patients with schizophrenia. However, the relative degree of these deficits remains to be elucidated (Avissar et al., 2017). Previous studies (Salisbury et al., 2002; Magno et al., 2008) found that frequency MMN deficits were common in CS, yet these deficits were not detected in FES. However, duration MMN deficits have been detected in both FES and CS. For this reason, frequency MMN may be an unreliable biomarker for diagnosing patients with schizophrenia during the early stages of this disease, especially when compared with duration MMN (Haigh et al., 2017).

Despite providing comprehensive information at a physiological level, ERP analyses of MMN offer limited regional details. In contrast to ERP analysis, neuro-oscillatory (eventrelated spectral perturbation, ERSP) approaches can provide information about underlying ERP disturbances at the circuit and molecular levels simultaneously (Javitt and Sweet, 2015). Previously, auditory MMN was shown to have primary evoked power within the theta frequency band (4–7 Hz) (Fuentemilla et al., 2008; Hsiao et al., 2009, Javitt, 2015), which is the band closely tied to the function of somatostatin (SST)-expressing (Womelsdorf et al., 2014) and multipolar bursting-type (Blatow et al., 2003) GABA interneurons. Some studies have already found that MMN-evoked theta power was impaired in patients with schizophrenia (Javitt et al., 2017). However, no studies have reported whether there is a difference in evoked theta power deficits between FES and CS.

In order to study the mechanism of MMN impairment in patients with schizophrenia, researchers associate it with clinical symptoms and cognition. When pertaining to the symptom domains in schizophrenia, relations to MMN amplitude deficits have not been detected at this time. Some reports have shown that negative symptoms may be associated with MMN deficits (Grzella et al., 2001; Salisbury et al., 2002), while other studies have found an association with positive symptoms (Thönnessen et al., 2008; Fisher et al., 2011). However, one meta-analysis failed to detect a relationship between MMN amplitude deficits and symptom domains (Umbricht and Krljes, 2005). Interestingly, antipsychotic drugs do not improve MMN amplitude (Umbricht et al., 1999; Oranje et al., 2017). D-serine, which functions as an endogenous ligand for the glycine modulatory site of the N-methyl-d-aspartate-type (NMDAR), was previously found to improve MMN amplitude in patients with schizophrenia (Kantrowitz et al., 2017). More importantly, many studies have also repeatedly demonstrated that NMDAR antagonists (ketamine or MK-801) can reduce MMN (Catts et al., 2016), suggesting that MMN generation is associated with NMDAR, but not dopamine receptors.

Recently, researchers have shifted their focus to the correlation between MMN measures and cognition impairment. Some studies reported significant correlations between duration MMN amplitude and cognitive functioning in the domains of verbal fluency (Higuchi et al., 2013), social cognition (Wynn et al., 2010), and executive functioning (Toyomaki et al., 2008). However, a recent study reported significant relationships between MMN amplitude with frequency deviants and cognitive function in the verbal learning domain, but there were no significant relationships with duration deviants (Lee et al., 2017b). Regarding MMN-evoked theta power, D-serine treatment can also significantly improve theta response (Kantrowitz et al., 2016). More specifically, the improvement in theta occurs during the preparation interval, which correlates explicitly with an increase in auditory cognitive abilities (Kantrowitz et al., 2017).

In general, MMN amplitude deficits have been observed in patients with schizophrenia. However, the basis for the difference between frequency and duration MMN deficits in patients with schizophrenia remains to be elucidated. In this study, we hypothesized that the neural mechanisms of duration and frequency MMN deficits are different and deficits occur during various stages of schizophrenia. Therefore, we collected frequency and duration MMN amplitude data from FES and CS to verify our hypothesis. Meanwhile, we used ERSP approaches to acquire the evoked power to the MMN standard stimulus. To the best of our knowledge, this is the first study to concurrently assess the electrophysiological indices of evoked power in both FES and CS. We hypothesized that the evoked power deficits in CS are worse than the FES, similar to the MMN amplitude. Lastly, we analyzed the relationship between MMN measures (amplitude and evoked theta power) and clinical symptoms, sociodemographic measures and cognition impairments to further explore the neural mechanism.

## MATERIALS AND METHODS

#### Participants

Eighty-five patients with schizophrenia were recruited from Beijing Anding Hospital, Capital Medical University. For inclusion in this study, the diagnoses were validated with the Structured Clinical Interview for DSM-IV (SCID). FES was

classified as appearing within three years of entry into this study, while CS appeared more than five years before entry into this study. HC were recruited by advertisement and had no DSM-IV axis I disorders.

The age range was 18–45 years old, and all participants had IQs ≥ 70. The exclusion criteria for this study included hearing disorders, learning disabilities, neurological impairments, and histories of seizures, head injuries, electroconvulsive therapy, and drug abuse. The Ethics Committee of Beijing Anding Hospital reviewed and approved this study and all subjects provided informed consent prior to inclusion.

#### Procedures

The auditory stimuli consisted of a sequence of binaural tones (825 trials) that were presented in random order with a stimulus onset asynchrony of 500–550 ms. Standard tones (675 trials, 82%) were 1000 Hz, 75 dB, and 50 ms in duration. Deviant tones included frequency and duration deviants. Frequency deviants (75 trials, 9%) were 1500 Hz, 75 dB, and 50 ms in duration. Duration deviants (75 trials, 9%) were 1000 Hz, 75 dB, and 100 ms in duration. At the beginning of the paradigm, the first 15 stimuli were used as the standard.

# Electroencephalogram (EEG) Data Acquisition and Processing

Electroencephalogram (EEG) data were collected from all of the subjects using the 128-channel electrode system (Electrical Geodesics, Inc., Eugene, OR, United States) with ground procedures and standard reference. The impedance of the signal was adjusted to ≥50 K with a sampling rate of 1000 Hz. During the experiment, the subjects were seated comfortably in a light and sound-attenuated room to remove potentially interfering variables from the study. The test consisted of three sections with a 60 s break between each section.

For the ERP analyses, the EEG data were analyzed and processed using EEGLAB 14.1.1b<sup>1</sup> , which is a neural electrophysiological analysis tool based on MATLAB (MathWorks, Natick, MA, United States). The EEG data were processed using a 0.1–40 Hz bandpass filter (finite impulse response filter). The 50 Hz power frequency noise was subject to notch processing. The reference electrode was changed to a global brain average reference. Artifacts due to eye movement were excluded by independent component analysis (ICA; Makeig et al., 1997). The EEG was segmented from 100 ms prior to initiation to 500 ms after the stimulus onset. MMN waveforms were collected by subtracting the standard from the deviant stimulus at the frontal midline (Fz) electrode (**Figures 1A,B**).

For evoked (average) power analyses, ERP waves were transformed with the short-time Fourier transformation (STFT) method using MATLAB (MathWorks, Natick, MA, United States). Continuous wavelet transformation was carried out for the segmented EEG signal time. The EEG time range was from 100 ms before initiation to 500 ms after the stimulus onset, which was relative to the stimulus presentation time. The frequency range of the wavelet transformation was 1–20 Hz. Additionally, the temporal values of power corresponding to each frequency point were averaged across the trials, and thus EEG power time-frequency distribution was attained channel by channel. We extracted the maximum power values between

<sup>1</sup>http://sccn.ucsd.edu/eeglab/

1–20 Hz and 100–250 ms of each subject for statistical analysis, and the range of the maximum value is within the theta frequency band of 4–7 Hz. We considered the theta frequency band to be the primary active frequency band of the nerve oscillation to the standard stimulus (**Figures 2A1–A3**).

# Clinical, Intelligence Quotient and Neuropsychological Assessment

The clinical symptoms of each patient were evaluated with the Positive and Negative Symptom Scale (PANSS, Chinese version), which was described previously (He and Zhang, 1997). The Chinese intelligence quotient (IQ) test tool is a revised short form of the Wechsler adult intelligence scale-revised, and the four included subsets for this evaluation were information, similarities, picture completion, and block design (Pang et al., 2011). The MATRICS consensus cognitive battery (MCCB, Chinese version) was used to evaluated cognitive deficits in patients with schizophrenia and healthy controls (Shi et al., 2015).

#### Statistical Analysis

Data analyses were performed using SPSS 20.0 (IBM, Chicago, IL, United States). Continuous variables were checked using one-way analysis of variance (ANOVA) and classified variables using the Chi-square test. Analysis of covariance was used to adjust for confounding effects of cognition among groups. A twofactorial mixed ANOVA with deviant type as within-subject factor and the between-subject factor groups was carried out. Associations between MMN values and the scores from the MCCB tasks or the PANSS scale were analyzed using Pearson's correlation analysis. The statistical test used a significance level of p < 0.05. P-values were corrected using Bonferroni adjustment (B-adjusted). Cohen's effect size was used to analyze the differences between the means of each group. Measurement of the effect size was based on the Cohen coefficient (Cohen, 1988)

#### with d < 0.2 as a negligible effect size, 0.2–0.5 as a small effect size, 0.5–0.8 as a medium effect size, and d > 0.8 as a large effect size.

#### RESULTS

#### Demographics and Clinical Characteristics of Patients

Three FES and two CS were excluded from this study due to lowquality EEG data. In total, 40 FES, 40 CS, and 40 HC were enrolled in the study. The demographic characteristics and clinical data for the remaining participants are summarized in **Table 1**. There were no significant differences in age (df = 2,117, F = 2.742, p = 0.069), gender (χ <sup>2</sup> = 1.364, p = 0.506), or education (df = 2, 117, F = 1.476, p = 0.233) among the three groups.

As expected, the HC had a higher IQ (df = 2, 117, F = 11.667, p < 0.001) and better MCCB task performance than FES and CS. The MCCB domain scores and statistical analyses are shown in **Table 2**. In analyzing covariance for IQ, the differences among groups remained significant for all the MCCB domains (df = 2, 117, Fspeed of processing = 27.548, p < 0.001; Fattention/vigilance = 37.489, p < 0.001; Fworking memory = 13.318, p < 0.001; Fverbal learning = 7.53, p = 0.001; Fvisual learning = 3.69, p = 0.028; Freasoning and problemsolving = 11.035, p < 0.001, Fsocial cognition = 3.321, p = 0.04; FMCCB combine = 31.889, p < 0.001).

# Mismatch Negativity to Different Deviants

Results showing the differences in frequency MMN, duration MMN, and standard ERSP between the three groups are shown in **Table 3**. More specifically, these findings describe the relative magnitude of MMN deficits between the groups. As predicted, there was a significant difference among the three groups in both

Correlations between evoked theta power and MCCB working memory showing a significant relationship between all three groups.

TABLE 1 | Demographic and clinical characteristics of patients with first-episode schizophrenia, chronic schizophrenia, and healthy controls.


<sup>∗</sup>FES = First-episode Schizophrenia; CS = Chronic Schizophrenia; HC = Healthy Controls; IQ = intelligence quotient; – = Not Available.

TABLE 2 | MATRICS consensus cognitive battery (MCCB) domain-scores for first-episode schizophrenia, chronic schizophrenia, and healthy control groups.


<sup>∗</sup>FES = First-episode Schizophrenia; CS = Chronic Schizophrenia; HC = Healthy Controls.

frequency MMN (df = 2, 117, F = 30.968, p < 0.001) and duration MMN (df = 2, 117, F = 23.962, p < 0.001).

When compared to HC, there was no significant difference with FES in terms of frequency MMN (p = 0.269, Bonferroniadjusted), and the Cohen's effect size was small (d = 0.37). However, the difference was significant (p < 0.001, Bonferroniadjusted) and the Cohen's effect size was large (d = 1.72) when comparing CS with HC. Meanwhile, in terms of duration MMN, both FES (p = 0.048, Bonferroni-adjusted) and CS (p < 0.001, Bonferroni-adjusted) showed significant differences when compared with HC. The Cohen's effect size for FES and CS were approximately medium (d = 0.48) and large (d = 2.15), respectively. When the two patient groups were compared, there was a significant main effect for both frequency

TABLE 3 | Mismatch negativity amplitudes and event-related spectral perturbation indexes.


<sup>∗</sup>FES = First-episode Schizophrenia; CS = Chronic Schizophrenia; HC = Healthy Controls; MMN = Mismatch Negativity.

MMN (p<sup>F</sup> < 0.001, Bonferroni-adjusted) and duration MMN (p<sup>d</sup> < 0.001, Bonferroni-adjusted).

The results of two-factorial mixed ANOVA with the deviant type (frequency and duration) as within-subject and betweensubject factor groups (FES and CS) showed that the effects within subjects were significant (F = 167.023, p < 0.001), and the effects between subjects were also significant (F = 75.835, p < 0.001). The interaction effects with MMN type (frequency vs. Duration) <sup>∗</sup> Group were also significant (F = 25.112, p < 0.001).

The relationship between MMN amplitude and several variables, including sociodemographic variables, MCCB tests, and PANSS scale scores, are described in **Tables 4**, **5**. As we expected, no significant correlations were found between MMN amplitude and sociodemographic variables or MMN amplitude and PANSS scale scores. In terms of the MCCB tests, both FES (r = −0.327, p = 0.039, **Figure 1D2**) and HC frequency MMN (r = −0.438, p = 0.005, **Figure 1D1**) were strongly correlated with overall cognitive functioning, as assessed by the combined MCCB scores. However, CS frequency MMN was not associated with overall cognitive functioning. Of the three groups, only the duration MMN (r = −0.341, p = 0.032) of the healthy control group was found to correlate with overall cognitive function. Among the individual MCCB items, the correlation between frequency MMN and verbal learning was detected in FES (r = −0.328, p = 0.039, **Figure 1C2**) and HC (r = −0.583, p < 0.001, **Figure 1C1**), yet it was not observed in CS. In contrast, HC showed weak correlations between duration MMN and speed of processing (r = −0.326, p = 0.04), as well as duration MMN and social cognition (r = −0.335, p = 0.034). However, when using the Bonferroni correction, only the correlation between frequency MMN and MCCB verbal learning in HC was statistically significant, suggesting that the respective correlations are only significant on a descriptive level.

#### Evoked Theta Power to the Standard Stimulus

The differences in evoked theta power among the three groups may be found in **Table 3**. There was also a significant difference among the three groups (df = 2, 117, F = 10.98, p < 0.001). Both patient groups showed significant reductions in the evoked theta power when compared with the HC (pFES = 0.001, pCS < 0.001, Bonferroni-adjusted), yet there was no significant difference between the two patient groups (p = 1.000, Bonferroni-adjusted). The Cohen's effect size for FES was medium (d = 0.74), while the Cohen's effect size for CS was large (d = 0.81).

As shown in **Table 5**, in terms of MCCB domain correlations, all three groups measures were significantly correlated with working memory (rHC = 0.361, pHC = 0.022, **Figure 2B1**; rFES = 0.347, pFES = 0.028, **Figure 2B2**; rCS = 0.408, pCS = 0.009, **Figure 2B3**). However, there were no statistically significant findings on the correlation analysis using the Bonferroni


<sup>∗</sup>FES = First-episode Schizophrenia; CS = Chronic Schizophrenia; MMN = Mismatch Negativity; ERSP = event-related spectral perturbation; PANSS = Positive and Negative Symptom Scale; <sup>∗</sup>p < 0.05.

TABLE 5 | Correlation between mismatch negativity measures and the MATRICS consensus cognitive battery domain.


<sup>∗</sup>FES = First-episode Schizophrenia; CS = Chronic Schizophrenia; HC = Healthy Controls; MMN = Mismatch Negativity; MCCB = the MATRICS consensus cognitive battery; ERSP = event-related spectral perturbation; <sup>∗</sup> p < 0.05; ∗∗ p < 0.01.

correction, which indicated that the respective correlations were only significant on a descriptive level.

#### DISCUSSION

Mismatch negativity generation deficits have been recognized as one of the best potential biomarkers of cognitive impairment in patients with schizophrenia (Näätänen et al., 2016). While MMN deficits have been detected in several deviant types, the most extensively studied are those of duration and frequency, as both of these have been found to be significantly impaired in patients with schizophrenia (Friedman et al., 2012). Some researchers have found differential impairments in frequency and duration MMN and suggested that deficits in duration deviants may be more sensitive indices of MMN reduction during the early stages of schizophrenia (Todd et al., 2008). In our study, compared to HC, frequency MMN was only impaired in CS, but duration MMN was impaired in both FES and CS. Therefore, our results confirmed this viewpoint. A recent meta-analysis (Haigh et al., 2017), which compared the results from several studies that measured MMN reduction in patients with first-episode schizophrenia-spectrum, showed a negligible effect size of 0.04 SD for MMN to frequency deviants and a small-to-medium effect size of 0.47 SD for duration deviants. In the individuals at ultra-high risk (UHR; Koshiyama et al., 2017), duration MMN was significantly smaller when compared with the HC. However, frequency MMN did not differ between the UHR individuals and HC. These similar findings suggest that duration MMN is a better biomarker for the early identification of the patients with schizophrenia than frequency MMN.

Both frequency and duration MMN deficits in CS were more severe when compared with FES, and the within- and between-subject effects were also significant, suggesting that MMN amplitude deficits may correlate with the progression of the patients with schizophrenia. However, MMN deficits were not significantly associated with disease duration in either of the deviant types between the two patient groups, which is in agreement with a recent meta-analysis (Erickson et al., 2016). One conceivable explanation is that MMN impairment worsens within the first 1 to 2 years after the diagnosis, but stabilizes after this critical period. Previously, Salisbury et al. (Salisbury et al., 2007) discovered a progressive course of MMN impairment during the initial 18 months of the disease. Nevertheless, our study did not find that MMN deficits in FES (mean duration of illness was 14.15 months) have any significant relationship with the length of illness. Another possible explanation to this controversy is that MMN deficits may be related to the age of the patients (Todd et al., 2008), yet several studies (Lee et al., 2017b,c), including the current investigation, have shown no correlation between MMN and patient age. Additionally, we did not detect this association with age at illness onset.

While we showed that differential deficits in frequency and duration MMN between FES and CS remain controversial, the correlation study with MCCB showed that the neural processing mechanism of duration and frequency MMN might be different. The results of our study showed that frequency MMN deficits were related to MCCB verbal learning (T-score) in the HC and FES, yet duration MMN deficits were not significantly associated with any cognitive domains in MCCB from either group. This may suggest that frequency MMN is a potential marker that can be correlated with auditory functioning early on in patients with schizophrenia. Similarly, previous researchers found that frequency MMN is intact in FES (Haigh et al., 2017), especially in patients with high premorbid functioning (Salisbury et al., 2017). However, even in these patients, declines in frequency MMN occur over the first few years of the disease in parallel with structural changes in the auditory cortex (Javitt et al., 2000; Kasai et al., 2003). In summary, duration MMN may be related to premorbid aspects of patients with schizophrenia, while frequency MMN may be relevant to the decline in cognitive functioning that occurs during the early stages of patients with schizophrenia (Lee et al., 2017b).

Another important discovery of our study is that deficits in theta power response to standard stimuli were significantly impaired in both groups. Similarly, Lee et al. found that lower frequency oscillations were sparked, especially by the MMN, that may be mapped to the theta frequency range, and that

these oscillations are impaired in patients with schizophrenia (Lee et al., 2017a, Lee et al., 2017c). However, their recent study (Lee et al., 2017b) also found that alpha power was significantly impaired in patients with schizophrenia. In return, low-range alpha and theta oscillatory frequencies may contribute to MMN, as they are all impaired in patients with schizophrenia. Our study also found that theta power responses to standard stimuli deficits could not distinguish between FES and CS. More interestingly, theta power response was significantly correlated with MCCB working memory in all three groups. A recent review proposed that theta frequency generation may be tied to the impaired interplay between the cortical pyramidal neurons and local circuit SST-type GABAergic interneurons (Javitt et al., 2017). In addition, another study suggested that working memory may be correlated with GABA levels in patients with schizophrenia (Chen et al., 2014), which may indicate that theta power response to standard stimuli is a marker of auditory working memory in patients with schizophrenia.

# CONCLUSION

In conclusion, we show the differential MMN measures of deficits between FES and CS. Frequency MMN was not impaired in FES when compared with HC, and it was correlated with MCCB verbal learning. Duration MMN and theta-evoked power were impaired in both patient groups. In addition, duration MMN deficits were not correlated with any MCCB domain, yet theta-evoked power deficits were correlated with MCCB working memory in all three groups. These results suggest that the mechanisms of frequency and duration mismatch negativity and theta power deficits in FES and CS are different, and that the processes may occur during various stages of the disease. Duration MMN may be a more sensitive biomarker during the early stages of patients with schizophrenia, while frequency MMN and theta power response to standard stimuli may be linked to a reduction in the cognitive functioning of patients with schizophrenia.

# REFERENCES


# LIMITATIONS

There are two primary methodological limitation for the current study that should be considered. First, some of the FES patients were already receiving antipsychotic therapy, so we were unable to completely rule out the potential effects of concurrent therapeutic intervention. Secondly, the age at illness onset of CS is generally earlier than that of FES. As we wanted to ensure that age was matched between the subjects, there may be some heterogeneity between the subjects.

# AUTHOR CONTRIBUTIONS

Y-BX and Q-JB contributed to manuscript preparation. Y-BX and QT performed the neurophysiological data analysis and statistics. Y-BX and YL oversaw MMN data/demographic data collection. C-MW looked over the MMN test. C-YW was in charge of design and implementation of the study and contributed to data interpretation.

# FUNDING

This work was supported by the Major Brain Program of Beijing Science and Technology Plan (Z161100002616017), the National Science Foundation of China (81601169 and 81471365), the Beijing Municipal Administration of Hospitals Clinical Medicine Development of Special Funding Support (ZYLX201807 and XLMX201807), and Capital's Funds for Health Improvement and Research (2018-2-2123).

# ACKNOWLEDGMENTS

The authors thank all the subjects for participating in this study.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Xiong, Bo, Wang, Tian, Liu and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evidence for a Mixed Timing and Counting Strategy in Mice Performing a Mechner Counting Task

Kenneth R. Light <sup>1</sup> , Brian Cotten<sup>1</sup> , Talia Malekan<sup>1</sup> , Sophie Dewil <sup>1</sup> , Matthew R. Bailey <sup>2</sup> , Charles R. Gallistel <sup>3</sup> and Peter D. Balsam1,2 \*

<sup>1</sup>Department of Psychology, Barnard College of Columbia University, New York, NY, United States, <sup>2</sup>Department of Psychology, Columbia University, New York, NY, United States, <sup>3</sup>Department of Psychology, Rutgers University, Piscataway, NJ, United States

Numerosity, or the ability to understand and distinguish between discrete quantities, was first formalized for study in animals by Mechner (1958a). Rats had to press one lever (the counting lever) n times to arm food release from pressing a second lever (the reward lever). The only cue that n presses had been made to the counting lever was the animal's representation of how many times it had pressed it. In the years that have passed since, many researchers have modified the task in meaningful ways to attempt to tease apart timing-based and count-based strategies. Strong evidence has amassed that the two are fundamentally different and separable skills but, to date, no study has effectively examined the differential contributions of the two strategies in Mechner's original task. By examining performance mid-trial and correlating it with whole-trial performance, we were able to identify patterns of correlation consistent with counting and timing strategies. Due to the independent nature of these correlation patterns, this technique was uniquely able to provide evidence for strategies that combined both timing and counting components. The results show that most mice demonstrated this combined strategy. This provides direct evidence that mice can and do use numerosity to complete Mechner's original task. A rational agent with fallible estimates of both counts made and time elapsed in making them should use both estimates when deciding when to switch to the second lever.

Keywords: counting, numerosity, timing, mice, operant conditioning

# INTRODUCTION

Numerosity is the ability of an organism to understand and distinguish between discrete quantities. Thus, counting to a certain number can be conceived of as a skill built on this ability because it is the ability to recognize when a pre-set number has been met or exceeded. To successfully count to a given number of responses, an organism must maintain an internal representation of the target number, keep track of the number of responses it has already made, and recognize when that number exceeds the target number.

Edited by:

Fuat Balc*ı*, Koç University, Turkey

#### Reviewed by:

Federico Sanabria, Arizona State University, United States William Albert Roberts, University of Western Ontario, Canada

> \*Correspondence: Peter D. Balsam

balsam@columbia.edu

Received: 01 February 2019 Accepted: 02 May 2019 Published: 25 June 2019

#### Citation:

Light KR, Cotten B, Malekan T, Dewil S, Bailey MR, Gallistel CR and Balsam PD (2019) Evidence for a Mixed Timing and Counting Strategy in Mice Performing a Mechner Counting Task. Front. Behav. Neurosci. 13:109. doi: 10.3389/fnbeh.2019.00109

Mechner (1958a,b) first tested the ability of rats to count on a task that required the rat to make a set number of responses or more (count requirement) on a counting lever before switching to a reward lever for a single response. These studies examined behavior under conditions where the count requirement was either 8 or 16. Interestingly, he found that under a variety of behavioral contingencies there was a robust pattern of behavior that appeared to indicate rats could effectively count out 8 or 16 responses before switching. Machado and Rodrigues (2007) replicated this study in pigeons using a more parametric approach, varying the count requirement from 4 to 32. They found that pecks increased linearly with count requirement, while the coefficient of variation (CV) remained fairly constant. In mice, Çavdaro˘glu and Balc*ı* (2016) demonstrated these same principles at count requirements of 10, 20, and 40.

Twenty-five years after Mechner's publication, Meck and Church (1983) developed bisection procedures where rats were tasked with discriminating either the duration or number of sounds. One of two levers was assigned to the short duration (or small number) while the other was assigned to the long duration (or large number). They then played either intermediate times or intermediate counts to see which lever the rats responded on. Rats effectively discriminated 4:1 ratios in both timing and counting, and the point of subjective equality (where 50% of responses occurred on the ''long'' or ''large quantity'' lever) was close to the geometric mean of the ends of the distribution. Thus, rats demonstrated numerosity perception in this task at a level comparable to their ability to discriminate time. Fetterman and Killeen (2010) found similar results in pigeons performing on a three-lever switch task dependent on count.

The comparable levels of counting and timing ability make it hard to determine the rats' true basis for the decision to switch levers. Timing and counting are nearly unavoidably correlated through rate. That is, a rat can both press a lever a given number of times or press at a certain rate for a certain duration of time and come to the appropriate number of lever presses before switching levers. Even when the stimulus is external, as in the bisection task, the presentation rate of stimuli confounds time and count. The two dimensions can be uncoupled by varying the duration of the cues to be counted and such experiments (Fernandes and Church, 1982; Meck and Church, 1983) showed that animals can use either time or count.

Attempts to tease apart counting and timing strategies in count production tasks have also been successfully done. Mechner and Guevrekian (1962) found that the speed of responses increased with deprivation on a fixed ratio schedule but not on a minimum interval schedule. Instead, on the interval schedule the break time between runs was decreased with increasing deprivation. Thus, time-based and response-based strategies were able to be isolated through deprivation. While it was not done in a counting task, it does demonstrate that they are separate, and separable, abilities in the rat.

Wilkie et al. (1979) attempted to dissociate timing and counting in a modified Mechner counting task in two pigeons. When they introduced a variable interval (but not a fixed interval) between the first and second pecks the animals made, it theoretically interfered with the animals' ability to time this interval. Because performance was not perturbed by this variable interval (specifically in conditions 4 and 5) the authors concluded that timing was not necessary to complete the task.

The research cited above demonstrates that animals can both count and time but it does not indicate what the animals will do when both strategies are possible. To that end, Roberts and Mitchell (1994) demonstrated that pigeons faced with a stimulus that has both timing and number attributes can process both attributes simultaneously. Roberts et al. (2000) then demonstrated that these attributes can be brought under stimulus control with a discriminatory cue presented prior. Recently, Berkay et al. (2016) investigated animals' strategies by modeling performance in a numerical switch task. This task, similar to the one employed by Fetterman and Killeen (2010), allowed the group to investigate the influence of when the animals' chose to move from the ''few'' response lever to the ''many'' response lever. This switching behavior is indicative of the mouse discerning that the number of presses it made on the ''few'' lever was equal to or more than the number required for reward at that lever, causing it to begin responding on the ''many'' lever to earn a reward that trial. In this task, the group found evidence for behavior based on numerosity and only weak evidence that timing might also play some role in the performance. Intuitively one would expect animals to rely on the dimensions that they track with greatest precision. Both Fetterman (1993) and Çavdaro˘glu and Balc*ı* (2016) measured the variability of both timing and counting in the counting switch task and found that the variability of counting was less than that of timing. Further, Çavdaro˘glu and Balc*ı* (2016) showed that counting was more heavily weighted than timing in a regression analysis to predict the switch in response levers while Fetterman (1993) found individual differences in the use of timing and counting strategies.

The present study sought to discern whether counting and/or timing guided behavior in the original Mechner counting task. In this task, the number of presses made on the counting lever before switching to the reward lever (which will be called terminal count) and the time it takes to make those presses (called the terminal time) are highly correlated. A way of discerning whether counting or timing is the basis of the decision to terminate presses on the counting lever arises from the fact that the rate of pressing varies from trial to trial because pressing is often interrupted by pauses of varying duration at varying points in the sequence. Because of these irregularities, when one ''drops in'' analytically (post hoc) on runs at a fixed time equal to half the average terminal time, the counts at that fixed time vary. The correlations between the varying elapsed counts observed at a fixed drop-in time and the terminal times and terminal counts depend on whether the animal counts its presses or times the elapsing interval(s) [**Figure 1**, red **(A)** and green **(B)** lines/dots]. Similarly, when one drops in at a fixed count equal to half the average terminal count, the times elapsed at that count vary [**Figure 1**, blue **(A)** and purple **(B)** lines/dots]. The correlations between the varying times elapsed at a fixed drop-in count and the terminal counts and terminal times depend on whether count or elapsed time or both is the basis of the animal's behavior.

We estimate the extent to which behavior was based on counting or timing or both by computing the four just-described drop-in correlations: terminal count and terminal time vs. drop-in count and terminal count and terminal time vs. drop-in time. When the drop-in is at a fixed time, t, and a counting strategy is being used, the following quantitative relations are relevant:


t<sup>r</sup> = cr/λ the time remaining is the count remaining divided by the average rate of pressing; the latter is a constant, so the smaller c<sup>r</sup> is, the shorter t<sup>r</sup> is.

By substitution: <sup>T</sup> <sup>=</sup> <sup>t</sup> <sup>+</sup> <sup>c</sup>r/<sup>λ</sup> = (C¯ <sup>−</sup> <sup>c</sup>t)/λ. Because t, <sup>C</sup>¯ and λ are constants, a bigger than usual c<sup>t</sup> predicts a smaller than usual c<sup>r</sup> , which predicts a shorter than usual t<sup>r</sup> , which in turn predicts a shorter than usual T. Thus, T (terminal time) should be negatively correlated with c<sup>t</sup> (**Figure 1C**, red bar on the left side). On the other hand, C (terminal count) should not be correlated with c<sup>t</sup> because a shorter than usual C<sup>r</sup> offsets the effect on C of a higher than usual c<sup>t</sup> (**Figure 1C**, green bar on the left side).

When the analytic drop-in is at a fixed count, c, rather than at a fixed time and a counting strategy is being used, then total time, T = t<sup>c</sup> + t<sup>r</sup> should be longer than usual when t<sup>c</sup> is longer than usual, because the average count remaining to the target, hence the expected time remaining, t<sup>r</sup> , should be a constant. Thus, T should be positively correlated with t<sup>c</sup> (**Figure 1C**, blue bar on the left side), but C should not be correlated with t<sup>c</sup> (**Figure 1C**, purple bar on the left side).

A parallel analysis of what is expected when the termination of a run of presses on the counting lever is based solely on timing an interval (either the elapsed trial time or the elapsed run time) yields the pattern of correlations shown in **Figure 1C** on the right side.

As a check on the soundness of our analytic derivation of these correlation patterns, we did two simulations. In the first, we chose a terminal count, n, for each simulated run at random from among the distribution of those counts for a given subject. We then randomly chose inter-response intervals from the distribution of inter-response intervals for that subject. We cumsummed them to obtain response times for the n responses. In this simulation, the decision to end a sequence only depends on count as it would for a pure counting strategy. In the second simulation, we chose terminal run times at random from among the distribution of those times for a given subject. For each terminal run time, we then chose a long sequence of inter-response intervals at random from the distribution of those intervals for a given subject. We cumsummed these intervals and truncated the sequence at the end of the inter-response time that exceeded the target interval. Thus, the decision to end a sequence in this second simulation depended only on elapsed run time, as it would for a pure timing strategy. We then computed all of the correlations as we did with the real data. The drop-in correlation patterns on the simulated data were those predicted by our derivations (**Figure 1**).

Importantly, this correlational analysis can reveal strategies based on mixtures of count and elapsed times. Such strategies will yield mixtures of the ''pure'' patterns shown in the top panels of **Figures 4**, **5**. We use this correlation analysis in considering five possible strategies that subjects might use to decide when to terminate pressing the counting lever and switch to the reward lever: (i) a strategy based purely on the press count; (ii) a strategy based purely on the elapsed trial time; (iii) a strategy based purely on the elapsed run time (time from the first response); (iv) a strategy based on both the press count and the elapsed trial time; and (v) a strategy based on the press count and the elapsed run time.

## MATERIALS AND METHODS

#### Subjects

Eight adult male C57/BL6 mice (Jackson Labs, Bar Harbor, ME, USA) were used in this experiment. All mice were kept in standard laboratory cages housed four per cage in a temperature and humidity controlled vivarium on a 12 h light/12 h dark cycle with a light onset of 7 am. Ad libidum access to water was maintained at all times while the animals were within their home cages. Feeding was restricted to maintain 85% of free-feeding body weight.

#### Apparatus

Mouse modular operant chambers (MED Associates, Fairfax, VT, USA) were used for all behavioral assessments in this study. The chambers were equipped with grid flooring, a house light, two retractable levers that flanked a feeding trough, and a dipper arm able to deliver 0.01 cc of evaporated milk.

#### Procedure

Mice were acclimated to the vivarium for 1 week prior to experimental testing. During this time, they were also acclimated to handling through taking their body weight daily. This baseline weight was then used to restrict food access as outlined above.

The first phase of training was dipper and lever press training, which occurred over two consecutive daily sessions. During both sessions, animals were rewarded with a drop of milk if they pressed the left lever. In session 1, the lever was extended and milk was delivered automatically after 30 s. A variable ITI with an average of 5 s then started before the lever extended again, starting the next trial. This continued for 20 trials. The second session was similar to the first, but the lever retracted after 20 s and the procedure continued for 30 trials.

Next, mice were trained on a forced response chain that increased the Fixed Ratio (FR) requirement for the first lever (''count'') over days. The FR requirement for the second lever (''reward lever'') was always 1. To achieve a forced chain, the count lever was extended until the FR requirement was met, at which point it retracted, the reward lever was extended, and a single press on the reward lever always resulted in a drop of milk. After the reward, the reward lever retracted and a variable ITI with an average of 15 s occurred before the count lever was again extended, starting the next trial. Subjects continued in this manner until 40 rewards were earned or 60 min elapsed. Subjects progressed through FR requirements of 1, 2, 3, 4, and 5 at individual paces before moving on to the next phase.

In the final phase of training, the response chain was no longer forced. Instead, both levers were extended at the start of a trial and both were retracted at the end. The trial only resulted in the presentation of a drop of milk if the mouse made at least the required number of responses on the count lever before switching to the reward lever for one press. Extra presses on the count lever were never penalized. Like the previous phase, this procedure continued until 40 rewards were earned or 60 min elapsed. All mice proceeded through requirements of 5, 7, and 10 presses on the count lever at individual paces. A subset of mice (n = 4) were then gradually moved to a requirement of 20 presses. All mice were then trained on their final requirement (10 or 20) for a minimum of 15 sessions.

## RESULTS

Because this task required animals to perform until they obtained 40 rewards, the number of trials taken to reach those rewards is indicative of their ability to perform the task as well as the efficiency of the strategy they employed to perform it. However, not all subjects earned all 40 rewards every session (subject 2 earned no rewards in session 1 and only 22 in session 5; subject 3 earned 38 rewards in session 5)

FIGURE 3 | Comparison of coefficient of variation (CV) across task strategies. Average CV of terminal values across the last five sessions of training is plotted as a function of three possible strategies. Error bars indicate standard errors of the mean. Asterisks indicate statistically significant LSD post hoc comparisons (p < 0.05).

before the 60-min time out. Therefore, a better measure of efficiency is the number of rewards earned divided by the number of trials taken to earn them. By that metric, in the final five sessions of training, all animals achieved a stable level of efficiency. No differences in overall efficiency were observed for mice under the 10 press requirement and mice under the 20 press requirement (M = 0.64 in both cases BF 1.9:1 in favor of the null against a bidirectional effect size of ±0.25). No group by trial interaction was observed (**Figure 2**).

Intuitively, one might expect that when there are two evolving and correlated cues that may determine a decision, the cue on which the decision is based will have less variable terminal values, hence a smaller CV. Consequently, we compared the CVs of the three terminal quantities (trial time, run time and press count). The CVs differ significantly, F(2,21) = 5.08, p < 0.05 (**Figure 3**). Least-significant-difference post hoc analyses revealed that the average trial timing CV was significantly larger than both the run-time (p < 0.05) and the count (p < 0.01) CVs, but the latter two did not differ (p = 0.75).

The larger trial timing CV does not definitively exclude the possibility of a trial-timing strategy contributing to the animals' decision to switch to the reward lever. To gather further relevant evidence, the four correlations described in the introduction were computed for each animal for the terminal trial times.

The correlations predicted by the two ''pure'' strategies are shown in the upper panel of **Figure 4**, while the correlations actually obtained are shown in the bottom panel. In the bottom panel, we see that the blue correlations were strongly positive in every subject, as is predicted by a counting strategy. A pure counting strategy also predicts that the red correlations should be strongly negative. This is true for only half the subjects; in

the other half, the red correlations are non-significant. A pure counting strategy predicts non-significant green correlations, but these are significantly positive in six subjects, as predicted by a timing strategy. Thus, the results of a correlational analysis of elapsed trial times and counts imply that all the subjects terminated pressing on the left lever on the basis of the number of presses at least some of the time, but that the time elapsed played a role in their decision on at least some runs.

We conducted the same analysis—with two different drop-ins and the four correlations—with elapsed run times rather than elapsed trial times. In this analysis, the fixed drop-in time was half the average terminal run time, that is, half the interval from the first to the last press on the count lever. If the decision to terminate pressing is based on the time elapsed since the first press rather than on the time elapsed since the start of the trial, the time correlations will get stronger; hence the overall pattern of correlations might become more mixed.

In shifting from **Figure 4** to **Figure 5**, one sees that the evidence for a timing strategy increases in most subjects. Indeed, in Subject 7, the correlations expected from timing are significant, whereas the correlations expected from counting are non-significant. The pattern in Subject 7 contrasts strikingly with the pattern in Subject 1, whose pattern is exactly what is expected given a pure counting strategy. The other six subjects show clear evidence of dependence on both counting and timing.

FIGURE 6 | Relationship between efficiency and bias towards one strategy. Strategy Bias (difference between contribution of timing and counting) is plotted as a function of Overall Efficiency (number of rewards earned/number of trials for all five sessions). Positive values of contributor bias indicate counting bias while negative values indicate timing bias. The trend line indicates the regression line.

Next, we attempted to estimate the contributions of countingbased and timing-based decisions in individual animals from the run time analyses. To this end, we subtracted the correlation predicted to be negative for each contributor from the correlation predicted to be positive. That is, for counting we subtracted the correlation between terminal time and current count from the correlation between elapsed time and terminal time. For timing, we subtracted the correlation between terminal time and elapsed time from the correlation between current count and elapsed time. We were then able to calculate the individual animals' bias towards one strategy or the other by subtracting our timing measure from our counting measure. Thus, mice with biases towards count as the more important basis of their decision to switch levers had a positive value and mice with biases towards time as the more important basis of their decision to switch levers had a negative value.

As a group, mice had an average bias of 0.046, indicating that generally mice show no clear bias towards one estimate or the other. However, the group mean is misleading because individual differences in this measure are very large. The range of values was from −0.80 to +0.92 because mouse seven relied on a pure timing strategy while mouse 1 relied on a pure counting strategy. Most, however, showed evidence for both timing and counting with a mild bias toward one or the other.

Finally, we correlated mice's efficiency at completing the task (see above) with this measure of bias towards one strategy or the other. However, due to the small sample size here, we cannot determine statistical significance, we can only examine coefficients as descriptive of our sample. There was a small positive correlation between the bias score and efficiency (r = 0.23) suggesting that biases towards counting are more effective than biases towards timing (**Figure 6**). However, it is noteworthy here that the mouse that was heavily biased towards timing and the mouse that was heavily biased towards counting were the two most efficient mice in the sample. Further, if one measures the correlation among only those that had a substantial count-basis for their decision (biases equal to −0.3 or larger), the correlation rises to 0.69. The same cannot be said when we examine only those with substantial timing-based biases (biases equal to 0.3 or smaller). The correlation then reverses to −0.26, indicating that when added to a counting strategy, timing is detrimental. Generally, then, choosing one variable to base a decision on is best and mixing one's strategy comes at the sacrifice of efficiency.

### DISCUSSION

In the present experiment, a Mechner counting task was evaluated in terms of the animals' abilities to complete the task at 10 and 20 press requirements and the strategies employed to accomplish the task. All mice presented here were able to complete a 10 press requirement and the subset tasked with completing a 20 press requirement did so effectively. Further, no differences in efficiency in completing the task were observed for the different press requirements.

Three variables were considered as possible contributors to the decision to switch levers in this analysis: one count-based and two timing-based contributors. The count-based contributor was the one conceived of in the initial design of the task. That is, mice could track, or count, the number of times they pressed the counting lever and switch to the reward lever once some target number is reached. The first timing-based contributor considered here assumes tracking the time elapsed in a trial. On this assumption, mice track the time from when the levers are inserted into the chamber (the start of the trial) and press until some target trial time is reached. The second timing-based strategy here is a press-run timing strategy. In this case, mice track the time elapsed since their first press on the counting lever until a target elapsed run time is reached.

First, and most importantly, all but one of our subjects based their decision at least in part on counting—in support of the original intention of the task (Mechner, 1958a,b), as well as decades of previous work manipulating the task (Mechner and Guevrekian, 1962; Wilkie et al., 1979; Machado and Rodrigues, 2007; Fetterman and Killeen, 2010) and analyzing data from similar tasks (Berkay et al., 2016; Çavdaro˘glu and Balc*ı*, 2016). Most mice, regardless of count requirement and which time (trial or run) the count strategy was compared to, demonstrated a count-based contribution to the decision to switch levers.

The correlational analysis of each animal's efficiency across the final five sessions of training indicated that a strategy based on elapsed trial time is less predictive of accuracy than one based on either run time or counting as might be predicted from the greater relative variability of trial times compared to the other measures. In the correlational efficiency analysis, comparison of the run time strategy alongside of a count-based strategy revealed that most mice used a combination of the two. However, in terms of efficiency, the data seem to suggest that while choosing a nearly pure timing strategy was very effective, when using a more mixed strategy the greater contribution counting made to the decision to switch levers, the more efficiently mice performed the task.

It is the finding that a combined strategy seems to be the most prevalent that is simultaneously the most revealing of the manner in which animals approach this task and others like it as well as the largest strength of this type of analysis. To understand why this is the case, one must understand first that, because count alone determines the outcome of this task, the optimal strategy here would be to count precisely. However, mice cannot count with sufficient precision to produce error-free performance. Therefore, adding a secondary strategy, such as timing a response-run, could lead to rewarded trials that would otherwise have been error trials due to erroneous counts. The evidence here seems to indicate that adding a second strategy may sacrifice efficiency. Presumably, mice would otherwise not be able to complete the task so, that sacrifice is often worth it. Interestingly, Roberts et al. (2000) presented evidence that, at least in pigeons, this compensatory strategy is only present when the animal is counting, but not timing. Due to the counting nature of the task, however, we are unable to make that distinction here.

Finally, the present results have implications for the way complex tasks tend to be analyzed, both in animal models and in humans. By parsing out different strategic contributions for individuals, one can then use a single task to measure multiple abilities. It would then be possible to manipulate one trait and see how that affects the individual contributors to the task as well as how the combination of those strategies might be altered. To use the present line of research as an example, one could alter animals' abilities to time and see how that affects their abilities to

#### REFERENCES


time and count on the Mechner task as well as whether a lack of timing ability makes them rely more heavily on a counting strategy. In this manner a much more holistic view of mouse performance becomes possible.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of National Institutes of Health (NIH) Guide for Animal Care and Use and the recommendations of the IACUC of Columbia University and New York State Psychiatric Institute. The protocol was approved by the IACUC of Columbia University and New York State Psychiatric Institute.

#### AUTHOR CONTRIBUTIONS

KL contributed to the design of the study, analysis of the data, writing of the manuscript. BC, TM, and SD contributed to the design and execution of the study. MB contributed to the design and analysis of the study. CG contributed to the analysis of the study and editing of the manuscript. PB contributed to the design of the study, analysis of the data, and editing of the manuscript. All authors contributed intellectually to this study.

## FUNDING

This work was supported by National Institutes of Health (NIH) R01 MH068073 to PB.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Light, Cotten, Malekan, Dewil, Bailey, Gallistel and Balsam. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effect of Presentation Format on Judgment of Long-Range Time Intervals

#### Camila Silveira Agostino1,2, Yossi Zana<sup>2</sup> \*, Fuat Balci<sup>3</sup> and Peter M. E. Claessens<sup>2</sup>

<sup>1</sup> Department of Biological Psychology, Faculty of Natural Science, Otto von Guericke Universität Magdeburg, Magdeburg, Germany, <sup>2</sup> Center for Mathematics, Computing and Cognition, Federal University of ABC, Santo André, Brazil, <sup>3</sup> Department of Psychology, Koç University, Istanbul, Turkey

Investigations in the temporal estimation domain are quite vast in the range of milliseconds, seconds, and minutes. This study aimed to determine the psychophysical function that best describes long-range time interval estimation and evaluate the effect of numerals in duration presentation on the form of this function. Participants indicated on a line the magnitude of time intervals presented either as a number + time-unit (e.g., "9 months"; Group I), unitless numerals (e.g., "9"; Group II), or tagged future personal events (e.g., "Wedding"; Group III). The horizontal line was labeled rightward ("Very short" = >"Very long") or leftward ("Very long" = >"Very short") for Group I and II, but only rightward for Group III. None of the linear, power, logistic or logarithmic functions provided the best fit to the individual participant data in more than 50% of participants for any group. Individual power exponents were different only between the tagged personal events (Group III) and the other two groups. When the same analysis was repeated for the aggregated data, power functions provided a better fit than other tested functions in all groups with a difference in the power function parameters again between the tagged personal events and the other groups. A non-linear mixed effects analysis indicated a difference in the power function exponent between Group III and the other groups, but not between Group I and II. No effect of scale directionality was found in neither of the experiments in which scale direction was included as independent variable. These results suggest that the judgment of intervals in a number + time-unit presentation invoke, at least in part, processing mechanisms other than those used for time-domain. Consequently, we propose the use of event-tagged assessment for characterizing longrange interval representation. We also recommend that analyses in this field should not be restricted to aggregated data given the qualitative variation between participants.

Keywords: temporal estimation, numerical estimation, personal events, power functions, model comparison

# INTRODUCTION

Most studies of time perception have focused on the short time range (i.e., milliseconds to minutes – Grondin, 2010). Despite the importance of processing longer-range intervals for humans, investigations of time perception in the days-to-years range are less common. Virtually all intertemporal decision-making tasks require the estimation of the magnitude of at least one long time interval at which the consequences of a choice would occur, and thus the psychophysics of time

#### Edited by:

Laurence T. Maloney, New York University, United States

#### Reviewed by:

Thiago Leiros Costa, Katholieke Universiteit Leuven, Belgium Darren Rhodes, Nottingham Trent University, United Kingdom

> \*Correspondence: Yossi Zana yossi.zana@ufabc.edu.br

#### Specialty section:

This article was submitted to Perception Science, a section of the journal Frontiers in Psychology

Received: 20 September 2018 Accepted: 11 June 2019 Published: 28 June 2019

#### Citation:

Agostino CS, Zana Y, Balci F, and Claessens PME (2019) Effect of Presentation Format on Judgment of Long-Range Time Intervals. Front. Psychol. 10:1479. doi: 10.3389/fpsyg.2019.01479

(i.e., the nature of mapping between objective [calendar] and subjective [perceived] time) has significant effects on the choice behavior in such scenarios. For instance, assume that intervals of 4 and 6 years are perceived as almost equally distant from now. A person has to choose between investing his/her money in mutual fund A or B. While fund A guarantees 40% nominal interest yield after 4 years, fund B guarantees 48% after 6 years. Equidistance perception of 4 and 6 year-long intervals might result in preference for investing in fund B, although a normative decision might suggest otherwise as the gain in the last 2 years is relatively low.

This paper concerns the estimation of subjective magnitude of long-range time intervals. We will refer to the relation between physical intervals and their internal representation as the psychophysical mapping function, and the psychometric function when we discuss the experimental measurement of this relationship. For obvious reasons, the direct measurement of the psychometric function of long-range time intervals is impractical. Thus, early studies of long time interval processing were based on the finding that the reaction time of the lefthand to small numbers is faster than to large numbers (Dehaene et al., 1993). Gevers et al. (2003, 2004) measured the Spatial Numerical Association of Response Codes (SNARC) effect, or the reaction time differences between right-handed minus lefthanded responses, as a function of the day of the week or month of the year. They found a linear relation between the magnitude of the SNARC effect and the distance to the reference stimuli.

Arzy et al. (2008, 2009) formulated a different approach. They theorized that, when people consider points in time, they use a mental time line on which they can project personal life events. In order to test this view, they asked participants to indicate if an event occurred before or after a reference point in time. They found that reaction times were slower and responses less accurate for distant events in the past and future in comparison to the reference events. Reaction time decreased logarithmically with distance from the reference point. Thus, differently from previous SNARC effect studies, time perception follows a logarithmic function when measured based on mental time "self-projection." It is noteworthy that both techniques are indirect in the sense that participants are required to discriminate between stimuli of different magnitudes rather than asked to estimate the magnitude of a specific time interval stimulus, and they rely on reaction times.

Instead, Faro et al. (2005, 2010) measured perception of long time intervals directly. In one study (Faro et al., 2005), participants were presented with pairs of historical events and asked to estimate the time between the two events in years. They did not, however, characterize the mapping functions. In another study, Faro et al. (2010) used a variant of the same procedure and reported a weak linear correlation between calendar and subjective time estimation without specifying the exact values. Kim and Zauberman (2009) and Zauberman et al. (2009) used a similar procedure, but presented time intervals textually in the form of a number and a time unit (e.g., "18 months") and used a horizontal line as a scale for responding, i.e., the response was measured by cross-modality matching usually termed visualanalogue scale (VAS). In their study, participants indicated the line-length that corresponded to the subjective magnitude of the presented time interval. Their results suggest that subjective time follows a decelerating power function (see also Han and Takahashi, 2012; Bradford et al., 2014). This cross-modal matching method coupled with textual time-interval as stimulus allows for the estimation of the psychometric function for longrange time intervals. In order for a time interval presented in such a format to be understood accurately, one needs the premise that there is a linear relation between the textual-numerical presentation and its subjective magnitude. Thus, the influence of the perception of symbolic numerals (e.g., conventional Arabic numerals as a decimal numeral system) should be considered. The capacity to represent counts as analog quantities is usually referred to as the Approximate Number System (ANS). There is ample evidence that this representation follows a compressed form. For example, Dehaene et al. (2008), using a horizontal line for subjective estimation, found that adults perceived the quantity of dots in the range of 1–100 in a logarithmically decelerating manner.

Arabic numbers, on the other hand, are symbolic representations of quantities, i.e., they only acquire quantity information (its semantics) indirectly through learned association with analog non-symbolic quantities (Gallistel and Gelman, 1992). Siegler and Booth (2004) asked participants to estimate the magnitude of Arabic numerals. They used numbers in the 0–100 range and a line-length for responding. In the characterization of psychometric functions, they found a progressive shift from a highly decelerating logarithmic function for kindergarteners to a linear function for undergraduate students (see also Siegler and Opfer, 2003).

In a tentative account of the linear mapping of Arabic numerals and logarithmic mapping of analog quantities, Dehaene (1992) proposed the triple-code model in which symbolic representations of quantity are transmitted linearly but processed by the same system used to process analog quantities. However, there is also evidence that suggests a more complex mapping of symbolic numerals. For instance, Dehaene et al. (1990) asked participants to judge if two-digit numbers are larger than 65 and found that reaction times were logarithmically related to the numerical distance from the target. In another study, Schley and Peters (2014) used the line-length technique to estimate numerals in the 1–1000 range and found that participants could be classified as those presenting a linear or those presenting a nonlinear psychometric function. Hybrid models propose that estimation of the magnitude represented by symbolic numbers relies on a learned symbolic-number linear mapping and an innate concave ANS mapping (Schley and Peters, 2014; Peters and Bjalkebring, 2015).

A more general theory of cognitive processing of magnitudes in different domains, called A Theory of Magnitude (ATOM), was proposed in order to understand the relation between time, space, and quantities (Walsh, 2003; Bueti and Walsh, 2009). ATOM argues that temporal and non-temporal dimensions (e.g., space, numerosity, size, speed) share a common representational metric and that they are represented by an innate generalized magnitude system localized in the parietal cortex. Among ATOM's direct predictions are the existence of a shared brain area

for processing magnitudes of different domains and interference of magnitude estimation in one domain with the processing in other domains (e.g., Frassinetti et al., 2009). Moreover, ATOM predicts that processing in one domain influences processing in the other domains symmetrically, as all dimensions are processed by a single general-purpose analog magnitude system and no directionality or hierarchy between dimensions is specified (Martin et al., 2017). Supporting ATOM's predictions, there is evidence of shared brain areas for processing space, time, and quantity (e.g., neural activity correlation in the inferior parietal cortex, especially the right intraparietal sulcus; for a review see Bueti and Walsh, 2009), as well as behavioral results in animals (e.g., rats could transfer their judgments across the numerical and temporal domain, Church and Meck, 1984) and humans (e.g., the above-mentioned SNARC-effect). However, evidence of linear mapping of numbers in adults, as opposed to logarithmic in children, and significant individual differences cannot be readily accounted for by the ATOM's predictions. Furthermore, changes in one domain do not always affect the processing of another dimension, or there are asymmetries (Dormal et al., 2006; Kars˛*ı*lar and Balc*ı*, 2016). For instance, duration judgments were found not to be affected by surface or numerosity (Martin et al., 2017).

Developmental changes in the mapping functions of abstract numerals and asymmetry in the interference between different dimensions of stimuli are predicted by an alternative theory, the metaphor theory (Lakoff and Johnson, 1980, 1999). The metaphor theory proposes that people mentally represent abstract and complex knowledge as conceptual metaphors. A typical linguistic example would be "They moved the meeting forward 2 hours," although it is not possible to perceive by our senses the "motion" of a meeting (Casasanto and Boroditsky, 2008). In western cultures, it is conventional to represent, in printed material such as graphs and comic strips, numbers and time as increasing horizontally from left to right. In view of the metaphor theory, time and space domains are independent at the beginning of development and become linked, asymmetrically, depending on individual experience, language, and culture (for a critical comparison of the two theories, see Winter et al., 2015). In support of the metaphor theory, there is evidence that people from different cultures and writing languages lay out time in space differently (Tversky et al., 1991; Fuhrman and Boroditsky, 2010; Anelli et al., 2018). For example, Tversky et al. (1991) asked participants to map events on a ruled-off square sheets of paper. They found that left-to-right organization was dominant for English speakers as compared to right-to-left for Arabic speakers (Hebrew speakers had no dominant preference), in correspondence with the respective direction of the language writing system. In an analogous task, the representation quantity was evaluated (ex. amount of sand). In contrast to the horizontal directionality of the representation of temporal properties, quantity properties showed no horizontal directionality: both right-to-left and left-to-right were equally used by participant of different speaking languages. However, other found directionality effect for numerical reasoning, which was modulated by culture and language. Directionality of numerical reasoning, just like temporal perception of events, was found to be modulated by culture and language (e.g., Dehaene et al., 1993; Bachtold et al., 1998; Zebian, 2005). French speaking participants showed leftto-right SNARC effect for large numbers and the inverse for small numbers (Dehaene et al., 1993; see also Bachtold et al., 1998 for similar results; for a review, see Toomarian and Hubbard, 2018).

Taken together, the empirical evidence and models suggest that symbolic numerals are perceived in a linear or concave shape in a way that varies between individuals. This leads to the question as to whether the Arabic numerals in the time interval stimulus type influence the resulting mapping function. This study aimed to bridge the empirical but theoretically critical gap in the literature by characterizing the psychophysical function that best mapped the subjective magnitudes onto objective longrange time intervals using different formats through which time intervals were presented. To this end, we tested participants in three different conditions; presentation of intervals as a number + time-unit (e.g., "9 months," Group I), as unitless numerals (e.g., "9," Group II), or as tagged future personal events (e.g., "graduation," Group III). The effect of directionality of the time scale was also evaluated in the first two conditions under the assumption that different effects of scale direction indicate different cognitive processing.

Our results suggest that, when presented with time-intervals as numerals, people use, at least in part, knowledge from domains other than time to estimate magnitudes in this context. In addition, our results emphasize the importance of considering individual differences. Finally, we discuss the procedural implications of our findings.

# EXPERIMENT I: NUMBERS WITH TIME UNIT

#### Materials and Methods

All participants were healthy undergraduate students that volunteered to take part of the experiment. Experimental protocols followed the conventional cross-modality line-length matching paradigm (Zauberman et al., 2009). In all three experiments, prior to the beginning of the first task, participants provided written and informed consent for their participation. The Research Ethics Committee at the Federal University of ABC approved all experimental protocols.

In Experiment I, 18 participants (mean age 22.2 years, 10 women) were seated in an isolated laboratory room, at 70 cm from a computer monitor. The following instructions were presented on the screen, in Portuguese: "In this study you will be asked to indicate your subjective feeling of duration between today and many days in the future. Time intervals vary between 3 and 36 months. Please, read the instructions carefully and indicate your response."<sup>1</sup> After confirming that the participant had understood these instructions, the following text was displayed on the upper part of the screen (**Figure 1**, left panel): "Imagine the interval below. Move the bar to indicate

<sup>1</sup> "Nesse estudo, você será solicitado a indicar seu sentimento subjetivo da duração entre hoje e vários dias no futuro. Os dias podem variar entre 3 e 36 meses. Por favor, leia as instruções cuidadosamente e indique suas respostas."

how long you consider the duration between today and the given interval."<sup>2</sup> In each of two direction conditions, the time interval in months was presented below these instructions in the format, e.g., "15 months," according to a random permutation of five repetitions from the set {3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36}. Below the numeric time interval, a 180 mm line was presented with labels "very short" and "very long" at the extremes. In one configuration of the task, the labels "very short" and "very long" were placed at the left or right side, respectively ("forward" condition). In the other configuration, the order was reversed ("backward" condition). The presentation order of direction conditions was counterbalanced between participants. The initial position of the mouse cursor was always at the center of the line. Participants had to move the cursor to the right or left to arrive at the desired segment length and click the left mouse button to confirm their choice. The maximum response window was 10 s, after which a new trial was initiated. Noresponse trials were treated as missing values during statistical analysis. Each of the 12 time intervals was presented five times, totaling 60 trials per session. Four training trials with a random selection of intervals were presented at the beginning of the task to familiarize participants with the procedure; these data were not included in the analyses. The time intervals below 12 months were presented in a single digit format, e.g., "9," rather than as double digits ("09"). The format used in the current study was chosen to be consistent with previous studies and thus allow for better comparison with their results (e.g., Zauberman et al., 2009).

#### Analysis

The evaluation of the explanatory power of different quantitative models and impact of experimental conditions was performed at three different levels of analysis. In all cases, linear, power, logarithmic, and logistic functions were fit to the data, in which each data point represented the average of five repeated measurements per participant, per condition. At a global analysis level, functions were fit to the data points (individual responses) from all the participants in a specific group using linear or nonlinear regression. At an individual level, the same analysis method was repeated, but on the data from each participant. Nonlinear mixed-effects modeling was used as an intermediate-level analysis, as it explicitly represents betweenparticipant variation, while simultaneously estimating group means per condition. Model selection was conducted based on a Bayesian model selection approach. Amongst other advantages, Bayesian approach allows to gather probabilistic quantification of relative evidence in favor of all hypotheses, including the "null" hypothesis (Gallistel, 2009). It also includes, with a sound theoretical basis (Raftery, 1995; Wagenmakers, 2007), direct comparison of non-nested models, which is the case of the nonlinear models tested in this paper. As a general procedure, all four functions were fit with no restriction on their parameters. The preferred function was chosen according to the Schwarz weights ω<sup>i</sup> (Schwarz, 1978; Wagenmakers and Farrell, 2004):

$$
\Delta\_{\text{i}} = \text{BIC}\_{\text{i}} - \min \text{BIC}
$$

$$
\alpha\_{\text{i}} = \frac{\exp\left(-\Delta\_{\text{i}}/2\right)}{\sum\_{m=1}^{M} \exp\left(-\Delta\_{m}/2\right)}
$$

where BIC is the Bayesian Information Criterion (Schwarz, 1978) and M is the number of models. The Schwarz weight for a model can be understood as its relative advantage in comparison with other models of a set, based on the posterior probabilities of all models involved. ω<sup>i</sup> assumes values from 0 to 1 and the sum of the weights of all functions is 1. The higher the weight of a specific function, the stronger the relative evidence in its favor. Since Schwarz weights are calculated using BIC, choosing according to highest Schwarz weight is identical to choosing according to lowest BIC, thus the selection criterion is actually the BIC and the weight a translation of relative evidence for a model. The Bayes Factor (BF) is another BIC-based measurement of relative evidence of models (Vandekerckhove et al., 2015). It is a useful measure to tell how much more probable one model is than another. However, in comparison with the ω<sup>i</sup> , it is restricted to pairwise comparisons. The value of the Bayes factor expresses how much one function is more likely than the other and is presented in this paper when a specific nonlinear function is compared to the linear alternative. The computation of a Bayes factor, in principle, takes into account a specific choice of a prior distribution over the model

<sup>2</sup> "Imagine o intervalo abaixo. Mova a barra para indicar quão longa você considera a duração entre hoje e o intervalo dado."

parameters; the exact value obtained depends on the choice of the prior. As a general approach, we calculated Bayes factors based on the Bayes information criterion. In practice, this approach can be considered one specific approximation to other possible Bayes factors, namely one that assumes that parameter priors are flat, and has the advantage of easy of computation. The Bayes Factor was therefore computed based on the BIC, as follows:

$$
\Delta BIC = BIC\_{\text{linear}} - BIC\_{\text{nonlinear}}
$$

$$
BF = \exp\left(\Delta BIC/2\right)
$$

A BF higher than 1 represents evidence in favor of the nonlinear function. As a reference, Bayesian factors differences between 3 and 20 constitute "positive" evidence while values between 20 and 150 constitute strong evidence in favor of the function with the lower factor. Any value above 150 can be taken as very strong evidence (Raftery, 1995).

All analyses were conducted in Matlab and The R Project for Statistical Computing, the latter with nonlinear mixed-effects estimation (nlme) package to obtain maximum-likelihood fits of non-linear models (Pinheiro et al., 2016).

#### Results

**Figure 2** shows the responses in each configuration from Group I – Number + time-unit which performed Experiment I and in which participants were asked to indicate their estimate of the duration of the time interval. In the Forward configuration, in which the left extreme of the horizontal line was labeled "very short" and the right "very long," participants picked a longer line length as the indicated time interval was increased. Similar results were found when the scale was inverted (Backward configuration). A linear, power, logarithmic and logistic regression analysis on the global data showed that, in all experimental conditions, over 88% of the variance was explained, with the exception of the logarithmic function that had lower values for all conditions (**Supplementary Table 1**). The power function had the highest posterior probability, as compared to the linear, logarithmic, and logistic functions, with ω<sup>i</sup> = 0.81, BF = 4.18 (positive evidence) and ω<sup>i</sup> = 0.88, BF = 8.24 (positive evidence), in the Forward and Backward conditions, respectively. One can say that, in the Forward condition, the data are 4.18 times as likely under the power function than they are under the linear function. Similar results were obtained when AIC (Akaike, 1974) was used instead of BIC. AIC is another popular information criterion that penalizes models less for complexity (number of free parameters). The estimates for the exponent of the power model were 0.77 and 0.75 in the Forward and Backward conditions, respectively. The effect of scale direction was evaluated using a nonlinear mixed-effects model (**Figure 3** and **Supplementary Table 2**). The model was a power function of the form y = c + αx β , with the three parameters, α, β, and c, as fixed-effects dependent on the Direction condition, with additional Gaussian random-effect components to allow for individual effects, besides the error term. The predicted variable Y was the estimated line-length response on the original scale, i.e., in pixels. The result of this model fit showed no significant role for Direction (**Figure 3**, t = 1.17, df = 409, p = 0.24), thus the scale direction did not substantially influence the responses.<sup>3</sup>

# EXPERIMENT II: NUMBERS ONLY

Having tested participants and characterized their psychometric functions for long-range intervals with numeral-unit pairs in Experiment 1, we performed the same investigation in Experiment 2 with numbers only. The intervention aimed to test if the omission of time units has affected the nature of the psychometric function.

## Materials and Methods

Twenty volunteers (mean age 20.7 years, 7 women) participated in this experiment. Procedures were identical to those of Experiment I, with the exception of the references to time units. The following instructions were presented on the screen:<sup>4</sup> "In this study you will be asked to indicate the MAGNITUDE (small or big) of numbers of values between 3 and 36. Read the instructions carefully and indicate your responses." After the confirmation that the participant had understood these instructions, the following instructions were displayed on the upper part of the screen:<sup>5</sup> "Consider the number below. Move the bar to indicate how small or big it is." The number was presented below these instructions (**Figure 1**, middle panel). All analysis procedures were identical to those performed in Experiment I.

#### Results

**Figure 4** shows the results for Group II – Number which performed Experiment II with numerical stimuli only and in which participants were asked to estimate the magnitude of numbers. Results were similar to those of Experiment I. A linear, power, logarithmic and logistic regression analysis on data aggregated across volunteers showed that in all cases, over 79% of the variance was explained, with the logarithmic function

<sup>3</sup>Group and Direction effects were tested under the assumption that the power function most appropriately represents the data, based on the results of the aggregated data analysis. However, one can argue that, if individual differences would be considered, one of the other functions might be chosen. We verified this possibility taking a hierarchical approach and using non-linear mixed effects analysis, treating model parameters as individual and randomly distributed in the population, i.e., as random effects, to compare the four functions considered in this paper. Since there is inter-individual difference not only in parameters but also in which of the functions best fit the data, the models did not always converge without interventions – we allowed excluding up to two subjects per Group if necessary for convergence of any candidate in the model set, and optimizing parameters on a log rather than a linear scale, in effect testing different population distributions for the parameters. The results are in agreement with the fits on the aggregated data, with the power model outperforming the others when using either BIC or AIC as criterion. In Experiment I, the log Bayes factors for the power model versus the linear model were 23.99 and 54.99 for Forward and Backward configurations, respectively.

<sup>4</sup> "Nesse estudo, você será solicitado a indicar a MAGNITUDE (pequeno ou grande) de números de valores entre 3 e 36. Leia as instruções cuidadosamente e indique suas respostas."

<sup>5</sup> "Considere o número abaixo. Mova a barra para indicar quão pequeno ou grande ele é."

FIGURE 2 | Number + time-unit interval magnitude estimation of the 18 participants in Experiment I. In the Forward configuration (black color), line-length was measured from left to right. In the Backward configuration (orange color), line-length was measured from right to left. Each dot represents the average of five measurements from an individual participant. Continuous lines represent the best fitting power functions, while dotted lines represent the upper and lower prediction bounds for the fitted functions with a confidence level of 95%. It should be noted that, while confidence intervals are usually associated with the distribution of data around a certain value, we show the prediction interval for a given confidence level, which is associated with the probability of the fitted function – without the random component representing inter-individual variation – being within an interval.

having the lowest values (**Supplementary Table 1**). The power function had the highest posterior probability, as compared to the linear, logarithmic and logistic functions, with ω<sup>i</sup> = 0.81 in the Forward condition. However, in the Backward condition, the linear function was preferred with ω <sup>i</sup> = 0.80. Bayes Factors were 4.35 and 4.40 ("positive evidence") in the Forward and Backward confidence level of 95%.

fpsyg-10-01479 June 27, 2019 Time: 15:14 # 7

conditions, respectively. However, when AIC was used instead of BIC, the power model was preferred. This outcome shows that, in the Backward configuration, the power model is preferred to the linear model when function complexity is less penalized. The estimates for the exponent (β) of the power model were 0.84 and 0.87 in the Forward and Backward configuration, respectively.

The effect of the scale direction was evaluated using a mixedeffects power model in the same manner as in Experiment I (**Supplementary Table 2**). The result of this model fit showed no significant effect of Direction (**Figure 3**, t = −1.72, df = 455, p = 0.08). However, the close to significant effect is in agreement with the finding of different best model fitting in the Backward and Forward condition at the global analysis level.<sup>6</sup>

#### EXPERIMENT III: PERSONAL EVENTS

There was a large overlap between the presentation format across Experiment 1 and Experiment 2 as both contained a precise magnitude information presented in the form of Arabic numerals. In Experiment 3, we followed a novel route by referring to time intervals via personal events and tested if such a change in the presentation format would affect the psychometric function that maps objective long-range time intervals onto subjective magnitudes.

# Materials and Methods

Twenty-one volunteers participated in this experiment. Procedures were similar to those of Experiment I, with variations due to the different stimulus types. Participants were asked to list 20 events that are expected to happen in his or her personal life in the next month and up to 3 years in the future. Alongside the name, participants also wrote down the expected month and year of the expected personal event. Twelve events were chosen such that the spread of the time range was maximized. The experimental session took place after an interval of 2–7 days. Participants received the following instructions:<sup>7</sup> "In this study, you will be asked to indicate your subjective feeling of the time interval duration between today and a certain future event. The events will occur in up to 36 months. Carefully read the instructions and indicate your answers." After the confirmation that the participant had understood these instructions, the following instructions were displayed on the upper part of the screen (**Figure 1**, right panel):<sup>8</sup> "Imagine the time until the event below. Move the bar to indicate how long you consider the duration of time until the event." The name of one of the events was presented as the stimulus. After completing the task, participants were asked to say aloud the expected month and year of the presented events.

Participants were also asked to estimate the valence of the events in the range of a 1–5 Likert-type scale, 5 representing

<sup>6</sup> In a similar manner to the description in footnote 3, we applied nlme analysis on the data from Experiment II. The results are in agreement with the fits on the aggregated data, with the power model outperforming the others when using both BIC and AIC as criteria. In Experiment II, the log Bayes factors for the power model versus the linear model were 24.39 and 13.06 for Forward and Backward configurations, respectively.

<sup>7</sup> "Nesse estudo, você será solicitado a indicar seu sentimento subjetivo da duração do intervalo de tempo entre hoje e algum evento pessoal futuro. Os eventos vão ocorrer em até 36 meses. Leia as instruções cuidadosamente e indique suas respostas."

<sup>8</sup> "Imagine o tempo até o evento abaixo. Mova a barra para indicar quão longa você considera a duração do tempo até o evento."

positive valence. For approximate matching of the valence of the events, events that were rated as negative, i.e., scored 1 or 2, were excluded from the analysis (Peters and Buchel, 2010). This exclusion had minor influence on the results, as only 12 events, across eight participants, met this exclusion criterion. Judgment of the duration of projected slides was previously found to be affected in a crossover interaction manner by valence and arousal levels: duration of negative slides was judged as shorter than that of positive slides for low-arousal stimuli, while the inverse was found for high-arousal stimuli (Angrilli et al., 1997). This might raise the concern that the low rate of cited negative events within this group could lead to more positive mood than their counterparts in the other two groups, thus biasing the results. However, there is some evidence that, in the particular case of the retrospective timing of remembered past events, the emotional content of said events and the participants' mood at the moment of recall do not distort perceived duration (Grondin et al., 2014). Furthermore, any possible mood-changing effect of the interview was avoided by the period of 2-to-7 days between the listing of the events and the actual time-interval estimation task.

At the end of all trials, participants were asked to describe what they had to do in the task. All participants answered correctly, with two exceptions. One participant used the line scale to indicate the valence of the events while the other participant indicated the level of importance. Both participants were excluded from further analysis. All analysis procedures were identical to those performed in Experiment I and II, after a conversion that will be detailed below.

#### Results

**Figure 5** shows the results of Group III – Events for Experiment III with personal events as stimuli and in which participants were asked to indicate their feeling of the duration of time until the event in a Forward scale configuration. The conversion from event to time interval was made based on the month and year that the volunteers reported in the interview after the experiment (e.g., "Vacation to United States" – May, 2017). We calculated the number of months elapsed between the time of the task to the time of the event. Global level regression analysis using linear, power, logarithmic and logistic functions showed that, in all cases, over 68% of the variance was explained (**Supplementary Table 1**), the lowest found in this study. The power function had the highest posterior probability, with ω <sup>i</sup> = 0.99. As expected from the ω <sup>i</sup> value, the Bayes Factor was very high (BF = 140, "strong evidence"). Similar results were obtained when using AIC instead of BIC. The power model β parameter was as low as 0.58, indicating a more decelerated concave function. For comparison, **Figure 6** presents the best-fit power models for the data in the three experiments in the Forward configuration.<sup>9</sup>

# RESULTS ACROSS EXPERIMENTS I, II, AND III

In all experimental conditions, global level analyses showed that power functions fit the data better, with the exception of the Number experiment in the Backward condition. Furthermore, it was observed that the best-fitting power function for the results of the Event experiment were more decelerated (β = 0.58). In this section we try to answer the following questions: (1) what is the degree of inter-individual variation? – and (2) did the manipulation of the stimulus type influence the responses, specifically the curvature of the psychometric function? In addition to the global level analysis, individual and mixed model analyses across experimental groups approach were adopted, where the mixed model refers to the use of methods that predict data at the group level as well as allowing for normal variation between individuals in parameter values.

## Individual Level Analyses

The following evaluation was designed to extend the testing of the working hypothesis that the curvature of the psychometric function depends on the use of numbers and time units taking an individual level approach. Using nonlinear regression analysis, power models were fit to the data of each participant (**Table 1**). Considering the β parameter as an estimator of the curvature of the psychometric function, we applied a linear mixed-effects model on the estimates of the corresponding parameter. Two fixed-effects variables were considered: Group to encode the experiment (Number + time-unit, Number) and Direction to encode the scale direction (Forward or Backward). The Event condition was not included, because the Direction variable has only one level (Forward) in this condition. Additional random-effect components were included to allow for interindividual variation. The result of this model fit showed no statistically significant difference for Group (t = −1.36, df = 72, p = 0.17) or Direction (t = −0.045, df = 72, p = 0.96), thus the curvature of the psychometric function was not affected by including a time unit in the stimuli or by changing the scale direction.

In order to compare the psychometric functions in the three experiments, including the Events, we defined Group as a factor with three experimental conditions: Number + time-unit, Number, and Events, all of them in the Forward configuration, and used linear mixed-effects model analysis to test the effect of Group on the β parameter. There was a significant difference (t = −2.79, df = 55, p = 0.007), indicating that the curvature of the psychometric function in the Event condition was higher than in the other conditions.

Finally, we extended the analysis to the individual subject level and evaluated the four alternative functions using regression analysis at an individual level. According to the model preference based on BIC or Schwarz weights, the power model did not account best for the data of more than 50% of the individuals in any experimental configuration (**Table 2**). In fact, considering all individuals in all experimental conditions, the linear function performed, similarly, to the power function. An extreme case

<sup>9</sup> In a similar manner to the description in footnote 3, we applied nlme analysis also on the data from Experiment III. The results are partially in agreement with the fits on the aggregated data, with the power model outperforming the others when using the AIC criterion, but the linear when using the BIC criterion. In Experiment III, the Bayes factor for the power versus the linear model was 0.025, and therefore favored the linear model.

is the Event configuration: Linear and logistic models explained better the data of 15 individuals as compared to only four by the logarithmic and power models.

#### Mixed Model Analyses

We already evaluated the effect of the scale direction in Experiment I and II on the power model β parameter using a nonlinear mixed-effects model and reported that no significant effects of scale direction were found. The same analysis procedure was extended to verify the influence of stimulus type on the function parameters (**Supplementary Table 2**). There was no significant difference in the model parameters between the Number + time-unit and Number configuration group in the Forward (β: t = −1.09, df = 413, p = 0.27) and Backward (β: t = −0.61, df = 413, p = 0.53) condition. However, the estimates of α and β parameters of the model for the Event group were

#### Agostino et al. Presentation Format and Time Intervals

#### TABLE 1 | Power function (c + αx β ), regressions of the individual data.


Average values of the estimated model coefficients are given. R<sup>2</sup> represents the average coefficient of determination. Values in parentheses represent standard deviation across participants.

TABLE 2 | Number (and proportion in parentheses) of participants in each experimental condition whose data are better represented by a specific function based on BIC criteria.


∗ In the case of two participants in the Number + time-unit Backward configuration and one participant in the Number Backward configuration, power function were initially selected based on the Schwarz weights criteria in which the four functions were compared. However, the Wald test performed afterward did not rejected the hypothesis that β = 1 and the linear function was chosen.

significantly different from those in the Number + time-unit (Forward; α: t = 1.99, df = 392, p = 0.04; β: t = −2.46, df = 392, p < 0.01) and Number (Forward; α: t = 2.57, df = 414, p < 0.01; β: t = −3.76, df = 414, p < 0.001) configuration. In summary, there was no evidence that the inclusion of the time unit in the stimulus label altered the parameters of the response model. However, the use of events, instead of Numbers or Numbers with a time unit, altered both the scaling (α) and curvature (β), with the function becoming more decelerated. Additionally, no evidence for the effect of the scale direction was found.

#### GENERAL DISCUSSION

We evaluated the effect of stimulus presentation and scale directionally on the psychophysical mapping function. In three experiments, participants were presented with either timeintervals in the format of "9 months", numerical quantities in the format of "9" or time intervals indicated by the name of a personal future event such as "wedding." In all experiments, participants used a horizontal line to indicate the estimated magnitude. The horizontal scale was labeled "Very short" and "Very long" from left to right, respectively; the first two experiments were also carried out with labels in inversed horizontal order. The common feature of the methodological approaches utilized in this paper is that they all measure how calendar times map onto the subjective long-scale timeline (internal metric representation) using a cross-domain transfer between subjective temporal distances and subjective spatial distances. The implied premises are that (a) the metric distances in the internal representation can be translated into metric distances in responding (akin to "amodal" magnitude estimation) and that (b) humans can transfer quantitative judgments across different magnitude domains (e.g., Balci and Gallistel, 2006). Another way to study the mappings between objective calendar times and the corresponding subjective temporal metrics would be to utilize a two alternative forced choice paradigm in which for instance participants are asked to make "too short" and "too long" judgments for different calendar times (e.g., Roach et al., 2011). In this case, information regarding certain biases in subjective metrics can be derived from the shape of psychometric functions (e.g., asymmetry) fit to these binary judgments.

When evaluated from a global perspective, considering only aggregated data, the responses to abstract numerals with and without a time unit were similar: data from both conditions were well explained by power functions and presented a similar deceleration rate, with the exception being the Number group in the Backward configuration which tended to favor the linear model instead. These results suggest that the psychometric function of abstract numerals follows a power model and that participants, when presented with time intervals that include numerals, are simply estimating the magnitude of the numerals. However, presenting time intervals with the indication of personal events, without the use of numerals, produced a different behavior. In this experimental condition, the power function clearly fit the data better in comparison with the other functions. Moreover, the power function fit in the Event condition was more decelerated (lower exponent) than the power function fit in the other conditions. Noteworthy is the lack of scale direction effect in both the Number and Number + time-unit conditions. A significant effect in one of the conditions, but not the other, would have indicated that different processing for numbers and numbers coupled with time

units. Although the results do not reject such hypothesis, they do not support it.

These results partially corroborate previously published results. The psychometric functions of the number-only magnitude estimation were found to be linear or close to linear, a result that corroborates the psychophysical mapping of the magnitude of numerals in adults (Whalen et al., 1999; Dehaene et al., 2008). Power functions were also the best performing models when time intervals were presented with numerals, similar to those found in previous studies (e.g., Zauberman et al., 2009) in which an equivalent global level approach was used. However, it is noteworthy that, while a highly decelerated rate was registered previously with presentations of numerals coupled with a time unit (Kim and Zauberman, 2009), in the current study such rate was found only when stimuli were based on personal event tags, without the use of any abstract number or time unit. These results imply that people estimate time interval differently when presented with numbers and/or time units.

The effect of manipulating the stimulus type was more evident when individual differences were considered together with the general tendency of the participants in the experimental group. Nonlinear mixed-effects model analysis showed no difference in power function parameters between the experimental groups with number or number and time-unit, but also that both were different from the parameters of participants that were presented event-tagged time intervals. At a third, purely individual level analysis, the curvatures of the estimated psychometric power functions were compared and showed no difference between the experimental conditions with numbers and time-unit and numbers only, but a difference from the event-tagged condition. An evaluation of the best fitting model of each individual revealed that the power model is not the best overall representative of the data, being outperformed by other models in the Number+timeunit condition and particularly in the case of participants in the Event experiment. Results from the global and nonlinear mixed model analyses agree as to the power function better fitting in most experimental conditions, the minor influence of the time-unit and as to the influence of using event-tags in place of numbers and time units. These results are compatible with previous studies of estimation of number magnitude and time-intervals, although in the latter case the psychometric function was of higher deceleration rate. However, individuallevel analysis revealed a different outcome. Power functions were outperformed by other models in four out of the five experimental conditions. Particularly in the event-tagged setup, the data of several participants were poorly fitted by power functions, and linear and logistic models were better suited. These results confirm the relevance of individual-level analysis in estimation of time-interval magnitude, although the comparison of global and individual level results should be always taken with caution given the differences in statistical power. In a recent study from our laboratory, using number and time-unit presentation, we showed that linear models fit 98% of the variance of the aggregated data as compared to approximately 90% in the current study (Agostino et al., 2017). However, the discrepancy can be explained by a normalization procedure in the former study that reduced substantially the variance. More importantly, findings from both studies emphasize the relevance of individual differences, with data from participants in the previous study split almost equally between linear and power models.

Time-interval presentation using references to events is not novel (Peters and Buchel, 2010), but to the best of our knowledge, it was not used previously for magnitude estimation and the results cannot be compared to those from previous studies. The responses in the Event condition were distinct from those in the other conditions, presenting a higher compression toward longer time intervals. This result, in addition to the earlier observed differences in best-fitted model distribution at the individuallevel, suggests that, when no abstract numerals are presented, people apply different mechanisms or use a different strategy for time interval estimation.

One of the main findings of the current study is that people map the magnitude of stimuli presented as an abstract number and Number+time-unit in a similar way, while the magnitude of time intervals without the use of numerals are treated differently. This result is better explained by the metaphor theory, given that it posits a high weight on the influence of language and culture, and it can be argued that the relatively linear mapping of abstract numbers, a cultural concept, dominates cognitive processing. In their absence, as in the event-tagged time intervals, mapping could be free of such influences. The metaphor theory would also predict a scale directionally effect, given the assumed influence of spatial metaphors on representation of time and numbers. However, the lack of significant directionally effect in the current study does not support this view. The hypothesis that people rather focus on numbers than the accompanying units is corroborated by findings that cooperation level of participants in the prisoner's dilemma game increased when the reward was increased from 3 to 300 , but not when it was increased from 3 to 3\$ (Furlong and Opfer, 2009). However, presentation format does not always alter time-preferences (Lukinova et al., 2019). In an intertemporal decision making study, time-intervals were indicated textually (ex. "64 days") or by the amplitude of a pure tone. Discounted factors in the verbal and non-verbal conditions were positively correlated (r = 0.79).

The ATOM theory does not derive predictions on the effect of scale direction inversion. It would predict similar mapping functions for all three presentation conditions. Under this theory, the representation and processing of time and numbers are assumed to rely on the same magnitude representations and thus would carry the same metric structure (Walsh, 2003). This prediction was not confirmed, considering the differences in representation between the tagged-event condition and the two other conditions. However, we did not control the stimulus presentation for saliency, thus it can be argued that saliency differences are responsible for the different effects. In defense of the ATOM theory, it can also be argued that the use of the cross-modality line-length matching paradigm confounded the effects of processing in the numerosity and time dimension with processing in the spatial dimension. This is possible, in principle, but is unlikely. The processing in any dimension is composed of at least two mapping processes, stimulus-to-representation and representation-to-action. We chose to manipulate the stimulus dimension and kept the type of action constant. Any

significant result could relate to both two processes, without being able to discriminate the locus of the effect. A similar approach was used by Dehaene et al. (2008), where the linelength matching paradigm was used to compare the magnitude estimation of dots (quantity), tones (sound) and numerals (abstract). Corroborating the results in this study, they found that western adult participants mapped symbolic numerals linearly, but logarithmically when quantities were presented in a nonsymbolic fashion.

It is possible that, different from the numerical conditions, the "personal event" condition engaged the prospective memory system, namely the memory representations formed for a future event. One type of prospective memory is the time-based prospective memory that involves remembering to realize a certain event after a certain time and thus involve time estimation (Graf and Grondin, 2006). To this end, Block and Zakay (2006) have adapted their attentional model of interval timing to accommodate the temporal components of the time-based prospective memory, but the application of an internal-clock mechanism to very long intervals is not realistic due to the cognitive architecture and attentional resources (for a review see Waldum and Sahakyan, 2012). Furthermore, episodic memory is known to be used in making time estimations regarding future events (Roy et al., 2005). Briefly, these possible complex interactions between time estimates and memory systems might have underlain the differential effects observed in the "personal event" condition.

In summary, several conclusions can be drawn from the results of the current study. Amodal magnitude estimation of time intervals presented in the form of Number + time-unit, e.g., "15 months," follows a slightly decelerated rate, similar to the psychophysical function observed with estimation of abstract numerals. Additionally, distinct psychophysical functions were found when people estimate the magnitude of time intervals presented without the use of abstract numerals. These results suggest that people do not necessarily invoke temporal domain mechanisms when presented with time intervals in the form of Number + time-unit and can be considered as evidence in support of the metaphor theory. The event-tagged time interval procedure can be argued to be more appropriate for measuring perception of long time intervals, given that it makes it more difficult for people to use a domain different from time, and can lead to the suggestion that time perception has a relatively high deceleration rate. Moreover, we have demonstrated that, in all experimental setups, individual differences are of major importance and the event-tagged paradigm is no exception. The source of the individual differences is not clear, and could reside in factors such as previous experience, numerical capacity, and type of personal events, but is a focus of ongoing research. In

## REFERENCES


any case, psychophysical functions of time perception derived from aggregated data analysis, for example in models of intertemporal decision making, should not be assumed to represent the perception of individuals in a non-discriminate way. Thus, the implications of the current work are both scientific and methodological: (a) through manipulation of the presentation format of the long-range time intervals we have shown that personal event tagged intervals are processed differently from those presented via numerals and (b) we have shown that there is substantial degree of individual variation in the function through which subjective magnitudes are mapped onto objective longrange time intervals.

# ETHICS STATEMENT

The Research Ethics Committee at the Federal University of ABC approved all experimental protocols.

# AUTHOR CONTRIBUTIONS

CA contributed with ideas, development of codes for the experiments, recruitment and measurement of the volunteers, and executed the experiments, carried out statistical analyses and contributed to the manuscript writing and revision. YZ contributed with ideas, development of codes for the experiments, carried out statistical analyses, and contributed to the manuscript writing and revision. FB contributed with ideas and to the manuscript writing and revision. PC contributed with ideas, conducted statistical analyses, and contributed to the manuscript writing and revision.

# FUNDING

FB has been granted a full waiver by Frontiers.

# ACKNOWLEDGMENTS

CA had a research scholarship from Capes and International Graduate School ABINEP.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.01479/full#supplementary-material



Siegler, R. S., and Booth, J. L. (2004). Development of numerical estimation in young children. Child Dev. 75, 428–444. doi: 10.1111/j.1467-8624.2004.00684.x



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Agostino, Zana, Balci and Claessens. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.