# TIME PERCEPTION AND DYSFUNCTION: CLINICAL AND PRACTICAL IMPLICATIONS

EDITED BY : Deana Davalos, Giovanna Mioni, Simon Grondin and Felipe Ortuño PUBLISHED IN : Frontiers in Human Neuroscience

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-704-5 DOI 10.3389/978-2-88945-704-5

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# TIME PERCEPTION AND DYSFUNCTION: CLINICAL AND PRACTICAL IMPLICATIONS

Topic Editors:

Deana Davalos, Colorado State University, United States Giovanna Mioni, Università degli Studi di Padova, Italy Simon Grondin, Université Laval Québec, Canada Felipe Ortuño, Universidad de Navarra, Spain

"Mi tiempo, tu tiempo, su tiempo" by Carlos Iraburu.

Processing time requires a complex set of abilities that dictates how one adapts to the environment. Timing is key in how we communicate, organize our space and understand the world. When timing is intact, it often exists below one's conscious awareness, but when one's ability to process time is disrupted, the effects are noticeable and widespread. To better understand the construct of temporal dysfunction, one has to examine the concept of timing from multiple angles. An integrative approach is required to understand the possible biological, cognitive and psychological etiologies of temporal dysfunction. In addition, expertise in language, measurement and psychopathology are necessary to comprehend how timing affects one's representation of the world. This Research Topic is dedicated to examining timing and temporal dysfunction across a variety of tasks and disorders. Specifically, we seek to understand the most basic types of timing dysfunction and how they may affect a wide range of behaviors and symptoms as well as higher levels of temporal deficits involved in how one perceives the time spent. This Research Topic represents the intersection of medicine, psychology, cognition and perception in a unified attempt to shed light on temporal processing. The collection of original research articles and case studies, highlighting novel methodologies and interventions, illustrates the complexity of timing dysfunction and how understanding these deficits helps us to get a fresh look at a wide range of clinical psychopathologies and to develop better interventions.

Citation: Davalos, D., Mioni, G., Grondin, S., Ortuño, F., eds. (2019). Time Perception and Dysfunction: Clinical and Practical Implications. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-704-5

# Table of Contents

*06 Editorial: Time Perception and Dysfunction: Clinical and Practical Implications*

Deana Davalos, Giovanna Mioni, Simon Grondin and Felipe Ortuño

*09 Segregation of Brain Structural Networks Supports Spatio-Temporal Predictive Processing*

Valentina Ciullo, Daniela Vecchio, Tommaso Gili, Gianfranco Spalletta and Federica Piras


Caroline Gagnon, Catherine Bégin, Vincent Laflamme and Simon Grondin

*61 Time Perception and the Experience of Time When Immersed in an Altered Sensory Environment*

Joseph Glicksohn, Aviva Berkovich-Ohana, Federica Mauro and Tal D. Ben-Soussan

*72 Multiple Looks of Auditory Empty Durations Both Improve and Impair Temporal Sensitivity*

Tsuyoshi Kuroda, Daiki Yoshioka, Tomoya Ueda and Makoto Miyazaki

*85 Overlapping but Divergent Neural Correlates Underpinning Audiovisual Synchrony and Temporal Order Judgments*

Scott A. Love, Karin Petrini, Cyril R. Pernet, Marianne Latinus and Frank E. Pollick

*96 Minimal Self and Timing Disorders in Schizophrenia: A Case Report* Brice Martin, Nicolas Franck, Michel Cermolacce, Jennifer T. Coull and Anne Giersch

### *104 Dissociating Explicit and Implicit Timing in Parkinson's Disease Patients: Evidence From Bisection and Foreperiod Tasks* Giovanna Mioni, Mariagrazia Capizzi, Antonino Vallesi, Ángel Correa, Raffaella Di Giacopo and Franca Stablum

*116 Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech*

Yoshitaka Nakajima, Mizuki Matsuda, Kazuo Ueda and Gerard B. Remijn

*124 Training in Temporal Information Processing Ameliorates Phonetic Identification*

Aneta Szymaszek, Anna Dacewicz, Paulina Urban and Elzbieta Szelag

*135 Disturbed Experience of Time in Depression—Evidence From Content Analysis*

David H. V. Vogel, Katharina Krämer, Theresa Schoofs, Christian Kupke and Kai Vogeley


Adrienne Wise and Michael Barnett-Cowan

# Editorial: Time Perception and Dysfunction: Clinical and Practical Implications

Deana Davalos <sup>1</sup> \*, Giovanna Mioni <sup>2</sup> , Simon Grondin<sup>3</sup> and Felipe Ortuño<sup>4</sup>

<sup>1</sup> Department of Psychology, Colorado State University, Fort Collins, CO, United States, <sup>2</sup> Department of General Psychology, Università degli Studi di Padova, Padova, Italy, <sup>3</sup> École de Psychologie, Université Laval, Quebec, QC, Canada, <sup>4</sup> Department of Psychiatry and Medical Psychology, Universidad de Navarra, Pamplona, Spain

Keywords: timing, time processing, temporal processing, timing dysfunction, temporal dysfunction

#### **Editorial on the Research Topic**

#### **Time Perception and Dysfunction: Clinical and Practical Implications**

The relationship between one's ability to process time and psychopathology has been documented for nearly a100 years. Schilder (1936) made detailed observations of temporal dysfunction across various types of psychopathology. More recent research has permitted to reach a point where different types of time processing could be parsed apart and eventually be regarded as "a precise marker" of certain diagnoses (e.g., schizophrenia) (Ciullo et al., 2018). As the research in temporal processing evolves, there is greater interest in understanding not only the different ways to measure time processing, but also how certain disorders may be affected by temporal dysfunction. The 13 articles that examine timing in this research topic allow us to better understand the temporal dysfunction encountered across a variety of disorders. In most domains of daily life, language plays a critical role in one's ability to communicate effectively and convey thoughts and emotions effectively. Language represents a domain, which has an intimate connection with time processing. Nakajima et al. tackle the basics of timing in language by exploring the temporal resolution needed for successful speech communication. The authors find that linguistic information is conveyed almost perfectly by auditory blocks around 40 ms. They note that the process resembles that of the visual system stringing together static movie frames of ∼40 ms into vivid motion, supporting the argument that while the auditory and the visual modalities are different in their peripheries, their cortical organizations may employ similar temporal grids. Dacewicz et al. extend the language and timing exploration by assessing children with specific language impairment (SLI) who are characterized by deficiencies in temporal information processing (TIP). Their results suggest that temporal training (exercises and games involving sequencing abilities and duration judgment), compared to non-temporal training, may play a unique role in improving both language and cognition of children with SLI. Szymaszek et al. extend on these findings, noting that TIP appears to be omnipresent in language communication and is reflected not only in phonemic hearing but also in phonetic identification. The authors explore the etiology of developmental language disorders, specifically the neural basis of SLI. Again, only those interventions focused on temporal training result in performance similar to normal control children when compared to non-temporal based interventions.

While language appears to be a characteristic that is associated with time processing across development and across disorders, it is schizophrenia that has received possibly the greatest attention over the years as "the" disorder most characterized by timing dysfunction. About schizophrenia, Schilder (1936) quoted a patient who reported, "There is no hour, no noon, no

#### Edited by:

Ryouhei Ishii, Osaka University, Japan

> Reviewed by: Hisato Sugata, Oita University, Japan

\*Correspondence: Deana Davalos deana.davalos@colostate.edu

Received: 16 September 2018 Accepted: 03 October 2018 Published: 23 October 2018

#### Citation:

Davalos D, Mioni G, Grondin S and Ortuño F (2018) Editorial: Time Perception and Dysfunction: Clinical and Practical Implications. Front. Hum. Neurosci. 12:435. doi: 10.3389/fnhum.2018.00435 night. . . . Time does not move. I am wavering between past and future." Martin et al. report similar dysfunction in a case report involving a young man with schizophrenia, "AF." AF describes, "I do not feel the time," "The word 'time' has no meaning for me. . . " The authors explore the possibility that distortions of the temporal structure of consciousness contribute to the emergence of "self-disorders," including schizophrenia. The ability to process time is described as an implicit aspect of our relationship with the world and disturbances in the temporal structure of consciousness appear to be a key issue in the psychopathology of schizophrenia. Wilquin et al. also tested patients with schizophrenia to distinguish between interval timing and synchronization difficulties and, more generally, the spatial-temporal organization disturbances for voluntary actions. The authors used a new sensorimotor synchronization task to test these abilities. The patterns of results suggested that patients with schizophrenia were able to perceive and produce both simple and complex sequences of time intervals but are impaired in the ability to synchronize their actions with external events. The authors suggested that a specific deficit in predictive timing may be at the core of early symptoms previously described in schizophrenia. Glicksohn et al. continue the examination of one's subjective experience of time and how it relates to psychopathology by manipulating sensory environments. Participants experienced a whole-body altered sensory environment, including periods of white-colored vs. colored illumination of the chamber. Subjective experience of elapsed time was assessed, as were traditional measures of time production. The authors found that for those participants who reported a marked change in time experience, such as "the sensation of time disappeared," their time production data could not be linearized using a log–log plot, suggesting a "break" in the psychophysical function. The authors report descriptions of time experience induced by the changes in sensory environment that are strikingly similar to descriptions of time made in previous studies focused on depression and timing. Vogel et al. used qualitative research approaches and content analysis to specifically investigate disturbances of time experience in major depressive disorder. Their findings suggest that better understanding of the disturbance of lived time in MDD may help specify subgroups of MDD, degrees of symptom severity or distinct depressive syndromes. Similarly, Wise and Barnett-Cowan study timing in those with a history of concussions to see if timing can provide a cue to recovery. The authors find that people with concussion history are much less precise in an audiovisual TOJ task than control participants. The authors also note that the precision at the TOJ task increases as time elapses since concussion diagnosis, and suggest using this task, following concussion, to determine whether or not a return to work, or to play in the case of sport, is a correct decision. Gagnon et al. propose to use timing tasks like bisection for studying the emotions elicited by food pictures in people with an eating disorder, anorexia (AN) or bulimia nervosa (BN), instead of relying on traditional self-reported procedures or on invasive psychophysiological techniques. They showed that the duration of food pictures, in comparison to neutral ones, is overestimated by AN women and that, compared to participants with BN, AN participants perceived the duration of joyful food pictures as longer, and tended to overestimate the duration of the disgusting ones.

Across all of these studies of temporal processing in clinical populations, there is also a general basic question regarding how one measures timing. Are measures of time processing thought to be interchangeable, does each timing task relate to a different difficulty in everyday living, and most importantly, how should we be measuring timing? Love et al. attempt to disentangle temporal processing by examining the neural basis of synchrony and temporal order judgments (TOJ). As the authors point out, synchrony and TOJ are commonly used to investigate synchrony perception between sensory cues and to measure multisensory perception in general despite behavioral evidence suggesting that these tasks do not measure identical perceptual processes. Using fMRI, the clear differences in BOLD responses observed between the two tasks suggest the presence of overlapping, but divergent, neural mechanisms. The authors point out that studies requiring multisensory integration mechanisms may be task dependent, and may have implications for the study of atypical temporal processing in clinical populations and how we interpret temporal dysfunction. Kuroda et al. further examine how temporal information is presented to better understand how one perceives time-based changes in stimuli. Specifically, the authors examined predictive timing and how one perceives the regular rhythm of information to detect when a sound stimulus is displaced from the predicted sequence. An additional area of interest is the distinction between explicit and implicit timing. The crucial difference is whether or not the task instructions require subjects to provide an overt estimate of duration. Explicit timing is engaged whenever subjects make a deliberate estimate of discrete duration in order to compare it with previously memorized standard. Conversely, implicit timing is engaged, even without a specific instruction to time, whenever sensorimotor information is temporally structured and can be used to predict the duration of future events. Mioni et al. showed preserved temporal ability in patients with Parkinson's disease in the implicit timing task (tested with foreperiod paradigm) but PD patients failed in the explicit timing task (time bisection task) as they displayed shorter perceived durations and higher variability compared to controls. Implicit timing was further investigated by Ciullo et al. by testing the ability to generate probabilistic expectancies regarding when and where sensory stimuli will occur. Using graph theoretical analysis applied to structural connectivity data, they tested the extent of brain connectivity properties associated with spatio-temporal predictive performance. When the interaction between explicit and implicit temporal orienting processes was considered at the long interval, the authors found that explicit processes were related to centrality measures of the bilateral inferior parietal lobule, confirming a dissociation between explicit and implicit timing.

Overall, findings across these studies add to the literature in research assessing the role of timing in clinical disorders, how best to study timing, and how timing abilities may be associated with everyday tasks.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## REFERENCES

Ciullo, V., Piras, F., Vecchio, D., Banaj, N., Coull, J. T., and Spalletta, G. (2018). Predictive timing disturbance is a precise marker of schizophrenia. Schizophr. Res. Cogn. 12, 42–49. doi: 10.1016/j.scog.2018.04.001

Schilder, P. (1936). Psychopathology of time. J. Nerv. Ment. Disord. 83, 530–456.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

### ACKNOWLEDGMENTS

We thank all of the authors and reviewers who made substantial contributions to this research topic.

Copyright © 2018 Davalos, Mioni, Grondin and Ortuño. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Segregation of Brain Structural Networks Supports Spatio-Temporal Predictive Processing

Valentina Ciullo1,2, Daniela Vecchio1,3, Tommaso Gili1,4, Gianfranco Spalletta1,5 \* and Federica Piras<sup>1</sup>

<sup>1</sup> Neuropsychiatry Laboratory, Department of Clinical and Behavioral Neurology, IRCCS Santa Lucia Foundation, Rome, Italy,

<sup>2</sup> Neurosciences, Psychology, Drug Research and Child Health (NEUROFARBA), University of Florence, Florence, Italy,

<sup>3</sup> Department of Psychology, Sapienza University of Rome, Rome, Italy, <sup>4</sup> IMT School for Advanced Studies, Lucca, Italy,

<sup>5</sup> Menninger Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, United States

The ability to generate probabilistic expectancies regarding when and where sensory stimuli will occur, is critical to derive timely and accurate inferences about updating contexts. However, the existence of specialized neural networks for inferring predictive relationships between events is still debated. Using graph theoretical analysis applied to structural connectivity data, we tested the extent of brain connectivity properties associated with spatio-temporal predictive performance across 29 healthy subjects. Participants detected visual targets appearing at one out of three locations after one out of three intervals; expectations about stimulus location (spatial condition) or onset (temporal condition) were induced by valid or invalid symbolic cues. Connectivity matrices and centrality/segregation measures, expressing the relative importance of, and the local interactions among specific cerebral areas respect to the behavior under investigation, were calculated from whole-brain tractography and cortico-subcortical parcellation.

Results: Response preparedness to cued stimuli relied on different structural connectivity networks for the temporal and spatial domains. Significant covariance was observed between centrality measures of regions within a subcortical-fronto-parietaloccipital network -comprising the left putamen, the right caudate nucleus, the left frontal operculum, the right inferior parietal cortex, the right paracentral lobule and the right superior occipital cortex-, and the ability to respond after a short cue-target delay suggesting that the local connectedness of such nodes plays a central role when the source of temporal expectation is explicit. When the potential for functional segregation was tested, we found highly clustered structural connectivity across the right superior, the left middle inferior frontal gyrus and the left caudate nucleus as related to explicit temporal orienting. Conversely, when the interaction between explicit and implicit temporal orienting processes was considered at the long interval, we found that explicit processes were related to centrality measures of the bilateral inferior parietal lobule. Degree centrality of the same region in the left hemisphere covaried with behavioral measures indexing the process of attentional re-orienting. These results

#### Edited by:

Giovanna Mioni, Università degli Studi di Padova, Italy

#### Reviewed by:

Anne Giersch, Institut National de la Santé et de la Recherche Médicale (INSERM), France Mariagrazia Capizzi, Università degli Studi di Padova, Italy

> \*Correspondence: Gianfranco Spalletta g.spalletta@hsantalucia.it

Received: 30 November 2017 Accepted: 08 May 2018 Published: 24 May 2018

#### Citation:

Ciullo V, Vecchio D, Gili T, Spalletta G and Piras F (2018) Segregation of Brain Structural Networks Supports Spatio-Temporal Predictive Processing. Front. Hum. Neurosci. 12:212. doi: 10.3389/fnhum.2018.00212

**9**

represent a crucial step forward the ordinary predictive processing description, as we identified the patterns of connectivity characterizing the brain organization associated with the ability to generate and update temporal expectancies in case of contextual violations.

Keywords: predictive timing, spatio-temporal predictive performance, structural connectivity, diffusion tensor imaging, complex network theory, insula, schizophrenia

### INTRODUCTION

fnhum-12-00212 May 23, 2018 Time: 17:11 # 2

The brain is capable of allocating attention dynamically on the basis of expectations concerning events occurrence, and this ability is critical for goal-directed behavior in a changeable environment. Indeed, real world information is often dynamic and imprecise, and the ability to derive timely and accurate inferences about changing contexts is crucial, given that prospective action control bears the potential for a successful interaction with the environment.

Behavioral findings (Coull and Nobre, 1998; Coull et al., 2000, 2013, 2016; Davranche et al., 2011; Carvalho et al., 2016) evidenced faster reaction times (RTs) to target stimuli preceded by informative pre-cues signaling event location and onset, indicating that attention can be deliberately and strategically oriented in space and time when overt (explicit) information is provided. Such behavioral optimization has been described as an endogenous process based on learned cue-target association that enhances action preparation (Coull and Nobre, 2008). Temporal expectations can also be derived more implicitly from the intrinsic predictive power of the flow of time, which is vector-like in nature, implying that probability of target occurrence increases with passing time (i.e., hazard function, Luce, 1986). Importantly, when prior expectations are violated, sensory (exogenous) information is integrated into the existing prediction model, and attention re-allocated to the novel context (Courville et al., 2006). For example, invalid trials with delayed target onset are detected faster than premature ones (Coull and Nobre, 1998), due to the increasing conditional probability over time that the event will occur given that it has not already occurred. Thus, both the explicit and the implicit nature of the source of temporal predictions modulate attentional orienting in order to optimize our interaction with unfolding sensory stimulation (Correa et al., 2005; Nobre et al., 2007; Correa, 2012).

Accordingly, computational approaches (Friston, 2005; Clark, 2013) have conceptualized the brain as an inference machine that integrates acquired experience about the world (prior believes) and incoming sensory information to generate probabilistic predictive models (posterior probabilities). Prior believes are updated in the light of new sensory input, allowing the system to arrive at posterior estimates. In other words, the posterior estimate depends on what is already known, and how much is learnt through the evidence, summarized in the probability of events occurrence (Hohwy, 2017). Any imbalance between expected and actual sensory data generates prediction error signals, computed as the difference between the prediction and the new evidence (Mathys et al., 2014), that adjust subjective expectations accordingly (Phillips et al., 2015; Hohwy, 2017). This theoretical framework is of particular clinical relevance, given that positive symptoms of schizophrenia (SZ) have been hypothesized as consequent to failures of predictive coding (Frith, 2005; Adams et al., 2013; Giersch et al., 2016; Sterzer et al., 2016; Martin et al., 2017), such that predictions are determined more by the new evidence and less by prior belief, while irrelevant stimuli are considered salient (e.g., delusion of reference) (Fletcher and Frith, 2009; Adams et al., 2013).

That's why defining the neural basis of predictive coding is a major challenge in psychiatry and neuroscience. Conversely, functional Magnetic Resonance Imaging (fMRI) investigations on predictive timing in healthy subjects (Coull and Nobre, 1998; Coull et al., 2000, 2013, 2016; Davranche et al., 2011; Carvalho et al., 2016) have implicated complex and distributed networks -including the cerebellum, frontal cortex and inferior parietal cortex-, in forming temporal expectations, but the relevance of these structures in optimizing prospective motor behavior as a function of informative cues is still debated due to heterogeneity in experimental methods and the intrinsic complexity of the studied process (see Correa et al., 2006).

Pharmacological and fMRI evidence suggest that explicit and implicit processes of temporal expectations are functionally dissociable. Indeed, ketamine selectively impaired the ability to use internal estimates of time to make predictions (Coull et al., 2011), while partially distinct neuronal networks are preferentially activated both by the fixed temporal predictability of temporal cues and by the dynamic updating of temporal probabilities in the neutral cue condition (Coull et al., 2016).

However, as far as we know, the neural substrates of the attentional re-orienting in case of invalid trials and of the explicit forms of temporal expectations have been directly compared in one investigation only (Coull et al., 2000).

In recent years, advances in network neuroscience and graph theory proved to be effective tools in mapping neural connectomics (i.e., the pattern of neural elements and structural interactions of the neural system, Sporns et al., 2005), providing a deeper insight of the brain structural connectivity organization and its relation to behavior and cognition (Bullmore and Sporns, 2009; Rubinov and Sporns, 2010). Indeed, such network-based approach to MRI data analysis permits to quantify key structural properties and dynamics, which shape the anatomical substrate for functional specialized processing and information binding (Sporns et al., 2000, 2005).

Here we aimed at investigating the relationships between behavioral indices of explicit spatial and explicit-implicit

temporal orienting effects and the brain structural network in a spatio-temporal predictive task in healthy subjects. To our knowledge, this is the first study that examined brain wiring in the context of explicit (i.e., based on learned cuetarget association established through valid cueing) and implicit (i.e., derived from the sensory evidence in case of invalid cueing) predictive timing using centrality and segregation metrics derived from complex network analysis of diffusion tensor imaging (DTI) data. Specifically, we aimed at identifying the connectivity patterns underlying predictive behavior in terms of local connectedness (i.e., individual elements within the whole structural network having the highest number of connections) (Bullmore and Sporns, 2009) and local segregation (i.e., the number of connections that exist between the nearest neighbors of an element within the whole structural network) (Sporns et al., 2000) as to characterize the relative importance of, and the local interactions among specific cerebral areas respect to the behavior under investigation.

## MATERIALS AND METHODS

### Participants

Twenty-nine healthy volunteers (age: 40.1 ± 12.3, 13F + 16M) were enrolled in the present study. Participants were recruited within the local community through advertising, and consecutively assessed at IRCCS Santa Lucia Foundation in Rome. All participants were screened for current or past diagnosis of any DSM-5 Axis I or II disorder using the SCID-5 Research Version edition (First et al., 2014) and SCID-5- PD (First et al., 2016a). Inclusion criteria were: (1) age between 18 and 65 years, (2) at least 8 years of education and (3) suitability for MRI scanning. Exclusion criteria included: (1) the presence of any psychiatric disorders according to DSM-5 criteria, (2) history of psychoactive substance dependence or abuse as investigated by the structured interview SCID-5-CV (First et al., 2016b), (3) history of neurologic illness or traumatic brain injury, (4) major medical illnesses, such as non-stabilized diabetes, obstructive pulmonary disease or asthma, hematological/oncological disorders, B12 or folate deficiency as indicated by blood concentrations below the lower normal limit, pernicious anemia, clinically significant and unstable active gastrointestinal, renal, hepatic, endocrine or cardiovascular system disease, newly treated hypothyroidism, (5) presence of any brain abnormality and microvascular lesion apparent on conventional FLAIR-scans (Iorio et al., 2013), (6) global cognitive dysfunction according to a Mini-Mental State Examination (MMSE) (Folstein et al., 1975) score lower than 24, consistent with normative data of the Italian population (Measso et al., 1993), (7) diagnosis of major neurocognitive disorder according to DSM-5 criteria and (8) non-Italian language native speaker.

All participants were right-handed, with normal or correctedto-normal vision. They gave written informed consent to participate after the procedures had been fully explained. The study was approved and carried out in accordance with the guidelines of the IRCCS Santa Lucia Foundation Ethics Committee.

## Temporal and Spatial Orienting of Attention Task

Subjects were seated comfortably in a quiet room facing a Toshiba computer screen (1600<sup>∗</sup> 1200 resolution 1280<sup>∗</sup> 768 pixels, frame rate 60 Hz).

Predictive timing was investigated with an adaptation of the temporal and spatial orienting of attention task (Coull and Nobre, 1998), which measured RT to a briefly presented target (100 ms) appearing after one of three foreperiod durations (FP: 540 ms/1080 ms/1620 ms) in one of three boxes (left/up/right) depicted on the computer screen (7◦ eccentricity). Subjects were asked to press a response button to detect the target, as quickly as possible, using information provided by one of three types of cue. The cue was presented prior to the target and was either informative (spatial and temporal conditions) or uninformative (neutral condition). During the neutral condition, no spatial or temporal information was provided, as the whole image brightened. Spatial locations and temporal onsets were balanced across trials for all the three experimental conditions. In the majority of trials (80%) informative cues validly predicted where (spatial cue) or when (temporal cue) the target would appear ("valid" trials). In the remaining 20% of trials, the cue incorrectly predicted the spatial location or temporal onset of the target ("invalid" trials). The three experimental conditions (spatial/temporal/neutral) were presented separately in three blocks of trials and blocks' order counterbalanced across subjects. The inter-trial interval (ITI) varied randomly from 600 to 1000 ms. A practice session (15 trials for the neutral and spatial condition each, 30 trials for the temporal condition) preceded the main task, which consisted of 90 trials for each experimental condition. In the temporal and spatial conditions, a total of 72 valid trials and 18 invalid trials were presented (**Figure 1**).

Stimuli presentation and collection of behavioral responses were controlled using e-Prime 2.0 software and a Serial Response Box Model 200a (Psychology Software Tools, Inc., Schneider et al., 2002), which allows for millisecond accuracy recording.

### MRI Acquisition

MRI data were collected at 3T (Philips Achieva) using a thirty-two channel receive-only head RF coil. A diffusionweighted spin-echo echo-planar imaging sequence was used to acquire high angular resolution diffusion weighted images (HARDI) (Jones et al., 1999). One hundred twenty-eight gradient orientations and six unweighted (b = 0 s/mm2) images were acquired with the following parameters: TR/TE = 10000/76 ms, b = 1,000 s/mm<sup>2</sup> , 60 slices, slice thickness = 2 mm, FOV = 224 × 224, acquisition matrix = 112 × 112, resulting in data acquired with a 2 mm<sup>3</sup> isotropic resolution. A highresolution T1-weighted whole-brain structural scan was also acquired (1 mm<sup>3</sup> isotropic).

### MRI Pre-processing

Data were preprocessed and analyzed in ExploreDTI v4.8.6 (Leemans et al., 2009). Data were corrected for motion and eddy currents. Motion artifacts and eddy current distortions were corrected with B-matrix rotation using the approach of Leemans

FIGURE 1 | Spatio-temporal predictive task. Participants detected, as quickly as possible, visual targets ('+' or an '×') appearing at one of three locations (left/up/right) after one of three intervals (short/medium/long). Expectations about stimulus location (spatial condition) or onset (temporal condition) were conveyed by symbolic cues with 80% validity. The cue consisted of a central image (1◦ eccentricity) composed of a diamond and two rings. Part of the cue briefly brightened (100 ms) to inform participants on the possible spatial location or temporal onset of the upcoming target. During the spatial condition, the left, up or right side of the diamond was highlighted to inform subjects that the target was likely to appear in the left, up or right peripheral box. In the temporal condition, brightening of the inner circle, diamond or the outer circle indicated that the target would occur after a short (540 ms), medium (1080 ms) or long (1620 ms) delay, respectively. During the neutral condition, no spatial or temporal information was provided. The inter-trial interval (ITI) varied randomly from 600 to 1000 ms. The three experimental conditions (spatial/temporal/neutral) were presented separately in three blocks of trials (90 each).

and Jones (2009). During this processing procedure, all brain scans were rigidly normalized to Montreal Neurological Institute (MNI) space during the motion-distortion correction step.

### Tractography

Tensor estimation was performed using a non-linear least square method (Jones and Basser, 2004). Whole-brain tractography was performed using the DTI fiber tract-reconstructing algorithm implemented in ExploreDTI. The following tracking parameters were applied: step-size of 1 mm, minimum FA thresholds of 0.2 to initiate and continue tracking, an angle threshold of 30◦ and fiber length range 10–500 mm (Mori and van Zijl, 2002; Langen et al., 2012).

### Network Measures

The automated atlas labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002) was registered to the HARDI data. A weighted connectivity matrix for each subject was generated in ExploreDTI using the AAL atlas regions as nodes. The atlas used to parcellate the gray matter consisted of 116 ROIs in total. However, the 26 cerebellar ROI were excluded for two main reasons: firstly, the cerebellum was not homogeneously covered by MRI across subjects, giving rise to a large inter-subject variability in the number of streamlines counted between cerebellar ROIS and the rest of the atlas; secondly, the streamline count from each cerebellar region resulted unavoidably altered by magnetic susceptibility artifacts due to the inhomogeneity of the local magnetic field induced by the magnetic field gradient used for diffusion tracking. Thus, a total of 88 nodes, 76 cortical and 12 sub-cortical, were used. The matrices used were weighted by number of streamlines connecting node i to node j. This matrix was thresholded to five streamlines and two nodes (i.e., left/right Heschl) were excluded from statistical analyses, in order to avoid singular matrices across subjects. Degree centrality and clustering coefficient (commonly used complex network measurements) were used to characterize network topology in terms of connectedness and segregation at the local level. All metrics were computed using the Brain Connectivity Toolbox (Rubinov and Sporns, 2010).

### Statistical Analysis Behavioral Data

Reaction times to target stimuli faster than 100 ms were considered anticipatory and removed from the analysis. The benefit of temporal and spatial cueing on performance (i.e., individual mean RTs) was analyzed in one repeated measures ANOVA, with cue type (neutral, temporal valid, spatial valid) as the within-subjects factor. In order to examine the costs of invalid cueing and the potential interaction with the foreperiod length, another repeated measures ANOVA with cue type (time, space), validity (valid, invalid) and foreperiod (540, 1080, 1620 ms) as within-subjects factors was performed. A Bonferroni correction for multiple comparisons was applied. The benefits and costs of spatial cuing were explored by calculating 2 indices: (1) Validityspatial as the difference between RTs at spatial valid trials and RTs at neutral trials; (2) Invalidityspatial as the difference between RTs at spatial invalid trials and RTs at spatial valid trials. In order to further explore the temporal validity/invalidity effects unmasked by the foreperiod effect, 4 indices were calculated as follows: (1) Temporal Validityshort as the difference between RTs at the short interval for early cue (V540) and RTs at the short interval when no cue was provided (N540); (2) Temporal Validitylong as the difference between RTs at the long delay for late cue (V1620) and RTs at the long delay when no cue was provided (N1620); (3) Temporal Invalidityshort as the difference in RTs at the short interval for late cue (I540) and RTs at the short interval for early cue (V540); (4) Temporal Invaliditylong as the difference between RTs at the long interval for early cue (I1620) and RTs at the long interval for late cue (V1620).

### Neuroimaging Data

Behavioral indices (Validityspatial, Invalidityspatial, Temporal Validityshort, Temporal Validitylong , Temporal Invalidityshort, Temporal Invaliditylong ) and network topology measures from the eighty-eight selected brain regions were included in the analyses, by investigating reciprocal dependencies. In order to look at relationships without inflating the risk of a type I error (of erroneously concluding that a significant correlation is present) (Draper and Smith, 1998), and to increase the number of descriptors in the regression equation as to improve predictors' fit, a forward stepwise multiple regression model (F > 4 to enter) was chosen. Indeed, the forward stepwise procedure starts with no variables in the model and it tries out

the variables one by one, including them if they are statistically significant, thus identifying the best set of predictors that gives the biggest improvement to the model. Simple linear regressions were preliminarily performed to include in subsequent multiple regressions analyses only variables significantly (p < 0.05) related to the behavioral measures considered.

All statistics were performed on StatView statistical software.

### RESULTS

### Behavioral Data

A main effect of cue type (neutral, temporal valid, spatial valid) [F(2,28) = 5.597; p = 0.0061] confirmed that targets appearing in a predictable location or moment in time are detected faster than uncued ones. Post hoc analyses with paired t-test evidenced a stronger effect for the spatial [t(28) = 3.011; p = 0.0055] as opposed to temporal [t(28) = 1.61; p = 0.1186] cueing, indicating that the latter did not significantly affect performance. The second analysis concerning temporal and spatial invalidity costs showed a significant effect of cue type, validity, and foreperiod [F(1,28) = 23.71; p < 0.001; F(1,28) = 89.3; p < 0.001; F(2,56) = 75.83; p < 0.001 respectively], a significant cue × validity interaction [F(1,28) = 32.23; p < 0.001], a significant validity × foreperiod interaction [F(2,56) = 10.42; p < 0.001] and a significant three way interaction between cue type, validity and foreperiod [F(2,56) = 4.10; p = 0.02]. Post hoc analyses (Bonferroni test) confirmed that for the temporal domain, RTs at the short interval after an early cue were not different from RTs for the medium interval at the medium cue, but significantly slower than RTs for the long interval at the late cue (due to the interaction with the foreperiod effect) [p = 0.0002]. Concurrently, RTs at the short interval for the late cue were slower from RTs at the medium interval after an invalid cue (either early or late) [p < 0.0001] and from RTs at the long interval after an early cue [p < 0.0001], while no difference was observed between the latter two variables. Behavioral performance at the medium interval was therefore discharged from the further indices computation and only RTs at the short and long delay considered (see **Table 1**).

TABLE 1 | Mean reaction times (RTs) to target stimuli split by (a) cue type; (b) validity, and (c) the interaction between validity and foreperiod.


### Local Connectedness

### Temporal Validity and Invalidity Effects

The covariance between the behavioral benefit of knowing in advance when the target will occur and the centrality of nodes within the whole structural connectivity network was investigated. Statistical analyses were separately performed for both Temporal Validityshort and Temporal Validitylong effects, considered as measures of explicit temporal orienting processes. Results from simple linear regressions showed significant negative correlations between the Temporal Validityshort effect and degree centrality of a set of cortical regions. Moreover, a significant covariance between the degree centrality of a widespread pattern of cortical and subcortical nodes and the Temporal Validitylong effect was found both as negative and positive correlations (see **Table 2**). Results from subsequent stepwise regressions showed that 51% of total variance observed in the Temporal Validityshort effect was explained by right inferior parietal (r = −0.41), right superior occipital (r = −0.31) and left rolandic operculum (r = 0.48) nodes' degree centrality variability [F(3,25) = 10.854; r = 0.75; adjusted R <sup>2</sup> = 0.51; p < 0.0001]. Further, for the Temporal Validitylong effect the same analysis revealed that 67% of total variance was explained by right supramarginal (r = −0.29), left inferior parietal (r = −0.39), left fusiform (r = 0.51) and right insula (r = 0.24) nodes' degree centrality variability [F(4,24) = 15.455; r = 0.85; adjusted R <sup>2</sup> = 0.67; p < 0.0001].

The covariance between the behavioral cost of receiving incoherent information about the target temporal occurrence and the centrality of nodes within the whole structural connectivity network was also investigated. Again, in the temporal invalid condition, two different variables were analyzed as to explore implicit temporal orienting processes (Temporal Invalidityshort, Temporal Invaliditylong ). Positive and negative correlations between the two indices and nodes centrality of a corticalsubcortical network (see **Table 2**) were found. Results from stepwise regressions showed that 40% of total variance observed in the Temporal Invalidityshort effect was explained by left putamen (r = 0.44), right paracentral lobule (r = 0.4) and right caudate nucleus (r = 0.32) nodes' degree centrality variability [F(3,25) = 7.434; r = 0.7; adjusted R <sup>2</sup> = 0.4; p = 0.001]. Conversely, in the case of the Temporal Invaliditylong effect, results from the stepwise regression evidenced a negative relationship between the ability to re-orient attention to a later time point and degree centrality of the left inferior parietal lobe [F(1,27) = 6.608; r = 0.44; adjusted R <sup>2</sup> = 0.2; p = 0.02].

#### Spatial Validity and Invalidity Effects

The analyses applied to subjects' ability to orient attention in space evidenced that centrality of two regions was positively and negatively correlated with RTs to stimuli appearing in a predictable location. Results from the stepwise regression evidenced that the observed positive covariance between the Validityspatial effect and degree centrality of the para-hippocampal gyrus (r = 0.5) and the negative one with centrality of the post-central gyrus (r = −0.4) [F(2,26) = 9.947; r = 0.7; adjusted R <sup>2</sup> = 0.4; p = 0.0006] could explain 40% of the total covariance.

TABLE 2 | Observed covariance between degree centrality and spatio-temporal predictive performance.


(Continued)

#### TABLE 2 | Continued

fnhum-12-00212 May 23, 2018 Time: 17:11 # 7


The cost of spatial invalidity on performance (Invalidityspatial) was found to positively covary with degree centrality of one region. However, the stepwise regression model produced no significant result (**Table 2** and **Figure 2**).

### Local Segregation

#### Temporal Validity and Invalidity Effects

The covariance between the four temporal indices (Temporal Validityshort, Temporal Validitylong, Temporal Invalidityshort, Temporal Invaliditylong ) and the prevalence of clustered connectivity around individual nodes was investigated. Simple linear regressions results showed significant negative correlations between the Temporal Validityshort effect and clustering coefficient of nodes within a cortical network (see **Table 3**). Conversely, significant positive correlations were found between the Temporal Validitylong effect and clustering coefficient of nodes within a cortical-subcortical network (see **Table 3**). The stepwise regression model showed a specific negative relationship between the Temporal Validityshort effect and the clustering coefficient of the right superior frontal cortex, indicating that its' connectedness with topological neighbors nodes explained 18% of the total variance [F(1,27) = 7.001; r = −0.45; adjusted R <sup>2</sup> = 0.18; p = 0.013]. Moreover, significant results from stepwise regression analyses indicates that 75% of total variance observed in the Temporal Validitylong effect was explained by right putamen (r = 0.5), left paracentral lobule (r = 0.6) and right amygdala nodes' clustering coefficient (r = 0.6) [F(3,25) = 28.908; r = 0.9; adjusted R <sup>2</sup> = 0.75; p < 0.0001]. The same analysis applied to the two temporal invalidity indices showed significant correlations between the latters and the clustering coefficient of a set of cortical nodes and in a single subcortical region (see **Table 3**). In the subsequent stepwise regressions, we found that the 50% of total covariance between the whole sub-network's clustering coefficient and RTs indexing

the Temporal Invalidityshort effect [F(2,26) = 12.31; r = 0.7; adjusted R <sup>2</sup> = 0.5; p = 0.0002] was explained by the left middle frontal gyrus (r = 0.6) and the left caudate nucleus (r = 0.4). Conversely, when temporal expectations had to be updated to a later time point (as expressed by the Temporal Invaliditylong effect), the clustering coefficient of the right posterior cingulate gyrus (r = −0.4), the right middle temporal pole (r = −0.4) and the left superior temporal pole (r = 0.4) explained the 40% of the observed covariance [F(3,25) = 8.339; r = 0.7; adjusted R <sup>2</sup> = 0.4; p = 0.0005].

#### Spatial Validity and Invalidity Effects

In case of predictable targets, the clustering coefficient of a single area was found to covary with the ability to orient attention in space (Validspatial effect). However, such correlation was no longer significant in the stepwise regression model.

Conversely, results evidenced positive correlations between the Invalidspatial effect and the clustering coefficient of cortical and subcortical areas (see **Table 3**). Results from the stepwise regression evidenced that 30% of the total covariance between the whole sub-network's clustering coefficient and the cueing cost for spatial invalid trials [F(2,26) = 7.124; r = 0.6; adjusted R <sup>2</sup> = 0.3; p = 0.0034] was explained by the left paracentral lobule's (r = 0.4) and right putamen's (r = 0.4) clustering coefficient (**Table 3**).

### DISCUSSION

In the present study, we investigated the covariance between performance on a spatio-temporal predictive task and topological measures of complex brain networks. Degree centrality (i.e., a measure of local connectedness identifying critical network elements as those having the highest number of connections or degree) and clustering coefficient (i.e., a measure of local segregation indicating the presence of clustered connectivity around individual nodes) were computed (Barabaìsi, 2016) and correlated to behavioral indices reflecting temporal and spatial expectations. Specifically, we aimed at investigating critical network elements for explicit spatial and temporal orienting, and to examine potential variations for implicit temporal expectations. Indeed, given that temporal predictions can rely on different sources of relevant temporal information (Correa, 2012), such as explicit predictive cues and probabilistic information associated with the passage of time (hazard function and foreperiod effects), we disentangled explicit processes from more implicit measures reflecting the interaction between these two different types of temporal orienting effects.

To date, although relevant behavioral (MacKay and Juola, 2007; Weinbach et al., 2015; Laidlaw and Kingstone, 2017) and fMRI (Coull and Nobre, 1998; Coull et al., 2015; Peer et al., 2015) investigations examined the interrelation between predictive behavior and processing of spatial and temporal stimuli features, this is the first study that applied complex network analysis to structural connectivity data for studying the brain topological organization underlying spatial and temporal predictive processing.

Here we found that response preparedness to cued stimuli relies on different structural connectivity networks for the temporal and spatial domains. Specifically, while the use of temporal informative cues was found to covary with centrality and segregation measures within a distributed corticalsubcortical network, its spatial counterpart was related to clustered connectivity around few brain regions. Particularly, the behavioral advantage of being validly cued to a certain spatial location was found to covary with local connectedness (increased when RTs were faster after valid as opposed to neutral cues) of a region in the right parietal lobe (the postcentral gyrus) more anterior relative to previous fMRI reports (see Chica et al., 2013 for a review; Coull and Nobre, 1998; Corbetta and Shulman, 2002). Although this region is traditionally linked to somatosensory perception (including proprioception), some functional and lesion studies (Corbetta, 1998; Balslev et al., 2013) suggested that the postcentral gyrus participates in movement organization and anticipation processes by contributing to code the locus of attention. Since spatial attention can be defined as the selection of locations for perception and/or for action, the postcentral gyrus may modulate the visual processing stream toward a motor response by using gaze-direction signals that have a proprioceptive component (Balslev et al., 2013). Alternatively, the observed correlation may be related to the postcentral gyrus participation to one of the three distinct key subsystems of attention, namely the orienting network (Fan et al., 2005; Posner, 2008). Indeed, clustered connectivity was found around the left paracentral lobule when the behavioral cost of being invalidly cued to a target location was considered. Since both the postcentral and the paracentral gyri participate to the network responsible for directing attention to target stimuli, triggered by specific spatial cues (Petersen and Posner, 2012), the patterns of relations among these brain areas and their topological neighbors may sustain cued orienting of attention to spatial locations. The fact that increased connectivity was found in a local network community comprising the right putamen when spatial invalidly cued trials were considered, seems to further confirm the hypothesis that this region is one of the hubs of the intrinsic connectivity networks in resting brain involved in attentional processes (Xiao et al., 2016).

As for temporal orienting, significant covariance was observed between centrality measures (calculated according to the number of axonal bundles incident upon a node) of regions within a subcortical-fronto-parieto-occipital network -comprising the left putamen, the right caudate nucleus, the left frontal operculum, the right inferior parietal cortex, the right paracentral lobule and the right superior occipital cortex-, and the ability to respond after a short cue-target delay suggesting that the local connectedness of such nodes plays a central role when the source of temporal expectation is explicit (Correa, 2012). When local segregation was taken into account, we found highly clustered structural connectivity across the right superior frontal gyrus, the left middle inferior frontal gyrus and the left caudate nucleus as related to explicit temporal orienting.

Interestingly, different connectivity clusters were observed when the automatic shifting of attention (related to unexpected premature targets) was separated from the voluntary top–down

TABLE 3 | Observed covariance between clustering coefficient and spatio-temporal predictive performance.


TABLE 3 | Continued


control of attentional orienting (in case of valid cuing) since clustering coefficient of the left middle frontal gyrus and caudate nucleus covaried with the former process, while the latter was related to the right superior frontal gyrus clustering coefficient.

These findings expand prior results (Coull and Nobre, 1998; Davranche et al., 2011; Coull et al., 2013, 2016; Carvalho et al., 2016) by defining, across the entire brain structural network, the relative importance of specific brain regions in temporal orienting. They also indicate the presence in the whole brain structural network, of a group of regions whose reciprocal connectivity scaled with the increased preparedness in responding to fast upcoming pre-cued stimuli and was specific for the process under investigation (automatic attentional shift vs. voluntary attentional orienting). Indeed, the functionality of a node is defined by the pattern of its connections with other nodes in the network (Bullmore and Sporns, 2009), since centrality increases as the potential for communication from a region increases, while the segregation measure here used is indicative of lasting patterns of relations among brain areas for supporting specific cognitive processes (Rubinov and Sporns, 2010).

Although a large corpus of research has implicated complex and distributed networks -including prefrontal, premotor, parietal and insula cortices and comprising also the striatum (Triviño et al., 2010)-, in temporal orienting of attention, the relevance of these structures in optimizing prospective motor behavior as a function of informative cues is still debated. Our results suggest that the behavioral benefit of being validly cued to a short interval is subtended by the reciprocal connectivity within a fronto-parieto-occipital complex. Activation of the inferior parietal cortex has been widely observed in fMRI studies on temporal orienting (Coull and Nobre, 1998; Coull et al., 2000, 2013, 2016; Cotti et al., 2011; Davranche et al., 2011). Although a frontoparietal network has been implicated in temporal as well as spatial orienting, the inferior parietal cortex seems specifically related to predictive timing (Wiener et al., 2010), as it has been found active not only when participants had to use temporal cues to optimize motor responses, but also for enhancing perceptual discrimination (Davranche et al., 2011), irrespectively of the effector and the type of requested action (button press/ocular saccades) (Cotti et al., 2011), both in the visual and auditory modalities (Merchant et al., 2013; Bolger et al., 2014). In terms of the structural network here observed, results confirm a pivotal role of this area in implementing the response benefits of temporal prediction, which is consistent with the thesis that temporal expectations tune action planning by optimizing prospective motor behavior (Coull and Nobre, 1998). The fact that densely connected local clusters were structurally centered around the right superior frontal gyrus in explicit temporal orienting supports the assumption that the latter is a voluntary process that requires more evolved structures such as the frontal cortex, involved in the strategic and voluntary (top– down) regulation of behavior (Konishi et al., 2008; Triviño et al., 2010).

In case of premature targets -target appears at the short interval after a late cue- a bottom–up, automatic grabbing of attention has been previously described and linked to the activity of the posterior extrastriate visual cortex (Coull et al., 2000). Here we found that premature targets are related to the local connectedness of the left putamen, the right caudate and the right paracentral lobule. The involvement of the striatum in timing is well known, being described as a 'core timer' of a more distributed neural network underlying temporal processing in the subsecond and multisecond range (Meck et al., 2008). Putaminal activity has been traditionally associated with motor preparation and execution, and specifically with the internal generation of precisely timed movements (Rao et al., 1997). Indeed, previous studies (Filip et al., 2016) suggested that the putamen is engaged in the evaluation of success and precision of the undergoing prospective temporal analysis, and our result would underscore its function as a neural node in a network engaged in cases of breaches of temporal expectations.

When the interaction between explicit and implicit temporal orienting processes was considered at the long interval, we found that explicit processes (indexed by the net advantage of being validly cued when the foreperiod effect was partialled out) were related to centrality measures of the bilateral inferior parietal lobule. Degree centrality of the same region in the left hemisphere covaried with behavioral measures (RTs at the long interval for early cue minus RTs at the long interval for late cue) indexing the process of attentional re-orienting. These results confirm the key role of the left parietal cortex in instantiating the behavioral benefits of temporal predictability, whether temporal information is conveyed by explicit predictive cues or by the probabilistic information associated with the passage of time (Coull et al., 2016). Densely connected local clusters were structurally centered on the right middle temporal pole and the right posterior cingulum when the re-orienting effect was specifically considered. The supramodal involvement of the temporal lobe in time processing has been already demonstrated (Kanai et al., 2011; Filip et al., 2016), while the posterior cingulate cortex seems necessary for organizing flexible behavior in response to an ever-changing environment. Indeed,

this region contributes to signaling environmental changes and, when necessary, to recombining variables into strategy-specific measures of Bayesian evidence that the environment has changed (Pearson et al., 2011). The fact that lasting patterns of relations among the posterior cingulate and its topological neighbors were observable when the process of attentional re-orienting to a later point in time occurred, suggests that this area is specifically involved in the dynamic updating of current expectancies as a function of the foreperiod.

The selectivity of our results, indicating segregated structural connectivity networks for temporal and non-temporal stimuli dimensions, but also for explicit and implicit temporal orienting provides empirical support to the notion that complex networks theory enhances behavioral driven neuroimaging data analysis of predictive timing. Indeed, actual evidence on the neural underpinnings of temporal orienting is currently restricted to task-based fMRI experiments in which the core of the process circuitry is blurred by the recruitment of additional areas related to ancillary cognitive/sensory processes. Conversely, a task-independent structural neuroimaging approach within a network-based modular framework provides quantitative information on anatomical brain connectivity, at both global and regional levels, thus unraveling the lasting patterns of relations among brain areas for supporting the process under investigation (Sporns, 2013; Mišic and Sporns, 2016 ´ ). We also think that our findings have a translational impact since alterations in predictive timing have been evidenced in psychiatric disorders such as schizophrenia. As such, we believe that our results have the potential to better define the neural circuits involved in the pathophysiology of the illness, and to provide an explanatory framework for symptoms and clinical manifestations. Given that graph theory allows a quantitative analysis of the pattern of interconnections each area has with other areas belonging to the same or different systems (Mišic´ and Sporns, 2016), its application may provide new insight on the dysfunctional interplay between timing deficits, clinical symptoms and connectopathy in SZ. Indeed, neuroimaging investigations evidenced connectivity deficits in patients with SZ, with positron emission tomography (Andreasen et al., 1996; Mallet et al., 1998), fMRI (Fornito and Bullmore, 2015; Crossley et al., 2016; Nelson et al., 2017) and diffusion-tensor imaging techniques (see Kelly et al., 2017 for a meta-analysis), suggesting that SZ pathophysiology may be explained in terms of abnormal or disrupted integration of spatially distributed brain regions (disconnection hypothesis, Friston and Frith, 1995; Friston et al., 2016). Here we observed that centrality and segregation properties of the parietal and frontal cortices modulate explicit and implicit temporal orienting in healthy subjects. Intriguingly, these areas have been identified as putative hubs across the whole brain network (Iturria-Medina et al., 2008; Gong et al., 2009) [i.e., highly connected nodes that allow segregated functional systems to share information by means of neural interaction (Collin et al., 2016)]. Such pivotal nodes infrastructure has been found disproportional in SZ patients, with abnormal high clustering, anomalous modularity structure (Zalesky et al., 2011; van den Heuvel et al., 2013), and reductions of global efficiency in the overall network structure (Zhang et al., 2012). Moreover, this altered connectivity profile was observed in specific areas (Bassett et al., 2008; van den Heuvel et al., 2010; Wang et al., 2012; Kaplan et al., 2016) that partially overlap with the predictive timing network here described, and are responsible for the computation of prediction error signals to guide learning and updating of expectancies (Kaplan et al., 2016).

Before concluding, few limitations have to be acknowledged. First, the relatively small sample size might reduce the impact of our findings and rescale the assertions here discussed. However, we intended to provide preliminary evidence of the neural architecture subtending the ability to optimize prospective motor behavior as a function of informative cues. In fact, complex and distributed networks have been thus far implicated in predictive timing, while aberrant connectivity has been suggested to underlie failures in predictive coding (Kraepelin, 1904; Friston et al., 2016). In addition, it might be argued that the network measures chosen are restricted to a small-sized pool, however, we intended to exploit the straightforward characterization given by the used ones, in order to elicit the cerebral interconnectivity correlates of such a complex process as predictive timing. Moreover, although the small number of temporal invalid trials here used could be a potential limitation, we nevertheless observed the traditional behavioral costs of being invalidly cued in time, further confirming the strength of the temporal orienting effects here investigated, even with a relatively reduced data pool. Lastly, recent studies on attentional orienting suggested that cues in the temporal and spatial conditions from the seminal study by Coull and Nobre (1998) are hardly comparable as spatial arrow cues elicit both voluntary and involuntary attention shifts (Hommel et al., 2001; Olk, 2014), being over-learned symbols of direction (Ristic and Kingstone, 2012) as opposed to temporal ones. Future behavioral and neuroimaging studies should employ cues that differ strictly in terms of temporal and non-temporal properties, in order to minimize potential confounds on the type of attentional orienting engaged.

### AUTHOR CONTRIBUTIONS

FP conceived the experimental design, supervised the data analysis, contributed to results interpretation, and wrote the paper. VC conceived the experimental design, collected the data, performed the behavioral data analysis, contributed to results interpretation, and wrote the paper. GS supervised the data analysis, contributed to results interpretation, and revised the paper. TG conceived the experimental design, performed the MRI data analysis, contributed to results interpretation, and wrote the paper. DV conceived the experimental design, collected the data, performed the MRI data analysis, contributed to results interpretation, and revised the paper.

### FUNDING

This study was conducted within the project "Multidimensional study of timing abilities and sense of agency in schizophrenia and bipolar patients" funded through 5Xmille 2016 from the Italian Ministry of Health.

### REFERENCES

fnhum-12-00212 May 23, 2018 Time: 17:11 # 12


Barabaìsi, A.-L. (2016). Network Science. Cambridge: Cambridge University Press.


Neuropsychologia 51, 309–319. doi: 10.1016/j.neuropsychologia.2012.0 8.017



in schizophrenia. JAMA Psychiatry 70, 783–792. doi: 10.1001/jamapsychiatry. 2013.1328


Zhang, Y., Lin, L., Lin, C. P., Zhou, Y., Chou, K. H., Lo, C. Y., et al. (2012). Abnormal topological organization of structural brain networks in schizophrenia. Schizophr. Res. 141, 109–118. doi: 10.1016/j.schres.2012.0 8.021

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer MC and handling Editor declared their shared affiliation.

Copyright © 2018 Ciullo, Vecchio, Gili, Spalletta and Piras. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Training-Induced Changes in Rapid Auditory Processing in Children With Specific Language Impairment: Electrophysiological Indicators

#### Anna Dacewicz 1† , Aneta Szymaszek <sup>1</sup> \* † , Kamila Nowak <sup>2</sup> and Elzbieta Szelag<sup>1</sup>

<sup>1</sup>Laboratory of Neuropsychology, Nencki Institute of Experimental Biology of Polish Academy of Sciences, Warsaw, Poland, <sup>2</sup>Laboratory of Social Psychology, Department of Ergonomics, Central Institute for Labour Protection–National Research Institute, Warsaw, Poland

The brain's ability to recognize acoustic changes occurring in rapid temporal succession is important for speech and successful language development. Children with specific language impairment (SLI) are characterized by deficient dynamics of temporal information processing (TIP) in the millisecond time range accompanied by disordered language development. Furthermore, previous studies have found that intervention based on amelioration of TIP resulted in improvement of both language and other cognitive functions. This study aimed to explain the changes associated with TIP training from the perspective of event-related potentials (ERPs). Thirty-six children aged 5–8 years (26 boys, 10 girls) diagnosed with SLI underwent two types of intense audio-visual computer intervention: experimental TIP training targeted at the millisecond time range (n = 18) or control non-TIP training (n = 18). Paired 50 ms tones of 1000 Hz and 1200 Hz were presented with inter-stimulus intervals (ISIs) of either 50 ms (Short ISI Condition) or 200 ms (Long ISI Condition). Auditory ERPs were measured in a passive oddball paradigm before and after each type of training. The mismatch negativity (MMN) paradigm was applied as an electrophysiological indicator of the brain's ability to automatically detect violations of regularity in paired tones presented in rapid succession. Moreover, the P3a component was also analyzed. After 24 sessions of temporal training (in the experimental group) MMN amplitude enhancement was observed in both ISI conditions, reflecting increased efficiency in perceiving changes in rapid auditory sequences. In both experimental and control groups, P3a amplitude was enhanced in both ISIs. This may be due to the improvement of involuntary attention shifting to the auditory events involved in each training type. To conclude, temporal training, compared to non-temporal control training, improved the ability to detect changes in a rapid auditory stream in children with SLI.

Keywords: specific language impairment (SLI), event related potentials (ERPs), temporal information processing, temporal windows, cognitive training

#### Edited by:

Deana Davalos, Colorado State University, United States

#### Reviewed by:

Mireille Besson, Institut de Neurosciences Cognitives de la Méditerranée (INCM), France Annalisa Setti, University College Cork, Ireland

> \*Correspondence: Aneta Szymaszek a.szymaszek@nencki.gov.pl

†These authors have contributed equally to this work.

> Received: 20 October 2017 Accepted: 16 July 2018 Published: 07 August 2018

#### Citation:

Dacewicz A, Szymaszek A, Nowak K and Szelag E (2018) Training-Induced Changes in Rapid Auditory Processing in Children With Specific Language Impairment: Electrophysiological Indicators. Front. Hum. Neurosci. 12:310. doi: 10.3389/fnhum.2018.00310

## INTRODUCTION

### Temporal Dynamics of Cognitive Function

In recent years, temporal information processing (TIP) has been emphasized as the neural basis of several mental functions, such as learning, memory, attention, decision making, motor control and speech processing (Szelag et al., 2010). All these functions may be characterized by their specific temporal dynamics at different ranges of TIP. Two main ranges crucial to cognitive processes may be distinguished: millisecond and multisecond ranges (Pöppel, 1997, 2004, 2009; Buhusi et al., 2018). The temporal dynamics of these processes provide a framework for understanding the neural mechanisms underlying human mental activity, including a structure for speech and any perceptual or motor activity.

Furthermore, specific distortions of TIP have been found in various clinical populations (e.g., ADHD, autism spectrum disorder, depression, schizophrenia), co-existing with disordered psychological functioning; which may elucidate the consequences of deficient timing (e.g., Davalos et al., 2003; Teixeira et al., 2013; Vatakis and Allman, 2016). This line of research is important as it may foster our understanding of the ''timing—behavior'' relationship and potentially contribute to the remediation of psychological conditions. On this basis, one cannot claim that certain disorders are due to time distortions per se, but deviations from the ''normal'' template seem critical to everyday functioning and, hence, are a crucial factor for research and rehabilitation.

The number of experimental studies on the co-existence of ''deficient timing—deficient language'' in children and adults has grown rapidly. In this article, we concentrate on neurodevelopmental disorders, specifically on children with specific language impairment (SLI). Children suffering from SLI display problems with language acquisition (both comprehension and expression), however, their general cognitive functions and nonverbal intelligence remain in the normal range. Furthermore, these problems cannot be explained in terms of hearing deficits, neurological and speech mechanism abnormalities, or environmental factors. It is estimated that SLI affects approximately 7% of the 5-year-old population (Tomblin et al., 1997). In children with SLI, the coexistence of language difficulties and deficient TIP was first shown in an early article by Tallal and Piercy (1973) and subsequently confirmed in more recent studies (Grondin et al., 2007; Szelag et al., 2015). These findings raised the question of whether the improvement of TIP of nonverbal information may induce subsequent gains in language skills. Hence, the funding of training programs based on improvement of TIP to aid language competency has been given a high priority.

The widely used computer-based remediation Fast ForWord<sup>r</sup> (FFW) has received a lot of attention in studies on improving language development in children. The program is based on the hypothesis that language impairments result from difficulties in rapid auditory processing (compare above). In the early studies of Tallal et al. (1996), as well as of Merzenich et al. (1996), it was found that after performing FFW exercises, there was a significant improvement in the language skills of children with expressive and/or receptive language difficulties and reading problems. However in some studies, the beneficial effects of FFW were comparable to those of other computerbased interventions (e.g., Cohen et al., 2005; Gillam et al., 2008; Given et al., 2008). Previous research on intervention-related effects indicated a transfer of improvement from the trained time domain to untrained behavioral domains, i.e., speech, auditory processing and other cognitive functions. Moreover, changes in brain function associated with auditory discrimination of both verbal (Lovio et al., 2012) and nonverbal stimuli presented in rapid sequences were reported (McArthur et al., 2010; Heim et al., 2013, 2016).

This research approach was applied in the Dr. Neuronowski<sup>r</sup> computer program developed at our Institute (Szelag and Szymaszek, 2016). This program offers targeted training in millisecond TIP, sequencing abilities and duration judgment and has been developed on the basis of our many years of research (for an overview, see Szelag et al., 2009). Using this intervention program in our previous study on children with SLI, we found significant improvements in behavioral measures of timing, language, attention working memory and executive functions (Szelag et al., 2015).

### Time Perception Windows

As mentioned above, time perception can be considered on various time scales or processing units which are often called ''temporal windows'' (Pöppel, 1997). Experimental support for the existence of such time windows comes from a large number of different paradigms (Pöppel, 1997; Szelag et al., 2004). At this point, the close association between TIP and speech is well established and has been the topic of long discussions in modern neuroscience. This association may be rooted in the temporal dynamics of our verbal utterances which indicate the temporal constraints on different time ranges. These ranges correspond to tens of milliseconds (single phonemes), hundreds of milliseconds (syllables or prosodic elements), or a few seconds (phrases or sentences). Speech processing requires temporal decoding of the signal because the speech stream is a wave of rapidly changing complex sounds (millisecond processing level). Furthermore, it is necessary to parse the perceived signal into manageable chunks (multisecond level).

The concept of temporal characteristics of the speech signal was proposed in **?** (**?**) Asymmetrical Sampling in Time hypothesis. This assumes that speech perception is associated with processing within two different temporal windows—shorter and longer windows—corresponding to two levels of processing: namely, phonological decoding and semantic parsing of the chunks of the incoming speech stream. Accordingly, the short window (∼20–50 ms) is crucial for phoneme reception (e.g., rapid formant transitions, voice-onset-time), whereas the long window (∼150–300 ms) contributes to syllable processing, intonation contour and prosody. Speech perception thus requires the integration of auditory information within these two ranges, which are the topic of the present study.

Existing data reveals that the length of such temporal integration windows (TIWs) of sensory information may be related to developmental fluctuations (Fox et al., 2010). For example, such TIWs are expected to be wider in children than in adults (Horváth et al., 2007; Fox et al., 2010) because the latter age-group can integrate information within shorter temporal segments than the former group. According to EEG studies, the length of the TIW in adults is around 200 ms, while in children of 5–8 years it is about 350 ms (Wang et al., 2005). It is also postulated that the length of such TIWs may change in subjects suffering from speech disorders.

### Mismatch Negativity (MMN) as an Indicator of "Genuine" Timing Efficiency

Different methods for measurement of millisecond timing efficiency have been developed (Szelag et al., 2004). One can distinguish here the classical behavioral methods, such as the detection of the temporal order of incoming events (Szymaszek et al., 2009). In these tasks, two consecutive stimuli are presented in rapid succession with a short gap in-between and the subject must report the order of occurrence (left-right, low-high, shortlong, etc.). It is commonly believed that these tasks measure the effectiveness of sequencing ability and temporal ordering of incoming events on the basis of perceptual thresholds (auditory or visual) for order detection. Such threshold values are usually of tens of milliseconds in young healthy subjects (Wittmann and Fink, 2004; Szymaszek et al., 2009). However, the efficiency of TIP involves strong contributions from resources other than TIP, e.g., attention, working and short-term memory and moreover, inhibitory processes and decision making. Hence, behavioral measures of temporal acuity are heavily influenced by these cognitive functions.

On the other hand, the mismatch negativity (MMN) paradigm may be used as an electrophysiological indicator of timing efficiency in the processing of rapid auditory stimuli. The power of such event-related potentials (ERPs) lies in that they provide information about the sequencing, timing and in some cases, location of neural activity elicited by particular stimuli long before subjects produce an overt response. ERPs are increasingly used in developmental research because they are non-invasive and, in many cases, do not necessitate active subject participation, which is a huge advantage when dealing with children or patients.

MMN is a fronto-central negative potential elicited by any discriminable change in a sequence of auditory stimuli (Winkler, 2007; Näätänen et al., 2011). MMN can be induced by unattended stimuli, unlike behavioral methods which require attention and cooperation from a participant (Campbell and Davalos, 2015). Therefore, MMN can be a useful tool for measuring ''genuine'' timing deficits in subjects whose behavioral measures may be affected by attentional or other cognitive deficits.

MMN obtained in response to auditory stimuli has been proposed as a reliable objective method to measure sensory memory traces, as well as the effectiveness of rapid auditory discrimination processes (Davalos et al., 2003; Ervast et al., 2015). Therefore, MMN has been recognized as a sensitive indicator of auditory processing impairments in children with language difficulties (Davids et al., 2011), as well as in infants of parents with a history of language impairment (Benasich et al., 2002, 2006).

### Experimental Aim

This study aims to verify whether in children with SLI the application of the experimental temporal training compared to the control non-temporal training may result in enhanced electrophysiological responses in detection of changes in a rapid auditory stimuli stream. Another goal was to gain an understanding of the electrophysiological underpinnings of rapid nonverbal auditory processing in both the shorter and longer temporal windows. The application of the MMN paradigm allowed to concentrate on pure TIP, minimizing the influence of cognitive functions (attention, executive functions, etc.) involved in the behavioral timing measures.

### MATERIALS AND METHODS

### Subjects

Participants were 36 children aged between 5 years and 8 years (26 boys, 10 girls) diagnosed with SLI according to ICD-10 (World Health Organization, 1992). They were recruited from either the Early Intervention Centre or the Children's Memorial Health Institute in Warsaw. The core inclusion criterion was a language development disorder, defined as an overall standard score or score on at least two standard subtests below or equal to the 4th sten according to the Test for Assessment of Global Language Skills (TAGLS; Tarkowski, 2001), which constitutes the screening assessment for language development in Polish children. The recruited children scored below the mean score for their age. Moreover, all participants had a normal level of nonverbal intelligence (IQ of 85 or higher, measured with the Polish version of Raven's Colored Progressive Matrices, CPM; Szustrowa and Jaworowska, 2003) and normal hearing level as verified by pure-tone audiometry screening at 500 Hz, 1000 Hz, 2000 Hz, 4000 Hz (using an AS208 audiometer), which covers the sound frequency spectrum used in this study. All children were monolingual Polish native speakers and were right-handed, based on the modified Edinburgh Handedness Questionnaire (Hill and Khanem, 2009). Participants had no neurological or psychiatric diagnosis, autism spectrum disorder, attention deficits, or socio-emotional disorders as determined by a parent questionnaire and clinician reports. Moreover, children did not attend any other speech or cognitive therapy while participating in this study.

### Ethical Approval

The study protocol was approved by the Bioethics Committee of the Medical University of Warsaw (permission no. KB/162/2010). Written informed consent was obtained from the parents of all children before the study. The children provided verbal approval before each session.

### Procedure

This was a blind study in which children were randomly assigned to two training groups using the RITA<sup>r</sup> software (Pahlke et al., 2004) according to their age, gender, non-verbal IQ and level of language development: the Experimental Group (EG; n = 18; 13 boys, 5 girls) and the Control Group (CG; n = 18; 13 boys, 5 girls). These two groups did not differ significantly in terms of age, non-verbal intelligence, or level of language development. Descriptive data for EG and CG are shown in **Table 1**. EG underwent the computerized Dr. Neuronowski<sup>r</sup> intervention program (Szelag and Szymaszek, 2016), focused on the enhancement of TIP in the millisecond time range. CG used a computer program which implemented classical speech therapy extended by 16 simple computer games. Thus, the intervention in EG and CG was matched in terms of the mental load, visual appearance, motivational aspects and training protocol with the exception of TIP exercises which were only included in the experimental training.

### Study Protocol

Each participant underwent a pre-test electrophysiological assessment. Afterwards, the children underwent either experimental (EG) or control training sessions (CG; **Figure 1**). The duration of the whole intervention comprised 24 1-h sessions performed over 6 weeks (four sessions weekly). After completing the training program, a post-test was conducted using the same electrophysiological assessment as applied in the pre-test.

### Audiovisual Training Programs

Both trainings were computer-based intervention programs provided in several visually attractive exercises. To motivate and increase the children's commitment to the training, exercises were conducted with the use of tablets because of some indications that this has a positive impact on young students (Dhir et al., 2013). Each child completed the training individually and each therapy session was guided by a trained consultant (speech-therapist, psychologist, special educator, or psychology student). The training took place in a separate room at the Nencki Institute or at the Early Intervention Centre.

In experimental and control training, both correct and incorrect answers were followed by appropriate visual and auditory feedbacks. The exercises in both interventions were conducted using a predefined specific agenda which allocated a comparable amount of time to training particular cognitive functions.

### Experimental Training

The experimental intervention procedure was provided with the Dr. Neuronowski<sup>r</sup> computer program developed at our Institute (Szelag and Szymaszek, 2016). The software consists of 46 games grouped into nine modules for improving particular cognitive functions (e.g., attention, non-verbal auditory processing, verbal short-term memory, executive functions, receptive language and phonemic hearing). In addition, the majority of these exercises involved millisecond TIP, sequencing abilities and duration judgment. The difficulty of particular exercises was adjusted individually on the basis of the actual level of a child's performance.

The games mostly targeted the discrimination and identification of sounds, tones, syllables and words presented at a rapid rate, as well as recognizing the sequence or duration of two sounds, reproducing sequences of sounds or word strings and matching sounds and words.


CPM, Raven's Colored Progressive Matrices; TAGLS, Test for Assessment of Global Language Skills. <sup>a</sup>Two-tailed t-tests for independent samples.

### Control Training

In contrast to the experimental training, none of the control intervention exercises involved TIP. The control training included three computer programs employing classical speech therapy and 16 simple computer games involving attention, short-term and working memory and executive functions.

The games employing speech therapy targeted the identification and discrimination of syllables and words presented at a regular rate of exposition as well as reproducing sequences of words or matching sounds and words. In the computer games, the tasks were to react as fast as possible to particular objects, to remember the pairs of sounds or pictures and to complete some logic puzzles.

### Electrophysiological Assessment

### Procedure

In the electrophysiological assessment, children were exposed to paired sinusoidal tones in a passive oddball paradigm. This means that they were asked to ignore sounds while watching silent videos. The standard stimulus was a pair of identical tones at 1000 Hz, while the deviant stimulus was a pair of tones differing in pitch, at 1000 Hz and 1200 Hz. The duration of each pair of tones was 50 ms (with 5 ms rising and falling) with an intensity of 80 dB SPL, measured with a Brüel & Kjær sound level meter. The time between consecutive tone pairs was 800 ms. The schema of the passive oddball paradigm is presented in **Figure 2**.

Stimuli were delivered binaurally using Presentation Software version 14.9 (Neurobehavioral Systems Inc., Berkeley, CA, USA) via E·A·RTone<sup>r</sup> 5A Insert Earphone headphones, inserted into the right and left ear canal. We implemented two conditions corresponding to the two inter-stimulus-intervals (ISIs) within paired-tones in standard and deviant stimuli. The ISIs were set at 50 or 200 ms creating the Short ISI Condition (Short ISI) and the Long ISI Condition (Long ISI), respectively (**Figure 2**).

In each condition, 600 standard and 150 deviant stimuli were presented (standard to deviant ratio was 75% to 25%) within six blocks (three blocks with Long ISI and three blocks with Short ISI) in two different orders randomized between subjects. In each block, 15 standards were presented at the beginning and the distribution of deviants was quasi-randomized: a minimum of 5 and maximum of 15 consecutive standards could be presented in a row. At the end of data collection, 300 deviant stimuli (150 for each condition) in four control blocks were also presented. This gave us an opportunity to use the same stimulus method (see below for description).

#### Data Acquisition

EEG data were recorded from 32 scalp electrodes (EasyCap, Germany) with Ag/AgCl active electrodes (ActiCAP, Brain Products) placed according to the 10-20 system of the BrainVision Recorder© v.1.10 software (Brain Products, Germany). The electrodes' contact impedances were kept below 10 k. Data were referenced to the FCz electrode and a bandpass filter of 0.1–100 Hz was applied.

#### Data Analysis

Offline analysis was performed using BrainVision Analyzer<sup>r</sup> v.2.0 software (Brain Products, Germany). First, the data were down-sampled to 256 Hz and re-referenced to TP9 and TP10 electrodes. Butterworth zero-phase filters were implemented with high-pass—1 Hz, low-pass—30 Hz (both 8 order), and notch filter—50 Hz. Next, artifacts were

removed using Independent Component Analysis. The data were segmented into the epochs extending from 200 ms before to 1000 ms after the stimulus (standard or deviant) onset. After baseline correction (from −200 ms to 0 ms), trials exceeding ±120 µV were excluded from the analysis. Epochs were averaged for three stimulus types separately in the two conditions: standards, regular deviants and control deviants (deviants presented in control blocks).

To obtain difference waves dissociated from physical stimuli properties, the same stimulus method was used. In this method, the control deviants were subtracted from the deviants. The control deviants were physically identical to deviants presented in the oddball paradigm (regular), whereas control deviants were presented alone in a row in separate control blocks (**Figure 3**). As suggested in Jacobsen and Schröger (2003), a proper control block should control for both the physical properties of the stimuli and the adaptation effect. Nevertheless, in this study the numerous deviant repetitions in the control block did not fulfil the equal probability control block criteria, which might bias the reported effects. We refer to this issue in the ''Discussion'' section.

The main aim of this study was the analysis of MMN, which usually has frontocentral topography; hence the following analyses were performed only on FCz, Fz, F3, F4, FC1 and FC2 electrodes.

#### Waveform Analysis: Amplitude and Latency

The main analysis of data concerns ERPs identified on a difference wave, i.e., MMN and P3a, both before and after the interventions (experimental vs. control). Additionally, we performed explanatory analyses comparing N2 elicited by deviant and control deviant stimuli to the training-induced effects indicated by the MMN.

MMN and P3a amplitudes (or latencies) were obtained for each child. MMN was the most negative value in the time window between 195 ms and 395 ms after the first stimulus onset (i.e., 95–295 ms after the onset of deviation) for Short ISI or between 360 ms and 530 ms (110–280 ms after the onset of deviation) for Long ISI (e.g., Näätänen, 2000). The P3a component was the most positive value between 335–560 ms and 480–720 ms after the first stimulus onset for Short and Long ISI, respectively (e.g.,Escera and Corral, 2007). The maximum peak amplitudes for each participant were calculated in these time windows (**Table 2**).

For deviants (presented in oddball) and control deviants (presented in separate blocks), P1-N2 and P1'-N2' components were identified on averaged waveforms, which differed in Short and Long ISIs (Cˇeponiene˙ et al., 2005). For Short ISI, only one electrophysiological response (P1 and N2) was elicited because of the integration of two tones within a pair. P1 was the most positive value, while N2 was the most negative one in the particular time windows shown in **Table 2** for



Dacewicz et al. Auditory Temporal Training in SLI

deviants and control deviants. For Long ISI, on the other hand, doubled electrophysiological responses were obtained, reflecting responses to two separate tones within a pair identified as P1-N2 and P1'-N2' (e.g., Clunies-Ross et al., 2015). P1 and N2 were elicited by the first tone within the doublet, whereas, P1' and N2' by the second tone. In this study, we analyzed only N2 and N2' components (in Short and Long ISI, respectively) because they reflected the electrophysiological response to the second tone within a deviant doublet (change onset). For this second tone, we got the MMN by subtracting the control deviant from the deviant. For each participant, the maximum peak amplitudes and latencies for N2 and N2' were analyzed in the particular time windows (**Table 2**).

Finally, the amplitudes of particular ERPs (in µV) were determined as the maximum peak in the particular time windows with regard to the baseline. The latencies of MMN and P3a in both conditions were obtained by subtracting time intervals corresponding to the onset of stimulus change (i.e., 100 or 250 ms for Short and Long ISI, respectively).

### RESULTS

### Statistical Analyses

To analyze differences in amplitudes (or latencies) of particular components in the two conditions, we conducted mixed-design analysis of variance (ANOVAs). Greenhouse-Geisser correction was applied when the sphericity assumption was violated. After ANOVAs, post hoc analysis was performed with the Bonferroni correction adjusted for multiple comparisons.

The data were analyzed in two stages.

In Stage 1, we examined the components identified on the difference wave, i.e., MMN and P3a for amplitudes (or latencies). In these ANOVAs, the within-subject factors were: Session (pretest vs. post-test), Condition (Short ISI vs. Long ISI), whereas the between-subject factor was Group (EG vs. CG).

In Stage 2, to understand the intervention-related changes in MMN obtained in Stage 1, we examined the amplitudes (or latencies) of the N2 and N2' waveforms observed for deviants and control deviants in the two groups, considering Short and Long ISI separately. The rationale was to clarify whether the trainingrelated MMN amplitude changes resulted from increased or decreased amplitudes (or latencies) of N2 and N2' elicited by deviants or control deviants. In the former case, the increased amplitudes would correspond to higher sensitivity of deviant detection in the oddball task, whereas in the latter case, the decreased amplitudes would result from repeated exposure. This may reflect reduced resources involved in the perception of a predictable auditory pattern. In Stage 2, the within-subject factors were: Session (pre-test vs. post-test), Block (deviants vs. control deviants) and Group (EG vs. CG) as a between-subject factor.

### Stage 1—Difference Wave

Mean amplitudes for MMN and P3a are presented in **Figures 4**, **5**, **6**.

#### MMN

ANOVA conducted on amplitudes revealed significant main effects of Condition and Session modified by Session × Group interaction (see **Table 3**, **Figure 5**). The other factors and interactions were nonsignificant.

The amplitudes were higher (more negative, M = −4.41; SD = 2.25) in the Short ISI than in the Long ISI (M = −3.36; SD = 1.73), irrespective of the Session and Group. Increased MMN in post-test was observed only in EG (p < 0.001). In CG, the difference between MMN amplitudes in pre- and posttest was nonsignificant (p = 0.387). The same pattern of results was observed for Short and Long ISIs. Although in pre-test, the two groups did not differ significantly in MMN amplitudes (p = 0.674), in post-test they were higher (more negative) in EG

TABLE 3 | Results of analyses of variances (ANOVAs) with repeated measures: F-values, p-values and effect sizes (η 2 ) for the mismatch negativity (MMN) (top) and P3a (bottom) mean peak amplitudes (or latencies) including Session (pre-test vs. post-test) and Condition (Short ISI vs. Long ISI) as within-subject factors and Group (experimental group, EG vs. control group, CG) as a between-subject factor.


Significant effects are marked in gray background.

than in CG (p = 0.031). Mean and SD for MMN amplitudes are presented in **Table 4**.

ANOVA conducted on latencies revealed a tendency toward significance (p = 0.055) for Condition with shorter MMN



latencies in Long ISI (M = 193.34; SD = 25.69) than in Short ISI (M = 202.92; SD = 21.57). Mean (with SD) for MMN latencies are presented in **Table 4** and **Figure 5**.

#### P3a

ANOVA conducted on amplitudes (**Table 3**) revealed a significant main effect of Session, with higher amplitudes in post- (M = 5.00; SD = 1.63) than pre-test (M = 4.37; SD = 1.27), independent of Group and Condition (p = 0.044).

In ANOVA on latencies (**Table 3**), main effects of Session and Condition were observed. Shorter P3a latencies were observed in post- (M = 343.49; SD = 41.12) than pre-test (M = 378.36; SD = 51.54; p = 0.001) and in Short ISI (M = 347.71; SD = 26.76) than in Long ISI (M = 374.13; SD = 66.48; p = 0.025) in both groups. Mean (with SD) for P3a amplitudes are presented in **Table 4** and **Figure 6**.

In summary, increased amplitudes in post-test in comparison to pre-test were observed only in EG for MMN, but in both groups for P3a. Shorter P3a latencies were observed in postcompared to pre-test and in the Short ISI than in the Long ISI Condition.

### Stage 2—N2 and N2'

### N2 (Short ISI Condition)

In ANOVA (Group × Session × Block) on amplitudes, a main effect of Block modified by the Session × Block interaction was observed (see **Table 5**). N2 amplitudes were higher in post- (M = −8.91; SD = 4.17) than in pre-test (M = −7.62; SD = 3.39) but only for deviants (p = 0.006). For control deviants, this difference was nonsignificant (p = 0.531). Moreover, only in posttest were the amplitudes for deviants (M = −8.91; SD = 4.17) higher than for control deviants (M = −7.15; SD = 2.59; p < 0.001). In pre-test, the amplitudes for deviants and control deviants did not differ significantly (p = 0.676).

The Session × Group interaction resulted from higher amplitudes in post- than pre-test in EG (p = 0.001, see **Table 5**). In CG the difference between sessions was nonsignificant (p = 0.135). Moreover, in post-test in EG, amplitudes were higher than in CG (p = 0.003). Pre-test between-groups differences were nonsignificant (p = 0.644). Mean (with SD) for N2 amplitudes are presented in **Table 6**.

In ANOVA on N2 latencies, all effects were nonsignificant (see **Table 6**).

### N2' (Long ISI Condition)

In ANOVA (Group × Session × Block) on amplitudes, the main effect of Session was significant (p = 0.005), whereas Block tended towards significance (p = 0.051). Two interactions, Session × Group and Group × Block, were observed (see **Table 5**).

The Session × Group interaction resulted from higher amplitudes in post- than in pre-test in EG (p = 0.001). In CG, these differences were nonsignificant (p = 0.517). Mean (with SD) for N2' amplitudes are presented in **Table 6**.

Although the groups did not differ significantly in amplitudes for deviants (p = 0.083) or control deviants (p = 0.617), the Group × Block interaction reflected higher amplitudes in EG in response to deviants (p = 0.005) than to control deviants. These differences in CG were nonsignificant (p = 0.904). Mean (with SD) for N2' amplitudes are presented in **Table 6**.

No significant effects in ANOVA on N2' latencies were observed (see **Table 5**).

In summary, enhanced N2 amplitudes post-test compared to pre-test were observed for deviants (presented in oddball), indicating increased deviant detection sensitivity. These differences were nonsignificant for control deviants. In EG, enhanced N2 and N2' amplitudes for both deviants and control deviants were found in post-test. These amplitudes were higher in response to deviants than to control deviants.

### DISCUSSION

This study showed important training-related influences posttest as compared to pre-test: (1) in EG in post-test increased MMN was accompanied by enhanced N2 and N2' amplitudes for deviants (in oddball); (2) in both groups post-test, higher P3a amplitudes and shorter latencies were observed. Additionally, in Short ISI, generally higher MMN amplitudes were observed irrespective of training effect.

The applied MMN paradigm allowed the study of the ''genuine'' timing, minimizing the influence of other cognitive processes which are highly involved in any behavioral task. It is known that MMN is generated automatically, even without overt attention being paid to the presented stimuli and it reflects the brain's pre-attentive ability to detect any violation of the regularity of auditory stimulation (Näätänen and Picton, 1987; Winkler, 2007). Due to its low cognitive demands (such as attention, decision making, executive function, etc.) it is considered a reliable and objective measure in children. In our study, the use of MMN allowed the verification of whether temporal training caused changes in electrophysiological responses as compared to the control non-temporal training.

Below, we discuss the observed relationships in terms of training-induced effects and general rapid auditory processing.

## Training-Related Changes in Rapid Auditory Processing

#### MMN

After training, increased MMN amplitudes in both conditions were obtained only in EG (**Figure 4**). This result is congruent with data from other studies indicating increased MMN amplitudes after temporal-auditory interventions (Kujala et al., 2001; Heim et al., 2016). For example, Heim et al. (2016) reported normalized ERP amplitudes and latencies of MMN in children


TABLE 5 | Results of ANOVAs with repeated measures: F-values, p-values and effect sizes (η 2 ) for the N2 (left) and N2' (right) mean peak amplitudes (or latencies) including Session (pre-test vs. post-test) and Block (Deviants vs. Control deviants) as within-subject factors and Group (EG vs. CG) as a between-subject factor.

Significant effects are marked in gray background.

TABLE 6 | N2 (left) and N2' (right) mean peak amplitudes in µV (with SD) and latencies in ms in EG and CG, in pre-test and post-test and in Short and Long inter-stimulus interval (ISI) conditions.


with language-learning impairment after the administration of FFW (see ''Introduction'' section). Nevertheless, they also found enhanced MMN post-test in healthy controls, probably due to task re-exposure. As in our study, increased MMN was only found in EG, excluding any effect of repeated measurement but suggesting, rather, that the changes were training-related. Moreover, Kujala et al. (2001) found increased MMN amplitude in response to paired tones following 14 sessions of auditory discrimination training in children with dyslexia. They reported a correlation between increased MMN amplitude and improved reading skills. In contrast, in our study we did not find any statistically significant correlations between enhanced MMN and improvement in behavioral measures reported in our previous article (Szelag et al., 2015). This lack of correlation in young children may be caused by the high variability of behavioral indicators which are highly cognitively demanding.

In our study, the training-related improvements indexed by MMN amplitudes in EG were post-test by both enhanced N2 and N2' amplitudes for the deviants, but not for the control deviants (see ''Results'' section, Stage 2). This may suggest increased temporal acuity after temporal training and better detection of the regular deviants. These relationships were not observed in CG.

At this point, some conclusions can be drawn about the potential impact of the control block design on the obtained results. As mentioned before, our control block allows the stimuli's physical properties to be controlled while ignoring the adaptation effect. One should agree that the reported MMN amplitudes could potentially be artificially elevated because of habituation due to repetition of the control deviants. As significant differences between sessions for control deviant amplitudes were not found, one may expect that the adaptation effect modified both the pre- and post-test measurements in a similar way. Therefore, we expect that our control block design will only have weakly influenced the training-related changes, which were the main focus of our study. However, we would recommend that future studies be more careful in designing the control block, as suggested by, e.g., Jacobsen and Schröger (2003).

The effects of our temporal training may be important for improving the rapid auditory processing of nonverbal information. There are convincing indications that such processing is the neural basis of language and speech (see ''Introduction'' section). The training-induced improvements in TIP may result in a transfer of the improvement from the time domain to the language domain in which timing is incorporated. Thus, improving timing in nonverbal information may facilitate speech functions. This seems to be crucial for future clinical applications of temporal training procedures in speech therapy in subjects with language disorders (aphasia or SLI; Szelag et al., 2014; 2015; Szymaszek et al., 2017).

According to previous studies, other auditory trainings for children with language disorders (e.g., FFW, Earobics, phonological interventions) may induce changes on different levels of auditory processing. For example, Pihko et al. (2007), in a MEG study, observed a stronger response to syllables, as reflected in the enhancement of both the P1 component and MMN, corresponding to speech improvement indexed by behavioral measurements in bilingual SLI children. Trainingrelated changes were also observed in ABRs (auditory brainstem responses) in children with SLI or CAPD (central auditory processing disorders; Filippini et al., 2012) indicating that auditory processing may be enhanced at very early stages (subcortical stages).

The transfer between improved auditory processing and language processing after auditory interventions (musical trainings) may be associated with enhanced sensitivity to acoustic features creating a base for speech improvement (Besson et al., 2011).

#### P3a

Some training-related changes were also observed in both groups in the P3a component, which reflects involuntary attention shifting (Escera and Corral, 2007). After both interventions, enhanced amplitudes and shortened P3a latencies were found, irrespective of condition. Our results are similar to those obtained by Lovio et al. (2012) who reported P3a enhancement after both the experimental (phonological awareness) and control (math games) interventions, whereas MMN increment were observed only following the experimental intervention. This pattern of training-related improvements is similar to that found in the present study, i.e., enhanced MMN in EG and increased P3a in both groups.

This training-related P3a increment in our study in both groups may be caused by some features common to the two interventions. These are: (1) similar attentional engagement; (2) comparable mental load; and (3) involvement of auditory discrimination (compare the ''Materials and Methods'' section). Thus, after both kinds of intense training, involuntary shifting of attention to the presented auditory stimuli might be enhanced, independent of the condition. Moreover, some researchers have claimed that language improvement induced by audiovisual trainings in children with language disorders may be moderated by attentional processes (Cohen et al., 2005; McArthur et al., 2010). Stevens et al. (2008) observed enhanced ERPs associated with selective attention in children with SLI after FFW administration resulted in receptive language improvement. Some support for this observation also comes from children with dyslexia showing diminished P300 amplitudes as well as better behavioral performance in a lexical decision task after audiovisual training (Jucla et al., 2010). The interpretation is that after such training, attentional resources are more efficiently allocated during word recognition.

Regardless of the training approach taken, there is room to create new intervention tools for children with language learning impairments.

### Two Temporal Mechanisms of Auditory Perception in SLI

Looking at the waveforms displayed in **Figure 4** for deviants (red lines) and control deviants (blue lines), we can assume that two mechanisms of auditory processing were active depending on the duration of the ISI separating the two stimuli presented in rapid succession. This may be reflected in two different patterns of neural responses. For Short ISI (paired tones separated by an ISI of 50 ms), one electrophysiological response was elicited and indicated by a complex P1-N2 response which is dominant in children (Ervast et al., 2015). This reflects a situation in which the response to the second tone within a pair began before the termination of the response to the first tone, which is reported in studies using paired stimuli separated by a relatively short ISI (**Figure 4**). Evidence from previous literature has suggested the existence of a 200 ms long TWI following the onset of the first sound, in which a unitary sound representation is formed (Näätänen and Winkler, 1999;Horváth et al., 2007). In contrast, for longer ISIs (above 200 ms) two distinct neural responses were obtained: P1-N2 followed by P1'-N2' for the first and second tone within a pair, respectively. The electrophysiological response to the second tone seemed to appear after termination of the response to the first tone. Similarly, two separated MMN potentials were elicited when the stimulus onset asynchrony exceeded the duration of the TWI (Long ISI Condition; **Figure 4**).

Such unitary sound representation has been reported in previous studies. For example, Wang et al. (2005) and Fox et al. (2010) indicated the integration of two paired tones presented with relatively short ISIs. Fox et al. (2010) obtained two distinct neural responses in children aged 7–9 when paired tones were separated by a 200 ms ISI. For ISIs of 25, 50 or 100 ms, only one neural response was elicited. Furthermore, Wang et al. (2005), using the double deviant method in children aged 5–8, reported two separate MMN responses elicited by two kinds of deviant presented in a row differing either in pitch or intensity, but only when the separating ISI was 250 ms. At shorter ISIs (i.e., 50, 100, or 200 ms) they found only one MMN, indicating an integrated neural response to two different kinds of deviants. In summary, the above research on temporal integration focused on an approximately 200 ms border interval known as the TIW. This research has clearly indicated the elicitation of one neural response when such an interval is short or near the TIW limit, whereas, two distinct neural responses are observed when the time window following the onset of a sound exceeds the TIW.

The important observation is that the children with SLI studied here presented the same pattern of responses as their normally developing peers (reported in previous studies), i.e., one integrated MMN response at shorter ISIs and two separate responses at longer ISIs. We may speculate that despite the TIP deficits in children with SLI evidenced in behavioral methods, the neural basis of TIP seems preserved. As reported in our previous studies (Szelag et al., 2015), children with SLI displayed deteriorated perception of the temporal order of two sounds presented in rapid succession, i.e., they displayed higher temporal order thresholds (values of approx. 200 ms) than their typically developing peers (approx. 100 ms). Our results are promising as they show that the neural mechanism of TIP evidenced in the electrophysiological method reported here is intact. One might expect that this preserved neural base may constitute the framework for the efficacy of TIP training, resulting in improved language skills.

### Short vs. Long ISI Detection Reflected in MMN

Higher MMN was observed in Short ISI as compared to Long ISI (**Figure 4**). As higher amplitudes are usually observed in easier tasks than in more difficult ones (e.g., Näätänen, 2000), one may expect that for children with SLI, processing in Short ISI was easier than in Long ISI. This was confirmed by greater MMN amplitudes in the former case, irrespective of Session and Group (see ''Results'' section, Stage 1, **Figure 4**). It seems that processing within the TIW (Short ISI) fosters easier detection of violation of regularities. Conversely, if the stimulus-onsetasynchrony exceeds the TIW, smaller MMN (or even no MMN) occurred. The Long ISI Condition required a longer memory trace which made the detection of violation more difficult.

According to Näätänen (2000), MMN occurs when the memory trace of the standard is still active during deviant presentation, thus, MMN may depend on sensory memory persistence. The overall duration of the paired-stimulus pattern in Short ISI (150 ms) was twice as short as that in Long ISI (300 ms), resulting in greater sensory memory load in the latter case, creating more a difficult perceptual situation. On the other hand, according to Winkler (2007), increased MMN amplitude in response to the paired-tone patterns may indicate the improvement of temporal grouping of incoming sounds, which seems crucial in the process of auditory object formation.

We emphasized the advantage of the MMN paradigm in testing TIP in children as it does not require the participant to actively perform the task or even to attend to the stimuli. In this respect, this paradigm has a large advantage over behavioral studies. It is worth mentioning that results of electrophysiological and behavioral measurements examining the same phenomena cannot be compared directly. Considering the behavioral data previously reported in children with SLI (Szelag et al., 2015), an opposite relation of detection may emerge—i.e., shorter ISIs may make the temporal ordering more difficult than long ISIs. Szelag et al. (2015) indicated that thresholds of temporal order for two sounds required an ISI of approx. 200 ms, whereas at shorter ISIs it was reported at a chance level. This suggests that longer ISIs created an easier perceptual situation than shorter ones. Such divergence between the electrophysiological results presented here and the behavioral data published previously may be due to the different mechanisms implemented in the sequencing of the auditory stimuli presented with the various ISIs, i.e., perceptual auditory streaming or identification of separate sounds.

### CONCLUSION

In summary, using ERP we have demonstrated that temporal training may enhance the detection of changes in rapid auditory streams in children with SLI. Improved temporal resolution in two temporal windows was measured in terms of increased MMN. Our study also indicated that intense cognitive training (both temporal and non-temporal) in children with SLI may ameliorate involuntary attention shifting as reflected by increased P3a amplitudes.

Results indicate that, not only in normal children (studied by, e.g., Fox et al., 2010) but also in children with SLI, auditory perception of sequences of rapidly changing stimuli depends on the duration of the ISI separating the presented stimuli. At a relatively short ISI (50 ms), one electrophysiological response was elicited, whereas at a long ISI (200 ms) two separate electrophysiological responses occurred.

### AUTHOR CONTRIBUTIONS

AD: subject recruitment, data acquisition, conducting therapy sessions, analysis and interpretation of data, contribution to manuscript writing. AS: data acquisition, conducting therapy sessions, analysis and interpretation of data, manuscript writing. KN: conducting therapy sessions, analysis and interpretation of data. ES: conceptualization and study design, analysis and interpretation of data, manuscript writing. All authors: final approval.

### FUNDING

This research was supported by grant INNOTECH-K1/IN1/ 30/159041/NCBR/12 from The National Centre for Research and Development (Narodowe Centrum Badan i Rozwoju) and ´ Statutable to the Nencki Institute.

### ACKNOWLEDGMENTS

The authors would like to thank the neuropsychologists and speech-language pathologists who recruited children with SLI for the study. We thank Elzbieta Chru´scicka and ˙ Monika Kastory-Bronowska from the Early Intervention Centre in Warsaw for their assistance in the recruitment

### REFERENCES


of children. The authors would like to thank Anna Bombinska for her technical assistance during data collection.


with acoustically modified speech. Science 271, 81–84. doi: 10.1126/science.271. 5245.81


**Conflict of Interest Statement**: ES and AS are the creators of the Dr. Neuronowski<sup>r</sup> software package, realized as part of a project at the Nencki Institute with funding from the National Centre for Research and Development in Poland. The rights to the software lie with the Nencki Institute, which has an agreement with Harpo Ltd., the company commercializing this software. ES and AS are not the owners of this technology nor do they have a direct financial arrangement with Harpo Ltd. The authors state that this does not affect the scientific validity of the results.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Dacewicz, Szymaszek, Nowak and Szelag. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Temporal Processing of Joyful and Disgusting Food Pictures by Women With an Eating Disorder

#### Caroline Gagnon\*, Catherine Bégin, Vincent Laflamme and Simon Grondin

École de Psychologie, Université Laval, Québec, QC, Canada

The present study used the presentation of food pictures and judgements about their duration to assess the emotions elicited by food in women suffering from an eating disorder (ED). Twenty-three women diagnosed with an ED, namely anorexia (AN) or bulimia nervosa (BN), and 23 healthy controls (HC) completed a temporal bisection task and a duration discrimination task. Intervals were marked with emotionally prerated pictures of joyful and disgusting food, and pictures of neutral objects. The results showed that, in the bisection task, AN women overestimated the duration of food pictures in comparison to neutral ones. Also, compared to participants with BN, they perceived the duration of joyful food pictures as longer, and tended to overestimate the duration of the disgusting ones. These effects on perceived duration suggest that AN women experienced an intense reaction of fear when they were confronted to food pictures. More precisely, by having elevated the arousal level and activated the defensive system, food pictures seemed to have speeded up the rhythm of the AN participants' internal clock, which led to an overestimation of images' duration. In addition, the results revealed that, in both tasks, ED women presented a lower temporal sensitivity than HC, which was related to their ED symptomatology (i.e., BMI, restraint and concern) and, particularly, to their weaker cognitive abilities in terms of attention, processing speed and working memory. Considered all together, the findings of the present experiment highlight the role of fear and anxiety in the manifestations of AN and point out the importance of considering non-temporal factors in the interpretation of time perception performance.

#### Edited by:

Mark A. Williams, Macquarie University, Australia

#### Reviewed by:

Fuat Balc*ı*, Koç University, Turkey Joseph Glicksohn, Bar-Ilan University, Israel

#### \*Correspondence:

Caroline Gagnon caroline.gagnon.7@ulaval.ca

Received: 30 November 2017 Accepted: 19 March 2018 Published: 06 April 2018

#### Citation:

Gagnon C, Bégin C, Laflamme V and Grondin S (2018) Temporal Processing of Joyful and Disgusting Food Pictures by Women With an Eating Disorder. Front. Hum. Neurosci. 12:129. doi: 10.3389/fnhum.2018.00129 Keywords: time perception, emotion, eating disorders, food, bisection, discrimination

## INTRODUCTION

Time perception refers to the subjective experience of time, which indicates how a person interprets the duration of an event. The sense of time is modulated by intrinsic (e.g., age, gender, menstrual cycle; Block et al., 2000; Morita et al., 2005; Glicksohn and Hadad, 2012; Pütz et al., 2012; Ferreira et al., 2016) and extrinsic contexts (e.g., medication, drugs, rhythmical stimuli; Droit-Volet et al., 2010; Lake and Meck, 2013; Shahabifar and Movahedinia, 2016). Among the intrinsic factors, emotions are particularly important. As suggested by the saying "time flies when we are having fun," time perception is intimately connected to emotional life (Droit-Volet and Gil, 2009).

Viewed from a pacemaker-accumulator (internal-clock) interpretation of time processing (Gibbon, 1977; Gibbon et al., 1984), emotions can affect time perception through two of the latter's components: the pacemaker or the attentional mechanisms. Several researches have shown that

an increase of the arousal level generates a relative lengthening effect of perceived duration (e.g., Wearden and Penton-Voak, 1995; Droit-Volet and Wearden, 2002; MacDonald and Meck, 2005). More precisely, increased arousal speeds up the rhythm of the pulses' emission of the pacemaker. If more pulses are amassed in the accumulator, then time seems longer. Therefore, by producing an elevation of arousal, an emotional event should make the internal clock run faster, leading to an overestimation of stimulus duration (Gil and Droit-Volet, 2012). For instance, Droit-Volet et al. (2004) showed that the presentation durations of emotional faces depicting anger, sadness or happiness were perceived longer than the ones of neutral faces. In parallel, the number of pulses collected would be under the control of attention, with more attention allowing a larger accumulation (Block and Zakay, 1996; Zakay and Block, 1996, 1997). Thus, assuming that there is a limited pool of attentional resources, being distracted from the passage of time leads to a lower number of pulses reaching the accumulator and to a shortening effect (i.e., durations are underestimated; for review see Lejeune, 1998; Brown, 2008). For example, Gil and Droit-Volet (2011a) showed that the presentation durations of ashamed faces were underestimated in comparison to those of neutral ones (see also Grondin et al., 2015). The feeling of shame could direct the focus of attention on a person's thoughts about the self or on causes of shame (i.e., self-awareness, reflexive activity; Lewis, 1971), and thus deviate the attentional resources from the events to be timed.

Interestingly, Angrilli et al. (1997) proposed an integrative view of the influence of emotions on time perception. They reported that perceived duration depends on an interaction between arousal and attentional processes. Precisely, on lowarousing conditions, the duration of negative stimuli is underestimated, whereas the duration of positive stimuli is overestimated. On high-arousing conditions, opposite results are found: the duration of negative stimuli is overestimated, whereas the duration of positive stimuli is underestimated. In consequence, according to Angrilli et al. (1997), two causal forces for the effect of emotions on perceived duration seem to coexist: a controlled-attentional mechanism for low-arousal conditions (i.e., negative pictures distract more the attentional resources from the passage of time than positive ones), and an automatic emotion-driven mechanism for high-arousing conditions (i.e., negative stimuli accelerate more the rhythm of the pacemaker than positive ones). Based on the work of LeDoux (1995) on fear processing, the authors explained that negative and higharousing stimuli, by speeding up the rhythm of the internal clock, activate the defensive system by quickly and automatically processing dangerous cues and preparing the responses programs associated to fear behaviors. In other words, the acceleration of the pacemaker in the presence of an intense stress or a threatening situation could have a motivational-survival function (Bradley et al., 2001).

Over the years, a variety of stimuli have been used to elicit emotions in time perception studies: facial expressions (Effron et al., 2006; Gil et al., 2007; Gan et al., 2009; Tipples et al., 2015; Mioni et al., 2016a), music pieces (Bisson et al., 2009; Droit-Volet et al., 2010, 2013), video sequences (Droit-Volet et al., 2011; Grondin et al., 2014a), sounds (Noulhiane et al., 2007; Mioni et al., 2017), words (Zhang et al., 2017), images of mutilated bodies (Grondin et al., 2014b), phobic objects (Watts and Sharrock, 1984; Buetti and Lleras, 2012; Tipples, 2015) and real-life scenes (Lambrechts et al., 2011). Gil et al. (2009) first used food pictures to test the effect of disgust and pleasure on time perception. With a temporal bisection task, they found that the duration of disgusting and pleasant food pictures was underestimated by healthy participants in comparison with a neutral stimulus, and that this shortening effect was more marked for the disgusting pictures than for the liked ones. They attributed their results to an attentional distraction from the passage of time caused by food images, especially the disliked ones.

Food represents a type of stimuli particularly sensitive for people suffering from an eating disorder (ED) like anorexia (AN) and bulimia nervosa (BN). In fact, when presented with food, women with an ED experience less pleasure and strong negative emotions like fear, disgust and lack of control (Gorini et al., 2010; Giel et al., 2011; Vocks et al., 2011). Most experiments, that examine reactions toward food, use pictures of food and ask ED women to rate their emotions on different dimensions (e.g., valence, arousal, pleasure, disgust, and fear) whilst viewing them (Rodríguez et al., 2007; Hay and Katsikitis, 2014), or to express their affects in an interview with an examiner (McNamara et al., 2008a). Although very informative, studies relying on self-reported measures could conduct to biases. First, social desirability and reluctance to provide information is one of the most important biases. In that sense, ED participants can soften, hide or alter the intensity of their true affects toward food for not seeming too disturbed or ill and, consequently, for avoiding the initiation or the prolongation of a treatment (Allison and Heshka, 1993; Nordbø et al., 2012; Brown et al., 2016). Second, high levels of alexithymia are found in individuals suffering from an ED (e.g., Schmidt et al., 1993; Gilboa-Schechtman et al., 2006; Nowakowski et al., 2013). Alexithymia is defined by difficulties identifying feelings and differentiating them from bodily sensations, difficulties describing and expressing emotions (i.e., lower emotional awareness), a lack of fantasy and a concrete cognitive style focused on the external environment (Sifneos, 1973; Taylor et al., 1991). Thus, the emotional responses that women with an ED give to food stimuli could be unprecise, unfelt or wrong. In the same vein, people suffering from AN or BN could report what they "cognitively think" of the stimuli instead of what they really "emotionally feel," a phenomenon referred to the "cognitive-affective" division (Jenkins and O'Connor, 2012). Again, this phenomenon can reduce the exactness of women's emotional responses toward food. To skirt these kinds of biases, indirect measures of emotions toward food are needed. Psychophysiological techniques (e.g., skin conductance, heat rate, startle reflex, cortisol level, electroencephalographic recordings) are an option. However, because these procedures are somewhat invasive and unpleasant - particularly for women with ED who are biologically monitored on a recurring basis – the development of a behavioral measure of emotions toward food would be useful.

Considering the facts that emotions influence timing and that time distortions (over- or underestimation of durations) give information about how the brain detects and interprets reality in terms of valence and arousal (Angrilli et al., 1997;

Teixeira et al., 2013), time perception appears to be a sensitive way to explore emotional effects. In addition, if time perception tasks use food pictures as stimuli, then these tasks may become an innovative way for measuring and understanding emotions elicited in people with AN and BN. Therefore, the main objective of the present experiment was to assess the emotional impact provoked by food pictures in women suffering from an ED by using a temporal perception perspective, which allows to bypass the limits associated to traditional self-reported procedures. Because timing distortions caused by emotions seem to depend on the nature of the task used (Gil and Droit-Volet, 2011b), and that temporal processes present complexities that can be highlighted by specific paradigms (Baudouin et al., 2006; Mioni et al., 2013a,b, 2014; Ogden et al., 2014), two types of temporal tasks were selected for this investigation: a bisection and a duration discrimination tasks. According to the literature, the bisection task appears to be the most used method to study the effects of emotions on timing. Typically, this paradigm involves durations ranging from some milliseconds to seconds. The discrimination task, for its part, is a most classical method for investigating the mechanisms involved in the processing of short intervals (Grondin, 2008, 2010).

As ED are multifaceted psychiatric disorders, investigating emotions evoked by food stimuli in terms of time distortions requires supplementary considerations. One of those is the presence of affective comorbidities in women with ED, which can influence time perception. In fact, prevalence of depressive and anxiety symptoms is high among people suffering from AN and BN (Swinbourne et al., 2012; Aspen et al., 2014; Fakra et al., 2014; Godart et al., 2015; Meng and D'Arcy, 2015). Therewith, some studies showed that mood disturbances notably modulate time perception (for review, see Droit-Volet, 2013; Teixeira et al., 2013). Among them, Tipples (2008) found that negative emotionality was positively correlated to temporal bias due to angry and fearful expressions, and Mioni et al. (2016b) demonstrated that depressed patients over-produced durations, whereas anxious patients under-reproduced temporal intervals. Thus, in an experiment using temporal tasks with an ED population, the prior affective state of participants appears to be an important factor to take into account. Moreover, several researches have supported that women suffering from ED (or at higher risk) present cognitive difficulties (e.g., Lena et al., 2004; Steinglass and Glasofer, 2011; Jáuregui-Lobera, 2013; Weider et al., 2015; Naor-Ziv and Glicksohn, 2016) and, besides, it is well known that time perception relies on various cognitive processes (Perbal et al., 2002; Pouthas and Perbal, 2004; Zélanti and Droit-Volet, 2012). Even more important, cognitive abilities most often impaired in ED are those also involved in time processing, that is attention, processing speed, working memory, inhibition and switching (for ED, see Kemps et al., 2006; Rosval et al., 2006; Roberts et al., 2007; and for timing, see Zélanti and Droit-Volet, 2011; Mioni et al., 2012, 2013a,b; Pütz et al., 2012; Brown et al., 2013; Ogden et al., 2014; Droit-Volet et al., 2015). In consequence, for a better understanding of the performance of women with an ED in temporal tasks, it is crucial to document their cognitive abilities. Finally, because hunger modulates affective (valence, arousal), cerebral (activation, orientation of attention) and psychophysiological (salivation, heart rate) responses to food stimuli (Spence et al., 2016), can reduce cognitive effectiveness (Doniger et al., 2006; Benau et al., 2014) and is a key concept in ED symptomatology (Brown et al., 2010; Haedt-Matt and Keel, 2011), it appears essential to also get information about the level of appetite of participants during the experimentation. Keeping in mind all these considerations about the influence of prior affective state, cognitive abilities and appetite level on emotions and time processing, the second aim of the study was to identify factors that could contribute to explain the differences between participants' performance on temporal tasks.

Globally, in accordance with Angrilli et al. (1997)'s point of view about the interaction between valence and arousal, we predicted that, for women suffering from an ED, food pictures will lead to a general overestimation of durations consecutive to an intense reaction of fear and an activation of the defensive system. Inversely, as found by Gil et al. (2009), we posited that for participants without ED, the duration of food pictures, even more the disgusting ones, will be underestimated due to a deviation of attentional resources from the passage of time. For this population, food pictures will not cause a strong reaction of fear and will not increase the arousal level. Finally, we anticipated that the performance of women with an ED on temporal tasks will be related to their clinical characteristics (i.e., BMI, ED symptomatology and affective state), level of hungriness and cognitive abilities (e.g., Tipples, 2008; Droit-Volet, 2013; Teixeira et al., 2013; Mioni et al., 2016b).

### MATERIALS AND METHODS

### Participants

Twenty-three women suffering from an ED (i.e., ED group) and 23 healthy controls women (i.e., HC group) took part in the study. Participants were recruited among students and employees of Université Laval by electronic advertisements. For both groups, the inclusion criteria were the following: participants had to (a) be aged between 18 and 60 years; (b) be of French-Canadian origin; (c) be right-handed; (d) display normal or correctedto-normal vision and audition; (e) be free of drug and alcohol abuses for 3 months; (f) not present a psychotic disorder; (g) not present a neurological disorder; and (g) show no history of traumatic brain injury within the last 5 years, or past head trauma associated with permanent cognitive impairments. For the HC group, additional criteria were used: (a) absence of personal or family ED antecedents; (b) no attempt to lose weight in the last month; and (c) absence of psychiatric disorder for which a medication was prescribed.

In accordance with the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5; American Psychiatric Association, 2013), the ED group was composed of 10 women with AN (5 with restrictive subtype, 5 with binge eating/purging subtype) and 13 women with BN. The participants got their ED diagnostic from a health professional (i.e., psychiatrist, family doctor) a few weeks before the experiment started and still showed active symptoms during

the study (n = 6), or from a doctoral-level psychologist (CG), which was then validated by a specialized clinical-researcher in the domain of evaluation and treatment of ED (CB; n = 17). All women suffering from an ED, if not yet engaged in a therapeutic process, were referred to clinical resources for support.

The participants of both groups gave informed written consent with respect to the Declaration of Helsinki. They received a monetary compensation of CAN \$45 for their implication in the study. The experiment was approved by the Ethics Committee of the CHU de Québec – Université Laval (Project #2012-812, C11-08-088).

### Apparatus

For the temporal tasks, women were seated in front of a 16 inch CRT monitor connected to a PC, at a viewing distance of 60 cm. The room was dimly lit. The stimuli were presented in a 700 pixels × 526 pixels colored format, at the center of the screen, on a black font. The answers were collected by a keypad. The software E-Prime 2.0 Professional (Psychology Software Tools, Pittsburg, PA, United States, Released 2012) was used to create and administer the tasks. Except for the Conners Continuous Performance Test – Second edition (CPT-II, Conners, 2000), which was performed on a laptop, the neuropsychological tests were completed in a pencil-paper format. Statistical data analyses were performed with SPSS 24.0 for Windows (IBM Corporation, Released 2016) and the free software R 3.3.3 (R Core Team, Released 2017).

### Time Perception Tasks Stimuli

Two types of pictures were used in the temporal tasks: food pictures and object pictures. The food pictures were eliciting joy or disgust, and the object pictures were emotionally neutral. The stimuli were chosen based on data from a previous study (Gagnon et al., 2018), which aimed to identify, among women with ED, emotional responses to food pictures. Briefly, in this experiment, two groups of women, one with ED and one of HC, were asked to rate, on 9-point Self-Assessment Manikin (SAM; Lang, 1980; Bradley and Lang, 1994) or Likert scales (ranging from 1 to 9), 46 food and 12 object pictures on nine dimensions (i.e., valence and arousal on SAM scales; joy, sadness, anger, disgust, fear, surprise and neutrality on Likert scales). From this collection of pictures, the five food pictures that generated the highest level of joy, the five food pictures that evoked the highest level of disgust, and the five object pictures that were the most neutral – on Likert scales of joy, disgust and neutrality, respectively – were chosen for the present study. The top-5 pictures needed also to evocate, for the joyful ones, a positive valence (>5 value), for the disgusting ones, a negative valence (<5 value), and for the neutral object pictures, an intermediate valence (value between 4 and 6). Because pictures in the top-5 food images that provoked joy in women with ED were not all the same that those for HC, some joyful pictures used for the two groups were different. In fact, as the main utility of the pictures was to elicit strong emotions, their selection was based on the intensity of their targeted affect instead of the similarity of their nature. Inversely, the top-5 disgusting food and the top-5 neutral object pictures were identical for both groups, so the temporal tasks adopted the same pictures for women with ED and HC. Ultimately, the selected joyful food pictures for the ED group were: strawberries, pieces of pineapple, red grapes, mixed salad leaves and a crepe. Those for HC were: strawberries, pieces of pineapple, a slice of sugar pie, squares of chocolate bar and a piece of chocolate cake. For both groups, the selected disgusting food pictures were: a black blood sausage, winkles, pieces of chitterlings sausage, a black radish and pieces of kidney in sauce. Finally, the neutral objects for all women were: a snap hook, a lamp, a wall socket, pincers and a screw. **Table 1** shows values of selected pictures on emotional dimensions that are interesting for the present study, and **Figure 1** shows examples of those pictures.

All joyful food pictures came from the Web Food Frequency Questionnaire (Web-FFQ; Labonté et al., 2012), while disgusting food pictures were created with copyright free pictures found on the Internet and stimuli used by Rousset et al. (2005, 2008). The visual parameters of these pictures were standardized (e.g., resolution, font, frame, brilliance) and the food was always presented in the same way, that is on the center of a white plate, bordered by a knife and a fork. Also, the size of the food portion on the dish (i.e., area covered by the aliment or the meal) was similar across pictures. Finally, the object pictures were extracted from the International Affective Picture System (IAPS, #7059, 7175, 6150, 7056, 7018; Lang et al., 2008).

### Temporal Bisection Tasks

Inspired by the research of Gil et al. (2009) on time perception and food pictures, the study was composed of two bisection tasks: one with emotional pictures (i.e., joyful and disgusting food), and one with a neutral picture. However, instead of using a white oval as a non-food or neutral stimulus like precedent authors, here, pictures of a neutral object were exploited. In fact, for women suffering from an ED, a white oval could represent an empty plate, which, according to clinical practice, seems far from being emotionally neutral.

Each bisection task consisted of two phases: a training and a testing phases. During the training phase, women were exposed to a short (400 ms) and to a long (1600 ms) standard durations, which were marked by the presentation of a picture of a snap hook. This object was chosen because, as said previously, it was evaluated as emotionally neutral by both groups, and in order to make the visual parameters of images uniform, its size was comparable to the size of food used as emotional stimuli. The S and L standard durations were presented five times each, and the women had to memorize them. After that, the participants performed 14 practice trials (2 for each probe duration) in which they had to indicate, by pressing the corresponding button (S or L) on the keypad, whether the presentation duration of a new picture, a screw, was closer to the short (S) or to the long (L) standard duration. As for the snap hook, the picture of the screw was chosen for its neutrality and its size. In order for the emotional salience of the stimuli to take hold, a delay of 1800 ms separated the duration of the stimuli to be judged and the screen asking participants to give their answer. The left–right position of keys was counterbalanced across women. Seven probe durations were used: the two standards (400 and 1600 ms) and five intermediate duration values (600, 800,

1000, 1200, and 1400 ms). A retroaction was given for every trial of the training phase. The intertrial interval (ITI) was then presented, with a random duration ranging from 1800 to 2300 ms.

During the testing phase, the women performed the same task: they indicated whether the presentation duration of a picture was closer to the S or to the L standard duration. However, whereas the stimuli took the form of an object picture (snap hook) in the neutral bisection task, food pictures were used in the emotional conditions. In the neutral task, each participant completed a total of 280 trials, that is 40 trials for each probe duration (40 × 7), separated into 5 blocks of 56 trials. In the emotional task with food pictures, women performed 560 trials, that is 40 trials for each probe duration, for each emotion (40 × 7 × 2)<sup>1</sup> . These 560 trials were divided in 8 blocks of 70 trials. As mentioned above, five specific joyful stimuli were used for ED and HC groups, but both groups viewed the same five images of disgusting food. The use of five different images to evoke the same affect was motivated by a desire to reduce emotional habituation across blocks and trials. In this sense, for example, instead of

#### TABLE 1 | Values of selected pictures on valence, arousal, joy, disgust and neutrality.


Values represent means (standard deviations) on Likert scales ranging from 1 to 9. <sup>a</sup>Eating disorders group. <sup>b</sup>Healthy controls group.

<sup>1</sup> Initially, the time perception tasks (bisection and discrimination) of this study were designed to be used with electroencephalographic (EEG) recordings. Consequently, an elevated number of trials per condition (for a better signalto-noise ratio) and a long delay between the steps of the tasks (e.g., ITI, stimulus-answer delay; for viewing the apparition and the extinction of cerebral components) were planned.

viewing a joyful food picture 280 times, participants saw it 56 times.

In the two bisection tasks, trials were presented in a random order. The standards were showed at the start of each block, before the round of trials began. A break was taken by the participants between the blocks to reduce fatigue. Women were asked to refrain from using segmentation and counting strategies that could help them to track time (e.g., foot taping, imaging, repetitive movements, or counting seconds; Grondin et al., 1999).

#### Duration Discrimination Task

A discrimination task was also proposed to participants. In this task, women of both groups had to judge the relative duration of two pictures presented successively and to indicate, by pressing the appropriate button on the keypad (S or L), whether the duration of the second picture was shorter (S) or longer (L) than the first. The left-right position of keys was counterbalanced across participants. The images were presented back to back, in a random order, with a 1 s interstimulus interval (ISI). The participants had to respond as soon as the second picture disappeared of the screen, which then turned black: there were neither a fixed delay between the second stimulus and the response, nor a screen of instructions asking participants to enter their answer. No training phase was introduced before the testing blocks and no feedback was given after each response.

The two stimuli presented consecutively could be of three emotions: joy (J), disgust (D), or neutral (N). So, nine conditions composed the task: three in which the same emotion was used to mark the duration of the first and second pictures (J–J, D–D, N–N), and six conditions in which different emotions were used (J–D, J–N, D–N, D–J, N–J, N–D). There was one experimental block for each of these nine combinations. The order of the blocks was randomized and counterbalanced between participants. There was a total of 360 trials, that is 40 trials for each of the 9 blocks. Durations used were 400 and 482 ms. Within each block, the order of the short and the long intervals was randomized, but equiprobable. As for the bisection tasks, one emotion was solicited by five different images, but contrary to that task, the neutral affect was also represented by five pictures. Again, each group tested specific pictures for the joyful food conditions, but were presented with the same pictures for the disgusting ones. The neutral object pictures were also identical for the ED and the HC groups.

## Measures

#### Anthropometric Data

Participants' height and weight were measured, then their body mass index (BMI; kg/m<sup>2</sup> ) was calculated. The direct evaluation of anthropometric data was made instead of using self-reported information because the latter can be invalid (Meyer et al., 2009a,b; Ambwani and Chmielewski, 2013).

#### ED Symptomatology

The Eating Disorder Examination – Questionnaire (EDE-Q 6.0; Fairburn and Beglin, 2008; French translation by Carrard et al., 2015) and the Revised Restraint Scale (RRS; Herman and Polivy, 1980; unofficial French translation by the Institut sur la Nutrition et les Aliments Fonctionnels [INAF] of Université Laval), two self-report scales, were used to confirm the absence of ED in the HC group, and to document the ED symptomatology in the clinical group. The EDE-Q includes 22 items assessing the attitudinal features of ED psychopathology, which can be derived in four subscales: restraint, eating concern, shape concern and weight concern over the 28-previous day. These items are answered on a 7-point Likert scale (ranging from 0 to 6). A global score can be calculated by summing and averaging the subscales scores. The greater the global score is, the more severe are the symptoms of ED. Another six items assess the frequency of ED behaviors (i.e., binge eating episodes, inappropriate compensatory methods), that is how many times the behaviors occurred during the 28-previous days. The EDE-Q is a good instrument in terms of internal consistency (α = 0.70– 0.90 for clinical sample; α = 0.78–0.93 for community sample; Luce and Crowther, 1999; Mond et al., 2004b; Peterson et al., 2007) and test–retest reliability (r = 0.81–0.94 for 2 weeks, r = 0.57–0.77 for about 1 year; Luce and Crowther, 1999; Mond et al., 2004a). To be included in the study as HC, recruited women could not present any fasting phase or notable restriction of energy intake to lose weight, nor any episode of binge eating with inappropriate compensatory behaviors. Furthermore, HC participants had to be satisfied with their weight and silhouette (dissatisfaction score < 3 on both items), and these elements could not influence significantly their self-esteem (score < 3).

The RRS measures restrained eating. It has 10 items divided into two subscales: concern for dieting and weight fluctuation. These items are presented in a multiple-choice format and scored on scales of 0–3 or 0–4 points. A typical cut-off score of 15 or 16 is used to designate high restraint (Polivy et al., 1988; Heatherton et al., 1991), so to take part in the study, recruited HC should get a global result inferior or equal to 15. The RRS has good psychometric proprieties: internal consistency ranges from 0.79 to 0.86, and test–retest reliability is evaluated to 0.95 for 2 weeks and 0.74 for 2 years (Ruderman, 1983; Allison et al., 1992; Gorman and Allison, 1995).

In addition to the EDE-Q and the RRS, the Structured Clinical Interview for DSM-IV-TR Axis I Disorders – Research version, Patient Edition (SCID-I/P; First et al., 2002), adapted for the DSM-5 criteria (American Psychiatric Association, 2013), was used to attribute diagnoses to women in the ED group. The SCID-I/P is a gold standard semi-structured interview for psychiatric evaluation and diagnostic. The instrument has an interjudge

reliability of 0.70–1.0 and a test–retest reliability of 0.82–0.90 for both clinical and community samples (Williams et al., 1992; Segal et al., 1994; Pike et al., 1995; First et al., 2002).

#### Affective State

Depressive symptoms of women in both groups were evaluated with the Beck Depression Inventory II (BDI-II; Beck et al., 1996; French adaptation by Éditions du Centre de Psychologie Appliquée, 1996). This questionnaire has 21 items with ratings ranging from 0 to 3, according to the severity of symptoms. Participants must respond as a function of their state during the last 2 weeks, including the day of testing. The BDI-II global score estimates the intensity of the depressive mood, with higher scores indicating more severe symptoms. The instrument has a high internal consistency (α = 0.90) and a good test–retest stability (r = 0.73–0.96; for review, see Wang and Gorenstein, 2013).

The State-Trait Anxiety Inventory – Y Form (STAI-Y, Spielberger, 1983; French-Canadian adaptation by Gauthier and Bouchard, 1993), a 40-item self-questionnaire, was used to detect the presence and assess the intensity of anxiety symptoms in participants. Part A assesses State-Anxiety, which can be defined as subjective feelings of tension, nervousness and worry, and the arousal of the autonomic nervous system induced temporarily by situations perceived as dangerous. At the opposite, Part B assesses Trait-Anxiety, which refers to a tendency to perceive the world in a fearful way, and to a disposition to react toward it with stress and discomfort in a predictable manner. Part A consists of 20 statements evaluating how respondents feel "right now" or "in a recent past" according to the moment they complete the questionnaire, whereas Part B consists of 20 items assessing how people "generally" feel. Participants respond on 4-point Likert scales. Two raw scores, one for each part, are obtained by adding up the points on each item, and are then transformed into T-scores depending on the age of the participants. For the needs of the experiment, only Part A scores (i.e., actual symptoms; State-Anxiety) were compiled. Test–retest reliability coefficient of the instrument ranges from 0.31 to 0.86 (intervals fluctuating from 1 h to 104 days) and its internal consistency coefficient ranges from 0.86 to 0.95 (Spielberger, 1983; Gauthier and Bouchard, 1993).

A supplementary measure of participants' affective state was used: the Profile of Mood States (POMS; McNair et al., 1971; French adaptation by Cayrou et al., 2000, 2003). This 65-item questionnaire was introduced in the study to consider feelings other than depression and anxiety, and to collect data about the mood of participants just before they complete temporal tasks (i.e., their actual or "right now" mood). In fact, the POMS provides a score for different emotional conditions (i.e., tension-anxiety, depression, angerhostility, vigor-activity, fatigue, confusion-bewilderment, and interpersonal relationships), and a total mood disturbances score (TMD) for which all scales except interpersonal relationships are taken into account. For each item, respondents give a selfreported answer on how they do or do not relate. Five-point Likert scales are used. Raw scores are transformed into Z-scores, then into T-scores. For the present study, only the TMD score, a general distress estimate, was considered. The POMS has satisfying psychometric qualities: its internal consistency ranges from 0.63 to 0.88 and its 2-week test–retest reliability ranges from 0.66 to 0.83 (McNair et al., 1971; Cayrou et al., 2000, 2003).

### Cognitive Abilities

Five well-known neuropsychological tests were used to evaluate attention components, processing speed, working memory, and executive functioning of all participants. The Continuous Performance Test – Second Edition (CPT-II; Conners, 2000) is a computerized task that requires respondents to press the spacebar each time a letter appears on the screen, except when the letter X is shown. The 14-min duration of the test and the variation of its ISI allow measuring sustained attention and vigilance. Alertness level (reaction time) and its constancy (or stability) are also evaluated, as well as the number of commissions (i.e., number of times the spacebar is pressed when the letter X is shown) and omissions (i.e., number of letters for which no response is provided). Performance is expressed in T-scores: higher are the T-scores, greater are the participants' difficulties.

The Spatial Span subtest of the Wechsler Memory Scale – Third Edition (WMS-III; Wechsler, 1997) assesses nonverbal (or visuospatial) working memory (i.e., maintenance and manipulation components). Ten blocks are arranged asymmetrically on a board. The examiner taps the blocks in a prearranged sequence and the participant tries to reproduce it by pointing the blocks in the same (Spatial Span Forward; maintenance component) or in the reverse order (Spatial Span Backward; manipulation component), as instructed. The number of blocks to be recalled increases across trials. The task is discontinued when the respondent commits two errors in both trials of the same length. The span represents the number of blocks tapped in the longest set completed with success. The performance is expressed in cumulative percentages: higher are the scores, lower are the participant's spans.

Two tasks of the Delis-Kaplan Executive Function System (D-KEFS; Delis et al., 2001) were used to assess women's performance in visual scanning (i.e., selective attention), processing speed, verbal inhibition, and cognitive flexibility. The first one is the Trail Making Test (TMT), which is composed of five parts. In Part 1 (visual scanning), respondents are required to find and circle all the numbers "3" on a worksheet with distractors (numbers and letters). In Part 2 (number sequencing), participants must link a set of 16 numbers (1 to 16) in ascending order, on a worksheet with numbers and letters. In Part 3 (letter sequencing), they have to join a set of letters (A to P) in alphabetical order. In Part 4 (number-letter switching), respondents must connect 16 numbers and 16 letters, in numerical and alphabetical orders, in switching between them (i.e., 1-A-2-B-3-C). Finally, in Part 5 (motor speed), women have to draw a line over a dotted line, in touching circles along the path, as fast as they can. For every part, the time to complete the task and the number of errors are recorded. The performance is expressed in scaled-scores: higher are the scaled-scores, better are the participants' abilities. The second task of the D-KEFS that was used is the Color-Word Interference (CWI). This subtest, based on the Stroop effect, has four conditions. In Condition 1 (denomination), respondents name the color of squares printed on a sheet. In Condition 2

(reading), women read color names (words) printed in black ink. In Condition 3 (inhibition), participants have to name the color of the ink in which given color words are printed, that is to inhibit automatic responses (i.e., reading) and generate incongruent responses. In Condition 4 (switching), participants must shift between reading the color names (words) and naming the ink colors of the words printed. Like the TMT, for all conditions, time to achieve the task and number of errors are noted. Once again, the performance is expressed in scaled-scores.

Finally, participants' verbal working memory (i.e., maintenance and manipulation components) was assessed by the Digit Span subtest of the Wechsler Adult Intelligence Scale – Fourth Edition: Canadian (WAIS-IV CDN; Wechsler, 2008). In this task, the examiner verbalizes a sequence of numbers and the respondent is asked to repeat them in the same order (Digit Span Forward; maintenance component), in reverse order (Digit Span Backward; manipulation component) or in ascending order (Digit Span Sequencing; manipulation component), as instructed. As for the Spatial Span, the task is discontinued when the participant commits two errors in both trials of the same length, and the span represents the number of digits recalled in the longest set correctly completed. The performance is expressed in cumulative percentages: higher are the scores, lower are the participant's spans.

### Procedure

Women of both groups were tested individually in a quiet room of the Laboratoire de Recherche en Psychologie de la Perception of the Université Laval. To reduce the variability of hungriness between participants and its influence on time perception, women were told to eat in the 60 min preceding the experiment and were tested at fixed hours, that is after breakfast (8:30 am), after lunch (1:30 pm) or after supper (6:30 pm). The study included two sessions that lasted approximatively 2.5 h each. At the beginning of each session, participants had to relate the time and the content of their preceding meal, evaluate their level of appetite on a 7-point Likert scale (ranging from 0 to 6) and fill in the POMS about their actual general mood. During the first experimental session, women completed the emotional bisection task, filled in questionnaires about their eating behaviors and affective state (i.e., EDE-Q, RRS, BDI-II, and STAI-Y), and achieved half of the neuropsychological measures. During the second session, they performed the neutral bisection task, finished the neuropsychological testing and realized the discrimination task. Afterwards, anthropometrics data were collected. The order of the sessions and the content of each seance were counterbalanced across participants. Considering the cognitive load linked to each experimental seance, a large break was taken between the tasks and only one session could be achieved per day. However, to reduce the variability of affective state and BMI across seance, both sessions had to be completed inside 2 weeks.

### RESULTS

### Age, BMI, ED Symptomatology, Affective State and Level of Hungriness

Demographic information, clinical characteristics and level of hungriness of participants are reported in **Table 2**. Values for each group of women (i.e., ED and HC), and for each ED subgroup (i.e., AN and BN) are presented. Data were inspected for normality (skewness, kurtosis, Shapiro–Wilk test), and because they did not display a normal distribution, non-parametric analyses (i.e., Kruskal–Wallis and Mann–Whitney U tests) were used to evaluate differences between groups and subgroups. The alpha (0.05) was adjusted with a Bonferroni correction when needed. **Table 3** provides the results of the analyses.

At the groups level, there were no significant differences between women with ED and HC for age (p = 0.327), BMI (p = 0.489) and level of hungriness before each experimental session (for bisection, p = 0.159; for discrimination, p = 0.382).

TABLE 2 | Demographic information, clinical characteristics, and level of hungriness for groups of eating disorders (ED) and healthy controls (HC), and for anorexia (AN) and bulimia nervosa (BN) subgroups of ED.


BMI, Body mass index; EDE-Q, Eating Disorder Examination – Questionnaire (global score); RRS, Revised Restraint Scale (raw score); BDI-II, Beck Depression Inventory II (raw score); STAI-Y, State-Trait Anxiety Inventory – Y Form (T-score); POMS, Profile of Mood State (Global Mood Disturbances T-score); Hungriness, Level of appetite assessed on a 7-point Likert scale ranging from 0 to 6.



ED, eating disorders group; HC, healthy controls group; AN, anorexia nervosa subgroup; BN, bulimia nervosa subgroup; BMI, body mass index; EDE-Q, Eating Disorder Examination – Questionnaire (global score); RRS, Revised Restraint Scale (raw score); BDI-II, Beck Depression Inventory II (raw score); STAI-Y, State-Trait Anxiety Inventory – Y Form (T-score); POMS, Profile of Mood State (Global Mood Disturbances T-score); Hungriness, level of appetite assessed on a 7-point Likert scale ranging from 0 to 6. <sup>∗</sup>Significant effect (alpha level adjusted with a Bonferroni correction when needed).

However, women suffering from AN or BN as a group showed more attitudinal ED features (p < 0.001), restrained eating (p < 0.001), depression symptoms (p < 0.001), anxiety manifestations (p < 0.001), and global mood disturbances for both experimental seances (for bisection, p < 0.001; for discrimination, p = 0.001) than HC.

At the subgroups level, there were no significant differences between women with AN, BN and HC for age (p = 0.509) and level of hungriness before the discrimination task (p = 0.219). However, there were differences between subgroups on the BMI (p < 0.001), the BDI-II (p < 0.001), the STAI-Y (p < 0.001), the EDE-Q (p < 0.001), the RRS (p < 0.001), the POMS for both sessions (for bisection, p < 0.001; for discrimination, p < 0.001) and the level of hungriness before the bisection task (p < 0.001). More precisely, the BN subgroup showed a higher BMI than the AN subgroup (p < 0.001) and the HC group (p = 0.021), and a higher level of hungriness during the bisection task than the HC group (p = 0.042). In addition, this subgroup presented lower symptoms of depression than women suffering from AN (p = 0.044), but higher levels of anxiety (p = 0.017), ED attitudes (p < 0.001) and retrained eating (p < 0.001) than HC. For their part, women with AN showed a higher level of depression (p < 0.001), anxiety (p < 0.001), ED features (p < 0.001) and restrained eating (p < 0.001) than HC. They showed also a higher level of general distress than HC in the beginning of each experimental session (p < 0.001).

### Cognitive Abilities

**Table 4** presents the classification of neuropsychological measures according to the cognitive domain assessed, and the results on these measures. Data for each group of women, and for each ED subgroup are shown. To verify the presence of differences between groups and subgroups, non-parametric analyses were used once again because the data were not normally distributed. Each cognitive domain was evaluated separately and the alpha (0.05) was adjusted when requisite (see **Table 5**).

At the groups level, the analyses revealed that women with an ED showed verbal (maintenance component, p = 0.303; manipulation component, p = 0.295, p = 0.991) and visuospatial (maintenance component, p = 0.143; manipulation component, p = 0.732) working memory, reaction time (p = 0.886), sustained attention (p = 0.362, p = 0.101), verbal inhibition (p = 0.786) and cognitive flexibility abilities (p = 0.936, p = 0.328) similar to those of the HC group. Nevertheless, their processing speed was slower (p = 0.049), their alertness was less constant (p = 0.024), their vigilance was inferior (p = 0.010), and their motor impulsivity (p = 0.026) was higher than those of HC. Furthermore, the ED group tended to have lower selective attention (visual scanning) capacities than women not suffering from AN or BN as a group (p = 0.062).

At the subgroups level, there were no significant differences between women for the manipulation component in verbal (p = 0.092, p = 0.949) and visuospatial (p = 0.908) working memory, the maintenance component in spatial working memory (p = 0.187), the processing speed (p = 0.141), the alertness (level, p = 0.861; stability, p = 0.074), the selective attention (p = 0.170), the sustained attention (p = 0.193, p = 0.156), the motor impulsivity (p = 0.051), the verbal inhibition (p = 0.955), and the cognitive flexibility (p = 0.928, p = 0.313). However, the vigilance capacities of the AN subgroup were poorer than those of the HC group (p = 0.030). In addition, women suffering from AN tended to show a lower verbal span for the maintenance condition than the BN (p = 0.057) and HC participants (p = 0.069).

### Performance on Time Perception Tasks Temporal Bisection Tasks

The data of the two bisection tasks – the one with joyful and disgusting food pictures and the one with a neutral object picture – were combined to compare the influence of these emotions on time perception. First, for each participant of both groups, the proportion of "long" responses (p[long]) for each stimulus duration and for each emotional condition was calculated. Then, the p[long] was plotted against durations. An examination of the **Figure 2** revealed that the p[long] increased as a function of stimulus duration for both groups, indicating that participants seemed to estimate time adequately (i.e., the longer the stimulus duration was, the more likely women responded "long"). However, the psychophysical functions were not distinctly shifted toward the left or the right, proposing no clear time distortion due to emotions. To explore the specificities of the ED diagnoses, the same procedure was applied within each ED subgroup: the p[long] was calculated and plotted against durations for women with AN and BN, in comparison with data of the HC group. A look at the **Figure 3** indicated, in addition to an effect of Duration, that there was a clear shift of the curve toward the left for the AN group when either joyful or disgusting food pictures were presented.

TABLE 4 | Scores on neuropsychological measures, by cognitive domain, for groups of eating disorders (ED) and healthy controls (HC), and for anorexia (AN) and bulimia nervosa (BN) subgroups of ED.


CPT-II, Continuous Performance Test – Second Edition; RT, reaction time; SD, standard deviation; TMT, Trail Making Test; BC, block change; ISI, interstimulus interval. Types of measures for the neuropsychological tests: Spatial Span and Digit Span, cumulative percentages; CPT-II, T-scores; CWI and TMT, scaled scores. <sup>a</sup>Longest sequence correctly recalled (span); <sup>b</sup>non-verbal modality; <sup>c</sup> verbal modality.

TABLE 5 | Results of the Mann–Whitney U and the Kruskal–Wallis tests on participants' neuropsychological scores, by cognitive domain.


ED, eating disorders group; HC, healthy controls group; AN, anorexia nervosa subgroup; BN, bulimia nervosa subgroup; CPT-II, Continuous Performance Test – Second Edition; RT, reaction time; SD, standard deviation; TMT, Trail Making Test; BC, block change; ISI, interstimulus interval. Types of measures for the neuropsychological tests: Spatial Span and Digit Span, cumulative percentages; CPT-II, T-scores; CWI and TMT, scaled scores. <sup>a</sup>Longest sequence correctly recalled (span); <sup>b</sup>non-verbal modality; <sup>c</sup> verbal modality. <sup>∗</sup>Significant effect (alpha level adjusted with a Bonferroni correction when needed). †Marginally significant effect (alpha level adjusted with a Bonferroni correction when needed).

To further investigate these observations, two temporal indexes were calculated and analyzed: the bisection point (BP) and the Weber ratio (WR). The BP refers to the stimulus duration at which the participants would respond "short" or "long" with equal frequency (p[long] = 0.50). A higher BP means that durations are judged to be shorter, shifting the psychophysical function toward the right (i.e., underestimation of time). The WR reflects sensitivity to time. Lower WR indicates better performance (i.e., greater temporal sensitivity). Both measures were derived from a maximum likelihood fit of the proportion of "long" responses to the target durations on a cumulative gaussian curve. BP was equal to the mean parameter of the

FIGURE 3 | Proportion of "long" responses plotted against stimulus duration for each emotional condition and each subgroup of participants.

estimated gaussian curve, while the WR was equal to the standard deviation parameter of the estimated gaussian curve divided by the arithmetic mean of all target durations (i.e., 1000 ms). **Table 6** presents the BP and the WR values for each group, and for each ED subgroup.

A mixed-design analysis of variance (ANOVA) was conducted on the BP, with Emotion (J, D, and N) as a within-subjects factor and Group (ED and HC) as a between-subjects factor. For this analysis (and all the subsequent), the alpha level was fixed at 0.05 and corrections (Greenhouse–Geisser, Bonferroni) were applied when needed. The ANOVA revealed neither significant main effects of Emotion (p = 0.370) and Group (p = 0.867), nor significant Emotion × Group interaction (p = 0.436; see **Table 7** for the complete results). To investigate the presence of possible differences on the BP at the ED subgroups level, another ANOVA was carried out, comparing data of women with AN, BN, and without ED. The analysis showed no significant main effects of Emotion (p = 0.118) and Subgroup (p = 0.129), but the Emotion × Subgroup interaction was significant (p = 0.004; see **Table 7**). The decomposition of this interaction indicated that for joyful food pictures, women suffering from AN showed a lower BP than participants with BN (p = 0.019). A trend was noted for disgusting food images too (p = 0.052). However, there was no significant difference between AN and BN for the neutral object (p = 1.00). In other words, compared to women with BN, participants with AN significantly overestimated the duration of joyful food, and tended to overestimate the duration of disgusting food. Moreover, post hoc tests revealed that for the AN subgroup, the BP associated to joyful and disgusting food pictures were lower than the one linked to the neutral object picture (p = 0.005, p = 0.005). Thus, in comparison to the neutral stimulus, people suffering from AN overestimated the duration of food.

A mixed-design ANOVA was conducted on the WR, with Emotion (J, D, and N) as a within-subjects factor and Group (ED and HC) as a between-subjects factor. The analysis showed no significant main effects of Emotion (p = 0.721) and Group (p = 0.136), but a marginally significant Emotion × Group

TABLE 6 | Bisection point (BP) and Weber ratio (WR) associated to emotional stimuli for groups of eating disorders (ED) and healthy controls (HC), and for anorexia (AN) and bulimia nervosa (BN) subgroups of ED.


Values represent means (standard deviations).

TABLE 7 | Results of the ANOVAs for the temporal bisection task and the duration discrimination task.


ED, eating disorders group; HC, healthy controls group; AN, anorexia nervosa subgroup; BN, bulimia nervosa subgroup; BP, bisection point; WR, Weber ratio. <sup>∗</sup>Significant effect (alpha level adjusted with a Bonferroni correction when needed). †Marginally significant effect (alpha level adjusted with a Bonferroni correction when needed).

interaction (p = 0.076; see **Table 7**). Post hoc analyses indicated that, for the neutral object, the WR of the ED group was higher than the one of the HC group (p = 0.026), which means that women suffering from an ED had more difficulty to discriminate time (lower level of temporal sensitivity). Finally, to test the WR across the ED subgroups, a second ANOVA was run. However, neither the main effects of Emotion (p = 0.839) and Subgroup (p = 0.271), nor the Emotion × Subgroup interaction (p = 0.196), was significant or marginally significant (see **Table 7**).

#### Duration Discrimination Task

The data of the discrimination task were analyzed in terms of perceived duration and discrimination level (or sensitivity). The perceived duration is the probability of responding "long" (p[long]), which indicates, for each experimental condition (i.e., each pair of stimuli), whether the second picture presented is judged as shorter or as longer than the first one. The discrimination level is the probability of responding correctly (p[correct]), that is responding "long" when the duration of the second picture is effectively longer that the first one.

Higher proportion of correct responses means greater temporal sensitivity. **Table 8** presents, for each pair of images, the p[long] and the p[correct] for the ED and HC groups, and for the ED subgroups.

A mixed-design ANOVA was performed on the p[long] with Emotion of the first picture (i.e., Emotion 1: J, D, and N) and Emotion of the second picture (i.e., Emotion 2: J, D, and N) as repeated variables, and Group (ED and HC) as a non-repeated factor. For this analysis (and all the subsequent), the alpha level was fixed at 0.05 and corrections (Greenhouse–Geisser, Bonferroni) were applied when needed. The ANOVA did not reveal any effect: neither main effects (Emotion 1, p = 0.371; Emotion 2, p = 0.825; Group, p = 0.547), nor interaction effects (Emotion 1 × Group, p = 0.215; Emotion 2 × Group, p = 0.853; Emotion 1 × Emotion 2, p = 0.295; Emotion 1 × Emotion 2 × Group, p = 0.523) were significant (see **Table 7**). Next, to test the presence of differences at the ED subgroups level, another ANOVA was carried out. Once more, no main and interaction effects were exposed (Emotion 1, p = 0.305; Emotion 2, p = 0.803; Subgroup, p = 0.727; Emotion 1 × Subgroup, p = 0.365; Emotion 2 × Subgroup, p = 0.473; Emotion 1 × Emotion 2, p = 0.261; Emotion 1 × Emotion 2 × Subgroup, p = 0.678; see **Table 7**).

For sensitivity, similar ANOVA designs were used. The first ANOVA showed no effect of Emotion 1 (p = 0.102) and Emotion 2 (p = 0.119), but the Group effect was significant (p = 0.035; see **Table 7**). More precisely, women suffering from an ED had a lower probability of responding correctly than the HC group (p = 0.035). However, none of the interaction effects was significant (Emotion 1 × Group, p = 0.598; Emotion 2 × Group, p = 0.525; Emotion 1 × Emotion 2, p = 0.225; Emotion 1 × Emotion 2 × Group, p = 0.273). The second ANOVA conducted with the subgroups showed no main and interaction significant effects (Emotion 1, p = 0.096; Emotion 2, p = 0.126; Subgroup, p = 0.110; Emotion 1 × Subgroup, p = 0.894; Emotion 2 × Subgroup, p = 0.191; Emotion 1 × Emotion 2, p = 0.132; Emotion 1 × Emotion 2 × Subgroup, p = 0.461; see **Table 7**).

### Influence of Non-temporal Factors

Correlation analyses on the entire sample were accomplished to verify and estimate the relation between the participants' performance on temporal tasks, their clinical characteristics (i.e., BMI, ED features, actual mood, depressive and anxiety symptoms), their level of hungriness and their cognitive abilities. The performance on temporal tasks was explored according to the four indexes mentioned earlier, namely the BP, WR, p[long] and p[correct], with all emotions pooled together (i.e., joy, disgust, and neutrality). Because some variables were not normally distributed, Spearman correlations were executed. The alpha level was set at 0.05.

For the bisection task, there was no significant or marginally significant link between the BP and participants' clinical characteristics, appetite and performance on neuropsychological tests. However, the WR significantly correlated with: BMI (r<sup>s</sup> = 0.30, p = 0.046), Spatial Span Forward (r<sup>s</sup> = 0.37, p = 0.013), Digit Span Backward (r<sup>s</sup> = 0.33, p = 0.025), CWI Part 2 (r<sup>s</sup> = −0.36, p = 0.014), CPT-II Hit RT SD (r<sup>s</sup> = 0.33, p = 0.026 ) and CPT-II RT ISI Change (r<sup>s</sup> = 0.45, p = 0.002). For the discrimination task, the p[long] was significantly associated with CPT-II Hit RT SD (r<sup>s</sup> = 0.29, p = 0.047), and showed a

TABLE 8 | Proportion of "long" and correct responses associated to each pair of images, for groups of eating disorders (ED) and healthy controls (HC), and for anorexia (AN) and bulimia nervosa (BN) subgroups of ED.


Values represent means (standard deviations).

marginally significant association with CPT-II Hit RT (r<sup>s</sup> = 0.29, p = 0.053). The p[correct] significantly correlated with: TMT Part 1 (r<sup>s</sup> = 0.45, p = 0.002), CPT-II Omissions (r<sup>s</sup> = −0.29, p = 0.049), CPT-II Hit RT SD (r<sup>s</sup> = −0.35, p = 0.019), CPT-II Hit RT ISI Change (r<sup>s</sup> = −0.46, p = 0.001). It showed also a marginally significant relation with: RRS (r<sup>s</sup> = −0.29, p = 0.050), EDE-Q (r<sup>s</sup> = −0.29, p = 0.052), Digit Span Sequencing (r<sup>s</sup> = −0.29, p = 0.053), CWI Part 2 (r<sup>s</sup> = 0.28, p = 0.061) and CPT-II Hit RT (r<sup>s</sup> = −0.26, p = 0.082).

To push further the reasoning about the influence of non-temporal factors on participants' temporal performance, ANCOVAs were run<sup>2</sup> . In other words, we were interested to see if the statistical control of these factors would impact the differences between groups and subgroups of women. If yes, we could then infer that the observed differences seemed to be due to these factors. For each of the ANCOVAs performed, the covariates were the measures significantly correlated with the temporal index investigated, as presented above (i.e., the marginally significant relations were not included). Because no measure was correlated with the BP, no ANCOVA was performed for this index. The alpha level was fixed at 0.05 and corrections (Greenhouse–Geisser, Bonferroni) were applied when needed.

For the bisection task, the control of the influence of BMI and specific cognitive abilities (i.e., alertness constancy, vigilance, processing speed, and working memory) on the WR made the marginally significant Emotion × Group interaction effect disappeared (p = 0.210). More precisely, for the neutral condition (i.e., pictures of an object), the ED group, in comparison to the HC group, showed no more difficulty to discriminate time (p = 0.572). However, at the subgroups level, the inclusion of covariates did not change the results: the ANOVA and the ANCOVA showed no significant or marginally significant effect.

For the discrimination task, there was no consequence, on the results of the analyses, of removing the influence of alertness (constancy) on the p[long]: both the ANOVA and the ANCOVA, performed for groups and subgroups of participants, showed no significant or marginally significant effect. However, for the p[correct], the Group effect previously found with the ANOVA was lost after controlling for the participant's cognitive abilities (i.e., selective attention, sustained attention, alertness constancy, vigilance; p = 0.485). In this sense, the ANCOVA showed that time sensitivity of ED and HC participants was no longer different. However, at the subgroups level, the results remained unchanged (once again, no effect was found).

### DISCUSSION

The main objective of the present experiment was to study the emotional reactions of women suffering from an ED. In comparison of using traditional self-reported measures, which are prone to some biases, we worked with a different method, namely the presentation of food pictures and judgements about the duration of these presentations. The time perception perspective adopted is based on the fact that timing is sensitive to emotions and that temporal distortions give insights about how the environment is processed. Temporal bisection and duration discrimination tasks were used, which involved pre-rated joyful and disgusting food pictures and neutral object pictures.

The results demonstrated that, for both tasks, women with an ED, when pooled together no matter their diagnostic, did not show any time distortion when food pictures were presented, compared to object pictures and to HC. However, in the bisection task, when the ED group was split in function of diagnostic, women suffering from AN overestimated the duration of joyful and disgusting food pictures in comparison to neutral ones. Also, they perceived durations of joyful food pictures as longer than did women with BN, and tended to judge durations of disgusting pictures as longer too. HC, for their part, did not show any time distortion, for both tasks, in the food conditions compared to the object condition. In addition, the results showed that women suffering from an ED presented a lower temporal sensitivity than HC. In the bisection task, that was limited to neutral pictures, but in the discrimination task, this lower sensitivity occurred with both food and neutral pictures.

The lengthened-duration effect observed in women with AN for food pictures, no matter their initial valence, suggests a general reaction of fear to their presentation. According to the view of Angrilli et al. (1997) about the interaction between valence and arousal on time perception, an overestimation of durations is caused by a rise of the arousal level in response to a negative cue. More precisely, a threatening stimulus generates an elevation of the arousal level by an automatic response of the amygdala, which in turn speeds up the rhythm of the internal clock and activates the defensive system (Lang et al., 1997; LeDoux, 2000). More intense is the emotion of fear elicited by a threatening cue, more elevated is the arousal response to it (Lang et al., 1998; Bradley et al., 2001). In that regard, the overestimation of the presentation duration of food pictures by women with AN appears to be due to a rise of their arousal level, consecutive to a fear reaction. In other words, for these women, viewing food pictures seems to be strongly unpleasant, and even anxiety-provoking. That result is supported by behavioral experiments, which have shown that women suffering from an ED, particularly from AN, rate food pictures as less pleasurable and more fearful than HC (Rodríguez et al., 2007; Giel et al., 2011; Hay and Katsikitis, 2014). It is also supported by many psychophysiological studies that have demonstrated that exposure of AN women to food stimuli elicits an increase in their arousal level, as measured by heart rate, skin conductance and eye blink startle response (e.g., Perpiña et al., 1998; Gordon et al., 2001; Gorini et al., 2010), suggesting an automatic and unconscious reaction of anxiety. Similarly, fMRI researches have

<sup>2</sup> In reference to the arguments of Miller and Chapman (2001) and those of Suckling (2010), we acknowledge that the use of ANCOVAs to "control for" or to "correct for" pre-existing group differences (e.g., cognitive abilities) when participants are not randomly assigned can be controversial. Accordingly, we think that a most appropriate way – in respect to the assumptions of the statistical tests and the conclusions they allow us to draw – to explore the influence of non-temporal factors on participants' temporal performance could have been to run mediation analyses. However, as these analyses are based on structural equation modeling, the detection of significant effects requires a large sample size (low statistical power) and is strongly dependent of the variables studied/entered in the model. Because the present experiment had a small sample size and the selection of the measures to be investigated could have been very ambiguous considering their number and their interrelations, we decided not to perform these mediation analyses.

revealed that presentation of food pictures to women with AN increases their amygdala, medial prefrontal cortex (including the anterior cingulate) and insula activations (Ellison et al., 1998; Uher et al., 2003, 2004), three cerebral areas related to innate signals of fear (for review, see Damasio, 1994; Adolphs, 2013; Silva et al., 2016). Finally, the time distortions caused by a fear reaction in participants with AN is not without reminding those of spider-fearful people for which the durations of phobic stimuli are overestimated (Watts and Sharrock, 1984; Buetti and Lleras, 2012). For example, Tipples (2015) showed that the presentation of threatening stimuli to individuals especially reactive to them, as pictures of spiders for high phobic individuals, generates a rapid reaction of fear and a prompt rise of the arousal level, thus speeding up the internal clock, which results in increasing biases toward "long" responses. In brief, as exposed by studies with different methodologies, food pictures appear to be synonyms of fright for women suffering from AN.

### Outstanding Questions About Temporal Distortions Demonstrated in the Bisection Task

Two questions emanate from the lengthened-duration effect observed in AN in the bisection task. The first one is why BN participants did not show the same results, that is an overestimation of durations for food pictures in comparison to object pictures. A possible explanation is the fact that women suffering from BN are not as much frightened by food as people with AN are. According to the DSM-5 (American Psychiatric Association, 2013), AN is defined by an "intense fear of gaining weight or becoming fat," a clinical criterion that does not characterize, per se, women with BN. To reduce their fear, women with AN rigidly avoid high-caloric food and are strongly reluctant to consume food outside a very narrow range (American Psychiatric Association, 2013; Savuskoski et al., 2016). Thus, many authors see evident overlaps between attitudes of patients with AN and those of anxious individuals in terms of worries/ruminations, intolerance of uncertainty, fear conditioning, avoidance strategies and reassurance seeking rituals (e.g., Pallister and Waller, 2008; Steinglass et al., 2011; Startup et al., 2013; Guarda et al., 2015; Kesby et al., 2017). Some of them even conceptualize AN as a phobia of food, weight gain and fat (Russell, 1967; Habermas, 1996; Crisp, 2006). By contrast, the relation of people suffering from BN with food stimuli does not seem so terrorizing. For instance, Léonard et al. (1998) demonstrated that, contrary to patients with AN, BN women did not show a rise of their arousal level (skin conductance) when confronted to a test-meal, suggesting an absence of fear reaction to it. Friederich et al. (2006), for their part, revealed that BN people did not only react without fear toward food pictures, but they also presented an exaggerated appetite response when exposed to them. Similarly, Drobes et al. (2001) showed that BN-like participants gave higher pleasantness ratings of food pictures than HC and AN-like participants. Considered all together, these studies propose that women suffering from BN do not feel threatened whilst viewing food pictures – rather they seem to resent a certain form of pleasure – and, thus, their arousal level does not increase abruptly in comparison to a base (non-food related) level. In that perspective, because a certain hedonic value of food seems preserved in BN women, it is also possible that these participants were more motivated than women with AN to execute the temporal tasks. In fact, Gable and Poole (2012) showed that a high-approach motivated state shortens the perception of time, causing time to be perceived as passing more quickly. So, if BN participants were more positively responsive to food pictures than AN women, they could have judge the durations of these stimuli as being shorter. Consequently, in the present study, the absence of an overestimation of durations of food pictures in comparison to those of objects (i.e., no rise of arousal) and the fact that BN women perceived the duration of food pictures shorter than AN participants did (i.e., positive reaction instead of fear) are better understood. In brief, for BN women, the absence of time distortions caused by food pictures in comparison to neutral ones, and the fact that their psychophysical functions for food are inversely shifted compared to those of AN, suggest that this ED subgroup reacts differently to that kind of stimuli. Therefore, when results of AN and BN subgroups are pooled together as a whole group for comparison to HC, the differences ED vs. HC can disappear.

The second and last question arising from the results is why the lengthened-duration effect observed with food pictures in the bisection task for women with AN did not also appear in the discrimination task. The different results are not so surprising considering that Gil and Droit-Volet (2011b) recently showed that time distortions due to emotions could depend on the task used. More precisely, they revealed that "the magnitude of the effect of emotions is reduced when more cognitive resources are required for the processing of time." Thereby, a first way to explain the distinction between results in bisection and discrimination tasks is to explore mental operations required and cognitive processes solicited by each one. According to traditional models of bisection, participants who accomplish that task have to: (a) learn and store in long-term memory the S and L standard durations; (b) measure the length of the probe duration; (c) retrieve from memory the value of the S and L standard durations; (d) compare the probe duration to S and L standards; and (e) make a decision in function of the comparison done (Church and Deluty, 1977). Thus, from that classical perspective, the longterm memory processes of participants and their access seem very solicited. However, recently, some studies suggested that, instead of referring to S and L standard representations stored in memory, participants who perform a bisection task could rather conceptualize the probe durations as either S or L based on a criterion they form with the progression of the task, trial by trial (for review, see Kopec and Brody, 2010). In that view, the implication of long-memory processes could be diminished. At the opposite, the discrimination task is recognized to strongly rely upon working memory abilities (Mioni et al., 2013a). In that sense, participants must hold active the first duration and process, at the same time, the second duration presented (Grondin, 2008, 2010). When emotions of stimuli are manipulated (i.e., are not the same for the first and second pictures of a trial), it implies also to switch back and forth between the affective load associated with both durations when comparing them. So, with working memory

capacities (maintenance and manipulation/updating), attentional resources (alertness, processing speed, selective, and sustained attention) and some executive functions (inhibition and cognitive flexibility) seem required when executing a discrimination task (Mioni et al., 2013a,b). In summary, the cognitive load related to the temporal discrimination task appears more important than the one required in the bisection task. In consequence, the influence of emotions on temporal processing could be reduced for that last task (Gil and Droit-Volet, 2011b), decreasing, by the same occasion, the opportunity to detect differences between conditions and groups. This hypothesis makes sense considering the fact that, in the present study, the discrimination task appeared to be significantly or marginally significantly correlated with more cognitive abilities than the bisection task, and the ED group, compared to the HC group, showed cognitive weaknesses in processing speed, attention, and executive functions. These weaknesses, which are in accordance with the literature (e.g., Steinglass and Glasofer, 2011; Jáuregui-Lobera, 2013; Weider et al., 2015), could have contributed, in the discrimination task, to decrease the influence of emotions provoked by food pictures on time perception (see section "Non-temporal Factors Influencing Performance on Temporal Tasks" for a more detailed explanation).

Besides the inherent processes of the temporal paradigms, another factor, the duration of the stimuli used, could contribute explaining the different results found for the bisection and the discrimination tasks. In fact, by analyzing affective reactions to emotional pictures, Codispoti et al. (2001) demonstrated that short presentations of unpleasant images resulted in less defensive activation (lower arousal) than sustained presentations of the same stimuli. Especially, the researchers showed that longer a subject is exposed to emotional pictures, particularly fearful images, stronger is his/her affective reaction (engagement) to them at a psychophysiological level. Because the discrimination task designed in the present study used short durations (400 and 482 ms) while the bisection task exploited a broader range of durations including longer presentations (400 to 1600 ms), the more sustained durations of pictures in the latter task could have favored the activation of the participants' defensive system by promoting a deeper affective impregnation. Then, that impregnation, in turn, could have facilitated the production of time distortions.

### Different Methodological Choices: Different Results

A supplementary question emerging from the study is why the results for the HC group are not similar to those reported by Gil et al. (2009). More precisely, with a comparable bisection task (i.e., pleasant and disgusting food pictures, durations from 400 to 1600 ms), these authors showed an underestimation of food images, no matter their valence, compared to a neutral stimulus. They also demonstrated that the shortening effect was more marked for the disliked food pictures than for the liked ones. In our experiment, no time distortion was found for the HC group. These different results may emerge from some specific methodological variations in the studies.

Firstly, all food pictures used by Gil et al. (2009) led to low arousal. Consequently, according to Angrilli et al. (1997)'s point of view, time perception would then depend on controlledattentional mechanisms. Viewed from a pacemaker-accumulator framework (internal clock), that means that the more the stimuli captured the participants' attentional resources, the less the pulses were accumulated in the counter, resulting in shorter perceived durations. Thus, Gil et al. (2009) explained their shortening effect by the capacity of food pictures to detract participants' attention from the passage of time. In the present study, all food pictures were high-arousing, so time perception did not depend mainly on the allocation of attention, but on the motivational-survival system: the more the stimuli represented a threat for participants, the more automatic and faster their durations were processed. As the survival of HC does not appear to be compromised by food pictures, they did not alter their ability to track the passage of time. Secondly, Gil et al. (2009) used a picture of a white oval as a neutral stimulus, whereas the present study used an object picture. The use of an image of a geometric figure, a stimulus different from a photograph, may have induced a bias in the processing of time information associated to it. In that sense, food pictures of Gil et al. (2009) appeared more visually complex than their neutral stimulus (e.g., 3D vs. 2D, plate with food on it vs. empty oval, multicolored item vs. one colored item) and, for that reason, they probably captured the participants' attention. So, times distortions induced by the direction of subjects' attention on food pictures could be due to the visual properties of these stimuli, not to their emotional nature per se. In our experiment, food pictures were more comparable to the neutral stimulus used (e.g., both were photos of 3D items, food and object images had similar sizes). As a result, attentional bias toward them – and time distortions deriving from them - was less probable. In the same vein, because HC participants in the study of Gil et al. (2009) were told to not eat 1 h before the testing phase, an extra attentional bias in favor of food pictures could have been induced by their non-sated state. In fact, hunger modulates attention to food-associated cues by producing an approach reaction (Loeber et al., 2013). Specially, attentional capture by food pictures is more marked for people who are not sated than for people who are (Lavy and van den Hout, 1993; Stockburger et al., 2008, 2009; Siep et al., 2009; Piech et al., 2010). Therefore, in Gil et al. (2009), the direction of participants' attention toward food pictures and time distortions associated to them could have been generated by the degree of satiation, not by emotions provoked by food pictures. At the opposite, in the present research, participants were told to eat in the hour preceding the experiment and were tested at fixed moments, just after mealtimes, to reduce and control their hunger. Thus, it was less probable that they felt hungry (as indicated by the low cotes on the Likert-scales completed before the achievement of temporal tasks) and had an approach reaction toward food pictures due to their non-sated state.

Another methodological difference between the study of Gil et al. (2009) and the present one is the fact that women and men participated in the former one, while only women took part in the latter. Knowing that men show happier responses to food images than women (McNamara et al., 2008b), and

considering that in low-arousing conditions, positive valence stimuli lead to an underestimation of time (Angrilli et al., 1997), the presence of men in Gil et al. (2009) could have contributed to the shortening effect observed. Inversely, because the present experiment was conducted with only women, a positive reaction toward food pictures was less plausible, especially considering that they were not hungry. Finally, Gil et al. (2009) did not document the affective state and the cognitive abilities of their HC participants, while it had been shown that these aspects could influence time perception (for review, see Droit-Volet, 2013; Teixeira et al., 2013; see also Tipples, 2008; Mioni et al., 2016b). As well, they did not evaluate if their participants presented ED-like concerns, attitudes and behaviors, which is an essential point to consider when working with food pictures and exploring emotions induced by them. Thus, it cannot be excluded that the performance on temporal tasks of Gil et al. (2009)'s participants was modulated by these factors.

### Non-temporal Factors Influencing Performance on Temporal Tasks

The second aim of the study was to identify and get a better understanding of the non-temporal factors that could contribute to explain the performance on the bisection and discrimination tasks. For that purpose, correlation analyses were run on the entire sample to estimate the relation between the women's performance on temporal tasks, their clinical characteristics (i.e., BMI, ED features, actual mood, depressive, and anxiety symptoms), their level of hungriness and their cognitive abilities.

In the bisection task, the correlation analyses revealed that the bisection point (BP) was not associated with any of the variables tested. Similarly, the sensitivity to time (WR) was not significantly related to participant's hungriness, actual mood and depressive or anxiety symptoms. However, it was correlated with the BMI. Concerning the cognitive functions involved in the task, it appears that the WR was moderately associated with spatial (maintenance component) and verbal (manipulation component) working memory, processing speed, stability of alertness and vigilance. In other words, participants' time sensitivity improved with these cognitive processes. Besides, the present study showed that ED and HC groups were different on WR for the neutral stimulus, with women suffering from an ED having a higher WR (poorer sensitivity) than healthy women. The experiment also demonstrated that both groups were different on all cognitive variables significantly correlated with temporal sensitivity, except on working memory. Interestingly, when the measures related to the WR were statistically controlled for, the ED vs. HC difference disappeared. Thus, it cannot be excluded that, in the bisection task, the weaker temporal sensitivity of women with ED, in comparison to HC, was reduced (or explained) by their BMI and their cognitive weaknesses.

In addition, correlation analyses were performed for the discrimination task. The perceived duration of stimuli (p[long]) was related to only one measure: the stability of alertness capacity. The temporal sensitivity of participants (p[correct]) was positively associated with many cognitive processes, that is selective attention (visual scanning), sustained attention, alertness (constancy), and vigilance. A notable positive trend was also observed for verbal working memory (manipulation component), processing speed and reaction time (alertness level). Concerning the hungriness, the BMI and the clinical characteristics of women, only the restrained eating and the ED symptoms, as assessed by the RRS and the EDE-Q, were marginally and negatively related to time sensitivity. Once again, the fact that the ED and HC groups were different on most variables correlated with the p[correct] seemed to have contributed to explain why their performance on the task – when considering this index – was dissimilar. Indeed, controlling for measures significantly correlated to p[correct], namely their cognitive weaknesses, showed that ED and HC women were no longer different on their time sensitivity. Accordingly, the lower temporal discrimination level of participants suffering from an ED seemed to be due to the symptoms related to their diagnostic (i.e., restraint and concern) and, especially, to their cognitive weaknesses.

Considered all together, the results of the correlation analyses proposed that perceived duration (and time distortions), as reflected by the BP shift, is not associated with variations of BMI, ED restraint and concern, premorbid affective state, level of appetite and cognitive abilities per se. Something rather specific to the processing of temporal information at a psychophysiological level – like the running of the internal clock and its modulation by biological aspects (e.g., arousal level, defensive reaction) – seems to influence the perceived duration of stimuli and cause time distortions. The p[long] appears to be related to the stability of participants' alertness, but this relation is likely not specific to perceived duration. It probably results from the higher cognitive load required by the discrimination task rather than from the alertness particularly. Inversely, the temporal sensitivity (i.e., WR, p[correct]) appears to be correlated with and impacted by non-temporal factors – or, said differently, by factors others than ones affecting the rhythm of the pacemaker – like ED symptoms and cognitive acuity. Consequently, the influence of these non-temporal factors on sensitivity to time is particularly important to consider, as it was demonstrated by two different tasks and it seconded the results of anterior studies showing the strong implication of attention, processing speed and working memory on time perception (for example, see Zélanti and Droit-Volet, 2011, 2012; Brown et al., 2013; Mioni et al., 2013a,b; Ogden et al., 2014; Droit-Volet et al., 2015). Furthermore, the ED vs. HC difference for the discrimination level disappeared when the cognitive weaknesses of women with ED were controlled for.

Finally, a detailed analysis of cognitive measures correlated to WR and p[correct], compared to those related to BP and p[long], may support the hypothesis of a heavier cognitive load required by the discrimination task than that required by the bisection one. In that sense, the correlation analyses showed that, in both tasks, the temporal sensitivity was associated with working memory, processing speed, constancy of alertness and

vigilance. However, the discrimination task was also moderatelyto-strongly linked to selective attention, sustained attention and reaction time (alertness level), three important cognitive processes that were not correlated (significantly or marginally significantly) with the bisection task. Therefore, by having a supplementary and heavier attentional component, the duration discrimination task seems to be more demanding than the bisection one, and so it could be more complex to perceive the influence of emotions on the processing of time when using it (Gil and Droit-Volet, 2011b). Such a more intense cognitive load could thus explain the difference in results between the discrimination and the bisection tasks on perceived duration, that is the detection of temporal distortions in the latter only. More precisely, as the bisection task seems to require less cognitive resources than the discrimination one, the effects of the emotions induced by food pictures were probably more easily detected. In other words, in that task, the participants were not "cognitively saturated": some resources were still available for reacting to and processing the emotional value of the stimuli. In this line of thoughts, the absence of difference on the p[long] between the ED and HC groups in the discrimination task could come from the higher cognitive load associated to it, diminishing the possibility for the emotions to influence notably the perception of time, and then to cause temporal distortions. Moreover, if we consider that ED participants, in comparison to HC women, showed cognitive weaknesses in attention, processing speed, working memory, and executive functions, we can believe that they were even more "cognitively busy" with the discrimination procedure, and consequently, less likely to be affected by the emotions generated by food pictures. So, in that specific task, the detection of significant differences between food and object pictures for those participants was unlikely. Finally, because the ED subgroups were comparable on most of the cognitive measures used – AN women only tended to show a lower verbal working memory capacity that BN women (maintenance component) – neither was in a better "cognitive state" to allow more resources for the completion of the discrimination task. Along these lines, for both subgroups of ED, due to their similar cognitive profile, there was a limited window of opportunity for the emotion factor to influence the processing of temporal information. Consequently, no difference between emotional conditions and ED subgroups was observed.

### Limitations and Strengths of the Study

The generalization of the results of the present study may be reduced by the sample size and the fact that women of the ED group were not critically ill as revealed by their outpatient status and their BMI in the normal range. In fact, there was no difference on BMI between ED and HC participants. However, that is far from signifying that there were no notable symptoms of ED in the ED group. In that sense, the results on EDE-Q and RRS clearly demonstrated that the ED group presented significant symptoms of ED, and higher ED symptoms than the HC group. The absence of differences on BMI could be explained by three elements. The first one is, when AN and BN subgroups were pooled together in the ED group, the BMI of women with BN seemed to have inflated the mean BMI of the ED group. The second element is that HC participants were young university students, whereas BMI often increases with age (for example, see Williamson, 1993; Drøyvold et al., 2006; Reas et al., 2007). The third factor is the outpatient and community status of participants with an ED. In that sense, a very low BMI, which occurred frequently with serious medical complications, leads to an immediate medical support like hospitalization (Hay et al., 2014; Hilbert et al., 2017). Consequently, often, women suffering from an ED who receive outpatient services or who continue their daily activities like attending their university classes do not present a critical health status (i.e., their BMI is much closer to the normal range).

To the best of our knowledge, the present experiment is the first one investigating time perception of women suffering from an ED. This research is also very innovative for its adoption of a temporal perspective for studying the emotions elicited by food pictures in people with AN and BN. In fact, studying time distortions caused by the effects of emotions on time processing allowed us to better understand, in a subtle manner, how women with an ED react to food. In other words, this was done without relying on traditional self-reported procedures, which are regularly argued to be associated with important biases blurring conclusions (e.g., social desirability, low emotional awareness), and without relying on invasive psychophysiological techniques. In light of that, time perception could be considered as a new clinical tool for investigating, in an indirect but efficient way, the emotions of AN women toward food products. This way of approaching the problem was shown to be efficient, even with the patients who normally show much resistance when invited to share information about their illness or to verbalize their affective state. In addition, by using two different temporal tasks (bisection and discrimination), which lead to different results in term of time distortions (i.e., presence vs. absence), this experiment supports the growing literature about the fact that different temporal paradigms lead to different results (for example, see Gil and Droit-Volet, 2011b; Mioni et al., 2013a, 2014; Jisha and Thomas, 2015). It gives also extra thoughts about the complexities of time perception and specificities of methods used to examine it. Following this line of thoughts, in further researches, it could be interesting to explore the latent decision processes accompanying the responses of participants on temporal tasks. More precisely, some recent studies examine the response times of subjects, in complementarity of their choice proportions (e.g., p[long]), by using a drift-diffusion modeling (for relevant experiments, see Balci and Simen, 2014; Tipples, 2015; and for theoretical aspects, see Luzardo et al., 2013; Ratcliff, 2014; Vandierendonck, 2017). This perspective give access to additional material regarding the cognitive processing of temporal information and the dynamic influence of emotions on duration judgements, information that are detectable with difficulty if only choice proportions are analyzed. Such approach would require new data, however, as participants in the present study were not required to respond as quickly as possible. Finally, by adding affective and neuropsychological measures to temporal tasks, the present experiment contributes to a better understanding of how non-temporal factors influence (or not) perceived duration and temporal sensitivity.

### CONCLUSION

fnhum-12-00129 April 4, 2018 Time: 16:15 # 19

The present study showed that women suffering from AN presented an intense reaction of fear when they were confronted to food pictures, regardless of the emotion associated with said pictures (i.e., joy and disgust). In a temporal processing perspective, this emotional reaction was suggested by the overestimation of durations of food stimuli in comparison to object ones, due to a rise of arousal and the activation of the defensive system. The lengthened-duration effect observed in AN women was not detected in participants with BN, suggesting, for this ED subgroup, a less emotional negative reaction toward food, and even the preservation of a hedonic response to it. Thus, for women suffering from BN, food seems to have kept a form of appetent and motivational properties. Considered all together, these results support the idea of a distinctive functioning of the reward pathway in AN and BN (for example, see Davis and Woodside, 2002; Keating, 2010; Keating et al., 2012), and consolidate the thesis of etiological and symptoms similarities between AN and anxiety disorders (e.g., Pallister and Waller, 2008; Steinglass et al., 2011; Guarda et al., 2015; Kesby et al., 2017). Therefore, the elaboration of treatment strategies addressing these aspects seems essential to generate positive and long-lasting changes in women with an ED.

### REFERENCES


### AUTHOR CONTRIBUTIONS

CG, SG, and CB designed the study, and CG ran it. CG and VL conducted the statistical data analyses. CG wrote the first draft of the manuscript. All authors contributed equally to the revision of the paper and then approved the final version of the manuscript.

### FUNDING

CG was the recipient of scholarships from the Canadian Institutes of Health Research (CIHR) and the Fonds de Recherche du Québec – Santé (FRQS). This study was also supported by research grants from the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Social Sciences and Humanities Research Council (SSHRC) awarded to SG.

### ACKNOWLEDGMENTS

The authors would like to thank the INAF of Université Laval, especially Marie-Ève Labonté, Ph.D., nutritionist, for sharing the food pictures that were used in the temporal tasks as joyful stimuli. They would also like to thank Sylvie Rousset, Ph.D., and Sylvie Droit-Volet, Ph.D., for sharing the food photographs exploited in their own experiments and from which some of the disgusting food stimuli used in the present study were created. Moreover, the authors are grateful to the neuropsychologists Karine Morasse, Ph.D., and Catherine Lehoux, Ph.D., for their help with the selection of the neuropsychological measures used.



of food in patients affected by eating disorders. Ann. Gen. Psychiatry 9:30. doi: 10.1186/1744-859X-9-30



orbitofrontal cortex. Behav. Brain Res. 198, 149–158. doi: 10.1016/j.bbr.2008. 10.035


in eating disorders. Am. J. Psychiatry 161, 1238–1246. doi: 10.1176/appi.ajp.161. 7.1238


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Gagnon, Bégin, Laflamme and Grondin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Time Perception and the Experience of Time When Immersed in an Altered Sensory Environment

Joseph Glicksohn1,2 \*, Aviva Berkovich-Ohana<sup>3</sup> , Federica Mauro<sup>4</sup> and Tal D. Ben-Soussan<sup>5</sup>

<sup>1</sup> Department of Criminology, Bar-Ilan University, Ramat Gan, Israel, <sup>2</sup> The Leslie and Susan Gonda (Goldschmied) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat Gan, Israel, <sup>3</sup> The Edmond J. Safra Brain Research Center for the Study of Learning Disabilities, Faculty of Education, University of Haifa, Haifa, Israel, <sup>4</sup> Department of Psychology, Sapienza University of Rome, Rome, Italy, <sup>5</sup> Research Institute for Neuroscience, Education and Didactics, Patrizio Paoletti Foundation for Development and Communication, Assisi, Italy

The notion that exposure to a monotonous sensory environment could elicit reports indicating aberrant subjective experience and altered time perception is the impetus for the present report. Research has looked at the influence of exposure to such environments on time perception, reporting that the greater the environmental variation, the shorter is the time estimation obtained by the method of production. Most conditions for creating an altered sensory environment, however, have not facilitated an immersive experience, one that directly impacts both time perception and subjective experience. In this study, we invited our participants to enter a whole-body altered sensory environment for a 20-min session, wherein they were asked to relax without falling asleep. The session included white-colored illumination of the chamber with eyes closed (5 min), followed by 10 min of illuminating the room with color, after which a short report of subjective experience was collected using a brief questionnaire; this was followed by an additional 5 min of immersion in white light with closed eyes. The participants were then interviewed regarding their subjective experience, including their experience of time within the chamber. Prior to entering the chamber, the participants completed a time-production (TP) task. One group of participants then repeated the task within the chamber, at the end of the session; a second group of participants repeated the task after exiting the chamber. We shall report on changes in TP, and present data indicating that when produced time is plotted as a function of target duration, using a log–log plot, the major influence of sensory environment is on the intercept of the psychophysical function. We shall further present data indicating that for those participants reporting a marked change in time experience, such as "the sensation of time disappeared," their TP data could not be linearized using a log–log plot, hence indicating that for these individuals there might be a "break" in the psychophysical function.

Keywords: time perception, sensory environment, whole-body perceptual deprivation, Ganzfeld, time production

## INTRODUCTION

Exposure to an altered sensory environment, such as that entailing what Marcusson-Clavertz et al. (2012) have termed a "sensory homogenization procedure," has a marked impact on subjective experience (Glicksohn, 1991; Wackermann et al., 2008). Sensory homogenization is achieved by means of a Ganzfeld (homogeneous perceptual field), which can be experienced if one is placed

#### Edited by:

Felipe Ortuño, Universidad de Navarra, Spain

#### Reviewed by:

Mariella Pazzaglia, Sapienza Università di Roma, Italy Chie Takahashi, University of Birmingham, United Kingdom

> \*Correspondence: Joseph Glicksohn jglick@post.bgu.ac.il

Received: 15 June 2017 Accepted: 21 September 2017 Published: 06 October 2017

#### Citation:

Glicksohn J, Berkovich-Ohana A, Mauro F and Ben-Soussan TD (2017) Time Perception and the Experience of Time When Immersed in an Altered Sensory Environment. Front. Hum. Neurosci. 11:487. doi: 10.3389/fnhum.2017.00487

**61**

". . .in the midst of an actual fog which would be perfectly evenly illuminated" (Koffka, 1935, p. 111). As Koffka, citing the pivotal experimental study by Metzger, writes, "the observer will 'feel himself swimming in a mist of light. . .'." (Koffka, 1935, p. 111). Koffka (1935, p. 114) expands on Metzger's procedure for creating such a Ganzfeld in the lab: "The observer sat in front of a carefully whitewashed wall. . .at a distance of 1.25 m. . . . wings bent towards the observer had to be added on all four sides, care being taken that the inhomogeneities thereby introduced were as small as possible . . . The illumination was supplied by a projection lantern . . . ." Ash (1995, p. 230) provides a photograph of this Ganzfeld setup.

A more convenient solution to creating a Ganzfeld is to employ halved ping-pong balls covering the eyes, coupled with exposure to red-colored stimulation, as pioneered by Hochberg et al. (1951) – this being the technique that we employed in previous research (Glicksohn, 1991, 1992; see also Wackermann et al., 2002, 2008). Yet, as Avant (1965, p. 249) correctly asserted, "It is highly likely that this technique produces a different field from that produced by a larger stimulus field at a greater distance from the eyes." All such Ganzfeld techniques entail pattern reduction, monotony, homogeneity and perceptual deprivation (Suedfeld, 1980, pp. 8–9).

The notion that exposure to a monotonous sensory environment could elicit reports indicating aberrant subjective experience (Marks et al., 1968; Mason and Brady, 2009; Daniel et al., 2014; Daniel and Mason, 2015) presents an interesting inroad into studying the relationship between such aberrant subjective experience and altered time perception. For example, Niedenthal (2002, p. 253) reports that "some visitors to a Turrell Ganzfeld installation at the Stedelijk museum in Amsterdam felt so disembodied they had to crawl through the space on hands and knees . . . ." And Gadassik (2016, p. 309), reporting on her own experience within a Turrell Ganzfeld setup, writes: "My ten-minute Hard program inside Light Reignfall gave me the impression of lasting two minutes, and when my reclining tray was rolled out I worried I had pressed the panic button by mistake. One of the lab coat assistants assured me that almost everyone experiences the Hard program at half the time or less of its measurable duration."

Research looking at the influence on time perception of exposure to such altered sensory environments indicates that the greater the environmental variation, the shorter is the time estimation obtained by the method of production; hence, exposure to a monotonous sensory environment should result in a lengthening of time production (TP; Glicksohn, 1992, 1996). This is clearly not because such environments are monotonous, hence are inevitably boring, for if this condition were boring, then one would expect a shortening of TP (Doob, 1971, p. 292; Zakay, 2014, p. 3). We stress that most conditions for creating an altered sensory environment have not facilitated an immersive experience, one that directly affects both time perception and subjective experience (Glicksohn and Berkovich-Ohana, 2012). In the present study, we use a unique environment of Whole-Body Perceptual Deprivation (WBPD; see **Figure 1**) to enable us to investigate the relationship between aberrant subjective experience and time perception.

FIGURE 1 | Whole-Body Perceptual Deprivation (WBPD). The person in the photo has volunteered to be photographed for the illustration of the WBPD paradigm, and did not take part in the current research.

Our TP task requires the participant to produce a target duration (P) by signaling when that duration (T) is thought to have elapsed. For example, if the required duration to be estimated is 8 s, individual **A** might produce a duration of 8 s, individual **B** one of 10 s and individual **C** one of 6 s. Note that for all three individuals, produced duration (P) is subjectively viewed as lasting 8 s (T). Individual **A** exhibits veridical time perception (i.e., P = T; 1 subjective second = 1 s). Individual **B** would be viewed as having a slower internal clock (P > T), and individual **C** would be viewed as having a faster internal clock (P < T). The power function relating P to T (Glicksohn and Hadad, 2012) is given by: P = aT<sup>β</sup> , subsequently linearized as log(P) = log(a) + βlog(T) = α + βlog(T). For individual **A**, α = 0, and β = 1. When α 6= 0, there is a consistent bias in producing durations; when β 6= 1, then the untransformed data are not consistent with a linear function. We compute individual regressions of P on T, after log transformation, and thereby derive individual estimates for the two parameters, the intercept (α) and the slope (β) of this psychophysical function. In this study, as in previous studies (Glicksohn, 1992, 1996), we employ short target durations of 4, 8, 16, and 32 s. The TP task is not overly demanding, and is completed within less than

5 min. In previous reports employing a Ganzfeld, this task with these durations was found to reflect both the influence of the altered sensory environment (Glicksohn, 1992), and that of the participant's personality interacting with such an environment (Glicksohn, 1996). In both cases, it is the intercept of the function which reveals both state and trait effects. In more recent work, we have shown how the same task also reveals the influence of trait mindfulness on time perception using experienced practitioners of mindfulness meditation (Berkovich-Ohana et al., 2012).

The current study improves on our earlier explorations in three notable ways. First, we employ a total whole-body immersive Ganzfeld coupled with both red and indigo-colored stimulation. The effect on time perception of exposure to such colored light has generated its own literature (Küller and Mikellides, 1983; Caldwell and Jones, 1985; Huang et al., 2012; Shibasaki and Masataka, 2014). We note that while Caldwell and Jones (1985) did not find a consistent effect of red versus blue light in their TP data, they do acknowledge that this might be due to the very short duration of exposure to such colored light (45 s). Indeed, exposure to colored light for such a short duration will not facilitate an immersive experience. In contrast, the Huang et al. (2012) study, allowing for exposure time in excess of 30 min to each colored light (red, blue, green), each on separate days, most certainly enabled an immersive experience. They reported a consistent shortening of produced duration when exposed to red colored light in their TP task, employing target durations of 180 s (they also employed a target duration of 600 s). To our mind, the use of such lengthy target durations is less informative than is our own use of different short target intervals, and this is on three counts. First, when employing lengthy target durations, in excess of what seems to be a maximal duration for time perception of around 100 s (Wackermann, 2007, p. 26), the very notion of time perception is compromised. Second, the use of a number of short target intervals, in contrast to one long target duration, enables the investigation of the psychophysical function for time perception, which is preferable to a focus on a single duration (Eisler, 1996, p. 67). Third, a lengthy target duration used for a TP task must surely be overtly disruptive of the effects of WBPD on the participant, because the participant is involved more in the TP task and less in the ongoing experience (Glicksohn, 2001a, p. 350). Given that in the present study, our participants are immersed in such colored light, for a period of 5 min for each color, we thereby enable better conditions for investigating such effects, on two counts. First, a period of 5 min exposure to such an altered sensory environment is ample time to enable an immersive experience (Hochberg et al., 1951, p. 155). Second, such exposure to each of two colored lights will enable us to investigate whether there is such a difference between the arousal potential of red and blue light (Caldwell and Jones, 1985).

Our second improvement on our earlier explorations is that instead of employing these four time intervals, each to be produced once (Glicksohn, 1992, 1996; Lipperman-Kreda and Glicksohn, 2006; Glicksohn and Hadad, 2012), we employ the same intervals, in two separate series. This is in line with other papers (Glicksohn et al., 2009; Berkovich-Ohana et al., 2012), hence providing us with additional data for assessing the hypothesized lengthening of TP.

Third, our participants are all experienced practitioners of breathing meditation. What we gain from this is the distinct possibility of obtaining a more informative phenomenological report of both subjective experience and of temporal experience. Such proficient meditators have been reported to be more introspectively accurate than are novices (Lutz et al., 2007), including regarding their bodily sensations (Fox et al., 2012), and to have an enhanced ability to sustain attention (Lutz et al., 2009). The downside of this prior extensive experience with meditation should also, however, be noted, and this is in two areas. First, while in the WBPD chamber, even if they are just "observing" or "resting," they are also probably entering into their meditative mode (Tei et al., 2009, p. 163). While this is not at all detrimental to the goals of this study, which focuses on temporal experience, this should, nevertheless, be noted from the start. Second, and of more relevance to our working hypothesis, that exposure to – in fact, immersion in – WBPD should result in a lengthening of TP, is that such experienced practitioners of meditation should characteristically exhibit such a lengthening of TP (Kramer et al., 2013; Droit-Volet et al., 2015). Nevertheless, given that this is a within-subject design, we are looking at such a lengthening of produced duration post- relative to pre-exposure to WBPD.

### MATERIALS AND METHODS

### Whole-Body Perceptual Deprivation (WBPD)

The WBPD chamber is in the shape of an egg (**Figure 1**), created by Patrizio Paoletti (Paoletti et al., 2017), and is located in the Cognitive Neurophysiology Laboratory, at the Research Institute for Neuroscience, Fondazione Patrizio Paoletti, Assisi, Italy. Two WBPD chambers were used. The first WBPD chamber had a diameter of 3 m and a height of 3.5 m, and opened and closed its top electronically. Following the translocation of the lab, a second WBPD chamber was utilized, having a diameter of 1.7 m and a height of 2.22 m; this chamber opens and closes manually (for security reasons, to avoid problems in case of an earthquake). In both chambers, the participants could sit comfortably inside upon a chair. Instructions were given verbally; sounds were transmitted via concealed speakers. The chamber was first flooded with white light, followed by red light and indigo light (these two colored-light conditions were presented in a counterbalanced order across participants), enabling a totally immersive WBPD. The participant's verbal reports were heard through a microphone, and were recorded.

### Participants and General Procedure

The complete sample of this study comprised 32 participants, and included EEG recording (these EEG data will be presented elsewhere). All are experienced practitioners of breathing meditation, chosen to participate due to their enhanced introspective and reporting abilities. They were recruited from the Ideas – Knowledge of Excellence, International School of Self-Awareness<sup>1</sup> , and had been practicing breathing meditation

<sup>1</sup>http://schoolofselfawareness.org/index.php/pages/page/40

from between 182 and 7280 h. They all completed a number of questionnaires prior to entering the WBPD chamber, which was illuminated with white light (5 min, eyes-closed condition). This was followed by red and indigo light, each presented for 5 min (eyes-open conditions). At the end of the session, the participants underwent an extensive interview. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the ethics committee of Bar-Ilan University. Unfortunately, we have missing TP data for 13 participants, for the following reasons: (1) a malfunction of the chamber for one participant (S13); (2) a problem in recording post-WBPD TP for six participants (P2, P6, P8, P10, S10, and S11); and (3) a problem in recording TP, both pre- and post-WBPD, for six participants (P11, P13, P16, S1, S2, and S4). Hence, only 19 participants (10 males and 9 females, whose age ranged between 27 and 66 years) provided complete data (both TP and verbal report), and it is their data which are presented here.

### Time Production

Prior to entering the chamber, the participants completed a TP task. One group of 16 participants (S1–S16) then repeated the task within the chamber, at the end of the session; a second group of 16 participants (P1–P16) repeated the task after exiting the chamber.

Four short durations of 4, 8, 16, and 32 s served for this TP task. The participants were required to remain with eyes closed while producing each of these target durations, pressing a finger button when they estimated that the time that passed following a "beep" sound equaled the target duration. Each target interval was produced twice, the target durations being presented in two series, each having a random order of the four target durations. Produced (P) and target (T) durations were both logtransformed (to base 2), rendering thereby a linear scale for both ranging for T between 2 and 5, with a midpoint value of 3.5. Mean log(P) served as one dependent measure, having an expected value of 3.5. Log(P) was then regressed on log(T), providing for each participant two further dependent measures, an intercept value, and a slope value (**Table 1**), where the slope is equivalent to the exponent of the power function relating P to T.

### The Semi-structured Interview

In the semi-structured interview, the participant was first asked to give a free description related to his/her experience. The participant was then asked to freely describe and also to rate on a one to nine scale (1 = low; 9 = high) his/her sense of: time, space, positive and negative emotions, bodily arousal, external and internal environment, metacognition and types of thoughts, concentration, insight, synesthesia, motor movements, sense of agency, and ownership inside the WBPD chamber. In addition, the participants were asked regarding thoughts about the past (memories), the future (concrete imagination), and momentary experiencing.

## RESULTS

## Mean Log(P)

Mean log(P) ranged between 1.39 and 4.19 (M = 3.21, SD = 0.67) prior to exposure to the WBPD, and between 1.29 and 4.40 (M = 3.29, SD = 0.64) following exposure to the WBPD. As can be deduced from the minimum values reported above, one individual (S3) produced time intervals indicative of an extremely fast internal clock. For example, for a target duration of 32 s, she produced pre-WBPD durations of 6.1 and 5.0 s, and post-WBPD durations of 6.2 and 6.0 s. We shall analyze the data both including her, and when removing her, to see whether her data had a marked effect.

The first question to consider is whether there is an increase in mean log(P) from the first series to the second series of TP estimates, either pre- or post-WBPD. For the first set of data (pre-WBPD), for our group of 19 participants there is such an increase [F(1,18) = 6.31, MSE = 0.06, p < 0.05], with mean log(P) increasing from 3.11 (SD = 0.69) to 3.31 (SD = 0.66), thus tending to veridical time perception. Similar values hold when the outlying individual was removed. For the second set of data (post-WBPD), this trend should be analyzed taking into account the difference in experimental protocol. When this difference in protocol is entered as a grouping factor in a two-way ANOVA, the Group [TP post-WBPD assessed within the chamber (n = 10), or after exiting the chamber (n = 9)] × Series (first, second) interaction is significant [F(1,17) = 20.20, MSE = 0.02, p < 0.0005]. This interaction indicates that for those participants who remained within the chamber while performing TP, mean log(P) increased from 3.03 (SD = 0.79) in the first series to 3.31 (SD = 0.82) in the second; this replicates the same trend seen pre-WBPD. In contrast, for those participants who exited the chamber, mean log(P) decreased from 3.47 (SD = 0.38) to 3.39 (SD = 0.42). Again, similar values hold when the outlying individual was removed.

Thus, mean log(P) presents the following curvilinear trend when assessed four times, for those who remained within the chamber: 2.82, 3.16, 3.03, and 3.31. Note specifically the decrease in value from time 2 to time 3, in opposition to the hypothesized lengthening of TP as a function of exposure to WBPD. For the second group of participants, who exited the chamber, mean log(P) presents the following trend: 3.44, 3.49, 3.47, and 3.39. Note specifically the stability in value from time 2 to time 3, followed by a reduction in value at time 4.

For those nine participants exiting the chamber, a Time (pre-WBPD, post-WBPD) × Series two-way ANOVA revealed that neither Series nor Time (nor their interaction) were significant, indicating that if there was an effect for WBPD on TP, as noted above looking at all four data points, this was "washed out" quite quickly. In contrast, for those 10 participants who remained within the chamber, there was a main effect for Series [F(1,9) = 18.16, MSE = 0.05, p < 0.005], but no interaction of Series with Time. For Time there was a suggestive trend, which on removal of the outlying individual

#### TABLE 1 | TP profile for each participant, both pre- and post-WBPD.


An asterisk indicates a significant (p < 0.05) effect.

approached significance [F(1,8) = 4.82, MSE = 0.08, p = 0.059], whereby TP pre-WBPD (M = 3.16, SD = 0.59) increased post-WBPD (M = 3.37, SD = 0.53), in line with the hypothesis. Clearly, the inclusion of the outlying individual masks this trend.

The second question to consider is whether mean log(P) reveals a gender difference in TP, as has been previously reported (Glicksohn and Hadad, 2012). For the pre-WBPD data, the increase in mean log(P) from the first series to the second series of TP estimates is not moderated by Gender, neither is there a main effect for Gender, when employing a Gender × Series ANOVA. For the post-WBPD data, a Group × Gender × Series × Time ANOVA revealed no main effect and no interactions with Gender. Hence, gender is not a relevant factor in this study.

Our participants were first exposed to white light, followed by red and indigo light – with these two colored-light conditions being presented in a counterbalanced order across participants. Red light, compared to blue (indigo) light, has been argued to be more arousing, hence should speed up the internal clock (Glicksohn, 2001b), resulting in shorter TP (Huang et al., 2012). Hence, participants exposed to red-then-indigo illumination in the WBPD chamber should exhibit a lengthening of TP over time, while participants exposed to indigo-thenred illumination should exhibit comparatively shorter TP. One should also consider a potential gender difference here (Delay and Richardson, 1981; Shibasaki and Masataka, 2014). We ran a Color × Gender ANOVA on the difference score (post-WBPD – pre-WBPD), which uncovered a significant interaction [F(1,15) = 5.64, MSE = 0.07, p < 0.05], and no main effects. While our nine female participants showed practically no difference in TP score for either red-then-indigo illumination (M = 0.057, SD = 0.116, n = 4) and indigo-then-red illumination (M = −0.001, SD = 0.328, n = 5), our 10 male participants showed a marked increase in TP score for indigo-then-red illumination (M = 0.518, SD = 0.452, n = 3) compared to red-then-indigo illumination (M = −0.035, SD = 0.178, n = 7). An effect for color illumination only found for men has been previously reported (Shibasaki and Masataka, 2014); nevertheless, one would predict a decrease in TP score for indigothen-red illumination, as noted above.

### Power Function

Log(P) was regressed on log(T), providing for each participant an intercept value and a slope value (**Table 1**). Given only four target durations, we consider an r 2 value ≥ 0.95 as supporting linearity, as in previous publications (Lipperman-Kreda and Glicksohn, 2006; Glicksohn and Hadad, 2012). Inspection of both the individual log–log plots and the individual r 2 values revealed that for pre-WBPD, the data of five individuals could not be considered to exhibit linearity following the log transformation. A similar inspection of the post-WBPD data revealed that for one of these individuals, this situation was continued. In addition, for two individuals, while their pre-WBPD data exhibited linearity following the log transformation, their post-WBPD data did not. We shall be analyzing the data for these seven individuals in separate.

For the remaining 12 participants, Group × Time (pre-WBPD, post-WBPD) ANOVAs were run on the slope and the intercept in separate. For the slope, there was both a main effect for Time [F(1,10) = 5.94, MSE = 0.04, p < 0.05] and also a Group × Time interaction [F(1,10) = 7.77, MSE = 0.04, p < 0.05]. As can be seen in **Figure 2A**, for those participants exiting the chamber there is practically no increase in slope due to WBPD, while for those participants who remained within the chamber, their pre-WBPD slope is, surprisingly, markedly lower than that of the other slope values. We have no ready explanation for this, and thus suggest treating the slope measure in this study with caution.

Turning to the intercept, we found both a main effect for Time [F(1,10) = 6.09, MSE = 0.05, p < 0.05] and a Group × Time interaction [F(1,10) = 5.63, MSE = 0.05, p < 0.05]. As can be seen in **Figure 2B**, for those participants exiting the chamber there is practically no decrease in intercept due to WBPD, while for those participants who remained within the chamber, their post-WBPD intercept is markedly lower. An increase in mean log(P) post-WBPD, even if only appearing as a trend in the data (see above), coupled with a lower intercept suggests an interaction between Time and Duration. In the next analysis, we looked at the individual log–log plots, both pre- and post-WBPD.

### Variability in Individual Power Functions

We uncovered four common profiles within our data, both for participants remaining within the chamber and for those exiting the chamber, with exemplars appearing in **Figure 3**. These four profiles indicate a discontinuity in psychophysical function (**Figure 3A**), signifying aberrant TP; an overlap of functions pre-WBPD and post-WBPD (**Figure 3B**), suggesting no clear influence of WBPD; longer productions post-WBPD, in line with our hypothesis (**Figure 3C**); and shorter productions post-WBPD, in contrast to our hypothesis (**Figure 3D**).

**Figure 3A** exemplifies the data of S15, who is one of three (S15, S5, and P5) of the seven individuals not exhibiting linearity following the log transformation (two of these three had remained within the chamber), together with the group means [±standard error (SE)], which appear in the upper curves. For all four curves appearing in this panel, we have used the "smooth curve" fit provided by KaleidaGraph software. What one notes here is that in spite of this lack of linearity, the post-WBPD data exhibit longer productions than the pre-WBPD data, in line with the hypothesis. An intriguing possibility is suggested by these data: Linearity for the first two data points, and linearity for the last two data points, with discontinuity of the function between these regions. The difference between these four linear functions would be primarily revealed in the intercepts.

**Figure 3B** exemplifies the data of S16, who is one of the remaining four (S16, S12, S14, and S6) of the seven individuals not exhibiting linearity following the log transformation (all four of whom had remained within the chamber). We also present the group means (±SE). What one notes here is that pre-WBPD and post-WBPD data are intertwined – with no clear dominant trend in the data.

**Figure 3C** exemplifies the data of S7, who is one of four (S7, S9, S3, and P3) individuals (three of these had remained within the chamber), for whom the post-WBPD function diverges from the pre-WBPD function primarily for the larger target durations, indicating longer productions post-WBPD, in line with the hypothesis. Their data are suggestive of the Duration × Time interaction noted above, wherein both intercept and slope change from pre- to post-WBPD. We also present the group means (±SE), supporting such an interaction. For both individual data and for group data, a linear fit in this log–log plot is clearly seen.

**Figure 3D** exemplifies the data for P1, who is one of six (P1, P7, P14, S8, P4, and P15) individuals (five of these had exited the chamber), for whom the post-WBPD data exhibit uniformly shorter productions, in opposition to the hypothesis. We also present the group means (±SE). The reverse pattern, whereby post-WBPD data exhibit uniformly longer productions, in line with the hypothesis, is found for the remaining two (P9 and P12) individuals, both of whom had also exited the chamber. For both individual data and for group data, a linear fit in this log–log plot is clearly seen.

### The Experience of Time

A question is raised as to the extent to which this variability in our TP data is related to variability in the experience of time while immersed in WBPD. We block our experiential data in line with the four major groups appearing in **Figure 3**.

Two participants whose TP data are exemplified by **Figure 3A**, who had remained within the chamber, reported the following in the interview at the end of the session: "There was no focus on [time]. The time dimension lost its meaning and significance" (S5); "It felt as if more time had passed. Time was expanded, I perceived more the passage of time; time passes" (S15). These reports do not indicate any apparent discontinuity in time perception, and in fact seem to be dissociated from TP performance.

Three other participants, who had also remained within the chamber, whose data are exemplified by **Figure 3B**, reported the following in the interview: "[Time] disappeared" (S12); "Slow, longer" (S14); and "No time" (S16). If the sensation of time "disappeared," then perhaps their TP performance should be haphazard, as seen in **Figure 3B**.

Three participants who had also remained within the chamber, whose data are exemplified by the type of interaction appearing in **Figure 3C**, reported the following: "Time was slower" (S7); "Expanded, flowing" (S9); and "The cognition of time after the lights, I didn't know what would have happened" (S3). If "time was slower," then one would expect to see this experience reflected in the TP data by longer productions, especially for longer target intervals, as can be seen in **Figure 3C**.

The majority of the participants presenting with TP data in opposition to the hypothesis of there being a lengthening of produced time had exited the chamber. The reports of three of these, whose data are exemplified by **Figure 3D**, are as follows: "Concerning time, time was not fast or slow, but I was firm with myself. There was absence of time, but I was setting my own time, for example with my breath, or my sensations, thus it was set on the present, it was just what it was" (P15); "It seemed that there was no time. I didn't think about it" (P7); and "Inside there I didn't have time perception, however when

I did the exercise afterward, I realized that I had a more refined time perception . . . I didn't perceive time, it was expanded" (P4). It could well be that on exiting the chamber, post-WBPD TP performance became dissociated from the timelessness that these participants reported regarding their WBPD experience. Two other participants, who had also exited the chamber, present the reverse pattern to that seen in **Figure 3D**, in line with the hypothesis. This is what they reported: "Very expanded. I had the feeling to be in a not defined space, and also time was not defined. However, practical thought about what time it is, what I have to do later, set in" (P9); "It was very slow" (P12). If time was "very expanded" and/or "very slow," then one would expect to see this

exhibiting parallel functions for pre- and for post-WBPD.

reflected in the TP task by longer productions, which was the case for these participants (even though they had exited the chamber).

### DISCUSSION

In the current study, we examined the effects of a total wholebody immersive Ganzfeld coupled with both red and indigocolored stimulation on TP and temporal experience. Clearly, as Block (1979, p. 202) suggests, "it is quite reasonable to determine whether gross reductions in external stimulus information affect temporal experience." We are, of course, in full agreement with Morrison and Hunt (1996, p. 118) regarding the need "to turn to the content-analyzed interview when assessing subjective experience." In doing so, we find wide individual differences in both temporal experience and TP among participants immersed in our WBPD environment.

Our working hypothesis was that exposure to – in fact, immersion in – WBPD should result in a lengthening of TP. Our study was designed such that we could maximize this effect, for our participants were all experienced meditators (in the widest sense of the term). Such a population should exhibit a lengthening of TP during meditation (Glicksohn, 2001b), and also exhibit longer TP while not meditating, at baseline (Berkovich-Ohana et al., 2012). Consider the following: The Ganzfeld (and other conditions of restricted environmental stimulation) comprises ". . . an externally structured analog of meditation and similar states" (Suedfeld, 1980, p. 44); and the effects of a Turrell Ganzfeld have been ". . . frequently described as calming, relaxing, womblike, uplifting, meditative and so on" (Benson, 2001, p. 125). Conversely, "certain meditative practices . . . have perceptual and cognitive outcomes similar to sensory deprivation" (Lindahl et al., 2014). Hence, we have compatibility between trait and state in expecting such a lengthening of TP.

Not all our participants exhibit the hypothesized lengthening effect; some, in fact, exhibited shorter TP following WBPD. We find that for those participants whose data exhibited linearity in the log–log plot of produced duration to target duration, it was the intercept of this function which was the locus of the effect for WBPD, much as was reported in a previous study (Glicksohn, 1996) employing both altered sensory environments (including Ganzfeld) and TP (using the same target durations). While an increase in the intercept might be due to the repetition of the task, and not necessarily due to exposure to an altered sensory environment (Glicksohn, 1996, p. 368), here we note a marked decrease in intercept due to exposure to WBPD, for those participants who remained in the chamber. For those exiting the chamber, on the other hand, there was practically no decrease in intercept due to WBPD.

We also noted that for a number of participants not exhibiting such linearity in their data, the difference between the two functions describing their data would be revealed in the intercept. For them, their post-WBPD intercept is higher. It might well be that we have uncovered the same type of "break" in the psychophysical function, emphasized by Eisler (1990) using the method of reproduction. In our data, we noted such a discontinuity in function between 4 and 8 s – something never observed before using our TP task (but also never actively investigated before). A discontinuity in function above 4 s, while not quite conforming to the hypothesized 3-s "subjective" or "sensible" present (Pöppel, 1997; Wackermann, 2007), would nevertheless conform to the temporal location of the break observed by Eisler (1996). As Eisler (1996, p. 77) writes: "For durations below about 4 s on the average there seems to be no difference between male and female subjects. Longer durations, above the break, yield longer reproductions for female subjects." Furthermore, as Eisler et al. (2004, p. 265) have indicated, ". . . for almost all subjects the psychophysical function showed a break or discontinuity at different temporal locations for different individuals . . . ." The explanation suggested for finding the locus of the effect in the intercept of the function was that the intercept reflects some "bias or error in production" (Glicksohn, 1996, p. 367). Such an error aligns with the claim that the internal clock incorporates a "fallible" counter (Killeen and Taylor, 2000), which would further predict such discontinuities in the psychophysical function, as seen here for these three individuals.

In spite of the existence of these individual differences – or, better, because of these individual differences – we can make the following tentative claims. First, when "time disappeared," TP becomes haphazard. Second, when "time was slower" or "time was expanded," TP is lengthened. We have also learned that the effects of WBPD are not long-lasting: Participants who remained in the chamber tended to report time as being slower, and tended to exhibit a lengthening of TP, as hypothesized; participants who exited the chamber tended to exhibit shorter productions, in opposition to the hypothesis. One might question whether it would have been better to ask our participants to produce durations during WBPD, rather than following WBPD. One could argue either way: If TP reflects time-in-passing (Glicksohn, 2001b), then performing TP during WBPD would be more tightly related to temporal experience during WBPD. TP can, in fact, serve as a measure of mental workload (Zakay et al., 1999, pp. 568–570; Baldauf et al., 2009), and will fluctuate as one's level of vigilance changes – but that is exactly what should be happening in the Ganzfeld (Avant, 1965).

On the other hand, by performing TP during WBPD, using a task employing the production of multiple target durations, this might very well disrupt one's temporal experience, and one's subjective experience in general, which is of prime interest for studies of WBPD. In fact, any task might disrupt the effects of WBPD (Suedfeld, 1980, pp. 67–68; Glicksohn, 2001a, p. 350). Thus, TP following WBPD is not necessarily a limitation of the present study. This, however, is an issue worth considering in future studies in this domain.

Another point to consider is the fact that, as part of the structured interview, we had requested our participants to rate on a one to nine scale (1 = low; 9 = high) their sense of time. For the majority of our participants (n = 18), this was a nonsensical idea, and they could not make such a rating. Two gave a rating of "0" (not on the scale), one gave the verbal rating of "neutral," another gave a verbal rating of "medium," and the other participants gave a numerical rating. The astonishment at the very question expressed by a number of our participants brings to mind a comment made by Klüver (1926, p. 512): "A question of the

experimenter concerning time was considered rather ridiculous. It seemed to me incommensurate to speak about the experienced abundance of phenomena in terms of minutes and hours."

A reviewer of this paper has astutely commented that in employing two different WBPD chambers, we might have impacted on TP, because the second chamber was smaller in size relative to the first. We further note that one group of 16 participants (S1–S16) repeated the TP task within the larger chamber, at the end of the session, while the second group of 16 participants (P1–P16) repeated the task after exiting the smaller chamber. There is a scanty literature that bears on the possible effect of the WBPD chamber size on temporal experience. DeLong (1981) found that observers of differently scaled environments (e.g., one-sixth of the full size of a room), who were asked to imagine themselves as being embedded within the scaled-model, exhibited a TP of a target duration of 30 min that was proportional to the scale of the environment. More recently, Zäch and Brugger (2008) reported a study wherein their participants were asked to imagine, with eyes closed, a railway clock at either a close (30 cm) or a far (6 m) distance from them, focusing on the imagined movement of the clock's second hand, TP of 15 and 30 s was shorter for the clock imagined to be farther away. If chamber size did have an impact in this study, one would assume that TP would be shorter in the larger chamber. However, as we have stressed above, individual differences in both TP and temporal experience seem to predominate over other design factors in this study.

Finally, what bearing does the present study have for the study of time perception and clinical disorders? One can view this study in analogy to another research domain, presenting with similar reports of such aberrant temporal experience – depression. Consider the reports of some of our participants, indicating for them that "time was slower," that "time was expanded," or that "time disappeared." Would such reports be readily distinguished from reports made by depressed individuals that ". . . time seems to pass more slowly than usual or even stops" (Droit-Volet, 2013, p. 260), as also emphasized by other researchers looking at time perception and depression (Oberfeld et al., 2014, p. 1)? If "time was slower," then one would expect to see this experience reflected in the TP data by longer productions, especially for longer target intervals, as can be seen in the present study (**Figure 3C**). One would conclude that the internal clock in both the present

### REFERENCES


Ganzfeld setup and in depression had slowed down (Glicksohn, 2001b). Indeed, this would be the same inference to be drawn regarding meditation (Glicksohn, 2001b; Wittmann and Schmidt, 2014; Wittmann et al., 2015).

And yet the study of time perception in depression is plagued by inconsistency, at a number of levels. Oberfeld et al. (2014, p. 4) assume that if ". . . altered time perception in depressive patients in terms of a faster running internal clock is true, we expect the subjects in the depressive group to underproduce . . . time intervals in the time production task. . . ." In contrast, we would argue for a slower internal clock, hence for longer TP. We make this claim, irrespective of actual findings in the literature, which are mixed (Droit-Volet, 2013; Thönes and Oberfeld, 2015; Mioni et al., 2016).

If the results of the present study do have relevance for the literature on time perception in depression, this would be in two ways. First, the locus of the effect should be found in the intercept of the psychophysical function, when produced time is plotted as a function of target duration, using a log–log plot, as in the present study. Preliminary support for this is clearly seen in the study reported by Kornbrot et al. (2013). Second, there are individual differences in both TP and temporal experience – and hopefully an analysis of both in studies of depression will move that line of research forward. We can conclude in full support of Droit-Volet (2013, p. 260), writing about time perception and depression, but with clear import for our own study: "It is therefore important to distinguish between the explicit awareness of time and the direct perception of time. A disturbance in the former does not systematically imply a disturbance in the latter. That said, time awareness may sometimes affect time judgments to a certain extent."

### AUTHOR CONTRIBUTIONS

JG, AB-O, and TB-S sponsored the study and wrote the manuscript. AB-O, TB-S, and FM ran the study.

### FUNDING

This research was supported by a grant from the Bial Foundation (228/14).




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer MP declared a shared affiliation, though no other collaboration, with one of the authors FM to the handling Editor.

Copyright © 2017 Glicksohn, Berkovich-Ohana, Mauro and Ben-Soussan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multiple Looks of Auditory Empty Durations Both Improve and Impair Temporal Sensitivity

Tsuyoshi Kuroda1,2\*, Daiki Yoshioka<sup>3</sup> , Tomoya Ueda<sup>3</sup> and Makoto Miyazaki 1,3

<sup>1</sup>Faculty of Informatics, Shizuoka University, Hamamatsu, Japan, <sup>2</sup>Yamaha Motor Co., Ltd, Fukuroi, Japan, <sup>3</sup>Department of Informatics, Graduate School of Integrated Science and Technology, Shizuoka University, Hamamatsu, Japan

Discrimination of two neighboring empty durations that are marked by three successive sounds is improved when the presentation of the first (standard, S) duration is repeated before that of the second (comparison, C), as SSSSC. This improvement in sensitivity, called the multiple-look effect, has been explained by a statistical model regarding variability. This model assumes that the perceived duration of the standard is averaged across observations (within a trial within an individual). The increasing of the number of observations thus reduces the standard error of the mean perceived duration. Alternatively, the multiple-look effect is attributed to the listener's prediction based on regular rhythm. Listeners perceive regular rhythm during the repetition of the standard, predict the timing of subsequent sounds, and detect a sound that is displaced from the predicted timing. These models were tested in the present experiment in which the main factor was a temporal separation between the standard and the comparison; i.e., these durations were adjacent to each other as SSSSC or separated by a temporal blank as SSSS\_C. The results differed between stimulus structures. First, the multiple-look effect was replicated in the SSSSC condition (yielding a higher performance than SC), but disappeared in SSSS\_C (having no difference with S\_C). Second, no multiple-look effect occurred in CSSSS (no difference with CS), and moreover, an impairment effect was observed in C\_SSSS (a lower performance than C\_S). Finally, discrimination was improved in SSSS\_CCCC compared with SSSSCCCC, the effect being kept even when sounds were aligned at irregular intervals. These findings are not consistent with those expected from the statistical model because the temporal separation should have produced no effects if the number of standards had been a sole parameter determining the multiple-look effect. The prediction-based model can explain the first finding; inserting a blank between the standard and the comparison violates the listener's prediction based on regular rhythm, thus reducing the multiple-look effect. However, it did not expect the other findings and required revisions. Notably, the second finding indicates that the formation of regular rhythm can impair temporal discrimination. In other words, an inversed multiple-look effect occurs.

Keywords: temporal sensitivity, multiple-look effect, rhythm, prediction, regularity

#### Edited by:

Deana Davalos, Colorado State University, United States

#### Reviewed by:

Alana M. Campbell, University of North Carolina, Chapel Hill, United States Fuat Balc*ı*, Koc¸ University, Turkey

> \*Correspondence: Tsuyoshi Kuroda tkuroda@inf.shizuoka.ac.jp

Received: 29 October 2017 Accepted: 19 January 2018 Published: 02 February 2018

#### Citation:

Kuroda T, Yoshioka D, Ueda T and Miyazaki M (2018) Multiple Looks of Auditory Empty Durations Both Improve and Impair Temporal Sensitivity. Front. Hum. Neurosci. 12:31. doi: 10.3389/fnhum.2018.00031

### INTRODUCTION

People utilize rhythm for predicting subsequent events. Rhythm is formed by the repetition of identical temporal structures, and thus, the perception of a certain type of rhythm informs that a temporal structure appears again and again in the future. In music, a sudden change of rhythm violates listeners' prediction, resulting in emotion such as surprise.

Predicting the timing of events enhances the perceptual processing of those events. Jones et al. (2002) demonstrated that the discrimination of pitch was enhanced when a target sound was located at the last of regular rhythm so that listeners could predict its timing. Such rhythmic prediction may also be involved with improvements in temporal sensitivity, as found in the multiple-look effect (Schulze, 1989; Drake and Botte, 1993; Miller and McAuley, 2005; Ten Hoopen et al., 2011). This effect is typically tested with three successive sounds that delimit two neighboring empty durations, namely the standard (S) and the comparison duration (C). Discrimination of these durations is improved when the presentation of the standard duration is repeated before the comparison (as SSSSC; see Schulze, 1989; Ten Hoopen et al., 2011; Li et al., 2016).

The multiple-look effect is usually explained by a statistical model in which the variability of perceived duration of the standard reduces with an increase in the number of observations. In other words, if the perceived duration is averaged across observations (within a trial within an individual), the standard error of the mean is reduced as the number of observations is increased (Miller and McAuley, 2005; Li et al., 2016). These models then indicate that the size of the multiple-look effect is mainly determined by the number of observations as well as some weighting parameters. Therefore, there is no reason to expect a change in its size, whether the standard is repeated before or after the comparison (i.e., whether the sequence is SSSSC or CSSSS). This idea seems consistent with that reported by several studies in which the multiple-look effect also occurs when the standard is repeated after the comparison (Miller and McAuley, 2005; Ten Hoopen et al., 2011).

However, Ten Hoopen et al. (2011) provided a compelling evidence, in their third experiment, that the multiple-look effect is stronger when the standard is repeated before the comparison than when the standard is repeated after the comparison<sup>1</sup> . This difference is difficult to explain with the statistical model. Furthermore, in Drake and Botte (1993), the multiple-look effect was reduced when the regularity of sounds in the consecutive standards, as well as in the consecutive comparisons, was broken. Given this finding, we assume not only the statistical mechanism but also the prediction based on regular rhythm underlying the multiple-look effect. In the latter mechanism, listeners perceive regular rhythm during the repetition of the standard, predict the timing of subsequent sounds, and detect a sound that is displaced from the predicted timing, thus improving temporal sensitivity.

This prediction-based explanation falls into the frameworks of the dynamic attending theory (Jones and Boltz, 1989; Jones et al., 2002; McAuley and Jones, 2003; McAuley and Fromboluti, 2014). This theory indicates that the attentional level is not static but dynamically changes in time. The most recent version of the theory posits that the attentional level periodically oscillates, and that this attentional oscillation is entrained by the rhythm of external stimuli (Jones et al., 2002; McAuley and Jones, 2003; McAuley and Fromboluti, 2014). In other words, the attentional level synchronizes its peak with successive stimuli if those stimuli appear at regular intervals. For stimulus sequences causing the multiple-look effect (as SSSSC), the attentional oscillation is entrained during the repetition of the standard. This entrained oscillation enables listeners to discriminate the standard and the comparison based on whether the last stimulus of the comparison appears earlier than, later than, or simultaneously with the peak of the oscillation. Such benefits are not given when the standard is repeated after the comparison (as CSSSS).

The present experiment was conducted to examine whether the multiple-look effect would occur when the standard and the comparison were separated by a temporal blank (as SSSS\_C; see **Figure 1**). If the attentional entrainment contributes to the occurrence of the multiple-look effect, a temporal blank between the standard and the comparison should violate the listener's prediction based on regular rhythm because sounds are lacking at the predicted timing, thus resetting the attentional oscillation. In this article, the sequences in which the standard and the comparison are adjacent (as SSSSC) are called the ''continuous'' sequences while the sequences in which the standard and the comparison are separated (as SSSS\_C) are called the ''discontinuous'' sequences.

There are two hypotheses: (1) if the multiple-look effect is only attributed to the statistical reduction of variability with an increase in the number of the standard, it should take place, whether or not the standard and the comparison are temporally separated, resulting in no differences between the continuous and discontinuous sequences. (2) If the multiple-look effect is also sourced from the rhythmic prediction, a temporal blank should violate the listener's prediction based on regular rhythm, resulting in a lower performance in the discontinuous than the continuous sequences. However, this blank effect should be observed only when the standard is repeated before the comparison (SSSSC vs. SSSS\_C). In other words, it should not be observed when the standard is repeated after the comparison (CSSSS vs. C\_SSSS) because this condition does not benefit from the rhythmic prediction. Similarly, no blank effect should be observed when the regularity of sounds in the consecutive standards is broken, as depicted in **Figure 1** (in ''repeat'' and ''irregular'').

No studies have yet compared the continuous vs. discontinuous sequences to test the occurrence of the multiple-look effect, except Grondin (2001) using visual stimuli. This author reported that the multiple-look effect occurred with the discontinuous sequences but not with the

<sup>1</sup>More precisely, these authors manipulated the number of the standard and that of the comparison independently. The standard preceded (SC) or followed (CS) the comparison. The repetition of the standard for SC and of the comparison for CS led to a stronger multiple-look effect than that of the comparison for SC and of the standard for CS. Note that in the present article only the number of the standard was manipulated as an independent variable.

continuous ones. In the auditory modality, the continuous sequences (Schulze, 1989; Ten Hoopen et al., 2011; Li et al., 2016) and the discontinuous ones (Drake and Botte, 1993; Miller and McAuley, 2005; Grondin, 2012) have been used in separate studies. The present study is the first one comparing the continuous vs. discontinuous sequences directly with auditory stimuli.

Drake and Botte (1993) and Miller and McAuley (2005) demonstrated the occurrence of the multiple-look effect with discontinuous sequences in the auditory modality, and this result does not seem to support our prediction-based hypothesis. However, in their experiments, the standard and the comparison were separated by a blank that was twice as long as the standard duration. In this case, only one sound was missing at the predicted timing between the last sound of the standard and the first of the comparison. The blank thus might have been too short to reset the attentional oscillation. The oscillation then continued after the blank, and its peak appeared coincidently with the first stimulus of the comparison, keeping the multiple-look effect. In the present experiment, a longer blank was adopted to enhance the effects of resetting the attentional oscillation. The standard and the comparison were separated by an interonset interval lasting three times or more of the standard; thus, two sounds were lacking at the predicted timing between the last sound of the standard and the first of the comparison. Furthermore, the duration of the blank was jittered across trials to prevent listeners from memorizing the duration and thus compensating the blank in their mind such that the attentional oscillation could be kept (see McAuley and Kidd, 1998).

The present experiment consisted of two sessions (**Figure 2**). In the location-varied session, the position of the standard and comparison (SC vs. CS) and the number of presentation of the standard (1 vs. 4) were manipulated, as in Miller and McAuley (2005) and Ten Hoopen et al. (2011). In the regularity-varied session, the standard and the comparison were presented four times each, and the regularity of sounds in the consecutive standards, as well as in the consecutive comparisons, was manipulated, as in Drake and Botte (1993). As a novel parameter, the sequence continuity (continuous vs. discontinuous) was examined in both sessions, based on the hypotheses mentioned above.

### MATERIALS AND METHODS

### Participants, Ethics and Experimental Structures

The experiment was conducted in accordance with Declaration of Helsinki as well as with procedures approved by the ethics review board of Shizuoka University. Written informed consent was obtained from each participant.

Sixteen participants that self-reported having normal hearing were recruited and divided into two groups. Eight of them (1 female), aged 20–23 years, listened to the continuous sequences, while the others (2 females), aged 19–23 years, listened to the discontinuous sequences. Thus, the continuity effect was tested with a between-participants design. The other effects were tested with a within-participants design.

Each group performed both the location-varied and regularity-varied sessions, as depicted in **Figure 2**. Half (4) participants of each group performed first the location-varied session and then the regularity-varied session, whereas the others performed these sessions in the opposite order. Participants assigned to the continuous sequences took about 2 h to complete the experiment, and those assigned to the discontinuous sequences took about 2.5 h.

### Location-Varied Session

#### Stimuli and Apparatus

The location-varied session consisted of three independent variables: location, continuity and standard duration. The first two were of interest to examine the statistical and predictionbased hypotheses. The first variable represents whether the standard was repeated before or after the comparison (There were also the control conditions in which the standard was presented only once). The second represents whether the standard and the comparison were adjacent to each other or separated by a temporal blank. If the multiple-look effect is just a matter of the statistical reduction of variability with an increase in the number of the standard, it should occur, whether or not the standard and the comparison are separated. However, if the rhythmic prediction contributes to the multiple-look effect, inserting a blank between the standard and the comparison should violate the listener's prediction based on regular rhythm and reduce the multiple-look effect. However, this reduction should be found only when the standard is repeated before the comparison but not when it is repeated after the comparison because the latter condition does not benefit from the rhythmic prediction. Finally, the last variable represents how long the standard was, and this manipulation was adopted to check if the continuity and location effects would be found in a specific duration or not.

There were four location conditions. In SC, a standard interval (S) preceded a comparison interval (C). In SSSSC, four standard intervals were successively presented before a comparison interval. In CS, a comparison interval preceded a standard interval. In CSSSS, a comparison interval was presented before four standard intervals.

The standard duration was varied in three conditions: S = 200, 300 or 400 ms. As explained later, participants were asked to compare the standard and the comparison, and there was a 60-ms difference between these intervals: C = S + 60 or S − 60 ms. The duration was manipulated in terms of an inter-onset interval (i.e., an interval between the onsets of two successive sounds).

There were two continuity conditions. In the continuous condition, the standard and the comparison were adjacent to each other. In the discontinuous condition, they were separated by a temporal blank. For example, the continuous SC consisted of three sounds; the first and second sounds delimited the standard, and the second and third ones delimited the comparison. However, the discontinuous SC consisted of four sounds; the first and second sounds delimited the standard, and the third and fourth sounds delimited the comparison. For the latter case, a blank between the second and third sounds lasted k × S (200, 300 or 400 ms). k was randomly varied between 3 and 3.5 across trials, as mentioned in ''Introduction'' section, to prevent listeners from memorizing the duration and thus compensating the blank in their mind such that the attentional oscillation could be kept (McAuley and Kidd, 1998).

Each sound was a sinusoid of 1000 Hz and 70 dB SPL. Its length was 10 ms, including the rise and the decay ramps of 4 ms with raised-cosine windows to avoid spectral splatter. Digital signals of stimuli were sampled at 44,100 Hz and quantized to 16 bits. These signals were converted into analog ones by a USB DAC (Onkyo SE-U33GXVII) and were presented from headphones (Sennheiser HD 380 pro) connected to an amplifier (Teac A-H01).

### Procedure

The four location conditions (SC, SSSSC, CS and CSSSS) were presented in separate sub-sessions, resulting in four sub-sessions in the location-varied session (**Figure 2**). The order of these sub-sessions was counterbalanced: the possible order was SC-SSSSC-CS-CSSSS, SSSSC-SC-CSSSS-CS, CS-CSSSS-SC-SSSSC, or CSSSS-CS-SSSSC-SC.

Before each sub-session, participants were given a schematic explanation of stimulus sequences (as in **Figure 1**) as well as the definition of two intervals to be discriminated. We did not use the terms ''standard'' and ''comparison'' when instructing the task to participants. We instead asked them to judge whether ''d2'' was shorter or longer than ''d1''. For the SC and SSSSC sequences, d1 and d2 indicated the standard and the comparison, respectively, but for the CS and CSSSS sequences, d1 and d2 indicated the comparison and the standard, respectively. Thus, d1 was always followed by d2, making the task of participants consistent throughout the session (thus preventing confusions). Participants pushed the left button of a computer mouse to respond that d2 was ''shorter'' than d1 and the right button to respond that d2 was ''longer'' than d1. The next trial began 1–2 s after the response.

Each sub-session included two blocks. In each block, six stimulus sequences (= 3 standards × 2 comparisons) were presented eight times each, resulting in 48 trials. The order of trials was randomized with a restriction that an identical sequence was not presented in two consecutive trials. A few-seconds break was taken between the blocks. Two warm-up trials in which randomly selected conditions were presented were conducted before the beginning of each block.

A training block was conducted before the beginning of each sub-session. This training included two warm-ups plus 48 trials as in the experimental block. However, the duration difference between the standard and the comparison was 140 ms instead of 60 ms, and the participant's response was followed by a feedback message indicating whether the response was correct or incorrect. For example, ''correct'' was presented on a computer display when participants responded ''longer'' for sequences in which d2 was physically longer than d1, whereas ''incorrect'' was presented when participants responded ''shorter'' for sequences in which d2 was physically longer than d1.

### Regularity-Varied Session

### Stimuli and Apparatus

The regularity-varied session consisted of three independent variables; regularity, continuity and standard duration. The last two were also adopted in the location-varied session, but a sharper focus was put on the first variable in this session. As shown in **Figure 1**, each sequence consisted of the four-standards pattern and the four-comparisons pattern. The regularity of sounds included in each pattern was manipulated. The prediction-based hypothesis expects the occurrence of the multiple-look effect if sounds are aligned at regular intervals and if no blank is inserted between the standard and the comparison pattern. If the regularity of sounds in each pattern is broken (i.e., if those sounds are aligned at irregular intervals in each


TABLE 1 | Intervals of the standard and comparison patterns for the repeat and irregular conditions.


Note. The standard duration (S) and comparison duration (C) were multiplied by the coefficient indicated in this table. For example, no. 1 means that the first interval of the standard lasted S × 0.6, the second lasted S × 1.1, the third lasted S × 1.4, and the last lasted S × 0.9.

pattern), the rhythmic prediction should not work, resulting in no multiple-look effect, in the continuous as well as the discontinuous sequences.

There were three regularity conditions (**Figure 1**). In the regular condition, sounds were aligned at regular intervals in each pattern. In the repeat and irregular conditions, sounds were aligned at irregular intervals in each pattern. However, in the repeat condition, the ratios of intervals in the standard pattern were identical to those in the comparison pattern. Therefore, this condition had a repetition of identical (interval-ratio) structures, resulting in regularity in a higher level of rhythmic hierarchy (Jones and Boltz, 1989), and might benefit from the rhythmic prediction even though to a lesser extent than the regular condition.

The following are the technical details of the regularity manipulation. In the regular condition, the standard pattern consisted of four identical intervals, each one (S) being 200, 300 or 400 ms. The comparison pattern consisted of four identical intervals, each one (C) being 60 ms shorter or longer than S.

In the repeat condition, the standard pattern consisted of an interval lasting S × 0.6, an interval lasting S × 0.9, an interval lasting S × 1.1, and an interval lasting S × 1.4. The possible order of these intervals was listed in **Table 1**; an order was randomly chosen across trials. The comparison pattern had the same order as the standard pattern, except that S was replaced by C (= S + 60 or S − 60 ms); thus, both patterns had the same interval-ratio structure. Note that the list did not include accelerating and decelerating structures, such as [×1.4]−[×1.1]−[×0.9]−[×0.6] and [×0.6]−[×0.9]−[×1.1]−[×1.4], as well as structures in which shorter and longer intervals were alternated, e.g., [×0.6]−[×1.1]−[×0.9]−[×1.4], because they might have yielded a specific impression of regularity.

In the irregular condition, the same method as in the repeat condition to manipulate each interval was adopted. However, as indicated in **Table 1**, the standard and comparison patterns had different structures; the first and second intervals in the standard pattern were interchanged with each other in the comparison pattern, and so were the third and fourth intervals.

The standard and comparison patterns were adjacent to each other in the continuous condition while they were temporally separated in the discontinuous condition. For the latter case, the last sound of the standard and the first sound of the comparison was separated by k × S (200, 300 or 400 ms). k was randomly varied between 3 and 3.5 across trials. The same apparatus and sounds as in the location-varied session were used.

#### Procedure

Participants were instructed to judge whether the last half of sounds (in the comparison pattern) were presented in a ''faster'' or ''slower'' tempo than the first half (in the standard pattern) in each trial. The session consisted of eight blocks. In each block, 18 stimulus patterns (= 3 regularities × 3 standards × 2 comparisons) were presented twice, resulting in 36 trials. The order of trials was randomized with a restriction that an identical sequence was not presented in two consecutive trials. A few-seconds break was taken between the blocks. Two warm-up trials in which randomly selected sequences were presented were conducted before the beginning of each block.

A training block was conducted before the first experimental block. This training included two warm-ups plus 36 trials as in the experimental block. Furthermore, the difference between S and C was 140 ms instead of 60 ms, and the participant's response was followed by a feedback message.

### Data Analysis and Statistics

The data of the location-varied session and the regularity-varied session were analyzed with the same methods. The warm-up trials and the training blocks were removed from the analysis.

d <sup>0</sup> was estimated based on the signal detection theory (Stanislaw and Todorov, 1999; MacMillan and Creelman, 2005) to examine temporal sensitivity, following previous studies in time perception (Schulze, 1989; Grondin, 1998; Kuroda and Grondin, 2013; Kuroda et al., 2016) 2 . This dependent variable expresses how well participants discriminated between the −60 ms and the +60 ms comparison interval; a higher value indicates better discrimination. It was calculated with the following equation:

$$d' = \Phi^{-1}(H) - \Phi^{-1}(F) \tag{1}$$

Φ−<sup>1</sup> (H) is a z score of the hit probability and Φ−<sup>1</sup> (F) is a z score of the false-alarm probability.

In the location-varied session, the hit probability means how frequently participants responded ''longer'' when the comparison was physically longer than the standard. The falsealarm probability means how frequently participants responded ''longer'' when the comparison was physically shorter than the standard. Note that, since participants were asked to judge whether d2 was shorter or longer than d1, they responded ''longer'' for the SC and SSSSC conditions when the comparison (d2) was perceived as longer than the standard (d1), but responded ''shorter'' for the CS and CSSSS conditions when the comparison (d1) was perceived longer than the standard (d2). We, therefore, read ''shorter'' for the CS and CSSSS condition as meaning ''longer'' in the data analysis.

In the regularity-varied session, the hit probability means how frequently participants responded ''slower'' when the comparison was physically longer than the standard. The falsealarm probability means how frequently participants responded ''slower'' when the comparison was physically shorter than the standard.

For both sessions, each probability was based on 16 responses for each condition for each participant. Furthermore, the loglinear method was adopted to correct each probability to avoid obtaining extreme values (0 and 1) which led to infinite when converted into the z score (Hautus, 1995).

The location-varied session was based on a 2 (continuities) × 4 (locations) × 3 (standards) design with repeated measures for the last two factors. The regularity-varied session was based on a 2 (continuities) × 3 (regularities) × 3 (standards) design with repeated measures for the last two factors. An analysis of variance (ANOVA) was conducted for each session. F distribution was estimated with the degrees of freedom that were corrected by the Greenhouse-Geisser epsilon against potential violation of sphericity. When the interaction was significant, the simple main effect was tested by a one-way ANOVA with the Greenhouse-Geisser correction. Pairwise contrasts were conducted based on the Holm method when the main or the simple main effect was significant.

### RESULTS

### Location-Varied Session

The mean d 0 for each experimental condition in the locationvaried session is shown in **Figure 3**. In general, the SSSSC conditions yielded the highest d 0 (sensitivity) in the continuous sequences, whereas the CSSSS conditions yielded the lowest d 0 in the discontinuous sequences. Indeed, the ANOVA revealed that the location effect, F(1.63,22.85) = 11.115, p < 0.001, η 2 <sup>p</sup> = 0.443, as well as its interaction with the continuity effect, F(1.63,22.85) = 9.866, p = 0.001, η 2 <sup>p</sup> = 0.413, was significant. The standard effect was also significant, F(1.62,22.67) = 15.365, p < 0.001, η 2 <sup>p</sup> = 0.523. No other effects were significant (p > 0.259).

The results of the post hoc contrasts, as shown in **Figure 3**, are summarized as follows: (1) the SSSSC condition resulted in a higher d 0 than the control (SC and CS) conditions, indicating the occurrence of the multiple look effect, but only in the continuous sequences. (2) The CSSSS condition resulted in a lower d 0 than the control (SC and CS) conditions in the discontinuous sequences. (3) The 200-ms standard yielded the highest d 0 and was followed by the 300-ms and then the 400-ms standard.

### Regularity-Varied Session

The mean d 0 for each experimental condition in the regularityvaried session is shown in **Figure 4**. In general, d <sup>0</sup> was the highest for the regular condition. The ANOVA revealed that all main effects were significant: the continuity effect, F(1,14) = 15.891, p = 0.001, η 2 <sup>p</sup> = 0.532, the regularity effect, F(1.69,23.72) = 33.824, p < 0.001, η 2 <sup>p</sup> = 0.707, and the standard effect, F(1.86,26.03) = 12.759, p < 0.001, η 2 <sup>p</sup> = 0.477. No interactions were significant (p > 0.051).

The results of the post hoc contrasts, as shown in **Figure 4**, are summarized as follows: (1) the regular condition resulted in the highest d 0 and was followed by the repeat and then the irregular condition. (2) d <sup>0</sup> was higher for the discontinuous than for the continuous sequences. (3) The 200-ms as well as the 300-ms standard yielded a higher d 0 than the 400-ms standard.

### DISCUSSION

### Statistical vs. Prediction-Based Mechanisms (in the Location-Varied Session)

The present study was conducted to examine whether the multiple-look effect would occur when the standard and the comparison were separated by a temporal blank, based on two hypotheses. The first hypothesis attributes the multiple-look effect to the statistical reduction of variability with an increase in the number of the standard, and this does not expect any differences between the continuous vs. discontinuous sequences. The second hypothesis attributes the multiple-look effect to the listener's prediction based on regular rhythm, and this expects that a temporal blank between the standard and the

<sup>2</sup> See Appendix, (**Figures 5**, **6**) for the results of β expressing perceived duration.

comparison resets the attentional oscillation, thus inhibiting the multiple-look effect.

The results of the location-varied session may be evidence against the statistical hypothesis. For the continuous sequences, the SSSSC condition yielded a higher performance than the control (SC and CS) conditions, indicating the occurrence of the multiple-look effect. However, for the discontinuous sequences, the SSSSC and control conditions led to almost identical performances. If the multiple-look effect had been just a matter of the statistical reduction of variability, this effect should have occurred, whether or not the standard and the comparison were separated by a temporal blank.

The results instead seemed consistent with those expected from the prediction-based hypothesis. For the SSSSC condition, the attentional oscillation was entrained during the repetition of the standard. When the comparison was adjacent to the preceding standard (i.e., in the continuous sequences), participants could utilize this entrained oscillation for discrimination; they discriminated between the standard and the comparison based on whether the last sound of the comparison appeared earlier or later than the peak of the oscillation. However, when a temporal blank was inserted between the standard and the comparison (i.e., in the discontinuous sequences), the attentional oscillation was rest during the blank, canceling out the benefits from the attentional oscillation. Therefore, the multiple-look effect occurred in the continuous sequences but not in the discontinuous sequences.

### The Inversed Multiple-Look Effect (in the Location-Varied Session)

However, the CSSSS condition yielded a lower performance than the control (SC and CS) conditions when the comparison was separated from the subsequent standard (i.e., in the discontinuous sequences). The prediction-based hypothesis did not expect any effects in the CSSSS condition, in which the attentional oscillation could not help discrimination because the comparison was presented before the oscillation was entrained by the repetition of the standard. Therefore, both the continuous and discontinuous sequences should have yielded no multiple-look effect, i.e., identical performances, in the CSSSS condition. However, this condition indeed exhibited an impairment effect (lower sensitivity than SC and CS) for the discontinuous sequences, suggesting the occurrence of an inversed multiple-look effect.

One might explain this impairment effect by the decay of memory. Since participants responded after the presentation of the last stimulus, they had to sustain the memory of the comparison longer when a temporal blank was inserted between the comparison and the subsequent standard. Then, the memory was more likely decayed, impairing the discrimination performance. However, this explanation seems implausible

because, in the control (SC and CS) conditions, the discontinuous sequences yielded a higher d 0 than the continuous sequences (**Figure 3**). For these conditions, the discontinuous sequences should have led to lower performances than the continuous sequences if the temporal blank of the present experiment had been long enough to facilitate the decay of memory.

Although speculative, the inversed multiple-look effect might be explained by adding two further assumptions to the prediction-based hypothesis: (1) the repetition of the standard after the comparison could be interference with the listener's decision process before the response. For the C\_SSSS (discontinuous) condition, listeners could not utilize the rhythmic prediction and thus simply compared the comparison and the first standard. This seemed to be the simplest strategy, but since the standard was consecutively repeated, listeners had to segregate the first standard from the others in their mind. This cognitive demand reduced the performance in the C\_SSSS compared with the C\_S condition. However, only with this assumption, it is difficult to explain why there were no differences between the CSSSS and CS (continuous) conditions. In the CSSSS condition, the repetition of the standard could have been interference. We therefore needed the next assumption. (2) Even during the presentation of only two sounds, the attentional oscillation could be entrained slightly and thus could be utilized for discrimination. In the CSSSS condition, the attentional oscillation was entrained during the presentation of the comparison, and even though this entrainment was weak, participants could discriminate the comparison and the standard based on whether the second sound of the first standard appeared earlier or later than the peak of the oscillation. The same strategy could be applied to the CS condition, resulting in an identical performance to the CSSSS condition.

In brief, the formation of regular rhythm produces two opposite effects; it both improves and impairs temporal sensitivity. The impairment effect seemed difficult to explain with the statistical hypothesis, but also required the predictionbased hypothesis to be revised much. A simpler, more comprehensive model should be constructed to explain the multiple-look effect and the inversed one.

### Border Effects (in the Regularity-Varied Session)

Inserting a blank between the standard and the comparison led to another effect in the regularity-varied session; the discontinuous sequences resulted in a higher d 0 than the continuous sequences (**Figure 4**). This result was unexpected from the statistical hypothesis as well as the predictionbased one. However, it is not surprising, given that in this session participants performed discrimination based on not only tempo but also the whole duration (i.e., between the beginning sound and the last sound) of the standard and the comparison pattern. For the discontinuous sequences, the borders (beginning and end) of each pattern were clear because the standard pattern and the comparison pattern were separated by a temporal blank. However, for the continuous sequences, the borders of each pattern were less clear because the end of the standard pattern and the beginning of the comparison pattern were delimited by an identical sound. Therefore, the whole duration of the standard pattern and that of the comparison pattern might be discriminated in the discontinuous sequences easier than in the continuous sequences.

Notably, this explanation could be applied to the results of the control (SC and CS) conditions in the location-varied session. These conditions also exhibited a higher d 0 for the discontinuous sequences than the continuous ones. The standard and comparison intervals were each delimited clearly when they were separated by a temporal blank (for the discontinuous

( <sup>∗</sup>p < 0.05, ∗∗∗p < 0.001).

sequences) but not when the end of the standard and the beginning of the comparison shared an identical sound (for the continuous sequences).

Nevertheless, there might be another approach explaining the results of the control conditions in the location-varied session. A temporal assimilation might occur in the continuous sequences; the perceived duration of the standard and the comparison were assimilated when those intervals were neighboring to each other (Nakajima et al., 2004; Grondin et al., 2017). This might result in a lower sensitivity for the continuous than the discontinuous sequences. However, such an assimilation typically occurs when the first interval is 200 ms or briefer (Nakajima et al., 2004), or when the total of the two intervals is 540 ms or briefer (Miyauchi and Nakajima, 2007). Only the 200-ms standard in the present experiment fulfilled these duration-range criteria.

### Prediction Based on Rhythmic Hierarchy (in the Regularity-Varied Session)

More important, in the regularity-varied session, the discrimination performance changed as a function of regularity; the regular condition yielded the highest sensitivity, and was followed by the repeat and then the irregular condition. Sounds were aligned irregularly in the last two conditions, but the standard and comparison patterns had the same interval-ratio structure in the repeat condition. In other words, the repeat condition had a repetition of identical structures (i.e., regularity) in a higher hierarchical level. The result thus suggests that temporal sensitivity benefits from the formation of rhythm in any hierarchical level whereas the benefits become smaller with a more complex structure of hierarchy (Jones and Boltz, 1989).

### Relation with the Oddball Paradigm and Potential Integration of Two Mechanisms

We have contrasted the statistical vs. prediction-based hypotheses, but it might be possible to assume these hypotheses as complementary to each other. The results of the locationvaried session obviously indicate that the temporal relationships between the standard and the comparison are a crucial factor determining the multiple-look effect. The statistical hypothesis has not taken this factor into account. However, it is also true to say that there are several researches demonstrating a good fitting of the statistical model to behavioral data (Schulze, 1989; Miller and McAuley, 2005; Ten Hoopen et al., 2011; Li et al., 2016) even when the standard was repeated after the comparison (Miller and McAuley, 2005; Ten Hoopen et al., 2011). Given this, it would be reasonable to posit that the statistical mechanism can also work in the multiple-look effect, but the predictionbased mechanism is dominant in some cases (as in the present experiment), resulting in a stronger effect when the standard is repeated before the comparison than when repeated after the comparison.

In order to discuss the potential integration of the two hypotheses, it seemed worth noting that the stimulus sequences used in the present experiment are very similar to those used in neurophysiological studies with the oddball paradigm (for review, see Garrido et al., 2009). An oddball stimulus that is deviated, for example, in pitch or duration from the other sequential stimuli elicits the mismatch negativity (MMN) that is recorded with electroencephalography (EEG) or magnetoencephalography (MEG). Psychophysical studies have also shown that the perceived (filled) duration of the oddball stimulus is distorted (typically overestimated) compared with that of the non-oddball ones (Tse et al., 2004; Pariyadath and Eagleman, 2007; McAuley and Fromboluti, 2014). In the present experiment, the last sound of the comparison in the SSSSC condition can be regarded as the oddball. This sound might have activated neural responses reflected by MMN. Indeed, MMN is interpreted to be evoked when the regularity of successive sounds is broken and generated by a neural process comparing the current sensory input with a memory trace of previous stimuli (Garrido et al., 2009). If the activation level of those neural responses increases as a function of the number of the standard and correlates with temporal sensitivity, it determines the occurrence of the multiple-look effect. This explanation is compatible with both the statistical and the prediction-based hypothesis. Therefore, further investigations of the multiple-look effect with neurophysiological techniques will give new insights into understanding the duration processing of successive intervals.

## CONCLUSION

We have discussed the results of the present experiment, focusing on the validity of the statistical vs. prediction-based hypotheses that explain the multiple-look effect. Inserting a temporal blank between the standard and the comparison produced several effects that the statistical hypothesis did not expect. The multiple-look effect resulting from the repetition of the standard before the comparison was diminished when the standard and the comparison were separated by a blank, supporting the prediction-based hypothesis. However, this hypothesis required a lot of revisions to explain an impairment effect that was observed when the standard was repeated after the comparison. In this condition, discrimination was impaired when a temporal blank was inserted between the comparison and the subsequent standard. Further investigations with neurophysiological techniques recording MMN may give a comprehensive theory explaining the multiple-look effect and the inversed one. A recent neurophysiological study reported that the multimodal training, including motor production, of musical rhythm induces the cortical plasticity involved with the improvements of temporal sensitivity (Lappe et al., 2011). The current finding that the formation of regular rhythm both improves and impairs temporal sensitivity might be an addition to the literature in clinical fields to find an effective method utilizing rhythmic activities to adjust the human time performance.

### AUTHOR CONTRIBUTIONS

TK, DY and MM conceived the project. TK designed and prepared the experiments. TK, TU and DY collected the data. TK and DY analyzed the data. TK drafted the article. All authors revised the article.

### FUNDING

This research was made possible by research grants awarded to TK by Japan Society for the Promotion of Science (JSPS)

### REFERENCES


KAKENHI Grant Number 16H01502, 15K21195 and to MM by JSPS KAKENHI Grant Number 16H01866, 16K12969.

### ACKNOWLEDGMENTS

We would like to express our gratitude to Kei Nihashi for her help in recruiting participants.


**Conflict of Interest Statement**: The first author, TK, is currently employed by Yamaha Motor Co., Ltd. The study presented in this article was conducted when TK was employed by Shizuoka University (until March 2017).

The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kuroda, Yoshioka, Ueda and Miyazaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### APPENDIX: PERCEIVED DURATION

β was also estimated, expressing how long participants perceived the comparison duration, in comparison with the standard. A lower value indicates more ''longer (slower)'' responses than ''shorter (faster)'' ones. Note that β is usually used in detection tasks to express the tendency for participants to prefer responding one of the two alternatives. However, in the present study, this measure is interpreted as a sign of perceived duration; for example, if duration is perceived as longer, participants should respond ''longer'' more frequently than ''shorter'' (Grondin, 1998; Kuroda et al., 2016). β was calculated by the following equation:

$$\log \beta \, \,=\, \frac{\left[\Phi^{-1}(F)\right]^2 - \left[\Phi^{-1}(H)\right]^2}{2} \tag{1}$$

The natural logarithm was adopted for keeping the linearity of scale. log β was zero when the number of ''shorter (faster)'' responses was equal to that of ''longer (slower)'' responses; i.e., the standard and the comparison were perceived as equivalent. log β was higher than zero when participants were likely to perceive the comparison duration (or tempo) as shorter (faster) than the standard, whereas it was lower than zero when they were likely to perceive the comparison as longer (slower) than the standard.

The mean log β for each experimental condition of the location-varied session is shown in **Figure 5**. An ANOVA according to a 2 (continuities) × 4 (locations) × 3 (standards) with repeated measures for the last two factors revealed that the standard effect as well as the interactions involved with this factor were significant: the standard effect, F(1.95,27.35) = 13.574, p < 0.001, η 2 <sup>p</sup> = 0.492, the continuity × standard interaction, F(1.95,27.35) = 4.860, p = 0.016, η 2 <sup>p</sup> = 0.258, the location × standard interaction, F(3.92,54.84) = 9.011, p < 0.001, η 2 <sup>p</sup> = 0.392, and the three-way interaction, F(3.92,54.84) = 4.875, p = 0.002, η 2 <sup>p</sup> = 0.258. No other effects were significant (p > 0.056).

Since the three-way interaction was significant, the post hoc contrasts were conducted without pooling any factors, as shown in **Figure 5**. Note that this figure indicates only the results of the contrasts among the standard conditions; the results of the other contrasts delivered almost the same messages. The results are summarized as follows: (1) for the discontinuous sequences, a longer standard yielded a lower log β in the SC and SSSSC conditions, but yielded a higher log β in the CS and CSSSS conditions. (2) For the continuous sequences, only the SSSSC condition exhibited the standard effect in which log β was decreased as the standard was lengthened.

The mean log β for each experimental condition of the regularity-varied session is shown in **Figure 6**. An ANOVA according to a 2 (continuities) × 3 (regularities) × 3 (standards) with repeated measures for the last two factors revealed that the standard effect was significant, F(1.82,25.46) = 71.291, p < 0.001, η 2 <sup>p</sup> = 0.836, and the continuity × regularity interaction was significant, F(1.97,27.63) = 4.842, p = 0.016, η 2 <sup>p</sup> = 0.257. No other effects were significant (p > 0.189).

The results of the post hoc contrasts, as shown in **Figure 6**, are summarized as follows: (1) log β was decreased as the standard was lengthened. (2) The discontinuous sequences produced a higher log β than the continuous sequences in the irregular condition.

In summary, for both sessions, the standard duration was the main factor determining the perceived duration of the comparison relative to the standard interval. There seemed to be no clear relationship between temporal sensitivity (d 0 ) and perceived duration (log β) in the present study.

# Overlapping but Divergent Neural Correlates Underpinning Audiovisual Synchrony and Temporal Order Judgments

#### Scott A. Love1,2 \*, Karin Petrini <sup>3</sup> , Cyril R. Pernet <sup>4</sup> , Marianne Latinus 1,5 and Frank E. Pollick <sup>1</sup>

<sup>1</sup> School of Psychology, University of Glasgow, Glasgow, United Kingdom, <sup>2</sup> Physiologie de la Reproduction et des Comportements, Institut National de la Recherche Agronomique, Centre National de la Recherche Scientifique, IFCE, Université de Tours, Nouzilly, France, <sup>3</sup> Department of Psychology, University of Bath, Bath, United Kingdom, <sup>4</sup> Brain Research Imaging Centre, Imaging Sciences, University of Edinburgh, Edinburgh, United Kingdom, <sup>5</sup> UMR1253, iBrain, Université de Tours, Inserm, Tours, France

Multisensory processing is a core perceptual capability, and the need to understand its neural bases provides a fundamental problem in the study of brain function. Both synchrony and temporal order judgments are commonly used to investigate synchrony perception between different sensory cues and multisensory perception in general. However, extensive behavioral evidence indicates that these tasks do not measure identical perceptual processes. Here we used functional magnetic resonance imaging to investigate how behavioral differences between the tasks are instantiated as neural differences. As these neural differences could manifest at either the sustained (task/state-related) and/or transient (event-related) levels of processing, a mixed block/event-related design was used to investigate the neural response of both time-scales. Clear differences in both sustained and transient BOLD responses were observed between the two tasks, consistent with behavioral differences indeed arising from overlapping but divergent neural mechanisms. Temporal order judgments, but not synchrony judgments, required transient activation in several left hemisphere regions, which may reflect increased task demands caused by an extra stage of processing. Our results highlight that multisensory integration mechanisms can be task dependent, which, in particular, has implications for the study of atypical temporal processing in clinical populations.

#### Keywords: multisensory, audiovisual, fMRI, temporal processing, asynchrony

### INTRODUCTION

Temporal correspondence is a key principle of multisensory integration, thus manipulating the relative timing of the unimodal components (e.g., audio and visual cues) of a cross-modal stimulus is one of the most extensively and widely used tools for investigating multisensory processing. The most commonly used tasks for studying temporal processing, whether in a unimodal or

#### Edited by:

Deana Davalos, Colorado State University, United States

#### Reviewed by:

Julian Keil, Christian-Albrechts-Universität zu Kiel, Germany Jennifer T. Coull, Aix-Marseille Université, France

> \*Correspondence: Scott A. Love love.a.scott@gmail.com

Received: 08 January 2018 Accepted: 13 June 2018 Published: 03 July 2018

#### Citation:

Love SA, Petrini K, Pernet CR, Latinus M and Pollick FE (2018) Overlapping but Divergent Neural Correlates Underpinning Audiovisual Synchrony and Temporal Order Judgments. Front. Hum. Neurosci. 12:274. doi: 10.3389/fnhum.2018.00274 cross-modal context, are synchrony judgment (SJ) and temporal order judgment (TOJ) paradigms. During a SJ task, participants decide whether cues are in synch or out of synch, whereas in a TOJ task, they decide which cue came first (or last). Both tasks allow for the extraction of a point of subjective simultaneity (PSS) and a temporal integration window (TIW) to index temporal processing ability. In their review, Keetels and Vroomen (2012) observed that SJs and TOJs have often been used interchangeably to investigate temporal processing, despite the fact that comparative studies report differences between the tasks. Indeed, accumulating behavioral evidence suggests that these tasks actually measure different processes, or at least different aspects of the same process, even within the same participant for the same stimulus (van Eijk et al., 2008; Vatakis et al., 2008; Fujisaki and Nishida, 2009; Petrini et al., 2010; Maier et al., 2011; Vroomen and Stekelenburg, 2011; Love et al., 2013). At the group level, the PSS derived from an audiovisual SJ task tends to be visual-leading, i.e., the onset of the visual cue needs to occur before the auditory cue for synchrony to be maximally perceived, whereas when it is derived from TOJ using the same stimuli and participants it is often found to be audio-leading (van Eijk et al., 2008; Petrini et al., 2010; Love et al., 2013). At the individual level, there is no correlation between the two tasks for either PSS or TIW (van Eijk et al., 2008; Vatakis et al., 2008; Fujisaki and Nishida, 2009; Vroomen and Stekelenburg, 2011; Love et al., 2013). Moreover, improved performance due to training on one of the tasks does not transfer to the other (Mossbridge et al., 2006). Here we aimed to investigate how these differences are manifested in brain activity by examining whether they reflect completely independent temporal processing networks, different levels of activity from the same network, or overlapping (share some mental processes and associated neural architectures) but divergent networks (require specific processes).

Several experiments have investigated the neural mechanisms involved in audiovisual SJs (e.g., Miller and D'Esposito, 2005; Lewis and Noppeney, 2010; Stevenson et al., 2010; Noesselt et al., 2012), but until very recently there was no evidence of the mechanisms involved in audiovisual TOJs, nor any direct comparison of the networks underlying these two tasks. Recently, however, using a simple beep-flash stimulus and an event-related functional magnetic resonance imaging (fMRI) analysis, Binder (2015) highlighted several left hemisphere regions (superior/inferior lobule, middle/inferior gyrus and lateral occipital cortex) that activate more during audiovisual TOJs than SJs. Similarly, but in response to unimodal tactile stimulation, Miyazaki et al. (2016) observed more activation for TOJs than SJs in left premotor cortex, left posterior parietal cortex, right premotor cortex and bilateral thalamus. In contrast, within left insular cortex they observed greater activation for SJs than TOJs. These studies were important and timely in reporting differences between TOJs and SJs, nevertheless many questions about the nature of these neural differences are still unanswered. For example, are the differences between TOJ and SJ only evident in overlapping brain regions or do they have divergent neural correlates? Are these differences present for more complex/natural audiovisual stimuli (i.e., stimuli for which we have accumulated prior experience about their visual and auditory correspondence)? Do these neural differences depend on similar or different processing time-scales?

The current study used a more complex audiovisual stimulus than in Binder (2015) of a point-light drummer (Petrini et al., 2009a,b; Love et al., 2013) to test the hypothesis that SJs and TOJs have different neural correlates in the human brain. Point-light drumming is formed by a visual and auditory continuous stream of information that is closer to the stimulation we receive in real life compared to simple beep-flash stimuli. Additionally, this stimulus represents a human action but at the same time is similar to Binder (2015) stimulus in terms of low-level characteristics (e.g., white dots appearing on a black background and absence of contextual information). The decision to use a more complex stimulus was necessary to understand whether the neural differences between SJ and TOJ are stimulus dependent as stimulus complexity is known to influence performance (Dixon and Spitz, 1980; Vatakis and Spence, 2006; Love et al., 2013; Stevenson and Wallace, 2013). This in turn would substantiate Binder (2015) findings with simple beep-flash stimuli by extending them to more complex and ecological situations. Indeed, we rarely experience events formed by one isolated visual and auditory stimulus rather we commonly experience complex audiovisual events formed by streams of visual and auditory events. Hence, it is essential to examine whether neural differences previously found with single-event simple stimuli extend to more complex everyday situations.

We used a mixed block/event-related design rather than a simple event-related design because research has demonstrated (e.g., Chawla et al., 1999; Donaldson et al., 2001; Visscher et al., 2003) that two different time-scales of neural activity can be investigated with fMRI: transient (event-related) and sustained (task/state-related) activity. Sustained effects are characterized by rises in the BOLD signal that plateau, or remain elevated, for a significant duration rather than quickly descending back to baseline as would a transient effect. This is an important distinction to make when exploring differences between two related tasks such as SJs and TOJs, as task differences could be explained by different sustained activity ("states-of-mind") and/or transient trial-related activity (e.g., decision-making). Standard block and event-related designs pool (confound) these two different levels of processing and prevent their independent investigation (Donaldson, 2004), thus limiting our understanding of neural differences between SJ and TOJ tasks.

As both tasks are involved in temporal processing but show clear behavioral differences, it was predicted that the neural mechanisms underpinning these two tasks would be overlapping but also divergent. That is, that a network of task-independent temporal processing regions would be involved in both tasks along with other task-specific networks dependent on the judgment being made. Such an inherent divergence in the neural mechanisms underpinning the tasks should be evident regardless of the stimulus type being presented. Therefore, despite our use of a more complex stimulus than Binder (2015) convergent overall results between the two studies would be expected.

### Experiment Overview

Participants first completed a pre-fMRI behavioral experiment in which they made SJs and TOJs to synchronous and asynchronous audiovisual stimuli. Dependent on TOJ performance (R<sup>2</sup> goodness-of-fit between data and fitted function) with the current stimulus, participants were classified, for that stimulus, into one of two groups: TOJ-able (R <sup>2</sup> > 0.5) or TOJ-unable (R <sup>2</sup> < 0.5). In the following fMRI experiment, participants made SJs and TOJs, but to a reduced stimulus set of synchrony conditions: individually defined task-specific PSSs, largest audio-leading (333 ms), largest video-leading (333 ms), and physically synchronous stimuli. Both TOJ-able and TOJ-unable participants were included in the fMRI experiment as statistical comparisons between these groups could be informative about how and why potential differences between the tasks occur.

### Participants

Twenty right-handed participants (10 female, mean age [range] = 24 [20–32]) took part. None had received any professional musical training, and all described themselves as "musical novices." All had normal or corrected to normal vision and reported no hearing difficulties or any history of neurological disorders. All participants gave informed written consent in accordance with the Declaration of Helsinki and were paid for their participation. The University of Glasgow, College of Science and Engineering ethics committee approved the protocol.

### Stimulus Preparation

The stimuli had previously been used in other studies, and a complete description of them can be found elsewhere (Petrini et al., 2009a,b; Love et al., 2013). They comprised dynamic audiovisual movies (3 s) containing the point-light representation of a drummer playing a swing groove at 120 beats per minute, with an accent on the second beat (**Figure 1**). Audio and visual cues were shifted relative to each other to produce stimuli with different cue onset asynchrony (COA). The video was shifted to begin either after the audio (−333, −267, −200, −133, and −67 ms) or before the audio (+333, +267, +200, +133, and +67 ms), producing a total of 10 asynchronous stimuli to be used in the pre-fMRI experiment. Negative and positive numbers will be used to refer to audio-leading and video-leading COA levels respectively, and 0 COA will refer to the synchronous condition. To prevent participants from having to stay in the MRI scanner for an uncomfortably long time only 4 COA levels were used during the fMRI experiment: two asynchronous (−333, +333 COA) and two "synchronous" (0 COA and the individually defined PSS). The −333, 0 and +333 COA conditions are provided as Supplementary Videos 1–3, respectively.

Stimuli were presented using MATLAB 2007b (MATHWORKS Inc., Natick, MA) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997).

### Procedure and Analysis of Behavioral Experiment

The behavioral session (∼20 min) took place in a darkened room, where participants sat approximately 65 cm from a CRT monitor (FormacProNitron 19.380; 1,024 × 768 pixel resolution and 60 Hz refresh rate). Auditory cues were presented via Beyerdynamic DT headphones.

The two-alternative forced-choice experiment consisted of 20 blocks; in half of the blocks participants responded as to whether the audio and video cues were synchronous or not (SJ) and in the other half as to which cue came first (TOJ). Block order was randomized. Instructions describing whether a SJ or a TOJ was to be performed were displayed on screen for 4 s before the beginning of each block. There were 10 trials per COA level for each task. Written instructions emphasized the importance of understanding the difference between the two tasks and that it was crucial to be constantly aware of what the current task was.

Best-fitting Gaussian curves (least squares minimization using iterative maximum likelihood) to the number of synchronous and visual first responses were calculated for the SJ and TOJ data respectively. PSS was derived as the peak of the SJ probability function and as the 50% point of the TOJ cumulative distribution function, while the TIW was taken as the standard deviation of the Gaussian curve for each task (Petrini et al., 2010). Previous research has shown that some participants cannot successfully make TOJs (random responses or responses completely biased toward one response) for some stimulus types (Petrini et al., 2010; Love et al., 2013). In the current study, R 2 -values (goodness-of-fit between data and fitted function) below 0.5 were regarded as indicating that a participant could not successfully make the TOJ. This criterion was defined in our previous work as it corresponded well with subjective interpretations of what constituted good and bad data fits (Love et al., 2013). Furthermore, it represents a quantitative, replicable criterion that can be used and compared across experiments.

### Procedure and Analysis of fMRI Experiment

The fMRI procedure was similar to the behavioral experiment, except that a reduced stimulus set was presented: −333, 0, PSS, and +333. The PSS values were obtained individually from the pre-fMRI experiment separately for SJ and TOJ. To be as accurate as possible the individual PSS conditions were selected as the closest COA level to that of the PSS value derived from the prefMRI data fits. Although COA levels in the pre-fMRI experiment were restricted to ±333, ±267, ±200, ±133, ±67, and 0 ms, COA values for the PSS condition in the fMRI experiment could be any COA level between 0 and ±333 in 16 ms increments i.e., one frame at a time. This use of an individually determined stimulus level (PSS) is similar to the approach used by Binder (2015) to determine stimuli levels, but did not use the simultaneity threshold approach based on separate sound-first and flash-first trials. For TOJ-unable participants, we used average results from a behavioral study using identical stimuli (Petrini et al., 2010). An optimized mixed block/event-related design was used to enable

investigation of differences between the tasks at both transient and sustained levels of processing.

Each of two functional runs (∼22 min each) consisted of 32 stimulation blocks (half SJ and half TOJ, randomized) and after every two stimulation blocks there was a 16 s fixation block (**Figure 1**). Within a stimulation block (25 s) there were 9 events: 5 stimuli (each 3 s) separated by 4 fixation events (1, 2, 3, or 4 s in pseudorandom order). Each COA condition was presented a total of 40 times (20 per run) per task. To minimize the correlation between the transient (stimuli) and sustained (stimulation block) regressors the number of times an individual COA condition was presented within a single stimulation block was manipulated as follows: in a run, a COA level was presented 0 times during 4 stimulation blocks, once in 6 blocks, twice in 4 blocks and 3 times in 2 blocks, i.e., a total of 20 presentations for each COA level and task. One thousand sequences with different randomizations of the order of events and blocks were created and the best chosen by balancing efficiency and correlation. In the chosen sequence, the mean correlation between sustained and transient regressors was 0.47, which enabled reliable estimation of both types of BOLD response (Otten et al., 2002).

Auditory stimuli were presented via Sensimetrics S14 insert headphones at approximately 85 dB. The visual cue was backprojected (Panasonic PT-D7700E DLP; 1,024 × 768 pixel resolution, 60 Hz refresh rate) onto a screen behind the participant's head, visible via a mirror mounted on the MR head coil with an approximate viewing distance of 65 cm.

Functional images covering the whole brain (field of view: 210 mm, number of slices: 32, voxel size: 3 × 3 × 3 mm) were acquired with a 3T Tim Trio Scanner (Siemens) and a 32 channel head coil using an echoplanar imaging (EPI) sequence (ascending-interleaved, TR: 2 s, TE: 30 ms, flip angle: 77◦ ). At the end of the fMRI session, high-resolution T1-weighted images (anatomical scan) were obtained (field of view: 256 mm, number of slices: 192, voxel size: 1 × 1 × 1 mm, flip angle: 9◦ , TR: 1.9 s, TE: 2.52 ms).

SPM8 software (Wellcome Department of Imaging Neuroscience, London, UK) was used to pre-process and analyse the imaging data. First, the anatomical scan was AC-PC centered; this correction was then applied to all EPI volumes. Functional data were slice-time corrected and subsequently motion corrected using a two-pass six-parameter rigid-body spatial transformation (Friston et al., 1996), which realigned all functional volumes to the first volume of the scan closest to the anatomical scan, and subsequently realigned all the images to the mean volume. The anatomical scan was co-registered to the mean volume and segmented. The functional and anatomical images were then normalized to the Montreal Neurological Institute (MNI) template using the parameters issued from the segmentation, keeping the voxel resolution of the original scans (3 × 3 × 3 and 1 × 1 × 1 mm respectively). Functional images were smoothed with an 8 × 8 × 8 mm full width at half maximum Gaussian kernel. Global linear trends and rapid aliased noise were minimized through high-pass filtering the data with a cutoff period of 128 s and an autoregressive [AR(1)] filter during statistical model estimation. All the analyses were conducted in a masked skull-stripped search volume, created by combining three tissue maps (white and gray matter and cerebrospinal fluid) output at the segmentation procedure.

Data were analyzed in a two-level random-effects analysis, with each run entered as a separate session. The first-level analysis involved a design matrix with 18 regressors per session. There were 10 regressors of interest: two for sustained-effects and eight for transient-effects (4 conditions × 2 tasks). SJ and TOJ sustained-effects were modeled using 25-s boxcar functions; transient-effects were modeled separately for each task and COA level with event-related impulse responses. Both the sustained and the transient regressors were convolved with a canonical hemodynamic response function. Eight regressors of no interest were included to account for the instruction periods, six realignment motion parameters and the grand mean.

Using the general linear model, parameter estimates for each regressor were calculated and used to create contrast images for a condition relative to baseline (Friston et al., 1995). The resulting images were used in repeated-measures ANOVAs conducted using the GLM Flex software (http://mrtools. mgh.harvard.edu/index.php/Main\_Page#Welcome.21). The first ANOVA examined any differences in sustained-effects produced by the factors Task (SJ/TOJ) and Group (TOJ-able/TOJ-unable), plus their interaction. The second tested for differences in transient-effects produced by the factors Group (TOJ-able/TOJunable), Task (SJ/TOJ), COA Condition (−333/0/PSS/333), and their interactions. We report all clusters that were significant after multiple comparisons correction (p < 0.05) based on clusterextent false discovery rate (Chumbley and Friston, 2009) with the auxiliary voxel-level threshold set at p < 0.0001.

### RESULTS

### Behavioral Results

In line with previous research (Petrini et al., 2010; Love et al., 2013), data from the pre-fMRI experiment indicated that some participants could not successfully make TOJs (**Figure 2**). Eleven out of 20 participants were deemed unable to make TOJs based on an R 2 value of < 0.5; from now on they will be referred to as the TOJ-unable group, with the other 9 participants being the TOJ-able group. The mean SJ PSS of all participants was a +70 ms (s.e.m = 5.7) video-leading stimulus, while the mean TOJ PSS from the TOJ-able participants was a −55 ms (s.e.m = 24.6) audio-leading stimulus. Paired-samples t-tests, using the prefMRI data of participants able to achieve both tasks (TOJ-able)**,** highlighted a significant difference (t<sup>8</sup> = 3.54, p = 0.008) between TOJ and SJ PSS but not TIW (SJ = 127 ms [s.e.m = 12], TOJ = 190 ms [s.e.m = 55], t<sup>8</sup> = 1.243, p = 0.249). Comparison between the TOJ-able and TOJ-unable group using independent-samples t-tests indicated there was no difference in SJ PSS (difference 17.8 ms t<sup>18</sup> = 1.628, p = 0.121) or TIW (difference 30.1 ms t<sup>18</sup> = 1.759, p = 0.096) between the groups.

Behavioral responses to the four conditions presented during fMRI scanning are displayed separately for each group and for each task in **Figure 3**. Furthermore, a direct comparison of individual performance inside and outside the MRI environment can be visualized in Supplementary Image 1. To compare behavior from pre-fMRI and during fMRI separate 3 Factor (Group: TOJ-able / TOJ-unable X Time: pre-fMRI/fMRI X COA: −333 0 and 333) repeated measures ANOVAs were conducted on the SJ and TOJ data. Both highlighted significant interactions between the Time and COA Factors [SJ: F(2,34) = 20.39, p < 0.001; TOJ: F(2,34) = 3.67, p < 0.036]. Visual inspection of the data showed that for both SJ and TOJ there was a difference in prefMRI and during fMRI performance but only for the +333 COA condition. For the SJ task this difference appeared as a higher proportion of synchronous responses during fMRI and for the TOJ task it appeared as a higher proportion of video-first responses during fMRI.

### fMRI Results

The main aim of this study was to explore differences in neural activity underpinning SJs and TOJs. Significant differences (p < 0.05 cluster-corrected) were found between the two tasks at both sustained and transient levels of processing. One region showed a significant difference in sustained activity between the tasks: the left middle occipital cortex (MOC) was activated more by SJ than TOJ. Investigation of percentage signal change relative to baseline also indicates that on average sustained task-related

deactivation was observed during TOJ vs. activation during SJ (sustained main-effect of Task, **Table 1** and **Figure 4A**). During transient events, TOJs revealed significantly more activation than SJs in the middle occipital, middle frontal, precuneus and medial superior frontal regions of the left hemisphere (transient main-effect of Task, **Table 1** and **Figure 4B**). Within the right anterior cingulate there was a significant Task by COA Condition interaction driven by larger deactivations to audio- and videoleading conditions than to PSS and 0 COA, but only during TOJs.

A significant transient main-effect of COA Condition was found in the bilateral putamen, insula, superior temporal cortex, left angular gyrus and anterior cingulate and right superior medial frontal cortex (**Table 1**, **Figure 5**). While our main focus was on effects of task on brain activity, the significant transient main-effect of COA Condition highlights a network of regions

involved in processing temporal information in audiovisual stimuli, regardless of the task performed.

No other significant main effects or interactions were found. Since the Group factor (TOJ-able/TOJ-unable) did not produce any significant results, we repeated the above analyses with this factor removed. The results were highly consistent with those described, with the addition that the transient TOJ>SJ effects found in the left middle occipital and middle frontal cortex were now also observed in the right hemisphere.

### DISCUSSION

dots and lines.

In the current experiment, the same participants made SJs and TOJs to an identical set of synchronous and asynchronous audiovisual point-light-drumming stimuli, while their sustained and transient task-related BOLD responses were recorded using fMRI. The results show that, even under identical stimulus conditions, TOJs and SJs have overlapping (main-effect of COA Condition) but divergent neural correlates for both sustained and transient BOLD responses. This neuroimaging evidence is in support of previous behavioral research indicating that the two tasks measure different processes, or aspects, of temporal processing (e.g., van Eijk et al., 2008; Vatakis et al., 2008; Fujisaki and Nishida, 2009; Petrini et al., 2010; Maier et al., 2011; Vroomen and Stekelenburg, 2011; Love et al., 2013). The current transient level results are also largely consistent with the results of a recent study which tested a similar hypothesis using an eventrelated fMRI analysis and single-event, simpler, audiovisual stimuli (Binder, 2015). In both the event-related results of Binder (2015) and the current transient effects, no region was more activated during SJs than TOJs, and all regions that activated more to TOJs than SJs were in the left hemisphere. Some of these left hemisphere regions, required during TOJs but not, or at least less so, during SJs, were consistent across studies. For example, both found similar MFG clusters, and there is a likely overlap between the cluster labeled as superior/inferior parietal lobule by Binder (2015) and our MOC cluster. While our results and those of Binder (2015) are consistent the current work also suggests that the difference in neural activity between these two tasks is consistent regardless of the stimulus being processed. This is an important and novel finding as it demonstrates that although behavioral performance on both tasks depends on stimulus type/complexity the overall network differences between them do not.

Although we predicted differences in the neural correlates underpinning SJs and TOJs, our design did not preclude finding brain activity common to both tasks. Correspondingly, regions showing sensitivity to COA under both task conditions represent a network for true audiovisual synchrony processing that is independent of task. Notably, we found that COA modulated activity in the bilateral putamen, insula and superior temporal cortex while participants performed either task–all areas previously found to be involved in the processing of audiovisual synchrony either during passive viewing (Calvert et al., 2001), other related tasks (Bushara et al., 2001; Olson et al., 2002) or SJs (Miller and D'Esposito, 2005; Stevenson et al., 2010). As similar regions play key roles in unimodal visual (Davis et al., 2009), auditory (von Steinbüchel et al., 1999), and tactile TOJs (Takahashi et al., 2013), it is possible that these regions represent a network which processes the relative timing of events rather than audiovisual synchrony per se. That is, they appear to be modulated by the relative timing of events, independent of the stimulated sensory modality or modalities and of the task being performed.

In contrast to transient level results, SJs produced more sustained activity in the left MOC than TOJs. The percentage signal change data actually indicate however that on average this region deactivated during TOJs (**Figure 4A**). The MOC has previously been found to exhibit task-induced deactivations, i.e., lower BOLD responses during a task than during baseline, and, in addition, these deactivations increase as a function of task difficulty (McKiernan et al., 2003; Hairston et al., 2008). In our previous and current work, we have shown that TOJs


TABLE 1 | Significant clusters with peak MNI coordinate, extent, and statistical values.

Clusters showing a voxel-level FWER p < 0.05 based on peak height (Chumbley and Friston, 2009) are indicated by a star (\*).

are perceived as being more demanding than SJs–a subjective measure that echoes with more objective criteria such as wider TIWs, and larger exclusion rates (Love et al., 2013). Therefore, we propose that the sustained deactivation found in the left MOC is indicative of a reallocation of resources (McKiernan et al., 2003) necessary for an extra stage of processing required during TOJs but not SJs (Ja´skowski, 1993; Binder, 2015; Miyazaki et al., 2016).

During TOJs but not SJs, several regions of the left hemisphere (middle occipital, middle frontal, precuneus and superior medial frontal cortex) displayed increased transient activity compared to baseline. These left-hemisphere results are in line with a voxelbased lesion-symptom mapping (VLSM) study, which argued that regions uniquely involved in visual TOJs, as compared to relative size judgments, were lateralized in the left-hemisphere (Wencil et al., 2010). In that study, lesions in both the left inferior frontal and left posterior parietal cortex correlated with visual TOJ deficits. Likewise, lesions of the left hemisphere have also been associated with deficits in auditory TOJs (von Steinbüchel et al., 1999; Wittmann et al., 2004). The current experiment and that of Binder (2015) add to these findings by showing that left hemisphere regions were also uniquely involved in audiovisual TOJs, even when contrasted with another synchrony-based task. However, it is worth noting that, similarly to Wencil et al. (2010), we are not suggesting TOJs are solely supported by the left hemisphere and SJs by the right hemisphere; indeed, several bilateral regions were activated during both tasks (see **Table 1** and **Figure 5**). Plus, a left lateralized TOJ>SJ effect should be treated with caution since similar effects were also observed in the right hemisphere when the Group factor was removed from the current analysis. Furthermore, despite noting that during tactile stimulation the regions activating more to TOJs than SJs were primarily distributed in the left hemisphere, Miyazaki et al. (2016) also found two such regions in the right hemisphere. Comparing unimodal visual TOJs to a shape discrimination task in two experiments, Davis et al. (2009) found bilateral temporal parietal junction activation to be larger for TOJs in the first experiment, while the same effect was found only in the left hemisphere during the second, better controlled, experiment. Taking all this evidence into consideration it appears prudent to talk about a prevalence of left hemisphere regions being more activated for TOJs than SJs rather than about a lateralization of this effect.

Regardless of whether the extra neural responses required to make TOJs are lateralized or not, they do highlight cognitive processing that is over and above that needed to make SJs. Binder (2015) argued that the additional activation was evidence in favor of a two-stage cognitive processing architecture for TOJs (Ja´skowski, 1991), requiring the perception of both (a)synchrony and order–SJs require only the first. Our results are coherent with this argument and further support it by highlighting that sustained task-induced deactivation of left MOC may facilitate this extra cognitive processing. This evidence is in opposition to the theory that TOJs can be made using the same information (arrival-time difference between the cues) and cognitive architecture as SJs (Sternberg and Knoll, 1973; Allan, 1975). Comparison of these two tasks using more time-sensitive neuroimaging techniques, such as electroencephalography and magnetoencephalography, would elucidate whether these stages are conducted serially or in parallel. Use of the VLSM technique to search for a double-dissociation between audiovisual TOJs and

SJs, similar to that found between visual TOJs and relative size judgments by Wencil et al. (2010), could also greatly increase our knowledge of the overlapping nature of these two processes.

### LIMITATIONS

One of the strengths and novelties in our study could also be seen counter intuitively as one of the limitations. As mentioned we used a more ecological and complex stimulus formed by a series of events (nine impact movements and nine resulting sounds) rather than a single, well-defined event (e.g., beepflash). Having multiple events means that creating the different asynchrony levels between the visual and auditory streams has the effect of realigning sensory inputs at longer COAs. This could have implications and perhaps partly explain the performance of TOJ-unable participants.

However, there are several reasons why we do not consider this a limitation. First of all, it is unclear why this realignment would affect more TOJ than SJ. Considering that participants could have used all the events in both tasks to make their judgments we could assume that the effect of a decrease in asynchrony for some events, due to realignment, should have had an influence on both tasks not only TOJ. Second, we would predict that any influence this realignment had on participants' responses should have been seen for both audio and videoleading conditions. However, this was not the case. Hence, though the effect of auditory and visual event realignment in complex stimuli needs to be considered and discussed we do not believe it undermines the task-related effects presented here. Finally, our interpretation is supported by the high level of consistency between our findings and those of Binder (2015) who used a single-event flash and beep stimulus, for which no realignment at larger audiovisual lags could occur.

Another possible limitation of the current work relates to the relative difficulty of the tasks. In general, when comparing tasks it is prudent to equate difficulty across the tasks. However in reality this is non-trivial, in particular when there is an inherent difference in difficulty between them. Participants in this and our previous work (Love et al., 2013) reported that TOJs were in general more difficult than SJs based on their experience with the tasks for a wide variety of cue onset asynchronies and a wide variety of stimulus types. For simple beep-flash stimuli, for example, we previously (Love et al., 2013) found no quantitative indication of TOJs being more difficult than SJs except for verbal reports, in which 71% of participants thought TOJs were more difficult. This example helps to highlight two different concepts of task difficulty: 1) between-task and 2) within-task difficulty. By within-task difficulty we refer to, for example, performing at a 75% correct level on two different tasks. While difficulty could be believed to be equal in this situation we would argue that this is not necessarily so. For example, it is inherently more difficult to perform at 75% when solving differential equations compared to performing at 75% on a multiplication task. Clearly, there are cases when equating this type of within-task difficulty is not possible. We believe that there is an inherent within-task difficulty difference between SJs and TOJs that cannot easily be equated. Perhaps, as suggested above, this could be due to an extra stage of cognitive processing required for TOJs. That said, it would be informative to compare the two tasks at audio- and video-leading just-noticeable-difference COA levels, as this may be the best control of within-task difficulty. We chose not to do this, as it would lead to different stimulus conditions (COA levels) being presented for each task. In general, behavioral experiments highlighting differences between SJ and TOJ have used identical stimulus conditions to compare the tasks. Here we aimed to investigate the underlying neural mechanisms that reflect the findings of such behavioral work.

As detailed in section Behavioral Results Behavioral Results, there was a significant difference in group mean behavioral performance for the +333 COA condition dependent on whether it was conducted inside or outside the MRI environment. Furthermore, while it was not possible to statistically compare PSS performance from outside and inside the MRI environment it appears that, at least for some participants, performance on the PSS condition was also affected by the MRI environment (**Figure 3A**). One obvious difference between the experimental procedures in these two situations was the ratio of synchronous to asynchronous conditions presented. In the pre-fMRI experiment multiple asynchronous COA levels were presented whereas during the fMRI experiment an equal number of synchronous (0 COA and PSS) and asynchronous (-333 and +333) COA levels were presented. It is possible that these different experimental contexts influenced behavioral performance. Another possibility is that the noise produced by the MRI scanner made the information from the auditory cue less reliable thus widening the TIW for the participants during the scan, which would result in a lower ability to detect asynchrony particularly when vision led sound.

Our subsample sizes of 9 (TOJ-able group) versus 11 (TOJunable group) should be considered as a limitation in the ability to detect differences between the two groups. Indeed no significant differences involving the Group factor were observed. However, our failing to observe significant differences between the groups should not, as with all null results, be interpreted as evidence of no difference. A study designed specifically to test for differences between these two groups involving a larger number of observations per group may well highlight significant differences. Unthresholded statistical maps from the current study (http://neurovault.org/collections/UMJLMEEJ/) indicate, for example, a possible main-effect of Group in bilateral Putamen and in the right STS.

In the current fMRI experiment the visual cue was presented on a screen approximately 65 cm from the head of participants, while the auditory cue was presented via headphones. The relative spatial location of the sensory cues of a multisensory stimulus is one of the main factors in regulating multisensory integration mechanisms (Stein and Meredith, 1993). We have previously shown that for an SJ task, using headphones or speakers placed next to the screen led to no significant difference in behavioral performance when using the same stimuli used in the present study (Petrini et al., 2009a). Therefore, we believe that it is unlikely that this spatial discordance significantly influenced the current results; however, an effect of such spatial discrepancy for TOJs cannot be ruled out. This is a limitation we share with the study by Binder (2015) and it is dictated by the common use of headphones during fMRI studies to reduce background noise. It would be interesting in future studies to ascertain the effect of spatial displacement on both tasks by conducting the TOJ and SJ tasks with both headphones or speakers.

### CONCLUSION

In conclusion, important differences between the neural correlates of synchrony judgments (SJs) and temporal order judgments (TOJs) have been highlighted at both the sustained

### REFERENCES


and the transient BOLD response levels. The similarity between the current results and those of Binder (2015) provide converging evidence that the divergent neural correlates of these two tasks likely exist regardless of stimulus complexity; however, it is important to also confirm this using contextually natural (not only white dots on a black background) stimuli. We speculate that the more demanding, possibly two-stage, cognitive processing required for TOJs induces a task-induced deactivation of the MOC to reallocate resources to regions required to make the judgment: the middle occipital, middle frontal, precuneus and superior medial frontal cortex. One important conclusion arising from this study is that care must be taken during future attempts to use atypical temporal processing as a diagnostic tool, or to inform the creation of remediation strategies for clinical disorders such as autism and schizophrenia. As SJs and TOJs are underpinned, not only by overlapping, but also by divergent neural mechanisms, atypical processing found for one task may or may not exist for the other (Capa et al., 2014). Neuroimaging studies examining differences in temporal processing between the mentioned clinical populations and the typical developing population could focus on the task unrelated activated regions (e.g., putamen, insula and superior temporal cortex) to identify useful diagnostic markers.

## DATA AVAILABILITY

Unthresholded statistical maps were uploaded to NeuroVault.org database and are available at http://neurovault.org/collections/ UMJLMEEJ/. The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

### AUTHOR CONTRIBUTIONS

All authors participated to experimental design, interpretation, manuscript editing and approval. SL and ML analyzed the data. SL collected the data and wrote the first draft of the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00274/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Love, Petrini, Pernet, Latinus and Pollick. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Minimal Self and Timing Disorders in Schizophrenia: A Case Report

Brice Martin<sup>1</sup> , Nicolas Franck <sup>1</sup> , Michel Cermolacce<sup>2</sup> , Jennifer T. Coull <sup>3</sup> and Anne Giersch<sup>4</sup> \*

<sup>1</sup> Centre Ressource de Réhabilitation Psychosociale et de Remédiation Cognitive, Centre Référent Lyonnais en Réhabilitation et en Remédiation Cognitive Hôpital du Vinatier, Centre National de la Recherche Scientifique UMR 5229, Lyon, France, <sup>2</sup> Service Universitaire de Psychiatrie, Hôpital Ste Marguerite, Marseille, France, <sup>3</sup> Laboratoire des Neurosciences Cognitives (UMR 7291), Aix-Marseille Université & Centre National de la Recherche Scientifique, Marseille, France, <sup>4</sup> Institut National de la Santé et de la Recherche Médicale U1114, Pôle de Psychiatrie, Fédération de Médecine Translationnelle de Strasbourg, Centre Hospitalier Régional Universitaire of Strasbourg, Université de Strasbourg, Paris, France

For years, phenomenological psychiatry has proposed that distortions of the temporal structure of consciousness contribute to the abnormal experiences described before schizophrenia emerges, and may relate to basic disturbances in consciousness of the self. However, considering that temporality refers mainly to an implicit aspect of our relationship with the world, disturbances in the temporal structure of consciousness remain difficult to access. Nonetheless, previous studies have shown a correlation between self disorders and the automatic ability to expect an event in time, suggesting timing is a key issue for the psychopathology of schizophrenia. Timing disorders may represent a target for cognitive remediation, but this requires that disorders can be demonstrated at an individual level. Since cognitive impairments in patients with schizophrenia are discrete, and there is no standardized timing exploration, we focused on timing impairments suggested to be related to self disorders. We present the case report of AF, a 22 year old man suffering from schizophrenia, with no antipsychotic intake. Although AF shows few positive and negative symptoms and has a normal neurocognitive assessment, he shows a high level of disturbance of Minimal Self Disorders (SDs) (assessed with the EASE scale). Moreover, AF has a rare ability to describe his self and time difficulties. An objective assessment of timing ability (variable foreperiod task) confirmed that AF had temporal impairments similar to those previously described in patients, i.e., a preserved ability to distinguish time intervals, but a difficulty to benefit from the passage of time to expect a visual stimulus. He presents additional difficulties in benefitting from temporal cues and adapting to changes in time delays. The impairments were ample enough to yield significant effects with analyses at the individual level. Although causal relationships between subjective and objective impairments cannot be established, the results show that exploring timing deficits at the individual level is possible in patients with schizophrenia. Besides, the results are consistent with hypotheses relating minimal self disorders (SDs) to timing difficulties. They suggest that both subjective and objective timing investigations should be developed further so that their use at an individual level can be generalized in clinical practice.

Keywords: self disorders, schizophrenia, timing and time perception, minimal self, hazard function, simultaneity judgment, implicit

#### Edited by:

Deana Davalos, Colorado State University, United States

#### Reviewed by:

Gianluca Serafini, Dipartimento di Neuroscienze e Organi di Senso, Ospedale San Martino (IRCCS), Italy Diego A. Golombek, Universidad Nacional de Quilmes (UNQ), Argentina

#### \*Correspondence:

Anne Giersch giersch@unistra.fr

Received: 15 January 2018 Accepted: 22 March 2018 Published: 06 April 2018

#### Citation:

Martin B, Franck N, Cermolacce M, Coull JT and Giersch A (2018) Minimal Self and Timing Disorders in Schizophrenia: A Case Report. Front. Hum. Neurosci. 12:132. doi: 10.3389/fnhum.2018.00132

## INTRODUCTION

Psychiatrists in the field of both phenomenology and experimental psychology have proposed that disturbances of temporality in schizophrenia represent a key issue for psychopathology (Fuchs, 2007; Vogeley and Kupke, 2007). Timing disorders may relate especially to basic disturbances in consciousness of the self i.e., the "minimal self " (Martin et al., 2014; Giersch and Mishara, 2017a). The minimal self is defined by the pre-reflexive "mineness" of conscious experience (Gallagher, 2000). For example, we say "I see a tree" when pointing out a tree to another person. In this sentence, the subject says "I" without thinking about it, as a result of his/her natural first-person perspective. The presence of the "I" signals the sense of "minimal self." As explained by Stanghellini (2009) "I experience myself as the perspectival origin of my experiences (i.e., perceptions or emotions), actions and thoughts." Minimal Self disorders (SDs) are considered as trait markers. They have been described in prodromal, as well as chronic stages of schizophrenia (Møller and Husby, 2000; Parnas et al., 2003, 2005; Raballo et al., 2011). They are considered stable, core features of schizophrenia (Lysaker and Lysaker, 2010; Gallese and Ferri, 2013; Nelson et al., 2013b; Hur et al., 2014; Mishara et al., 2014; Nordgaard and Parnas, 2014; Nordgaard et al., 2017), and discriminate the schizophrenia spectrum from bipolar disorders with psychotic experiences and other psychoses (Haug et al., 2012a), or borderline personality (Nelson et al., 2013a). However, descriptions of temporality disorders are rarer. Temporality is the sense of continuity that is inherent to the concept of minimal self: we implicitly think of ourselves as unique beings that are continuous in time. Our sense of self continuity is a given, which we do not naturally question. It is so strong a feeling that it is difficult to imagine its perturbation, making it difficult to access by verbal report. Timing disorders are thus often ignored in clinical settings, despite their potential role in pathophysiology. They have however been demonstrated in group studies, and have been related to minimal self disorders (Martin et al., 2017). Here we check whether it is possible to demonstrate timing disorders at an individual level, which is a critical step toward the integration of such evaluations into clinical practice.

We present a case report of AF because he has the rare ability to describe his time and self disorders ("subjective" timing). We also report experimental results ("objective" timing) analyzed at the single case level in AF, focusing on temporal expectation in the seconds time-range. Complementary results on a simultaneity judgement task in the millisecond range (Martin et al., 2014; Giersch and Mishara, 2017a,b), can be found in Supplementary Materials. We explored implicit timing, which is defined by the processing of information in time independently of any explicit temporal judgement (Coull and Nobre, 2008). This type of timing is very close to the phenomenological description of human time in that it shapes conscious experience rather than representing the contents of time. We favor this approach because explicit duration judgements are characterized by variability rather than a clear over- or underestimation (Thoenes and Oberfeld, 2017), can be affected by attention or working memory impairments (e.g., Campbell and Davalos, 2015), and hence are difficult to interpret at an individual level (see Giersch and Mishara, 2017b, for a discussion on implicit vs. explicit measures in schizophrenia).

The temporal expectation task explores the ability to continuously extract temporally predictive information from the elapse of time, i.e., to benefit from the predictive information inherent in the unidirectional flow of time. A disconnection from time's flow would mean a lack of immersion in the world, which should impact the patient's ability to feel present and embodied in his actions, feelings and thoughts. The observation of both types of impairment in this patient would reinforce the hypothesis of a link between implicit timing and the sense of self.

### AF Case Report

### Anamnesis and Symptomatology

AF is a 25-year-old man, who has never worked since his professional training in electronics. He attends a rehabilitation center because of social and professional difficulties. He has been followed psychiatrically for 4 years, following a suicide attempt and a diagnosis of schizophrenia. When we met him, he had no more psychiatric support and no treatment. A written informed consent was obtained from AF for the publication of this case report.

AF was born in Lyon (France). He was adopted at the age of 2 years. His childhood contained no great peculiarities, though he describes himself as "shy" since childhood. He has always been imbued with the diffuse feeling of "being different." During childhood, he was a good pupil. The family atmosphere was serene. However, a state of psychic suffering gradually settled in during early adulthood. It was marked by a difficulty in maintaining good school performance ("everything required more and more effort... maybe I became too analytical"). AF also describes growing difficulty in interacting with others "I gradually became isolated... It is as if contact with others was not natural anymore...I began to wonder how others behaved so naturally." AF adds: "Everything was a question for me... stupid questions like 'why does the mouth go upwards when you smile'." He gradually quit school, became isolated and withdrew to the family home. At the age of 18, he attempted suicide "As amazing as it may seem... I could no longer stand to always be thinking about everything." He was then hospitalized for 3 months in a psychiatric ward for a delirious state, marked by ideas of persecution. Antipsychotic drugs (risperidone and then olanzapine) were then prescribed and the diagnosis of schizophrenic disorder was made on the presence of delusions, negative symptoms, major impairments in interpersonal functioning, and lack of evidence for schizoaffective disorder and autism.

When we encountered AF, functional impairments persisted, i.e., difficulty in social and professional integration, but there was no longer any obvious behavioral symptomatology (see clinical details in Supplementary Material). The patient voiced two major complaints. The first concerned the feeling of being oneself, and the second his experience of time.

### Lack of Ownership—Minimal Self Disorder (SDs)

On several occasions, AF reported feelings of being at a distance from his environment and his own sensation of being a subject.

"I often observe myself from the outside... When I'm speaking, I'm seeing myself speaking at the same time that I'm speaking... when I'm walking, it is as if I am controlling myself artificially at the same time that I am walking... as if I were outside myself, and it is not really I who perceive... as if my perception is bland, artificial... as if I am not there...."

In addition, AF described a loss of natural evidence, associated with an experience of hyper-reflexivity. "Everything raises questions... it is as if the world would be so far from me that everything is questionable... why is the table called a table... why is the sky blue... why do I have two arms and not three."

Finally, AF described: "it is as if the content of my perception is wrong... as if there is me on one side, and an artificial environment on the other side that reflects my perception... as if what I perceive is not really what is there to be perceived.... I sometimes have the impression that it is cut in two parts...or that I'm alone, without a real perception."

AF's bodily feelings, and relationships to his own thoughts were also disturbed, and descriptions can be found in Supplementary Materials.

### Disturbance of Temporality

Timing disturbances are usually difficult to explore, but come up frequently in AF's complaints: "I do not feel the time," "The word 'time' has no meaning for me...[I have to] use tools, tricks, to know that time has passed." "For example, I often look at my watch to know that time elapsed..." "You see, I can use a metaphor to explain to you... Birds, they have a sense that allows them to orient themselves... a kind of magnetism... It is an innate thing... If they do not have it, they cannot navigate... Me, it's the time I do not have... I'm like blind to time... but I cannot explain it better... I try to find out how to talk about it... but I can't manage to explain. . . It may be the most important thing to understand..."

Although time is his main complaint, AF is unable to better specify the particularity of his relationship to time.

### OTHER INVESTIGATIONS—MATERIALS AND METHODS

### EASE Scale

The EASE is a 57 item semi-structured interview designed to explore SDs. The scale was administered by BM, who was trained in its use by one of the authors of the EASE scale (JP). Five domains are explored, as detailed in Supplementary Materials.

AF had a total score of 16, which corresponds to a score frequently found in the population of people with schizophrenia. Detailed sub-scores can be found in Supplementary Material.

### Neuro Cognition

A neurocognitive assessment was performed by a neuropsychologist, including psychomotor speed, working memory, verbal memory and executive functions. No major TABLE 1 | Scores of AF in the neurocognitive battery.



The battery was designed to explore the main cognitive functions known to be impaired in patients with schizophrenia (Heinrichs and Zakzanis, 1998), and known to affect explicit duration estimation in these patients (Bonnot et al., 2011; Roy et al., 2012; Ward et al., 2012; Campbell and Davalos, 2015; de Montalembert et al., 2016). It is to be noted that the patient did not display thymic disorders (see Supplementary Material for details).

deficit was observed, except a flexibility deficit. Scores are detailed in **Table 1**.

### Temporal Prediction Task

A visual target was displayed at various intervals after an initial fixation point (400 or 1,000 ms), and the task was to press a response key as quickly as possible once the target was displayed. The conditional probability of target presentation increases with the length of the interval (termed the "hazard function"), resulting in a heightening sense of expectation over time. Increasing expectation results in shorter reaction times (RTs) as the delay between fixation point and target increases (Niemi and Näätänen, 1981). This decrease in RT is referred to as the variable foreperiod effect.

To investigate the patient's ability to implicitly take such conditional probabilities into account, he performed the task in two different types of experimental blocks. In one set of blocks, the initial fixation point was not followed by the target in 25% of trials (25% catch trials) whereas in the other blocks the target always followed the fixation point (0% catch trials). Catch trials decrease the expectancy that the target will be presented and lead to "dispreparation," yielding a decrease in the variable foreperiod effect.

We also investigated the ability to voluntarily orient attention in time. In half of the blocks, we used visual cues to indicate the time of occurrence of the target, either 400 or 1,000 ms

absent in 25% of the cases (25% catch trials, 2 lower quadrants).

after the fixation point (**Figure 1**). This temporal cue condition was contrasted with a neutral cue condition (which provided no temporal information).

All experimental conditions were crossed orthogonally (**Figure 1**). A detailed description is provided in Supplementary Materials.

To further explore the implicit processing of time intervals we analyzed trial-to-trial sequential effects in blocks with 0% catch trials (Los and van den Heuvel, 2001). These effects rely on automatic mechanisms (Vallesi et al., 2014), whereby the time interval of the preceding trial influences performance on the current trial: if the foreperiod on trial N is shorter than that on trial N-1, reaction times are slowed. However, no such effect is observed when the foreperiod is longer on trial N than N-1.

The temporal orienting test has been used widely (Coull and Nobre, 1998; Correa et al., 2006; Nobre and van Ede, 2018) and participants matched to AF on age and education level showed typical results on this test: RTs were faster for temporal vs. neutral cue conditions and, in the neutral condition, for long vs. short foreperiods (Martin et al., 2017).

### RESULTS—TEMPORAL PREDICTION TASK

AF committed only 3.3% errors, which were not analyzed further. Analyses of variance were conducted on RTs for correct trials, with each RT as a random variable. In the first analysis we included cue type (temporal vs. neutral), catch-trial percentage (0% vs. 25%), and foreperiod (400 vs. 1,000 ms) as betweengroup variables. Results showed a significant interaction between foreperiod and catch-trial percentage [F(1, 752) = 6.5, p< 0.05). Post-hoc Tukey analysis showed that in case of 0% catch trials there was no difference in RTs between 400 ms (323 ms) and 1,000 ms (321 ms), whereas in the case of 25% catch trials, RTs increased between 400 ms (333 ms) and 1,000 ms (357 ms), p<.05. There was also a main effect of cue type [F(1, 752) = 13.8, p < 0.001]. RTs were longer for temporal (340 ms) than for neutral cues (321 ms) (**Figure 2**).

In a second analysis, we analyzed sequential effects during neutral cue 0% catch trial blocks only. Results showed that RTs were longer when the foreperiod of two consecutive trials were different rather than identical [337 vs. 307 ms, F(1, 444) = 21.4, p < 0.001]. This effect was significant both when the foreperiod

FIGURE 2 | Mean Reaction times for AF as a function of catch-trial condition (0% catch trials in the lefthand graph, vs. 25% catch trials in the righthand graph), foreperiod between cue and target (400 vs. 1,000 ms, on the X axis), and cue type (temporal in red and neutral in blue).

was 400 ms on trial N [342 vs. 305 ms, F(1, 224) = 18.9, p < 0.001], and when it was 1,000 ms [333 vs. 311 ms, F(1, 221) = 5.4, p < 0.05] (**Figure 3**).

### DISCUSSION

The case of AF is characterized by low intensity schizophrenia symptoms. In particular, AF shows no delusion or disorganization of thought. However, AF presents significant subjective complaints that can be attributed to SDs. A normal sense of self involves automatic, not-reflexive, self presence, and immersion in the world (Parnas et al., 2005). We perceive and act in the world "from the inside." Because our thinking is "glued" to ourselves, our consciousness is non-spatial in nature and we cannot locate our consciousness, our thoughts, which are intuitively experienced as ours.

In contrast, AF describes experiences reflecting a distance from his own perception and self-awareness, i.e., a lack of immersion in the world. His own consciousness of himself, as well as his own perceptions, no longer seem embodied by the self and appear to him as mechanical and distinct from himself. In addition, the patient presents a loss of natural evidence, and a loss of immersion in his environment (Parnas et al., 2005). Finally, AF's descriptions of a "perceptualization of inner discourse" and "spatialization of thought" (see Supplementary Material) suggest that there is a gap between his thoughts and the self, reflecting and/or leading to the loss of the "mineness" of mental experience.

Taken together, these disturbances reflect a basic alteration of the feeling of presence, i.e., immersion in the world, which represents a fundamental feature of SDs. AF's disturbances are attested by his high score on the EASE scale.

AF had a normal neuro-cognitive assessment, despite his minimal self disturbances being quite large. This dissociation between normal neurocognitive outcome and SDs is consistent with the literature (Haug et al., 2012b), which does not indicate a correlation between typical neuro-cognitive disturbances and minimal SDs.

Usual neuro-cognitive batteries are not exhaustive though, and AF insists that he does not "have a sense of time." AF also insists on his great difficulty in being able to describe more precisely the peculiarities of his time experience.

### AF Shows Several Timing Impairments

AF does not benefit from the flow of time to optimize his temporal expectancies, i.e., his reaction time does not decrease as the probability of target occurrence increases. In addition, in the presence of catch trials, there is a large dispreparation effect (Correa et al., 2006) resulting in slower RTs at 1,000 ms than at 400 ms. These results are consistent with those observed at the group level (Martin et al., 2017), especially in patients with SDs. As in this previous study, our results suggest a fragility in the ability to make temporal predictions. Moreover, results fit with AF's complaint that he does not feel that time has elapsed. Despite these difficulties, some basic ability to encode time intervals is preserved, as shown by the fact that, as in healthy participants, RTs increase when the foreperiod between consecutive trials differs. This result shows that the difference between 400 and 1,000 ms has been encoded. This is not surprising, since automatic coding of short durations has been found in electrophysiological studies of cortical slices, i.e., without requiring the entire cortex (Goel and Buonomano, 2016). In AF, it is not interval estimation that is impaired, but the ability to use this interval to predict the probability of target occurrence. It is also to be noted that AF had no difficulty in taking probabilities into account, since performance changed when the probability of target presentation was manipulated, i.e., in the case of catch trials. It is thus the temporal prediction itself that seems affected.

However, some of AF's results had not been observed at the group level. Sequential effects were preserved in our group of patients with schizophrenia (see Supplementary Material). For AF, performance was similarly affected whether the foreperiod on trial N was shorter or longer than that on trial N-1. It's as if AF expected the foreperiod to be strictly identical on consecutive trials and was unable to reinitialize expectation once the target had not been displayed after the short foreperiod. This effect is reminiscent of previous studies suggesting heightened sequential effects in patients (Zahn et al., 1963). Together with AF's difficulty in using temporal cues (unusually, RTs were slower, rather than faster, after temporal cues), the results suggest that temporal prediction is especially fragile and lacks flexibility in this patient.

Overall, AF's complaints about time were supported by objective exploration. The paucity of other symptoms and neurocognitive impairment makes it tempting to propose a relationship between objective and/or subjective timing difficulties and minimal SDs, consistent with previous hypotheses (Martin et al., 2014, 2017; Giersch and Mishara, 2017a,b). However, such a relationship cannot be proven, and more extensive exploration of timing might be useful, e.g. using longer time intervals. Yet, the important result in the present study is that timing disorders can be demonstrated at an individual level, independent of antipsychotics or impairments in attention and memory. This is the first step toward personalized evaluation and therapeutics. We suggest that subjective and objective explorations of timing may usefully complete patients' clinical evaluation.

### ETHICS STATEMENT

The project was approved by a local ethics committee (CPP Sud Est VI), and informed written consent was obtained, before the study, from each patient and control participant. All methods have been conducted in accordance with the recommendation of the Declaration of Helsinki.

### AUTHOR CONTRIBUTIONS

AG and JC: designed the study; BM and NF: provided their clinical expertise and acquired the experimental and clinical data; BM: conducted the clinical interviews; BM and MC: scored the EASE; AG: analyzed the experimental data; AG, JC, and BM: interpreted the data; BM: wrote the first draft of the manuscript; AG and JC: edited and finalized the manuscript. All authors reviewed, amended, and approved the manuscript.

### FUNDING

The research was funded by INSERM, and a grant from the Conseil Scientifique de la recherche de l'Hôpital le Vinatier (CSR F05, AO2014) and an ANR grant (Autotime, ANR-16- CE37-0004).

### REFERENCES


### ACKNOWLEDGMENTS

We thank Zelda Prost for her help with neuro-cognitive assessment. INSERM, the University Hospital of Strasbourg and of Lyon (Le Vinatier) provided logistical support for this research, as well as a grant from the Conseil Scientifique de la recherche de l'Hôpital le Vinatier (CSR F05, AO2014) and an ANR grant (Autotime, ANR-16-CE37-0004).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00132/full#supplementary-material

neurocognitive dysfunction in schizophrenia. Schizophr. Res. 135, 79-83. doi: 10.1016/j.schres.2011.11.015


Shallice, T., and Burgess, P. W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain 114, 727–741. doi: 10.1093/brain/114.2.727


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Martin, Franck, Cermolacce, Coull and Giersch. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dissociating Explicit and Implicit Timing in Parkinson's Disease Patients: Evidence from Bisection and Foreperiod Tasks

Giovanna Mioni <sup>1</sup> \* † , Mariagrazia Capizzi 2† , Antonino Vallesi 2,3 , Ángel Correa4,5 , Raffaella Di Giacopo6,7 and Franca Stablum<sup>1</sup>

<sup>1</sup>Department of General Psychology, University of Padova, Padua, Italy, <sup>2</sup>Department of Neuroscience, University of Padova, Padua, Italy, <sup>3</sup>San Camillo Hospital IRCCS, Venice, Italy, <sup>4</sup>Centro de Investigación Mente, Cerebro y Comportamiento, University of Granada, Granada, Spain, <sup>5</sup>Departamento de Psicología Experimental, University of Granada, Granada, Spain, 6 Institute of Neurology, San Bortolo Hospital, Vicenza, Italy, <sup>7</sup>Center for Mind/Brain Sciences (CIMeC), University of Trento, Trento, Italy

A consistent body of literature reported that Parkinson's disease (PD) is marked by severe deficits in temporal processing. However, the exact nature of timing problems in PD patients is still elusive. In particular, what remains unclear is whether the temporal dysfunction observed in PD patients regards explicit and/or implicit timing. Explicit timing tasks require participants to attend to the duration of the stimulus, whereas in implicit timing tasks no explicit instruction to process time is received but time still affects performance. In the present study, we investigated temporal ability in PD by comparing 20 PD participants and 20 control participants in both explicit and implicit timing tasks. Specifically, we used a time bisection task to investigate explicit timing and a foreperiod task for implicit timing. Moreover, this is the first study investigating sequential effects in PD participants. Results showed preserved temporal ability in PD participants in the implicit timing task only (i.e., normal foreperiod and sequential effects). By contrast, PD participants failed in the explicit timing task as they displayed shorter perceived durations and higher variability compared to controls. Overall, the dissociation reported here supports the idea that timing can be differentiated according to whether it is explicitly or implicitly processed, and that PD participants are selectively impaired in the explicit processing of time.

Keywords: Parkinson participants, basal ganglia, time bisection task, foreperiod, explicit timing, implicit timing, sequential effects

### INTRODUCTION

Parkinson's disease (PD) is a progressive neurodegenerative disease characterized by motor and non-motor disorders, such as bradykinesia, tremor, rigidity, olfactory loss, sleep, behavioral and cognitive impairment (Nalls et al., 2015). This heterogeneous disease involves dysfunctions in several circuits, including the loss of dopaminergic neurons in the substantia nigra pars compacta, which has strong implications for the efficacy of the nigrostriatal dopaminergic pathway (Alberico et al., 2017), the loss of dopaminergic neurons in the ventral tegmental area,

#### Edited by:

Charlotte A. Boettiger, University of North Carolina at Chapel Hill, United States

#### Reviewed by:

Nandakumar Narayanan, University of Iowa, United States Richard B. Reilly, Trinity College, Dublin, Ireland

> \*Correspondence: Giovanna Mioni mioni.giovanna@gmail.com

†These authors have contributed equally to this work.

Received: 28 November 2017 Accepted: 15 January 2018 Published: 06 February 2018

#### Citation:

Mioni G, Capizzi M, Vallesi A, Correa Á, Di Giacopo R and Stablum F (2018) Dissociating Explicit and Implicit Timing in Parkinson's Disease Patients: Evidence from Bisection and Foreperiod Tasks. Front. Hum. Neurosci. 12:17. doi: 10.3389/fnhum.2018.00017 which mostly affects the mesocortical pathway to the prefrontal cortex (Parker et al., 2015a; Kim et al., 2017), and the degeneration of the cholinergic system, which helps explain the cognitive symptoms of patients with PD (Calabresi et al., 2006).

A well-known hypothesis in the timing literature is that temporal processing in the milliseconds to seconds range involves the basal ganglia and is modulated by the level of dopamine (Meck et al., 2008; Marinho et al., 2018). Supporting the role of the basal ganglia in timing processes, PD patients usually have severe deficits on various temporal tasks and with various temporal ranges (for a review, see Jones and Jahanshahi, 2014). However, the exact nature of timing problems in PD is still elusive. In a critical review of the literature on time perception, Coull and Nobre (2008) fractionated temporal processing and timing tasks on the basis of whether the underlying mechanisms were explicitly or implicitly engaged. In explicit timing tasks, participants are instructed to attend to the duration of the stimulus, which is hence explicitly task-relevant. Conversely, in implicit timing tasks, no explicit instruction to process time is received, albeit timing is inherent in the task to be performed and usually affects behavior. The main goal of the present study is to delve further into the dissociation between explicit and implicit timing in PD.

The dopaminergic system has been associated with both the perception of time intervals in the supra-seconds range and the regulation of speed of a hypothesized internal clock, which is consistent with its effect on the rate of an internal pacemaker that varies between individuals leading to a ''faster'' clock for some and a ''slower'' clock for others (Coull et al., 2012). The involvement of the basal ganglia and the dopaminergic system in explicit timing would thus explain the deficit of PD participants in the most commonly used temporal tasks, such as finger-tapping (Artieda et al., 1992; Pastor et al., 1992b; O'Boyle et al., 1996), time reproduction, time production, and time estimation tasks (Pastor et al., 1992a; Lange et al., 1995; Perbal et al., 2005). However, some studies reported that the temporal deficit associated with PD might be explained by impairment of other cognitive processes, such as memory and attention, rather than by a real ''clock problem'' (Malapani et al., 1998; Koch et al., 2008). Moreover, most of the previous commonly used temporal tasks included a motor component that might have emphasized the observed temporal impairment (for a review, see Jones and Jahanshahi, 2014). Indeed, when the motor component was reduced by using a time bisection task, mixed results have been observed. In the time bisection task, participants are instructed to categorize the presented duration as being more similar to the short or to the long standard interval. Employing the time bisection task, for instance, Smith et al. (2007) showed lower temporal abilities in PD participants compared to controls when using long (1–5 s) temporal intervals, but not when using short ones (100–500 ms). In the study by Merchant et al. (2008), PD participants displayed higher temporal variability than their controls when presented with brief temporal intervals (350–1000 ms). Wearden et al. (2008) found no evidence of temporal impairment in PD participants within the sub-second range (100–800 ms). Finally, Zhang et al. (2016) showed temporal overestimation and higher variability in PD participants with respect to controls using auditory stimuli (330–750 ms). Overall, these mixed results may be explained by clinical (severity and/or medication state) and methodological differences between the studies such as the specific modality and temporal range used.

A classic example of implicit timing task is given by the foreperiod paradigm, in which participants have to respond to a target stimulus preceded by a warning signal (for reviews, see Niemi and Näätänen, 1981; Coull, 2009; Vallesi, 2010). The foreperiod is the time interval between warning and target. When one foreperiod only (e.g., either a short interval of 1000 ms or a long interval of 3000 ms) is presented during a block of trials, response times (RTs) are usually shorter for the blocks with the shorter foreperiod, a phenomenon dubbed the ''fixed foreperiod effect'' (e.g., Mattes and Ulrich, 1997; Vallesi et al., 2009). The fixed foreperiod effect has been explained in terms of better time estimation of short intervals relative to long intervals (see Gibbon, 1977), which in turn will lead to shorter RTs in the short foreperiod blocks (e.g., Bausenhart et al., 2008).

Unlike the fixed foreperiod paradigm, when shorter and longer foreperiods are randomly and equiprobably intermixed across trials, the pattern of results usually reverses with shorter RTs for the long foreperiod trials, a phenomenon known as the ''variable foreperiod effect'' (e.g., Niemi and Näätänen, 1981; Mento et al., 2015). A further phenomenon that emerges in the variable foreperiod paradigm concerns the ''sequential effects''. Sequential effects consist of a performance benefit when the current short foreperiod is preceded by another short rather than a longer foreperiod. Performance at the current long foreperiod is instead fast irrespective of whether the previous foreperiod has been shorter than or as long as the current one (e.g., Los and van den Heuvel, 2001; Steinborn et al., 2008; Capizzi et al., 2015; Mento, 2017).

Converging evidence from behavioral (Vallesi et al., 2013, 2014), neuropsychological (Vallesi et al., 2007a; Triviño et al., 2010), developmental (Vallesi and Shallice, 2007) and transcranial magnetic stimulation (Vallesi et al., 2007b) studies suggest that dissociable processes may underlie foreperiod and sequential effects (but see Los et al., 2014, for an alternative model). In particular, it seems that sequential effects are mediated by more automatic processes than those at the basis of the foreperiod effect. From a neural point of view, for instance, while the foreperiod effect has been shown to rely on the functioning of prefrontal structures related to executive processes, this is not the case for sequential effects. Such a neural dissociation has led to hypothesize that sequential effects probably rely upon more primitive brain areas that develop earlier as compared to prefrontal structures (Vallesi and Shallice, 2007; see also Mento and Tarantino, 2015). Among these sub-cortical regions, the basal ganglia might be a likely neural substrate for sequential effects. This expectation, however, was not fulfilled in the study by Triviño et al. (2010), which showed normal sequential effects in participants with basal ganglia lesions. The authors attributed the null finding to the fact that their participants had suffered a unilateral stroke that mainly affected the striatum (putamen and caudate nucleus) while leaving intact the substantia nigra and the dopamine production (see also Triviño et al., 2016). Therefore, the administration of a variable foreperiod task to PD participants is critical to directly investigate the involvement of the basal ganglia in the generation of sequential effects.

As regards the foreperiod effect, early studies on PD hypothesized a reduction of the foreperiod effect for PD participants on the ground that such an effect should also depend on intact dopaminergic pathways (e.g., Zahn et al., 1963; Brown and Robbins, 1991). However, a reduced foreperiod effect in PD has not always been confirmed (e.g., Rafal et al., 1984). For instance, Jurkowski et al. (2005) found that PD participants had a normal foreperiod effect in a reflexive (startle-eyeblink) task but not in a voluntary (hand-grip) one. Their conclusion was that interval processing associated with lower level reflexive behavior was intact in PD participants. Likewise, Lee et al. (2012) tested phasic arousal and temporal preparation. Considerable benefit was indeed observed from the warning stimulus, however, the benefit was not greater for the controls than it was for PD participants.

As far as we know, only a few studies have directly compared the performance of PD participants and control participants in both explicit and implicit timing tasks within a single experimental session. Amongst these, de Hemptinne et al. (2013) employed an oculomotor paradigm, which required anticipation of a salient target that moved along a circular path and reversed direction after a short (1200 ms) or long (2400 ms) forward path. The results showed that the explicit timing of target motion but not the implicit one was impaired in PD participants. Most germane to our study for the kind of tasks employed, Jones et al. (2008) study used time production (30, 60 and 120 s) and time reproduction tasks (250, 500, 1000 and 2000 ms) as measures of explicit timing and warned and unwarned reaction time tasks (250, 500, 1000, 2000 ms fixed between blocks) as measures of implicit timing. In the case of time reproduction and warned and unwarned reaction time tasks, PD participants were as accurate as controls when requiring temporal processing within the 250–2000 ms range. Exploratory factor analysis also suggested that the time production task used mechanisms distinct from those employed in time reproduction and warned and unwarned reaction time tasks. The authors concluded that the integrity of the basal ganglia is necessary for producing time in the seconds range and that explicit and implicit timing are mediated by dissociable mechanisms.

Taken together, the previous studies investigating explicit and implicit timing in the same group of PD participants suggest that these two ways of processing time may be differently affected in PD. This is also supported by functional magnetic resonance imaging (fMRI) studies showing that explicit timing engages the basal ganglia, whereas implicit timing does not (Coull and Nobre, 2008). In the present study, we shall further investigate the performance of PD participants when tested with explicit and implicit timing tasks. We opted for a time bisection task to investigate explicit timing (Mioni et al., 2016, 2017) and a foreperiod task to test implicit timing (Vallesi et al., 2014). The time bisection task has been extensively used to study temporal processing and, importantly, has been previously employed with PD participants because the motor component is limited (Mioni et al., 2016, 2017). The foreperiod task was a simple detection task also with low motor demands, in which participants were required to respond to a target stimulus presented either after a fixed foreperiod or a variable one. The use of a variable foreperiod design allowed us to also analyze sequential effects in PD. In addition to explicit and implicit timing tasks, all participants performed neuropsychological tests that evaluated attention, working memory and executive functions, which are usually reduced in PD (Kudlicka et al., 2011).

To sum up, on the basis of prior studies (de Hemptinne et al., 2013; Jones and Jahanshahi, 2014), we expected a deficit of PD participants in the explicit timing task, but not in the implicit one, which would confirm the distinction between the two time processing in PD. Importantly, our work aimed to also shed new light into another implicit temporal phenomenon, namely, sequential effects, which so far have been neglected in the study of time processing in PD.

### MATERIALS AND METHODS

### Participants

Twenty right-handed PD participants (11 males, 9 females) and 20 right-handed healthy controls (9 males, 11 females) matched for age (t(38) = 0.45; p = 0.658; d = 0.14) and years of education (t(38) = 0.11; p = 0.911; d = 0.03) were examined (**Table 1**). The sample size was based on previous literature about explicit and implicit temporal processing in PD participants (Jurkowski et al., 2005; Jones et al., 2008; Lee et al., 2012; Mioni et al., 2016, 2017; Zhang et al., 2016). PD participants were recruited and tested at the Center for Neurocognitive Rehabilitation (CeRiN), Center for Mind/Brain Sciences (CIMec), University of Trento (Italy). Control participants were volunteers from the local community (Trento, Italy). All participants received the PD diagnosis (Diagnosis and Treatment of Parkinson's disease: Italian Guidelines. Health Care Institute and Italian League for Parkinson's disease, Extrapyramidal Syndrome and Dementia, 2015) by a movement disorders neurologist. All participants were assessed when in ''on'' medication. The motor involvement of participants was mild, according to the score of the Unified Parkinson's disease Rating Scale (UPDRS; Movement Disorder Society Task Force on Rating Scales for Parkinson's Disease, 2003) Part III (medium score: 17/108 point) and Hohen & Yahr Scale (score ≤ 3; Goetz et al., 2004).

The exclusion criteria included: dementia or severe cognitive impairment (Dubois et al., 2007), medications (apart from PD treatments) known to interfere with cognitive functioning, history of neurosurgery or brain injury, psychiatric disorders, or any condition (e.g., drowsiness) that would interfere with testing. Participants recruited obtained at least a score equal to or greater than 24/30 at the Mini-Mental State Examination (MMSE; Folstein et al., 1975). All participants performed an extensive neuropsychological evaluation to investigate their cognitive abilities (**Table 1**).



Note: MMSE, Mini Mental State Examination; TMT, Trial Making Test; CPM, Colored Raven's Progressive Matrices; MCST, Modified Card Sorting Test; CDT, Clock Drawing Test. <sup>∗</sup>p < 0.05.

### Materials

### Explicit Timing: Time Bisection Task

The experimental session started with the learning phase in which each participant memorized two standard durations: 400 ms (short standard) and 1600 ms (long standard; Mioni et al., 2016; **Figure 1A**). The stimulus used was a dark gray circle on a white background. Both standard durations were presented 10 times in a fixed presentation order. After the learning phase, participants were required to judge the duration of new intervals and determine if they appeared more similar in duration to the short standard or long standard. Seven comparison durations were used: 400, 600, 800, 1000, 1200, 1400 and 1600 ms. Participants performed four blocks and within each block each duration was presented 10 times in a random order. They were asked to respond with their left and right index fingers and response keys were counterbalanced between participants. After each response, there was a 1000-ms inter-trial interval.

### Implicit Timing: Foreperiod Task

The foreperiod paradigm was a shortened version of the task used in Vallesi et al. (2014; **Figure 1B**). Each trial started with the presentation of a ''XX'' (2 cm × 2 cm), which was displayed in the center of the screen simultaneously with an auditory warning signal (a 1500 Hz pure tone) played for 50 ms via laptop internal speakers. The sound intensity was set at a comfortable level for all the participants. The ''XX'' remained on the screen for either 1000 or 3000 ms, depending on the foreperiod for that trial. The target was a downward-pointing white arrow (with maximum length and width of 2 cm) fitted in a black square, which appeared once the foreperiod duration elapsed. Participants were instructed to respond to it by pressing the spacebar as quickly as possible. Following the response to the target, or after 2000 ms in case of a missed response, the next trial began.

The foreperiod task comprised three types of blocks: fixedshort (1000 ms), fixed-long (3000 ms) and variable (1000 or 3000 ms), which were presented in a counterbalanced order across participants. In total, there were two fixed blocks of 30 trials each (i.e., one block for the short foreperiod and one block for the long foreperiod) and two blocks of 30 trials each for the variable foreperiod. In the variable foreperiod blocks only, the current foreperiod could be preceded with the same probability either by a short or long foreperiod. An initial training phase with four trials was used before each type of block to ensure that participants correctly understood task instructions.

### Neuropsychological Assessment

A complete neuropsychological evaluation was conducted<sup>1</sup> . Specifically, to assess attention we used the Digit Span forward and backward tests (Mondini et al., 2011), the Trail Making Test (TMT; Giovagnoli et al., 1996), and the Attentional Matrices test (Spinnler and Tognoni, 1987). To assess executive functions we used the Semantic Fluency test (Novelli et al., 1986), the Phonemic Fluency test (Carlesimo et al., 1996) and the Modified Card Sorting Test (MCST, Caffarra et al., 2004). Finally, to evaluate general cognitive abilities we used the Colored Progressive Matrices (CPM; Carlesimo et al., 1996) and the Clock Drawing Test (CDT; Mondini et al., 2011).

### Procedure

Controls were tested in their own home in the area of Trento (Italy), whereas PD participants were tested at CeRiN, Trento (Italy). During the tasks, participants were seated at a distance of approximately 60 cm in front of a 15-inch PC monitor screen. E-Primer2.0 (Schneider et al., 2002) was used to program and run the experiments. PD participants were tested during one experimental session that lasted approximately 60 min. Neuropsychological information of PD participants was collected from clinical records. Controls were tested in two separate experimental sessions lasting approximately 60 min each for completion of the neuropsychological assessment and experimental tasks, respectively. Written informed consent was collected from all the participants and the study was conducted in accordance with Helsinki

<sup>1</sup>A complete description of the neurophysiological tasks used can be found in Mioni et al. (2016, 2017).

Declaration (59th WMA General Assembly, Seoul, 2008). The study was approved by the ethic committee of the Department of General Psychology and the CeRiN—CIMec ethical committee.

### Statistical Analyses

For the time bisection task, for each participant a 7-point psychometric function was traced, plotting the seven comparison intervals on the x-axis and the probability of responding ''long'' on the y-axis. The cumulative normal function was fitted to the resulting curves. We calculated two indices, one that defines the perceived duration and one for sensitivity. The first was the Point of Subjective Equality (PSE), that is, the stimulus duration at which participants responded ''short'' or ''long'' with equal frequency. An observed shift of the bisection point can be interpreted as an indicator of differences in perceived duration, with smaller bisection point values meaning longer perceived durations. The second dependent variable was the Weber ratio (WR), which is based on one standard deviation (SD) on the psychometric function. The WR is the SD divided by the midpoint duration used in the experiment. This is a measure of temporal sensitivity; smaller values indicate more sensitive timing (Mioni et al., 2016). Separate t-tests were conducted on PSE and WR and we estimated effect size with Cohen's d. One PD participant was excluded from the analyses because above 3 SD from their individual task mean condition. Therefore, for the time bisection task, the analyses were conducted on 39 participants (19 PD participants and 20 controls).

For the foreperiod task, data from practice trials, the first trial in each block, trials with premature responses (i.e., responses before target onset, 5.02% of the remaining trials for PD and 3.87% for controls), trials with RT below 150 ms (0.43% of the remaining trials for PD and 0.79% for controls) and trials without responses (0.64% of the remaining trials for PD and 0% for controls) were rejected from the analysis. Additionally, for each participant, trials with an RT above 3 SD from their individual task mean condition were treated as outliers and discarded from the RT analysis (1.45% of the remaining trials for PD and 1.8% for controls).

Mean RTs for each participant and condition were analyzed through a three-way mixed factorial analysis of variance (ANOVA) with Group (PD, controls) as a between-subjects factor, and Type of block (fixed, variable) and Foreperiod (1000 ms, 3000 ms) as within-subjects factors. Sequential effects were analyzed on the variable foreperiod trials only with a three-way mixed factorial ANOVA involving Group (PD, controls) as a between-subjects factor and Foreperiod of the previous trial (1000 ms, 3000 ms) and Foreperiod of the current trial (1000 ms, 3000 ms) as within-subject factors. One PD patient and one participant from the control group were excluded as they had less than 50% of correct trials in some task conditions. Moreover, two PD participants and one participant from the control group did not complete the task. Therefore, for the foreperiod task, the analyses were conducted on 35 participants (17 PD participants and 18 controls). All significant effects were followed by two-tailed paired t-tests and the effect size was estimated either with partial eta squared (η 2 p ) or Cohen's d.

Separate t-tests were conducted on neuropsychological tasks between PD participants and controls.

### RESULTS

### Explicit Timing: Time Bisection Task

**Figure 2** represents the probability of ''long'' responses, for each comparison interval in PD participants and controls.

When data were analyzed in term of PSE, a significant main effect of Group (t(37) = 2.32, p = 0.026, Cohen's d = 0.74) was found (**Figure 3A**); the PD participants' PSE was shifted through the right indicating shorter perceived durations (PD participants PSE = 956, SD = 123; controls PSE = 863, SD = 126).

When data were analyzed in term of WR, a significant main effect of Group (t(37) = 2.03, p = 0.049, Cohen's d = 0.64) was also found (**Figure 3B**); the PD participants' WR was higher than the controls' one, indicating lower temporal sensitivity (PD participants WR = 0.33, SD = 0.23; controls WR = 0.22, SD = 0.07).

### Implicit Timing: Foreperiod Task

#### Foreperiod Effects

The significant main effect of Group (F(1,33) = 16.9, p < 0.001, η 2 <sup>p</sup> = 0.33), showed that PD participants were slower than their controls (mean RT: 467 and 344 ms, respectively). The significant Type of block (F(1,33) = 9.2, p = 0.005, η 2 <sup>p</sup> = 0.21), and Foreperiod (F(1,33) = 38.6, p < 0.001, η 2 <sup>p</sup> = 0.53) main effects revealed faster responses in the fixed foreperiod paradigm compared to the variable one, and in the long foreperiod compared to the short foreperiod, respectively. Further, there was a significant interaction between Type of block and Foreperiod factors (F(1,33) = 36.7, p < 0.001, η 2 <sup>p</sup> = 0.52). This interaction was explained by the fact that participants were faster after the long foreperiod compared to the short foreperiod in the variable foreperiod paradigm (t(34) = 8.97, p < 0.001, Cohen's d = 1.56), while there was no difference between the two foreperiods in the fixed paradigm (t(34) = 1.1, p = 0.27, Cohen's d = 0.19). Hence, this result reflects the presence of the typical variable foreperiod effect and the absence of the fixed foreperiod one. Inspection of the data (see **Figure 4**) also showed that participants were faster at the short foreperiod when it was kept fixed across the block as compared to when it was intermixed across trials with the long foreperiod (t(34) = 5.67, p < 0.001, Cohen's d = 0.98). By contrast, participants were equally fast at the long foreperiod in both Types of blocks (t(34) = 1.09, p = 0.28, Cohen's d = 0.20). There were no significant interactions involving Group (all ps > 0.40). Please note that results remained the same even after logarithmic transformation of raw RT data, which controls for the difference in speed between the two groups (e.g., Ben-David et al., 2014).

#### Sequential Effects

The analysis on the sequential effects showed significant main effects of the Foreperiod of the previous trial (F(1,33) = 17.04, p = 0.001, η 2 <sup>p</sup> = 0.34) and Foreperiod of the current trial (F(1,33) = 64.66, p < 0.001, η 2 <sup>p</sup> = 0.66), which were further explained by a significant interaction involving these two factors (F(1,33) = 24.56, p < 0.001, η 2 <sup>p</sup> = 0.42). This interaction reflected

the typical pattern of asymmetrical sequential effects, that is, faster responses for the current short foreperiod when it was preceded by another short rather than long foreperiod (t(34) = 5.15, p < 0.001, Cohen's d = 0.87), while equally fast responses were observed for the current long foreperiod irrespective of the type of foreperiod occurring in the previous trial (t(34) = 0.05, p = 0.95, Cohen's d = 0.009). Apart from a significant Group main effect (F(1,33) = 17.26, p < 0.001, η 2 <sup>p</sup> = 0.34), the interactions involving the group factor were not significant (all ps > 0.14; **Figure 5**). As for the analysis on the foreperiod effects, all the results concerning sequential effects were replicated using log-transformed RT data.

#### Neuropsychological Evaluation

T-test analyses were conducted to test performance on neuropsychological tests in PD participants and controls (**Table 1**). Significant differences were observed between groups on TMT part-B (p = 0.050) and TMT B–A (p = 0.014), attentional matrices (p = 0.001) and CPM (p = 0.029) indicating that PD participants had lower attentional, visual search and non-verbal intelligence. No differences between groups were observed on the other measures (all ps > 0.05).

Exploratory Pearson correlational analyses were conducted between performance on the neuropsychological tasks and performance on the timing tasks separately for PD participants and controls. As measure of explicit timing we used the PSE, whereas for implicit timing we calculated a variable foreperiod effect index (short foreperiod minus long foreperiod RTs) and a sequential effects index (previous long minus previous short foreperiod RTs for current short foreperiod trials; for similar measures, see Triviño et al., 2011, 2016). Previous studies have suggested an involvement of attention, working memory and executive functions in explicit timing (Perbal et al., 2005; Aarsland et al., 2010; Parker et al., 2013b). Despite the small sample size prevents us from drawing clear conclusions regarding the relationship between cognitive functions and processing of explicit and implicit time, the following correlations emerged. Briefly, within the control group, negative correlations were observed between explicit timing and MMSE (r = −0.563, p = 0.012) and Semantic fluency (r = −0.525, p = 0.021) suggesting that participants with a lower MMSE score and a lower score at semantic fluency underestimated more in the explicit timing task (see **Figure 6**). No significant correlations were observed between implicit timing and any of the measures included in the neuropsychological evaluation (all r ≤ 0.23, all p ≥ 0.05). Within the PD group, a negative correlation was observed between explicit timing and Digit span forwards (r = −0.471, p = 0.042) suggesting that PD participants who had lower span underestimated time intervals in explicit timing tasks. Moreover, a negative correlation was observed in the implicit timing between the sequential effects index and CPM (r = −0.526, p = 0.030) indicating that PD participants who scored higher on the CPM test had smaller sequential effects (see **Figure 7**).

### DISCUSSION

While there have been several studies investigating either explicit or implicit timing in PD, to our knowledge, only few of them have compared both explicit and implicit processes within a single experimental session (Jones et al., 2008; de Hemptinne et al., 2013). Overall, our results showed that impaired processing of explicit timing in PD could occur in the presence of spared implicit timing.

Specifically, regarding explicit timing, our results are in line with some previous studies (Smith et al., 2007; Mioni et al., 2016, 2017) showing higher PSE and higher WR in PD participants than in controls, indicating temporal under-estimation and lower temporal sensitivity in PD participants as compared to controls. Differently, Wearden et al. (2008) found no evidence of temporal impairment in PD participants within the sub-second range (100–800 ms) regardless of whether they were tested ''on'' or ''off'' medication. Moreover, Merchant et al. (2008) only reported higher temporal variability in PD participants compared to controls when tested with brief temporal intervals (350–1000 ms) but no differences in the perceived duration. These results occurred just when PD participants were tested ''off'' medication.

Some methodological differences might explain the different results with respect to our study. Merchant et al. (2008) and Wearden et al. (2008) used auditory stimuli and the superiority of audition over vision for temporal processing is well known. Sensitivity to time is much higher (lower threshold, or less variability) when intervals are marked by auditory rather than by visual signals (for a review, see Grondin, 2003). The underestimation observed in Smith et al. (2007) was restricted to long temporal intervals (1–5 s) while participants equally judged short temporal intervals (100–500 ms). Clinical characteristics were similar in Smith's study and ours and in both studies PD participants were tested ''on'' medication. Moreover, we used visually filled intervals whereas Smith's study included visually empty intervals. Previous studies showed that for brief temporal intervals (<300 ms), time discrimination was better with empty intervals in both visual and auditory modalities (Grondin, 1993). It is, then, possible that both PD and controls benefited from the presentation of empty intervals. The two studies reported by Mioni et al. (2016, 2017) showed under-estimation in PD participants with mild cognitive impairment (MCI) indicating that part of the temporal impairment observed in PD participants was explained by reduced cognitive abilities. In the present study, we were not able to differentiate the sample on the basis of diagnosis of MCI.

More consistent across different studies are the results regarding the WR. Here we observed higher variability in PD participants compared to controls. Merchant et al. (2008) also showed higher temporal variability in PD participants compared to controls when tested with brief temporal intervals (350–1000 ms), and Smith et al. (2007) showed higher variability across modalities (visual vs. auditory) and temporal ranges (1–5 s). Higher temporal variability in PD participants is often observed on tasks requiring motor responses (Jones and Jahanshahi, 2014). Interestingly, here we used a timing task (time bisection) that has a minimal motor component (Kopec and Brody, 2010; Gil and Droit-Volet, 2011), and we analyzed temporal performance excluding measures of reaction times that might have been affected by PD participants' motor dysfunction (Jones and Jahanshahi, 2014).

Regarding implicit timing, participants with PD performed at the same level as controls in the variable foreperiod task, thus displaying the typical pattern of foreperiod and sequential effects. On the one hand, these findings are in line with previous observations that the variable foreperiod effect is generally preserved in PD (e.g., Bloxham et al., 1987; Jahanshahi et al., 1992), at least when using relatively short durations for the longest foreperiod (∼3 s; see Jurkowski et al., 2005). On the other hand, our findings extend previous research by also revealing unimpaired sequential effects in participants with PD. The presence of sequential effects in PD suggests that such effects are not related to dopamine-dependent neural mechanisms.

An unexpected result concerning implicit timing was the lack of the fixed foreperiod effect in both PD and control groups. In contrast to the literature showing a RT advantage on the fixed short blocks compared to the longest ones, here there was no difference between short and long blocks. It is interesting to note, however, that performance on short foreperiods significantly changed as a function of the task context (fixed vs. variable), although not as strongly to get the typical fixed foreperiod effect. This result thus suggests that there was still a difference in the processing of the short foreperiods between the fixed and variable foreperiod tasks.

Beyond PD, similar dissociations between explicit and implicit timing have been also documented in other types of participants and kinds of tasks. For instance, Bégel et al. (2017) recently showed that individuals with ''beat deafness'', a congenital anomaly associated with difficulties in synchronizing to the beat, performed poorly on explicit rhythm but not on implicit rhythm tasks. Likewise, it has been shown that participants with right frontal damage have troubles in orienting attention to time intervals when they are cued by explicit information but not when using implicit rhythmic patterns (Triviño et al., 2010, 2011). Further supporting the idea that explicit and implicit timing reflect distinct processes, it seems that the two follow distinct developmental trajectories being the explicit aspect more variable across age groups compared to the implicit one (e.g., Droit-Volet and Coull, 2016; but also see Mento and Tarantino, 2015). Finally, explicit and implicit timing have been related to distinct neural regions. Specifically, explicit timing is usually associated with the supplementary motor area, basal ganglia, cerebellum and right inferior frontal and parietal cortices (Coull and Nobre, 2008; Wiener et al., 2010). Implicit timing, when measured through the variable foreperiod effect, has been linked to the functioning of the right lateral prefrontal cortex (Arbula et al., 2017) and at least in one occurrence also of the left one (Triviño et al., 2010), whereas sequential effects have been related to the motor/premotor circuitry (Vallesi et al., 2007a) and left subcortical structures (Triviño et al., 2016).

Taken together, our results add to the neural dissociation between explicit and implicit timing by demonstrating that accurate performance in the time bisection task, but not in the foreperiod task, depends on intact basal ganglia and dopaminergic functions. This is in line with several data from rodents showing that manipulation of dopamine in the substantia nigra changes the perception of time (Meck, 2006; Soares et al., 2016). It has been recently proposed that medial frontal dopamine, which can degenerate in PD (Parker et al., 2015a; Kim et al., 2017), is also critical for accurate timing behavior in rodents (Parker et al., 2015b). Interestingly, depletion of dopamine input from the ventral tegmental area to the medial prefrontal cortex disrupts the foreperiod effect of rats engaged in a simple reaction time task (Parker et al., 2013a). This result is thus at odds with both our findings and others' ones showing a normal foreperiod effect in participants with PD (Bloxham et al., 1987; Jahanshahi et al., 1992). As suggested by the authors, however, it might be possible that only those patients with executive dysfunctions have impaired prefrontal dopamine regulation, which would explain the discrepancy between animal and human findings reported in the context of foreperiod tasks. Indeed, our participants with PD did not have a severe executive dysfunction as shown by the neuropsychological evaluation.

In keeping with the contribution of cognitive factors on temporal processing, despite the evidence of reduced cognitive abilities in PD, very few studies have identified cognitive dysfunction in temporal processing in PD participants (Jones and Jahanshahi, 2014). One interesting exception is Perbal et al. (2005), who used time production and reproduction tasks as well as neuropsychological measures with PD participants and controls. Correlations conducted on the entire sample showed that participants with higher temporal variability in the time reproduction and time production tasks had lower short-term memory and working memory abilities. Among the studies that used similar explicit and implicit timing tasks, Mioni et al. (2016, 2017) showed greater under-estimation (higher PSE) and higher variability (WR) in PD participants with MCI, confirming that cognitive factors can influence performance on explicit time processing, which aligns with the documented cognitive deficits of this group (see also Merchant et al., 2008, for different findings). The results obtained from the preliminary correlations conducted in our study are in line with previous findings indicating that participants with lower cognitive abilities produced a greater underestimation (in the explicit timing task). Only in PD participants negative correlations were observed in implicit timing between the sequential effects index and CPM, indicating that PD participants who scored higher on the CPM test had smaller sequential effects. According to this exploratory correlation, it seems that PD participants with greater cognitive abilities were less influenced by the foreperiod duration provided by the previous trial. This might imply more focus on the current trial temporal information and greater resistance to lower-level influences from previous trial durations. However, since the correlation for the implicit task represents a novel finding, caution has to be taken before drawing firm conclusions on the role of general cognitive abilities in the expression of sequential effects.

Among the limitations of the present study, it is important to acknowledge the quite small sample size and the difference in the temporal ranges used for the explicit and implicit timing tasks. Regarding the former point, our sample size is comparable to sample sizes used in previous work with PD participants (Jones and Jahanshahi, 2014). Moreover, the differences found between the two groups can still provide interesting insights into the understanding of the different processes underlying explicit and implicit timing. Regarding the latter point, it would be highly informative in future studies to match the durations between explicit and implicit timing. For the explicit part of our design, we opted for these range of durations (400–1600 ms) to reduce the use of counting strategies that are often engaged when longer intervals (<1 s) are processed (Grondin, 2010). Conversely, for the implicit part, we decided to employ longer durations to take into account the motor deficit associated with PD participants (Jones and Jahanshahi, 2014). Despite these limitations, however, our work provides useful evidence on the dissociation between explicit and implicit timing in clinical populations. Future research should further explore such a dissociation by employing other measures of implicit and explicit timing with reduced motor component and by adopting the same temporal range in the two types of tasks.

To conclude, our results support the existence of two different processes underlying explicit and implicit timing in PD participants. Moreover, we extend previous studies in the field of implicit timing by providing the first experimental evidence of preserved sequential effects in PD.

### AUTHOR CONTRIBUTIONS

All the authors were involved in the conception of the work. GM and MC are co-first authors, performed data analysis, drafted the manuscript and were involved in all subsequent revisions. GM and RDG were involved in data collection. AV, ÁC, RDG and FS provided ongoing contributions and feedback throughout the experimental process. They also provided additional revisions to the manuscript. All the authors

### REFERENCES


have approved the final version of the manuscript and agree to be accountable for all aspects of the work.

### ACKNOWLEDGMENTS

The information in this manuscript and the manuscript itself has never been published either electronically or in print. There are no financial or other relationships that could be interpreted as a conflict of interest affecting this manuscript. This research received no specific grant from any funding agency from either the commercial or the not-for-profit sectors. The authors gratefully acknowledge the participants and their families and friends who participated in this study; moreover, the authors gratefully acknowledge the staff of Center for Neurocognitive Rehabilitation (CeRiN), Center for Mind/Brain Sciences, University of Trento (Italy) and Nicola Cellini for his useful advices. MC and AV were funded by the European Research Council Starting Grant No. 313692 (FP7/2007–2013) to AV.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Mioni, Capizzi, Vallesi, Correa, Di Giacopo and Stablum. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech

Yoshitaka Nakajima<sup>1</sup> \*, Mizuki Matsuda<sup>2</sup> , Kazuo Ueda<sup>1</sup> and Gerard B. Remijn<sup>1</sup>

<sup>1</sup> Department of Human Science, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, Fukuoka, Japan, <sup>2</sup> Nihon Kohden Corporation, Tokyo, Japan

Temporal resolution needed for Japanese speech communication was measured. A new experimental paradigm that can reflect the spectro-temporal resolution necessary for healthy listeners to perceive speech is introduced. As a first step, we report listeners' intelligibility scores of Japanese speech with a systematically degraded temporal resolution, so-called "mosaic speech": speech mosaicized in the coordinates of time and frequency. The results of two experiments show that mosaic speech cut into short static segments was almost perfectly intelligible with a temporal resolution of 40 ms or finer. Intelligibility dropped for a temporal resolution of 80 ms, but was still around 50%-correct level. The data are in line with previous results showing that speech signals separated into short temporal segments of <100 ms can be remarkably robust in terms of linguistic-content perception against drastic manipulations in each segment, such as partial signal omission or temporal reversal. The human perceptual system thus can extract meaning from unexpectedly rough temporal information in speech. The process resembles that of the visual system stringing together static movie frames of ∼40 ms into vivid motion.

Edited by:

Felipe Ortuño, Universidad de Navarra, Spain

#### Reviewed by:

Massimo Grassi, Università degli Studi di Padova, Italy Michel Hoen, Oticon Medical, Sweden

#### \*Correspondence:

Yoshitaka Nakajima nakajima@kyudai.jp

Received: 24 November 2017 Accepted: 04 April 2018 Published: 24 April 2018

#### Citation:

Nakajima Y, Matsuda M, Ueda K and Remijn GB (2018) Temporal Resolution Needed for Auditory Communication: Measurement With Mosaic Speech. Front. Hum. Neurosci. 12:149. doi: 10.3389/fnhum.2018.00149 Keywords: speech, spectro-temporal resolution, intelligibility, mosaic, movie frames

### 1. INTRODUCTION

Speech can remain considerably intelligible even when it is drastically manipulated in the temporal domain. One example is the word intelligibility (articulation) of speech in which 50-ms portions are alternately played and silenced, as reported by Miller and Licklider (1950), who discovered the illusory continuity (see also Vicario, 1960). Intelligibility does not change in such "gated speech" even if the temporal gaps are simply removed, shortening the total duration (Fairbanks and Kodman, 1957; Shafiro et al., 2016). The perception of locally time-reversed speech is also to be noted. When speech is cut into segments of 50 ms, for example, and if each segment is reversed in time, intelligibility is still quite well preserved (Steffen and Werani, 1994; Saberi and Perrott, 1999; Ueda et al., 2017).

Neuroscientific research performed in the last decade has suggested that different types of neural oscillations are involved in the segmentation and organization of speech into perceptual units (Giraud and Poeppel, 2012; Chait et al., 2015). Neural oscillations at modulation frequencies around 30–50 Hz, corresponding to a temporal resolution around 20–33 ms, are considered to be involved in phonemic processing. If this temporal resolution is attained, the human auditory system should receive sufficient information to grasp the rhythmic intensity fluctuations in speech and music (Ding et al., 2017).

To some extent, the segmentation and organization process of speech into perceptual units resembles that of how the visual system strings together static movie frames into motion pictures. The motion of visual objects can be expressed vividly by presenting successive still frames in a ratio of 24 frames per second: A temporal resolution of 42 ms is sufficient to perceive motion pictures. This brought our research question: Is it possible to express speech sentences utilizing auditory counterparts of movie frames? One of the ways to deal with this issue is to use locally time-reversed speech (Saberi and Perrott, 1999). This paradigm is used widely to measure the temporal resolution needed for speech communication (Ueda et al., 2017). Basically, this procedure is considered to make the temporal information within each segment unavailable. Thus, each segment plays the role of a movie frame. This analogy does not seem to work precisely, however, since the reversed segment still keeps the original temporal change in the reversed direction. A very short explosion, for example, may be preserved as a similar noise burst in a reversed segment, but it may appear in a distorted timing. This can either improve or reduce the listeners' performance. In other words, an experimental noise, whose influence is unpredictable even qualitatively, is left. We thus created "mosaic speech" in analogy with visual mosaic images (Harmon, 1973) of monochrome pictures (**Figure 1**). This was done by concatenating local spectra of speech signals, which by definition were static, resulting in a new type of degraded speech suitable to study the temporal resolution needed for speech perception. Since each local spectrum in the signal should be playable as a steady-state sound, we added up 17 narrowband noises, whose frequency bands covered a range 0.1–4.4 kHz, corresponding to critical bands simulating the auditory periphery (Fastl and Zwicker, 2007).

The past paradigm to utilize locally time-reversed speech has played an important role to shed light on the issue of temporal resolution needed for speech communication. As mentioned above, however this paradigm contains unavoidable experimental contamination for that purpose, since the temporal fine structure of each speech segment is preserved but reversed; the fine structure is not suppressed.

The preserved temporal fine structure may inappropriately facilitate speech perception. For example, stop consonants often begin with explosions, approximately very short bandnoise bursts, and this is one of the cues to identify stop consonants (Liberman, 1996). If an explosion is located in the middle of a segment, a short portion of band noise is still there even when the segment is reversed in time, and this may help the perception of the stop consonant. The locally time-reversed speech was not able to suppress the fine temporal structure to be excluded in this case.

On the other hand, the reversing procedure may have a negative influence on speech perception. Stop consonants are also differentiated perceptually by formant transitions just before the following vowel, i.e., ascending or descending formant frequencies (Liberman, 1996). If a clear formant transition is included in a temporal segment, then the ascending or descending movement is reversed in time by the time-reversing procedure; this is very likely to distort consonant identification.

FIGURE 1 | Original (A; taken by Feng Li) and mosaic image (B) of the Kinkakuji, Kyoto, Japan.

Thus, locally time-reversed speech patterns are not necessarily very suitable to control temporal resolution systematically. The unchanged, although reversed, waveforms may preserve temporal information that should not be used by the participants, or the reversing procedure may sneak unrelated disturbance to the experiment. In order to avoid this kind of experimental contamination, the only practical way is to replace each speech segment with a sound without a clear temporal structure keeping the basic spectral shape. This was the most important reason we created the mosaic speech paradigm. Once the paradigm is established, it can be employed for many other purposes just as the mosaic paradigm in vision.

It is important to compare experimental results obtained in the locally time-reversing paradigm and in the mosaic speech paradigm. If there is a discrepancy, perhaps the past experiments in the locally time-reversed paradigm should be interpreted with some caution.

Speech signals can be mosaicized in the coordinates of time and frequency. By manipulating how fine or rough the timefrequency mosaic tiles are, we are able to generate various mosaic speech stimuli, within the constraint of the timefrequency uncertainty principle (Cohen, 1989). This can be done without adding or changing irrelevant cues, making it possible to create standardized tests to check the temporal or the frequency resolution given to or needed by the auditory system for speech perception. In the present study, we mainly investigated the precision of temporal resolution (see also Supplementary Material). The frequency resolution was fixed at that of critical bandwidths, and the temporal (time) resolution was varied systematically. In order to obtain behavioral data on the temporal resolution needed for speech perception, we measured the intelligibility of Japanese mosaic speech. Mosaic speech neither contains distinct pitch information, nor cues as to the temporal fine structures of the original speech.

In detail, mosaic speech was made as follows; relevant speech signals are illustrated in **Figures 2**–**4**. Note that Japanese speech was used in the present experiments, but an English sentence is used in this explanation to enable the reader to see the correspondence between the original speech and its spectrogram. The time axis and the frequency axis of a sound spectrogram are not completely independent of each other, contrary to the horizontal and the vertical axis of visual images. For acoustic signals, the uncertainty principle between time and frequency (the inverse of time) makes it essentially impossible to control both time and frequency very accurately (Cohen, 1989). Since our direct purpose was to gain insight into the temporal aspects of speech communication, we put our priority on obtaining a temporal resolution of 20 ms, the inverse of 50 Hz, considering the fact that a period of vocal-folds vibration of male speakers can be around 10 ms (Raphael et al., 2011). Fortunately, this was still compatible with the finest frequency resolution of the critical bandwidth, i.e., 100 Hz (Fastl and Zwicker, 2007). The smallest possible size of the temporal segmentation for mosaicization was thus determined as 20 ms.

For generating mosaic speech, we first separated the speech signal (**Figure 2A**) into critical bands. Each critical band contains a temporal intensity fluctuation presumably conveying linguistic information. We generated a band noise in each critical band, which was amplitude-modulated to make its intensity fluctuation equivalent to that observed in the same frequency band of the original speech signal. This follows basically the procedure to make noise-vocoded speech (Shannon et al., 1995; Smith et al., 2002; Kishida et al., 2016), which is exemplified in **Figure 3A**.

This noise-vocoded speech was almost perfectly intelligible (see Ellermeier et al., 2015, for related data); it contains linguistic information sufficient for speech perception. We calculated the intensity fluctuation of the speech signal in each critical band, which enabled us to calculate average intensity in any given temporal segment. We then cut the intensity fluctuation for each critical band into segments of 80 ms, for example, and calculated the average intensity in each segment. By replacing each temporal segment of each critical band with a band noise portion of the same average intensity, mosaic speech was obtained. Each noise portion was smoothed with a rise and a fall time of 5 ms to avoid spectral splatters (**Figure 3B**).

For one of the intelligibility experiments (Experiment 2), we used not only mosaic speech, but also locally time-reversed speech with smoothing and without smoothing, as well as the original speech. Locally time-reversed speech was made by reversing each segment in time as in **Figure 4A** (Steffen and Werani, 1994; Saberi and Perrott, 1999). This is a well-established way to degrade the temporal resolution of speech (Ueda et al., 2017). Without smoothing, the abrupt edges of the segments are heard clearly as click-like sounds as can be seen in the spectrogram (**Figure 4B**). We also prepared the original speech with the same rise and fall times for control (**Figure 2B**).

### 2. RESULTS

Twenty-four participants (n = 4 in Experiment 1 and n = 20 in Experiment 2) were asked to write down what they heard in Japanese hiragana letters, each corresponding clearly in most cases to one mora—basic phonological units of Japanese, which are in many cases equal to and sometimes shorter than syllables. The percentage of correct mora identification was calculated by counting the number of written morae that corresponded to the morae pronounced in the three sentences assigned to each stimulus condition (**Figure 5**).

original speech as in Figure 2 was used. (A) Noise-vocoded speech made of amplitude-modulated noises in 17 critical bands; (B) mosaic speech: the noise-vocoded speech mosaicized into 80-ms segments. Intuitively, the mosaic speech was constructed by cutting the original speech as appeared in the spectrogram into time-by-frequency blocks of 80 ms and one critical bandwidth, and by leveling the sound-energy density in each block. Because the uncertainty principle between time and frequency does not allow the spectrum of a noise portion to be kept within a narrow frequency band if the portion appears and disappears abruptly, each block was shaped with a rise time and a fall time of 5 ms.

The intelligibility of original speech, as measured as the percentage of correct mora identification, was almost perfect for any segment duration. The intelligibility of mosaic speech and locally time-reversed speech, with or without rise and fall times, was nearly perfect when the segment duration was 20 or 40 ms. Intelligibility decreased monotonically after that as the segment duration increased. The results of Experiment 1 were very close to the results obtained in the same conditions in Experiment 2; Experiment 1 may therefore be considered a kind of pilot experiment whose results were fully replicated in Experiment 2. Thus, only the results of Experiment 2 were statistically analyzed. For each participant and for each stimulus type, except for original speech, the segment duration corresponding to the 50% correct mora identification was calculated by linear interpolation. The average values were 75.3 ms for mosaic speech, 65.8 ms for locally time-reversed speech with rise and fall times, and 64.3 ms for locally time-reversed speech without rise and fall times.

A Friedman two-way analysis of variance by ranks (Siegel and Castellan, 1988) was performed on these 50% points for these three stimulus types. The effect of the stimulus types was significant (N = 20, k = 3, F<sup>r</sup> = 17.5; p < 0.001). Multiple comparisons after that (following Siegel and Castellan) indicated significant differences between mosaic speech and locally time-reversed speech with/without rise and fall times (p < 0.01; p < 0.01). The difference between the two types of locally time-reversed speech was not significant. In sum, the participants' performance was significantly better for mosaic speech than for locally timereversed speech, and the 50% threshold for Japanese mosaic speech exceeded 70 ms.

### 3. DISCUSSION

For all stimulus conditions, mora identification was nearly perfect for temporal windows as fine as 20 or 40 ms. It dropped sharply, however, except in the original-speech condition, as the

standard deviations. Twenty-four participants were asked to write down what they heard in Japanese hiragana letters, indicating morae–based phonological units of Japanese, which are in many cases equal to and sometimes shorter than syllables.

temporal windows widened from 40 to 80 ms and further. Mora identification dropped to 10% or below with even wider temporal windows.

### 3.1. Locally Time-and-Frequency-Reversed Stimuli

As mentioned earlier, to measure something related to the temporal resolution of the auditory system or acoustic signals is often difficult because time and frequency are not completely independent. In a preliminary study, we reversed speech not only in time but also in frequency, thus imposing a temporal and frequency grid upon Japanese spoken sentences (see Supplementary Material). These locally time-and-frequencyreversed stimuli generated results similar to the present results. Intelligibility dropped sharply when the temporal window widened from 40 to 80 ms.

### 3.2. Mosaic vs. Locally Time-Reversed Speech

The results of the experiments showed that the participants' performance was significantly better for mosaic speech than for locally time-reversed speech, either with or without smoothing. This indicates that the perception of locally time-reversed speech was degraded by the reversed temporal fine structure, which itself can be a very interesting research topic in future. In order to measure the temporal resolution needed to make speech signals intelligible, in other words, to measure how far the temporal resolution can be lowered without harming intelligibility, mosaic speech seems more suitable than locally time-reversed speech. Another advantage of mosaic speech is that both the temporal and the frequency resolution can be manipulated for future studies including clinical ones.

### 3.3. Temporal Resolution of the Auditory System and Speech Perception

The problem of the time-frequency uncertainty principle is usually more obvious when very short time intervals are concerned. Some experimental paradigms have been employed to circumvent this problem, e.g., phase detection or temporal gap detection. Based on research with these paradigms, the ultimate temporal acuity in the auditory modality, except for dichotic situations, is close to 2 ms (Eddins and Green, 1995). There are often temporal changes in speech taking place in periods around 40 ms. Examples of these changes are formant transitions as in /w + (vowel)/ and /p + (vowel)/ or explosions as in /p + (vowel)/ and /g + (vowel)/, followed by a far weaker intensity. Therefore, it is surprising that a temporal resolution comparable to that of the motion picture system is sufficient to perceive the linguistic content of speech.

One way to assess temporal acuity is to measure the gap detection threshold. This paradigm is used, for example, to investigate decreased speech intelligibility in the elderly with otherwise preserved pure tone thresholds (Ozmeral et al., 2016). Along this line, it would be also necessary to measure the listeners' capacity to grasp linguistic contents even when the temporal resolution of the speech signals is limited, as in a reverberant room. The "mosaic speech" paradigm introduced in the present study can be very useful for this purpose.

The present study shows that linguistic information is conveyed almost perfectly by auditory blocks around 40 ms. It is interesting that linguistic contents of speech can be conveyed almost perfectly by presenting successive spectra at intervals of 40 ms just as the movie system can represent motions by presenting successive static pictures at similar intervals. Although the auditory and the visual modality are different in their peripheries, their cortical organizations may employ similar temporal grids (deCharms et al., 1998).

The present behavioral data corroborate neuroscientific research indicating that neural oscillations around 30–50 Hz are involved in the segmentation and organization of ongoing speech signals into perceptual units (Giraud and Poeppel, 2012; Chait et al., 2015). If phonemic processing as related to formant transitions or noise-vowel transitions is really based on such oscillations, processing should deteriorate for degraded speech whose temporal segment size exceeds ∼20–33 ms. This agrees with the present data.

Widening the temporal windows from 40 to 80 ms caused a considerable drop of mora identification both for mosaic speech and locally time-reversed speech. We thus conclude that the temporal resolution needed to convey linguistic information is close to 40 ms. Since the average mora duration in the 60 sentences utilized here was 120–150 ms, this indicates that one Japanese mora, typically corresponding to one consonant and one vowel in this order ("Kinkakuji" has 5 morae including a special mora for "n"), can be conveyed by 3–4 still (spectral)

frames. It thus is possible to synthesize intelligible speech connecting still frames in time, as if they were Lego blocks.

In order to understand the mechanism of speech communication, it is of vital importance to determine how far the speech signal can be degraded in the temporal dimension. It should be one of the unavoidable steps to examine how long homogeneous temporal units can be, still conveying linguistic information. To employ locally time-reversed speech systematically has been a substitute for this paradigm (e.g., Ueda et al., 2017). The reversed temporal units are never static, however, and this simply makes the interpretation of the perceptual data difficult. Mosaic speech whose frequency resolution was as fine as critical bands solved this problem; it was now established that static temporal units of 40 ms are sufficient for reasonable speech communication.

### 4. MATERIALS AND METHODS

### 4.1. Participants

Four (Experiment 1) or 20 (Experiment 2) native-Japanese speakers participated. They were 4 men of 21–25 years old (Experiment 1) or 9 women and 11 men of 19–25 years old (Experiment 2), all with normal hearing.

### 4.2. Stimuli

Four stimulus types were generated: (1) original speech with a 5-ms rise and a 5-ms fall time (**Figure 2B**), (2) mosaic speech (**Figure 3B**), (3) locally time-reversed speech with a 5-ms rise and a 5-ms fall time (**Figure 4A**), and (4) locally time-reversed speech without rise and fall times (**Figure 4B**). Only mosaic speech was employed in Experiment 1, while all four types of stimuli were employed in Experiment 2.

Japanese speech samples were obtained from the "NTT-AT Multi-Lingual Speech Database 2002." The samples were spoken by a female native-Japanese speaker, and with a sampling rate of 16 kHz with 16-bit quantization. The samples were edited to remove irrelevant silent portions and noises, and converted into computer-oriented audio (.wav) files with a sampling frequency of 22.05 kHz using Praat (Boersma and Weenink, 2016).

Original-speech stimuli were shaped with a temporal grid as in the other types of stimuli. The width of the grid window was varied in 5 steps, 20, 40, 80, 160, and 320 ms, and the window edges were smoothed with 5-ms rise and fall times.

Mosaic speech stimuli were created by calculating the average sound-energy density within each spectrographic block of a speech sample. The speech signals were first delimited by narrow frequency bands, and then shaped by temporal windows with 5 ms rise and fall times, turning into mosaic speech. The width of each frequency band was determined so as to simulate a critical band in the same frequency range (Fastl and Zwicker, 2007). Seventeen frequency bands covering the range 0.1–4.4 kHz were utilized. This frequency range was enough to make noisevocoded speech almost perfectly intelligible (Shannon et al., 1998). How sound energy density should change in time in each frequency band was calculated as a target, utilizing a moving average of intensity with a Gaussian window in time (σ = 5 ms). To realize this target, we generated a white noise as long as the speech signal, adding temporal margins. This noise was divided into the same 17 frequency bands. Sound energy density of this noise as a function of time was calculated from each frequency band utilizing the same moving average as above. This indicates an unavoidable small level fluctuation of the noise in each frequency band, which was going to be canceled to a certain degree in the next step. Finally, the noise within each frequency band was amplitude-modulated so that the original sound energy density of the noise was transformed to the calculated target density. Thus, the original speech was converted into a combination of band noises whose intensities were nearly constant within each time window, but with 5-ms rise and fall times. Locally time-reversed speech stimuli were shaped with a temporal grid of which the waveform in each temporal window was reversed in time, with and without 5-ms rise and fall times.

### 4.3. Conditions

In Experiment 1, only mosaic speech was used, and the grid window was varied in 4 steps. Three different speech sentences of 17–20 morae within the duration range of 2.27–2.98 s were used. Each participant encountered 3 sentences for each stimulus condition, adding up to 55–58 morae in total. The average duration of one mora calculated for each sentence was 0.13–0.15 s.

In Experiment 2, combining the 4 stimulus types and the 5 steps of the temporal grid resulted in 20 stimulus conditions in total. For each stimulus condition, 3 different speech sentences of 16–19 morae within the duration range of 2.13–2.66 s were used, adding up to 53 or 54 morae in total. Each participant thus encountered 60 stimuli in total. The average duration of one mora calculated for each sentence was 0.12-0.15 s.

In both experiments, each sentence appeared only once for each participant and for each stimulus condition. The sound energy per unit time of the speech stimuli was equalized.

### 4.4. Procedure

Each stimulus was presented once, diotically through headphones, 0.4 s after the participant clicked a "Play" button on the computer screen. After presentation, the participant wrote down on paper the morae he/she had heard in Japanese hiragana letters, avoiding guessing from the context as far as possible. A hiragana letter in most cases corresponds to a clearly distinguishable speech sound, but in some cases it was necessary for the participant to use both a hiragana letter and a few Roman letters to avoid ambiguity. How to do this was instructed clearly to the participant before the training trials. The 60 speech stimuli were randomly presented to each participant in 4 blocks. A warm-up trial was added to the 15 trials (stimuli) for each block. The participant was first asked to perform a practice block consisting of 20 trials. The sentences used in the practice block and the warm-up trials were not used again.

The research was conducted with prior approval of the Ethics Committee of Kyushu University; all methods employed were in accordance with the guidelines provided by the Japanese Psychological Association. The participants provided written informed consent prior to their participation.

The stimuli were presented to the participant in a soundproof room, from a computer (Frontier KZFM71/N) with an audio board (E-MU 0404) that was installed outside the soundproof room. From the computer, the stimuli were passed through an audio processor (Onkyo SE-U55GX), a low-pass filter (NF DV-04 DV8FL; cut-off frequency 15 kHz), a graphic equalizer (Roland RDQ-2031), and a headphone amplifier (STAX SRM-3235), before being presented to the participant through headphones (STAX SR-307). The equalizer was used to keep the flat shape of the frequency characteristics of the headphones, while the low-pass filter was used for anti-aliasing. Since the sampling frequency of the speech files was 22.05 kHz, the cut-off frequency of 15 kHz could not suitably deal with aliasing in between 14.05 and 15 kHz. The sound energy related to this range, however, was negligible. The audio output level was calibrated so that a 1-kHz pure tone of the same intensity as the average intensity of the stimuli was at 70 dBA with a precision sound level meter (Naganokeiki 2071) mounted with an artificial ear (Brüel and Kjær 4153).

### REFERENCES


### AUTHOR CONTRIBUTIONS

YN: designed the study, wrote prototypes of computer programs, analyzed the data, and wrote the paper; MM: designed the study, and collected and analyzed the data; KU and GR: prepared for the experiment, analyzed the data, and wrote the paper.

### ACKNOWLEDGMENTS

This study was supported by the Japan Society for the Promotion of Science (25242002 in FYs 2013–2017 and 17H06197 in FYs 2017–2019 to YN) and the Kyushu University Center for Clinical and Translational Research (FY 2017). Yu Tanaka assisted us in analyzing data.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnhum. 2018.00149/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Nakajima, Matsuda, Ueda and Remijn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Training in Temporal Information Processing Ameliorates Phonetic Identification

Aneta Szymaszek\*, Anna Dacewicz, Paulina Urban and Elzbieta Szelag

Laboratory of Neuropsychology, Nencki Institute of Experimental Biology of Polish Academy of Sciences, Warsaw, Poland

Many studies revealed a link between temporal information processing (TIP) in a millisecond range and speech perception. Previous studies indicated a dysfunction in TIP accompanied by deficient phonemic hearing in children with specific language impairment (SLI). In this study we concentrate in SLI on phonetic identification, using the voice-onset-time (VOT) phenomenon in which TIP is built-in. VOT is crucial for speech perception, as stop consonants (like /t/ vs. /d/) may be distinguished by an acoustic difference in time between the onsets of the consonant (stop release burst) and the following vibration of vocal folds (voicing). In healthy subjects two categories (voiced and unvoiced) are determined using VOT task. The present study aimed at verifying whether children with SLI indicate a similar pattern of phonetic identification as their healthy peers and whether the intervention based on TIP results in improved performance on the VOT task. Children aged from 5 to 8 years (n = 47) were assigned into two groups: normal children without any language disability (NC, n = 20), and children with SLI (n = 27). In the latter group participants were randomly classified into two treatment subgroups, i.e., experimental temporal training (EG, n = 14) and control non-temporal training (CG, n = 13). The analyzed indicators of phonetic identification were: (1) the boundary location (α) determined as the VOT value corresponding to 50% voicing/unvoicing distinctions; (2) ranges of voiced/unvoiced categories; (3) the slope of identification curve (β) reflecting the identification correctness; (4) percent of voiced distinctions within the applied VOT spectrum. The results indicated similar α values and similar ranges of voiced/unvoiced categories between SLI and NC. However, β in SLI was significantly higher than that in NC. After the intervention, the significant improvement of β was observed only in EG. They achieved the level of performance comparable to that observed in NC. The training-related improvement in CG was non-significant. Furthermore, only in EG the β values in post-test correlated with measures of TIP as well as with phonemic hearing obtained in our previous studies. These findings provide another evidence that TIP is omnipresent in language communication and reflected not only in phonemic hearing but also in phonetic identification.

Keywords: temporal information processing (TIP), specific language impairment (SLI), voice-onset-time, phonetic identification, temporal intervention, voicing contrast detection

#### Edited by:

Deana Davalos, Colorado State University, United States

#### Reviewed by:

Marianne Latinus, INSERM U930 Imagerie et Cerveau, France Yoshitaka Nakajima, Kyushu University, Japan

> \*Correspondence: Aneta Szymaszek a.szymaszek@nencki.gov.pl

Received: 20 October 2017 Accepted: 08 May 2018 Published: 06 June 2018

#### Citation:

Szymaszek A, Dacewicz A, Urban P and Szelag E (2018) Training in Temporal Information Processing Ameliorates Phonetic Identification. Front. Hum. Neurosci. 12:213. doi: 10.3389/fnhum.2018.00213

## INTRODUCTION

fnhum-12-00213 June 6, 2018 Time: 11:58 # 2

### Characteristics and Associated Features of Specific Language Impairment

Specific Language Impairment (SLI, diagnosed as F.80.1 and F.80.2 according to ICD 10; Puzy˙ nski and Wciórka, 2000 ´ ) is a form of developmental language impairment in which children demonstrate difficulties in understanding and/or producing speech. However, their general cognitive functioning and non-verbal intelligence remain within the normal range. The language impairment cannot be explained by hearing problems, neurological and speech mechanism abnormalities or environmental factors. The prevalence of SLI is estimated to be approximately 7% among 5-year-olds (Tomblin et al., 1997).

Specific language impairment generates pervasive social problems possibly in relations to future lower academic achievements. Hence, there is a necessity to identify the causal factors of SLI and to create the efficient speech therapy which may provide language disordered children with the same opportunities as their typically developing peers.

Although developmental language disorders have been investigated for almost 150 years, the neural basis of SLI still remains unclear. One theoretical approach assumes difficulties associated with deficient perception of auditory input. In the early seventies Tallal and colleagues revealed that SLI children are less efficient in discriminating between verbal (Tallal and Piercy, 1974; Tallal, 1975) and non-verbal (Tallal and Piercy, 1973; Tallal, 1975) sounds presented in rapid succession. Since that time, the researchers considered difficulties in temporal information processing (TIP) of rapidly changing acoustic events as one of the core problems in SLI.

This indication is in line with a long discussion about the relationship between TIP and language in norm and pathology. Several subject populations, including children with languagelearning-impairment (Benasich and Tallal, 2002; Fitch and Tallal, 2003), children or adults with dyslexia (Tallal et al., 1995; Rey et al., 2002) and patients with aphasia following left hemisphere brain lesions (Swisher and Hirsh, 1972; von Steinbüchel et al., 1999; Wittmann et al., 2004; Fink et al., 2006) displayed deficits in the perception of temporal order of two stimuli presented in rapid succession. They indicated elevated temporal order threshold (TOT), i.e., they needed longer time interval between two sounds to report correctly their temporal relation 'before-after.'

In our previous study (Szelag et al., 2015) the coexistence of deficient TIP and disordered phonemic hearing was confirmed inter alia in children with SLI. We indicated that they displayed higher TOT (of about 184 ms) than normal peers (about 96 ms) accompanied by deficient phonemic hearing. In the present study we verify in children with SLI the co-occurrence of these deficits with disordered phonetic identification using the voice-onsettime (VOT) which may be considered as a sensitive measure combining TIP with phonetic aspects of speech perception.

### TIP in Voiced/Voiceless Categorization

Speech perception requires neural encoding of both spectral acoustic and temporal cues. Different speech sounds (consonants and vowels) vary in their spectrotemporal characteristics. Whereas vowels present a relatively steady-state pattern of formants, stop consonants are temporally much more transient and acoustically variable (Obleser et al., 2007). Subjects with deficient TIP have accompanying difficulties in stop consonants reception which are critical in time, whereas such deficits rarely affect the perception of vowels. Moreover, stop consonant – vowel syllables (e.g., /TO/ and /DO/) are distinguished by acoustic differences in time between the onset of the consonant (stop release burst) and the onset of the following vowel (voicing). Such voiced/unvoiced categorical perception is the most frequently studied phonetic feature. It is measured with the VOT phenomenon defined as the time interval between the release from stop closure (onset of the consonant) and the onset of vibration of the vocal folds (onset of the following vowel, Lisker and Abramson, 1964; Molfese, 1980). It is worth mentioning that different languages are characterized by various temporal cues for VOT, thus, the relationship between the burst and laryngeal pulsing. The positive VOT reflects a situation when the laryngeal pulsing is preceded by a burst. In contrast, in case of negative VOT the burst is preceded by the laryngeal pulsing (Szelag and Szymaszek, 2014).

Some languages like American English or German use only the positive VOT, reflected in values from 0<sup>1</sup> to 20 ms for voiced categorizations, like /BA/, /DA/, /GA/ and longer intervals of around 30–80 ms for voiceless categorizations, like /PA/, /TA/, /KA/ (King et al., 2008). In other languages, e.g., Slavic or French the negative VOT is also observed. The voiced contrast detection varies along a continuum of VOT values. Among Slavic-speaking (including Polish) environments these values are located from −90 to 20 ms. Thus, the differentiation between voiced/unvoiced contrast bases on both negative and positive VOT.

Independently of these cross-linguistic differences in voiced/unvoiced categorical perception some language universalia emerge. They comprise a similar time gap of some tens of milliseconds with either positive or negative values critical for such differentiation in a given language. The efficient TIP in the millisecond time range seems to be crucial for the categorical voiced/unvoiced contrast perception, independently of the natural language. VOT phenomenon as an important aspect of phonetic identification in speech perception has been a topic of many studies including both normal and clinical populations (Giraud et al., 2007; King et al., 2008; Doellinger et al., 2011; Chobert et al., 2012).

### Timing-Related Approaches to Remediation of Language Acquisition

As difficulties in rapid auditory processing were indicated as the crucial deficit at least in some children with SLI and the conventional speech therapy often seems not efficient enough, the interventions based on TIP were developed (Merzenich et al., 1996; Tallal et al., 1996; Wittmann and Fink, 2004; Cohen et al., 2005; Gillam et al., 2008; Given et al., 2008). The widely known remediation software Fast ForWord <sup>R</sup> was successfully implemented in children with SLI and resulted in

<sup>1</sup>VOT value of 0 ms reflects an overlapping of burst and vibration onsets.

improvement of expressive and receptive language skills. In addition, the effectiveness of the Dr. Neuronowski <sup>R</sup> software, focused on TIP, developed in our Institute (Szelag and Szymaszek, 2016), was verified in children with SLI in our previous study (Szelag et al., 2015). We found that such intervention resulted in lowered TOT values, reflecting improved TIP in comparison to the non-significant change after control non-temporal training. The improvement in TIP was accompanied by amelioration of language skills in both phonemic hearing and global speech comprehension tasks.

### Study Aims

In the present study we tested whether children with SLI present the same boundaries for the typical categories of phonetic identification as their healthy peers and whether the application of intervention based on rapid auditory processing may result in improved performance on VOT task.

### MATERIALS AND METHODS

### Participants

Forty-seven right-handed (Edinburgh Inventory) children aged between 5 and 8 years participated in the study. They were classified into two groups: (1) normal children without any language disability (NC, n = 20) and (2) children affected by SLI (n = 27). In the latter group children were randomly assigned using the RITA <sup>R</sup> software (Pahlke et al., 2004) into two intervention subgroups, i.e., experimental temporal training subgroup (EG, n = 14) and control non-temporal training subgroup (CG, n = 13).

All children were monolingual Polish native speakers. NC were recruited at kindergartens in the area of Warsaw, whereas children with SLI at either the Early Intervention Centre or the Children's Memorial Health Institute in Warsaw. All participants showed normal hearing level (ANSI, 2004) which was verified with screening audiometry for 500, 1000, 2000, and 4000 Hz frequencies (audiometer AS 208). These frequencies covered the frequency spectrum of auditory stimuli presented in this study. All children had normal level of non-verbal intelligence (IQ at least 85 or higher, measured by the Raven's Colored Progressive Matrices; Szustrowa and Jaworowska, 2003).

In case of children with SLI the developmental language impairment was defined as reduced language competency, evidenced by the Test for Assessment of Global Language Skills (TAGLS; Tarkowski, 2001). It constitutes the screening assessment tool for language development in Polish children. All participants with SLI obtained the overall standard language score on at least two standard language subtests below or equal 4th sten. The exclusion criteria were neurological and psychiatric disorders, attention deficits or socio-emotional disorders (as determined by the parental report) and the participation in any other therapy program during our data collection which might have influenced the obtained results.

All three groups (NC, EG, and CG) were balanced according to age, gender, non-verbal IQ based on either one-way analysis of variance (ANOVAs) (for age and IQ) or Pearson's chi-squared test (for gender). Controlling such variables was important for the efficacy of the applied interventions. It was a blinded randomized controlled study. The detailed subject characteristics are given in **Table 1**.

### Ethics Statement

The study protocol was approved by the Bioethical Commission at the Warsaw Medical University (Permission No. KB/162/2010). Prior to testing a written informed consent was obtained from the parents of each child participating in the study; children provided a verbal approval.

### Stimuli and Experimental Procedures

The study comprised both assessment and intervention procedures. The assessment procedures included three tasks: (1) phonetic identification tested with VOT Task (Szelag and Szymaszek, 2014), (2) phonemic hearing with Phoneme Discrimination Test (PDT, Szelag et al., 2015), and (3) TIP with auditory TOT (Fink et al., 2006). These assessment procedures in SLI group were performed twice, i.e., before (pre-test) and, next, after (post-test) the applied intervention. In case of NC the phonetic identification data were collected at the beginning of the study, this group reminded without any intervention.

### Assessment Procedures

### **Measurement of phonetic identification**

Voice-onset-time task is a sensitive tool for evaluation of phonological deficits on a basis of phonetic identification during speech perception. The task has built-in the millisecond TIP component which is crucial for a differentiation between voiced and unvoiced category at the initial bilabial consonant. It was achieved by parametrizing a single acoustic temporal dimension of VOT across the synthetized spectrum of presented stimuli. The stimulus continuum was selected on a basis of our previous studies (Szelag and Szymaszek, 2014).

A series of voiced/unvoiced stimuli were created on a basis of the Polish word /**T**OMEK/ (with unvoiced initial consonant, naturally spoken with a female voice, in English: Tom). In all created stimuli the segment /OMEK/ was spectrally identical. The voiced/unvoiced contrast was achieved by the manipulation (Adobe Audition 3.0) in a single acoustic dimension of VOT

TABLE 1 | The detailed characteristic of the subject pool.


within the initial consonant in semi-synthesized word **T**OMEK. The created stimuli differed in VOT values that separated the onset of the stop burst and subsequent voicing, thus, the relationship between the burst and laryngeal pulsing. It created a continuum of VOT values comprising 13 stimuli: −90, −80, −70, −60, −50, −40, −30, −20, −10, 0, +5, +10, and +20 ms. For VOT from −90 to −10 ms the burst was preceded by the laryngeal pulsing (negative VOT), while from +5 to +20 ms the laryngeal pulsing was preceded by a burst (the positive VOT; for explanations of positive vs. negative VOTs see section "Introduction"). Accordingly, in Slavic languages (including Polish) the word of −90 ms VOT is identified as **D**OMEK, whereas, that of 20 ms as **T**OMEK with transition zone (chance level identification) for VOTs of −30 and −20 ms. Illustrative waveforms of two endpoint stimuli of applied VOT continuum are shown in **Figure 1**.

Children were presented with these stimuli binaurally through headphones at a comfortable listening level. The measurement based on identification of presented words as either **/**TOMEK**/** or **/**DOMEK**/**. Children were asked to listen to the presented words and to associate each stimulus they heard with one of two pictures. These pictures were presented on one response card (format A4, **Figure 2**). The upper picture displayed a boy named Tom (in Polish: **/T**OMEK/) and the lower picture a house (in Polish: **/D**OMEK**/**).

In each child the experiment was preceded with an introductory practice session. First, the children were introduced to the above two pictures, hearing examples of **/D**OMEK**/** and **/T**OMEK**/** with the VOT corresponding to continuum endpoints (−90 and 20 ms). Next, 16 presentations (8 repetitions of 2 words from the VOT continuum endpoints, i.e., −90 ms and 20 ms) were randomly presented. After each presentation children were requested to point to the proper picture and a feedback on the correctness achieved was given. The introductory practice session ended when four responses in a row were correct. Then, the experiment started without any feedback on the correctness achieved.

The measurement comprised 78 presentations (6 series, each consisted of the 13 basic stimuli presented in random order).

Outcome measures: We analyzed the percent of voiced distinctions within the whole VOT spectrum.

In further analyses the psychometric (sigmoid) function<sup>2</sup> was adjusted to all responses given by each child (Treutwein and Strasburger, 1999; Strasburger, 2001). This function based on the percent of voiced /DOMEK/ identifications for each of the 13 applied VOT values. The sigmoid function was the Z-shape curve limited in the range from 0 to 1 values. The 0 value for a defined VOT stimulus corresponded to a lack of any voiced identification in child's responses, whereas the value 1 meant 100% of such detections. The sigmoid function (f(x)) was defined by the formula:

$$f(\mathbf{x}) = \frac{1}{1 + \exp\left(\frac{x - a}{\beta}\right)}$$

where α reflects the categorical boundary location (x coordinate of 'half way up") corresponding to 50% voiced/unvoiced discriminations and β indicates the slope of the identification curve. In such generalized logistic model differences in categorical precision are reflected by a shallower slope of the function, i.e., the higher β the more flat curve (less categorical perception) corresponding to the worse performance. The additional abbreviations in the above formula are:


In our data analysis both α and β values were used as indicators of phonetic identification (see section "Results").

<sup>2</sup> In our preliminary data analyses, the sigmoid function seemed to be best fitted to data obtained in this study.

### **Measurements of phonemic hearing and TIP**

As stated before, the efficiency of phonemic hearing was evaluated with the PDT, whereas that of TIP with auditory TOT task. The measurement procedures of PDT and TOT were described in detail in our earlier report (Szelag et al., 2015). In the present study we refer to some data collected previously and published in Szelag et al. (2015; see Tables 2, 3, p. 9). From the subject pool published in this previous report (n = 32) we selected the data of 27 children with SLI (considering EG and CG subgroups) who were tested also with the VOT task in the present study. These previous data were used here to test in EG and CG the correlations between phonemic hearing (or TIP) and β value used as an indicator of phonetic identification in VOT task performance. Below we summarize briefly the measurement procedures of phonemic hearing and TIP.

Phoneme Discrimination Test comprises 64 paired-words in which 75% pairs were different, e.g., górnik – kurnik (in English: miner – hen house) and 25% the same, e.g., mama –mama (in English: mother – mother). The task was to judge whether two words within the presented pair were the same or different. Responses were given by pointing to one of the two response cards, corresponding to these two situations. In case of different paired-words they did not match in one phoneme. The differed paired-words contrasted for place of articulation, plosive, fricative, voicing, or nasality.

In TIP the measurement based on auditory TOT defined as the minimum time gap between two auditory stimuli presented in rapid succession that is necessary for a participant to report correctly their temporal order, i.e., the relation before-after at 75% correctness. The stimuli were paired 1 ms clicks presented monaurally (i.e., to each ear separately) with various interstimulus-intervals (ISIs). The task was to report the order of two clicks, thus: left–right or right–left. ISIs varied adaptively from 1 to 600 ms, according to the adaptive maximumlikelihood-based algorithm (Treutwein, 1997) until the TOT was located with a probability of 95% inside a ±5 ms interval around the currently estimated threshold (Treutwein, 1995).

### Intervention Procedures

As mentioned above, in children with SLI two types of interventions were applied, i.e., experimental temporal intervention (in EG subgroup) and control non-temporal intervention (in CG). Detailed description of both intervention programs was provided in Szelag et al. (2015).

### **Experimental temporal intervention**

Experimental temporal intervention procedure used the multimedia intervention program Dr. Neuronowski. <sup>R</sup> It was designed in our Institute on the basis of our previous prototyping interventions addressed TIP (Szelag et al., 2014; Szymaszek et al., 2017). This software consists of nine various modules containing 46 basic computer games. The crucial aspect of this software is that the majority of games involved TIP in the millisecond time range, sequencing abilities or duration judgment. Moreover, the temporal-based games were extended by tasks exercising other cognitive functions, i.e., language comprehension, attention, working memory and executive functions. The software was auto adaptive, i.e., task difficulty changed adaptively based on correctness of the actual child's performance. The tasks difficulty was modified according to numerous parameters, i.e., number, length and presentation rate of verbal and non-verbal stimuli, rate of modified speech, various ISIs in stimuli presented sequentially, application of various distractors, time limitations for responses.

### **Non-temporal control intervention**

Non-temporal control intervention was based on freely available computer games (e.g., Memory or Tetris), as well as on educational speech-therapy exercises. The combination of such software trained phonemic hearing, articulation and vocabulary, as well as attention, working memory and executive functions. Contrary to the experimental temporal intervention, none of these tasks engaged any rapid auditory processing in the millisecond time range.

#### Study Protocol

The assessment procedures and the intervention programs (**Figure 3**) were conducted with each child individually. The intervention (experimental and control) consisted of 24 sessions of 1-h each, performed 3 times per week.

### Statistical Analyses

Statistical analyses comprised four Steps. They included: (1) comparison of phonetic identification between NC and SLI children, (2) training-related differences separately in EG and CG, (3) post-test performance in EG and CG in comparison to NC, and (4) relationships in EG and CG between the phonetic identification indexed with the function slope (β) and results of phonemic hearing and TIP in pre- and in post-test assessment, separately. In Steps 1–3 (**Figure 4**) we analyzed both α and β 3 values using the U Mann–Whitney test (Step 1), Wilcoxon Signed-Rank test (Step 2) and Kruskal–Wallis oneway ANOVA followed by the U Mann–Whitney tests (Step 3).

Additionally, in Step 1 these comparisons were extended by 2-way ANOVA with the percent of voiced responses for the whole spectrum of VOT values in NC and SLI. This ANOVA aimed at testing the differences in phonetic identification for particular VOT stimuli within the voiced, unvoiced and transition categories<sup>4</sup> . It included 'Group' (NC vs. SLI) as between-subject factor and 'VOT value' (13 values: −90, −80, −70, −60, −50, −40, −30, −20, −10, 0, +5, +10, and +20 ms)

as a within-subject factor<sup>5</sup> . In SLI group only pre-test (summed EG and CG) data were considered.

In Step, 4 using Spearman correlations, we tested the relationships between phonetic identification indexed with β values and other cognitive skills reflected by phonemic hearing and TIP in EG and CG subgroups in pre- and in post-test assessment, separately. Referring to Szelag et al. (2015), TOT was the indicator of TIP efficiency and PDT of phonemic hearing.

### RESULTS

### Phonetic Identification in NC and SLI

Between-group differences for the boundary location (assessed with α, **Figure 5**) were non-significant (U = 244, p = 0.59). The obtained results indicated the similar boundaries for voiced/unvoiced distinctions in NC (α = −24.5) and SLI (α = −22.2).

Significant between-group differences for the slope of identification function (assessed with β, **Figure 5**) were found (U = 96, p = 0.0002). The NC group presented the steeper slope (β = 3.62) than the SLI group (β = 26.3), corresponding to better performance in the former group.

Analysis of variance with the voiced identification scores for the VOT spectrum revealed a significant effect of 'VOT value' [F(12/540) = 271.593, p < 0.001, η <sup>2</sup> = 0.858] modified by the interaction 'VOT value' × 'Group' [F(12/540) = 9.313, p < 0.001, η <sup>2</sup> = 0.171]. In both groups the results were patterned by two phonetic categories (**Figure 6**). The voiced category comprised the VOT values ranged from −90 to −40 ms, whereas, the unvoiced one from –10 to 20 ms, independently of the group. The ranges of these two categories were established on a basis of significant jump in the identification score between −40 and −30 ms (p < 0.001 in both groups) indicating the voiced category and between −20 and −10 ms (p < 0.001 in both groups) for the unvoiced category. Despite these similarities in both groups,

<sup>3</sup>α and β exponents did not have a normal distribution.

<sup>4</sup>Polish-speaking children perceived voiced category at VOT: −90, −80, −70, −60, −50, −40 ms, unvoiced at: −10, 0, +5, +10, +20 ms with the transition zone (a chance level performance) located for −30 and −20 ms. These categories were established on the basis of literature (Rojczyk, 2010) and our previous studies (Szelag and Szymaszek, 2014; Oron et al., 2015) as well as were confirmed in the present study (ANOVA, see below).

<sup>5</sup>The data presented normal distribution.

the better performance within each category was observed in NC than in SLI ('VOT value' × 'Group' interaction).

### The Effect of Applied Experimental vs. Control Intervention

The effect of training on the boundary location (**Figure 7**) was non-significant in EG (Z = 1.10; p = 0.27; αpre−test = −21.80 and αpost−test = −19.93) and in CG (Z = 0.73; p = 0.47; αpre−test = −22.63 and αpost−test = −24.34). Thus, the categorical boundary location remained relatively stable following each type of intervention.

The training-related changes in the slope of identification function indicated significantly lower β in EG in post- than in pre-test (Z = 2.73; p = 0.007; βpre−test = 29.59, βpost−test = 6.72), corresponding to improved performance (**Figure 7**). In contrast, β in CG did not differ significantly between post- and pre-test (Z = 0.52; p = 0.60; βpre−test = 22.77, βpost−test = 19.80), indicating the similar level of performance in pre- and post-test.

### Post-Test Performance in EG and CG in Comparison to NC

Between-group comparisons for the boundary location (**Figure 8**) were non-significant (H = 3.5, p = 0.18; αEG = −19.93, αCG = −24.34, αNC = −24.54).

Significant between-group differences in the slope of identification function were observed (H = 7.61, p = 0.03) between CG and NC (U = 62, p = 0.02; βCG = 19.80, βNC = 3.62). On the contrary, differences between EG and NC were nonsignificant (U = 139, p = 0.99; βEG = 6.72, βNC = 3.62). The obtained results indicated that EG in post-test reached the level of NC performance, whereas the post-test performance in CG remained still significantly below that of NC.

### Relationships Between Phonetic Identification and Other Cognitive Skills

In pre-test the β values did not correlate significantly with TOT (r = 0.23, p = 0.43 for EG and r = −0.09, p = 0.78 for CG) and PDT (r = 0.48, p = 0.09 for EG and r = 0.22, p = 0.47 for CG) in any subgroup. In contrast, in post-test only in EG significant correlations were found between β and TOT (r = 0.58, p = 0.03), as well as between β and PDT scores (r = 0.56, p = 0.04). For CG correlations between β and TOT (r = 0.25, p = 0.42), as well as between β and PDT scores (r = 0.15, p = 0.62) were non-significant.

### DISCUSSION

Considering the aims of our study, the discussion of obtained results is focussed, firstly, on the differences in phonetic identification between children with SLI and healthy peers, followed by the training-related changes in such identification. Finally, we concentrate on relationships between the level of phonetic identification assessed with the VOT test and results of TIP and phonemic hearing obtained in our previous study (Szelag et al., 2015).

### Voicing Contrast Detection in Children With SLI on the Background of Healthy Controls

Although some existing literature studies concern the phonetic identification in children with SLI, the boundaries of categorical perception of voiced/unvoiced detection in such children have been rarely studied. Therefore, the important result of the present study was the indication of non-significant differences between the boundary location (α) in children with SLI and NC (**Figure 5**). Moreover, in both these groups the similar ranges of categories for voiced/unvoiced detection were distinguished. The voiced category was identified at VOT values from −90 to −40 ms, while the unvoiced category from −10 to 20 ms (ANOVA, **Figure 6**) which is congruent with the previous reports in Polish subjects (Rojczyk, 2010). Besides these similarities, deficient voiced contrast detection reported in this study was reflected in significantly lower correctness within these categories in children with SLI than in NC (**Figures 5**, **6**). It was evidenced in statistical analyses (ANOVA) as well as in a shallower slope of identification function (higher β) values in children with SLI which corresponds to worse performance.

As mentioned above, speech perception difficulties have been frequently reported in SLI in literature studies. For example, Ziegler et al. (2005, 2011) investigated the perception of speech in noise considering various features, such as: voicing, manner and place of articulation. Although the perception of all these features was impaired in children with SLI, as compared to age-matched healthy controls, the voicing was impaired to a greater extent. It was interpreted as the strongest deficit compared to other features of speech perception. The deficient voiced categorical perception may reflect the deteriorated millisecond TIP which is incorporated in such voiced/unvoiced categorization (Benasich and Tallal, 2002; Fitch and Tallal, 2003; Szelag et al., 2015).

As indicated before, particular languages are characterized by specific boundaries for voiced detection (see section "Introduction"). Nevertheless, literature studies provided evidence that infants who experienced any linguistic

FIGURE 6 | The voiced identification score for the presented VOT spectrum in NC and SLI indicating the similar ranges of voiced/unvoiced categories in both groups with poorer performance in SLI than in NC.

environment are sensitive to some universal boundaries of phonetic identification which were located at VOT values between −30 and 30 ms (Lasky et al., 1975; Streeter, 1976). It was concluded that infants have a specialized biological predisposition to discriminate an universal set of phonetic contrasts (Eimas, 1991). The process of language acquisition during child development involves reorganization of this universal sensitivity under the influence of specific environmental conditions. Based on this view, infants at around 6 months of life transfer from a language-general to a languagespecific mode of speech perception with phonetic boundaries typical for the experienced language. It was evidenced for Spanish (Lasky et al., 1975) as well as for French (Hoonhorst et al., 2009). One may expect that the skilled functioning in the range of some tens of milliseconds is crucial for language development from early years of life. In the VOT task, similarly as in the TIP task, it is necessary to perceive effectively two sequential events (e.g., burst–vibration or two sounds) separated in time by some tens of milliseconds. Such a statement is supported by the literature evidence on shared neural network for rapid auditory processing and speech processing (e.g., Zaehle et al., 2004).

A number of studies has revealed deficient categorical contrast detection in children with SLI. For example, Sussman (1993) compared the performance of VOT in syllables, using the discrimination and identification methods. During discrimination measurement the performance in children with SLI was comparable to that reported in healthy controls. In contrast, in identification they were significantly less accurate. The impaired phonetic identification in VOT continuum in children with SLI was further confirmed by Gerrits

and de Bree (2009). Our results are congruent with these literature reports (**Figures 5**, **6**). Although several studies indicated lower accuracy of phonetic identification in children with SLI, our important finding was the observation that categorical boundaries for voiced/unvoiced identification were still preserved and remained comparable to those observed in healthy controls. At this point what should be noted is the similar boundary location reflected in α (**Figure 5**) as well as typical voiced/unvoiced category ranges (**Figure 6**) in 5–8-year-old normally developing children and in SLI ones. The indication of the typical temporal framework for phonetic identification in SLI seems to be promising in the context of speech neurorehabilitation.

### Training-Related Changes in Voiced Contrast Detection

Despite existing literature controversies on the contribution of deficient TIP and declined temporal precision to deficient speech perception (Fink et al., 2006; Vander Werff and Burns, 2011; Parbery-Clark et al., 2012), our results confirmed that training in TIP improved the phonetic identification measured by the VOT task. As stressed before, although the boundaries of categorical voiced distinctions in children with SLI were the same as in healthy peers, the correctness of performance in SLI was significantly lower (**Figures 5**, **6**). Only in EG after temporal intervention we observed significant improvement reflected in lowered β values (βpre−test = 29.59 vs. βpost−test = 6.72). On the contrary, in CG after control intervention no significant differences were revealed (**Figure 7**).

Furthermore, it should be stressed that only following temporal intervention, the children with SLI reached the level of performance comparable to that observed in NC (non-significant differences between EG and NC). In contrast, following non-temporal intervention (in CG), the level of performance still remained below that of NC (significant differences between CG and NC, **Figure 8**).

Previous studies indicated beneficial effects of various interventions based on rapid auditory processing which resulted in amelioration of speech reception evidenced, e.g., by phonemic hearing in children with SLI (e.g., Tallal et al., 1996; Szelag et al., 2015). However, in the present study we confirmed that increased temporal precision in auditory processing resulted in more effective phonetic identification which seems to be more complex and nuanced than simple correct/incorrect phoneme differentiation measured with phonemic hearing tests.

In our previous studies, the effectiveness of intervention based on TIP was investigated in aphasic patients using the prototype version of Dr. Neuronowski <sup>R</sup> software (Szelag et al., 2014; Szymaszek et al., 2017). Patients were trained in sequencing two sounds presented in a rapid succession. It resulted in significant improvement of both TIP and speech reception (evidenced in phonemic hearing, global speech comprehension and VOT tests).

Referring to some theories on SLI, these children presented impaired working memory and selective attention which affected the phonological processes (Bishop et al., 1999; McArthur and Bishop, 2001; Vandewalle et al., 2012). Although, two intervention programs applied here (temporal vs. control) contained exercises focused on cognitive functions, like working memory, attention or executive functions, only the temporal intervention (addressed TIP) caused the improvement in speech perception, i.e., phonetic identification (studied here), as well as phonemic hearing and global speech comprehension (Szelag et al., 2015). Thus, one may emphasize that efficient temporal framework is fundamental for broader aspects of speech perception, but the training in working memory and attention incorporated in both applied interventions was not sufficient enough to improve speech perception skills in children with SLI.

For better understanding training-related benefits reported here in the context of our previous reports (Szelag et al., 2015), we conducted correlation analyses between phonetic identification indexed with β and phonemic hearing (indexed with the percent of errors) or TIP (TOT in millisecond time range).

### Correlations Between Phonetic Identification, Phonemic Hearing and TIP

Literature evidence indicated that some aspects of speech reception, i.e., both phonetic identification and phonemic hearing are rooted in millisecond temporal frame (Pöppel, 1997, 2009). In that context, we investigated whether the TOT values and the effectiveness of phonemic hearing tested in our previous study (Szelag et al., 2015) correlated with the β values obtained here, considered as the indicator of voiced contrast identification efficiency.

Only in EG in post-test both these measures (TOT and phonemic hearing) correlated significantly with the β values. Thus, the lower β (better contrast detection) was accompanied by better phonemic hearing and lower TOT (better TIP performance). It may suggest the existence of a neural mechanism underlying speech perception rooted in TIP which was improved during temporal intervention. The application of such exercises may result in a transfer of improvement from the trained non-verbal timing processing into the untrained verbal processing, i.e., some aspects of speech perception in which the temporal component is built-in. Hence, such transfer of improvement was documented in amelioration of both phonemic hearing and phonetic identification. One may hypothesize that following the temporal training the improved temporal acuity resulted in more coherent processing of both speech and nonspeech stimuli. Such correlations in pre-test in both groups were non-significant probably due to more variable and less precise subjects' responses. In CG the applied intervention did not influence TIP, thus, the preserved declined millisecond time frame resulted in non-significant relation 'timing-speech perception.'

### Final Remarks

The obtained results revealed that children with SLI, despite lower correctness in phonetic identification, present the same as their healthy peers boundary location for categorical voicing contrast detection. Temporal intervention in children

with SLI resulted in significant improvement of phonetic identification as compared to non-temporal control intervention. In CG intervention based on cognitive functions such as: working memory, attention, executive functions extended by typical speech therapy exercises (non-temporal control intervention) did not benefit speech perception assessed by phonetic identification.

### AUTHOR CONTRIBUTIONS

AS: data acquisition, conducting therapy sessions, analysis and interpretation of data, and manuscript writing. AD: subject recruitment, data acquisition, conducting therapy sessions, analysis and interpretation of data, and contribution to manuscript writing. PU: analysis of psychometric function. ES:

### REFERENCES


conceptualization and study design, analysis and interpretation of data, and manuscript writing. All the authors: final approval.

### FUNDING

The research was supported by the INNOTECH-K1/IN1/30/159041/NCBR/12 grant from the National Centre for Research and Development, Poland.

### ACKNOWLEDGMENTS

The authors would like to thank Anna Bombinska for her technical assistance during the data collection.


order judgement of two consonants. Brain Lang. 80, 576–591. doi: 10.1006/brln. 2001.2618


Treutwein, B. (1995). Adaptive psychophysical procedures. Vis. Res. 35, 2503–2522. doi: 10.1016/0042-6989(95)00016-X

Treutwein, B. (1997). YAAP: yet another adaptive procedure. Spat. Vis. 11, 129–134.


**Conflict of Interest Statement:** ES and AS are the creators of the software package Dr. Neuronowski <sup>R</sup> , realized as a part of a project at the Nencki Institute with funding from the National Centre for Research and Development in Poland. The rights to the software lie with the Nencki Institute that has an agreement with Harpo Ltd., the company commercializing this software. ES and AS are not the owners of this technology nor do they have a direct financial arrangement with Harpo Ltd. The authors state that it does not affect the scientific validity of the results.

The remaining authors state that the research was realized in the absence of any commercial or financial relationships that could generate any potential conflict of interest.

Copyright © 2018 Szymaszek, Dacewicz, Urban and Szelag. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Disturbed Experience of Time in Depression—Evidence from Content Analysis

David H. V. Vogel <sup>1</sup> \*, Katharina Krämer <sup>1</sup> , Theresa Schoofs <sup>1</sup> , Christian Kupke<sup>2</sup> and Kai Vogeley 1,3

<sup>1</sup> Department of Psychiatry, University of Cologne, Cologne, Germany, <sup>2</sup> Department of Psychiatry, Society for Philosophy and Sciences of the Psyche, Charité - Universitätsmedizin Berlin, Humboldt - Universität zu Berlin, Berlin, Germany, <sup>3</sup> Institute for Neuroscience and Medicine, Cognitive Neuroscience (INM-3), Research Centre Juelich, Juelich, Germany

Disturbances in the experience of time have been a commonly reported feature of depressive disorders since the beginning of modern psychiatry and psychological research. However, qualitative research approaches to investigate the phenomenon are rarely used. We employed content analysis to investigate disturbances of time experience in Major Depressive Disorder. Our analysis from 25 participants showed that individuals with Major Depressive Disorder subjectively seem to have lost the ability to influence or change the present, resulting in an impersonal and blocked future. The present is rendered meaningless, the past unchangeably negative, and the passage of time turned into a dragging, inexorable, and viscous continuance. The overall,—possibly intersubjective—concept of time experience, remains largely intact, causing or adding to depressive mood and suffering. We elaborate on how these findings reflect previous theories on the experience of time in depression. This study might encourage future inquiries into both the phenomenal and neuroscientific foundation of time experience under psychopathological conditions.

#### Edited by:

Giovanna Mioni, Università degli Studi di Padova, Italy

#### Reviewed by:

Gianluca Serafini, Department of Neuroscience, San Martino Hospital, University of Genoa, Italy Domenico De Berardis, Azienda Usl Teramo, Italy

#### \*Correspondence:

David H. V. Vogel david.vogel@uk-koeln.de

Received: 27 November 2017 Accepted: 05 February 2018 Published: 20 February 2018

#### Citation:

Vogel DHV, Krämer K, Schoofs T, Kupke C and Vogeley K (2018) Disturbed Experience of Time in Depression—Evidence from Content Analysis. Front. Hum. Neurosci. 12:66. doi: 10.3389/fnhum.2018.00066 Keywords: depression, time experience, temporality, phenomenological psychopathology, content analysis

## INTRODUCTION

Temporal dysfunctions are a commonly reported feature of Major Depressive Disorder (MDD). A variety of experimental approaches has been employed in order to assess and investigate these disturbances, with varying conclusions. For instance, research exploring human time perception has yielded inconsistent results concerning its potential disturbance in MDD (Thönes and Oberfeld, 2015). One of the more robust results from this line of research is the observation of a subjective decrease in the experienced velocity of the flow of time in depressive disorders. This decrease is possibly the most prominent and most often reported disturbance of time experience in MDD in the literature. It was either obtained from patients' qualitative descriptions (Lewis, 1932; Hartocollis, 1975; Wyrick and Wyrick, 1977; Kuhs, 1991; Mundt et al., 1998; Bschor et al., 2004; Stanghellini et al., 2016, 2017) or by making additional use of questionnaires or subjective quantitative measures (Lehmann, 1967; Bech, 1975; Kitamura and Kumar, 1982; Richter and Benzenhoefer, 1985; Blewet, 1992).

In this context phenomenological psychopathology offers a unique opportunity. Structural psychopathology investigating MDD appositely has identified disturbances in time experience as the basis underlying the disorder and linking its symptoms, ranging from impairments in neurocognitive and psychomotor functions, over changes in affect and mood to depressive delusions (Stanghellini et al., 2017). The emerging syndrome has adequately been referred to as a disturbance of lived time (Fuchs, 2001, 2013, 2014; Broome, 2005; Kupke, 2005; Wyllie, 2005; Gallagher, 2012; Moskalewicz, 2015; Bloc et al., 2016; Stanghellini et al., 2016), the slowing down of the subjective experience of time has been conceptualized to be a part of this syndrome (Wyllie, 2005; Fuchs, 2013, 2014; Stanghellini et al., 2017). The investigation of such an underlying principle or mechanism is of special significance, as current insights increasingly emphasize the importance of recognizing depressive symptoms apart from affect and mood for both diagnosis, therapy, and prognosis (Gonda et al., 2015). Despite the wealth of theoretical accounts on time experience, there has been only limited empirical research on the construct of lived time (Stanghellini et al., 2016, 2017). To the best of our knowledge, we present the first prospective empirical study employing qualitative content analysis as an empirical tool for phenomenological psychopathology to investigate disturbances of time experience in MDD. We will compare our results to those from a recent study on healthy individuals using the same method (Vogel et al., in revision).

### MATERIALS AND METHODS

### Participants

Participants were recruited from patients admitted for inpatient treatment to the Department of Psychiatry at the University Hospital Cologne. Patients were considered for inclusion into this study if the clinical diagnosis of a Severe Depressive Episode (ICD-10, F32.2) (World Health Organization, 1992) had been established after admission. Patients were only considered for inclusion, if they had neither record of comorbid neurological or comorbid psychiatric disease nor was there any neurological or psychiatric comorbidity detectable in the clinical diagnostic procedures. Comorbidities included organic brain disease, mental retardation (IQ<70), manic or hypomanic episodes, bipolar disorder, psychosis or schizophrenia, personality disorders, and addiction. Patients eligible for inclusion were clinically rescreened approximately 2 weeks after admission by the principal investigators qualified for making a clinical diagnosis (DHVV, KV) for meeting both diagnostic criteria for a Severe Depressive Episode (ICD-10, F32.2) (World Health Organization, 1992) and diagnostic criteria for non-chronic MDD as defined by the DSM 5 (American Psychiatric Association, 2013). Patients were deemed eligible regardless of the number of prior depressive episodes, however patients diagnosed with chronic or persistent depressive disorder, or dysthymia were not considered for inclusion. Patients meeting these inclusion criteria were provided with the study material (see below). Patient screening and subsequent study inclusion were conducted over the course from January 2015 until August 2017. None of the patients included were taking any neuropsychiatric or any otherwise psychoactive or illegal drug not explicitly prescribed for anti-depressive treatment over the period of investigation. All participants received pharmacological anti-depressive treatment according to the S3 guidelines by the German Association for Psychiatry, Psychotherapy and Psychosomatics (Härter et al., 2017) (see **Table 1**). Two participants received electro-convulsive-therapy. Participants were included in a post-acute state, only after acute treatment interventions (e.g., treatment with benzodiazepines) had been successfully terminated. All participants underwent additional cognitive-behavioral therapy in group settings over the period of investigation.

We administered the Beck-Depression-Inventory (BDI-II) (Hautzinger et al., 1995) and a verbal IQ measure (WST) (Schmidt and Metzler, 1992). We used BDI-Scores to approximate depressive symptom severity and used WST-Scores to guarantee verbal speech comprehension, proper production of written material, and sufficient intellectual capability to cope with the complex subject of time experience. Demographics and results from BDI-II and WST are presented in **Table 2**.

### The Time Questionnaire

The Time Questionnaire (TQ) (**Figure 1**) was specifically designed to address as many aspects of the experience of time as possible making use of open questions, and it has been successfully tested in a previous study on the experience of time in healthy participants (Vogel et al., in revision).

Questions (Q) 1–3 address the flow of time, with Q1 being as broad and open as possible and Q2–3 inquiring further into the dependency on interpersonal and situational

## TABLE 1 | Prescribed daily medication and weekly Electro Convulsive Therapy (ECT).


context. Q4–7 are designed to address experience, concepts, and subjective meaning of the past, present, and future. Additional space is provided for participants to address anything they found relevant concerning the experience of time, but could in their opinion not be assigned to a specific question. Participants were instructed to write as much or as little as they found sufficient. There were no time limits for completion.

All returning documents were analyzed using summarizing content analysis (acc. to Mayring, 2014) (SCA). Superfluous examples and redundancies were deleted from the material. The remaining text was paraphrased into common language. The smallest possible part of the material to be categorized and coded was defined as a single statement within the material, ranging from a single word to multiple sentences. The largest possible part of the material that could form a coded category was defined as everything addressing one question. From the paraphrased material individual statements were identified. For each participant descriptive categories were generated for each single paraphrased statement. After all paraphrases had been assigned a category, the material was re-screened, and categories of similar or identical content were merged. This was repeated until a coherent categorical system had been established.

To evaluate the final categorical system's reliability, it was presented to two independent inter-coders (K.K., T.S.). Intercoders were provided with an explication, a coding rule, and an anchor example and the code for each of the categories, together with the uncoded paraphrased material (**Table 3**). The inter-coders assigned each paraphrased statement to one category by employing the explication and coding rules. Intercoder agreement was calculated by comparing codes from the coded material from the initial analysis, both inter-coders using Krippendorff's alpha (Krippendorff, 2004; Hayes and Krippendorff, 2007).


### RESULTS

We identified 235 statements in total. We identified and explicated 30 categories which could be assigned to five different groups roughly corresponding to the initial formulation of questions. The first group consisted of eleven categories and addressed the passage of time (87 statements, app. 37% of all statements, categories K01-11). The second group consisted of nine categories concerning the present (64 statements, app. 27% of all statements, categories K12-20). The third group consisted of four categories and addressed the past (38 statements, app. 16% of all statements, categories K21-24). The fourth group addressing the future consisted of six categories (44 statements, app. 19% of all statements, categories K24-29). The last group was a single category labeled Rest (2 statements, app. 1% of all statements, category K30). This group coded for all statements which were not addressing the experience of time or were too unspecific or too divergent from the explanations of the other categories. The categorical system for the analyzed material is depicted in **Table 3**. The number of counts per category is depicted in **Figure 2**. Calculation of Krippendorff's alpha yielded a strong inter-coder agreement of α = 0.93.

### Category Group "Passage of Time"

The majority of categories related to the Passage of Time concerned experiences of a changed velocity of passing time (categories 01, 02, 03, 04, 05, 10, 11; in total: n = 58 statements, app. 24% of all statements). Five categories related to changes in situational or social contexts (categories 01, 02, 03, 04, 05; in total: n = 36 statements, app. 15% of all statements). Most categories specified either an acceleration or an increase in the velocity of the passage of time during activities (category 01, n = 9 statements, 36% of participants), during engagement in pleasant situations (category 02, n = 10 statements, 40% of participants), or during engagement in interaction with others (category 05, n = 10 statements, 40% of participants). Fewer participants reported a deceleration or a decrease in the experienced velocity of the passage of time, namely during unpleasant situations (category 03, n = 4 statements, 16% of participants), or during the presence of others (category 04, n = 3 statements, 12% of participants). Two categories referred to general changes in the velocity of time irrespective of whether this experience was related to an activity (categories 10, 11; n = 18


FIGURE 1 | The Time Questionnaire.

#### TABLE 3 | The Categorical System.


statements, app. 9% of all statements), associated either with acceleration (category 10, n = 7 statements, 28% of participants) or a deceleration (category 11, n = 15 statements, 60% of participants). In the acceleration group 5/7 participants stated this to be generally the case, and one found it to be distinctive for his/her depressive episode. In the decrease group 8/15 participants stated a general deceleration, and seven found it to be specific to their depressive episode. Most participants reported either an increase or a decrease in velocity corresponding to the onset of the current depressive episode; however, two participants gave statements revealing both an increase and a decrease since exacerbation, with one stating the increase to occur in retrospect.

Three categories described changes in the passage of time nonrelated or not specifically related to velocity (categories 06, 08, 09; in total: n = 20 statements, app. 8.5% of all statements). Participants reported a normalization or improvement of the experience of passage of time while being engaged in social or pleasant activities (category 06, n = 6 statements, 24% of participants). The same number of participants reported an inability to feel, perceive or give meaning to the passage of time (category 08, n = 5 statements, 20% of participants). Relatedly, participants further stated the experience of the passage of time being inexorable, endless, circular or a standstill (category 09, n = 9 statements, 36% of participants). Additionally and in contrast to various statements mentioned before, several depressed participants reported the absence of situational influence on the passage of time (category 07, n = 9 statements, 36% of participants).

### Category Group "Present"

Concerning the Experience of the Present, several categories described the present as enduring (categories 12, 15, 17; in total n = 19 statements, 8% of all statements). Several participants stated that the present was experienced as extended (category 17, n = 10 statements, 40% of participants). Correspondingly participants gave account of the present being a current activity (category 15, n = 6, 24% of participants) or being contained within 1 day (category 12, n = 3 statements, 12% of participants). Concordantly, participants described the present taking place as a conscious" now" or within a current moment (category 13, n = 7 statements, 28% of participants). Strikingly, a significant number of participants experienced the present as a point in time instead of an extended duration (category 16, n = 10 statements, 40% of participants). One category of statements concluded that the present could be influenced (category 19, n = 7 statements, 28% of participants), however the identical number of participants judged the present to be out of their own control (category 18, n = 7 statements, 24% of participants). Nine participants judged the present to be related to negative emotions (category 14, n = 9 statements, 36% of participants), and the present to be meaningless (category 20, n = 5 statements, 20% of participants).

### Category Group "Past"

Within the group concerning the Experience of the Past participants stated that the past had a negative effect on the present and/or was dominated by feelings of guilt (category 21, n = 15 statements, 60% of participants). Additionally, participants experienced the past as over and unchangeable (category 22, n = 10 statements, 40% of participants) and that the past had to be accepted (category 23, n = 3 statements, 12% of participants). Ten participants stated that the past had influenced and shaped their personality or that it could be learned from (category 24, n = 10 statements, 40% of participants).

### Category Group "Future"

Within the group Experience of the Future statements describing the future as being related to negative emotions such as fear, worry, and hopelessness were identified most frequently (category 25, n = 10 statements, 40% of participants). Surprisingly, the same number of participants referred to the future as being related to thoughts of hope (category 29, n = 10 statements, 40% of participants). Slightly fewer statements referred to a diminished, blocked future, meaningless, and no longer part of subjective experience (category 26, n = 8 statements, 32% of participants), or uncertain, non-projectable, and out of one's own control (category 27, n = 7 statements, 28% of participants).

### DISCUSSION

The results obtained from content analysis confirmed the very few retrospective studies (Stanghellini et al., 2016, 2017), case reports and theoretical accounts of a disturbance of the experience of time in MDD. Whereas the overall nature of a fluent and variable passage of time seemed to be intact with acceleration in pleasant and deceleration in unpleasant situations (Vogel et al., in revision), depressed patients showed general changes in their experience of the passage of time. The most common was that of time having slowed down, most reliably found experiential disturbance in MDD within the literature (Thönes and Oberfeld, 2015).

Seemingly contradicting, we also found statements on an increase in the experienced velocity of the passage of time. Although we cannot rule out completely hypomanic or manic experiences, an involvement of corresponding symptoms seems to be highly unlikely, as pre-inclusion diagnostic procedures were conducted rigorously according to clinical guidelines, and all participants showed clear depressive symptoms in a clinical range as additionally assessed by BDI. We therefore speculate that this apparent increase may be due to one or more out of three conceivable associations. First, stands the possibility of a retrospective judgement of the duration of the past course of the depressive episode—a phenomenon found within the data and which would be describable by the so called time paradox (Wittmann, 2016, also see Straus, 1928). According to this phenomenon, uneventful durations (in this case the depressive episode) retrospectively are judged to have been shorter than real time or the contained time seemed to have gone by more quickly than real time.

Second, clinical improvements of participants after having started treatment may explain statements of an increase in the overall speed of the flow of time. The ostensible discrepancy in patient reports possibly and in part may have been influenced by the heterogeneous treatment regimen found in our patient group. Furthermore, it has recently been shown, that the experience of time passing quickly can be found in healthy individuals (Vogel et al., in revision). In other words, the increase in the velocity of the passage of time would not appear to be a distinct feature to MDD but in our case more closely reflected decreasing symptom severity.

Third, may stand a discrepancy between the description of internal time/ego-time and external time/world-time (Straus, 1928; Lehmann, 1967; Minkowski, 1971) or intersubjective time (Fuchs, 2013), with some subjects describing their own passage of time as slowed down, and fewer subjects describing the external world time as moving faster and passing by. Concordantly, our finding of both an increase and a decrease in time in MDD may effectively describe the same phenomenon from two different points of view (internal vs. external). As a speculation, this finding may describe two psychopathological subgroups of MDD, with one group comparing external time to internal time (e.g., "everyone is passing me by"), causing the velocity of the flow of time to be described as increased, or comparing internal time to external time (e.g., "I am slower than everyone else"), causing the velocity of the flow of time to be described as decreased. In a simpler line of reasoning the differing reports on the velocity of the flow of time, may psychopathologically delineate MDD subgroups with decreasing velocity being the well-known more prevalent phenomenon and an increase in velocity posing a newly described symptom, specific to a subordinate division of depressive disorder.

When focusing on the aspect of context-dependency of the passage of time we find a discrepancy between the number of statements stating that the passage of time will speed up when engaged in a pleasant activity or situation and that it will slow down when engaged in an unpleasant activity or situation. Although our method is not entirely suitable for quantitative evaluation, it relates to a somewhat similar finding, which has very recently been reported in the judgement of the velocity of the experience of time in an Experience Sampling Method (Dupuy, 2017). The authors were unable to find a significant reduction in the subjective speed of time in unpleasant situations in participants with MDD. We argue that it seems likely that an individual in a depressed state will not report any further deceleration in the velocity of the passage of time in an unpleasant situation due to the overarching unpleasantness of the depressed state and the related deceleration of the passage of time.

Regarding intersubjective and social aspects of time experience our results are heterogeneous. We speculate that an increase in the velocity of the passage of time in the presence of others, just as the reported normalization of its experience relative to situation, reflects that social interactions are usually experienced as pleasant and engaging, therefore related to an improvement in affect or mood, which in turn causes the passage of time to be experienced as moving faster (Vogel et al., in revision). It is important to note that our method of qualitative inquiry into inner experience is insufficient to provide proof of implicit interpersonal aspects of time experience and temporal desynchronization, which have previously been convincingly conceptualized as a possible underlying mechanism of disturbances in time experience in depressive disorders (Fuchs, 2001, 2005, 2013; Ratcliffe, 2012).

Concerning what has been referred to as the experience of the structure of time (i.e., past, present, and future) (Kupke, 2009; Vogel et al., in revision), we—similarly to the passage of time find an overall conserved nature of experience of time structure, with the notion of an experience of time advancing from the past, through the present, into the future. As previously observed in healthy individuals (Vogel et al., in revision), and as opposed to persons suffering from schizophrenia (Vogeley and Kupke, 2006; Stanghellini et al., 2016), participants with MDD seem to retain this structured directedness of time from the past to the future. In this context, both depressed and healthy individuals experience their lives in the present, where it may be possible to influence the future through a present activity drawing from previously acquired knowledge and past experiences. However, several distinct features arise from our analysis. It can be concluded that depressed participants experience a diminished ability to influence the present and the future. It seems as if they feel detached from the otherwise intact structured directedness of time. Time seems unchangeable, and the present is reduced to circling (daily) repetition. It can further be shown that the present and the past are strongly experienced as negative and in the former case as meaningless. Our participants experienced the past as a source of guilt and as providing the reasons for one's present suffering. The future was experienced as blocked; to our participants it seemed out of reach and meaningless. Both the experience of the future being uncertain and of the future being experienced as frightening seemed much more pronounced than previously observable in healthy individuals (Vogel et al., in revision). Both experiences of uncertainty and fear carried with them additional feelings of hopelessness and inevitability. These findings are in good concordance with theoretical considerations on the psychopathology of time in depression. The above mentioned disturbance of lived time has also been referred to as blocked future (Straus, 1928; von Gebsattel, 1928; Fuchs, 2001; Wyllie, 2005; Stanghellini et al., 2017), disturbance of (vital) becoming (Straus, 1928; von Gebsattel, 1954; Minkowski, 1971; Fuchs, 2013), disturbance of protention (Binswanger, 1960), or disorder/disturbance of conation (Stanghellini et al., 2017). In simple terms, this terminology describes the depressed patient's inability to advance into the future through goal directed or planned action. This subsequently renders the future inaccessible, turning the past into the dominant temporal domain. Subsequently, past experiences are deemed responsible for the depressed present state. Statements within our data conveying an experience of a decreased or diminished ability to influence the present and the future, an experience of an intangible future further foregrounded by a slowed down, halted or endless passage of time, and the addition of feelings of guilt and negativity to an otherwise still influential past, highly correlate with the hypothesized disorder of lived time. Although it was not possible to draw conclusions as to the dynamic of the onset and interrelationship of depressive symptoms of time experience from our data, we conclude that the experience we captured by employing the presented method, empirically validates MDD as a disorder of lived time, including its correlating symptoms.

Our findings coincide with conclusions from a study by Stanghellini et al. (2017) on the Abnormal Time Experience (ATE) in MDD. The study found three changes in temporal experience, namely (1) a standstill of bodily functions ("vital retardation"), (2) a present and future dominated by the past, and (3) a slowing or blocking of the flow of time ("slackening of the flow of time"). The second category roughly corresponds to statements of guilt and the negative influence of the past. The third category corresponds to statements on the decrease in velocity of the passage of time, the standstill and endlessness of the passage of time. However, we were unable to detect statements referring to the first category of ATE. This is most likely due to the fact pointed out by Stanghellini et al. (2017) that this category refers to more implicit changes of time experience. Hence, it will most likely not be detectable with a questionnaire explicitly inquiring the experience of time.

Observations and analyses leave us with a coherent account of disturbed time experience in MDD. As clearly represented in **Table 4**, participants with MDD experience an inability or difficulty to influence or advance from a meaningless and unpleasant present. Their future seems blocked and the past overwhelmingly negative. These phenomena present themselves to the depressed individual before the unaffected background of a concept of time, meaning that time not only keeps its structured directedness, but for the individual suffering from MDD the basic notion of how time was experienced before the onset of depression is preserved during illness. This background may arise from implicit intersubjective aspects of temporality, and be the distinguishing origin of depressive suffering (Ratcliffe, 2012; Fuchs, 2013).

The syndrome of disturbed lived time in MDD emerging from our analysis suggests a variety of implications for further research with both phenomenological and cognitive neuroscientific approaches. The relationship between the inseparable formats of time experience—passage of time and structure of time—remains largely uninvestigated in phenomenological, cognitive and in neuroscientific research. The syndrome of a disturbed lived time in MDD demonstrates convincingly the intricate relationship of the passage or flow of time and the structure of time. Neither the experience of the passage of time, nor the experience of the dimensions of time, i.e., past, present, and future, are clearly separable. They interpenetrate and influence each other in a dynamic, complex and "dialectic" way, creating what we refer to as structured directedness.

We speculate that changes in the experience and judgment of the subjective flow of time reflect an individual's present state of becoming, i.e., the congruence of the present activity or state and a desired state. Thus, an individual will experience an unpleasant deceleration in the flow of time during an unpleasant or non-engaging activity (Zakay, 2014), or an acceleration in the flow of time or pleasant fading out of the experience of time (e.g., Csikszentmihályi, 1990; Hancock and Weaver, 2005) during pleasant or engaging activities. The same may rule true for mental disorders and especially affective disorders. Inversely, it has been discussed that the experience of the velocity of the passage of time may be causative to a pleasant or unpleasant experience during a specific activity (Sackett et al., 2010); transcribed to MDD this would suggest that a decrease in the velocity of the passage of time is causative to depressed mood. In either case it seems plausible that the mechanism behind time experience serves as a specific indicator for the current state of what has been referred to as an individual's becoming (Minkowski, 1923, 1971; Straus, 1928; von Gebsattel, 1954; Fuchs, 2013), and that experiences of velocity changes are an integral component of the disturbance of lived time in MDD (Straus, 1928). Relatedly, the subjective speed of time passage has been put forward as reflective of overall status of health (Droit-Volet, 2013; Droit-Volet and Wearden, 2016).

It has recently been proposed that the repeated findings of a subjectively slowed down flow of time in MDD, when contrasted with the heterogeneous findings from time perception research in depressive disorders, in addition to other diverging results between subjective time experience and timing, may be due to two separate underlying mechanisms (Droit-Volet, 2013; Lamotte et al., 2014; Droit-Volet and Wearden, 2016). The mechanism for time perception being the heavily investigated internal clock (Church, 1984; Droit-Volet, 2013; Shi et al., 2013; Allman et al., 2014), and the mechanism for time experience still remaining unclear, although introspection has been suggested as a possible underlying process (Fayolle et al., 2013; Lamotte et al., 2014). These disturbances of subjective time experience have been associated with impaired decision-making (Wittmann and Paulus, 2008; Woods et al., 2014; Owen et al., 2015). Additionally, a slowed down subjective flow of time has been conceptualized to be linked to psychomotor retardation in MDD (Kitamura and Kumar, 1982; Blewet, 1992; Bschor et al., 2004; Gil and Droit-Volet, 2009; Stanghellini et al., 2017), although diverging findings exist (Bech, 1975).

As changes in the brain's resting state (Kaiser et al., 2015; Drysdale et al., 2017) have been implicated as possible underlying neuropathophysiology in MDD, it has been speculated, that phenomenological (spatio-)temporal abnormalities and (spatio-)temporal abnormalities observed through neuroimaging in psychiatric conditions including MDD may be corresponding phenomena (Northoff, 2016a,b; Northoff and Stanghellini,


2016). We argue that in the case of MDD, the psychopathological specifiers lie in the disturbance of lived time.

In a differing context resting state data has provided results for further differentiation of subtypes in MDD (Drysdale et al., 2017; Price et al., 2017) and in line with current opinion we suggest that psychopathology may offer a complementary approach to neuroscientific proceedings in identifying secondary types or specifiers of MDD and vice versa (Sullivan et al., 2002; Fountoulakis et al., 2004; Stanghellini and Rossi, 2014; Owen et al., 2015; Stanghellini and Aragona, 2016). In accordance with our findings, we suggest a possible subgroup distinction may lie in reports of experiences of an increase and a decrease in the velocity of the flow of time. As specified above, accounts of a slow flow of time may be specific to more severe depressed states and those about fast flow of time may either reflect recovery from depression, hence possibly posing a specific symptom to less severe depressive episodes, or it may in fact be a distinctive feature to a previously unidentified clinical subtype of MDD. Unfortunately, due to the limitations of qualitative analysis - specifically vulnerability to false negatives, possible oversimplification, and the only partially quantitative assessment of categories—, it is not feasible to search for the assumed correlations between the occurring symptoms of disturbances in subjective time and e.g. overall depressive symptom severity on the basis of our data. Although we consider the sample size saturated, fully adequate and sufficient for our qualitative assessment (Sandelowski, 1995; Pope et al., 2000; Mason, 2010; Glaser and Strauss, 2017), it is too small to search for discrete subtypes within the identified syndrome, as the qualitative method is not intended for this type of statistical investigation (Brown and Lloyd, 2001). Further integrative research approaches will be necessary to properly examine the possibility of distinct depression subtypes within the corresponding disorder of subjective time experience.

### LIMITATIONS AND CONCLUSION

Using a time questionnaire (TQ) specifically designed for the investigation of time experience, we acquired qualitative material from 25 participants suffering from MDD. Although patients were recruited after a rigorous diagnostic process, we did not

### REFERENCES


administer psychometric procedures after inclusion and the treatment regimen was relatively heterogenic. This possibly may have interfered with patients' self-reports. These limitations of our study are accompanied by general methodological limitations of content analysis, primarily vulnerability to false negatives, possible oversimplification, and the only partially quantitative assessment of categories. Despite these restrictions, we were able to identify a specific disturbance of lived time and corresponding clinical symptoms in MDD. These findings hold several implications. As disturbances in time experience in MDD have been repeatedly described in previous literature, but as of yet have not been included as diagnostic criteria, the detailed description of an underlying syndrome provided herein, might facilitate additional diagnostic accuracy. Moreover, our findings suggest a probable specification of MDD subgroups based on the disturbance of lived time, be it between degrees of severity or genuinely distinct depressive syndromes. Further research will be needed to investigate the implications posed by a syndrome of disturbed time experience in MDD. We suggest that integrative research designs combining both psychopathological and neuroscientific methods will provide the most fruitful approach to the further investigation of time experience.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Ethics Commission of the Faculty of Medicine of Cologne University with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethics Commission of the Faculty of Medicine of Cologne University.

### AUTHOR CONTRIBUTIONS

DV, CK, and KV: Conceptualization; DV: Data Curation; DV: Investigation; DV: Qualitative analysis; DV, KK, and TS: Coding and Intercoding; DV, TS, and KK: Statistical Analysis; KV: Supervision; CK and KV: Resources; DV: Writing—original draft; DV, TS, CK, and KV: Writing—review and editing.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Vogel, Krämer, Schoofs, Kupke and Vogeley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Motor Synchronization in Patients With Schizophrenia: Preserved Time Representation With Abnormalities in Predictive Timing

#### Hélène Wilquin<sup>1</sup> \*, Yvonne Delevoye-Turrell<sup>2</sup> , Mariama Dione<sup>3</sup> and Anne Giersch<sup>4</sup>

<sup>1</sup> Aix Marseille Univ, Laboratory of Clinical Psychology, Psychopathology and Psychoanalysis, Aix-en-Provence, France, <sup>2</sup> SCALab, UMR 9193 – National Center for Scientific Research, University of Lille, Villeneuve d'Ascq, France, <sup>3</sup> Department of Physiology, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden, <sup>4</sup> INSERM U1114, Department of Psychiatry, Federation of Translational Medicine of Strasbourg, Strasbourg University Hospital, Strasbourg, France

Objective: Basic temporal dysfunctions have been described in patients with schizophrenia, which may impact their ability to connect and synchronize with the outer world. The present study was conducted with the aim to distinguish between interval timing and synchronization difficulties and more generally the spatial-temporal organization disturbances for voluntary actions. A new sensorimotor synchronization task was developed to test these abilities.

#### Edited by:

Giovanna Mioni, Università degli Studi di Padova, Italy

#### Reviewed by:

Sven Thoenes, Leibniz Research Centre for Working Environment and Human Factors (LG), Germany Michael H. Thaut, University of Toronto, Canada

> \*Correspondence: Hélène Wilquin helene.wilquin@univ-amu.fr

Received: 15 January 2018 Accepted: 20 April 2018 Published: 16 May 2018

#### Citation:

Wilquin H, Delevoye-Turrell Y, Dione M and Giersch A (2018) Motor Synchronization in Patients With Schizophrenia: Preserved Time Representation With Abnormalities in Predictive Timing. Front. Hum. Neurosci. 12:193. doi: 10.3389/fnhum.2018.00193 Method: Twenty-four chronic schizophrenia patients matched with 27 controls performed a spatial-tapping task in which finger taps were to be produced in synchrony with a regular metronome to six visual targets presented around a virtual circle on a tactile screen. Isochronous (time intervals of 500 ms) and non-isochronous auditory sequences (alternated time intervals of 300/600 ms) were presented. The capacity to produce time intervals accurately versus the ability to synchronize own actions (tap) with external events (tone) were measured.

Results: Patients with schizophrenia were able to produce the tapping patterns of both isochronous and non-isochronous auditory sequences as accurately as controls producing inter-response intervals close to the expected interval of 500 and 900 ms, respectively. However, the synchronization performances revealed significantly more positive asynchrony means (but similar variances) in the patient group than in the control group for both types of auditory sequences.

Conclusion: The patterns of results suggest that patients with schizophrenia are able to perceive and produce both simple and complex sequences of time intervals but are impaired in the ability to synchronize their actions with external events. These findings suggest a specific deficit in predictive timing, which may be at the core of early symptoms previously described in schizophrenia.

Keywords: schizophrenia, predictive timing, sensorimotor synchronization, timing and time perception, tapping

## INTRODUCTION

fnhum-12-00193 May 14, 2018 Time: 15:47 # 2

Many approaches, in phenomenology, neurobiology, and experimental psychology, have led to the idea that time disorders may play a role in the pathophysiology of schizophrenia. Timing disorders may affect the patients' ability to connect to and synchronize with the outer world, possibly explaining why patients feel disconnected from reality. However, the precise mechanisms at play in this difficulty may be diverse. In the present work, a sensorimotor synchronization task (SMS) was used to investigate the importance of time interval production and synchronization abilities in patients with schizophrenia. The findings will be discussed in the light of possible clinical and practical implications of predictive-timing abnormalities in schizophrenia.

Phenomenologists have long reported that the thought flow is fragmented in time in patients with schizophrenia (Lewis, 1932; Minkowski, 1933/2005; Kimura, 1994; Fuchs, 2007, 2013; Vogeley and Kupke, 2007; Stanghellini, 2009; Mishara, 2010). Andreasen (1999) proposed the hypothesis of cognitive dysmetria (see also Friston, 2005), which is based on an analogy between the loosening of associations, the observation of cerebellumrelated disorders (Feinberg and Guazzelli, 1999; Bernard and Mittal, 2014; reviews in Picard et al., 2008; Bachmann et al., 2014; Giersch et al., 2016), and the newly found role of the cerebellum in the sequencing of cognitive actions (Leiner et al., 1993; Ito, 2008; Koziol et al., 2014). Timing abilities are necessarily required to plan correctly and sequence actions through time, although the model of cognitive dysmetria does not specify the type of timing impairments involved. Recently, studies in experimental psychology have reported a difficulty in schizophrenia to discriminate between simultaneous and asynchronous information in visual perception, and to predict and expect information at the milliseconds (ms) and the second levels (Giersch et al., 2009; Lalanne et al., 2012a,b; Martin et al., 2017). For example, Martin et al. (2017) used the variable foreperiod paradigm in which a visual target is displayed at variable delays after an initial fixation point. The probability that the target is displayed increases as time passes, and subjects benefit from time passage to increase expectation and be better prepared to process the target when it is displayed. This effect typically resulted in better performance in controls for longer foreperiods. However, patients showing self-disorders (i.e., lacking the feeling of being present here and now), did not benefit from the passage of time. Furthermore, all patients were impaired when expectancy was globally weakened by the addition of catch trials, i.e., trials in which targets were absent. These findings suggest that patients have a fragility in the ability to predict the moment of appearance of a visual target, consistent with prior studies suggesting predictive timing impairments at the milliseconds level (Lalanne et al., 2012a,b).

Such a fragility may also impact the patients' ability to interact with their environment. Predictive timing is required to decode rhythmic information and track time regularities contained within the environment, and to act in synchrony with the environment. Interacting with the environment requires indeed that motor planning takes the dynamics of perceptual information into account, like, e.g., rhythmic patterns. A fragility in the process of sensory predictions may affect this ability and impact directly the accuracy and the stability of sequential motor planning. However, other timing difficulties, such as encoding time durations, may also account for the patients' difficulty to connect to the real world. Such difficulties have been evidenced in several studies based on duration estimation, production, reproduction and discrimination tasks (Rabin, 1957; Orme, 1966; Johnson and Petzel, 1971; Tracy et al., 1998; Volz et al., 2001; Davalos et al., 2003, 2005; Elvevag et al., 2004; Bonnot et al., 2011; Bolbecker et al., 2014). Recent meta-analyses of time perception and temporal processing in schizophrenia conclude about a heightened variance in the patients' performances rather than a clear over- or under-estimation of interval durations (Ciullo et al., 2016; Thoenes and Oberfeld, 2017). Disturbances in encoding time intervals in patients with schizophrenia may be related to impaired executive functions in patients, rather than timing deficits per se (Elvevag et al., 2004; Roy et al., 2012). Nevertheless, even if timing difficulties in schizophrenia may have several origins, these temporal dysfunctions might affect the ability of the patients to benefit from regularities in the environment and thus, interact with it in a suitable way.

In the present study, the ability to connect and interact with external regularities was tested by exploring the subjects' abilities to coordinate sequences of motor actions (finger taps) with external sequences of tones using a SMS. During SMS, the finger tapping actions must be coordinated temporally with the predictable external event, i.e., the tones, which requires both time perception and coordination abilities. Any of these abilities may be affected in patients with schizophrenia. As a matter of fact, a number of studies have evaluated SMS tasks in patients with schizophrenia (Greenwood et al., 2007; Carroll et al., 2009; Wilquin et al., 2010; Da Silva et al., 2012; Giersch et al., 2013), overall suggesting lower accuracy and greater variability in patients with schizophrenia than in controls. However, motor timing disturbances observed through SMS tasks may be of different nature, which have not yet been accurately determined.

When a tapping action must be synchronized rhythmically with an auditory metronome, it entails the production of regular time intervals, but also the anticipation of when the action must be initiated in order for it to be synchronized with the external sound. These are distinct aspects of SMS, inasmuch the production of regular intervals can coexist with asynchronies, e.g., during offbeat tapping.

The perception of 'inter-stimuli intervals' (ISIs) refers to the time interval between the onset of one event (tone) and the onset of its successor. The extraction of these auditory intervals is achieved by means of an automatic, implicit timing process associated to attention dynamics (Barnes and Jones, 2000; Jones et al., 2002). It does not require explicit estimation of duration. In the case of isochronous intervals, Ehrlich (1958) proposed that the SMS task refers to a specific mechanism: a self-maintained repetitive activity of initiating and executing voluntary series of movements on the basis of the extracted cadence of the series of tones. This view has been recently corroborated by findings suggesting that an isochronous sequence of tone onsets induces an underlying attentional periodicity that cyclically targets a focus

of attention to expected temporal locations (Jones et al., 2002). The phenomena may be considered as the activity of a persisting and periodic process that is synchronized with an external event, which can tacitly continue despite the introduction of potential interpolated rhythms. Owing to the automatic nature of the perception of auditory sequences composed of isochronous intervals and consistent with previous findings suggesting preserved time perception abilities in schizophrenia (Carroll et al., 2009; Wilquin et al., 2010), it was hypothesized that the patients in the present study would be able to perform correctly the production of successive target intervals of time (Delevoye-Turrell et al., 2012).

However, in SMS tapping tasks, the structure of sound sequences can be more complex. In such a case, regularities in auditory arrays do not arise from successive isochronous time intervals but from, e.g., alternations between two different time intervals (300 ms and 600 ms intervals). This more complex pattern, even if metric, generates temporal expectancies guided by the explicit auditory cues detected within the auditory sequences. As proposed by the dynamic attending theory, this metrical structure with its relative "strong" and "weak" beats, modulates attentional resources over time and in turn, affects the functioning of both perceptual and motor preparatory systems (Bolger et al., 2014). Empirical findings have highlighted the complexity of alternations between two different time intervals (Fraisse, 1966; Grahn and Brett, 2007). For instance, Grahn and Brett (2007) showed that regular beats were reproduced more accurately than metric complex rhythms. In the present study, the use of complex patterns of intervals was intended to heighten the probability of evidencing impairments in rhythmic pattern extraction in the patients with schizophrenia and thus, provide the means to specify those timing mechanisms that may be at the origin of sensorimotor synchronization disturbances. The durations of the inter-response intervals (IRIs) were analyzed in addition to movement fluency and asynchrony indicators in order to assess grouping mechanisms as well as temporal anticipation.

As a matter of fact, complex auditory patterns involve perceptual organization performances that refer to the abilities to group perceptual information into coherent patterns (Silverstein and Keane, 2011). In particular, features coded close in time are likely to be bounded together. Several studies indicate dynamic grouping impairment in schizophrenia, especially in the visual domain (Uhlhaas and Silverstein, 2005; van Assche and Giersch, 2011), but also in audition (Silverstein et al., 1996). At the same time, patients are also known to have deficits in attention flexibility (Braff, 1993). These difficulties may account for accuracy loss when patients have to produce sequences of alternating time intervals (Wilquin et al., 2010; Giersch et al., 2013). In the present study, we expected such accuracy loss to be revealed by difficulties in the case of rhythmic alternations as compared to isochronous intervals.

Crucially, a performance measure was added in order to evaluate synchronization ability, i.e., the tightness of fit of sound and tap. This important aspect of functional processes underlying SMS is often forgotten in experimental assessments of studies using finger tapping tasks in patient studies. Indeed, the level of the patients' synchronization skills is usually obtained by measuring the accuracy and the variability in the production of time intervals, consisting in the calculation of the time interval between consecutive finger taps (IRIs). However, as emphasized above, the tempo and rhythmic structure can be accurate even if the tap occurs systematically half way between two consecutive beeps. Thus, to assess the degree of synchronization capacities, it is necessary to compute asynchronies (also called synchronization errors), which are defined as the time interval between the start of each tap and the start of the corresponding tone contained within the external rhythm. Signed asynchronies, i.e., negative and positive asynchronies correspond to taps occurring before and after the beeps, respectively. They are the indicators of the ability to produce taps in synchrony with an imposed metronome. Moreover, they indicate whether the taps are anticipated or not, i.e., whether they have been planned in advance of the tone occurrence. Thus, these data provide key information about anticipatory processes in SMS, i.e., predictive timing abilities. To the best of our knowledge, no SMS studies have measured these asynchronies errors in patients with schizophrenia.

Given the recent results suggesting difficulties in time prediction, it was hypothesized that patients with schizophrenia would have difficulties in planning their actions (tap) in anticipation to the predicted moment of sound occurrence. The typical tapping task was modified slightly in order to make it more sensitive to possible anticipation impairments by including a spatial aspect to the SMS task. Indeed, our previous studies suggested that patients were more impaired at planning motor sequences than performing simple one-element motor actions (Delevoye-Turrell et al., 2007, 2012). In sum, the present study was designed to provide the means to distinguish between the role of distinct temporal difficulties when planning through space a series of voluntary movements in synchrony with an auditory rhythmic pattern. A difficulty in estimating durations should lead to imprecise and variable time intervals between successive taps. In contrast, a difficulty at anticipating the moment of occurrence of the external auditory event should mainly lead to tap-tone asynchronies. The manipulation of the type of tone sequence (isochronous, or not), allowed us to assess to what extent the difficulties observed in patients with schizophrenia are function of the necessity to extract a complex pattern of time regularities within the perceptual world.

### MATERIALS AND METHODS

### Participants

Two groups participated in the current study. The first group was composed of 24 chronic outpatients aged between 25 and 57, who were diagnosed with schizophrenia (Patient group). Diagnoses were based on a structured clinical interview that was given by a single psychiatrist following the DSM-IV. Symptom severity was evaluated using the Positive and Negative Syndrome Scale (PANSS; Kay et al., 1989). All patients were recruited in the local job center (ESAT), which is associated to the Psychiatric Sector of the University Hospitals of Strasbourg. The second group

was composed of 27 healthy volunteers matched for age and sex (Control group).

For both groups, exclusion criteria were substance dependence, mental retardation, history of epilepsy and physical illnesses. The present study was approved by the local ethics committee (Comité Consultatif de Protection des Personnes dans la Recherche Biomédicale d'Alsace IV) and all participants provided written informed consent prior the beginning of the study, in accordance with the recommendations of the Helsinki Declaration.

For all patients, the mean duration of illness and the mean years of education were measured. Patients were stabilized on one or more antipsychotic medications with a calculated mean daily dose in chlorpromazine equivalents. Demographics and clinical data are presented in **Table 1**.

### Apparatus and Instructions

Participants were seated comfortably on a chair in front of a touch screen (EloTouch, 23 cm × 36 cm × 30 cm), which was placed on a narrow support at knee-height. The participants' task was to produce a sequence of pointing movements to visual targets, with their dominant index finger, in synchrony with a series of beeps emitted by a computer. Each beep was 100 ms long and had a frequency of 333 Hz. The visual targets consisted of six equidistant circles (diameter: 1.2◦ ) arranged in the form of a hexagon. Participants were asked to point clockwise each circle, one after the other, starting with the bottom-right circle (**Figure 1**). The trials lasted 15-s and thus, participants circled the visual pattern five or six times as a function of the type of auditory sequence to perform (N = 30 or 34 taps depending on isochronous or non-isochronous sequences). At the end of the trial, the visual pattern disappeared and participants could relax their hand on the side of the touch-screen. Participants were instructed to be the most accurate as possible both in space and in time throughout.

### Experimental Conditions

In the present study, the software Audacity was used to create two different auditory sequences, which are illustrated in **Figure 2**. The isochronous patterns were trials that were composed of equivalent inter-stimulus intervals (ISIs) of time, providing a regular sequence of auditory beeps (R-eq; ISI = 500 ms). The second series were composed of non-isochronous ISI patterns, i.e., trials constituted of alternating time intervals (R-alt; ISI = 300/600 ms or ISI = 600/300) affording a more complex rhythmic sequence. An overall tempo of 900 ms could be extracted from the pooled alternated short and long ISIs. The isochronous R-eq sequences were composed of 34 tones and the non-isochronous R-alt sequences were composed of 30 tones. In all conditions, the tones lasted 100 ms.

### Experimental Design

Participants were tested individually in a quiet room. They performed a familiarization phase in which they were instructed to tap to the beat of the three auditory sequences (R-eq; R-alt300−600; R-alt600−300). No feedback on performance level was provided. When participants were comfortable with the setup, instructions and equipment, they performed the test session, with each trial consisting of a listening phase (4.5 s), a waiting phase without sound (2.7 s) and a producing phase. The waiting phase was introduced in order to allow the participants to perceive the beat and to be ready to point to the first target at the start of the trial.

Participants performed nine trials, three trials per condition semi-randomized. Pre-analyses of the non-isochronous rhythmic sequences (R-alt300−600; R-alt600−300) revealed similar result

TABLE 1 | Demographic and psychometric characteristics for the patient and the control groups.


SZ, patients with schizophrenia; HC, healthy controls; s, significant; ns, non-significant.

<sup>a</sup>n = 24, <sup>b</sup>n = 27.

FIGURE 1 | Experimental protocol. (Left) The left panel illustrates a top view of the tactile screen on which the picture of the targets was displayed throughout each trial. The visual targets consisted in six equidistant circles (diameter: 1.2◦ ) arranged in the form of a hexagon. (Right) The right panel is a picture of a participant who was asked to point clockwise each circle, one after the other, in synchrony with a series of beeps (N = 30 or 34 depending on auditory sequences). These sounds were emitted by a computer and played through table-top speakers.

patterns. Hence, these two trial types were collapsed for the main analysis, and will be referred to as R-alt in the following.

### Data Collection

The participants' task was to tap clockwise around a pattern of six visual targets, tapping one circle on each metronome beat. For each participant and trial, the accuracy of motor performance was measured both in time and in space.

The spatial accuracy was considered by measuring for each trial the dispersion ellipses of the endpoints of the pointing movements performed toward each visual target. Using principal component analysis, spatial ellipses were fitted to the movement endpoint distributions (for details on the calculations, see Gordon et al., 1994). The area of each spatial ellipse (AE – calculated in mm<sup>2</sup> ) was then computed with small values indicating better spatial accuracy (Dione and Delevoye-Turrell, 2015). To get specifications about fluency aspect of the task, the contact duration (CD) of the finger upon the tactile screen during each tap was also measured as it may be abnormal in chronic patients with schizophrenia (Delevoye-Turrell et al., 2012).

Regarding the timing aspect of the task, the following variables were considered and for better comprehension, they will be presented in two different sections: (1) producing a rhythmic pattern and (2) being synchronized to an external metronome.

### Production Parameters

The IRI refers to the time interval between the onsets of two successive taps produced by a participant (**Figure 1**, top). This parameter, commonly used in the tapping literature (Repp, 2005), reflects the subjects' capacity to produce accurately a timed motor sequence. Through the use of a tactile screen, the IRIs were here calculated for each trial and for each participant by detecting the first point of contact between the finger and the screen (in ms) for each pointing action.

Using the IRI measurements for each individual, the coefficient of variation (CV) was also calculated for each trial

Isochronous sequences were constituted of equivalent inter-stimulus intervals (ISIs) of time (Top: R-eq; ISI = 500 ms). Non-isochronous sequences were composed of alternating time intervals (Bottom: R-alt; ISI = 300/600 ms or ISI = 600/300 ms). This figure also specifies the indicators used to characterize the participants' timing performances. The inter-response interval (IRI) refers to the time interval between the onsets of two successive taps produced by a participant (Top) and reflects the participants' ability to produce accurately a timed motor sequence. The asynchrony indicator (ASYNC) was calculated as the time interval between the first detected point of contact between the finger and the screen (i.e., the tap action) and the start of the nearest tone. It is a marker of the participants' ability to produce a motor response in synchrony with a predictable external event. By convention, signed asynchronies are negative when the tap is ahead of the target beep (e.g., ASYNC 1), and positive when the tap lags behind (e.g., ASYNC 3). ISI, inter-stimulus interval; IRI, inter-response interval; ASYNC, asynchrony (in ms)

following the equation: 100<sup>∗</sup> IRI standard deviation/IRI mean (in %). The CV was used as an indicator of within-subject performance variability and thus, of performance stability.

#### Synchronization Parameters

fnhum-12-00193 May 14, 2018 Time: 15:47 # 6

The SMS is characterized by the predictability of the external beep, which arises from its regular recurrence. It is in fact this feature of predictability that allows good synchronization between own self-initiated movements and an external event. This is a clear distinction with simple reaction time tasks, for which an action is made as quickly as possible after a beep and thus, is characterized by a reaction time of 180 ms and more. Hence, the ability to anticipate the beep occurrence leads to tapbeep asynchronies close to zero or even negative asynchronies in healthy volunteers as they have been shown to have the tendency to over-estimate sensorimotor latencies (Aschersleben, 2002).

The synchronization ability of the participants in the present study was characterized by calculating an asynchrony indicator (**Figure 2**, top), which was computed for each trial and each participant. ASYNC was calculated for each tap as the time interval (in ms) between the first detected point of contact between the finger and the screen and the start of the nearest tone (TapStart – BeepStart). This difference is referred to as the « signed asynchrony », which indicated the direction of the error of synchronization. By convention, signed asynchronies are negative when the tap precedes the target beep and positive when the tap is late. An "absolute asynchrony" was also calculated for each tap to illustrate the error amplitude of asynchronies, independently of error direction. Finally, a measure of synchronization variability was also computed. To this aim, A CV of asynchronies was calculated for each trial following the equation: 100<sup>∗</sup> standard deviation of Abs (ASYNC)/Abs (ASYNC) mean (in %).

### Statistical Analyses

Descriptive time series analyses were first conducted to detect possible out-liers in the tapping time series. In both groups, it took a few beeps in order for the participants to get into the rhythm set by the metronome. Hence, in the following analyses the first six taps of each trial were discarded. Analyses of variances (ANOVA) were conducted on the different dependent variables to reveal main effects of pathology and auditory sequence complexity, and interactions. These analyses were conducted with Statistica Software and the significance level was set to an alpha of 0.05.

### RESULTS

Overall, participants showed no difficulties in producing series of taps in synchrony with the auditory sequences and this was true for both types of auditory sequences. The following analyses were conducted on mean CDs, with group and types of sequences as a between-group variable (since isochronous and non-isochronous sequences were run in separate blocks). The patients with schizophrenia showed significantly longer CDs (M = 172 ms, SD = 53) than controls (M = 140 ms, SD = 45); F(1,98) = 12.961; p = 0.001, η 2 <sup>p</sup> = 0.117. All participants produced shorter CDs in non-isochronous (M = 137 ms, SD = 42) than in isochronous sequences (M = 174 ms, SD = 54); F(1,98) = 17.908; p < 0.001, η 2 <sup>p</sup> = 0.155). No significant Group<sup>∗</sup> Sequence interaction effects were obtained [F(1,98) = 2.786; p = 0.098, η 2 <sup>p</sup> = 0.028].

Spatial performances in the control group (M = 71 mm<sup>2</sup> , SD = 21) were more accurate than that observed in the patient group, which revealed larger areas in the spatial ellipses (M = 119 mm<sup>2</sup> , SD = 75), F(1,98) = 21.595; p < 0.001, η 2 <sup>p</sup> = 0.181. Both groups were less accurate in space when producing nonisochronous sequences (M = 110 mm<sup>2</sup> , SD = 70) compared to isochronous sequences (M = 78 mm<sup>2</sup> , SD = 40); F(1,98) = 9.664; p = 0.002, η 2 <sup>p</sup> = 0.090. The interaction Group<sup>∗</sup> Sequence effect was not significant [F(1,98) = 1.587; p = 0.211, η 2 <sup>p</sup> = 0.016], suggesting a similar increase in difficulty index in both groups when synchronizing to more complex metronome sequences. These results are summarized in **Table 2**.

### Producing Isochronous and Non-isochronous Rhythmic Patterns

In this section, results are presented on the participants' capacity to produce successive target intervals of time without taking into account the fact that participants were synchronized or not to the beeps.

#### Isochronous Sequences (R-eq)

The target time interval was an ISI of 500 ms. A total of 15 IRI values greater than 1000 ms (absolute value) were filtered out, leaving 93% of the observations.

The results from the one-way ANOVA suggested that the differences between mean IRIs in the control group (M = 497.9 ms, SD = 5.80) and in the patient group (M = 501.76 ms, SD = 11.20) were not statistically significant; F(1,49) = 2.431; p = 0.125, η 2 <sup>p</sup> = 0.05. These results are presented in **Figure 3** (left) and indicate that all participants were able to produce accurately an isochronous auditory sequence with production of IRIs close to the 500 ms target interval.

The results from the one-way ANOVA conducted on the coefficients of variation (CV) revealed that the differences in IRI variances between the patient group (M = 10.13%, SD = 9.108) and the control group (M = 11.82%, SD = 10.00) were not statistically significant, F(1,49) = 0.391; p = 0.535, η 2 <sup>p</sup> = 0.008. The CV values were low (around 10%) indicating an overall correct production of IRI intervals in both groups (**Table 3**).

#### Non-isochronous Sequences (R-alt)

The target time interval was an overall ISI of 900 ms (short and long targets pooled intervals). A total of 21 IRI values greater than 1,800 ms (absolute value) were filtered out, leaving 81% of the observations.

A one-way ANOVA was conducted on the mean IRIs. Analyses revealed an absence of Group effect, F(1,49) = 1.926; p = 0.172, η 2 <sup>p</sup> = 0.038. The mean IRIs for the patient group (M = 922.38 ms, SD = 62.00) and the control group (M = 905.40 ms, SD = 13.78) were close to the overall target interval of 900 ms. Findings are presented in **Figure 3** (right).

TABLE 2 | Summary of findings obtained for the spatial accuracy and contact durations (CD in ms) in the patient and in the control groups as a function of the different rhythmic complexity conditions.


SZ, patients with schizophrenia; HC, healthy controls. Analyses of variance were conducted on both contact duration and area of ellipses to reveal main effects of pathology.

A one-way ANOVA conducted on the coefficients of variations (CV) did not indicate significant Group differences between the patient group (M = 28.28%, SD = 11.61) and the control group (M = 27.75%, SD = 6.15), F(1,49) = 0.043; p = 0.837, η 2 <sup>p</sup> = 0.001 (**Table 3**). Overall, these findings indicate a preserved capacity in both controls and patients to produce non-isochronous auditory sequences.

### Maintaining Synchrony With an External Metronome

As mentioned in the data collection section, three synchronizing parameters were used to test the ability of the participants to synchronize their motor actions with an external metronome: signed asynchronies (to reveal error direction of the asynchronies: taps lagging or ahead of the metronome beep), absolute asynchronies (to indicate the amplitude of these asynchronies) and asynchrony variability (to characterize the stability of the asynchrony errors). Results for each of these measures are presented below for isochronous and nonisochronous auditory sequences, respectively. More specifically, for the non-isochronous sequences, the signed and absolute asynchronies were measured on the first tap of the overall interval of 900 ms.

### Isochronous Sequences (R-eq)

Concerning the direction of synchronization error (signed ASYNC), a one-way ANOVA was conducted, which revealed significant differences between the control group (M = −21.82 ms, SD = 45.40) and the patient group (M = 10.50 ms, SD = 44.90). Findings indicated that controls significantly taped in advance to the beeps whereas patients were late, F(1,49) = 6.505; p = 0.014, η 2 <sup>p</sup> = 0.117. These results are presented in **Figure 4**. The absolute ASYNC did not significantly differ between the control group (M = 42.97 ms, SD = 36.80) and the patient group (M = 62.21 ms, SD = 36.10), F(1,49) = 3.531; p = 0.066, η 2 <sup>p</sup> = 0.067.

A one-way ANOVA conducted on the coefficients of variations (CV) of ASYNCs did not reveal significant Group differences between the patient group (M = 100.80%, SD = 40.25) and the control group (M = 99.30%, SD = 24.79), F(1,49) = 0.026;

TABLE 3 | Summary of findings obtained for the coefficients of variances (CV) for the temporal variables.


SZ, patients with schizophrenia; HC, healthy controls.

FIGURE 4 | Mean signed asynchronies (ASYNC in ms) (with their respective standard errors) in the equivalent isochronous rhythmic condition (Left: R-eq) as a function of the group (controls; patients). On the (Right), the time series of these results are presented for the total of 34 taps in controls (black dots) and in the patients (gray dots). The black vertical line illustrates the boundary from which the first six taps of each trial was discarded for the analyses. Indeed, it took a few beats for the participants to get into the pace set by the metronome.

the patients (gray dots). The black vertical line illustrates the boundary from which the first six taps of each trial was discarded for the analyses. Indeed, it took a few beats for the participants to get into the pace set by the metronome.

p = 0.872, η 2 <sup>p</sup> = 0.001. The findings indicated a similar variability in synchronization performances in both controls and patients.

#### Non-isochronous Sequences (R-alt)

The one-way ANOVA conducted on the signed ASYNC revealed significant differences between the control group (M = 8.23 ms, SD = 32.27) and the patient group (M = 28.27 ms, SD = 29.12). In both groups, asynchronies were positive but nevertheless the controls were more synchronized with the metronome than the patients, F(1,49) = 5.36; p = 0.025, η 2 <sup>p</sup> = 0.099. These results are presented in **Figure 5**.

A one-way ANOVA conducted on the absolute ASYNC confirmed significant differences between the patient group (M = 76.19 ms, SD = 29.35) and the control group (M = 42.48 ms, SD = 15.55), indicating greater synchronization errors in the patients than in the controls, F(1,49) = 27.12; p < 0.001, η 2 p = 0.356.

The results from the one-way ANOVA conducted on the coefficients of variations (CV) of ASYNCs revealed an absence of group differences between the patient group (M = 76.81 ms, SD = 26.77) and the control group (M = 76.38 ms, SD = 16.31), F(1,49) = 0.005; p = 0.945, η 2 <sup>p</sup> = 0.001, indicating that the

patients were no more variable in their performance than the controls.

### Correlations

In the case of non-isochronous sequences, the Lepine disorganization score (sum of items P2, N5, G10 and G11 in the PANSS, van Assche and Giersch, 2011) correlated with the spatial area of the taps (r = 0.49, N = 24, p < 0.05). No other correlations were observed and in particular, there were no significant correlations between the performance scores obtained in the isochronous sequences and those coding the chlorpromazine equivalents.

### DISCUSSION

Distortions of time have been described in schizophrenia for over a century (Lewis, 1932; Minkowski, 1933/2005). And yet, the scientific literature still gives little knowledge about the exact processes involved in the timing deficits reported in patients with schizophrenia (Ciullo et al., 2016; Thoenes and Oberfeld, 2017). Using a tapping SMS task, the results presented here were contrasted between patients with schizophrenia and controls in their abilities to produce sequences of time intervals in synchrony with a series of regular auditory tones as a function of the rhythmic complexity of the sensory information.

As in previous studies, patients revealed longer contact times than controls (Delevoye-Turrell et al., 2007, 2012), which was previously proposed as an index of the patients' difficulty to plan voluntary motor sequences composed of (1) a downward movement of the finger, (2) a tactile surface-touching instant and (3) a lifting-up movement of the finger. The findings reported here comfort the idea that schizophrenia is characterized by a fluency difficulty in the sensorimotor integration domain of voluntary movement. The remaining of the present study was then designed to gain a better understanding of the nature of the fluency deficits related to motor and perceptual timing inabilities in schizophrenia.

The IRIs were well produced by both the controls and the patients, indicating that the cognitive abilities required for time production of both isochronous and non-isochronous intervals are preserved in schizophrenia. The relatively small amplitude of impairment in time intervals production fits with the metaanalysis of Thoenes and Oberfeld (2017) who reported that the duration-related performance was only moderately altered in patients with schizophrenia. In the present study, patients were able to take into account the tempo of the auditory metronome and produce the required inter-tap intervals, confirming a somewhat preserved perception and production of time interval durations in schizophrenia, even if the movements remained less fluent in the patients than that observed in the controls, i.e., with longer contact durations between the finger and touch-screen.

Tap-tone asynchronies were accurate in both groups. Yet only control participants were able to tap in advance of the sound and thus, showed clear anticipation as it is reported classically in the literature (for a review, see Repp, 2005). The patients, in contrast, tapped after the beep. As such, the group differences in the tap-tone asynchronies were significant indicating a clear effect in the patients. Such findings are consistent with previous results, which have been reported using a similar paradigm, albeit with more complex visual grouping features (Giersch et al., 2013). Nevertheless, the present study provided the means to gain a better understanding of the mechanisms at play in the patients' alteration of motor prediction. Indeed, the tap-tone positive asynchronies in the patients suggest that some degree of anticipation is preserved. Indeed, a complete lack of prediction would have forced subjects to wait to perceive the sound before tapping, which would have led to positive tap-tone asynchronies greater than +100 ms (Repp and Su, 2013). This was the case in neither groups. Second, in both groups, learning curves were observed (presented in **Figures 3**, **4**), which confirm the fact that participants and patients were able to detect asynchrony errors and use them to improve synchrony performances. Indeed, at the start of the trials, all participants showed a large taptone delay, probably because a certain number of beep intervals were needed in order to pick up the beat. Nevertheless, both groups quickly reached a plateau of tap-tone synchronies within the first six taps of a trial. It has been suggested that such adjustments do not require a conscious detection of synchrony errors (Weibel et al., 2015). Importantly, the adjustment patterns in the patients were consistent with prior results indicating that patients automatically adjust motor actions in the case of a mismatch between expected and real sensory outcomes, at least as long as the detection of this mismatch does not need conscious awareness (Fourneret et al., 2002; Delevoye-Turrell et al., 2003; Knoblich et al., 2004; Lencer et al., 2017). Another important point of these learning curves is that they suggest that the differences in tap-tone asynchronies between patients and controls appear only late, when controls start to tap in advance. Patients and controls' performance is similar during the first beep intervals of the trials. If any, patients are rather faster than controls, suggesting that the group differences are not due to a non-specific motor slowing in the patients.

Even if negative asynchronies seem to be the golden rule for isochronous intervals, sensori-motor anticipations were less emphasized in the non-isochronous than in the isochronous rhythmic sequences in both controls and patients. Repp et al. (2011) have reported negative asynchronies even for nonisochronous intervals but these authors tested musicians only. Furthermore, Repp (2005) and Repp and Su (2013) used solely tapping paradigms for which no displacement through space was required – tapping was performed at different tempo and with different rhythmic patterns but always to a unique spatial target. A spatial aspect has been incorporated here as most voluntary motor behavior in our everyday life requires not only timing but also spatial coordination of body movements. In doing so, Giersch et al. (2013) reported that tap-tone asynchronies were always positive, comforting the results observed in the present study for which positive asynchronies were observed both in the patients and in the controls, in the non-isochronous sequences. Furthermore, the changes in the nature of the asynchronies as a function of task complexity suggest that tapping in a predictive matter is related to task attentional load. More specifically, participants may tap all the more in advance that they do the task

automatically and that the synchronization parameter is adjusted on the basis of the predicted sensory consequences of action, and not on the initiation of the out-going efferent command. This explanation fits well with the results observed in musicians, who need to coordinate their actions with the sounds produced by others, and who usually show negative mean asynchronies.

Anticipatory mechanisms may be less efficient when task complexity increases. This would indicate that for the patients, even the isochronous tapping task would be effortful. This hypothesis of an increase in task difficulty (i.e., in cognitive load) in both isochronous and non-isochronous sequences in the patients is consistent with the observed increase in spatial errors, which are larger in the patients than in the controls across conditions. An increased difficulty in synchronizing taps with beats is also supported by previous work. For example, in Turgeon et al. (2012), participants were instructed to detect a shift in occurrence of a sound, presented within a series of otherwise isochronous tones. Contrary to controls, the patients had more difficulties to detect the phase shifts when they tapped in synchrony with the tones than when they simply listened to the tone series. This finding suggested a peculiar difficulty associated with SMS, with a difficulty in predictive timing abilities. This occurred in a task based on simple isochronous rhythms, which is usually believed to require only basic, automatic mechanisms. Moreover, the chosen tempo was close to the spontaneous tempo observed in adults suffering from schizophrenia (Delevoye-Turrell et al., 2012), meaning that the tempo was especially comfortable for the patients (Delevoye-Turrell et al., 2014). Hence, what is usually a very easy task for healthy controls may present a real challenge for patients with schizophrenia.

It has been proposed that the natural flow of events may have lost its continuity in schizophrenia (Martin et al., 2014; Giersch and Mishara, 2017a,b). This might account for a difficulty to benefit from the time flow when expecting an event, and to anticipate external events (Northoff and Duncan, 2016). Previous studies have suggested a link between time prediction difficulties and elementary forms of self-disorders, i.e., those enabling subjects to feel the present as being now and here. This disorder of the self usually emerges during the prodromal phase of schizophrenia, and may pre-exist full-blown symptoms (Møller and Husby, 2000; Parnas et al., 2005; Wilquin and Delevoye-Turrell, 2012). As a matter of fact, previous studies have not found correlations between the most elementary time disorders and clinical symptoms (Giersch et al., 2015). Likewise, in the present study, no correlations were found between asynchronies and

### REFERENCES


clinical symptoms. A correlation with clinical disorganization was found only with spatial errors, echoing the significant correlations found between spatial organizations and clinical disorganizations (Silverstein and Keane, 2011; van Assche and Giersch, 2011). Future studies will have to investigate whether or not tap-tone asynchronies correlate with elementary forms of self-disorders in patients with schizophrenia, maybe by rendering the task more difficult through the modulation of the tempo of the sequences to produce.

### CONCLUSION

Our results suggest a clear dissociation between the preserved ability to produce inter-tap intervals in schizophrenia and the difficulties in timing self-initiated action to predictable external events. A dissociation between these two mechanisms in a sensori-motor synchronization task had already been reported in a study conducted in healthy adults by Palmer et al. (2014). In their study, beat-deaf patients were able to produce time intervals but were impaired in synchronizing self-produced taps with the beats of the metronome. However, contrary to patients with schizophrenia, they tapped well in advance of the sounds. The findings reported here in patients with schizophrenia thus appear to show a specific atypical pattern in synchronizing self-initiated body movements to outer world events.

### AUTHOR CONTRIBUTIONS

AG and YD-T designed the study and edited and finalized the manuscript. AG acquired the clinical data. HW acquired the experimental data. HW, MD, and YD-T analyzed the experimental data. AG, HW, and YD-T interpreted the data. HW wrote the first draft of the manuscript. All authors reviewed, amended, and approved the manuscript.

### FUNDING

This study was conducted in association with INSERM and the University Hospitals of Strasbourg (France). The research was supported by the French National Research Agency for the scientific projects (ANR-10-BLAN-1903-01 and ANR-EQPX-0023) and the doctoral student HW (ANRt, CIFRE n◦ 558/2007).

Barnes, R., and Jones, M. R. (2000). Expectancy, attention, and time. Cogn. Psychol. 41, 254–311. doi: 10.1006/cogp.2000.0738




and accuracy. Clin. Psychol. Rev. 54, 44–64. doi: 10.1016/j.cpr.2017. 03.007


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wilquin, Delevoye-Turrell, Dione and Giersch. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Perceived Simultaneity and Temporal Order of Audiovisual Events Following Concussion

Adrienne Wise and Michael Barnett-Cowan\*

Department of Kinesiology, University of Waterloo, Waterloo, ON, Canada

The central nervous system allows for a limited time span referred to as the temporal binding window (TBW) in order to rapidly determine whether multisensory events correspond with the same event. Failure to correctly identify whether multisensory events occur simultaneously and their sequential order can lead to inaccurate representations of the physical world, poor decision-making and dangerous behavior. Damage to the neural systems that coordinate the relative timing of sensory events may explain some of the long-term consequences associated with concussion. The aim of this study was to investigate whether the perception of simultaneity and the discrimination of temporal order of audiovisual stimuli are impaired in those with a history of concussion. Fifty participants (17 with concussion history) were recruited to complete audiovisual simultaneity judgment (SJ) and temporal order judgment (TOJ) tasks. From these tasks, the TBW and point of subjective simultaneity (PSS) were extracted to assess whether the precision and or the accuracy of temporal perception changes with concussion, respectively. Results demonstrated that those with concussion history have a significantly wider TBW (less precise), with no significant change in the PSS (no change in accuracy), particularly for the TOJ task but no significant differences were found between the SJ and TOJ tasks. Importantly, a negative correlation between the time elapsed since last concussion and TBW width in the TOJ task suggests that precision in temporal perception does improve over time. These findings suggest that those with concussion history display an impairment in the perceived timing of sensory events and that monitoring performance in the SJ and TOJ tasks may be a useful additional assessment tool when making decisions about returning to regular work and play following concussion.

Edited by: Simon Grondin, Laval University, Canada

#### Reviewed by:

Anne Giersch, Institut National de la Santé et de la Recherche Médicale (INSERM), France Leah Fostick, Ariel University, Israel

> \*Correspondence: Michael Barnett-Cowan mbc@uwaterloo.ca

Received: 13 January 2018 Accepted: 27 March 2018 Published: 13 April 2018

#### Citation:

Wise A and Barnett-Cowan M (2018) Perceived Simultaneity and Temporal Order of Audiovisual Events Following Concussion. Front. Hum. Neurosci. 12:139. doi: 10.3389/fnhum.2018.00139 Keywords: concussion, time perception, auditory, visual, simultaneity, temporal order

### INTRODUCTION

A sport related concussion is defined as the rapid onset of a short-lived impairment of neurological function due to a direct or indirect force, which resolves spontaneously (McCrory et al., 2017). While this definition suggests concussion results in short term impairments, the research community is gaining more insight to its long-term consequences and the potential for experiencing prolonged symptoms (McCrory et al., 2017). Long-term consequences include cognitive and motor impairments (Brown et al., 2015; Dalecki et al., 2016), second impact syndrome (Cantu, 1998), chronic traumatic encephalopathy (McKee et al., 2009) and an increased risk of mild cognitive impairment (MCI) and earlier onset of Alzheimer's disease (AD; Guskiewicz et al., 2005). In sport, accurate and timely assessment of a suspected concussion on the sideline (i.e., at the site or field of play) is critical to prevent athletes from returning to play too soon and risking a subsequent, often more serious, concussion. Currently, sideline assessment tools depend heavily on self-reporting of symptoms (e.g., Sport Concussion Assessment Tool, SCAT; Echemendia et al., 2017) and visually evaluated performance tests (e.g., Vestibular Ocular Motor Screening, VOMS; Mucha et al., 2014). While these tests are valuable for concussion assessment based on symptoms, these tests may be insensitive to subtle deficits associated with mild concussions, leading to these long-term consequences. A deficit that could be associated with concussions that current tests do not account for is impaired perceived timing of sensory events, specifically perceived temporal order and simultaneity. Damage to the neural systems that govern how the central nervous system (CNS) coordinates the relative timing of sensory events may explain problems that concussed individuals continue to experience.

To form a coherent representation of the world, the CNS plays a role in combining multiple sensory modalities accurately together in the temporal domain. However, the CNS faces a multitude of challenges when integrating sensory information to perceive a unified percept and determining the temporal coincidence of events in the environment. Challenges include the differences in the physical propagation of energies in the environment (Spence and Squire, 2003), and the differences in the internal transmission time, processing speed and axonal lengths of each sensory modality (Pöppel et al., 1990). For example, visual stimuli require a longer processing time compared to auditory stimuli (Pöppel et al., 1990). In order for multisensory integration to occur, the stimuli must fall in a specific temporal proximity of one another, known as the temporal binding window (TBW; Meredith et al., 1987; Stein et al., 1998). Thus, the CNS must calculate these temporal latencies between stimuli to determine whether or not information needs to be bound together or remain distinct.

What do we know about the perceived timing of sensory events in the damaged CNS? While research has thoroughly investigated the perceived relative timing of multisensory stimuli in the healthy adult CNS, there is some research that has demonstrated how multisensory information is integrated in the developing and damaged CNS. For example, it has been shown that the width of the TBW shrinks from young childhood and throughout adolescence, settling on a minimal width in young adulthood (Hillock et al., 2011; Hillock-Dunn and Wallace, 2012). It has also been shown that the width of the TBW is extended with damage to the CNS in developmental disorders such as autism (Stevenson et al., 2014) and in patients with Schizophrenia (Capa et al., 2014). Furthermore, an extended TBW has been shown in older adults (Setti et al., 2011; Bedard and Barnett-Cowan, 2016), which can be even more impaired with MCI and AD patients (Wu et al., 2012). Thus, the TBW has the potential for being a sensitive marker for changes in the CNS's ability to determine the timing of sensory events following global changes to neural function.

The perceived timing of the temporal order and simultaneity of sensory events can be assessed using various psychophysical methods. These methods experimentally manipulate the temporal delay between two sensory modalities—defined as the stimulus onset asynchrony (SOA) between visual and auditory stimuli for example. In these tasks, observers are asked to report their perception of the event, based on the task at hand. A simultaneity judgment (SJ) task asks observers to report whether two stimuli were simultaneous, while a temporal order judgment (TOJ) task asks observers to report which of two stimuli was presented first. From these two tasks, the size of the TBW can be extracted. Additionally, the point of subjective simultaneity (PSS) can be measured—the amount of asynchrony between two sensory stimuli that likely results in the individual perceiving simultaneity (Love et al., 2013).

The purpose of this study is to provide a more detailed understanding of how persons with a history of concussion perceive the temporal order and simultaneity of audiovisual events. Considering the importance of the initial assessment for concussion diagnosis and ongoing clinical assessment for those recovering from concussion, better sideline tools are needed to ensure full functional recovery. Assessing the TBW and PSS in both the TOJ and SJ tasks may provide timely evidence to better inform return to play, work and school decisions. To our knowledge, no studies have considered the relationship between perceived temporal order and simultaneity in those with concussion history. Here, measuring the TBW in those with and without a past history of concussion will be assessed to investigate if and how impaired timing of sensory events is associated in individuals with concussion history. Based on previous evidence of age-related changes in the perceived timing of sensory events (Bedard and Barnett-Cowan, 2016), it is hypothesized that those with concussion history will show a similar TBW width and PSS compared to a healthy control group on the SJ task. For the TOJ task, it is hypothesized that compared to healthy controls, those with concussion history will have an extended TBW width, and require a larger visual lead-time in order to perceive simultaneity.

### MATERIALS AND METHODS

### Participants

Fifty-one participants with no known auditory or visual deficits were recruited for the study. Participants were recruited through posters located on the University of Waterloo campus. To increase the likelihood of recruiting those with concussion history, athletes from the Waterloo Warriors Varsity program were approached at practices and were invited to participate. Some participants were paid \$10/h for their time, while others volunteered to participate with no pay. Following collection, one participant was excluded because of the inability to interpret the data. Participants (n = 50; 18–35 years; mean age = 21.9, SD = 2.8) were categorized into one of two groups: a concussion history group or a healthy control group. Participants were eligible to be a part of the concussion group if they had experienced one or more concussion(s) in the past 10 years, and participants were eligible as a healthy control if they had never been diagnosed with a concussion. Categorization into these two groups was based on self-report data. All participants with concussion history reported to be asymptomatic at the time of testing. Seventeen participants were recruited into the concussion history group (n Females = 10; Mean age = 21.3, SD = 3.0) and 33 individuals were recruited as healthy controls (n Females = 24; Mean age = 21.4, SD = 2.2). This study was carried out in accordance with the recommendations of Canada's Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS2) by the University of Waterloo's Human Research Ethics Committee with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University of Waterloo's Human Research Ethics Committee.

### Study Design

A between-subjects design was used to assess differences between groups with and without concussion history.

### Experimental Procedure

Participants read and signed the information consent form upon arrival. To confirm their self-reported concussions, participants in the concussion history group were also asked to complete a Concussion Safety Program Injury Report and a SCAT 5 Symptoms Evaluation Form (Echemendia et al., 2017). The Concussion Safety Program Injury Report Form is used by the Athletic Therapy team at University of Waterloo, and consists of questions regarding specific details of an individual's concussion at the time of injury. The SCAT five Symptom Evaluation form asks individuals to rate the severity of a list of concussion related symptoms on a Likert scale ranging from 0 (no symptoms) to 6 (severe symptoms) out of a possible total severity score of 132. For the purposes of this research, participants were asked to rate their current symptoms relative to before their concussion. For participants who had experienced more than one concussion, these forms were based on their most recent concussion. Additionally, participants in both the healthy control and concussion history group were asked to complete a Clinical Information Form. This form consisted of questions regarding their history of concussion, headaches, migraines, neurological and attention disorders, and whether or not they took medications for any of the above conditions.

Each questionnaire and each task was explained to the participant before being asked to complete it. Participants completed practice trials until it was clear they were familiar with each task, and researchers encouraged participants to ask any questions to clarify confusion before the commencement of each task.

### Experimental Setup

Participants were seated in a dark quiet room with their heads stabilized on a chin rest, which was placed 57 cm away from a MacBook Pro (OS 10.9.5 Maverick, 15 inch (2880 × 1800)) computer. The chin rest was used to ensure their heads were stabilized, and 57 cm was used to assure all participants were viewing the visual stimuli from a consistent distance. VPixx Technologies DataPixx software (version 3.01) was utilized to produce the visual and auditory stimuli accurately in time relative to one another. Visual stimuli were presented on the MacBook Pro monitor as a white circle (0.4◦ ), while auditory stimuli was presented through speakers located adjacent to the computer (Altec Lansing Multimedia computer speaker system, ACS95W). Participants would respond to a trial using either the right arrow button or left arrow button on the MacBook Pro, each button being coded to a different response depending on the task.

### Experimental Tasks

SJ and TOJ tasks were completed in a randomized order, and are represented in **Figure 1** below. Both experimental tasks had an identical design: A trial began when a fixation cross was presented in the center of the computer monitor. Participants were instructed to fixate on this cross throughout the duration of the study. 1.3◦ below the fixation cross, the visual stimulus was presented for 17 ms against a black background. The visual stimulus was preceded or followed by an auditory stimulus (1850 Hz, 7 ms duration). SOAs were used between the presentation of the two stimuli, including: −300 ms, −200 ms, −150 ms, −100 ms, −50 ms, −25 ms, 0, 25, 50, 100, 150, 200 and 300 ms, where negative values indicate the sound was presented first. Ten trials were completed for each SOA in a randomized order, resulting in 130 experimental trials. All participants were given the opportunity to practice each task until they felt comfortable with completing the experimental trials.

For the SJ task, participants were asked to report whether the visual and auditory stimuli were presented simultaneously (right arrow key) or successively (left arrow key). For the TOJ task, participants were asked to report whether the auditory stimulus (right arrow key) or visual stimulus (left arrow key) was presented first.

### Statistical Analysis

Data analysis was carried out using SigmaPlot version 12.5. As a proxy of the TBW, the just noticeable difference (JND) was measured. The JND indicates the sensitivity of an individual to reliably detect a change of asynchrony between the two stimuli (Harris et al., 2009). To estimate the TBW width and PSS values for each task, SJ and TOJ data were fitted with Gaussian (Equation 1) and sigmoidal (Equation 2) psychometric functions respectively:

$$y = a \cdot e^{(-0.5(\frac{x - \alpha\theta}{b})^2)}\tag{1}$$

where a is a scaling factor, x∅ is the PSS and b is the JND (proxy for the TBW).

$$\wp = \frac{100}{1 + e^{\frac{x - x\wp}{b}}} \% \tag{2}$$

where x∅ is the PSS and b is the JND (proxy for the TBW).

All participants were included for further analysis. To determine interactions between group and task, independent sample t-tests were run to determine significant differences between those with and without concussion history for each parameter for the SJ and TOJ tasks. In cases where normality was violated as assessed by the Shapiro-Wilk test, the Mann-Whitney test was employed. In cases where outliers were identified, the Bayesian independent t-test was additionally employed. To assess change of the TBW and PSS as a function of time since last concussion, the Pearson product-moment correlation coefficient was calculated. In cases where TBW or PSS data was not normally distributed, the Spearman's rank correlation coefficient was employed.

### RESULTS

**Figure 2** shows the results for the SJ task. Both the TBW and PSS values for the SJ task failed normality, and therefore the Mann-Whitney test was employed to determine significant differences between those with and without concussion history. No significant effect was found between those with and without concussion history in terms of the PSS value in the SJ task (U(48) = 278, p = 0.484, Cohen's d = −0.021) For the TBW in the SJ task, a significant between-subject effect was found (U(48) = 193, p = 0.037, Cohen's d = −0.281).

**Figure 3** shows the results for the TOJ task. Both the TBW and PSS values for the TOJ task failed normality, and therefore the Mann-Whitney test was employed to determine significant differences between those with and without concussion history. No significant effect was found between those with and without concussion history in terms of the PSS value in the TOJ task (U(48) = 289, p = 0.572, Cohen's d = 0.109) For the TBW in the TOJ task, a significant betweensubject effect was found (U(48) = 163, p = 0.008, Cohen's d = −0.588).

Note that statistical outliers were found for the TBW for both SJ and TOJ data and as Bayesian estimation for two groups can handle outliers by describing the data as heavy tailed distributions instead of normal distributions (Kruschke, 2013), we performed Bayesian independent t-tests on the TBW values for both tasks between concussed and controls using JASP v0.8.0.1. Here Bayes Factors (BF) provide a numerical value that quantifies how well a hypothesis (H1; concussed TBW significantly different from control TBW) predicts the data relative to a competing null hypothesis (H0; no difference in TBW values across groups), where a BF10 between 0 and 1, indicates support for the H0, and a BF10 greater than 1 indicates support for the H1. Our results show support for the alternative hypothesis (H1) that despite outlier participants being included, the average TBW of the concussed participants was significantly wider than that of control participants for the TOJ (BF10 = 1.378; default Cauchy prior width = 0.707), but not the SJ task (BF10 = 0.422; default Cauchy prior width = 0.707).

Correlational analysis for the SJ and TOJ task is displayed in **Figure 4** below, which were conducted using the Spearman's rank correlation. While the correlation in the SJ task was not found to be significant (Spearman's rho = −0.316, p = 0.216), a significant difference was observed between the time since last concussion and TBW width in the TOJ task (Spearman's rho = −0.65, p = 0.006). For the TOJ task, the less time between the concussion and the time of testing, the wider the TBW. Here two participants were outside the 95th percentile of the control group range for the TOJ TBW. No significant correlations were found for the PSS values in either task. No significant correlations were found between the TBW and PSS as a function of severity of concussion (average severity score: 18.12 out of 132, SD: 19.53).

### DISCUSSION

We investigated changes in the perceived timing of sensory events in those with concussion history, specifically the perceived

simultaneity and temporal order of audiovisual stimuli. The results show that compared to a healthy control group, those with concussion history exhibit an extended TBW for audiovisual events and a relationship between the TBW width and time since concussion onset. Evidence in support of this effect was strongest for the TOJ task, though similar trends were found for the SJ task as well. Therefore, the hypothesis is consistent with the findings of this study. To our knowledge, this is the first study that has specifically conducted research that considers the perceived timing of audiovisual events in those with concussion.

While there is currently limited evidence investigating the perceived timing of sensory events in those with concussion history, evidence has demonstrated impaired audio-visual integration with aging, which could help inform potential changes in those with concussion history. For example, while young and older adults have similar TBW widths for the SJ task, older adults have an extended TBW width compared to younger adults in the TOJ task (Bedard and Barnett-Cowan, 2016). Not only does this finding and those from the current study reinforce the theory that there are different underlying perceptual mechanisms for the SJ and TOJ tasks (Love et al., 2013), it also demonstrates the flexibility of the TBW size with changes to the CNS.

What might explain a widened TBW in older adults and those who have had a concussion? A widened TBW in the older adult population found in the TOJ task has been proposed to be associated with the age-related loss of the inhibitory neurotransmitter, GABA (Bedard and Barnett-Cowan, 2016). GABA is an inhibitory neurotransmitter responsible for modulating the excitation of cortical and thalamocortical networks that relay sensory information—such as visual and auditory stimuli (Castro-Alamancos and Connors, 1997). In an aging population, GABA has shown to be reduced. This reduction could lead to less inhibition of irrelevant sensory stimuli and result in behavioral changes such as impaired discrimination of temporal delays (Caspary et al., 2005;

Bedard and Barnett-Cowan, 2016). Loss of GABA reduces inhibitory signaling in the CNS, declining the ability to inhibit irrelevant stimuli, therefore leading to their impaired performance of discriminating temporal delays between sensory stimuli (Caspary et al., 2005). Importantly, concussion is also known to alter the relative amounts of glutamate and GABA in the CNS (Guerriero et al., 2015). While these changes in neurotransmitter levels following concussion may not follow a similar pattern to the changes observed in the older adult population, these changes may add noise to signal processing in the CNS. With increased noise in the system, the temporal characteristics of the sensory inputs may not be clearly represented, leading possibly to a wider TBW following concussion.

In addition to the aging population, loss of GABA producing cells has been shown following a brain injury (mild to severe), and the imbalance between GABA and glutamate results in excitotoxicity in the brain (Caspary et al., 2005; Giza and Hovda, 2014). Beyond the initial injury however, chronic changes associated with brain injury are less understood. Using transcranial magnetic stimulation, individuals were shown to exhibit an up regulation in GABA 9 months following concussion, which may be a long-term compensatory mechanism for the excitotoxicity immediately following the injury (De Beaumont et al., 2012b). However, upon looking at individuals 3 years following concussion, no significant differences in neural transmission were found (Tremblay et al., 2014). While these neural transmission changes are different than what is observed in the aging population, as noted above it is possible that the influx of neurotransmitters immediately following concussion and fluctuations throughout recovery may increase the noise in the CNS. With increased noise when processing sensory information, the temporal characteristics of these inputs may not be clear, leading to increased and erroneous integration of sensory stimuli and a widened TBW is observed. However, as GABA neurons interact with other GABA neurons, it is hard to derive any firm conclusion from global changes in GABA/Glutamate levels ∼20 concussions.

points are color coded based on number of concussions: white = 1 concussion; light gray = 2; dark gray = 1 individual with 3 concussions; black = 1 individual with

and their association with the TBW. We therefore suggest that research should continue to understand this potential mechanistic relationship between neural transmissions and TBW width.

Research has demonstrated that those with MCI and AD also exhibit an extended TBW compared to the healthy population when integrating audiovisual information (Wu et al., 2012). Interestingly, one of the long-term consequences of sustaining a concussion is the increased risk of developing MCI and earlier onset of AD. Because MCI and AD show similar sensory integration impairments as those with concussion history, it is possible that this impaired integration that occurs following concussion is a potential factor that increases the individual's susceptibility to increased MCI risk and earlier onset of AD. The relationship between the TBW and these populations is a potential research avenue that should be further explored.

Not only do the results of this study show a widened TBW in those with concussion, they also demonstrate that the size of the TBW when discriminating the temporal order of stimuli is dependent on the amount of time elapsed since the concussion was sustained. This is yet another case where individuals have returned to play or daily activities, but impairments in neurological function are still present. Research has also shown impaired neurological function following concussion in tasks related to cognitive-motor integration (Brown et al., 2015; Dalecki et al., 2016), episodic memory and response inhibition (De Beaumont et al., 2009) and when completing an audio-visual dual-tasking paradigm (Tapper et al., 2017) all of which cannot be detected by symptom-based assessments. These findings reinforce the need for quantitative metrics to be used by clinicians, in order to make informed judgments regarding patient's being fully clear of their concussion. The TBW has shown to be a sensitive measure in tracking impaired audiovisual integration over time in those with concussion specifically for the TOJ task. The functional significance of an extended TBW and whether it negatively impacts performance in sport or daily activities is unknown from this research, but could eventually play a role in concussion assessment.

There are limitations to this study that should be addressed. First, with a small and unbalanced sample size, we were unable to examine relationships between the number of concussions and how this affected performance on the audiovisual tasks. Additionally, visual processing deficits are commonly found following concussion, particularly in those with multiple concussions (De Beaumont et al., 2012a). Of 17 participants with concussion two participants reported blurred vision, five participants reported dizziness and nine participants reported sensitivity to light in their SCAT five symptom evaluation form. Therefore a visual test should have been administered to confirm whether these individuals were capable of detecting the visual stimuli, especially given that the duration of the visual stimulus was only 17 ms, as reliably as the rest of the participants who reported no visual changes. However, the two participants with the largest TBWs (**Figure 4**, right panel with TBWs of ∼400 ms) were not the same two participants who reported blurred vision or any visual changes following their concussion(s).

## CONCLUSION

Here, we have demonstrated an extended TBW in those with concussion history when discriminating the temporal order of audiovisual stimuli compared to the healthy population. Furthermore, we have established a relationship between the size of the TBW with the time elapsed since the concussion when discriminating the temporal order of audiovisual stimuli. These findings suggest that the TBW may be a sensitive marker for tracking the functional recovery of impaired perceived timing of sensory events in those with concussion, and the TOJ task in particular could be a potential candidate test for assessing multisensory integration following concussion. Research should replicate this study with a larger, balanced sample size and controlling for visual deficits, and should explore the relationship between audiovisual integration in concussion and how it may impact sport performance and daily activities.

## REFERENCES


### AUTHOR CONTRIBUTIONS

AW and MB-C designed and implemented the study; analyzed the data; wrote the article. AW collected the data.

### FUNDING

This work was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (#RGPIN-05435-2014) and a University of Waterloo Research Incentive Fund Grant to MB-C.

### ACKNOWLEDGMENTS

We thank Robert Burns, David Gonzalez, Robyn Ibey and Travis Wall for study design and participant recruitment and testing assistance.


Alzheimer's disease compared with normal age controls. J. Alzheimers Dis. 32, 317–328. doi: 10.3233/JAD-2012-111070

**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wise and Barnett-Cowan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.