# EXPLORING THE NATURE, CONTENT, AND FREQUENCY OF INTRAPERSONAL COMMUNICATION

EDITED BY : Thomas M. Brinthaupt, Alain Morin and Małgorzata M. Puchalska-Wasyl PUBLISHED IN : Frontiers in Psychology

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-271-5 DOI 10.3389/978-2-88966-271-5

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# EXPLORING THE NATURE, CONTENT, AND FREQUENCY OF INTRAPERSONAL COMMUNICATION

Topic Editors:

Thomas M. Brinthaupt, Middle Tennessee State University, United States Alain Morin, Mount Royal University, Canada Małgorzata M. Puchalska-Wasyl, The John Paul II Catholic University of Lublin, Poland

Citation: Brinthaupt, T. M., Morin, A., Puchalska-Wasyl, M. M., eds. (2020). Exploring the Nature, Content, and Frequency of Intrapersonal Communication. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-271-5

# Table of Contents

*04 Editorial: Exploring the Nature, Content, and Frequency of Intrapersonal Communication*

Thomas M. Brinthaupt, Alain Morin and Małgorzata M. Puchalska-Wasyl


Judy L. Van Raalte, Andrew Vincent and Yani L. Dickens

*18 Individual Differences in Self-Talk Frequency: Social Isolation and Cognitive Disruption*

Thomas M. Brinthaupt


Charles Fernyhough, Ashley Watson, Marco Bernini, Peter Moseley and Ben Alderson-Day

*45 Endorsement and Constructive Criticism of an Innovative Online Reflexive Self-Talk Intervention*

Alexander T. Latinjak, Cristina Hernando-Gimeno, Luz Lorido-Méndez and James Hardy

*62 A Penny for Your Thoughts: Children's Inner Speech and its Neuro-Development*

Sharon Geva and Charles Fernyhough

*74 The ConDialInt Model: Condensation, Dialogality, and Intentionality Dimensions of Inner Speech Within a Hierarchical Predictive Control Framework*

Romain Grandchamp, Lucile Rapin, Marcela Perrone-Bertolotti, Cédric Pichat, Célise Haldin, Emilie Cousin, Jean-Philippe Lachaux, Marion Dohen, Pascal Perrier, Maëva Garnier, Monica Baciu and Hélène Lœvenbruck

# *104 Types of Inner Dialogues and Functions of Self-Talk: Comparisons and Implications*

Piotr K. Oleś, Thomas M. Brinthaupt, Rachel Dier and Dominika Polak

# Editorial: Exploring the Nature, Content, and Frequency of Intrapersonal Communication

#### Thomas M. Brinthaupt <sup>1</sup> \*, Alain Morin<sup>2</sup> and Małgorzata M. Puchalska-Wasyl <sup>3</sup>

*<sup>1</sup> Department of Psychology, Middle Tennessee State University, Murfreesboro, TN, United States, <sup>2</sup> Department of Psychology, Mount Royal University, Calgary, AB, Canada, <sup>3</sup> Institute of Psychology, Department of Personality Psychology, The John Paul II Catholic University of Lublin, Lublin, Poland*

Keywords: self-talk, inner speech, intrapersonal communication, internal dialogue, imaginary companions, individual differences

**Editorial on the Research Topic**

#### **Exploring the Nature, Content, and Frequency of Intrapersonal Communication**

The goal of this Research Topic was to explore the myriad ways that researchers conceptualize and study the phenomenon of "talking to oneself " and associated experiences of intrapersonal communication. It is clear that people show wide variations in what kinds of intrapersonal communication they experience, how frequently they engage in it, and what functions it serves. In this Research Topic, the contributors explore a range of explanations for how and why people differ in their inner speech, self-talk, or internal dialogue. Our nine contributors examine the phenomenology of intrapersonal communication, its development in childhood, personality and individual differences in the phenomenon, and its occurrence and use in sport contexts.

Variations in intrapersonal communication have been studied using multiple methods, including questionnaires, open-ended self-reports, thinking aloud protocols, imaging techniques, and descriptive experience sampling. Researchers have also started to examine ways that inner speech can be manipulated and the effects of those manipulations on thoughts, emotions, and behavior.

Interest in inner speech (covert self-communication) and private speech (self-communication that occurs aloud) has a long history. However, only recently have researchers begun in earnest to explore the wide range of features of intrapersonal communication. For example, research on various aspects of the neuroanatomy of inner speech and the development of inner speech is very active. Recent work also examines individual and personality differences in the nature, content, and frequency of intrapersonal communication. There is also substantial interest in applied work with self-talk in the domain of sport and athletic performance. This Research Topic highlights work in these areas.

# THE NEUROLOGICAL UNDERPINNINGS AND DEVELOPMENT OF INTRAPERSONAL COMMUNICATION

What is inner speech and how does it occur? With the ConDialInt Model, Grandchamp et al. present a neurocognitive predictive control framework of the condensation, dialogical, and intentionality dimensions of inner speech. They illustrate how the form and syntax of inner speech can be condensed or abbreviated, how inner speech can include monologic and dialogic forms involving the self and others, and how it can be produced both spontaneously and willfully. Through an fMRI protocol, they provide neuroanatomical evidence for the intentionality and dialogicality dimensions and how they work together to produce intrapersonal communication.

Edited and reviewed by: *Mario Dalmaso, University of Padua, Italy*

\*Correspondence: *Thomas M. Brinthaupt tom.brinthaupt@mtsu.edu*

#### Specialty section:

*This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology*

Received: *01 September 2020* Accepted: *17 September 2020* Published: *22 October 2020*

#### Citation:

*Brinthaupt TM, Morin A and Puchalska-Wasyl MM (2020) Editorial: Exploring the Nature, Content, and Frequency of Intrapersonal Communication. Front. Psychol. 11:601754. doi: 10.3389/fpsyg.2020.601754*

**4**

Geva and Fernyhough highlight the neuro-development of children's inner speech. They show how the dorsal language stream (i.e., the connection between the brain's auditory-phonological and motor systems crucial for speech production) supports the development and phenomenon of inner speech. Their review of pediatric and adult studies of the dorsal language stream supports the idea that there are parallels between the neuro-anatomical and psychological development of inner speech. This overlap suggests that the maturation of the dorsal language pathway is closely linked to the development of inner speech in childhood.

Fernyhough et al. report two studies of the relations among imaginary companions, inner speech, and auditory verbal hallucinations. Noting that imaginary companions in childhood are associated with a variety of positive development outcomes, they compare those with and without such experiences with "hearing voices" and other aspects of inner experience in adulthood. The results showed that, compared to those without a history of imaginary companions, people with such a history reported more frequent auditory verbal hallucinations and higher scores on social-related inner speech. The authors propose that imaginary companions represent a hallucinationlike experience that is closely linked to the development of inner speech.

# PERSONALITY AND INDIVIDUAL DIFFERENCES IN INTRAPERSONAL COMMUNICATION

Several contributors examined personality and individual differences in intrapersonal communication. For example. Brinthaupt reviews research on individual differences in selftalk frequency according to social isolation and cognitive disruption hypotheses. Individuals who show high levels of experiences of social isolation are expected to show higher levels of self-talk and those with experiences of cognitive disruption (e.g., anomalous, upsetting, or disturbing selfrelated experiences) should also show increased levels of self-talk frequency. Research provides moderate support for the social isolation and strong support for the cognitive disruption hypotheses.

The relations between inner dialogue types and self-talk functions is the focus of Ole´s et al.. They define inner dialogues as intrapersonal communication characterized by different voices and mutual expressions representing self and a wide variety of others and self-talk as self-referent or selfdirected speech. Comparing two multidimensional measures of inner dialogue and self-talk among a sample of Polish and US participants, their results show a significant degree of common variance between these two modes of intrapersonal communication. They suggest that inner dialogues appear to serve contemplative or reflective functions of intrapersonal communication, whereas self-talk may serve dynamic, active processing functions.

Łysiak examines the relations between inner dialogues, self-talk, and pathological personality traits, using the DSM-5's new hybrid personality disorder system. She finds that people who report more ruminative and confronting inner dialogues also report higher levels of unusual beliefs, psychoticism, and negative affectivity (e.g., anxiety, separation insecurity). However, specific self-talk facets were unrelated to DSM-5 pathological personality traits. Łysiak suggests that inner dialogues and selftalk are complementary, relating to different aspects of intrapersonal communication.

Finally, Heavey et al. provide a more expansive view of individual differences in intrapersonal communication with the development and validation of their Nevada Inner Experience Questionnaire. The authors show how intrapersonal communication in the form of inner speech is one of several kinds of inner experience, including inner seeing, thinking without symbols, and feelings or emotional experiences. They discuss possible reasons for the relative frequency of these experiences and ways that researchers can increase participants' understanding and awareness of these experiences.

# INTRAPERSONAL COMMUNICATION IN SPORT CONTEXTS

Two contributions focused specifically on the role of selftalk in the sport domain. Van Raalte et al. explore how Dialogical Self Theory and the method of Descriptive Experience Sampling (DES) can be used to enhance our understanding of inner experience and self-talk in many sports. They argue that focusing on the dialogical aspects of athlete experiences (such as I-positions and interlocutors, power dynamics, and confrontational vs. integrative inner dialogues) open new avenues for theory and research in sport psychology. DES can provide the tools to assess these theoretical ideas and insights into the phenomenon of athlete self-talk.

Latinjak et al. describe an innovative reflexive self-talk online intervention that targets goal-directed self-talk. During the 4-week program, the researchers encouraged participants to describe challenging scenarios in training or competition, examine how they use self-talk in those situations, determine its effectiveness, and explore alternative kinds of self-talk that they could use in the future. Results showed enhanced awareness of self-talk use and content refinements that appeared to benefit the emotions, motivation, and confidence of the participants. The authors discuss several implications for sport psychologists and other applied practitioners.

In summary, intrapersonal communication is a complex phenomenon, covering concepts such as inner and private speech, self-talk, inner dialogue, and imaginary companions. This topic is an attempt to exemplify the variety of approaches to studying this multi-faceted phenomenon. At the same time, it represents a first step toward a needed synthesis of knowledge about intrapersonal communication. Several unexplored questions remain to be explored, such as whether non-human animals engage in forms of intrapersonal communication and what are the similarities and differences between adaptive and dysfunctional intrapersonal communication. We hope that this selection of articles provides a useful jumping-off point for future intrapersonal communication theorists, researchers, and practitioners.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Brinthaupt, Morin and Puchalska-Wasyl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Measuring the Frequency of Inner-Experience Characteristics by Self-Report: The Nevada Inner Experience Questionnaire

Christopher L. Heavey, Stefanie A. Moynihan, Vincent P. Brouwers, Leiszle Lapping-Carr, Alek E. Krumm, Jason M. Kelsey, Dio K. Turner II and Russell T. Hurlburt\*

Department of Psychology, University of Nevada, Las Vegas, Las Vegas, NV, United States

#### Edited by:

Alain Morin, Mount Royal University, Canada

#### Reviewed by:

Tanya Luhrmann, Stanford University, United States Arnaud Delorme, UMR5549 Centre de Recherche Cerveau et Cognition (CerCo), France Glenn Carruthers, Charles Sturt University, Australia

> \*Correspondence: Russell T. Hurlburt russ@unlv.nevada.edu

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 25 September 2018 Accepted: 05 December 2018 Published: 11 January 2019

#### Citation:

Heavey CL, Moynihan SA, Brouwers VP, Lapping-Carr L, Krumm AE, Kelsey JM, Turner DK II and Hurlburt RT (2019) Measuring the Frequency of Inner-Experience Characteristics by Self-Report: The Nevada Inner Experience Questionnaire. Front. Psychol. 9:2615. doi: 10.3389/fpsyg.2018.02615 Descriptive experience sampling has suggested that there are five frequently occurring phenomena of inner experience: inner speaking, inner seeing, unsymbolized thinking, feelings, and sensory awareness. Descriptive experience sampling is a labor- and skill-intensive procedure, so it would be desirable to estimate the frequency of these phenomena by questionnaire. However, appropriate questionnaires either do not exist or have substantial limitations. We therefore created the Nevada Inner Experience Questionnaire (NIEQ), with five subscales estimating the frequency of each of the frequent phenomena, and examine here its psychometric adequacy. Exploratory factor analysis produced four of the expected factors (inner speaking, inner seeing, unsymbolized thinking, feelings) but did not produce a sensory awareness factor. Confirmatory factor analysis validated the five-factor model. The correlation between an existing self-talk questionnaire (Brinthaupt's Self-Talk Scale) and the NIEQ inner speaking subscale provides one piece of concurrent validation.

Keywords: inner experience, questionnaire, descriptive experience sampling, inner speech, inner seeing, unsymbolized thinking, feelings, sensory awareness

# INTRODUCTION

The term inner experience as we will use it here refers to directly apprehended "before the footlights of consciousness" inner events such as inner speaking, visual images, and sensations. Pristine inner experience refers to inner experiences in their natural state, undisturbed by the act of apprehension, not manipulated by psychological experiment or any other specific intervention (Hurlburt and Akhter, 2006; Hurlburt, 2011).

Descriptive experience sampling (DES; Hurlburt, 1990, 1993, 2011; Hurlburt and Heavey, 2002, 2006; Hurlburt and Akhter, 2006) is an explorational method aimed at pristine inner experience. It uses a random beeper and "expositional" interviews to investigate instances of pristine inner experience. Of course, it falls short—the beep and its response requirements by definition disturb the pristine nature of the experience. Therefore, the aim of DES is to get a glimpse of pristine inner experience in as high fidelity as the current state-of-the-art allows.

The DES method has been described in detail elsewhere (Hurlburt and Heavey, 2006, 2017; Hurlburt, 2011, 2017), and its methodological adequacy has been discussed (Hurlburt and Schwitzgebel, 2007; Caracciolo and Hurlburt, 2016; all the papers in Weisberg, 2011).

**7**

Heavey and Hurlburt (2008) have said that there are five frequent phenomena (subsequently dubbed the "5FP" by Kühn et al., 2014) of inner experience: inner speaking (sometimes called "inner speech"; Hurlburt et al., 2013), inner seeing (sometimes called "visual imagery"; Hurlburt, 2011), unsymbolized thinking (a thought directly present without words, images, or other symbols; Hurlburt and Akhter, 2008a,b), feeling (the experience of emotion; Heavey et al., 2012), and sensory awareness (attending to some sensory aspect of the internal or external environment without regard for instrumentality; Hurlburt et al., 2009). Each of the five occurs in roughly a quarter or more of samples (adding to more than 1 because several features can occur simultaneously). To say something like "a characteristic occurs a quarter of the time" implies the necessity of measuring the frequency of these characteristics. Heavey and Hurlburt (2008) measured the 5FP frequencies in the scientifically standard way: they used DES to obtain random samples of inner experience, counted the number of those samples that contain the characteristic and divided by the total number of samples.

Descriptive experience sampling is a labor-intensive procedure, so it would be desirable, if possible, to have a more efficient way of estimating frequency of the 5FP, such as by questionnaire. However, no such questionnaires exist. There are two questionnaires that consider the frequency of inner speech: the Self-Talk Scale (STS: Brinthaupt et al., 2009) and the Varieties of Inner Speech Questionnaire (VISQ; McCarthy-Jones and Fernyhough, 2011; and the revised version VISQ-R, Alderson-Day et al., 2018). The STS has two frequency-related limitations. First, it does not inquire directly about frequency in natural settings. Instead, the STS inquires about frequency in specific situations, by presenting the stem "I talk to myself when. . ." followed by a list of situations such as "I should have done something differently," or "I want to reinforce myself for doing well" (Brinthaupt et al., 2009, p. 88). There is no measure of how frequent those situations are and therefore no way of translating to overall natural-setting frequency. Second, it uses anchors (1 = Never, 2 = Seldom, 3 = Sometimes, 4 = Often, and 5 = Very Often) that are ambiguous: "Often" might refer to five times a day ("I often brush my teeth") or five times a year ("Hurricanes often make landfall in the US"). Despite these limitations, the STS is occasionally used as an overall frequency measure (Brinthaupt et al., 2015) by recoding the ratings from 0 to 4 instead of 1 to 5, adding them, and dividing by 64 (the possible sum of scores), a procedure that assumes (with little warrant) equality of frequency across situations and across people.

The VISQ (McCarthy-Jones and Fernyhough, 2011) is a questionnaire designed to measure features of inner speech inspired by Vygotsky. Like the STS, it has two frequency-related limitations. First, instead of inquiring about frequency directly, it asks about Vygotsky-inspired characteristics of inner speech. Second, it uses ambiguous anchors (1 = Certainly does not apply to me, 2 = Possibly does not apply to me, 3 = If anything, slightly does not apply to me, 4 = If anything, applies to me slightly, 5 = Possibly applies to me, and 6 = Certainly applies to me), which are not really measures of frequency at all. Here is a typical item: "I hear the voice of another person in my head. For example, when I have done something foolish I hear my mother's voice criticizing me in my mind" (McCarthy-Jones and Fernyhough, 2011, p. 1589); there is no measure of how frequent "doing something foolish" is, and no direct way of mapping applies to me onto frequency. The recently revised version (VISQ-R, Alderson-Day et al., 2018) reduces the anchor ambiguity by using as anchors 1 = Never to 7 = All the time, but the VISQ-R remains a consideration of the characteristics of inner speech when it occurs, not a measure of its frequency of occurrence.

There are questionnaires inquiring about emotion (e.g., the Positive and Negative Affect Scale; PANAS; Watson et al., 1988), but such questionnaires typically rate the intensity of emotion, not the frequency of feelings. There are questionnaires inquiring about visual imagery (e.g., the Vividness of Visual Imagery Questionnaire; VVIQ; Marks, 1973), but those questionnaires typically rate vividness of imagery, not its frequency. There are, that we know of, no questionnaire measures at all for unsymbolized thinking or sensory awareness as DES defines them.

Many psychologists believe that inner experience is important for both theoretical and practical reasons. Using inner speech as an example, theoretically, Baddeley and Jarrold (2007) held that inner speech instances are recitations in a phonological loop designed to keep information readily at hand. Practically, inner speech is held to be important, for example, in a wide variety of sport (basketball, football, golf, tennis, cricket, cross country running, swimming, volleyball and many others) performance (Hardy, 2006; Van Raalte et al., 2014), in psychotherapy (Meichenbaum, 1977), in self-awareness and metacognition (Morin, 2005, 2011; Carruthers, 2011), and so on. However, claims about the frequency of inner speech vary widely, from "Human beings talk to themselves every moment of the waking day" (Baars, 2003, p. 106) to the 28% found by Heavey and Hurlburt (2008). Any theory about the role of inner speech in information processing, sport success, psychotherapy, and so on must account for or dismiss claims about individual differences in inner speech frequency (Hurlburt et al., 2013).

Thus, inner experience (including inner speech) is important, and the measurement of the frequency of inner experiences is a basic scientific endeavor. DES is the best method we know of for such frequency measurement; however, DES is time intensive, so it would be desirable to estimate frequencies by questionnaire. Current questionnaires, if they exist at all for inner phenomena, typically measure characteristics such as vividness rather than frequency, and their response anchors are often ambiguous.

To overcome all those limitations, we created a questionnaire (the Nevada Inner Experience Questionnaire; NIEQ) that (a) inquires about the same inner phenomena that DES frequently finds (the NIEQ has five subscales, one for each of the 5FP); (b) inquires directly about the frequency of experience, rather than its vividness, etc. (by asking "How frequently do you. . .?" and "Generally speaking, what portion of your inner experience is. . .?"); (c) inquires about frequency in the natural environment (not about a specified list of situations or a specified list of characteristics); and (d) reduces the ambiguity of anchors by using visual-analog scales (Wewers and Lowe, 1990) with anchors

from Never to Always (for the "How frequently do you. . .?" questions) or from None to All (for the "Generally speaking, what portion of your inner experience is. . .?" questions). The complete

TABLE 1 | The Nevada Inner Experience Questionnaire (NIEQ).


9. Generally speaking, what portion of your inner experience consists of focusing on internal or external sensory experiences, like a tickle or pain, or the color or shape of something you are seeing?


10. Generally speaking, what portion of your inner experience consists of thinking about something specific but without using any words or mental images?


NIEQ is shown in **Table 1**. The present study investigates the psychometric adequacy of the NIEQ.

# MATERIALS AND METHODS

# Participants

The participants were undergraduate subject-pool volunteers (N = 260) taking introductory psychology courses at a large urban university. It was a diverse sample: mean age = 20.6 years (SD = 4.35; range = 18–49); 28.5% male, 63.5% female, 8% did not provide gender information; 39% self-identified as white or Caucasian, 17% Hispanic, 15% African American, 15% Asian, and 8% Pacific Islander. Each received subject-pool credits for participation.

# Instruments

## The Self-Talk Scale (STS; Brinthaupt et al., 2009)

The STS is a 16-item questionnaire that uses 5-point frequency scales (1 = Never, 5 = Very Often) to ask about the frequency of self-talk in various situations. It thus produces a total score between 16 and 80. Brinthaupt et al. (2009) showed that the STS has adequate test-retest reliability [r(99) = 0.66, p < 0.001] over a 3-month period. The STS defines self-talk as including either aloud self-talk or inner speech, without differentiating the two.

# Nevada Inner Experience Questionnaire (NIEQ)

The NIEQ is a 10-item set of visual-analog scales with one pair of items (a Frequently item and a Generally item) for each of the 5FP. The scale items were written collaboratively by a team of researchers familiar with DES. One question ("How frequently. . .?") was aimed at the participant's perception of how frequently they experience the phenomenon without regard for any other phenomena, whereas the other question ("Generally speaking, what portion. . .?") used softer language to evoke the participant's perception of how frequently they experience the phenomenon, with an appreciation for time spent engaged in other phenomena. Thus, the two items of each pair were designed to ask basically the same question in two different ways. For example, the two inner speech items are "How frequently do you talk to yourself in your inner voice?" rated on a visual analog scale from Never to Always; and "Generally speaking, what portion of your inner experience is in inner speech (thinking in words)?" rated on a visual analog scale from None to All. The complete NIEQ questionnaire is shown in **Table 1**. The visual analog scales were treated as running from 0 to 100. Measurement was doubleentry (Barchard and Pace, 2011): Two raters independently measured each rating (for example, the "sample" mark in **Table 1** would be measured as 78). The correlation between raters was >0.99 for each item. Where between-rater ratings differed by 3 or more, two independent judges resolved the discrepancy. The rating for each item was entered as the average of the two raters. Ratings for each item pair were averaged to produce subscale scores for the frequencies of inner speaking (averaging items 1 and 6), inner seeing (items 2 and 7), unsymbolized thinking (items 5 and 10), feelings (items 3 and 8), and sensory awareness (items 4 and 9).

TABLE 2 | NIEQ item and scale means (and standard deviations), percentages<sup>a</sup> , and STS score and percentage.


ISpeaking, inner speaking; ISeeing, inner seeing; UnsTh, unsymbolized thinking; SensAw, sensory awareness. <sup>a</sup>N = 260. <sup>b</sup>Derived from the STS Score following Brinthaupt et al. (2015): STS percentage = 100 × (STS total – 16)/64. <sup>c</sup>Participants' responses on each NIEQ item ranged from 0 to 100% except Feeling/Frequently (range 9-100%) and Unsymbolized/Generally (range 0-98%). <sup>d</sup>Scale score = average of Frequently item and Generally item.

#### TABLE 3 | NIEQ item correlations<sup>a</sup> .

fpsyg-09-02615 January 11, 2019 Time: 14:0 # 4


ISpeaking, inner speaking; ISeeing, inner seeing; UnsTh, unsymbolized thinking; SensAw, sensory awareness. <sup>a</sup>df = 258.

#### A Demographic Form

Designed for this study, the form asked participants to provide name, preferred phone number, age, race/ethnicity, sex, marital status, education level, and employment.

## Procedure

After obtaining informed consent, participants were administered the STS, NIEQ, and the demographic form. This took approximately 20 min.

# RESULTS

The NIEQ item and scale means (as percentages) and standard deviations are shown in **Table 2**. As expected, within each phenomenon (inner speaking, inner seeing, etc.), the Frequently and Generally item pairs had similar means (with the possible exception of sensory awareness). For example, the ISpeak subscale suggests that our participants believed that inner speaking occurred on average 68.3% of the time.

**Table 2** also shows the mean STS Score for our participants, as well as the STS percentage, an estimate derived (following Brinthaupt et al., 2015) from the STS Score by recoding the anchors from 0 to 4 (instead of 1 to 5), adding the new item codes, and dividing by 64 (the number of items × 4, the maximum score for each item). Thus, on the STS our participants reported self-talk (including both inner speech and external self-speech) as occurring in 67.2% of potential situations, a value very close to their NIEQ inner-speaking percentage (68.3% of the time).

The NIEQ item correlations are shown in **Table 3**. As expected, within each phenomenon (inner speaking, inner seeing, etc.), the Frequently and Generally item pairs correlated fairly strongly with each other (see main diagonal) and the off-pair item correlations were relatively low (with some exceptions, mostly involving sensory awareness).

Because there is no existing factor model of the NIEQ, we include the results of an exploratory factor analysis in **Table 4**, which shows the Varimax rotated factor components when the eigenvalues are constrained to be greater than 1. Factors emerge as expected (highest loading on the pair of Frequently and Generally item), so the respective factors are easily named Inner Speaking, Inner Seeing, Unsymbolized Thinking, and Feeling.

TABLE 4 | Varimax rotated factor components of the NIEQ (eigenvalues > 1).


ISpeaking, inner speaking; ISeeing, inner seeing; UnsTh, unsymbolized thinking; SensAw, sensory awareness.

TABLE 5 | Goodness of fit statistics for NIEQ confirmatory factor analysis (robust solutions for one- and five-factor models).


CFI, Comparative Fit Index; RMSEA, Root Mean Square Error of Approximation; AIC, Akaike's Information Criterion; S-B χ 2 , Satorra-Bentler scaled chi-square statistic.

A sensory awareness factor did not emerge; the sensory awareness items loaded on all the factors.

Because the test construction was designed around a fivefactor model, we used EQS (Bentler, 2008) to conduct two confirmatory factor analyses of the NIEQ, first assuming one factor (to determine whether the NIEQ represents a general inner experience factor) and then five factors (to determine whether the NIEQ reflects the five 5FP factors as designed). **Table 5** presents the confirmatory factor analysis goodness of fit statistics. Because Mardia's coefficient for the analysis was 21.38 (that is, greater than 5.00; Bentler, 2008), the data violated assumptions of normality, so robust fit statistics are displayed. The first row of **Table 5** shows that the one-factor analysis did not meet the CFI > 0.90 (Bentler, 1990) and RMSEA < 0.08 (Steiger and Lind, 1980) criteria for good fit. However, the second row shows that the five-factor TABLE 6 | Coefficient alpha (on main diagonal, intercorrelations<sup>a</sup> of NIEQ subscales, and subscale correlation with the STS).


ISpeaking, inner speaking; ISeeing, inner seeing; UnsTh, unsymbolized thinking; SensAw, sensory awareness. <sup>a</sup>df = 258.

model provided a much better fit (AIC = −4.655) than did the one-factor model (AIC = 99.332); the Comparative Fit Index was 0.939 and the Root Mean Square Error of Approximation was 0.056.

The confirmatory factor analysis results for the five-factor model are illustrated in **Figure 1**. The items typically loaded as expected: one factor was composed primarily of the inner speaking Frequently and Generally items; another primarily of the inner seeing Frequently and Generally items; and so on for each of the five factors. The weakest factor loadings (0.43 and 0.48) and strongest between-factor correlations (e.g., 0.91 with inner seeing) involved sensory awareness. Thus, the five-factor model largely (with the possible exception of sensory awareness) supports the structural validity of the NIEQ.

**Table 6** shows on the main diagonal coefficient alpha for each of the five NIEQ subscales; these are acceptably high for twoitem scales (between 0.50 and 0.66) except for sensory awareness (0.34). The subscale intercorrelations are shown off the diagonal. Again except for sensory awareness, these are, as is desirable, relatively low.

**Table 6** also shows the relatively high correlation (0.52) between the NIEQ-ISpeaking subscale and the STS percentage.

# DISCUSSION

The NIEQ was designed to measure directly by questionnaire the five frequent phenomena (5FP) of inner experience identified by DES studies. Psychometric evaluation showed that the NIEQ behaved as it was designed: confirmatory factor analysis showed that the five-factor model was a good fit for the NIEQ items and that the items loaded in the expected way (with the possible exception of sensory awareness).

To situate the NIEQ in the context of other questionnaires, we investigated the relationship of the NIEQ-ISpeaking subscale with the STS (Brinthaupt et al., 2009), a questionnaire that has been used to estimate the frequency of self-talk. We found very similar percentages between the NIEQ-ISpeaking subscale average and the STS frequency average (68.3% vs. 67.2%) across our 260 participants; the confidence interval for the difference between the NIEQ-ISpeaking subscale and the STS included zero. [Our STS percentage was somewhat higher than the 58.6% STS percentage reported by Brinthaupt et al. (2009) and the

53.9% reported by Brinthaupt and Kang (2014); we have no explanation for this other than the samples were from different universities.] Furthermore, we found, as expected, a relatively high correlation (0.52) between the NIEQ-ISpeaking subscale and STS. The correlation should not be expected to be higher because (a) whereas the NIEQ-ISpeaking and the STS have substantial overlap (both measure inner speaking), their aims are not identical (the STS, unlike the NIEQ, also includes aloud selftalk, and the STS measures frequency in defined situations, rather than in the natural environment); and (b) there are only two NIEQ-ISpeaking items.

It would be desirable to subject the other NIEQ subscales to similar concurrent validity analysis. We did not do so because, as we have seen, such questionnaires either do not measure frequency (for imagery and feelings) or do not exist (for unsymbolized thinking and sensory awareness).

The NIEQ-SensAw subscale had lower within-scale (Frequently vs. Generally) correlation and higher betweensubscale correlations than the other NIEQ subscales. We offer two potential explanations. First, sensory awareness, as DES defines it, involves a variety of sensations of both the external environment (color, smell, shape, etc.) and inner environment (tickle, soreness, stomach ache, etc.). However, the NIEQ SensAw Frequently item ("How frequently do you pay attention to the colors, smells, or sounds or your environment?") inquires only about the external world, whereas the NIEQ SensAw Generally item ("Generally speaking, what portion of your inner experience consists of focusing on internal or external sensory experiences, like a tickle or pain, or the color or shape of something you are seeing?") inquires about both the inner and the external world. That difference in focus might lower the between-item correlation, even though the two items together may do a better job of measuring sensory awareness as conceptualized in the 5FP than would either item alone.

Second, the concept of sensory awareness does intersect with the other 5FP. For example, feelings can importantly involve sensations (e.g., of a teary eye or a heavy heart); inner seeing may involve a specific sensory focus (e.g., on the color of what is imaginarily seen). Thus, it may be a desirable feature (not a weakness) of the NIEQ to demonstrate the correlation of sensory awareness with other aspects. Further research, including the sampling of experience in the natural environment, is required to tease apart possibilities.

We can compare our results to those derived from Lapping-Carr (unpublished), which administered the NIEQ as part of a larger study. Those participants (N = 60) responded to a Qualtrics version of the NIEQ where they used the mouse to click the NIEQ visual analog scales. **Table 7** shows that the results of performing the Varimax-rotated four-factor exploratory factory analysis on Lapping-Carr's unpublished data are very similar to our own results shown in **Table 4**: Factors emerged as expected (highest loading on the pair of Frequently and Generally item) for Inner Speaking, Inner Seeing, Unsymbolized Thinking, and Feeling, but a sensory awareness factor did not emerge; the sensory awareness items loaded on all the factors. That is, the psychometric conclusions we drew from our own study are consonant with the Lapping-Carr (unpublished) NIEQ data.

TABLE 7 | Varimax rotated factor components derived from Lapping-Carr (unpublished).


ISpeaking, inner speaking; ISeeing, inner seeing; UnsTh, unsymbolized thinking; SensAw, sensory awareness.

Thus, overall we conclude that by the usual psychometric standards, the NIEQ measures the 5FP with consistent estimated frequencies and reliabilities. However, the inner experience frequencies shown in **Table 2** (which ranged from 38 to 74%) are substantially higher than those reported by Heavey and Hurlburt (2008, p. 6) using DES: inner speech = 26%, inner seeing = 34%, unsymbolized thinking = 22%, feeling = 26%, and sensory awareness = 22%. These discrepancies might be due to the fact that the NIEQ, like the STS, VISQ, and other questionnaires, measures participants' self-reports about inner experience rather than attempting to sample experience itself (Hurlburt et al., 2013). Without training and practice, participants may not have an adequate understanding of their own inner experience, so self-reports (including with the NIEQ) might be expected to over-estimate general experiential frequencies as measured by DES (Hurlburt and Heavey, 2015). We would value studies that seek to measure experience more directly, such as in the experience sampling studies by Brinthaupt et al. (2015) and in DES studies. Now that the NIEQ has been validated as a psychometric instrument, a direct comparison of NIEQ and DES results using the same participants would be desirable.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the UNLV Human Subjects Research Policy of the UNLV Office of Research Integrity, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the UNLV Social/Behavioral Sciences Institutional Review Board.

# AUTHOR CONTRIBUTIONS

CH, VB, JK, DT, and RH: planning. CH, VB, LL-C, JK, DT, and RH: NIEQ design and data collection. SM and RH: analyses. CH, SM, LL-C, AK, and RH: writing.

# REFERENCES

fpsyg-09-02615 January 11, 2019 Time: 14:0 # 7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Heavey, Moynihan, Brouwers, Lapping-Carr, Krumm, Kelsey, Turner and Hurlburt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Dialogical Consciousness and Descriptive Experience Sampling: Implications for the Study of Intrapersonal Communication in Sport

#### Judy L. Van Raalte1,2 \*, Andrew Vincent <sup>3</sup> and Yani L. Dickens <sup>4</sup>

*<sup>1</sup> Department of Psychology, Springfield College, Springfield, MA, United States, <sup>2</sup> College of Health Sciences, Wuhan Sports University, Wuhan, China, <sup>3</sup> Counseling Center, SUNY Oneonta, Oneonta, NY, United States, <sup>4</sup> Counseling Services, University of Nevada, Reno, NV, United States*

Keywords: athlete, inner experience, open-beginninged methods, presupposition, self-talk

#### Edited by:

*Thomas M. Brinthaupt, Middle Tennessee State University, United States*

#### Reviewed by:

*Hubert Hermans, Radboud University Nijmegen, Netherlands James Hardy, Bangor University, United Kingdom*

#### \*Correspondence:

*Judy L. Van Raalte jvanraal@springfieldcollege.edu*

#### Specialty section:

*This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology*

Received: *13 January 2019* Accepted: *08 March 2019* Published: *03 April 2019*

#### Citation:

*Van Raalte JL, Vincent A and Dickens YL (2019) Dialogical Consciousness and Descriptive Experience Sampling: Implications for the Study of Intrapersonal Communication in Sport. Front. Psychol. 10:653. doi: 10.3389/fpsyg.2019.00653* Inner experience and intrapersonal communication research in sport psychology has been largely dominated by a focus on self-talk, which has typically been examined using retrospective self-report measures. Although the existing self-talk literature has addressed aspects of athlete's inner experience, attempts to extend the theoretical scope of intrapersonal communication in sport has been limited by an adherence to linear, causal models of self-talk, as well as by methodological challenges associated with assessing inner experience. The purpose of this paper is to present theoretical and methodological approaches that can be used for further understanding of intrapersonal communication and inner experience in sport. The paper begins with a brief history of sport self-talk theory and research. Next, a discussion of dialogical self (Hermans et al., 1992; Hermans and Hermans-Konopka, 2010) and dialogical consciousness (Larrain and Haye, 2012; Haye and Larrain, 2013) as they relate to sport self-talk theory is presented. Descriptive Experience Sampling (DES), a promising method for exploring inner experience and self-talk in sport is described. We conclude with suggestions related to integrating dialogical theories and DES into the study of intrapersonal communication in sport.

# HISTORY OF SELF-TALK IN SPORT PSYCHOLOGY

Examining the origins and history of self-talk research in sport psychology provides important insight into strengths and limitations of the literature. Early sport psychology self-talk research primarily involved linear experimental designs that assessed the effects of assigned self-talk on laboratory-based motor learning and motor performance tasks (Landers, 1995). These experimental approaches required self-talk phrases to be categorized, so that hypotheses about how types of self-talk affect learning and performance could be tested. Although linear, causal theories can provide insight related to the effects of self-talk on certain tasks, it is not possible to answer questions such as "How do athletes experience their own self-talk?" "What is the purpose of self-talk in sport?" and "How does self-talk work?" through categorization and experimental testing alone.

The self-talk literature was subsequently shaped by cognitive and cognitive behavioral theories (CBT) of Ellis (1957) and Beck (1975), which focused on self-talk as emblematic of deeply held "core beliefs" related to self-esteem, confidence, self-concept, and self-efficacy. Although cognitive behavioral paradigms advanced the application of mental skills interventions, such theoretical approaches were limited by their conceptualization of the self as autonomous, unitary, and self-contained (Hermans and Hermans-Konopka, 2010). For instance, the assumption that an athlete's critical self-statement reflects low self-esteem leaves little room for the experience of inner conflict (an athlete who oscillates between positive and negative self-concept) or self-talk that echoes the voice of some important other (an athlete hearing a coach saying "that's not good enough" in their head). Researchers who consider self-talk in a broader paradigmatic context and apply methods that circumvent the limitations of retrospective self-report, may inspire new inquiry and advance understanding of inner experience and self-talk.

# EXPANDING THEORY IN INTRAPERSONAL COMMUNICATION: DIALOGICAL CONSCIOUSNESS

Theories from discursive psychology, especially ideas about dialogical self (Hermans et al., 1992; Hermans and Hermans-Konopka, 2010) and dialogical consciousness (Larrain and Haye, 2012; Haye and Larrain, 2013), provide alternative perspectives with potential for expanding current theory, research, and practice in sport psychology. Dialogical theories of self are based on philosophical assumptions of constructivism, which view the self as multifaceted, contextual, and created through interaction with the social world (Hermans et al., 1992; Hermans and Hermans-Konopka, 2010). Perhaps the most notable feature of theories of dialogical consciousness is that key aspects of inner experience are viewed as taking place in the form of a dynamic conversation that is polyphonic, consisting of many "voices" (Hermans et al., 1992; Larrain and Haye, 2012). These voices, which can be based in language, emotion, or other forms of experience, reflect different viewpoints, perspectives, or positions that might occur to a person (Puchalska-Wasyl, 2016). For example these voices might take the form of internalized Ipositions that reflect different versions of self (e.g., ideal self, undesired self, real self), internalized interlocutors who represent external figures such as a coach, a close friend, or a therapist, or norms or rules that have been internalized from culture and society (Hermans and Hermans-Konopka, 2010; Puchalska-Wasyl, 2016).

Ideas pertaining to dialogical consciousness were introduced to the sport psychology self-talk literature via the sport-specific model of self-talk, which raises questions pertaining to inner discourse such as "If we already know everything we know, then why do we talk to ourselves?" and "What are we doing when we engage in self-talk?" (Van Raalte et al., 2016, pp. 140–141). Although answers to these questions cannot be understood using linear, causal models that focus on self-talk categorization, they can be addressed through the lens of dialogical self whereby intrapersonal communication is not about messages being sent and received by a singular self but rather a conversation between internalized positions taking place in the society of the mind (Hermans and Hermans-Konopka, 2010). For instance, an athlete who misses a pass may have self-talk such as "not good enough, you have to make that play" and "no worries, you can do it." If we focus solely on the content, we lose a chance to gain understanding of that athlete's internal world where the first statement may reflect the internalized voice of a critical coach or parent, and the latter may reflect the internalized voice of a mentor or a fan.

Understanding intrapersonal communication in this way opens additional avenues for research, some of which are currently under study in the area of dialogical consciousness but missing from sport psychology. For instance, Hermans (2003) has discussed the importance of power differential between Ipositions and interlocutors, suggesting that certain voices are likely to be more influential in consciousness by being more dominant in internal dialogue. In sport psychology, practitioners and researchers would benefit from better understanding which internal voices are dominant and passive and how intentionally used self-talk interacts with athletes' dominant and passive internal voices and performance.

Integrative and confrontational dialogue types present a second avenue for exploration. Integrative internal dialogues move toward synthesis and solution between internalized voices as existing positions come together as part of the construction of a new position, whereas confrontational internal dialogues accentuate difference and result in cognitive dissonance (Hermans and Hermans-Konopka, 2010; Puchalska-Wasyl, 2016, 2017). Intrapersonal communication that takes place between an athlete's inner critic and inner fan could serve as an example of this. In a confrontational dialogue, one position becomes dominant while the other is silenced; this might result in self-talk such as "ignore that positive talk, you are playing like garbage." Oppositely, an integrative dialogue would move toward a position that includes both "inner critic" and "inner fan" and may result in self-talk such as "you can finish this game strong, but let's work on that in practice next week." Exploring the extent to which integrative and confrontational dialogues occur for athletes and the ways these different types of dialogues shape athlete experiences could prove useful in understanding intrapersonal communication in sport, especially given the nature of existing applied interventions such as thought stopping and thought replacement, which employ confrontational approaches designed to silence unwanted voices in internal dialogue (Hardy and Oliver, 2014).

The connection between self, culture, and social context is a key feature of dialogical self theory, as internal dialogue is seen as a reflection of both individual experience and larger cultural forces (Hermans, 2003; Hermans and Hermans-Konopka, 2010). Viewing internal dialogue as being inextricably interconnected with the social context has important implications for selftalk in sport and could provide several avenues for future study. For instance, a given internalized position may be an internalization of a prominent cultural narrative or, in the case of sport, some aspect of team culture. This connection between social context, culture, and the internal world of an athlete stands in contrast to traditional causal, linear, category-focused, information-processing views of sport self-talk and provides a theoretical lens through which cultural differences in self-talk can be understood. Integrating theories of dialogical consciousness into existing theories of intrapersonal communication in sport can also direct applied and research attention to racism, sexism, and other oppressive forces that may be manifested as voices that play out in the internal dialogue of athlete consciousness. One of the major challenges associated with these dialogical concepts pertains to their assessment. Standardized self-report

questionnaires are limited in capturing athletes' experiences related to dialogical processes.

# EXPLORING INNER EXPERIENCE: THE DESCRIPTIVE EXPERIENCE SAMPLING (DES) METHOD

Self-talk research in sport has been constrained by the ways self-talk is studied (Hardy and Jones, 1994; Brinthaupt et al., 2015). Self-report questionnaires have traditionally served as primary sources of self-talk data, despite concerns about their validity (Van Raalte et al., 2014; Van Raalte and Vincent, 2017; Thibodeaux and Winsler, 2018), extensive evidence that these and other retrospective observations are unreliable (e.g., Brewer et al., 1991; Wells and Loftus, 2003), and the fact that recalling inner events is problematic (Hurlburt and Melancon, 1987; Koriat and Bjork, 2005). Approaches that improve upon existing methods have occasionally been used in sport and exercise psychology research, such as think-aloud methods (Fuhrer, 1985; McPherson, 1999; Whitehead et al., 2015), Ecological Momentary Assessment (EMA; Biddle et al., 2009), and the Experience Sampling Method (ESM; Cerin et al., 2001). Although each of these methods sample inner experience during sport performance, each has limitations (Dickens et al., 2018). One method that overcomes many of these shortcomings and is well-suited to the exploration of the dialogical self, dialogical consciousness, and the discursive nature of athlete's inner experiences is DES.

DES is a method that uses a random beeper to directly sample "pristine" inner experience contemporaneously and directly, circumventing many of the limitations of self-report measures and retrospection. DES is "open-beginninged," openended, and uses focused non-leading questions like "what was your inner experience, if any, at the moment of the beep" to direct participants to real-time, momentary experience. Whereas, standardized questionnaires, EMA, and ESM are often influenced by the theory of inner experience that they are designed to measure, DES brackets presuppositions to prevent experimenter expectancies from contaminating observed inner experience. DES also offers several methodological improvements that yield high fidelity samples of inner experience. For example, DES includes collections of random representative samples; intensive training to help participants observe and report inner experience; and extensive collaboration with participants around investigating their inner experience through video-recorded interviews within 24 h of sample collection. DES studies have shown high inter-observer reliability (Hurlburt and Heavey, 2002), DES has been validated with Functional Magnetic Resonance Imaging (fMRI) (Kühn et al., 2014), and DES has been shown to be feasible during sport performance (Dickens et al., 2018). The major cost of implementing DES is the quantityfor-quality tradeoff. DES is labor-intensive, requiring 5–10 h of interview time per participant (Hurlburt and Akhter, 2006; McKelvie, 2019).

DES researchers suggest that DES advances understanding of actual momentary inner experience, often yielding unique contributions. For instance, although many have presumed that self-talk is pervasive, if not ubiquitous, in activities such as silent reading or sport performance, DES research has shown that inner experience typically consists of five frequent phenomena (5FP) (Kühn et al., 2014) including inner speaking, inner seeing, sensory awareness, feeling, and unsymbolized thinking. Inner speaking is self-talk spoken silently to oneself, inner seeing is visual imagery, and sensory awareness includes bodily sensation (e.g., pain, tension, hunger), and feeling is emotion (e.g., anxiety, anger, joy). Unsymbolized thinking is a seldom recognized but explicit thought process that takes place without the presence of words or images (see Hurlburt and Akhter, 2008) and occurs about as frequently as the more well-known 5FP (Lapping-Carr and Heavey, 2017). DES research suggests that inner experience is idiosyncratic since inner experiences outside of the 5FP can and do occur, including being in a flow state and completely absorbed in an activity (Lapping-Carr and Heavey, 2017) and having no inner experience occurring at the moment of the beep (Hurlburt and Schwitzgebel, 2011). In a sport context, Dickens et al. (2018) found that inner experience during golf performance included all 5FP, speaking aloud and inner speaking both occurred during golf, self-talk was a frequent but not the predominant inner experience, inner-speaking self-talk was 6 times as frequent as speaking aloud self-talk, and effortful, intentional use of self-talk (i.e., System 2 self-talk) was rare. Also, some participants experienced no self-talk, and one participant reported no inner experience in over half of their samples, illustrating the idiosyncratic nature of inner experience during sport performance.

# FUTURE DIRECTIONS

Taken together, theories of dialogical self, dialogical consciousness, and DES challenge assumptions and inspire new theorizing and research in the area of intrapersonal communication and inner experience in sport. Considering athlete experience as dialogical allows us to move beyond CBT cause-effect paradigms that focus on categorization of self-talk and explore possible theories related to I-positions and interlocutors, power dynamics, and confrontational vs. integrative inner-dialogue types. DES provides the tools necessary for precise empirical assessment of these theoretical ideas and can provide insights related to self-talk. Indeed, DES research has already shown that self-talk is a less prevalent aspect of inner experience than previously suggested in the sport psychology literature (Dickens et al., 2018). Together, theories of dialogical self, dialogical consciousness, and DES have the potential to advance theoretical and practical knowledge by validating previous findings and/or uncovering new findings.

# AUTHOR CONTRIBUTIONS

All three authors developed and contributed to this work. AV developed ideas related to dialogical self and dialogical consciousness. YD developed ideas related to Descriptive Experience Sampling (DES).

# ACKNOWLEDGMENTS

The authors world like to acknowledge the theoretical, conceptual, and emotional support provided by Jessica Younger Dickens, Russell T. Hurlburt, and Rebecca Vincent. We would

# REFERENCES


also like to thank James Harnsberger and the Springfield College Office of Academic Affairs for financial support and the Springfield College Athletic Counseling Research Team whose feedback facilitated completion of the manuscript in a timely manner.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Van Raalte, Vincent and Dickens. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Individual Differences in Self-Talk Frequency: Social Isolation and Cognitive Disruption

#### *Thomas M. Brinthaupt\**

*Middle Tennessee State University, Murfreesboro, TN, United States*

Despite the popularity of research on intrapersonal communication across many disciplines, there has been little attention devoted to the factors that might account for individual differences in talking to oneself. In this paper, I explore two possible explanations for why people might differ in the frequency of their self-talk. According to the "social isolation" hypothesis, spending more time alone or having socially isolating experiences will be associated with increased self-talk. According to the "cognitive disruption" hypothesis, having self-related experiences that are cognitively disruptive will be associated with increased self-talk frequency. Several studies using the Self-Talk Scale are pertinent to these hypotheses. The results indicate good support for the social isolation hypothesis and strong support for the cognitive disruption hypothesis. I conclude the paper with a wide range of implications for future research on individual differences in self-talk and other kinds of intrapersonal communication.

#### *Edited by:*

*Stefan Berti, Johannes Gutenberg University Mainz, Germany*

#### *Reviewed by:*

*James Hardy, Bangor University, United Kingdom Judy Van Raalte, Springfield College, United States*

> *\*Correspondence: Thomas M. Brinthaupt*

*tom.brinthaupt@mtsu.edu*

#### *Specialty section:*

*This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology*

*Received: 08 February 2019 Accepted: 25 April 2019 Published: 10 May 2019*

#### *Citation:*

*Brinthaupt TM (2019) Individual Differences in Self-Talk Frequency: Social Isolation and Cognitive Disruption. Front. Psychol. 10:1088. doi: 10.3389/fpsyg.2019.01088*

Keywords: self-talk, intrapersonal communication, self-talk scale, social isolation, cognitive disruption

Several researchers have studied individual differences in the frequency of intrapersonal communication (e.g., Honeycutt, 2010; Morin et al., 2011; Hurlburt et al., 2013; Ren et al., 2016). It is clear that people differ in how often they typically talk to themselves. What is less clear are the factors that might account for such individual differences in intrapersonal communication. Considering these factors is likely to have implications for a wide range of research and practice domains. For example, cognitive-behavioral interventions (e.g., Hollon and Beck, 2013) may be more (or less) effective for frequent compared to infrequent selftalkers. Sport psychologists who are interested in enhancing athletic performance through self-talk manipulations (e.g., Hatzigeorgiadis, 2006) might improve their efforts by taking into account individual differences in self-talk frequency. Educational practices that utilize self-talk as a self-regulatory tool (e.g., Deniz, 2009) could be adjusted based on how frequently or infrequently students talk to themselves.

In this paper, I examine two potential sources of individual differences in self-talk frequency. These sources focus on the potential interpersonal aspects of self-talk (i.e., how different kinds of social experiences might relate to its frequency) and how a variety of intrapersonal events (such as cognitive, perceptual, and sensory experiences) might relate to self-talk frequency. First, I define self-talk as a category of intrapersonal communication and examine the various self-regulatory functions that it serves. Next, I review the characteristics and research examining the psychometric properties of the Self-Talk Scale (STS; Brinthaupt et al., 2009), a measure designed to assess self-talk frequency. In the next sections of the paper, I examine the findings that are pertinent to the "social isolation" and "cognitive disruption" hypotheses of individual differences in self-talk. I conclude the paper with recommendations for how to test these hypotheses further using the STS and related measures. Implications for future research on self-talk and intrapersonal communication frequency are also presented.

# SELF-TALK AND OTHER KINDS OF INTRAPERSONAL COMMUNICATION

As the current Research Topic contributors and others illustrate, the research literature on intrapersonal communication is alive and well. Among the varieties of this kind of communication are silent self-talk (inner speech; McCarthy-Jones and Fernyhough, 2011), out loud self-talk (private speech; Duncan and Cheyne, 1999), internal dialogues (Hermans, 1996), auditory imagery (MacKay, 1992), and selfstatements (Kendall et al., 1989). Researchers and reviewers have identified a wide range of possible functions served by self-talk (Langland-Hassan and Vicente, 2018). For example, psychologists propose that self-talk plays a role in inhibiting impulses, guiding courses of actions, and monitoring goal progress (Mischel et al., 1996). Self-talk has also been conceived of as a "meta-monitoring" of behavior and goal progression that can affect emotional reactions and responses to behavioral deficits (Carver and Scheier, 1998).

Sport psychologists highlight the importance of instructional (e.g., giving directions) and motivational (e.g., psyching oneself up) self-talk as well as other kinds of intrapersonal communication with respect to sport or athletic performance (Hatzigeorgiadis et al., 2011; Latinjak et al., 2019). Clinical psychologists have long been interested in the content of self-talk, particularly whether it is positive or negative (Kendall et al., 1989) and whether what one says to oneself is maladaptive or dysfunctional (e.g., Ellis, 1962; Beck, 1976). Others (e.g., Fernyhough, 2016; Van Raalte et al., 2016) further differentiate between condensed/automatic and expanded/ elaborated self-talk.

Fernyhough's (2016) summary nicely captures many of the everyday self-regulatory functions served by self-talk: "[Self-talk] can help us to plan what we are about to do and to regulate a course of action once it has started; it can give us a boost in keeping information in mind about what we are supposed to be doing, and in psyching ourselves up for action in the first place. For many of us, it provides a central thread to our conscious experience and is integral to our sense that we have a coherent, enduring self " (p. 107).

In summary, conceptual and research distinctions focus on the audible/overt, automatic, affective, and conversational aspects of self-talk. Following these distinctions, I define self-talk as self-directed or self-referent speech (either silent or aloud) that serves a variety of self-regulatory and other functions. This broad definition is designed to capture some of the primary features of the general phenomenon of talking to oneself that are amenable to the study of individual differences.

# THE SELF-TALK SCALE

The Self-Talk Scale (STS) (Brinthaupt et al., 2009) is a measure of the frequency with which individuals talk to themselves under a variety of circumstances. It assumes a functional approach by measuring how often people talk to themselves (silently or aloud) in response to specific events or situations. The STS measures four specific self-talk functions: self-criticism, self-reinforcement, self-management, and social-assessment. Respondents rate the 16 STS items with a 5-point frequency scale (1 = *never*, 5 = *very often*) and using the common stem "I talk to myself when…" Self-critical self-talk assesses negative events (e.g., when something bad has happened or when feeling ashamed of something one has done). Self-reinforcing self-talk refers to positive events (e.g., when feeling happy for oneself or proud of something one has done). Self-managing self-talk measures general self-regulation (e.g., when mentally exploring a possible course of action or when giving oneself directions or instructions about what to do or say). Finally, social-assessing self-talk applies to people's social interactions (e.g., when replaying something one has said to another person or analyzing something that someone recently said).

In our research, we find that total STS scores are normally distributed among college student samples. Test-retest stability of total scores (over 3 months) is good (i.e., *r* = 0.69; Brinthaupt et al., 2009, Study 7). Total and subscale internal consistencies are good (i.e., in the 0.85–0.94 range). We typically conduct correlational research using total and subscale STS scores (e.g., Brinthaupt et al., 2009, Study 4; Shi et al., 2015) as well as compare infrequent (lower 25%) with frequent (upper 25%) STS groups on a variety of measures (e.g., Brinthaupt et al., 2015, Study 2; Brinthaupt et al., 2009, Study 5).

Rasch analysis has supported the use of the STS response format and the use of the STS total score as a unidimensional measure of self-talk frequency (Brinthaupt and Kang, 2014). Brinthaupt et al. (2015) found that the self-talk situations included in the STS are frequently reported occurrences in people's lives and that STS scores (from 6 weeks earlier) were significantly related to reports of self-talk in response to relevant situations that had occurred over the past 2 days (*r* = 0.45; Study 1). We also found, in a week-long experience sampling study, that frequent self-talkers (measured one month earlier) reported talking to themselves significantly more often during recent events over the past 2 h compared to infrequent self-talkers (*d* = 0.83; Study 2). Qualitative (open-ended) research on when, where, and why people talk to themselves supports the four STS subscales/functions (e.g., Morin et al., 2018). Finally, there is good cross-cultural support for the structure and properties of the STS (e.g., Khodayarifard et al., 2014; Ren et al., 2016).

In summary, as a measure of individual differences in selftalk frequency, the structure and properties of the STS have been well supported. Although research indicates wide individual variation in the frequency of self-talk, there are few systematic assessments of the possible factors that might account for why people differ in their self-talk frequency. In the following sections, I present two hypotheses that are informed by our and others' research using the STS.

# THE SOCIAL ISOLATION HYPOTHESIS

One potential reason for why people differ in their self-talk frequency is the extent of their social isolation. Research shows that social isolation is a significant risk factor for physical and mental health (Cacioppo and Cacioppo, 2014). In addition, the frequency of self-referential pronoun use is positively associated with a variety of socially isolating physical and mental illnesses (Fineberg et al., 2016). According to this hypothesis, individuals who spend more time alone or who have more socially isolating experiences will report more frequent self-talk. The rationale here is that people may be motivated to create or manage their "social" interactions (*via* self-talk) when their social experiences are limited or unsatisfactory. Several published studies are pertinent to this hypothesis.

Some research has examined how childhood social experiences might be associated with differences in self-talk frequency. For example, adult only-children report significantly higher levels of overall (*d* = 0.28) and self-critical (*d* = 0.46) self-talk frequency than sibling children (Brinthaupt and Dove, 2012, Study 1). Adults who report having had an imaginary companion in childhood report significantly higher levels of overall (*d* = 0.16), self-reinforcing (*d* = 0.23), and self-managing (*d* = 0.17) selftalk frequency than those who did not have an imaginary companion (Brinthaupt and Dove, 2012, Study 2). We speculated that only children may be more comfortable being alone, more likely to engage in self-socialization, and more self-focused and autonomous compared to children with siblings. Having an imaginary companion in childhood might be associated with greater use of imagery, increased awareness of internal states, and being more creative and fantasy-prone compared to not having had such an experience. These factors might play a role in determining the levels of self-talk frequency in both childhood and adulthood.

In a study of loneliness, self-talk, and well-being using a German adult sample, Reichl et al. (2013) found that need to belong (*r* = 0.26) and loneliness (*r* = 0.29) scores were positively correlated with overall self-talk frequency, with similar relationships for all of the STS subscales. They also found higher negative correlations between loneliness and mental health for frequent compared to infrequent self-talkers. These results pertain directly to the rationale for the social isolation hypothesis, indicating that having limited or unsatisfactory social relationships was associated with increased selftalk frequency.

Other research has studied social-related variables and their relationship to STS scores. For example, using a Persian translation of the STS, Akbari-Zardkhaneh et al. (2018) found that extraversion scores were negatively related to the frequency of self-managing self-talk (*r* = −0.29) and that insensitivity scores (e.g., being unwilling to accept other people's opinions) were negatively related to self-critical self-talk frequency (*r* = −0.27). In other words, people who are more introverted tended to report more self-managing self-talk, whereas people who do not believe that they are superior to other people (lower insensitivity; Van Kampen, 2000) reported higher levels of self-critical self-talk.

In summary, there is good support for the social isolation hypothesis, with a consistent pattern across the studies, and most effect sizes in the small range. It is clear that certain features of social experiences (e.g., having limited or unsatisfactory relationships) are associated with increased levels of self-talk frequency. Further systematic assessment of the social isolation hypothesis is needed. For example, researchers could examine fear of negative evaluation (e.g., Tanaka and Ikegami, 2015), shyness (e.g., Tang et al., 2017), and social anxiety disorder (e.g., Poole et al., 2017). According to the social isolation hypothesis, each of these characteristics should relate positively to overall self-talk frequency as well as to the self-critical facet of self-talk frequency, based on the findings of previous research.

Exploring different kinds of internal dialogues (e.g., Oleś, 2009) might also help to assess the validity of the social isolation hypothesis. For example, integrative dialogues (i.e., internal conversations that resolve opposing views or reduce selfdiscrepancies) might be characterized by high levels of selfreinforcing, self-managing, and social-assessing self-talk, whereas confrontational dialogues (i.e., those that create internal dissonance or favor one viewpoint over another) might be characterized by high levels of self-critical and self-managing self-talk (Puchalska-Wasyl, 2017). The "helpless child" interlocutor identified by Puchalska-Wasyl (2015) should be associated with frequent self-critical self-talk, as it is characterized by feelings of powerlessness and isolation.

Finally, it would be interesting to explore self-talk frequency with respect to other facets of social isolation, such as being socially disconnected, living alone or with pets, and having recently suffered the termination of a romantic relationship. Individuals experiencing such short- or long-term social features might be motivated to compensate for their limited or unsatisfactory experiences through increased levels of overall or specific kinds of self-talk. For example, researchers could measure self-talk levels of participants before and after they experience a socially isolating event. Investigators might also expose participants to hypothetical threats to or affirmations of their social connections and assess the content and frequency of self-talk in response to those manipulations.

# THE COGNITIVE DISRUPTION HYPOTHESIS

Cognitive disruption related to the need to explain or understand personal events or experiences is another potential reason for individual differences in self-talk frequency. Research shows that people who experience cognitive disruption following negative or stressful events demonstrate performance and self-regulatory decrements (e.g., Gunther et al., 2007; Helton et al., 2011). According to this hypothesis, self-related experiences that are cognitively disruptive (such as anxiety, obsessive-compulsive tendencies, and schizotypy) will be associated with increased self-talk frequency. The rationale here is that having anomalous, upsetting, or disturbing self-related experiences should press a person into trying to resolve, understand, or clarify those experiences. Self-talk is one self-regulatory tool that is predicted to be used under these circumstances. There are several research studies using the STS that are pertinent to this hypothesis.

Large percentages of people report that they feeling anxious about speaking in public (e.g., Stein et al. 1996). Because of its prominence, anxiety about public speaking is an excellent case for studying the relationship between self-talk and the cognitive disruptions caused by anxiety. Research conducted by Shi et al. (2015) examined whether individuals who were anxious about delivering a forthcoming public speech reported more self-talk related to that speech. Just prior to delivering their speech, college student participants completed the STS (adapted to their speech preparation) and a measure of public speaking anxiety (PSA). The results showed that self-critical (*β* = 0.15) and social-assessing (*β* = 0.31) self-talk were positively related to PSA, whereas selfreinforcing self-talk was negatively related to PSA (*β* = −0.28). We interpreted these results to suggest that individuals with high PSA were cognitively "busier" than those with low anxiety as they prepared for their upcoming speech. In a follow-up study (Shi et al., 2017), we found that self-managing self-talk was positively associated with the rated organization of an actual speech (*r* = 0.23) and that PSA mediated the effects of selfcritical and social-assessing self-talk on rated speech delivery, with self-critical self-talk indirectly decreasing speech delivery scores through its influence on increasing speakers' PSA levels.

Research shows that people normally have a variety of intrusive and ruminative thoughts and that these thoughts can sometimes develop into the serious clinical obsessions that characterize obsessive-compulsive disorder (e.g., Mancini et al., 1999). Studies also show that obsessional, compulsive tendencies are associated with an over-awareness of self-processes (e.g., Baumeister and Heatherton, 1996). Thus, it seems reasonable that obsessive-compulsive tendencies might be related to increased self-talk frequency. Research using the STS supports this line of reasoning. For example, compared to infrequent self-talkers, frequent self-talkers report higher levels of obsessive-compulsive tendencies (*d* = 0.80), in particular, impaired control over mental activities (*d* = 0.77) and checking behaviors (*d* = 0.83) (Brinthaupt et al., 2009; Study 5). Khodayarifard et al. (2014) also found moderate positive correlations (i.e., in the 0.32–0.34 range) between obsessive-compulsive tendencies and overall and subscale self-talk frequency.

Another kind of self-related cognitive disruption is associated with the occurrence of schizotypy tendencies, which are milder forms and predictors of schizophrenia (e.g., Kwapil et al., 2018). Schizophrenia and schizotypy have long been considered to be disorders of the self by researchers and theorists, and a variety of self-related impairments and self-experience anomalies have been reported by those with schizotypy tendencies (Parnas, 2003). In a recent study using the STS (Brinthaupt, Smartt, and Long, under review), we found that positive (e.g., thought disruptions, perceptual anomalies) and disorganized (e.g., disruptions of current behavior, situational confusion) schizotypy factors were positively and significantly correlated with self-talk factors (*r*s in the 0.28–0.44 range), but that negative schizotypy factors (e.g., speech impairments, diminished reactivity and affect) were unrelated to self-talk frequency. We interpreted these results as consistent with a "self-regulatory focus" explanation rather than reflecting self-regulatory or intrapersonal deficits.

There are additional studies that are pertinent to the cognitive disruption hypothesis. Using a Chinese college student sample, Ren et al. (2016) found significant relationships between impulsivity and self-talk frequency. In particular, motor impulsiveness scores (e.g., doing things without thinking) were positively related to self-critical self-talk (*r* = 0.31), whereas cognitive impulsiveness scores (e.g., making quick cognitive decisions) were negatively related to self-reinforcing self-talk (*r* = −0.27). Indirect support for the cognitive disruption hypothesis comes from research that examines general cognitive variables and their relationship to self-talk frequency. For example, overall self-talk frequency is positively correlated with scores on private self-consciousness (*r* = 0.37) and using verbal information processing strategies (*r* = 0.47), and people who report frequent self-talk show higher need for cognition scores than do infrequent self-talkers (*d* = 0.64) (Brinthaupt et al., 2009, Studies 4–6). Furthermore, Ren et al. (2016) found that self-managing self-talk was positively but weakly correlated with a variety of reasoning and working memory tasks (*r*s in the 0.16–0.22 range).

In summary, there is strong support for the cognitive disruption hypothesis, with moderate-to-large effect sizes reported in the research literature. A variety of self-related and general cognitive measures are associated with increased levels of overall or subscale self-talk frequency. If the cognitive disruption hypothesis is accurate, it is likely that other kinds of self-related disruption, such as identity disturbance (e.g., Kaufman et al., 2015), will be associated with increases in self-talk frequency. Conducting experimental manipulations would be the best way to provide direct support for the cognitive disruption hypothesis. For example, researchers might create situations that result in anomalous perceptual or sensory experiences and then monitor overt and covert self-talk as participants attempt to explain or understand those experiences.

Other examples of relevant cognitive disruption might include dissociative experiences (e.g., Alderson-Day et al., 2018), perfectionism (e.g., Moore et al., 2018), and academic procrastination (e.g., Grunschel et al., 2016). In each of these cases, higher overall or subscale scores (particularly the selfcritical facet of self-talk) would be expected to be associated with increased self-talk frequency. For example, research shows that perfectionism is associated with increased levels of stress and stress reactivity (Flett et al., 2016) as well as increased intrusive imagery and difficulty completing tasks (Lee et al., 2011). Such tendencies should increase the need for self-regulatory self-talk. To date, no research has examined these possibilities.

# OTHER POSSIBLE FACTORS RELATED TO INDIVIDUAL DIFFERENCES IN SELF-TALK FREQUENCY

This paper reports research that examines the relationship of personality and personal experience factors to self-talk frequency. There are likely to be shorter-term, less stable factors that might affect when, where, and how much one talks to oneself (Hardy et al., 2009). For example, it is possible that unstable, situational experiences of social isolation or disruption (e.g., experiencing anger or rejection from a friend or family member) or cognitive disruption (e.g., experiencing an acute stressful life event) will be associated with more frequent selftalk frequency, regardless of one's normal levels of self-talk. Future research could explore these possibilities. Sport psychology appears to be particularly well-equipped to test many of these ideas.

Although the results reported here do not directly assess this possibility, it appears likely that self-regulatory disruptions (e.g., Baumeister and Heatherton, 1996) will precipitate increased self-talk. For example, disruption of plans, failure to engage in desirable behaviors or to stop engaging in undesirable behaviors, and having difficulty meeting one's internalized standards should all increase the need to engage in the selfregulatory functions served by self-talk. Future research could explore these possibilities as well. Conducting research along the lines described here will help to clarify the extent that self-talk frequency differs based on stable, individual differences and as a response to short-term events and experiences.

Future research should also contrast the social isolation and cognitive disruption hypotheses. The results reported in this paper suggest that cognitive disruption is more strongly related to self-talk frequency than are socially isolating experiences. Brinthaupt et al. (under review) found that the interpersonal superordinate schizotypy facet was much less strongly related to self-talk frequency than were the cognitive-perceptual anomalies and disorganized thinking superordinate facets. This result provides an initial comparison of the relative strength of the social isolation and cognitive disruption hypotheses, with stronger support for the latter.

The social isolation and cognitive disruption hypotheses can be further tested using measures that include other varieties of inner speech (Alderson-Day et al., 2018) or dialogic functions (Puchalska-Wasyl, 2017) not assessed by the STS. As reported earlier, there is some evidence of a weak, positive relationship between extraversion and self-talk frequency. However, overall, the Big 5 personality traits appear to be weakly related to self-talk frequency. As Uttl et al. (2011) found, most measures of inner speech or self-talk show very weak relationships with the NEO traits. Thus, the issue is not one that is specific to the STS. Upon reflection, the need or desire to talk to oneself should not be specific to high or low levels of core personality traits. Being generally sociable, talkative, trusting, curious, organized, or distress-prone should not, per se, incline people to talk more or less frequently to themselves. People who are

# REFERENCES


low versus high in agreeableness or openness will probably differ less in the frequency of their self-talk than in its content (e.g., its valence, whether it is more approach or avoidance in nature).

An additional hypothesis for individual differences in selftalk frequency might be that having emotionally disruptive experiences will precipitate the need for more self-talk. To date, there is some support for this "emotional disruption" hypothesis. For example, depression (e.g., Khodayarifard et al., 2014), selfesteem (e.g., Brinthaupt et al., 2009, Studies 4 and 6), and neuroticism (e.g., Uttl et al., 2011; Akbari-Zardkhaneh et al., 2018) are weakly related to overall self-talk frequency, but more strongly related to self-critical self-talk. Self-criticism has been identified as a trans-diagnostic process related to a variety of negative clinical outcomes (Shahar et al., 2012). Observational research of tennis players shows that negative self-talk increases in frequency following lost points during a competitive match (Van Raalte et al., 1994). Future research could explore whether experiencing negative emotions is most strongly associated with self-critical self-talk frequency, whereas experiencing positive emotions is most strongly associated with self-reinforcing (and possibly self-managing) self-talk frequency.

In conclusion, research exploring individual differences in self-talk frequency has uncovered moderate support for the social isolation hypothesis and strong support for the cognitive disruption hypothesis. As alluded to in the introduction, measuring individual differences in self-talk frequency has the potential to be useful and informative for a variety of therapeutic, sport, and educational interventions. It is conceivable that cognitive or behavioral interventions might "take" more easily and readily with individuals who frequently rather than infrequently talk to themselves. By using the Self-Talk Scale and related measures, researchers can examine these possibilities and a wide range of other interesting questions.

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.

# FUNDING

Funding for this paper comes from Middle Tennessee State University.


Scale: a new measure for assessing positive, negative, and disorganized schizotypy. *Schizophr. Res.* 193, 209–217. doi: 10.1016/j.schres.2017.07.001


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Brinthaupt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Inner Dialogical Communication and Pathological Personality Traits

#### Małgorzata Łysiak\*

Department of Clinical Psychology, Institute of Psychology, John Paul II Catholic University of Lublin, Lublin, Poland

Dialogicality and its relation to personality traits have been extensively explored since the evolution of dialogical self theory. However, the latest edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) proposes a new hybrid personality disorder system and, thereby, a new model of pathological personality traits. As of now, there are no studies which show the relationships between self-talk, internal dialogicality, and pathological traits. Thus, the aim of this study was twofold: (a) to investigate the relationship between self-talk and pathological personality traits and (b) to explore the possible affinity between pathological structure of personality and dialogicality. A representative sample of 458 individuals from the non-clinical population, aged 18–67 (M = 30.99, SD = 10.27), including 52% women, completed three questionnaires: the Self-Talk Scale by Brinthaupt et al. (2009), the Internal Dialogical Activity Scale by Oles (2009) ´ , and the Personality Inventory for DSM-5 by Krueger et al. (2012). To verify the correspondence between self-talk, internal dialogues, and pathological personality traits, the Pearson product–moment correlation coefficients (Pearson's r) and canonical correlation analysis were used. The results supported the hypotheses about the specific relationship between internal dialogical activity and five crucial dysfunctional personality traits related to the hybrid DSM-5 system of diagnosis. People characterized as having emotional lability, anxiousness, and separation insecurity (high negative affectivity), with unusual beliefs and experiences, as well as eccentricity (high psychoticism), are prone to having ruminative and confronting dialogues. The correlation between pathological personality traits and self-talk were statistically significant, but the relationships are very small.

#### Edited by:

Thomas M. Brinthaupt, Middle Tennessee State University, United States

#### Reviewed by:

Joshua Weller, Tilburg University, Netherlands Lisa James, University of Minnesota Twin Cities, United States

> Annabelle Christiaens, Tilburg University, in collaboration with reviewer JW

> > \*Correspondence: Małgorzata Łysiak lysiak@kul.pl

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 18 March 2019 Accepted: 01 July 2019 Published: 16 July 2019

#### Citation:

Łysiak M (2019) Inner Dialogical Communication and Pathological Personality Traits. Front. Psychol. 10:1663. doi: 10.3389/fpsyg.2019.01663 Keywords: inner speech, internal dialogues, self-talk, pathological personality traits, DSM-5

# INTRODUCTION

One of my patients in the session suddenly said: "Oh, my God, I'm talking to myself. . . do you think I'm abnormal?" When we started to question one of her dysfunctional beliefs, she started to go back to her past and analyze what she could have done if she had the baggage of experience she has today. Naturally, she had a dialogue-like conversation with herself. When she realized what she was doing, her reaction was as emotional as the first: "well, well, well! Not only do I talk to myself, but I am making a dialogue to myself. Are you sure I need this therapy? There is no need for explaining what happened next, but my patient's observations led me to think about internal speech and internal dialogues as a special form of intrapersonal communication that requires more attention, especially research. Not without significance is the fact that I start by reflecting on a

**25**

psychotherapeutic practice example, because the it shows how intrapersonal communication may work. Not only patients "talk to each other and conduct internal dialogue." Such a process of intra-communication is a process studied by philosophers, literary scholars and psychologists. There are many types of inner speech, that fit into the category of intrapersonal communication, as well as individual differences in the frequency at which people experience internal speech (Hurlburt et al., 2013). Brinthaupt (2019) gives two hypotheses as an explanation of individual differences between people in terms of intrapersonal communication, which includes social isolation hypothesis and cognitive disruption hypothesis. In the context of pathological personality traits and intrapersonal communication the cognitive explanation is especially important. As we know from the cognitive-behavioral Beck's theory the dysfunctional beliefs thought to underlie pathological behavior (Beck and Freeman, 1990). The counselor's task in the conversation with the patient is to find these dysfunctional beliefs and help him/her to reformulate them. Dysfunctional beliefs are often expressed in the thoughts of patients, which often reflect their inner speech and inner dialogues. This is the first reason, why it is interesting to explore the nature of intrapersonal communication and whether it is related to personality traits. Beck and Freeman (1990) also claim "personality "traits" identified by adjectives such as "dependent," "withdrawn," "arrogant," or "extraverted" may be conceptualized as the overt expression of these underlying (belief) structures" (p. 18). The intensity of the traits is different depending on the type of personality. Zawadzki et al. (1995) claim that narcissistic personality is associated with low agreeableness and high neuroticism, antisocial personality disorder with elevated neuroticism, low conscientiousness and agreeableness while obsessive-compulsive personality with elevated neuroticism and reduced openness to new experiences and compromise. According to the newest diseases classification DSM-5, the concept of personality traits and disorders is changed. The Diagnostic and Statistical Manual of Mental Disorders proposes a new hybrid personality disorders system, which entails a new model of personality traits.

Combining a number of individual differences in intrapersonal communication, clinical practice, Brinthaupt hypotheses and pathological personality traits, the purpose of the present study is to explore potential links between self-talk (e.g., Brinthaupt et al., 2009), internal dialogicality based on Hermans' dialogical self theory (Hermans, 1996), and the construct of pathological personality traits based on the DSM-5 personality hybrid system.

Because people reflect on their inner experiences, we define inner speech as a dialogue with oneself which has a central role in self-regulation, self-reflection, and development (Alderson-Day and Fernyhough, 2015; Gut et al., 2018). Inner communication plays a crucial role in self-observation, where people can observe their experiences, emotions, and dispositions. It is considered to be the mental simulation of speech, as well as representative of cognitive function which is the main device for problem-solving, decision-making, and setting goals (Perrone-Bertolotti et al., 2014; Morin et al., 2018). Morin (2005, p. 5) suggests that "With inner speech one can engage in verbal conversations with oneself and replicate comments emitted by others (Cooley's mechanism) or internalize others' perspective (Mead's mechanism)." With self-talk one can recall observations, emotions, appraisals made by others, and might imprint one's own inner speech remarks on these recollections. Self-talk permits people to recreate the perspectives of others "in their private speech and to incorporate these multiple perspectives and into their concept of self " (DePape et al., 2006). Inner speech also allows people to regulate their mental states and can be involved in recalling past situations and emotions, also along with autobiographical memories (Morin, 2012). As with the past, internal speaking and internal dialogues are important in planning future situations and thinking, which can be helpful for creating psychological distance between the self and mental states created by the mind (Morin, 2005; Łysiak and Puchalska-Wasyl, 2018). Brinthaupt et al. (2009) identify the functions of self-talk which support the self-regulatory aspects of the self: social assessment, self-reinforcement, self-criticism, and selfmanagement. The social assessment function refers to "self-talk related to a person's social interactions" (Brinthaupt and Dove, 2012, p. 326). Focusing on positive events (e.g., feeling proud of something one has done or when something good has happened) is the self-reinforcement function, while regarding negative events (e.g., feeling discouraged about oneself or criticizing oneself for something one has said or done) refers to self-criticism (Brinthaupt and Dove, 2012). Lastly, self-management refers to giving oneself instructions or directions about what to do or say, or needing to figure out what to do or say, which is generally self-regulatory self-talk.

These functions express the dialogical nature of self-talk. Hermans' dialogical self theory assumes the self "in terms of dynamic multiplicity of voiced positions in the landscape of mind intertwined as this mind is with minds of other people" (Hermans, 2003, p. 90). Relatively autonomous I-positions can interact with other I-positions, in open dialogical space and time, and reflect the different roles that a person can perform (e.g., values and ideals, identity, thoughts, and the ideas of others). I-positions, which are in constant motion, are associated with a probable story and they move from one self-position to another. The "conversations" between the positions give an expression of the experiences, beliefs, and feelings. A person can consider a problem from the point of view of the group to which they belong, express some aspect of themselves, feel important and separate in relation to other aspects of themselves, or they may represent a given person at different moments of their life (Łysiak and Puchalska-Wasyl, 2018). Internal dialogical activity seems to be very important in inner conflicts, where the positions confront different points of view. Self-dialogues may lead to re-evaluation of crucial individual experiences from different perspectives. Studies on the functions of internal dialogues concern support, substitution, exploration, bond, selfimprovement, insight and self-guiding (Puchalska-Wasyl, 2007). The words "inner dialogues," "inner speech" used in the field of psychology, have several hidden meanings, such as: inner voice, verbal thinking, private speech, inner speaking, self-talk, internal monolog, internal dialogue (e.g., Piaget, 1959; Vygotsky, 1962; Hermans, 2003; Brinthaupt et al., 2009). It is difficult

to choose one appropriate definition, for the purpose of this research these words will be used interchangeably, with the main meaning being internal dialogical communication. However, it is important to distinguish the different dialogical activities. Self-talk is defined as self-directed speech, silent or loud, which mainly concerns self-regulatory functions (Brinthaupt, 2019), while internal dialogicality is an active process similar to interpersonal dialogues. Just as two people exchange views, thoughts, discuss or argue with each other, so two inner positions can interact with each other in similar ways. As there are many types of interpersonal communication, there are many types of intrapersonal dialogues, from identity dialogues to rumination or confrontation dialogues.

Although all positive and adaptive functions of internal speech are mentioned in the cited research, internal dialogues can also be non-adaptive and have negative consequences. First of all, inner speech has implications for patients in psychiatric conditions or with developmental disorders (Alderson-Day and Fernyhough, 2015). When inner speech becomes too intense, it can convert into pathological symptoms, such as insistent inner voices that characterize, for example, schizophrenia or redundant rumination, especially in social anxiety and depression (Perrone-Bertolotti et al., 2014). In psychotic disorders, inner speech is associated with auditory verbal hallucinations or hearing voices in the absence of the interlocutor. This is typical for a diagnosis of schizophrenia, but it is worth noting that there are no obstacles to this phenomenon appearing in the general population as well. Negative self-reflection – ruminations with negative thoughts – is one of the risk factors in affective disorders. Mainly cognitive-behavioral theories disclose data about maladaptive self-talk which is very important in developing anxiety and depression disorders (e.g., Padesky and Greenberger, 1995; Kendall and Choudhury, 2003). Calvete et al. (2005) explore how positive and negative content occurs in self-talk and how these, affect mood. The researchers used the Negative and Positive Self-Talk Scale and explored the connections for psychopathology traits. As expected, the trait "depression" was highly predicted by depressive self-talk and the trait "anxiety" by anxious and depressive self-talk. Positive-oriented self-talk has a connection with lower depression, but higher anger, while negative inner speaking correlates with anxiety, but not with depression (Alderson-Day and Fernyhough, 2015). Research by Brinthaupt et al. (2009) on frequency of self-talk showed that frequent self-talkers tended to be inwardly self-focused and had obsessive–compulsive tendencies. While the negative aspects of self-talk are correlated to anxiety, the positive ones appears in manic and narcissistic tendencies (Brinthaupt and Dove, 2012).

The conflict between various I-positions may cause neurotic problems if there are no efficient inner dialogue or assertive voices are suppressed by them (Strózak, 2018 ˙ ). Puchalska-Wasyl and Ole´s (2013) claims that doubtfulness is characteristic for providing dialogue. In some way the uncertainty is needed to provide the inner dialogue, while inner dialogues is one of the form for reducing the doubtfulness (Hermans and Hermans-Konopka, 2010). Research results by Chin et al. (2012) concerned uncertainty reveals that if people experience uncertainty, are likely to demonstrate those personality traits that may see as positive. Hermans and Hermans-Konopka (2010) postulate That living in times of uncertainty may contribute to engaging all form of dialogicality to reduce the doubtfulness and open the new ways of understanding the reality On the other hand the variety of possibilities, narrations, dynamic and constat changes may lead an individual to the most important value nowadays like being resilient and be ready to change.

The psychopathological side of living is linked to an American classification system, DSM (The Diagnostic and Statistical Manual of Mental Disorders); in the latest edition – DSM-5 – a new hybrid diagnostic system for personality disorders, with a dimensional pathological trait model, is proposed. It is also a five-factor trait model, but with a pathological version of the Five-Factor Model for normal personality (FFM); thus, it is called the "Pathological Big Five" (Krueger et al., 2011, 2012; Rowinski ´ et al., 2018). In DSM-5, there are four criteria to diagnose personality disorder, but two of them are the most original: the level of personality functioning (Criterion A) and the model of maladaptive personality traits (Criterion B). The first one consists of self and interpersonal functioning and the second refers to personality traits (Waugh et al., 2017). The new DSM-5 model consists of 25 lower order personality facets that are classified into five higher order domains: negative affectivity, detachment, antagonism, disinhibition, and psychoticism. Negative affectivity (like FFM: neuroticism) involves tendencies to experience lability in feelings, especially unpleasant ones, the antagonistic or inactive behaviors. Detachment (like FFM: low extraversion) assessment of depressive feelings with anhedonia, general interpersonal withdrawal and suspiciousness. Antagonism (like FFM: low agreeableness) means callousness with tendency to manipulate and attention seeking. Disinhibition (like FFM: low conscientiousness) means irresponsibility, impulsivity, and risk-taking behaviors, with strict perfectionism. Psychoticism (like FFM: openness to experience) includes the features of eccentricity, odd and unusual beliefs and behaviors (Hopwood et al., 2012). In the FFM model there is an instrument to measure the traits; likewise, there is one in the DSM-5 model, where each trait is represented by a dimension scored using a dedicated instrument: namely, the Personality Inventory for DSM-5 (PID-5; Krueger et al., 2011). The analysis concerning the relationships between FFM and PID-5 confirmed four correlations, given that psychoticism and openness to experience are given no association (Góngora and Solano, 2017).

To date, several studies have been conducted to explore the nature and correlations of internal dialogical activity and personality, and they have not yielded the same results. Regarding personality traits (FFM), the studies confirmed that internal dialogical activity is moderately associated with openness to experience and neuroticism. On the basis of the research by Ole´s et al. (2010), people with high neuroticism tend to conduct ruminative dialogues, whereas people with high openness have a tendency to use internal dialogues for identity clarification (Ole´s and Puchalska-Wasyl, 2012). The same team researched attachment styles and internal dialogicality. It appears that secure attachment correlates positively with identity dialogues and negatively with ruminative dialogues, and anxious attachment correlates with the simulation of social relationships and

ruminative dialogues. Avoidant attachment style has a negative relationship with supporting dialogues and identity dialogues, and a positive relationship with ruminative dialogues (B ˛atory et al., 2010; Ole´s et al., 2010; Ole´s and Puchalska-Wasyl, 2012). Studies on attachment styles and core beliefs, which are related to personality traits, considered that individuals with anxious style find others as difficult to understand with thoughts of having little control over outcomes in their lives, while people with secure attachment style are more assertive and interpersonally oriented (Platts et al., 2002). The research conducted by Zapała (2018) on imaginary dialogue and personality traits showed that openness to experience did not enhance the dialogical activity but was a predictor for creative dialogue as a personal dimension. Walasek's (2018) on Eysenck's personality types and internal dialogical activity confirmed the relationship between neuroticism and three types of dialogues: ruminative, confronting, and the simulation of social ones; no relationship between psychoticism and extraversion and inner communication was found. Her analysis also showed the connections between neuroticism and self-criticism, but only in a group of adolescents. While there are the correlations between internal dialogicality and FFM personality traits, Uttl et al. (2011) found very weak relationships between self-talk and big five personality traits. Given the frequency of self-talk, there is only a weak positive correlation with extraversion. An interesting study by Reichl et al. (2013) found negative correlations between loneliness and mental health, suggesting that people in weak or unsatisfactory relationships tend to use self-talk more frequent. Loneliness seems to be associated with uncertainty, and these two traits are very characteristic for personality disorders. This conclusion leads us to seek links between self-talk and pathological personality traits. As it was mentioned, uncertainty and doubtfulness are also linked to internal dialogicality, which also leads us to seek links between the inverted Big Five and internal communication.

In the outlined context, considering the adaptive and nonadaptive functions of inner communication and, to an extent, the DSM-5 hybrid personality pathological traits, the purpose of this research is to evaluate the degree to which pathological personality main domains influence variance in the functions of self-talk and types of dialogues. The main question of the study was posed: What are the relationships between the pathological personality traits and self-talk functions? What is the relationship between the pathological personality traits and internal dialogues? As this was an exploratory study, no hypotheses were formulated.

# MATERIALS AND METHODS

The participants in the study were 498 individuals aged 18 to 67 (M = 30.99, SD = 10.27, 52% women). All of them completed three questionnaires: Self- Talk-Scale (STS), the Personality Inventory for DSM-5-SF (PID-5-SF) and Internal Dialogical Activity Scale (IDAS). The study was conducted by assistants, recruited from among psychology students. Each student invited 10 to 20 people from among their friends and acquaintances to take part in the study. All the participants were informed about the purpose of the study and signed their informed consent for participation. They filled questionnaires in the paperpencil procedure and did not get any compensation. The study was conducted on a non-clinical sample, which means the results should be treated with caution. On the other hand, the dimensional approach presupposes the existence of specific traits that are found – with different degrees of intensity – in every person; a disorder is marked by a high intensity of these traits.

To examine the functions of the self-talk the Self-Talk Scale (STS) by Brinthaupt et al. (2009) was used. This selfreport questionnaire includes sixteen items examines self-talk as described in relatively abstract terms and as generalized across time and situations. The participants responded to each item using a 5-point scale, in which 1 was "never," 2 was "hardly ever," 3 was "sometimes," 4 was "fairly often," and 5 was "very often." The STS yields four scores for the scales including: social assessment, including wanting to replay something said to another person and imagining how other people respond to things one said (e.g., I'm imagining how other people respond to things I've said); self-reinforcement factor, which includes feeling proud of something when something good has happened (e.g., I'm proud of something I've done); self-criticism factor, which involves feeling discouraged about oneself and criticizing oneself for something said or done (e.g., I should have done something differently); and the self-management factor which entails giving oneself instructions or directions about what one should do or say, and needing to figure out what one should do or say (e.g., I need to figure out what I should do or say). The authors provide some initial evidence for the internal consistency, test– retest reliability, and construct validity of data collected from the measure (Brinthaupt and Dove, 2012). In the present study, the following alpha coefficients were obtained for the STS factors: social assessment, 0.76; self-reinforcement, 0.83; self-criticism, 0.75; and self-management, 0.73.

The Personality Inventory for DSM-5 (PID-5-SF) is a short form of a 220-item self-report inventory, PID-5, designed to assess the twenty-five pathological personality trait facets and the five higher-order domains of criterion B of the DSM-5 AMPD. The PID-5-SF-Adult is a 25-item self-rated personality trait assessment scale for adults aged 18 and older. It assesses five personality trait domains, including negative affectivity, detachment, antagonism, disinhibition, and psychoticism, with each trait domain consisting of five items. Each item on the measure is rated on a 4-point scale from 0–very false to 3–very true or often true. Each trait domain ranges in score from 0 to 15, with higher scores indicating greater dysfunction in the specific personality trait domain. Negative affectivity is defined as intense experiences of high levels of a wide range of negative emotions (e.g., anxiety, depression, guilt/shame, worry, and anger) and their behavioral manifestations (e.g., I worry about almost everything). Detachment is understood as avoidance of socioemotional experience, including both withdrawal from interpersonal interactions as well as restricted affective experience and expression, and, particularly, limited hedonic capacity (e.g., I steer clear of romantic relationships). Antagonism is a trait which puts the individual at odds with other people and includes

an exaggerated sense of self-importance and a concomitant expectation of special treatment, as well as a callous antipathy toward others, encompassing both an unawareness of others' needs and feelings, and a readiness to use others in the service of self-enhancement (e.g., I crave attention). Disinhibition is an orientation toward immediate gratification, leading to impulsive behavior driven by current thoughts, feelings, and external stimuli, without regard for past learning or consideration of future consequences (e.g., People would describe me as reckless). Psychoticism exhibits a wide range of culturally incongruent, odd, eccentric, or unusual behaviors and cognitions (e.g., I have seen things that weren't really there), including both processes (e.g., perception, dissociation) and contents (e.g., beliefs). In the present study, the following alpha coefficients were obtained: negative affectivity, 0.74; detachment, 0.63; antagonism, 0.76; disinhibition, 0.76; and psychoticism, 0.70.

The Internal Dialogical Activity Scale (IDAS) by Ole´s (2009) enables the assessment of the intensity of general dialogical activity in everyday life (general score) and seven kinds of internal dialogues measured by subscales: (1) pure dialogical activity (AD) – meaning spontaneous conduct of internal dialogues, thinking, and solving various issues in the form of dialogue (e.g., I converse with myself); (2) identity dialogues (ID) – internal dialogues aimed at better self-knowledge and answering identity questions, such as who am I, what is important to me, and what is the meaning of my life? (e.g., Sometimes I debate with myself about who I really am) (3) supportive dialogues (SD) – dialoguing which confirms beliefs, and supports or understands the imagined interlocutor, which may replace real conversations and give instructions (e.g., In some stressful situations, I attempt to calm myself with my thoughts); (4) ruminative dialogues (RD) – conducting internal dialogues about unpleasant topics, evoking difficult topics in thoughts, and pursuing them in the form of dialogue, accompanied by a sense of fatigue and frustration, and even a breakdown associated with internal dialogue activity (e.g., After failures, I blame myself in my thoughts and discuss how the failures could have been avoided); (5) confronting dialogues (CD) – conducting dialogues between two clearly separated parts of oneself, playing out internal conflicts in the form of dialogue (e.g., Sometimes I think that my "good" side argues with my "bad" side); (6) simulation of social dialogues (SS) – dialogues that are a continuation of conversations or a reflection of social dialogue relations: quarrels, discussions or exchanges of ideas (e.g., Sometimes when I am preparing to talk to someone, I rehearse the conversation in my mind); (7) taking a point of view (PV) – measures willingness to take a different viewpoint from one's own, the viewpoint of another person, or to question one's own opinion and attempt to assess events from a different personal perspective, and to objectify problems by looking at them from a new, different perspective (e.g., Often in my thoughts I use the perspective of someone else). Answers are given on a 5-degree Likert scale, ranging from 1–definitely disagree to 5–definitely agree. The higher the score, the higher is the intensity of internal dialogical activity.

In the present study, the following alpha coefficients were obtained for the IDAS subscales: pure dialogical activity, 0.78; identity dialogues, 0.82; supportive dialogues, 0.72; ruminative dialogues, 0.79; confronting dialogues, 0.80; simulation of social dialogues, 0.81; and taking a point of view, 0.65.

# RESULTS

The basic statistics for each variable are given in **Table 1**. As this is a non-clinical sample, it is worth noting the kurtosis and skewness values for PID variables. The distribution is skewed to the left, but mostly in the levels of acceptance (sk < 1). Concerning the relationship between pathological personality traits and self-talk functions, a Pearson's correlation was used. The analysis showed a significant but weak positive relationship between negative affectivity and self-criticism (r = 0.25), selfassessment (r = 0.15), and self-management (r = 0.14). Also, psychoticism and disinhibition are weakly correlated with selfassessment (r = 0.25 and 0.15), while self-management is correlated with psychoticism (r = 0.20) (**Table 2**).

To verify whether there was a relationship between the pathological personality traits and internal dialogues, the same statistical calculations were made. In the first step, Pearson's correlation was used, which showed not very strong but positive relationships between personality domains and types of internal dialogicality (**Table 3**). The results showed that ruminative dialogues, confronting dialogues, and taking a different point of view are related to all pathological personality traits, while negative affectivity and psychoticism are related to all types of dialogues. Also, disinhibition is associated with more than half of the types of dialogues (**Table 3**). In view of these results, further analysis concerning a general exploratory question about the



STS: SOCL\_AS, social assessment; SELF\_RE, self-reinforcement; SELF\_CR, self-criticism; SELF\_ME, self-management; IDAS: AD, pure dialogical activity; ID, identity dialogues; SD, supportive dialogues; RD, ruminative dialogues; CD, confronting dialogues; SS, simulation of social dialogues; PV, taking different points of view; PID: NA, negative affectivity; DET, detachment; ANT, antagonism; DIS, disinhibition; PSY, psychoticism.

TABLE 2 | Pearson's correlation for pathological personality traits (PID) and self-talk functions (STS).


p-values were adjusted with Bonferroni's correction. <sup>∗</sup>p < 0.01 and ∗∗p < 0.05.

TABLE 3 | Pearson's correlation for pathological personality traits (PID) and internal dialogicality (IDAS).


AD, pure dialogical activity; ID, identity dialogues; SD, supportive dialogues; RD, ruminative dialogues; CD, confronting dialogues; SS, simulation of social dialogues; PV, taking a point of view. p-Values were adjusted with Bonferroni's correction. <sup>∗</sup>p < 0.001 and ∗∗p < 0.01.

mutual relationship between the DSM-5 pathological personality traits and inner dialogues was carried out.

In order to answer the research question, canonical correlation analysis was used as a multivariate statistical model which allows the simultaneous prediction of multiple dependent variables from multiple independent variables. A canonical correlation analysis was conducted using the five personality traits as predictors and internal types of dialogue as criteria. The results of the correlation analysis refer to the direction of impact, nevertheless, such results should be treated with great caution.

TABLE 4 | Canonical correlations with personality traits as predictors and internal dialogicality as criteria.


Predictors entered in the analysis: negative affectivity; detachment, antagonism, disinhibition psychoticism (PID), Criteria entered in the analysis: pure dialogical activity; identity dialogues; supportive dialogues; ruminative dialogues; confronting dialogues; simulation of social dialogues; taking a point of view (IDAS). n.s. nonsiginificant.

The analysis provided three statistically significant functions (**Table 4**), but the second and the third explained only 8.8 and 4.7% of the remaining variance (unexplained by the first function). Therefore, only the first function, explaining 28.6% of the total shared variance between pathological personality traits (as predictors) and internal dialogical activity (as criteria), was considered in further analyses. As shown in **Table 5**, the first canonical variable representing DSM-5 personality traits is mainly loaded by psychoticism (to a high degree: 0.82), negative affectivity (high: 0.81) and disinhibition (moderate: 0.61). This canonical variable represents 41.7% of the variance shared by these three personality domains. The opposite canonical variable, created by inner dialogicality, represents 47.4% of the variance shared by all types of dialogues, and loaded mainly with ruminative dialogues (0.94), confronting dialogues (0.78), and taking different points of view (0.67).

Pathological personality traits and inner dialogicality have much in common; 28.6% of shared variance is quite substantial. The redundancy analysis shows that the latent variable, personality traits, explains 13.5% of internal dialogicality variability, whereas the particular types of dialogue explains 11.9% of DSM-5 personality traits.

Because canonical loadings with the same sign indicate a positive correlation of the variables, it could be said that the higher negative affectivity, psychoticism, and disinhibition, the higher is the degree of ruminative dialogues, confronting dialogues, and taking different points of view. Thus, those with a greater intensity of emotional lability, anxiousness, and separation insecurity (high negative affectivity) with unusual beliefs and experiences, as well as eccentricity (high psychoticism) and a tendency to be irresponsible and impulsive (high

TABLE 5 | Results for the first canonical function.


disinhibition), are prone to present ruminative, confronting dialogues, as well as taking different points of view.

To establish whether there were correlations among self-talk functions and internal dialogical activity, r-Pearson's correlations among the STS and IDAS subscales were performed. The results are presented in **Table 6**. All correlation coefficients are positive and significant and are within the limits 0.20–0.46. The highest, but still moderate correlations are between pure dialogical activity and social assessment (0.46). Social assessment correlates with simulation of social dialogues (0.45). Pure dialogical activity and supportive dialogues have moderate correlations (around 0.4) with self-criticism and self-management.

# DISCUSSION

The purpose of this research was to investigate the relationship between inner speech and pathological traits, using the model described in DSM-5. Two specific objectives were set: (1) analyze the relationships between the functions of self-talk by Brinthaupt et al. (2009) and DSM-5 pathological personality traits; (2) explore the possible affinity between pathological structure of personality and types of internal dialogical activity. Two overall findings were observed. With regard to the results, the lack of any significant correlation between functions of self-talk and DSM-5 pathological traits are noteworthy, while there is correspondence between inner dialogicality and personality pathological domains. The results of the canonical correlation showed quite substantial correspondence between DSM-5 personality traits and the types of internal dialogicality. A common variance exceeds nearly 30% and shows a clear affinity between the two sets of variables.

In light of the obtained results, the weak correlations between self-talk and pathological personality traits and, at the same time, the relationship between the pathological big five and types of internal dialogicality are puzzling. The first explanation of these results is related to the relationships between the internal speech objects. It is worth noting that the correlations between functions of self-talk and internal dialogical activity are not very strong. The strongest ones are between pure

TABLE 6 | Correlations among subscales of types of internal dialogues (IDAS) and self-talk (STS).


p-values were adjusted with Bonferroni's correction <sup>∗</sup>p < 0.05. STS: SOCL\_AS, social assessment; SELF\_RE, self-reinforcement; SELF\_CR, selfcriticism, SELF\_ME, self management; IDAS: AD, pure dialogical activity; ID, identity dialogues; SD, supportive dialogues; RD, ruminative dialogues; CD, confronting dialogues; SS, simulation of social dialogues; PV, taking different points of view.

dialogical activity, supportive dialogues, and simulation of social dialogues with social assessment, self-criticism, and selfmanagement as functions of self-talk. It is worth noting that although the strength of correlations is moderate, it does not mean that it is invalid. According to analysis of the definitions of these (functions and types) and research (e.g., Padesky and Greenberger, 1995; Calvete et al., 2005), it can be assumed that inner speech might be positive and negative. When combined with internal dialogical activity, functions of self-talk seem to be much more correlated with positive internal speech. A special relationship exists between self-criticism and supportive dialogues, which may suggest that critical self-reflection and dialoguing is a part of "productive" life (Hermans and Hermans-Konopka, 2010, p. 123). Confronting negative events might also be positive, especially in the process of self-reflection, when people retreat to their negative experiences or feelings to positively trigger an internal dialogue, which is supportive. McCarthy-Jones and Fernyhough (2011) distinguished four types of inner speech: dialogic inner speech – backward and forward conversational quality; condensed inner speech – a short, fragmentary form of inner speech; other people in inner speech – a representation of others' voices or what someone else would say; and evaluative/motivational inner speech – which means judging or assessing one's own behavior. The results indicate that evaluative/motivational inner speech and dialogic inner speech were most commonly chosen. Such a situation may relate in particular to the role of the critical internal voice, which in a healthy person can play a constructive mobilizing role. However, this interpretation should be treated with caution, as the study concerned the intensification of pathological features, hence the critic rather intensified his disadaptive strategies. According to this research, it is likely that the scales of internal dialogical activity and self-talk complement each other but explore different aspects of internal speech. These results may support that selftalk functions are not associated with pathological personality traits, although Brinthaupt et al. (2009) showed that selftalkers have obsessive–compulsive tendencies and tended to be inwardly self-focused.

While there is no correlation between functions of selftalk and personality pathological traits, canonical correlation analysis revealed a main pattern which is reflected in negative affectivity and psychoticism as predictors, and ruminative and confronting dialogues as criteria. With higher emotional lability, anxiousness, submissiveness, insecurity (negative affectivity), higher unusual beliefs and experiences, and eccentricity (psychoticism), people are prone to having dialogues which are focused on unpleasant themes, usually conducted with frustration (ruminative dialogues) and dialogues where two strictly divided parts of oneself tries to resolve internal conflict (confronting dialogues). In this context, the obtained results may be seen as DSM-5 pathological big five as reversed Five-Factor Model. Studies on dialogicality and FFM show that internal dialogical activity is associated with openness to experience. There are also low but significant correlations between dialogical activity and neuroticism and, more interestingly, only with two types of dialogue: ruminative and confronting (Puchalska-Wasyl et al., 2008; Ole´s et al., 2010). The results from the research

on the DSM-5 pathological personality and FFM personality models confirm a strong correlation between general traits and pathological traits (Krueger et al., 2012; Quilty et al., 2013; Thomas et al., 2013; Strus et al., 2017): negative affectivity with neuroticism (positive), detachment with extraversion (negative), antagonism with agreeableness (negative), and disinhibition with conscientiousness (negative). Ambiguous results were obtained by comparing psychoticism and openness with experience – some research found a relationship between these two traits (e.g., Thomas et al., 2013; Chmielewski et al., 2014), while others did not find any association (Quilty et al., 2013; Suzuki et al., 2015). Due to the potential for integrating models of normal and abnormal personality, the results on dialogicality appear to be compatible because negative affectivity is the counterpart of neuroticism and psychoticism is a counterpart of openness to experience. In both models of personality traits, ruminative and confronting types of dialogue are most characteristic. This would suggest the dimensional approach presupposes the existence of specific traits that are found with different degrees of intensity in every person and a disorder is marked by a high intensity of these traits. Personality traits (normal or pathological) participate in explaining the inner communication in its adaptive and non-adaptive functions and types.

Rumination in the categories of inner speech and dialogicality is the aspect of negatively experienced positions which dominate the self. When "ruminating," I is constrained by the cluster of internal and external positions that are accessible, but they do not allow any exit. It is like prison from which a person cannot escape. A lack of any innovation and the dominance of one position is the reason why ruminating seems to be more monological than dialogical (Hermans and Hermans-Konopka, 2010, p.176). Positive inner speech is not accessible while ruminating. It seems compatible with these domains of pathological traits – negative affectivity and psychoticism – where high levels of a wide range of negative emotions, such as anxiety, depression, or worrying, and a wide range of eccentric, bizarre behavior, appearances, and/or speech, and having strange and unpredictable thoughts, may intensify the ruminative dialogues. It is like a "vicious cycle" such as in experiences of, for example, anxiety disorders: If a person has most of the features within the domains of negative affectivity or psychoticism, it makes sense that she or he can try do things that are opposite to the features. The more the person tries to not dialogue in a negative way, the more difficult it is. Although it may be possible for a nonclinical population, it may be very difficult for people with personality disorders.

Confronting dialogues define dialogues where two internal voices are in conflict and they try to push an individual in different, sometimes opposite, directions. This "war" in the mind can cause a lot of consequences in feelings or behavior. It may bring tension or frustration, yet can be developmental for the self, even leading to creative insight. Intensity and frequency of internal confronting dialogues causes emotional exhaustion and may become maladaptive. The relationship between psychoticism and negative affectivity and confronting dialogues confirms that if the personality develops into disorder, the loading of confronting dialogues is stronger. According to Morin (1993) the discrepancies between perspectives are accompanied by negative emotionality and selfawareness is constricted. At first sight, confronting dialogues usually seem to be accompanied by negative emotions, which cause discomfort and inconvenience; however, they are stronger if loaded by the traits where dissociative experiences and feelings of nervousness are present. Referring to cognitive behavioral literature (Beck and Freeman, 1990; Padesky and Greenberger, 1995), negative thoughts and negative internal speaking can cause non-realistic beliefs, which are related to personality traits. Negative thinking is supposed to be balanced by positive thoughts, which a person can integrate into their overall generally positive and emotionally healthy sense of self (e.g., Padesky and Greenberger, 1995). The obtained results are worth looking at in terms of clinical implications. As it has already been mentioned, the research on intrapersonal communication is a trend in practice toward integrating cognitive behavioral insights, where clinical psychologists, psychotherapists often try to change the content of inner speaking to help their clients alter emotional responding and function in more adaptive way. Imaginable dialogues stimulate thinking as much as the real one and it is more effective in constructing the solution (Staudinger and Baltes, 1996). Cognitive-behavioral interventions, especially while using "experimental techniques" to explore the dialogues, can be easier and quicker to use as we know the pathological traits as predictors and the types of dialogues as criteria (e.g., classic empty chair, dialogical temporal chair technique; Łysiak, 2017). These techniques can be useful to reconstruct ruminative or confronting dialogues in more effective way. Furthermore, if the pathological personality structure is known for psychologist, it will be useful to check the intensity of internal dialogical activity to plan different kind of interventions for example to change the emotions these dialogues may cause. This last remark relates to a question for further research: Can dialogical activity foster overcoming problems related to personality disorders and, if so, under which conditions and using which kind(s) of dialogical activity?

This study has some limitations. First, it is a strictly correlational study based on self-report questionnaires. Second, the sample consisted only of adults from one country and, although representing different kinds of academic educations or coming from different areas of the country, the research should be replicated in a different population. The study was conducted on a non-clinical sample; even if a disorder is marked by a high intensity of pathological traits, the clinical sample should be examined in further research to increase confidence in the results. The next limitation of the study is the procedure of participants recruitment. The researcher's assistants were psychology students, who had the guideline about the sample specification and the criteria of data collection. But there is a possibility that they engaged their friends and acquaintances. This is also the reason to treat the results with caution. Another limitation is that the results regarding the relationships with the DSM-5 pathological personality traits were limited to basic personality traits only. The current study checked only the main domains of traits and their connections to inner speech, but an investigation of the lower facets is an important issue and might provide more information. A better understanding of the relationship between the DSM-5 personality trait structure and inner dialogicality needs further exploration.

In summary, the findings from this research concerning the relationships between pathological personality traits and inner communication allow us to identify a main role of two out of five pathological personality traits which mainly favor two out of seven types of internal dialogicality.

# DATA AVAILABILITY

fpsyg-10-01663 July 13, 2019 Time: 15:26 # 9

All datasets generated for this study are included in the manuscript and/or the supplementary files.

# REFERENCES


# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of "name of guidelines, name of committee" with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Committee on Ethics in Scientific Research of the Institute of Psychology in John Paul II Catholic University of Lublin.

# AUTHOR CONTRIBUTIONS

MŁ prepared the manuscript with the help of students who collected the data.


T. Gieser (New York, NY: Cambridge University Press), 241–252. doi: 10.1017/ cbo9781139030434.017


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Łysiak. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpsyg-10-01663 July 13, 2019 Time: 15:26 # 10

# Imaginary Companions, Inner Speech, and Auditory Verbal Hallucinations: What Are the Relations?

*Charles Fernyhough1 \*, Ashley Watson1 , Marco Bernini2 , Peter Moseley1,3 and Ben Alderson-Day1*

*1 Department of Psychology, Durham University, Durham, United Kingdom, 2 Department of English Studies, Durham University, Durham, United Kingdom, 3 School of Psychology, University of Central Lancashire, Preston, United Kingdom*

*Edited by:* 

*Alain Morin, Mount Royal University, Canada*

#### *Reviewed by:*

*Karin Slotema, Parnassia Psychiatric Institute, Netherlands Tomohisa Asai, Advanced Telecommunications Research Institute International (ATR), Japan*

*\*Correspondence: Charles Fernyhough c.p.fernyhough@durham.ac.uk*

#### *Specialty section:*

*This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology*

*Received: 28 February 2019 Accepted: 02 July 2019 Published: 30 July 2019*

#### *Citation:*

*Fernyhough C, Watson A, Bernini M, Moseley P and Alderson-Day B (2019) Imaginary Companions, Inner Speech, and Auditory Verbal Hallucinations: What Are the Relations? Front. Psychol. 10:1665. doi: 10.3389/fpsyg.2019.01665*

Interacting with imaginary companions (ICs) is now considered a natural part of childhood for many children, and has been associated with a range of positive developmental outcomes. Recent research has explored how the phenomenon of ICs in childhood and adulthood relates to the more unusual experience of hearing voices (or auditory verbal hallucinations, AVH). Specifically, parallels have been drawn between the varied phenomenology of the two kinds of experience, including the issues of quasi-perceptual vividness and autonomy/control. One line of research has explored how ICs might arise through the internalization of linguistically mediated social exchanges to form dialogic inner speech. We present data from two studies on the relation between ICs in childhood and adulthood and the experience of inner speech. In the first, a large community sample of adults (*N* = 1,472) completed online the new Varieties of Inner Speech – Revised (VISQ-R) questionnaire (Alderson-Day et al., 2018) on the phenomenology of inner speech, in addition to providing data on ICs and AVH. The results showed differences in inner speech phenomenology in individuals with a history of ICs, with higher scores on the Dialogic, Evaluative, and Other Voices subscales of the VISQ-R. In the second study, a smaller community sample of adults (*N* = 48) completed an auditory signal detection task as well as providing data on ICs and AVH. In addition to scoring higher on AVH proneness, individuals with a history of ICs showed reduced sensitivity to detecting speech in white noise as well as a bias toward detecting it. The latter finding mirrored a pattern previously found in both clinical and nonclinical individuals with AVH. These findings are consistent with the view that ICs represent a hallucination-like experience in childhood and adulthood which shows meaningful developmental relations with the experience of inner speech.

Keywords: hallucination proneness, signal detection, theory of mind, social cognition, imagination, development

# INTRODUCTION

Between a third and two-thirds of young school-age children will engage with imaginary companions (ICs), defined as invisible characters with whom children converse and interact (Taylor et al., 2004). These characters can include invisible characters which nevertheless have an air of reality for the child (Svendsen, 1934), and personified objects (imaginary beings that are embodied in a toy or object). Since research in this area adopted new methodological standards in the 1990s, ICs have been associated with a range of positive developmental outcomes (Taylor, 1999). Several studies have linked engagement with an IC to superior social cognition (Taylor and Carlson, 1997; Roby and Kidd, 2008; Davis et al., 2011), while other studies have indicated that children with an IC are more creative (Schaefer, 1969; Seiffge-Krenke, 1997; Hoff, 2005), more sociable (Mauro, 1991), and capable of constructing more complex narratives (Trionfi and Reese, 2009).

Historically, however, engaging with ICs has been considered a cause for concern, and even a possible marker of future mental illness. Although this view has now been discredited (Taylor, 1999), several features of engaging with ICs raise parallels with an experience that is often considered pathological: the experience of auditory verbal hallucinations (AVH) or "hearing voices." Hallucinations are defined as percept-like experiences which occur in the absence of an appropriate stimulus, which have the full force or impact of the corresponding veridical perception, and which are not under the experiencer's direct or voluntary control (Slade and Bentall, 1988). Several researchers have considered whether the experience of engaging with ICs bears commonalities with that of AVH. Intuitively, a point of commonality should reside in the fact that both ICs and AVH generate disembodied yet percept-like social agents with whom to interact. Unlike ICs, however, AVH are not usually experienced as willfully created by the subject, but rather as spontaneously occurring emergences of quasi-perceptual agents (Nayani and David, 1996; Woods et al., 2015). ICs might also be assumed to show cooperative and positive interactional social behavior compared to AVH, which often have a negative emotional valence. In addition, AVH can be perceived as located either internally or externally in space, whereas ICs usually tend to be projected as agents in the external world. In short, there seem to be good reasons for testing productive comparisons between AVH and ICs, and research and theoretical insights on the latter might inform and challenge theoretical and empirical work on the former.

One line of research has examined engagement with ICs as involving non-veridical percept-like experiences. Pearson et al. (2001) found that children's reporting of ICs in middle childhood related to their tendency to report hearing words in an ambiguous auditory stimulus. Using a more rigorous methodology, Fernyhough et al. (2007) replicated this effect in two samples, linking the childhood experience of ICs to a Vygotskian view of development by which thinking develops through the gradual internalization of linguistically mediated social exchanges to form inner speech (see Alderson-Day and Fernyhough, 2015, for a review). This interpretation was subsequently supported by Davis et al.'s (2013) finding that children with ICs were more advanced (relative to their peers without ICs) on the internalization of private speech, considered by Vygotsky to be a precursor of inner speech.

Another line of research has considered the extent to which ICs are under the experiencer's voluntary control. There is growing recognition that the behavior of ICs is not always under children's control, providing a further rationale for considering at least some manifestations of ICs as hallucinationlike phenomena. Hoff (2005) and Taylor et al. (2007) have presented findings suggesting considerable variability in the extent to which children report that their ICs can have alternative thoughts, feelings, or/and behaviors to their own. Taylor and colleagues have referred to this as the "illusion of independent agency." In this article, we use the equivalent term IC *autonomy* to refer to IC behaviors that are not compliant with the host's own cognitions, emotions, and intentions.

A further way in which research into ICs has developed in recent years concerns a growing recognition that engagement with ICs can persist into adulthood. Taylor et al. (2004) found continued engagement with ICs (in a sample that had originally been studied in the preschool years) at age 7 and on into adolescence. Although anecdotal evidence suggests that some adults engage with ICs (Taylor, 1999), to date, there has been no systematic study of the persistence of ICs into adulthood. Beyond the question of the continued engagement with ICs in adulthood, another avenue of research involves examining what – if any – cognitive differences in adulthood may be observed in those with a history of ICs. For example, Firth et al. (2015) found that adults reporting having had an IC in childhood scored more highly on a scene construction task, employed as an objective measure of imaginative capacity, as well as rating themselves as more imaginative.

We set out to explore several hypotheses concerning the relations between ICs and hallucinatory experiences. In the first study, we asked a large sample of online respondents about their experience of ICs in childhood and adulthood. In line with the reasoning of Pearson et al. (2001) and Fernyhough et al. (2007), we predicted that individuals reporting engagement with ICs would show greater susceptibility to hallucination-like experiences in adulthood. We additionally took measures of the sensory vividness of reported IC interactions and IC autonomy.

We also examined ideas from Fernyhough et al. (2007) and Davis et al. (2013) on the relation between ICs and the development of inner speech. Using a new questionnaire assessment of the quality of inner speech in adulthood, we investigated relations among IC status, varieties of inner speech, and hallucination proneness in our large sample of online respondents. Specifically, we predicted that those with experience of ICs would evidence more expanded, social-like experiences of inner speech, such as reporting other people in inner speech, or inner speech with dialogic characteristics. We also gathered, in the largest sample examined to date, novel data on the persistence of ICs into adulthood.

In the second study, we worked with a smaller, separate sample of participants to explore the cognitive processes involved in distinguishing real events from imagined ones. We assessed this capacity with an auditory signal detection paradigm. Biased performance on such tasks has been linked to reality-monitoring processes and strongly implicated in the experience of AVH (Bentall, 1990; Brookwell et al., 2013), but has never previously been examined in relation to IC engagement. We also assessed social cognition (theory of mind) to test specificity of any cognitive effects.

# STUDY 1: METHOD

# Participants

A sample of 1,472 participants (age *M* = 38.84; SD = 13.42; 1,112 females) were recruited *via* an online survey originally designed to explore inner speech and reading imagery (Alderson-Day et al., 2017). The survey was advertised *via* a UK national newspaper (*The Guardian*) and the Edinburgh International Book Festival. The majority of participants were based in the UK (*n* = 748) or USA (*n* = 213) and education levels were high, with over 80% of the sample possessing a graduate degree or above (for a full description of the sample, see Alderson-Day et al., 2018).

# Measures

### Imaginary Companions Questionnaire

Due to the lack of measures for assessing ICs in adulthood, a bespoke schedule of questions was devised to assess IC status (past and current), plus characteristics of IC experiences (see **Table 1**).

The following questions were used:


Questions 1, 2 and 5 were answered with a yes/no response. Questions 3 and 4 were completed with the following response options: *Never, Very occasionally, Some of the time, Most of the time, All of the time*.

## Launay-Slade Hallucination Scale – Revised, Auditory Subscale

A 5-item version of the Launay-Slade Hallucination Scale (henceforth LSHS-A) was chosen to examine proneness to auditory hallucination-like experiences ("hearing voices") in the sample (Bentall and Slade, 1985; Morrison et al., 2000). Participants answer items describing a range of perceptual errors (such as hearing one's name being called momentarily) and rate their frequency from 1 (*Never*) to 4 (*Almost Always*). Despite being a short measure, the 5-item LSHS-A has moderate/good internal reliability (Cronbach's alpha = 0.69; McCarthy-Jones and Fernyhough, 2011). Online assessment of psychopathological variables has been shown to be reliable compared to traditional pen-and-paper methods (Jones et al., 2008).

Varieties of Inner Speech Questionnaire – Revised The Varieties of Inner Speech Questionnaire – Revised (VISQ-R) is a 26-item scale that requires participants to report on the frequency of various phenomenological characteristics of inner speech (Alderson-Day et al., 2018). It has five factors: dialogic inner speech, evaluative/critical inner speech, condensed inner speech, other people in inner speech, and positive/regulatory inner speech. The scale has strong internal reliability (alphas > 0.8) and is consistently related to various psychopathological traits, such as hallucination proneness, dissociation, anxiety, and

depression (Alderson-Day et al., 2018).

# Analysis

All data were analyzed in R, unless otherwise stated. Differences in hallucination proneness and inner speech characteristics were compared between four groups based on their IC status: those with no history of having an IC, those with a childhood IC only, those with a childhood and current IC, and those with a current IC only. For inferential statistics, skewed distributions were corrected using either log transformations (LSHS, Condensed VISQ, Other People VISQ) or square root


*Percentages for items 3–5 were calculated from the total of all participants who had an imaginary companion at some point (n = 632).*

transformations of reflected scores (Dialogic VISQ, Evaluative VISQ). For ease of interpretation, all figures and tables report untransformed scores. No transformed outcomes failed Levene's test (all *p* > 0.05).

# STUDY 1: RESULTS

## Characteristics of Imaginary Companions

**Table 1** displays the main IC characteristics reported across the sample. The majority – 56% – of participants had never had any experience of an IC, but as many as 41% had an IC in childhood. A total of 69% of participants with an IC (at any point in their life) reported having had an experience of hearing the IC's voice on at least one occasion, while 61% had had visual or other sensory experiences. From those who responded to both questions 1 and 2 of the survey (*N* = 1,463), the four groups separated out as follows: those with no history of an IC (*n* = 831), those with a childhood IC only (*n* = 522), those with a childhood and current IC (*n* = 84), and those with a current IC only (*n* = 26). **Table 2** displays mean ages and scores for hallucination proneness (LSHS-R) and inner speech characteristics (the VISQ-R) across the four groups.

# Relations With Hallucination Proneness and Inner Speech

As can be seen in **Figure 1**, scores for LSHS-A were positively skewed, with a majority of participants across all groups reporting very little experience of hallucinations. Nevertheless, a one-way ANOVA on log-transformed scores for the LSHS-A indicated a significant main effect of group, *F*(3, 1,459) = 10.74, *p* < 0.001, h*p* <sup>2</sup> = 0.022. *Post hoc* Games-Howell tests (which correct for multiple comparisons) indicated that those with a childhood and current IC scored higher for hallucination proneness than all three other groups (all *p* < 0.044), while those with a past IC were also more hallucination prone than participants with no IC at all (*p* = 0.002).

With an alpha correction to 0.01 (to account for multiple testing across the five VISQ-R subscales), similar results were observed for dialogic inner speech, *F*(3, 1,459) = 9.15, *p* < 0.001,

h*p* <sup>2</sup> = 0.018; evaluative/critical inner speech, *F*(3, 1,459) = 5.58, *p* = 0.001, h*p* <sup>2</sup> = 0.011; and other people in inner speech *F*(3, 1,459) = 15.84, *p* < 0.001, h*p* <sup>2</sup> = 0.032. Differences in positive inner speech were marginal but non-significant, *F*(3, 1,459) = 3.43, *p* = 0.016, h*p* <sup>2</sup> = 0.007, while no group differences were observed for condensed inner speech, *F*(3, 1,459) = 1.96, *p* = 0.117, h*p* <sup>2</sup> = 0.004.

Broadly similar pairwise differences were observed in *post hoc* analysis, again using Games-Howell tests. For dialogic inner speech, those with a past and current IC scored higher than those without an IC (*p* < 0.001) and those with only a childhood IC (*p* = 0.019), but not those with a current IC only (*p* = 0.051), while more dialogic inner speech was also observed in those with a childhood IC compared to those with no IC history (*p* = 0.005). The same pattern of group comparisons was evident for the Other People inner speech factor (all *p* < 0.01). For evaluative/critical inner speech, scores only significantly differed between those with both current and childhood ICs compared to the childhood IC group (*p* = 0.019) and those with no IC (*p* < 0.001).

# STUDY 1: DISCUSSION

Study 1 set out to explore for the first time relations between IC status in childhood and adulthood, the quality of inner speech, and proneness to AVH. Data gathered from a large sample of online respondents supported predictions that experience of ICs would be associated with a greater susceptibility to AVH. The highest scores for AVH proneness were observed in those who had both had an IC in childhood and continued to have one in adulthood. Comparable findings were observed in relation to measures of social-like experiences of inner speech, particularly on the Dialogic, Other People, and Evaluative/ Critical factors.

A further aim of Study 1 was to gather novel data on the persistence of ICs into adulthood, using the largest sample employed to date in such analyses. The proportion of individuals reporting experience of ICs in childhood (41%) was roughly in line with previous studies. A total of 110 participants (representing around 7.5% of the sample) reported experience of ICs in adulthood. Of those reporting a childhood


*IC, Imaginary companion; LSHS-A, Launay-Slade Hallucination Scale – Auditory; VISQ-R, Varieties of Inner Speech Questionnaire – Revised.*

IC, 13.8% reported continued IC engagement in adulthood. Our findings are also consistent with previous observations that the behavior of ICs is not always fully under the experiencer's control (Taylor et al., 2007).

Several limitations of Study 1 need to be mentioned. Although the group differences reached high levels of significance, they represent what would conventionally be described as small effects (h*p* <sup>2</sup> between 0.011 and 0.032). Further limitations were the embedding of our data collection within a wider study of reading imagery (Alderson-Day et al., 2017), and the exclusive reliance on online self-report as a method of data gathering. The consequence of the former is that this sample may be skewed toward those high in imagery vividness and imaginative tendencies in the general population, while the latter limitation might have served to increase correlations among variables within the sample (i.e., common-method variance). Accordingly, in our second study, we used cognitive tasks that have previously been associated with the presence of hallucinations to obtain arguably more objective measures of relevant processes. The fact that we were working with a smaller sample also allowed us to obtain parental corroboration of childhood IC status, which is considered best practice in IC research (Taylor and Carlson, 1997).

# STUDY 2: METHOD

Employing a smaller sample of participants in a lab-based study, we replicated the measures used in Study 1 and added two tasks to assess cognitive processes previously implicated in IC status and AVH proneness. One such task is auditory signal detection, a measure which requires participants to detect speech clips embedded in noise. Previous findings have indicated that AVH proneness is associated with a tendency to falsely detect speech in noise, with signal detection parameters indicating that this is due to a response bias, rather than reduced task sensitivity. This has been linked to reality monitoring processes (i.e., the processes used to distinguish between self- and non-self-generated stimuli; Brookwell et al., 2013), or the influence of top-down processes on perception (Moseley et al., 2016), and has never previously been examined in relation to IC engagement. The second process, social cognition or theory of mind, has been associated with IC status in childhood, but has not been consistently associated with hallucination proneness in adult population samples (e.g., Fernyhough et al., 2008). To further examine the role of theory of mind, we therefore included a commonly used measure of "mentalizing" abilities: the Reading the Mind in the Eyes test (Baron-Cohen et al., 2001).

# Participants

A sample of 14 adults with a history of imaginary companions (age *M* = 21.21, SD = 2.26, 4 males) and 34 adults with no imaginary companions (age *M* = 21.18, SD = 2.18, 13 males) were recruited *via* a university participant pool, email circular, social media, and a recruitment blog article (Watson, 2017). Participants received either course credit or a gift voucher for their participation. On recruitment into the study, participants were asked to complete a short schedule about their history of imaginary companions (see **Table 3**), and to ask their parents to complete three questions: whether their child (1) had an IC when they were younger, (2) spoke to the IC, or (3) actively played with the IC. No participants who reported an IC failed the parental verification check; however, two participants did not recall having an IC when their parents reported that they had (including outwardly interacting with the IC). The latter two participants were included in the IC group, but were marked in later analysis in case they unduly influenced the group results.

# Measures

#### Imaginary Companions Questionnaire

**Table 3** shows the questions asked of participants about their IC history. The questions used were broadly similar to those used in Study 1, although specific questions about observable behaviors (e.g., playing with the IC) were also included to

TABLE 3 | Participant IC schedule and response frequencies in IC group (*n* = 14).


*\*Two participants were included because their parents reported them having an IC in childhood, even though they did not recall having one.*

allow for comparison with parent reports. Each question was answered with a binary response (*Yes* or *No*).

### Launay-Slade Hallucination Scale – Revised

For Study 2, a 9-item version of the LSHS was used which incorporated the five auditory LSHS items used in Study 1, and added four items from the full LSHS relating to visual experiences (for example, *I see shadows and shapes when there is nothing there*) (Bentall and Slade, 1985; Morrison et al., 2000). The longer scale provides a more reliable estimate of hallucination proneness, and is in line with use of the LSHS in the wider hallucinations literature (which often focuses on general hallucination proneness; e.g., Siddi et al., 2019).

### Signal Detection Task

An auditory signal detection task (SDT) was used modeled on those used by Smailes et al. (2014) and Moseley et al. (2014). Participants were asked to listen to 60 trials containing 5-s bursts of white noise, played over headphones. In 12 trials, speech was clearly present in the white noise at an audible volume; in 24 trials, no speech was present; and in 24 trials, speech was played at a threshold volume calibrated in piloting to allow a 50% success rate (pilot sample *n* = 10). The speech was identical to that employed in previous studies and first used by Barkus et al. (2007): a 1.5-s clip of a male voice reading aloud from an instruction manual. On each trial, participants were asked to indicate whether speech was present or absent, providing four response outcomes: *hits* (correctly identifying speech when present), *misses* (failing to identify speech when present), *correct rejections* (identifying when speech is absent), and *false alarms* (hearing speech when none is being played). Following Stanislaw and Todorov (1999), these outcomes were used to calculate beta (*β*), a measure of response bias, and d-prime (*d*′), a measure of sensitivity or discrimination. Scores below 1 for beta indicate a bias toward classifying trials as containing speech, while scores above 1 indicate a bias away from identifying speech. Higher scores on discrimination indicate better sensitivity on the task. Following previous studies of hallucination proneness and signal detection, the primary outcome on the task was beta (see Brookwell et al., 2013, for a review), while *d*′ – on which people with hallucinations usually do not differ from control participants – was used as a control outcome.

### Reading the Mind in the Eyes Task

This social cognition (theory of mind) task was used as a control task to determine specificity of any effects relating to signal detection (Baron-Cohen et al., 2001). The revised adult version (Baron-Cohen et al., 2001) was used to accommodate the age of the sample and was presented in printed form. Both validity and test-retest reliability have been found to be high enough to treat scores as a good approximation of theory of mind ability (including cross-culturally). Participants were asked to select one of four words that they believed best described the emotional or mental state of 30 different sets of eyes. The selection of words varied for each question. Definitions were available for each participant, including an example sentence. All participants reported being proficient in English.

# Procedure

All testing took place in a quiet room away from auditory distractors. Following consent, participants completed the LSHS and a paper version of the Reading the Mind in the Eyes test, and then the signal detection task. Participants wore over-ear Sennheiser HD206 headphones with the volume set to 20%. The SDT was run using E-Prime 2.0 on a 17″ Lenovo laptop.

# Analysis

All analyses were conducted in R. Group differences for hallucination proneness, signal detection bias (*β*), and social cognition performance were compared using Welch's *t*-tests. *d*′ (or sensitivity) on the SDT was also analyzed as a control variable. Prior to analysis, log-transforms were applied to LSHS scores and *β* scores on the SDT, while a square root transformation was applied to *d*′ scores; this reduced skew in the data and served to normalize distributions within each IC group. However, for ease of interpretation, raw scores are included in the reporting of descriptive statistics.

# STUDY 2: RESULTS

**Table 4** shows the mean scores for each IC group on the LSHS, signal detection task, and Reading the Mind in the Eyes task. To correct for multiple comparisons across the main outcomes for the questionnaire and two tasks, the alpha level was adjusted to 0.016 (0.05/3). When the groups were compared, significant differences were evident for LSHS, *t* (18.79) = 2.73, *p* = 0.013, *d* = 0.99, indicating higher hallucination proneness in the IC group.

On the signal detection task, both groups were more likely to say speech was absent than present (as indicated by mean scores over 1), but IC participants showed significantly lower *β* scores than controls (i.e., they exhibited more bias toward responding that there was speech present), *t* (26.92) = 3.00, *p* = 0.005, *d* = 0.96. However, group differences were also evident on the control variable, *d*′, indicating lower sensitivity in the IC group, *t* (17.57) = 2.37, *p* = 0.030, *d* = 0.87. No group differences were observed for scores on the Reading the Mind in the Eyes task, *t* (22.23) = 0.12, *p* = 0.909, *d* = 0.04, n.s1 .

<sup>1</sup> These analyses were also checked for (1) LSHS auditory items only (in line with Study 1) and (2) group differences following the omission of the two participants who did not recall ICs despite their parents indicting otherwise. As for LSHS total scores, IC participants scored significantly higher for auditory LSHS, *t* (23.72) = 3.78, *p* < 0.001. With the omission of the two participants, group differences were still evident for LSHS, *t* (15.06) = 2.14, *p* = 0.0488, and signal detection bias, *t* (20.09) = 2.58, *p* = 0.018, but no longer for sensitivity, *t* (14.12) = 1.87, *p* = 0.083, n.s.


*IC, Imaginary companion; LSHS, Launay-Slade Hallucination Scale; SDT, Signal Detection Task; RMET, Reading the Mind in the Eyes Test. Hit percentages are calculated from a total of 36 trials; false alarms from a total of 24 trials.*

Finally, although a Fisher's exact test suggested that the distribution of gender across the two groups did not deviate from parity (*p* = 0.741), we compared LSHS, beta, and *d*′ scores by gender to gauge their potential influence on IC group differences. No gender differences were observed, with the closest to significance being beta scores, *t* (38.04) = 1.71, *p* = 0.09. As this was in the direction of males showing more bias toward reporting speech to be present (*M* = 1.83) than females (*M* = 2.74), with a majority of males being in the non-IC group, this seemed unlikely to have affected the difference observed between IC groups in response bias.

# STUDY 2: DISCUSSION

Study 2 presented us with the opportunity to investigate associations between IC status and hallucination proneness in the context of measures of relevant cognitive processes. We replicated Study 1's finding of higher hallucination proneness in the group of adults with childhood ICs. Our findings also aligned with previous results showing a relation between hallucination proneness and bias (*β*) on an auditory signal detection task, with participants in the IC group showing a greater bias toward responding that speech was present. We did not replicate the previously observed finding of no differences in sensitivity between groups high and low in hallucination proneness; in our sample, participants in the IC group showed reduced sensitivity. This is in line with a few studies that have reported patients with AVH showing reduced sensitivity as well as bias (e.g., Vercammen et al., 2008). The two IC status groups did not differ on social cognition (theory of mind) performance, suggesting that the group effects on cognitive task performance were specific to the signal detection task.

One limitation of Study 2 was the small sample. However, our methodology did require recruiting people with ICs into a lab-based study, as well as requiring parental verification, which made recruitment more challenging. Our findings form part of a small but growing body of research into the neglected area of cognitive processes in adults with a history of ICs (e.g., Firth et al., 2015). In addition, despite our relatively small sample, our findings are in line with previous work on the cognitive processes implicated in hallucinations, with, for example, very similar false alarm rates in the no-IC group compared to those observed in previous studies (Moseley et al., 2014, 2016).

# GENERAL DISCUSSION

The two studies reported here were motivated to explore several hypotheses concerning the relations between ICs and hallucinatory experiences. In Study 1, a large sample of online respondents were asked about their experience of ICs in childhood and adulthood. In line with predictions, experience of ICs was associated with a greater susceptibility to AVH, with the highest scores for AVH proneness observed in individuals who had both had an IC in childhood and continued to have one in adulthood. The inner speech reported by individuals with ICs was more likely to include social-like qualities such as dialogicality, other people, and evaluation/criticism. Study 1 also presents the largest dataset yet gathered on the persistence of ICs into adulthood, with around 7.5% of the sample reporting experience of ICs in adulthood.

Study 2 represents the first attempt to link IC engagement with cognitive processes relevant to hallucination proneness, specifically auditory signal detection and social cognition (theory of mind). Individuals reporting ICs showed a greater bias toward reporting the presence of speech in noise, along with reduced sensitivity. The groups did not differ on theory of mind performance, suggesting that the cognitive tasks effects were specific to auditory signal detection.

Taken together, the two studies reported here are in line with the view that engaging with an IC bears some similarities with psychotic experiences, specifically hallucinations. As noted in section "Introduction," a small body of research has attempted to explore these relations, including Pearson et al.'s (2001) suggestion that engaging with ICs involves non-veridical perceptlike experiences, and Fernyhough et al.' (2007) proposal that engaging with ICs is a by-product of a developmental process involving the gradual internalization of dialogic social exchanges. The present findings are not sufficient either to confirm or disconfirm these theoretical proposals, but they are at least consistent with them. For the first time, the research presented here has been able to relate these experiences to the quality of inner speech, which has been linked both to childhood engagement with ICs (Davis et al., 2013) and to AVH (see, e.g., Alderson-Day and Fernyhough, 2015).

The studies reported here also speak to the question of whether, and how, childhood ICs persist into adulthood. The research described here was cross-sectional rather than longitudinal, and thus cannot address whether the ICs engaged with in childhood were, for those with persistent ICs, identical to those experienced in adulthood. It does, however, suggest that adults who had childhood ICs show cognitive differences from those without such experiences. In other words, the association observed in childhood between IC status and hallucination proneness appears to persist into adulthood.

That is not to say that ICs that emerge in adulthood are underpinned by the same processes that give rise to ICs in childhood. Establishing continuity in IC experience between childhood and adulthood would require long-term longitudinal data, and one should resist the assumption that adult ICs necessarily represent childhood ICs that have not gone away. There may indeed be such continuity, but ICs may also be constructed anew in adulthood, raising the possibility that such ICs are underpinned by separate cognitive mechanisms to those in operation in childhood. This is particularly pertinent for individuals who *only* develop ICs in adulthood: for both hallucination proneness and inner speech, this group were most similar to those who had never experienced an IC at all. It is likely that there are multiple cognitive routes toward hallucination-like experiences in the nonclinical population (Waters and Fernyhough, 2019), especially for those who deliberately cultivate such experiences (Luhrmann et al., 2019). Tulpamancers (Mikles and Laycock, 2015; Veissière, 2015) and spiritualists (Powers et al., 2017), for example, describe non-self, agentic experiences that in some ways parallel ICs, but which often rely on long periods of focused practice (such as meditation). It is possible that such practices could "unlock" ICs for adults who did not otherwise have a childhood proneness or tendency to have IC experiences.

The experience of shaping and engaging with ICs has also been linked to the creative imaginative act of molding fictional characters into existence, where literary writers displace agency into externalized imaginary beings (Taylor et al., 2003; Bernini, 2014). The creation of fictional characters and the generation of imaginary friends arguably share a feeling of distributed agency paired with knowledge of the subjective source of these creative acts. Looking into how readers represent fictional minds can also offer insight into the links between ICs and AVH. There is growing evidence that readers experience fictional voices as highly vivid, personified, and agentive (Alderson-Day et al., 2017; Maslej et al., 2017). Sometimes the personified voices and worldviews of fictional characters even cross into the reader's experience of the everyday, in what some authors have termed "experiential crossing" (Alderson-Day et al., 2017). This type of crossing between imagination and reality resembles hallucinatory dynamics in terms of the spontaneous emergence of social agents within the mind, thus reinforcing possible links between AVH, the creation and reception of fictional characters, and the experience of ICs.

Data from the cognitive task measures included in Study 2 suggested that there is at least some overlap between the cognitive processes associated with hallucinations and those associated with childhood ICs, supporting the conclusions from self-report measures used in Study 1. Specifically, participants in the IC group were more likely to report the presence of speech in noise than those in the non-IC group in the signal detection task. While Study 1 evidenced elevated levels of inner speech with social qualities (dialogic or evaluative inner speech, or use of inner speech involving other people) in those with ICs, Study 2 suggested that performance on the Reading the Mind in the Eyes Test, an index of social-cognitive processes involved in theory of mind, was not linked to the presence of ICs, suggesting no impairment in mentalizing in individuals with past ICs.

Performance on the signal detection task has previously been linked to an externalizing bias in reality monitoring (i.e., a bias toward misattributing imagined events as real; Brookwell et al., 2013), or over-weighted top-down processes influencing perception (Moseley et al., 2016), suggesting that ICs may be linked to these cognitive processes. However, it is noteworthy that participants in the IC group also showed a lower sensitivity (*d*′) on the signal detection task, indicating that they also were less able to distinguish the speech from the noise. This pattern is divergent from previous studies showing that hallucinating psychosis patients showed a difference in response bias but *not* sensitivity (e.g., Bentall and Slade, 1985; Varese et al., 2012), though some previous studies have reported reductions in both measures (e.g., Vercammen et al., 2008). While a bias toward speech detection may be consistent with reality-monitoring or top-down accounts of hallucinations, a reduction in sensitivity may also indicate more basic perceptual disturbances. Further research is needed to untangle specific patterns of performance and their association with ICs and proneness to hallucinations. Overall, cognitive data from Study 2 support the continuity across age in IC engagement suggested by the questionnaire data in Study 1 – and indicate more of a link with basic perceptual disturbance than social cognition or theory of mind – but at the same time are slightly different from a patient profile (in highlighting differences in sensitivity).

Although the signal detection task is widely used in the hallucination literature, it is possible that alternative tasks might shed further light on the cognitive processes involved (Brookwell et al., 2013). For example, a limitation of signal detection tasks in understanding AVH is that they do not typically manipulate the amount of auditory verbal imagery used by participants in performing the task. Future research in this area might utilize paradigms which can manipulate engagement in such imagery (Moseley et al., 2016). Other reality-monitoring tasks, particularly those drawn from the episodic memory literature, might reveal different associations with the variables of interest (e.g., Garrison et al., 2017). Future research might also consider the role of autistic traits in the observed relations among ICs, AVH, and inner speech. Such traits are known to affect weighting of sensory information (Karvelis et al., 2018), although their relation to ICs is only beginning to be explored (Davis et al., 2018). Although there are practical difficulties with long-range longitudinal research, investigating the development of these traits and abilities over the life course would be highly desirable.

As summarized above, limitations of the present study include the relatively small effect sizes in Study 1, the embedding of our data collection in a wider study of reading imagery and the use of online self-report (Study 1), and the relatively small size of the sample in Study 2. A further potential limitation of both studies is that recall of childhood experiences might be unreliable (the reason why we sought parental corroboration in Study 2). In addition, it is possible that the presence of AVH is associated with autobiographical memory biases that might increase the likelihood of childhood ICs being recalled.

Notwithstanding these limitations, the present findings provide some support for the view that ICs develop in childhood as a by-product of typical developmental processes. A challenge for future research is to find out more about those ICs that either persist into, or are generated anew, in adulthood, along with the cognitive and neural mechanisms that make continued engagement with ICs possible.

# DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of University of Durham Ethics Committee

# REFERENCES


with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University of Durham Ethics Committee.

# AUTHOR CONTRIBUTIONS

CF, BA-D, AW, and MB conceived the study. BA-D and AW collected the data. BA-D, AW, and PM analyzed the data. All authors wrote the manuscript.

# FUNDING

This research was supported by Wellcome Trust grants WT098455 and WT108720.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Fernyhough, Watson, Bernini, Moseley and Alderson-Day. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Endorsement and Constructive Criticism of an Innovative Online Reflexive Self-Talk Intervention

Alexander T. Latinjak1,2, Cristina Hernando-Gimeno1,3 \*, Luz Lorido-Méndez<sup>3</sup> and James Hardy<sup>4</sup>

<sup>1</sup> School of Psychology and Education, University of Suffolk, Ipswich, United Kingdom, <sup>2</sup> School of Health and Sport Sciences (EUSES), Universitat de Girona, Catalonia, Spain, <sup>3</sup> Universitat Autònoma de Barcelona, Bellaterra, Spain, <sup>4</sup> Institute for Psychology of Elite Performance, Bangor University, Bangor, United Kingdom

This study prospectively followed the experiences of skilled athletes who were involved in an innovative reflexive self-talk online intervention targeting goal-directed self-talk. Four experienced female athletes between the ages of 20 and 40 years were invited to an initial interview, a 4-week intervention, and two post-intervention interviews. Two applied sport psychologists used an online Socratic questioning approach to encourage their athletes to describe challenging scenarios, think about their use of self-talk and its effectiveness, and explore alternative self-statements that could be used in future situations. Data were multi-sourced stemming from the psychologists, athletes, and third parties (e.g., coach). Three athletes completed the intervention, whereas one athlete withdrew prematurely, mainly because the Socratic questioning approach and the online mode of delivery did not meet her preferences. From the three athlete who had completed the intervention, there was endorsement and constructive criticism of the intervention and its online delivery mode. The intervention, largely due to the accompanying raised awareness of self-talk use and refined content, seemingly benefited a range of variables including emotions, motivation, and confidence, both inside and outside of the athletes' sports life domain. Accordingly, this new type of online intervention warrants further consideration in the literature.

### Keywords: self-esteem, anxiety, thoughts, self-regulation, inner speech, sports

# INTRODUCTION

This study reports on a cognitive intervention that aims to change and strengthen athletes' goaldirected self-talk in sports. This approach is aligned with interventions framed within cognitive therapy (Beck, 1976). Cognitive therapy emphasizes the role of internal dialog in influencing an individual's subsequent feelings and behavior. According to Beck (1976), individuals are not always aware of their internal dialog, but they can learn to identify it, and, therefore, become able to monitor and, if necessary, replace automatic, emotion-filled thoughts. Cognitive-behavioral therapy (Meichenbaum, 1977) and rational emotive behavior therapy (Ellis, 1976) are two classical examples of cognitive therapy, which have successfully been applied to sport contexts (e.g., Neil et al., 2013; Turner and Barker, 2014) and in which self-talk plays a key role to cognitive change (Michie et al., 2013).

#### Edited by:

Thomas M. Brinthaupt, Middle Tennessee State University, United States

#### Reviewed by:

Csilla Horvath, Radboud University Nijmegen, Netherlands Véronique Boudreault, Université du Québec à Trois-Rivières, Canada

\*Correspondence:

Cristina Hernando-Gimeno c.hernandogimeno@gmail.com

#### Specialty section:

This article was submitted to Personality and Social Psychology, a section of the journal Frontiers in Psychology

> Received: 27 March 2019 Accepted: 22 July 2019 Published: 06 August 2019

#### Citation:

Latinjak AT, Hernando-Gimeno C, Lorido-Méndez L and Hardy J (2019) Endorsement and Constructive Criticism of an Innovative Online Reflexive Self-Talk Intervention. Front. Psychol. 10:1819. doi: 10.3389/fpsyg.2019.01819

**45**

The terms inner dialog and self-talk were used by Beck (1976) and Meichenbaum (1977) mainly to refer to the critical inner voice that tends to encourage caution and self-doubt and can over time negatively impact upon self-esteem and self-worth (Palmer and Williams, 2013). In sport, the term self-talk is applied to a variety of processes that can occur simultaneously (Boudreault et al., 2018). To provide a conceptualization of self-talk that summarizes these processes, we describe it as follows: Self-talk takes form in verbalizations addressed to the self, overtly or covertly, characterized by interpretative elements associated to their content; and it either (a) reflects dynamic interplays between organic, spontaneous, and goal-directed, cognitive processes or (b) conveys messages to activate responses through the use of predetermined cues developed strategically, to achieve performance-related outcomes (Latinjak et al., 2019a).

In sport, self-talk interventions are usually beneficial for learning and performance and performance-related variables such as confidence or anxiety (Tod et al., 2011; Hatzigeorgiadis et al., 2014). However, in studies on the effects of self-talk, intervention protocols may be remarkably different (Latinjak et al., 2019a). Whereas traditional interventions focus on the effects of repetition of predetermined cue words (e.g., Hardy et al., 2015), some recent interventions aim to improve athletes' rational self-regulatory skills by creating metacognitive knowledge (Brick et al., 2016). Changes in metacognition in these recent interventions result from repeated reflections on past organic self-talk (both spontaneous and goal-directed) and future use of self-instructions (Latinjak et al., 2016). This reflexive self-talk intervention aims to enhance the use of goal-directed self-talk, which is a controlled mental process deliberately employed toward solving a problem or making progress on a task (Latinjak et al., 2014).

According to a recent review on self-talk interventions (Latinjak et al., 2019a), there are three main differences between the traditional, strategic self-talk interventions, and the newly proposed reflexive intervention. First, the content of strategic self-talk interventions is typically pre-determined (Hardy, 2006), while the self-talk discussed in the reflexive interventions emerges from sport situations and is thus always self-determined. Second, the moment when the self-instructions are verbalized in strategic self-talk interventions is usually fixed to before or during the execution of the task. In reflexive self-talk interventions, participants must decide in situ when they want to use selfinstructions. Third, while verbalizing self-instruction is essential in strategic self-talk interventions, the actual use of goaldirected self-talk is optional in the context of reflexive self-talk interventions. The result of a reflexive self-talk intervention could therefore even be to use less goal-directed self-talk, for example, to prevent ironic processes of mental control (Wegner, 1994).

Compared with the existing self-talk literature that deals intensively with research on interventions using predetermined cue words (Tod et al., 2011), the research with reflexive selftalk interventions (aka., goal-directed self-talk interventions) is still in its infancy (Latinjak et al., 2016, 2018). Nonetheless, diverse psychotherapeutic approaches [e.g., Rational Emotive Therapy (Ellis, 1976) and Cognitive-Behavior Modification (Meichenbaum, 1977)] previously applied effectively to the sports setting (Neil et al., 2013; Turner and Barker, 2014) serve as indirect support for the efficacy of reflexive self-talk interventions in sport. This is because, our reflexive self-talk intervention is similar to these psychotherapeutic approaches because it shares several core features. For instance, both cognitivebehavior approaches and reflexive self-talk interventions aim at making athletes conscious about their internal dialog, identifying automatic, emotion-filled thoughts, and when dysfunctional, replacing them with functional self-instructions (Beck, 1976; Latinjak et al., 2016). To this end, Socratic questioning (McArdle and Moore, 2012) is used to develop metacognitive skills that enable athletes to non-judgmentally observe their own thoughts, and subsequently think logically and empirically in order to challenge, correct, and replace them. In cognitive-behavioral therapy, Socratic questioning, which consists of asking a person a series of open-ended questions to help promote reflection, is considered useful for raising awareness and improving problemsolving thinking (Neenan, 2009).

A unique and contemporary aspect of the reflexive self-talk intervention presented in this study was the use of an online text-messenger service for the intervention. With an estimated 3 billion Internet users worldwide, the development of online interventions could be of considerable utility (Lane et al., 2016). To the best of the authors' knowledge, only a single experiment has examined the effects of an online self-talk intervention in the performance context. Lane et al. (2016) examined the effects of strategic self-talk directed to outcome goals, process goals, instruction, and arousal-control, in a brief online intervention, on a competitive (non-sport) computer task. In their study, only self-talk directed to outcome and process goals helped participants' performance. That said, at a more general level, their findings support the utility of the online modality to teach psychological skills.

Despite the lack of online interventions within the sport psychology literature, meta-data from other disciplines provide useful guidance. Specifically, research has emphasized the potential of online interventions in different areas of application including behavioral change, health, and clinical practice (e.g., Webb et al., 2010). An important matter in online interventions is related to the mode of delivery. Webb et al. (2010) performed a meta-analysis of online interventions, indicating that interventions that allowed for scheduled contact with an advisor showed significant effects, whereas interventions that provided automatic follow-up messages tended not to show significant effects. In addition, interventions using smart phones showed the biggest size effects among online interventions. Therefore, in our study, the use of automatized feedback was discarded and scheduled contact with an advisor via an online text-messenger service was preferred.

The present study included an innovative and longitudinal (4 weeks) self-talk intervention aimed at improving goaldirected self-talk using an online delivery format. The aim was to give a clear idea of what a successful reflexive selftalk intervention might look like and what variables should be considered to increase the likelihood of a satisfactory application. Aimed at applied practitioners, this study sought to provide the most detailed presentation of reflexive self-talk intervention

procedures to date, as well as offer relevant and innovative guidance on adapting the standard procedures to the needs and preferences of individual athletes. In addition to investigating a novel form of self-talk intervention, the highly unusual but contemporary online format of the intervention is noteworthy. Our online delivery format has obvious scope and potential beyond just sport related self-talk; yet we are aware of very few published examples of online interventions in the sports psychology literature.

Overall, a 4-week reflexive self-talk online intervention was delivered and qualitative reports on implementation and perceived effects were collected. In order to gain a comprehensive understanding of the intervention, data from different sources (Tracy, 2010) were collected to compare different experiences of athletes, applied practitioners, and researchers. The experiences of athletes and psychologists were expected to reveal meaningful information for refining the intervention and highlight moderating factors that practitioners should consider when adjusting the intervention to their client's specific needs.

# MATERIALS AND METHODS

# Participants

#### Philosophical Orientation

Due to the investigation's subjective focus, emphasizing the experiences of the participants, we adopted a constructivist epistemology enabling us to develop an appreciation of the lived experience and the identification of themes across our stakeholders. We assumed that there is no one knowledgeable truth and that knowledge involves a process of interpretation and the construction of individual knowledge representations (Jonassen, 1991). To this end, we collected data from a variety of sources – athletes, practitioners, and coaches – to provide a multifaceted understanding of a 4-week reflexive self-talk online intervention. Since our intervention was tailored to the individual circumstances of each athlete, it was expected that the experiences of our participants would be complex and dynamic. Therefore, a multiple single-case study approach was chosen as the most appropriate method. This approach is particularly useful for allowing analysis within and across individual cases that allow us to examine in detail the subjective experiences of individuals who are part of the intervention and to highlight similarities and differences between them. Accordingly, an interpretative phenomenological analysis was chosen to analyze the data, since it is relatively sensitive to exploring differences in experiences between participants (Sparkes and Smith, 2014).

### The Athletes

To enhance the scope of our case study approach, four athletes were purposefully recruited for the study. We looked for athletes from different sports with different ages, different performance levels, but relatively large experience in their sports. All athletes participated in official competitions while the intervention took place. The four athletes between the ages of 20 and 40 years were involved in contact, choreographic and team sports, and participated in recreational and professional competitions. They all had over 7 years of experience and practiced over 10 h a week at the time of the intervention. Please note that for ethical reasons, we have changed the names of the participants and did not specify their exact sport and age.

## The Psychologists

For this study, two novice sport psychologists with different of different ages (early 20s and early 40s) were selected. Both had <1-year experience in working as sport psychologists. The Psychologist 1 and the younger Psychologist 2 were graduated psychologists and specialists in sport and exercise psychology. In addition, Psychologist 1 had special training in Rational Emotive Behavior Therapy. Both participated in the design of the intervention and only after completion of the data collection, in the discussion of the results. Both worked at an elite sports academy with talented junior basketball players. They were selected for their interest in researching the use of online interventions and in providing self-talk interventions for athletes. Two athletes were randomly assigned to each psychologist. Psychologist 1 worked with Maria and Julia, while Psychologist 2 worked with Anna and Sandra.

# Procedures

#### Intervention Design

The main thrust of our intervention was based on Latinjak et al. (2016) reflexive self-talk intervention. Nonetheless, some experiences collected in that study and the decision to deliver the intervention via online text-messenger required further deliberation. To create the intervention protocol, the first author prepared a script that was discussed with the practitioners performing the intervention. After adapting and modifying the script, the intervention design was sent to the fourth author, who acted as critical friend in this study. Taking into account his comments, the first author elaborated the final protocol of the intervention.

### Ethics and Athlete Sampling

After obtaining all necessary institutional permissions, athletes were selected, following recommendations about purposeful sampling in qualitative studies (Robinson, 2014). Accordingly, we defined a sample universe, we decided upon a sample size, through the conjoint consideration of epistemological and practical concerns, we selected convenience sampling as our sampling strategy, and we decided on contacting partner clubs and high-performance centers for sample sourcing. Suitable candidates were identified and contacted for an initial evaluation, via Skype, 1 week prior to the intervention. At the beginning of this interview, the athletes were informed about the procedures of the study and signed an informed consent form. Regarding confidentiality, athletes were informed that their names would be changed in the final report and none of their actual intervention discussions (i.e., text messages) would be published. In addition, the athletes were told that they would receive a copy of the summary of each interview to highlight sections that we should not quote in the article.

# Initial Interview

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 4

One week before the intervention, the athletes participated in a brief interview conducted by a researcher independent of the intervention. The interview consisted of three parts. In Part 1, the athletes were asked personal descriptive questions (e.g., age, hours of practice, and best results in competitions). In Part 2, the athletes commented on their emotions, confidence, motivation, and thoughts in sport, and their corresponding self-regulation skills. In Part 3, the athletes were asked about their self-talk, in terms of frequency, typical things they say to themselves, and the effects of their self-talk on their sport participation.

Less than 48 h after the completion of the interview, each athlete was sent a transcript of her interview and a short summary, so that she could undertake modifications by rephrasing, eliminating, or adding ideas (if necessary). Once each athlete reflected on the interview transcriptions and the summary, the latter was sent to the psychologist who conducted the intervention.

# Introductory Video

On day 1 of the intervention, the psychologists contacted each of the athletes sending them an introductory video via WhatsApp messenger. In this video, the leading researchers were introduced, and the general goals of the study were described. Specifically, the athletes were informed that this study aimed to test the effects of an online intervention on goal-directed thoughts in sport. Furthermore, the athletes were introduced to the idea of goaldirected self-talk, described as self-talk used intentionally to solve a problem or make progress on a task (Latinjak et al., 2019a). Several non-sport-related examples were offered in the video so as to inform but not bias participants.

After defining goal-directed self-talk, the general procedures of the intervention were outlined. That is, (a) all communications between you and the psychologist will take place in WhatsApp; (b) a typical session consists of you describing a problematic situation in your sport, reflect on your goal-directed selftalk in that situation, and evaluating potential alternative selfstatements; (c) the aim of the intervention is to encourage you to reflect on your goal-directed self-talk, and so, the psychologist solely formulates questions and hardly ever provides answers; and (d) because research protocols have to be followed, other issues besides goal-directed self-talk cannot be discussed over the course of this intervention. Each of these points was accompanied by non-sport-related examples. After seeing the introductory video, the athletes were invited to formulate questions and they were informed that a psychologist would contact them within the next 3 days to start the intervention.

### Intervention Sessions

During the intervention period, athletes were contacted every 3–4 days by their psychologist via WhatsApp, as scheduled by the athlete at the end of the previous session. Two days after the introductory video, the athletes were contacted for the first scheduled session. The psychologist opened the conversation asking the athlete "is it a good time to talk?" A typical session consisted of five consecutive main questions: (a) report a problematic situation that has occurred to you recently during training or competition; (b) what did you say to yourself in that situation to cope with your problems; (c) why did this statement help you to cope with the problems in that situation, or why did it not; (d) think of any alternative self-statement you could have used instead to self-regulate more effectively; and (e) why would this alternative statement be better compared with the original one to cope with the problems in the situation. Nonetheless, variations to this typical flow of the sessions were also foreseen (**Figure 1**).

# Post-intervention Interview

In the week following the intervention, the athletes were contacted again via Skype by the same researcher who conducted the initial evaluation for a second interview. The postintervention interview consisted of three parts. In Part 1, the athlete was asked to evaluate the general procedures of the intervention. Specific attention was paid to (a) the WhatsApp conversations and (b) the Socratic questioning approach. Both endorsement and constructive criticism were encouraged. In Part 2, the athletes were asked to reflect on changes they noted, or failed to notice ("have you hoped for some changes to take place, that haven't taken place?"), with regard to the experience of and coping with emotions, confidence, motivation, and thoughts and attention. Finally, in Part 3, the athletes were asked to reflect on changes they noted, or failed to notice, regarding their use of self-talk as a self-regulation strategy. Again, less than 48 h after the interview, each athlete received a transcript of her interview and a short summary, so that modifications could be made, if necessary.

# Third-Party Interviews

During the post-intervention interview, permission was requested to contact a significant person related to their sport (e.g., coach). The choice of that person was left to the athlete. Interviews with the third persons were conducted, within 2 weeks post-intervention, via Skype, by the same researcher who conducted the previous interviews with that athlete. During this interview, generic open-ended questions inquired into any changes in the athlete the coaches had observed during the past month.

# Follow-Up Interview

Three months post-intervention, the athletes were contacted via Skype by the same researcher who conducted the previous evaluations, for a third interview. During the follow-up interview, the athletes were asked to reflect on changes in their sport, or even outside sport, that might (partly) be explained by the intervention conducted 3 months earlier. Some questions were also directed at exploring habits of self-reflection about self-talk participants might have acquired. Identical member checking procedures to those used previously were employed.

# Psychologist's Reflections

During the intervention, the psychologists followed a structured diary, enabling several intervention-control variables to be assessed: number of sessions per athlete (excluding the initial video), number of athlete messages, and a word count of athlete messages. Additionally, after the intervention had terminated,

they were asked several questions regarding each athlete. In particular, they reflected on (a) the general functioning of the sessions, (b) any progresses they have noted, and (c) shortcomings or limitations of the interventions. Once this information was compiled and structured, the psychologists were given a copy and asked to reflect on the information

#### correcting any mistakes, reformulating ideas, and adding missing information.

# Data Analysis

An interpretative phenomenological analysis was chosen to evaluate the data in this study. This approach enabled us

to focus in depth on the interpretations and experiences of the athletes and psychologists. Furthermore, interpretative phenomenological analysis has recurrently been used in previous studies with small numbers of participants (Robinson, 2014).

In this study, the interpretative phenomenological analysis consisted of four steps that were consecutively performed on the transcripts of each athlete and psychologist. On an individual level, the analysis included (a) searching for themes by reading and re-reading all interviews and textmessages of the intervention and (b) identifying and labeling themes that characterize the experience and perceived effects of the intervention. On a group level, the two remaining steps consisted of (c) connecting the themes to make global sense of the athletes' and psychologists' reports and (d) producing a table for each participant (**Tables 1**–**3**) and two tables to summarize the reports of the athletes (**Table 4**) and psychologists (**Table 5**). For Sandra, no individual table was prepared, as she withdrew prematurely after 2 weeks of intervention. She just completed the initial interview and agreed after quitting to answer only a few questions regarding her withdrawal instead of the post-intervention interview.

### Establishing Confidence

Regarding the list of universal criteria for rigor in qualitative research (Tracy, 2010), in the present study a relativist approach was adopted (Sparkes and Smith, 2014). In the present study, the following criteria were included: the worthiness of the topic; the significant contribution of the work; rich rigor, that is, sampling diverse athletes, and psychologists to gather a variety of data from different sources that allow to understand a complex phenomenon; and the meaningful coherence of the research, indicating how well the study interlinks in terms of the aim, method, and results. Furthermore, the authors practiced selfreflexivity to consider how their perspectives influenced upon data collection and analysis. For example, having identified the first author's potential bias in favor of the intervention's effects, it was decided to have independent psychologists perform the intervention, to collect data from multiple sources, and to use multiple voices in the data analysis.

Regarding multiple sources of data and multiple voices in the analysis, these allow for different facets of problems to be explored to deepen our understanding. Besides the first author, the psychologists and the athletes, the fourth author of this study had served as a critical friend reviewing the intervention procedures and commenting critically on the final draft of the manuscript. In agreement with Cowan and Taylor (2016), the role of the critical friend was to encourage reflections upon, and exploration of, multiple and alternative explanations and interpretations of the data sampled in this study. For example, the critical friend was very important when we discussed the reasons why one of the athletes stopped the intervention prematurely. Based on his comments, we considered the relative lack of experience of the younger psychologist as a contributing factor. Furthermore, in order to facilitate a balanced perspective, efforts were made during in all interviews to capture and interpret both endorsements and constructive criticisms of the intervention.

# RESULTS

In this section, the implementation and perceived effects of the intervention are described. First, we present the psychologists' reports on the intervention and on the progress and limitations of their athletes. Subsequently, we summarize the evaluations athletes made during the post-intervention interviews. Furthermore, a third section outlines the specific outcomes of the intervention as interpreted by the researchers from the athletes' interviews. Lastly, some testimony is offered, from third persons who were close to the participants during the intervention.

# The Psychologist's Evaluation Intervention Sessions

Two athletes, Maria and Anna, responded well to the established timetable (**Table 5**). Julia frequently changed the schedule and Sandra stopped the intervention after 2 weeks. Before canceling, Sandra had skipped several sessions and delayed others for several hours. Most sessions lasted between 20 and 45 min. With Anna, the sessions lasted much longer, up to 90 min. During the first sessions, she required up to 30 min to find a situation to discuss. After the third session, however, the sessions got noticeably shorter.

A total of 49 sessions were planned (12–13 per participant) and a total of 39 sessions were completed (6–12 per participant). All athletes but Sandra completed most of their scheduled sessions. Sandra only completed 6 out of 13 planned sessions. With regard to messages, 522 messages were sent from the psychologists to the athletes (83–169 per participant) and 499 messages were sent back from the athletes to the psychologists (78–156 per participant). See **Table 5** for more detailed information concerning the sessions.

### Content of Sessions

The athletes discussed a wide variety of idiosyncratic situations, including sport-specific situations, such as difficulties with a choreography (Maria), negative self-talk during competitions (Anna), problems concentrating (Julia), and situations in which things do not work out the way they were supposed to (Sandra). Furthermore, almost a third of the situations were not directly related to sport practice and performance. For instance, athletes talked about balancing free time and sports (Maria), and diet and injuries (Anna). Social conflicts, related to peers and coaches (Sandra), were also discussed. In all situations, athletes used selftalk to cope with, exclusively negative experiences such as anxiety, fear, stress, anger, shame, guilt, sadness, frustration, and pressure. The thoughts related to the situations were also negative; "I can't stand the fatigue," "I am not helping the team," or, simply, "I can't" are typical examples. Both negative experiences and thoughts occurred in competition, training, and outside of sport practice. The absence of positive emotions can be explained by the difficulties that athletes have in identifying positive experiences as detrimental for performance (Latinjak et al., 2016).

With a particular emphasis on athletes' self-talk, two athletes, Maria and Julia, were able to discuss their self-talk in detail. Maria reported using instructions such as "Come on, concentrate, TABLE 1 | Asummary of Maria's initial, post-intervention, and follow-up interview.


Online Reflexive Self-Talk Intervention

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 7

Latinjak et al.

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 8

August 2019 | Volume 10 | Article 1819

TABLE 2 |Asummary of Anna's initial, post-intervention, and follow-up interview.


#### TABLE 3 |Asummary of Julia's initial, post-intervention, and follow-up interview.


Online Reflexive Self-Talk Intervention

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 10

TABLE 4 |Athletes'reflections about the online goal-directed self-talk intervention during the final interview.


Online Reflexive Self-Talk Intervention

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 11

TABLE 5 |Overviewof the length of the intervention, and basic reflections of the psychologists on their athletes' intervention, progress, and limitations.


**55**

do it with energy" or "You can't do everything 100%." In her case, these instructions worked well, for example, when they helped her to accept the situation and not to see work as a loss of time. She also developed some new instructions during the intervention, such as, "Trust more in yourself and take your decisions" or "Think about the fun you will have tomorrow and that it was worth the effort." With regard to Julia, she reported to have had used instructions such as "You can do better, proof it" or "Calm down." These instructions helped if she managed to calm down. Nonetheless, often they ceased to work because she lost concentration or because some negative thoughts came back to debilitate her.

Skill-execution-related instructions, often studied in predetermined instructional self-talk interventions (Hardy et al., 2015), were not discussed by the athletes. Neither did the psychologists feel the need to direct the athletes through questions toward instructional statements. According to conscious processing hypothesis (Masters, 1992), modes of conscious control should mostly be used in early stages of learning, as they contrast with the typical automatic functioning of experts like the athletes in this study.

## Evaluation of Athletes' Progress

The psychologists noted a positive development in three of the athletes (**Table 5**). For example, according to the perceptions of Psychologist 1, Maria "gained awareness of her negative selftalk." Once awareness was raised, "she was able to reflect on it and turn it into positive thoughts." Finally, at the end of the intervention "she was able to look for alternatives in her self-talk." The importance of awareness and motivation to change negative self-talk has received support in earlier studies in sports (Hardy et al., 2009b). In comparison, Julia even from the beginning, "had few problems identifying situations, emotions, thoughts, and self-talk, and reflecting on the effects of the latter." For athletes with less awareness, it might be valuable to complete a self-talk diary ahead of their first session, to help them raise awareness and make the sessions run more efficiently. Another characteristic, related to awareness, is the athletes' belief in their self-talk. While working with Maria, Psychologist 1 noticed that "she recognized that sometimes it's hard for her to believe in what she says . . . ." Similarly, Julia "had some difficulties believing in her self-talk and, hence, it often does not work." Previous studies have already focused on athletes' belief in their self-talk (Hardy et al., 2009a). It therefore seems important to strengthen athletes' beliefs in their inner voice, so that a change in self-talk content can be effective.

In the case of Sandra, who abandoned the intervention after only a few sessions, Psychologist 2 had not noted any progress. Sandra's considerations indicate that it was the use of the online text-messenger service rather than the potential relative lack of experience of Psychologist 2, what may explain her withdrawal. Based on Sandra's discontent with the intervention format, Psychologist 2 felt she "never wanted to discuss any problematic situation that was really significant to her." Psychologist 2 also noted Sandra's resistance to talk sincerely and to change her current self-regulation strategies (Zaltman and Duncan, 1977). Surely, Psychologist 2 lacked a bit of experience to better deal with resistance. However, it was mainly the intervention protocol that failed to include evidence-based techniques to deal with resistance (Hatcher, 2015). Because resistance is to be expected in cognitive-behavioral interventions, future studies on reflexive self-talk interventions should include strategic responses to optimize client experience and outcomes.

### Advice for Practitioners

Based on their personal experiences, both psychologists formulated a series of proposals for applied practitioners. First, it is paramount to take your time to explore to some depth the situations that the athletes want to solve. It is those aspects they have not considered before that provide the best innovative solutions. Questions such as why anxiety is making you perform worse or why others do not have the same problem can help the athletes take an alternative perspective that leads to alternative goal-directed self-talk. Second, both psychologists agreed that a combination of text messages, voice recordings, and video-calls could be beneficial in applied practice.

# Athletes' Reflections

## Evaluation of the Intervention Format

The use of WhatsApp messenger received generally positive evaluations from Maria, Anna, and Julia, and negative evaluations from Sandra (**Table 4**). Generally, Maria and Anna acknowledged that the intervention fit very well into their daily routines. Nonetheless, this positive fit can be mediated by the tendency of athletes to use their mobile phones during the day. Sandra, on the contrary, frequently forgot her phone at home, where she did not return until very late every day.

The written messenger service format was rated positively because the athletes had time to think (Anna) and to write their answer, to change their answers, or to complete their answer before sending it (Julia). The disadvantages of the written messenger service were a lack of personal contact (Maria) and the absence of gestures (Anna). Although Maria and Sandra suggested that video chats might be an alternative to the written messenger service, for Julia it was the written format that had advantages over the video chat.

The Socratic questioning approach (McArdle and Moore, 2012) elicited disparate opinions among the athletes. Generally, Maria rated the questioning approach positively, Anna, both positive and negative, and Julia and Sandra rather negative. Both Maria and Anna acknowledged that the Socratic questioning approach required finding solutions on their own. For instance, Maria told us that "we are used to get feedback on our responses, but then I thought maybe that is not necessary; things are neither good nor bad . . . ." Additionally, Anna appreciated that she did not feel judged by the psychologist. Regarding the criticism of the Socratic questioning approach, both Anna and Julia found it frustrating not to receive any feedback from the psychologist. For instance, Anna explained that she "felt sometimes lost, in need for orientation or assessment." For Sandra the problem was that the questions were repetitive. She reported that "one day I told her [Psychologist 2] that I had trained very well; either way if I had told her that the training was terrible, she would have asked me the exactly same question." For context, please keep in mind the earlier argument on resistance in the relationship between Psychologist 2 and Sandra.

### Overall Impression of the Intervention

fpsyg-10-01819 August 3, 2019 Time: 14:38 # 13

When asked to critically evaluate the intervention, Maria, Anna, and Julia had a generally positive opinion (**Table 4**). For example, Julia told us that "I didn't expect anything . . . but it went very well." Anna specified that she "expected many questions and answers, and it was like that," and then she "noticed spectacular changes." Maria based her positive opinion on her increased awareness of self-talk. She reported that she "gained much more consciousness than before" when she "was never aware of how some thoughts can affect you . . . they can change things in some situations." Sandra had a negative experience with the intervention. Specifically, the structured nature of the intervention did not meet her expectancies and preferences. She declared that "she [Psychologist 2] says something the first day, the same on the second and on the third day, and the fourth day I got tired."

# Follow-Up Interviews

In follow-up interviews, Maria, Anna, and Julia reported that some of the intervention effects on their self-talk were still noticeable (see **Tables 1**–**3** for Maria, Anna, and Julia, respectively). Consistent with their post-intervention interviews, they kept noticing an enhanced awareness of self-talk. Maria told us that she "was much more aware of self-talk while [practicing my sport]." Moreover, she also detected that her self-talk was much more positive, insofar as "no more 'I can't' or 'you are doing it wrong' or 'people are watching you'." Instead she used much more constructive statement, such as "come on, go!"

Furthermore, the three athletes acknowledged that for 3 months the changes in self-talk had a continuous impact on other performance-related variables. For example, Anna noted improvements in her concentration. She told us that she still had "a tendency for mind wandering" but now she told herself "it's time to focus on the here and now." Julia, in turn, noted improvements in her emotional control. She reported that before she was "very nervous, like so scared" and "now it's like 'no, we can win and calm down, and if we don't, nothing happens'." These comments were deemed positive, although it is unlikely that these changes can be attributed exclusively to the intervention. Be it as it may, the athletes' comments provide support for the engagement with and acceptance of the athletes for the intervention, as all three see the intervention as the cause of positive changes in their sport.

According to Maria, Anna, and Julia, the intervention had positive long-term effects that were not restricted to sports because they identified changes in self-talk in other areas of life. Anna for example used self-talk consciously "in other areas of life, such as in academics or now when looking for a new job." Julia believed that "the whole program, the 'what do you say to yourself' and 'what could you have said differently to yourself', can help you in all your life."

It was found that even 3 months after the intervention, the athletes still evaluated the intervention as a positive experience. For Maria, it was important to discover "that I have a psychologist inside, an 'inner I,' that I believe a lot in this inner I, that she can talk to you, help you, or, on the contrary, haunt you." More specifically, Julia remembered that "the WhatsApp (. . .) was a key point because in person I sometimes just think 'I don't know,' but because I could write, you get your moment to think and answer . . . about things I couldn't imagine to be potentially interesting." On the basis of her experience with the intervention, Anna had even expressed her wish to continue working with her psychologist beyond the reflexive self-talk intervention. This suggests that online interventions for athletes can be a simple first step to commence working on psychological aspects, with positive experiences, leading to engagement in broader collaborations with sport psychologists.

# Interpreting Changes Across Athletes' Pre- and Post-intervention Interviews

In this section, we present our interpretation of the pre- and post-intervention interviews (**Table 4**). This was possible only for Maria (**Table 1**), Julia (**Table 2**), and Anna (**Table 3**), as Sandra withdrew from the intervention. Sandra agreed to the final interview, but only to evaluate the intervention and briefly explain, from her point of view, what went wrong. Overall, our interpretation of the interviews suggests that the potential benefits of the intervention on performance is likely to result from the following sequence: the reflexive self-talk intervention (a) raises awareness of previous self-talk, (b) changes self-talk content, and (c) helps with performance-related variables like emotions, motivation, or confidence.

Generally, Maria, Julia, and Anna justified the positive effects of the intervention with an increase in metacognitive knowledge. Both Maria and Julia underlined that they gained awareness as they realized how they "might have thought things more unconsciously (Maria)" or that now they "have seen all that I say to myself, there are many things" and that they "see how important these things are (Julia)." Similarly, Anna reported that the intervention had helped her to understand the importance of self-talk ("I see how important the things I was telling myself before were") and how self-talk had influenced her previous decisions ("It was fantastic, getting aware of unconscious decisions I had taken based on my negative self-talk").

Alongside their increased awareness, all three athletes also noted positive experiences in refining their self-talk. For example, Maria changed her self-talk patterns as she "tried to be more positive than before, when I had more negative self-talk (. . .) to psych up and not to drag me down." Anna even managed to transfer past successful self-talk experiences to future situations. She explained that "in situation in which I got conscious that my self-talk was positive, I kept these statements, as a tool, for other moments." Julia managed to overcome a problem she had previously experienced when attempting to purposefully use self-talk: "The confidence self-talk hadn't work well before. Now I know I need to take a different approach . . . I have seen that I shouldn't tell myself 'you can' and eliminate all the negative thoughts." Julia now focuses her self-talk on finding solutions for her problems instead of increasing confidence. She

understood that confidence is a consequence of having found viable solutions. This last quote shows a connection between the reflexive self-talk intervention and the coping literature, where studies have found that female athletes use emotion-oriented rather than problem-oriented coping strategies (Crocker et al., 2015), although the latter generally lead to better outcomes (Nicholls and Polman, 2007).

The awareness and the changes of self-talk were associated to improvements in performance-related variables. Anna, for example, detected progress in her emotion-regulation. She reported that "before fear stopped me (. . .); I wouldn't even try to compete in my gym (. . .); Now I go!" Maria also described positive changes in her motivation, as the intervention helped her "to motivate myself better in all these short moments when I decay." For Julia, the most important change was related to her confidence. She told us that confidence is "where I've seen most changes, and for the good, of course; (. . .) I approached things differently, and that was like a door that opened."

Notwithstanding, the athletes also recognized that further changes in the awareness and content of self-talk were required to better self-regulate. Maria, for example, admitted that "I think I still can't cope [with self-talk] 100%, to think always positive things." Specifically, she told us that "sometimes I am unaware of self-talk, and I don't comprehend that I could cope with things using self-talk." Maria and Anna argued that they had needed more time. For instance, Maria told us that she "had not enough time to assimilate it all," and Anna recognized that "the intervention showed me the basics, I gained consciousness, but there still is a large way to go." On the positive side, Maria and Anna were keen to continue the intervention even 3 months after it had ended.

# Third-Party Reflections on the Intervention

Two athletes, Maria and Julia, gave us permission to contact a significant person in their sport environment to corroborate the effects of the intervention. On the contrary, Anna did not allow us to contact anyone close to her. She preferred "those few people, who know me well enough to evaluate any changes, not to be involved with the intervention." Marc, Maria's training partner noted meaningful changes that confirm her reports on enhanced self-motivation. Before the intervention, "Maria tended to react negatively to challenges and mistakes," Marc explained. She "was the first to say things like 'I can't do it'," what "had effects on others, because if you have someone telling your constantly 'I can't, I can't', (. . .) well, we have to be positive." After the intervention, Marc noticed that "she lets herself go more, she's focused on enjoying herself." In summary, Marc saw her "more motivated, more optimistic." Julia's coach also corroborated the positive changes his pupil had noticed in her confidence. Oriol explained that "she started to show a lot of confidence, she finished off plays, and she took responsibility in very important moments during the games, something anyone wouldn't do without confidence."

# DISCUSSION

In this study, an online version of a novel reflexive selftalk intervention (Latinjak et al., 2016) was implemented, and experiences of its application and perceived effects were gathered over a prolonged period of time from multiple sources. The online text-messenger format received both approval and criticism. The potential beneficial effects of the intervention seem to be based on (a) raised awareness of previous self-talk, (b) refined self-talk content, and (c) effects on performance-related variables such as emotions, motivation, or confidence. The intervention was rated positively by three of the four participants, who noted positive effects both in sport and outside their sport.

Self-awareness has been identified as a fundamental psychological skill for athletes and one of four fundamental components of effective self-regulation (Vealey, 2007; Heatherton, 2011). Awareness is also connected to metacognition, insofar as Zimmerman (2000, p. 65) defined metacognition as "the awareness of and knowledge about one's own thinking." This is relevant as metacognition is an essential component of self-regulation and its primary functions are to monitor and control the thoughts and actions required for sport performance (Brick et al., 2016). As a result, it is thought that the effects of our online reflexive self-talk intervention are accompanied by an improvement in metacognition, which is caused by the reflection and planning of self-talk.

According to Zinsser et al. (2006) it is possible that this heightened awareness can cause athletes to change their selftalk patterns in order to improve sport performance. However, it is likely that self-talk does not affect performance per se, but through changes in performance-related mechanisms (Galanis et al., 2016). In the present study, the participants reported benefits in terms of concentration, confidence, motivation, and emotional control. This is in line with goal-directed self-talk categories that have been uncovered in previous studies. Boudreault et al. (2018) described, as an example, motivational and emotion control functions of goal-directed selftalk, which reflect many of the participants' comments on the outcomes of the present intervention. Likewise, concentration and confidence-oriented statements are among the most replicated findings in the research on goal-directed self-talk (e.g., Latinjak et al., 2014, 2019b). However, all of these studies were descriptive and therefore cannot establish a causal link between goal-directed self-talk and performance-related variables. In order to find inferential evidence, one must refer to research with strategic self-talk interventions (e.g., Hardy et al., 2015), in which self-talk is however far less self-determined. These studies indirectly support the findings of this project as they demonstrate that self-talk can have a positive effect on concentration, confidence, motivation, and emotional control (e.g., Tod et al., 2011), and that changes in these factors may partly explain how self-talk improves sport performance (Hatzigeorgiadis et al., 2014).

# Issues Relevant to Applied Practice

Several considerations are important before utilizing the reflexive self-talk intervention. These relate to expectancies

and/or preferences of athletes when working with sport psychologists. First, when opting for self-talk interventions, cognitive processing preference should be considered. For instance, it was apparent that Anna had a very little preference for self-talk, and this coincided with her being relatively unaware of her inner dialog and how her inner dialog affected her sport participation. Conversely, Julia showed a strong preference for using self-talk and was, thus, relatively conscious of her self-talk even prior to the intervention. Nevertheless, there remains little evidence about the impact of cognitive processing preference on the use of self-talk and its effects (for an exception see, Thomas and Fogarty, 1997). In the present case, cognitive processing preference might explain why the intervention was considered too short by Anna, and why the time gap between sessions was perceived too narrow by Julia, who eventually ran out of selftalk to discuss. With regards to further individual differences and their effect on self-talk use, it is noteworthy in views of the present study that previous studies found differences between males and females (Latinjak et al., 2017; Ada et al., 2019).

Second, applied practitioners need to decide whether to use the traditional strategic or the innovative reflexive self-talk intervention. Strategic self-talk interventions are simpler and lead to fixed self-talk plans to be used at particular instances to deal with fixed and specific performance issues (e.g., see also the IMPACT-ST model by Hatzigeorgiadis et al., 2014). Alternatively, more self-determined interventions, such as the reflexive selftalk intervention (Latinjak et al., 2016), are more malleable and less controlled, and aim to improve metacognitive skills. Within reflexive self-talk interventions, a Socratic questioning approach is indispensable. Maria and Anna evaluated the Socratic questioning approach positively, whereas Julia and Sandra would have preferred more guidance and assurance. Both psychologists advocated the use of scaffolding for applied practice. Scaffolding is when the psychologist provides temporary support to the athletes to gain a deeper understanding of their psychological challenges and the role of self-talk as a psychological skill (James et al., 2010). In this context, athletes may first become familiar with basic aspects of cognitive therapy. Such psychoeducation on the influence of thought on emotions and behavior has proven to be important for cognitive interventions (Kazantzis et al., 2018). Along these lines, guidelines on the use of feedback should also be included in the intervention protocol. In time, the scaffolds used at the beginning of the intervention would gradually be removed as athletes progressively gain an understanding of the reflective task, they are to perform. Overall, it will be important in future studies to add detailed guidelines for the provision of scaffolding to the intervention procedures, and thus successfully overcome challenges such as resistance. This information could be particularly useful for relatively inexperienced practitioners such as the psychologists who participated in this study.

Third, use of an online intervention delivery format or traditional face-to-face sessions (Latinjak et al., 2016) is worthy of further consideration. In the present study, the athletes communicated with their psychologist by mobile phone. This format was chosen because online interventions, administered via mobile phones, will become more and more accessible to different populations as the rate of ownership of smart phones rises [e.g., in the United Kingdom, from 60% ownership in 2013 to 80% by the end of 2017 (García et al., 2016)]. However, in practice, athletes should feel comfortable with mobile phones for this delivery option to be viable. Sandra, for instance, used her discomfort with mobile phones to partially explain her withdrawal from the intervention. To contextualize the experiences reported in this study, it should also be noted that demographic studies have shown that men and women use online messenger services differently (Rosenfeld et al., 2018).

Fourth, having chosen the online format, the applied practitioner is still left with the choice of written or verbal communication. Julia explicitly acknowledged the importance of the written response format as it allowed her to take her time and to write and rewrite her answers. However, Maria and Sandra would have preferred video chat in combination with the text-messenger application. Our decision to employ a text-based format was informed by Pennebaker's (1997) work investigating expressive writing. Nonetheless, Pennebaker and Seagal (1999) reported that expressive writing and expressive talking should have comparable effects. Based on the available evidence, practitioners may consider taking the athletes' preference of one or the other communication format into account.

# Methodological Considerations

In this investigation, member reflecting was used as a means to enhance rigor in the qualitative research design. Member reflections were considered to ensure the manuscript would reflect the subjective experience of both athletes and psychologists. Hence, in this study epistemological constructivism and ontological relativism were preferred over ontological realism (Smith and McGannon, 2017). This approach was aligned with the present goal to collect qualitative evidence about the delivery and perceived effects of our online reflexive self-talk intervention. Future studies, however, should also be grounded within an ontological realism framework; that is, gather objective evidence to confirm the effects of the reflexive self-talk intervention in sports, and its broader effects beyond the boundaries of sport. With regard to future research, we also recommend testing the application and effects of reflexive self-talk interventions in non-sport contexts. The use of goal-directed self-talk is common in a variety of contexts, including but not limited to physical activity and academic and professional activities.

# CONCLUSION

This innovative study provides a detailed insight into an online version of a reflexive self-talk intervention. The steps of the intervention protocol and involvement of the client is best summarized by: (a) a description of recurrent problematic situations in and around sports, (b) reflections on situationspecific goal-directed self-talk and its effectiveness, and (c) the development of alternative statements that can be used in future situations. The online text-messenger may be beneficial as it allows athletes (a) to engage with the intervention when it best suits them, at any location of their convenience, (b) to take

as much time as they required to reflect on the intervention questions, and (c) give their concise responses in a written format. The potential beneficial effects of the intervention seem to be based on; (a) raised awareness of previous self-talk, (b) refined self-talk content, and (c) effects on performancerelated variables like emotions, motivation, or confidence. The intervention protocol displayed in **Figure 1** can be taken as a starting point for applied practice. Yet, some sections of the protocol would benefit from further development. To improve the protocol, applied practitioners should: (a) integrate guidelines for dealing with resistance, (b) consider using scaffolding during the initial sessions, and (c) combine text messenger and video chat options. The increasingly popular voice recording function in text-messenger applications is another suitable option.

The intervention described in this study is very different to the traditional strategic self-talk interventions investigated over the last three decades yielding generally positive effects for sport performance (Tod et al., 2011). Whereas strategic self-talk interventions targeted changes in psychological processes, such as confidence or emotions (Hatzigeorgiadis et al., 2009), this new reflexive self-talk intervention aims to enhance metacognitive knowledge. Athletes are encouraged to learn about themselves, and to use this knowledge to better self-regulate both in and outside of their sport. Hence, this is a self-talk intervention developed and applied in sport, with potential beneficial effects for the athlete in other areas of life. Our highly unusual online delivery format also serves as a reminder to both practitioners and researchers of the need to be responsive to changes in,

# REFERENCES


Ellis, A. (1976). Reason and Emotion in Psychotherapy. New York, NY: Lyle Stuart.


and athletes' use of, information technology. This intervention represents one of very few in the sports psychology literature that embraces an online methodology.

# DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the University of Suffolk Research Ethics Committee, with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University of Suffolk Research Ethics Committee.

# AUTHOR CONTRIBUTIONS

AL and JH designed the study. AL, CH-G, and LL-M collected the data and analyzed the data. CH-G and LL-M ran the intervention. AL wrote the manuscript. JH made extensive comments on the manuscript. CH-G and LL-M provided feedback on the manuscript.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Latinjak, Hernando-Gimeno, Lorido-Méndez and Hardy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Penny for Your Thoughts: Children's Inner Speech and Its Neuro-Development

*Sharon Geva1 \* and Charles Fernyhough2*

*1 Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom, 2 Department of Psychology, Durham University, Durham, United Kingdom*

Inner speech emerges in early childhood, in parallel with the maturation of the dorsal language stream. To date, the developmental relations between these two processes have not been examined. We review evidence that the dorsal language stream has a role in supporting the psychological phenomenon of inner speech, before considering pediatric studies of the dorsal stream's anatomical development and evidence for its emerging functional roles. We examine possible causal accounts of the relations between these two developmental processes and consider their implications for phylogenetic theories about the evolution of inner speech and the accounts of the ontogenetic relations between language and cognition.

#### *Edited by:*

*Thomas M. Brinthaupt, Middle Tennessee State University, United States*

#### *Reviewed by:*

*Emily M. Elliott, Louisiana State University, United States Cyrille Magne, Middle Tennessee State University, United States*

> *\*Correspondence: Sharon Geva s.geva@ucl.ac.uk*

*Specialty section: This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology*

*Received: 09 October 2018 Accepted: 09 July 2019 Published: 14 August 2019*

#### *Citation:*

*Geva S and Fernyhough C (2019) A Penny for Your Thoughts: Children's Inner Speech and Its Neuro-Development. Front. Psychol. 10:1708. doi: 10.3389/fpsyg.2019.01708*

Keywords: neural developmental mechanism, dorsal language pathway, ventral language pathway, arcuate fasciculus, superior longitudinal fasciculus

# DEVELOPMENT OF INNER SPEECH

Inner speech – the experience of speaking silently in one's head – is an enigmatic everyday phenomenon. It has been suggested to play an important role in psychological processes as diverse as memory, cognition, emotional regulation, auditory verbal hallucinations, and even consciousness and self-reflection (Alderson-Day and Fernyhough, 2015). Various domains of scholarship, including philosophy, psychology, and neuroscience, have seen renewed interest in inner speech, where it is seen as providing a context for exploring questions about the relationship between language and thought, the boundary between typical and atypical experience, and the emergence and maintenance of self-regulation (Fernyhough, 2016).

The origins of modern interest in inner speech can be traced to the Russian developmental psychologist, Vygotsky, who proposed that it develops through the gradual internalization of linguistic interactions that have been shaped by social interaction. Vygotsky argued that infants begin life embedded in social exchanges which, with the emergence of language, become linguistically mediated. In time, words that had previously been used to regulate the behavior of others are "turned back on the self " to regulate the child's own behavior. In the preschool and early school years, such self-directed speech is mainly overt and audible, constituting a developmental stage known as private speech. With further development, these overt dialogues with the self become internalized so that they are entirely covert and inaudible, marking the development of inner speech.

Research in the last few decades has largely confirmed Vygotsky's view of the development and functions of private and inner speech (Winsler et al., 2009). In particular, empirical studies have supported Vygotsky's insight that private speech peaks in the preschool and early school years (between 4 and 7 years of age) and gradually reduces in frequency in middle childhood (Winsler et al., 2009). Although studying inner speech in childhood is fraught with difficulty, there is a consensus that this pattern corresponds to the emergence of fully internalized inner speech as private speech "goes underground" (Vygotsky, 1987), and the findings suggest that children begin to understand the concept of inner speech in the preschool and middle school years (Flavell et al., 1993, 1997, 2001; Fernyhough, 2009). Furthermore, there has been a growing recognition that overt self-directed speech (or private speech) continues to have important psychological functions into adulthood (Duncan and Tarulli, 2009). Fernyhough (2004) has proposed that adults can move flexibly between inner and overt private speech.

Studies of the various linguistic parameters in inner speech have so far focused on adult inner speech. Oppenheim and Dell (2008) have suggested that inner speech is phonetically impoverished in comparison to overt speech because inner speech lacks some of the phonetic components present in overt speech or because the internal monitoring system fails to detect the full range of phonetic features of the produced inner speech. However, others have shown that phonetics is fully specified in inner speech. For example, Corcoran (1966) has shown that readers automatically access phonetics in inner speech during silent reading. Özdemir et al. (2007) reported that the "uniqueness point", the place in the sequence of the word's phonemes at which it deviates from every other word in the language, influenced phoneme monitoring in inner speech suggesting that inner speech is specified to the same level as overt speech. Slevc and Ferreira (2006) documented a phonemic similarity effect in inner speech, again suggesting phonemic representation in inner speech. An fMRI study showed that manipulation of phonetic variables affects activation in phonological regions, even during a covert condition (Kell et al., 2017). Lastly, people's ability to detect verbal transformations in inner speech (Sato et al., 2004) also suggests that the phonological representation is highly specified in inner speech. Others found that inner speech monitoring is influenced by lexical bias, suggesting that it is specified at the lexical level (Nooteboom, 2005; Geva and Warburton, 2019). While Slevc and Ferreira (2006) showed that monitoring of inner speech is not subject to the semantic similarity effect, this should not be simply interpreted as inner speech lacking semantic information. Rather, it might be that semantic information is not used for the task of monitoring errors. Lastly, recent studies have suggested that inner speech also carries prosodic information (Breen and Clifton, 2011; Filik and Barber, 2011; Geva and Warburton, 2019). However, it has been argued that information about prosody can be accessed by speakers even before inner speech is evoked (Coltheart et al., 1993; Rastle and Coltheart, 2000), and studies of tip-of-the-tongue somewhat support this argument (reviewed in Geva and Warburton, 2019).

Drawing on ideas of Vygotsky (1987), Fernyhough (2004) has suggested that inner speech can vary between fully specified expanded inner speech to a highly condensed form, with these variations reflecting levels of specification of syntax, semantics, and phonology. Expanded inner speech bears fully specified linguistic information and is similar to overt speech, while condensed inner speech lacks phonology (and all linguistic levels that follow, such as prosody and articulation) and full syntactic structure, and its semantics may be different to that of overt speech, such as being more idiosyncratic and personal in nature. Fernyhough (2004) further suggests that the transition from expanded to condensed inner speech is part of a developmental process and that adults can move flexibly between different forms of inner speech and overt private speech as conditions and task demands change.

# NEURAL CORRELATES OF INNER SPEECH

With advances in neuroscientific methodology, attention has turned to the neural correlates of self-directed speech, although to date, this has mostly focused on inner speech in adults (Perrone-Bertolotti et al., 2014; Geva, 2018). Recent studies of inner speech function in adults with brain damage have shown that, for some patients, inner speech can be preserved while there is marked impairment in overt speech. More interestingly, other individuals can have preserved overt speech, but at the same time a salient impairment in inner speech (Geva et al., 2011a; Langland-Hassan et al., 2015; Stark et al., 2017). This dissociation suggests that somewhat distinct neural mechanisms support each type of speech. Although inner speech is (in the Vygotskian view) seen as developing out of overt speech, the process of internalization involves various types of semantic and syntactic transformation (Vygotsky, 1987) which make plausible the involvement of distinct neural substrates.

In the last 40 years, hundreds of functional imaging studies have examined the neural correlates of inner speech. These studies have used diverse tasks ranging from silent word repetition (Shuster and Lemieux, 2005; Pei et al., 2011), verb generation (Frings et al., 2006), stem completion (Rosen et al., 2000), and rhyme judgment (Paulesu et al., 1993; Pugh et al., 1996; Lurito et al., 2000; Poldrack et al., 2001; Owen et al., 2004; Hoeft et al., 2007) to silent reading (Bookheimer et al., 1995). Converging evidence from these studies of task-dependent inner speech points to the involvement of the left inferior frontal gyrus (IFG), and the left angular (AG) and supramarginal gyri (SMG) in the production and processing of inner speech (reviewed in Geva, 2018). These areas are connected *via* the dorsal language stream (Hickok and Poeppel, 2007; Saur et al., 2008), suggesting that it is involved in inner speech processing (Geva et al., 2011c; Rijntjes et al., 2012).

Spontaneous inner speech has only been scarcely studied, but findings so far support those from studies of task-dependent inner speech. A study by Doucet et al. (2012) found higher levels of spontaneous inner speech to be associated with increase in spontaneous fluctuations of activity (tested using resting state fMRI) in a fronto-parietal network, which includes the IFG, temporo-parietal junction, and superior temporal regions. In accordance with this result, it was shown that during resting state (while participants lie inside the scanner without performing any task and without exposure to any specific external stimulus), significant bursts of activation can be recorded in bilateral auditory cortex, which might be related to spontaneous occurrences of inner speech (Hunter et al., 2006). A detailed study of a single participant experiencing spontaneous inner speech in the scanner showed activation in left IFG and superior temporal sulcus (STS) as well as superior and middle temporal gyri during inner speech compared with rest. Left IFG activation was also present when comparing inner speech to other inner experiences (Kühn et al., 2014). In the only fMRI study that has directly compared spontaneous and elicited inner speech, a Region of Interest (ROI) analysis was used to contrast inner speech elicited by a task with occurrences of spontaneous inner speech. The results showed distinct patterns of activation associated with the two speech types, with left IFG activating in elicited, but not in spontaneous inner speech (Hurlburt et al., 2016). The implications of this finding are that it should not be assumed that activations associated with task-based inner speech reflect those found when inner speech arises spontaneously.

Buchsbaum and D'Esposito (2008) suggested that area Spt (Sylvian parietal temporal area, which is located within the Sylvian fissure at the parietal-temporal boundary), is the key area along the dorsal language stream that acts as an interface between the auditory-phonological system and the motor system. This function would implicate it in inner speech production in adults and would point to its potential as a starting point for exploring the neural substrates of inner speech in childhood. In the next sections, we present the current knowledge of dorsal stream anatomy and then discuss its development during childhood, as well as what is known about its function in pediatric populations.

# DORSAL LANGUAGE STREAM ANATOMY

The dorsal language stream has been studied for more than a century, beginning with the seminal work of Dejerine (1895) and Wernicke (1874). It is specified in the classical Wernicke-Lichtheim-Geschwind anatomical model, where it is suggested that Broca's area is connected to Wernicke's area in the posterior temporal cortex *via* the arcuate fasciculus (AF). Advances in neuroimaging allowed further anatomical characterization of the dorsal language stream. In the past, connections between various areas in the human brain were mainly studied postmortem. Today, the preferred methodology for defining anatomical white matter connections *in vivo* is diffusion tensor imaging (DTI). DTI images quantify the level and direction of the movement of water molecules in a tissue. As water molecules behave differently in different types of tissue, DTI can reliably distinguish between cell bodies (gray matter), tracts (white matter), and cerebrospinal fluid (CSF) (Pierpaoli et al., 1996; Pierpaoli and Basser, 1996; Basser and Pierpaoli, 1998). In recent years, DTI studies have refined, altered, and expanded upon the classical Wernicke-Lichtheim-Geschwind anatomical model of the language system (Hagoort, 2014). For terms related to DTI methodology, see **Box – DTI Glossary**. For a review of the use of DTI in language studies, see Geva et al. (2011b).

Catani et al. (2005) suggested that in addition to the direct AF pathway between posterior temporal and inferior frontal regions (termed by Catani and colleagues *the long segment*), there are two other tracts: the anterior segment, which connects the posterior IFG with the inferior parietal lobe; and the posterior segment, which connects the inferior parietal lobe with the posterior temporal gyrus (see **Figure 1**). Later studies confirmed these findings in both adults (Parker et al., 2005; Frey et al., 2008) and children (Eluvathingal et al., 2007; Tak et al., 2016). These three segments are also referred to as the fronto-temporal (FT) segment (the long segment); fronto-parietal (FP) segment (the anterior segment), and temporo-parietal (TP) segment (the posterior segment) (Eluvathingal et al., 2007; see **Table 1**). In addition, imaging studies have suggested that a separate tract, the superior longitudinal fasciculus (SLF), also forms part of the dorsal language stream (Frey et al., 2008; Saur et al., 2008). The SLF can be divided into three components, of which only SLF III forms part of the dorsal language stream, connecting parietal area 40 (SMG), the ventral parts of pericentral Brodmann Areas (BA) 43, 2, 4, and 6, and BA 44 (pars opercularis) (Makris et al., 2005). SLF III differs from the long segment of the AF, which in its posterior part reaches the

BOX | DTI Glossary (adapted from Geva et al., 2011b).

*Diffusion Tensor Imaging* (DTI) – An MRI technique which is sensitive to the microscopic motion of water molecules in a tissue.

Diffusion tensor images are based on measurements of the movement of molecules:

*Isotropic movement* is a completely random movement which occurs in the absence of any restriction. This movement is equal in every direction and it is a characteristic of the movement of water molecules in neuronal cells (gray matter) and the cerebrospinal fluid (CSF).

*Anisotropic movement* is movement which occurs in the presence of physical restriction and is therefore larger in one direction. As axons restrict the movement of molecules parallel to the trajectory of the axon, the movement in the white matter is more anisotropic.

*Eigenvector* is the direction of movement of the water molecules (the diffusivity), while *eigenvalue* is the value of the diffusivity along the direction of the associated eigenvector. The *tensor* represents the overall movement of the water molecules, derived by averaging the strength of movement along the x, y, and z axes.

DTI studies commonly report the following parameters:

*Fractional Anisotropy (FA)* – A function of the eigenvalues, normalized to be between 0 (movement is completely unrestricted) and 1 (movement is restricted towards one direction), representing how similar the diffusivity values are in the different directions.

*Axial Diffusivity (AD)* – The value of the main (largest) eigenvalue. Also reffered to as *Longitudinal Diffusivity*.

*Radial Diffusivity (RD)* – The average of the two smaller eigenvalues. Also reffered to as *Transverse Diffusivity*.

*Mean Diffusivity (MD)* – The average of the three eigenvalues. This value describes the average distance traveled within a specific voxel.

*Apparent Diffusion Coefficient (ADC)* – The diffusion coefficient along a particular direction. In the context of DTI, MD and ADC are often used interchangeably.

SMG (BA 40), posterior superior temporal gyrus (pSTG; BA 22), and the temporo-occipital region (BA 37). Lastly, it has been suggested that the dorsal pathway can be divided into two sections according to their frontal termination point: dorsal pathway I includes AF/SLF fibers which terminate at the premotor cortex, while dorsal pathway II includes AF/SLF fibers which terminate in the IFG BA 44 (Friederici, 2011, 2012). For details, see **Table 1**.

Based on these anatomical definitions, the most likely tracts to support inner speech, within the dorsal language stream, are either the fronto-temporal or fronto-parietal segments. However, note that the exact anatomical end points of the various tracts are not agreed upon (see Martino et al., 2013 for an excellent discussion regarding the differences between various anatomical studies). In addition, in many imaging studies,

FIGURE 1 | Tractography reconstruction of the three segments of the dorsal language pathway: the fronto-temporal (FT) segment/long segment (red); the fronto-parietal (FP) segment /anterior segment (green); and the temporo-parietal (TP) segment/posterior segment (yellow). The figure is adapted from Catani et al. (2005), and is being used with the permission of John Wiley and Sons.

these tracts are not distinguished, due to the methodological limitations of DTI, and are referred to as simply the dorsal stream or AF/SLF (Friederici, 2009).

In addition to the dorsal language stream, the human language system is supported by a ventral language stream (Hickok and Poeppel, 2007; Weiller et al., 2009), which mostly runs medially to the temporal lobe. This pathway connects occipital and temporal areas with frontal regions. It includes the inferior fronto-occipital fascicle (IFOF), which connects the occipital lobe, parietal lobe, and the posterior temporal cortex with the frontal lobe. In addition, the inferior longitudinal fascicle (ILF) connects the posterior occipito-temporal region and the temporal pole. Lastly, the uncinate fasciculus (UF) connects the anterior temporal cortex to inferior frontal areas (reviewed in Duffau, 2016).

# PEDIATRIC STUDIES OF THE DORSAL LANGUAGE STREAM

# Anatomical Studies

The field of developmental cognitive neuroscience has seen a recent increase in interest in the role of the dorsal language stream in both typical development (Tak et al., 2016) and language and speech disorders (Morgan et al., 2016). In a pioneering study of its kind, full-term newborns were scanned within the first 3 days of life. DTI images showed that dorsal pathway I, which terminates in the premotor cortex, is already fully present at birth, while dorsal pathway II, which connects to the IFG, was undetectable (Perani et al., 2011). Similarly, in a study of language pathways among 6- to 22-week-old infants, it was shown that all language tracts were detectable at this age (both ventral and dorsal), although the AF showed the highest variability, terminating in the pre-central gyrus in most cases, and not reaching the IFG (Dubois et al., 2016). Among 0- to 54-month-olds, the SLF was found to be the least developed tract in the newborns, when compared to projection, callosal, brainstem, limbic, and other association

TABLE 1 | Descriptions of the subcomponents of the dorsal language pathway, according to different studies.


*The description of "x to y" is arbitrary, as anatomical studies are blind to the directionality of the fibers. AF, arcuate fasciculus; AG, angular gyrus; BA, Brodmann area; IFG, inferior frontal gyrus; IP, inferior parietal; ITG, inferior temporal gyrus; MTG, middle temporal gyrus; pMTG, posterior middle temporal gyrus; pSTG, posterior superior temporal gyrus; SLF, superior longitudinal fasciculus; SMG, supramarginal gyrus; STG, superior temporal gyrus.*

fibers, and in fact, it could not be delineated before the age of 12 months (Hermoye et al., 2006). A study which included participants ranging the entire age span from neonates to adults showed that the SLF was difficult to identify in neonates and that it was significantly smaller in infants up to the age of 1 year. However, it could easily be identified in late childhood (6–10 years) (Zhang et al., 2007). While data from these studies converge to suggest that the dorsal language stream, or at least its portion which terminates in the IFG, is under-developed at birth, the explanation for this finding varies. Most authors interpret their findings as reflecting genuine anatomical difference between infants/children and adults (Hermoye et al., 2006; Zhang et al., 2007; Perani et al., 2011). However, Dubois et al. (2016) argue that the difference can be attributed to methodological issues, as studies of infants do not take into account the differences between the dorsal and ventral bundles in adults. Interestingly, post-mortem dissections of fetal human brains at 19–20 weeks gestational age showed that some of the ventral pathway, but not the dorsal one, is already present at this gestational age. In the ventral pathway, the external capsule (which contains the ILF and IFOF) was not clearly visible, but the UF was clearly identified. In healthy neonates, both the ILF and IFOF were identified, though they were not developed enough to reveal their projection to the frontal, temporal, and occipital lobes using DTI (Huang et al., 2006). The SLF was also not visible in the fetus, and it could also not be identified in the neonate. The temporal projection of the SLF was only clearly identifiable in the DTI scans of 5- to 6-year-olds (Huang et al., 2006). Hence, the finding of an existing ventral, but not dorsal, pathway in the fetus, suggests that the under-developed presentation of the dorsal pathway in DTI studies of infancy and childhood might be a genuine anatomical finding, rather than a methodological artifact. However, as the cause of death of the fetuses in the study by Huang et al. is not reported, and as fetal brains are rarely obtained without damage, these results should be interpreted with caution.

Further studies included school-aged children as well. Brauer et al. (2013) expanded on Perani et al.'s (2011) findings, showing that 7-year-olds already have both dorsal pathways I and II in place, similarly to adults, therefore obtaining very similar results to those obtained by Zhang et al. (2007). However, fractional anisotropy (FA) values, a commonly used DTI parameter (see **Box – DTI Glossary**), were still lower for 7-year-olds, compared to adults. Significant correlations between age and diffusivity parameters were found among cohorts of various age ranges [6- to 17-year-olds, examining the three segments of the AF (Eluvathingal et al., 2007); 4- to 17-year-olds, examining white matter integrity in the area of the AF (Paus et al., 1999); 5- to 18-year-olds, examining the AF (Schmithorst et al., 2002)]. Eluvathingal et al. (2007) distinguish between patterns of maturation based on different diffusivity parameters: The AF fronto-parietal segment showed a significant increase in FA with age, accompanied by significant decreases in mean, transverse, and axial diffusivity, suggesting increases in myelination. The authors suggest that this tract undergoes development mainly at the tested age range (6–17 years of age). The fronto-temporal and temporo-parietal segments of the AF showed significant age-related decreases in mean, transverse, and axial diffusivity measures that were not accompanied by significant increase in FA, which, according to the authors, suggest that much of the tracts' maturation occurred before the age of 6 (Eluvathingal et al., 2007). A more recent DTI study of the maturation of the dorsal language pathway examined typically developing children in five age groups: 0–2, 3–5, 6–8, 9–11, and 12–14 years. It was found that the posterior segment developed first and actually showed an almost complete maturation already in the youngest age groups. This was followed by the anterior segment, which showed maturation in the middle age groups (around 6–8 years). Finally, the direct segment was suggested to mature only in the early teen years (Tak et al., 2016). Skeide et al. (2016) examined three age groups, similar to the middle ones of Tak et al. (2016), 3–4, 6–7 and 9–10-year-olds, as well as a group of adults. They showed a gradual and steady increase in FA of the dorsal pathway between the four age groups. While data in these studies suggest that the AF reaches maturation around the early school age years, non-linear relations were not statistically evaluated, and it is therefore difficult to determine at which age the maturation plateaus, signifying the age in which the language tracts reach an adult level of development. In addition, some of these studies did not include a group of adults, for comparing the level of maturation of the white matter tracts.

A few studies directly evaluated the age of maturation of various white matter tracts. Maturation was defined as the age at which diffusivity parameters reach a plateau. A longitudinal study which scanned children (aged 5–17) three times over a period of 3 years found increase in FA for both the AF and ILF, the latter forming part of the ventral language stream. However, the slopes were not dependent on initial age of testing, suggesting that the rate of change is equivalent across this age range (Yeatman et al., 2012). Studying participants aged 6–30 years Lebel et al. (2008) suggested that the AF reaches full maturation between the teen years and early 20s. A study of 7- to 68-year-olds found similar results, showing that all three segments of the dorsal language stream (anterior, posterior, and direct) reach full maturation around age 20–30 (Hasan et al., 2010). The authors further suggest that developmental studies should evaluate maturation of anatomical brain structures using non-linear relations.

In summary, there is an agreement in the literature that the ventral language pathway is already detectable at birth (Perani et al., 2011; Tak et al., 2016) and matures faster than the dorsal language pathway (Brauer et al., 2013; Dubois et al., 2016; Tak et al., 2016). In addition, by late childhood, children's dorsal pathway has similar anatomical structure to that of adults, although full maturation (as reflected in diffusivity parameters) is only achieved in the late teens or even early 20s (see **Figure 2**).

# The Functional Role of the Dorsal Pathway During Development

We have argued that the dorsal language stream supports the development and maintenance of inner speech. Much research

has been done on the role of the dorsal language stream in language processing. Here, we ask whether some of the more well-established functions of this pathway have overlaps with inner speech and try to establish how it can support various and potentially distinct functions at the same time.

Two influential models of language development and processing assign specific functions to the dorsal language stream. The first describes language processing in general, suggesting that acoustic speech signals which are processed in posterior brain regions are transferred through the dorsal language stream to the frontal lobe, where they are converted into articulatory representations (Hickok and Poeppel, 2007). This process is essential for language acquisition, as infants and children learn to produce heard words (Hickok and Poeppel, 2007). Later in adulthood, this processing stream can be used for repetition (Saur et al., 2008; Kümmerer et al., 2013). However, based on the anatomical findings showing that the dorsal stream is under-developed in early childhood, developmental studies of the two language pathways suggested that, in early childhood, language development is actually dependent on the ventral pathway, not the dorsal one, while the dorsal pathway only subserves higher language functions which develop later (Brauer et al., 2013; Skeide et al., 2016).

Reconciling this apparent contradiction, Friederici (2009) suggested that language acquisition is dependent on dorsal pathway I, which terminates in the premotor cortex and develops early, while higher language functions depend on dorsal pathway II, which develops later and terminates more anteriorly in the IFG. This suggestion is supported by studies of adults learning an artificial language. In one study, a significant correlation was found between performance on an artificial language learning task and the integrity of the left long segment, which connects auditory and motor regions. No correlation was found between language learning and the integrity of any of the other language tracts examined (the anterior segment, the posterior segment, or the IFOF) (Lopez-Barroso et al., 2013). Another study demonstrated that performance on an artificial language learning task was reduced when participants' subvocal rehearsal was blocked (using articulatory suppression), compared to a condition of no suppression, therefore allowing rehearsal. Additionally, task performance correlated with the integrity of the fibers running through the extreme capsule/external capsule, only when subvocal rehearsal was suppressed. The authors suggest that in adults, language learning without subvocal rehearsal is associated with the ventral pathway (Lopez-Barroso et al., 2011). Together, these studies suggest that the association between adult language learning and the dorsal pathway is mediated by inner speech, a suggestion that supports our hypothesis.

A second influential and extensively studied model describes the process of adult reading. According to the Dual-Route model (Paap et al., 1987; Paap and Noel, 1991; Coltheart et al., 1993; Rastle and Coltheart, 2000), word reading can be achieved through one of two routes. The first is a lexical route, dedicated to reading frequent regular, as well as irregular, words by means of whole word recognition. The second is the sublexical route, which supports the reading of new words and non-words, by utilizing direct grapheme to phoneme translation (but see connectionist models, for example Seidenberg and McClelland, 1989). It has been suggested that the lexical and sublexical routes are supported by the ventral and dorsal systems, respectively (Schlaggar and McCandliss, 2007). However, the dorsal portion relevant for reading was found to be the temporo-parietal segment (Pugh et al., 2000; Schlaggar and McCandliss, 2007; Vandermosten et al., 2012). Later studies have extended this model, adding the frontal segments (fronto-temporal and fronto-parietal) (Vanderauwera et al., 2017), showing their association with phonological awareness (reviewed in Vandermosten et al., 2012). Among a group of children aged 7–11, higher phonological awareness (the ability to parse the word into syllables and phonemes and manipulate these phonemes to make up new words) was associated with lower FA in the left AF, over and above age. The correlation was specific to the tract and task (compared with word reading, verbal shortterm memory, and repetition tasks). The negative correlation is interpreted as experience-based successful pruning (Yeatman et al., 2011). A longitudinal study of 5-year-old pre-readers found similar results: children were tested at the start and end of their last nursery year, and it was found that better phonological awareness (end phoneme and rhyme identification tasks) was a significant predictor of FA in the left dorsal fronto-temporal segment, over and above naming and letter identification. This correlation was not found for the temporoparietal segment (Vandermosten et al., 2015). Paralleling the early internalization of overt speech, studies have shown that during reading acquisition, children slowly switch from overt to covert reading (Kragler, 1995; Prior and Welling, 2001). However, studies have yet to test whether this transition is associated with anatomical developments of the ventral or the dorsal routes of language.

Studies of word learning and repetition emphasize a specific functional directionality of the dorsal language pathway, in which processing of input phonological data in posterior regions precedes retrieval of articulatory information in frontal regions, therefore suggesting that information propagates from posterior temporal to anterior frontal regions (Friederici, 2009; Agosta et al., 2010). Direct cortical stimulation of posterior language areas (SMG, middle and posterior STG and the adjacent middle temporal gyrus; MTG) of awake adults resulted in evoked potentials in anterior language areas (Broca's area or adjacent regions), supporting the idea of processing progressing from posterior to anterior regions. However, in addition, stimulation of anterior regions also resulted in evoked potentials in all posterior regions tested (Matsumoto et al., 2004). A similar study using direct cortical stimulation in adult patients also showed bidirectional connectivity between pSTG and IFG (David et al., 2013), further suggesting that the connection is direct, and also providing evidence that propagation of information is faster from posterior to anterior regions, compared to the other direction. Koubeissi et al. (2012) also highlighted the bidirectionality of the connection, by showing that some patients have evoked response in posterior regions after stimulation of anterior regions, while others show the opposite response. Lastly, a neuro-computational model of the dorsal language stream also suggested a bidirectional transfer of information in this route (Schomers et al., 2017).

In summary, adult patient studies show that information propagates along both anterior and posterior directions within the human dorsal language pathway, and hence, one should be cautious in assuming posterior-to-anterior direction. Most developmental studies have so far focused on those language functions which are supported by unidirectional propagation of information in the dorsal route from posterior to anterior parts. We suggest that some reciprocal fibers in this pathway which send information in the other direction might be essential for inner speech development.

# The Dorsal Language Stream in Atypical Development

Some studies suggest a reduced use of inner speech among individuals with autistic spectrum disorder (ASD) (reviewed in Alderson-Day and Fernyhough, 2015). The reduction in inner speech use in some, but not all tasks, might be explained by the difference between dialogic and monologic thinking, with the former having its roots in communication with others, and the latter rooted in communication with the self (Fernyhough, 1996). Accordingly, it is expected that dialogic inner speech will be more affected among individuals with ASD (Alderson-Day and Fernyhough, 2015), a hypothesis that is confirmed in one study (Williams et al., 2012). A comprehensive review of DTI studies of ASD showed that people with ASD have white matter abnormalities across the brain, including in the AF/SLF, but not exclusively (Travers et al., 2012). In addition, correlations between diffusivity parameters and behavioral measurements have been inconsistent (Travers et al., 2012). A single study suggested that inner speech develops more slowly among children with specific language impairments (SLI), compared to typically developing children (Lidstone et al., 2012), but no neural correlates were studied. To the best of our knowledge, no other studies have examined inner speech in atypical pediatric populations. In cases where inner speech has been studied in atypical development, findings regarding white matter abnormalities are inconsistent, and associations with behavioral measurements vary greatly. However, this area of research offers an opportunity to further our understanding of the normal and abnormal development of inner speech and its neural correlates. We suggest that future studies of inner speech developmental abnormalities also examine whether behavioral performance correlates with dorsal stream anatomical integrity.

# DORSAL LANGUAGE STREAM – INNER SPEECH HYPOTHESIS

By combining findings from different disciplines, we have presented evidence that the maturation of the dorsal language stream, especially the fronto-temporal and fronto-parietal segments, during childhood occurs in parallel with the development of inner speech. We therefore suggest that there is a link between these neuro-anatomical and psychological developments. This suggestion is based on findings from three separate lines of research. First, inner speech emerges around the early school years; second, the FT and FP segments of the AF/SLF mature around the same time; and third, adult studies suggest the involvement of those dorsal pathway segments in inner speech processing.

In addition, there is also more specific experimental evidence to support this hypothesis: firstly, studies suggested that language learning in adults is mediated by subvocal rehearsal and is correlated with the integrity of the dorsal tracts (Lopez-Barroso et al., 2011, 2013); and secondly, children's performance on phonological awareness tasks, often requiring inner speech, is correlated with dorsal pathway development (Yeatman et al., 2011; Vandermosten et al., 2012, 2015).

Evidence for the parallel emergence of the neural pathway of the dorsal stream and the psychological process of inner speech should not, however, be interpreted uncritically as evidence for causation in any particular direction. The development of language is, of course, not solely influenced by maturation of brain structures. Large variability in both brain maturation and language abilities among individuals is partly due to environmental exposure (Kidd et al., 2018). It is well established that environment induces brain changes, especially during childhood (Sale, 2018). It is also known that induced white matter changes can be documented in animals *in vivo* (Sale, 2018) and in humans using DTI (Scholz et al., 2009). For example, in the area of language development, it has been shown that following 100 h of training program, poor readers showed changes in diffusivity parameters, suggesting increased myelination. Moreover, these changes occurred in the same frontal region where the children with poor reading ability showed lower FA than children with normal reading abilities. Lastly, changes were specific to the group which underwent the remediation program (Keller and Just, 2009). Together, these studies suggest that observed changes in brain maturation can be environmentally induced.

It would therefore be a mistake to assume that the emergence of inner speech is only developmentally constrained by dorsal pathway maturation. Following Vygotsky, Luria argued for bidirectional causation between biological maturation and sociocultural experience, fitting with the view that the internalization of social exchanges creates a new functional system of inner speech (Luria, 1965; Fernyhough, 2010). This view is in keeping with similar views of developmental interplay between interaction with the environment and biological maturation in the human brain (Gómez-Robles et al., 2015).

Lastly, we do not intend to minimize the role of the ventral language stream in inner speech development. Tasks requiring internal content analysis, as is the case in most occurrences of natural inner speech, probably rely on an interaction between the dorsal and the ventral streams (Rijntjes et al., 2012). However, as the ventral stream is already highly developed at birth, it is the maturation of the dorsal stream that presents the main constraint on inner speech development during childhood. Further research on the interplay between the ventral and the dorsal language streams may pay dividends for our understanding of functionally relevant distinctions between forms of inner speech, such as the distinction that can be made between subvocal rehearsal and planning (Alderson-Day and Fernyhough, 2015).

# A COMMENT ON INNER SPEECH AND THE ORIGIN OF LANGUAGE

Understanding the neurodevelopment of inner speech could be significant for current discussions about the origin of language in human evolution. There are contentious debates on whether language evolved as mechanism for symbolic thought (using inner speech) (Everaert et al., 2015, 2017) or as means of communication (Pinker and Jackendoff, 2005; Corballis, 2017). Jackendoff (1996) and others (Rijntjes et al., 2012) have discussed the importance of inner speech in human evolution, suggesting that the development of inner speech supported more complex and abstract thought. However, Pinker and Jackendoff (2005) emphasize that, in their view, language evolved initially as means of communication, and that inner speech is a "by-product": a later evolutionary development which is a result of internalizing external speech, which in turn supports more complex thinking. Here, we extend this hypothesis to suggest that this evolutionary development is related specifically to anatomical changes in the dorsal language stream.

Comparative studies have found some substantial differences between dorsal stream tracts in humans, monkeys, and apes, suggesting an evolutionary change affecting these tracts. The human SLF III (the fronto-parietal segment) is similar to that of rhesus monkeys (Thiebaut de Schotten et al., 2012) and macaques (Croxson et al., 2005). The long segment of the AF, on the other hand, shows intra-species variations. In macaques (Rilling et al., 2008) and rhesus monkeys (Petrides and Pandya, 2009; Thiebaut de Schotten et al., 2012), AF connectivity in both anterior and posterior sites is limited. In these monkey species, the AF does not reach the middle or inferior temporal gyri in the posterior end and has less widespread connectivity in the anterior end. In chimpanzee, both parietal and frontal connectivities are wider than in the macaque; however, it is still not as developed as in humans (Rilling et al., 2008).

Additionally, in the macaque (Rilling et al., 2008) and rhesus monkey (Thiebaut de Schotten et al., 2012), the ventral pathway is substantially more developed than the dorsal pathway, as is the case in human infants (see section "Anatomical Studies"). The monkey ventral pathway resembles the human one in its anatomy (Croxson et al., 2005; Thiebaut de Schotten et al., 2012). In chimpanzees, the opposite is found: the dorsal pathway is more developed than the ventral one, as is the case in adult humans (Rilling et al., 2008).

Using neurocomputational modeling, Schomers et al. (2017) demonstrated that intra-species anatomical differences along the dorsal pathway are associated with functional differences. They suggest that compared with the monkey, the human anatomy of the dorsal pathway gives rise to stronger and longer-lasting neural activations, as well as parallel, rather than serial, activation (Schomers et al., 2017). They further suggest that the activity in the human model but not in the monkey model "can be viewed as reflecting (subvocal) articulation" (Schomers et al., 2017, p. 3051).

In summary, comparative studies show that monkeys and even chimpanzees have substantially less developed AF, compared with humans. It has already been suggested that changes in the dorsal tracts were the key element in human language evolution (Aboitiz and García, 2009; Friederici, 2009; Aboitiz, 2012). Aboitiz and colleagues further argue that these changes gave rise to inner speech and its associated function: phonological working memory (Aboitiz and García, 2009; Aboitiz, 2012). If early humans had under-developed AF, and if highly developed AF is the neural substrate for inner speech production (as we argue here), then, one might suggest that early humans had no, or at least limited, inner speech. In the absence of inner speech, language would have been initially used as means of communication (Pinker and Jackendoff, 2005; Corballis, 2017) rather than as mechanism for symbolic thought (Everaert et al., 2015, 2017).

Another line of evidence connecting inner speech with language evolution comes from genetic studies. The FOXP2 gene has long been associated with speech and language in humans (Lai et al., 2001; Vargha-Khadem et al., 2005), and later, it has been argued that both FOXP2 and its target genes have undergone adaptive protein evolution during human evolution (Enard et al., 2002; Zhang et al., 2002). The FOXP2 gene was first identified in the KE family, whose affected members have a mutation in this gene, and they suffer from speech and language deficits (Lai et al., 2001). A later study has shown that those affected individuals suffer from phonological loop impairments, even when the task requires only inner speech, with no overt recitation (Schulze et al., 2018). Others have shown an association between FOXP2 mutations and auditory hallucinations in schizophrenia (Sanjuan et al., 2006; Tolosa et al., 2010). Building on these findings, Crespi et al. (2017) have studied more than 800 healthy individuals, finding an association between a specific variant of the gene and inner speech scores (based on self-rating). Together, these studies link inner speech to one of the main genes implicated in the evolution of language, putting inner speech as a main component in the evolution of language as a whole (Crespi et al., 2017).

Lastly, we do not argue that the ontogeny (of inner speech) recapitulates its phylogeny. That is, the anatomical changes in the language pathways that occur during embryonic development and early childhood are somewhat different from those that came about in the course of evolution. The bidirectional causal view that we have espoused here is in keeping with the finding that human infants are born without a fully matured dorsal pathway. It is the development of this neural system, in parallel with human infants' socially and linguistically patterned experience, that makes the emergence of inner speech possible.

# CONCLUSION

The anatomy of the arcuate fasciculus was described more than 200 years ago, and its role in language processing has been discussed extensively (Catani and Mesulam, 2008).

# REFERENCES


Together with subcomponents of the SLF, it forms the dorsal language stream. Neurodevelopmental studies have shown that humans are born with a dorsal language stream which is not fully developed and that it slowly matures throughout early childhood. Based on the temporal co-occurrence of dorsal stream maturation and the emergence of inner speech in children, as well as findings from studies of language development and adult language processing, we have suggested that the maturation of the dorsal language stream is closely linked to inner speech development. Studies of the neural mechanisms associated with inner speech in children are scarce. However, recent methodological advances in the study of neurodevelopment (Satterthwaite et al., 2014) and brain networks (Bassett and Bullmore, 2017) on the one hand, and inner speech (Geva and Warburton, 2019) on the other hand, can all contribute to our ability to make progress in this area. By linking findings from different disciplines, studies on the neural mechanisms of inner speech development can further our understanding of the role of inner speech and bridge the gap between research into language, cognition, development, and evolution.

# AUTHOR CONTRIBUTIONS

SG initiated the article and conducted the literature review. Both authors drafted parts of the manuscript and critically revised the work for intellectual content. Both authors approved the submitted version.

# FUNDING

CF is supported by the Wellcome Trust grant WT108720.


Dejerine, J. J. (1895). *Anatomie des centres nerveux*. Paris: Rueff.


normal language processing. *Aphasiology* 25, 323–343. doi: 10.1080/02687038. 2010.511236


adolescents: *in vivo* study. *Science* 283, 1908–1911. doi: 10.1126/ SCIENCE.283.5409.1908


dyslexia. *Neurosci. Biobehav. Rev.* 36, 1532–1552. doi: 10.1016/j.neubiorev. 2012.04.002


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Geva and Fernyhough. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The ConDialInt Model: Condensation, Dialogality, and Intentionality Dimensions of Inner Speech Within a Hierarchical Predictive Control Framework

Romain Grandchamp<sup>1</sup> , Lucile Rapin<sup>1</sup> , Marcela Perrone-Bertolotti<sup>1</sup> , Cédric Pichat<sup>1</sup> , Célise Haldin<sup>1</sup> , Emilie Cousin<sup>1</sup> , Jean-Philippe Lachaux<sup>2</sup> , Marion Dohen<sup>3</sup> , Pascal Perrier<sup>3</sup> , Maëva Garnier<sup>3</sup> , Monica Baciu<sup>1</sup> and Hélène Lœvenbruck<sup>1</sup> \*

<sup>1</sup> Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LPNC, Grenoble, France, <sup>2</sup> INSERM U1028, CNRS UMR5292, Brain Dynamics and Cognition Team, Lyon Neurosciences Research Center, Bron, France, <sup>3</sup> Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, Grenoble, France

#### Edited by:

Thomas M. Brinthaupt, Middle Tennessee State University, United States

#### Reviewed by:

Charles Fernyhough, Durham University, United Kingdom Sharon Geva, University College London, United Kingdom

\*Correspondence:

Hélène Lœvenbruck Helene.Loevenbruck@ univ-grenoble-alpes.fr

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 16 February 2019 Accepted: 19 August 2019 Published: 18 September 2019

#### Citation:

Grandchamp R, Rapin L, Perrone-Bertolotti M, Pichat C, Haldin C, Cousin E, Lachaux J-P, Dohen M, Perrier P, Garnier M, Baciu M and Lœvenbruck H (2019) The ConDialInt Model: Condensation, Dialogality, and Intentionality Dimensions of Inner Speech Within a Hierarchical Predictive Control Framework. Front. Psychol. 10:2019. doi: 10.3389/fpsyg.2019.02019 Inner speech has been shown to vary in form along several dimensions. Along condensation, condensed inner speech forms have been described, that are supposed to be deprived of acoustic, phonological and even syntactic qualities. Expanded forms, on the other extreme, display articulatory and auditory properties. Along dialogality, inner speech can be monologal, when we engage in internal soliloquy, or dialogal, when we recall past conversations or imagine future dialogs involving our own voice as well as that of others addressing us. Along intentionality, it can be intentional (when we deliberately rehearse material in short-term memory) or it can arise unintentionally (during mind wandering). We introduce the ConDialInt model, a neurocognitive predictive control model of inner speech that accounts for its varieties along these three dimensions. ConDialInt spells out the condensation dimension by including inhibitory control at the conceptualization, formulation or articulatory planning stage. It accounts for dialogality, by assuming internal model adaptations and by speculating on neural processes underlying perspective switching. It explains the differences between intentional and spontaneous varieties in terms of monitoring. We present an fMRI study in which we probed varieties of inner speech along dialogality and intentionality, to examine the validity of the neuroanatomical correlates posited in ConDialInt. Condensation was also informally tackled. Our data support the hypothesis that expanded inner speech recruits speech production processes down to articulatory planning, resulting in a predicted signal, the inner voice, with auditory qualities. Along dialogality, covertly using an avatar's voice resulted in the activation of right hemisphere homologs of the regions involved in internal own-voice soliloquy and in reduced cerebellar activation, consistent with internal model adaptation. Switching from first-person to third-person perspective resulted in activations in precuneus and parietal lobules. Along intentionality, compared with intentional inner speech, mind wandering with inner speech episodes was associated with greater bilateral inferior frontal activation and decreased activation

**74**

in left temporal regions. This is consistent with the reported subjective evanescence and presumably reflects condensation processes. Our results provide neuroanatomical evidence compatible with predictive control and in favor of the assumptions made in the ConDialInt model.

Keywords: inner speech, auditory verbal imagery, mind wandering, condensation, dialogality, intentionality, fMRI, predictive control

# INTRODUCTION

# Three Dimensions of Inner Speech

Inner language can be defined as the subjective experience of verbalization in the absence of overt articulation or sign (Alderson-Day and Fernyhough, 2015). It can be produced independently of overt speech. It contributes to enriching and shaping our inner existence and is instrumental in the maintenance of a coherent self-narrative (Perrone-Bertolotti et al., 2014; Lœvenbruck, 2018). Given the scarcity of data on inner sign language production (but see e.g., Max, 1937; McGuire et al., 1997; MacSweeney et al., 2008 and references in Lœvenbruck et al., 2018) the present article is restricted to the description of inner speech, although most of the theoretical principles we endorse presumably also apply to inner sign.

The cognitive functions (or rather uses) of inner speech have been investigated by means of introspective questionnaires and behavioral methods, in typical and atypical populations (for reviews, see e.g., Perrone-Bertolotti et al., 2014; Alderson-Day and Fernyhough, 2015; Martínez-Manrique and Vicente, 2015; Alderson-Day et al., 2018; and the volume edited by Langland-Hassan and Vicente, 2018). Previous works suggest that inner speech plays an important role in many cognitive operations, including working memory (Baddeley, 1992; Marvel and Desmond, 2012), autobiographical and prospective memory (Meacham, 1979; Conway, 2005; Morin and Hamper, 2012; Pavlenko, 2014), orientation and spatial reasoning (Loewenstein and Gentner, 2005), mental arithmetics (Sokolov, 1972), executive control (Emerson and Miyake, 2003; Laurent et al., 2016), complex problem solving (Sokolov, 1972; Baldo et al., 2005, 2015), and theory of mind judgment (Newton and de Villiers, 2007). It has also been considered that inner speech serves metacognitive functions. By making our thoughts auditorily salient (in expanded varieties of covert speech, see below), inner speaking makes us aware of our thinking processes and allows us to focus our attention on our thoughts and activities. This metacognitive ability in turn contributes to our taking perspectives on self and others and to generate self-knowledge. It has thus been suggested that inner speech fosters metacognition (Vygotsky, 1934/1986; Carruthers, 2002; Clark, 2002; Martínez-Manrique and Vicente, 2010; Jackendoff, 2011; Langland-Hassan et al., 2017), selfregulation and self-motivation (Hardy, 2006; Clowes, 2007), and self-awareness (Peirce, 1934; Vygotsky, 1934/1986; Ricœur, 1990; Dennett, 1991; Merleau-Ponty, 1948/2002; Wiley, 2006b; Morin et al., 2011; Wilkinson and Fernyhough, 2017). This diversity of uses comes with a plurality of forms. It has been suggested that inner speech varies along several dimensions (McCarthy-Jones and Fernyhough, 2011). This article seeks to provide an integrative description of these dimensions, which accounts for the occurrence of various inner speech forms.

A first dimension along which inner speech can vary is condensation. Overt speech production is classically viewed as involving three main stages: conceptualization, formulation and articulation (e.g., Dell, 1986, 2013; Bock, 1987; Kempen and Hoenkamp, 1987; Levelt, 1989). Conceptual preparation consists in planning an utterance's meaning and purpose. The preverbal message that results can be described as highly condensed in form. Formulation translates the condensed preverbal message delivered by the conceptualizer into a linguistic structure. Formulation includes prosodic, syntactic and morpho-phonological encoding. It ends up in the sketching of a phonetic goal (or plan), expressed in a less condensed (semi-expanded) form. The articulation stage follows, consisting of articulatory planning, then execution, with full elaboration and expansion. Covert speech has been conceived of as truncated overt speech, but the stage at which the production process is interrupted is still debated. According to some scholars, inner language predominantly pertains to semantics and is unconcerned with phonological, phonetic, articulatory or auditory representations (see e.g., Vygotsky, 1934/1986; MacKay, 1992; Oppenheim and Dell, 2008, 2010). Vygotsky, for instance, claims that syntax in inner speech is maximally simplified and can be elliptical, with the omission of words and an extreme condensation of meaning. In his view, inner speech, is highly predicated, in the sense that only the necessary information is supplied. In line with Vygotsky's view that inner speech precedes word-level formulation, Knobloch (1984, p. 230, cited by Friedrich, 2001), posits that inner language is the preliminary form of all overt language utterances. It is the mechanism by which quasi-linguistic material are supplied to semantico-syntactic processes, in a "condensed, compact and indicative form." In this view, inner speech can therefore be conceived of as the conceptual message, cast in a pre-linguistic compact form, before formulation and articulatory planning take place. Bergounioux (2001, p. 120) likewise states that inner speech generally employs asyndeton (the omission of coordinating conjunctions), anaphora (the use of expressions whose interpretations depend on the context) and predication (the use of expressions in which only the predicate, not the subject, is formulated). In the same vein, Wiley (2006a) argues that the "syntax of inner speech is abbreviated and simplified" (p. 321) and that its semantics is also condensed, with fewer words used relative to overt language, given that key words may be used, that carry "large numbers of words or their possible meanings" (p. 323). These introspective observations

of condensation are supported by several psycholinguistic experiments on the relative rates of overt and covert speech (e.g., Korba, 1990; but see Netsell et al., 2016) or on the different biases exhibited by speech slips in overt and covert modes (Oppenheim and Dell, 2008, 2010; but see Corley et al., 2011). These empirical findings suggest that, compared with overt public speech, inner language is sketchy and can be viewed as abbreviated or condensed, at the syntactic, lexical, and even phonological levels. Such condensation implies that the formulation and articulation stages may be suppressed or limited in inner language.

An alternative view is that inner speech is a simulation of overt speech production, encompassing all its stages, only interrupted prior to motor execution. In this view, inner speech entails phonological and articulatory specification and is associated with the subjective experience of a voice percept (see e.g., Postma and Noordanus, 1996; Corley et al., 2011; Scott et al., 2013). Several empirical arguments for the proposition that inner speech involves multisensory representations, together with the recruitment of the speech motor system, are provided in Lœvenbruck et al. (2018). These include psycholinguistic data, such as the verbal transformation effect (Reisberg et al., 1989; Smith et al., 1995; Sato et al., 2004) as well as electromyographical findings (McGuigan and Dollins, 1989; Nalborczyk et al., 2017) and neuroimaging data (Lœvenbruck et al., 2005; Perrone-Bertolotti et al., 2012; Yao et al., 2012; Vercueil and Perrone-Bertolotti, 2013; Kell et al., 2017). These data, in turn, suggest that inner speech may well possess many of the properties of overt speech, including its articulatory specification.

These two views can be reconciled if various degrees of unfolding of inner speech are considered. Building on the Vygotskian's view of inner speech as the outcome of a developmental process, Fernyhough (2004, see also Geva et al., 2011; Alderson-Day and Fernyhough, 2015) has suggested that inner speech varies between two extremes. The first one, which he calls "expanded inner speech," is claimed to correspond to an early developmental stage of inner speech, which (according to Vygotsky, 1934/1986) is an internalization of overt dialog and which includes turn-taking qualities as well as syntactic, lexical and phonological properties. The other extreme, referred to as "condensed inner speech," is argued to correspond to Vygotsky (1934/1986) description of the latest developmental form of inner speech, which has lost most of the acoustic and structural qualities of overt speech. Fernyhough (2004) has suggested that inner speech varies with cognitive demands and emotional conditions between these two extreme forms. A similar position is taken by Vicente and Martínez-Manrique (2016), who conceive of unsymbolized thinking (as described by Hurlburt et al., 2013) as the most condensed form of inner speech and as in continuity with expanded forms of inner speech. Therefore, the two views of inner speech (abbreviation vs. simulation) can be construed as descriptions of two opposite poles on the condensation dimension. The fully condensed form only involves the highest linguistic level (semantics), and has lost most of the acoustic, phonological and even syntactic qualities of overt speech. Expanded inner speech, on the other hand, presumably engages all linguistic levels down to articulatory

planning and the perception of an inner voice. It retains many of the phonological and phonetic properties of overt speech. Between the fully condensed form (preverbal message) and the expanded articulation-ready form, it can be assumed that various semi-condensed forms may exist, depending on the level at which the speech production process is truncated.

A second dimension is dialogality. As argued by Fernyhough (2004) or Jones and Fernyhough (2007a), inner speech may be considered as "irreducibly dialogic," in that it results from a gradual process of internalization of dialogs, in which differing perspectives on the world are held and self-regulated (but see Gregory, 2017 for a slightly different view). In the Vygotskian developmental approach taken by Fernyhough, a child's first utterances are set within external dialogs with their caregivers. Later in development, the utterances remain dialogic, with the child overtly producing both questions and answers, in an egocentric fashion (private speech, speech directed toward the self). In the last developmental stage, these dialogs become fully internalized into inner speech. Yet, even though selfdirected speech may become fully internalized, Fernyhough (2004) claims that it retains the dialogic character of overt dialog, with the ability to hold differing attitudes or views on reality. In French pragmatics, a distinction is made between dialogal discourse in which two distinct speakers are involved, in an interpersonal way, and dialogic discourse, where two points of view are confronted (for the distinction between dialogic and dialogal, see Roulet, 1984; Bres, 2005; Roulet and Green, 2006). Dialogal discourse occurs in a communicative interaction whereas dialogic discourse occurs in a reflexive argumentation. An overt discourse can be "monologal dialogic," when it is uttered by one speaker who, asserts, refutes, questions. In other words, it can be an argumented soliloquy. A discourse can also be "dialogal monologic," when two speakers convey a single view, with no alternative. It can then be described as a unitary conversation (Maingueneau, 2016). Although it may be considered that inner speech is dialogic in content, since multiple perspectives can be entertained internally, we claim that it can be either monologal (soliloquial) or dialogal in form. Monologal inner speech occurs when we engage in internal soliloquy. In monologal situations, we can use our own voice or we can also covertly imitate someone speaking, which means we can produce internal soliloquy in another person's voice, yet we primarily are the speaker (although obviously also the listener), and only one voice is controlled and monitored. Dialogal inner speech occurs when we imagine hearing someone, what is often referred to as auditory verbal imagery (Shergill et al., 2001). In dialogal situations, when we imagine someone talking to us, with their own voice, we primarily are the addressee (although perhaps also the speaker). This happens for instance when we recollect past dialogs or when we practice future conversations. Dialogal inner speech involves the representation and monitoring of our own voice as well as those of other people. It also sometimes requires the ability to entertain differing perspectives (Fernyhough, 2004; Jones and Fernyhough, 2007a). Therefore, we claim that inner speech can vary between two extremes: internal monolog or soliloquy – i.e., inner speaking using own voice ("Self ") – and

internal dialog, which includes inner speaking and imagining others speaking with their voices ("Self and Other"). Imitative soliloquy, or monolog with another voice as one's own, can be conceived of as lying between these two extremes. Our model seeks to account for these three distinct situations: inner speaking as self, inner speaking as modified self, inner speaking as self and other.

A third dimension is intentionality. We sometimes deliberately engage in inner speech (when we rehearse material in short-term memory), what can be called willful or intentional inner speech. Other times, we find ourselves unintentionally using inner language, what has been called verbal mind wandering (Perrone-Bertolotti et al., 2014) or spontaneously occurring inner speech (Hurlburt et al., 2016). Verbal mind wandering has been described as evanescent, fading (Egger, 1881; Saint-Paul, 1892; Hurlburt, 2011; Smadja, 2018) and its auditory qualities are often reported as fainter than that of intentional inner speech (Lœvenbruck et al., 2018).

As depicted in **Figure 1**, inner speech can therefore vary along condensation, dialogality and intentionality dimensions. It can be assumed that the expanded forms most frequently arise during intentional inner speech (verbal mind wandering is often reported as fading and fleeting), but this is debatable, as unintentional varieties with expanded, audible, forms have been reported (Hurlburt, 2011).

# Monitoring of Multidimensional Inner Speech Varieties

The question of monitoring during inner speech is still an open one. Overt language production relies on verbal self-monitoring, a mechanism which allows us to control and regulate our own language productions. We can detect errors or disruptions from our initial language goals, and even correct for these errors online, sometimes even before articulation takes place (Levelt, 1983; Postma, 2000; Huettig and Hartsuiker, 2010). In many psycholinguistic models of overt speech production (e.g., Laver, 1980; Levelt, 1989), errors are detected by monitoring and parsing the phonetic plan, also called "inner speech," prior to articulation. In our view, as described above, inner speech production is embedded in overt speech production. It engages speech production mechanisms, which can be interrupted at different stages, according to the degree of condensation. The mechanisms by which errors can be anticipated online during overt speech production are therefore engaged during inner speech production. This implies that errors in inner speech

perception of an inner voice (bottom box). On the dialogality dimension, inner speech can vary between two extremes: internal monolog or soliloquy with own voice ("Self") and internal dialog, which includes inner speaking and imagining others speaking with their voices ("Self and Other"). Monolog with another voice as one's own lies in between these two extremes. On the horizontal intentionality axis, inner speech can vary between verbal mind wandering, on the left, and intentional inner speech, on the right.

can be detected using these mechanisms. Introspective accounts suggest indeed that inner speech itself can be monitored (Bergounioux, 2001). Evidence for inner speech monitoring can be found in psycholinguistic data. Studies of inner recitation of tongue-twisters show that speech errors can be detected, even in a covert mode (e.g., Dell and Repka, 1992; Nooteboom, 2005; Oppenheim and Dell, 2008, 2010; Corley et al., 2011). The Verbal Transformation Effect (VTE) refers to the perceptual phenomenon in which listeners report hearing a new speech percept when an ambiguous stimulus is repeated rapidly (Warren, 1961). It has been shown to also occur in a covert mode (Reisberg et al., 1989; Smith et al., 1995; Sato et al., 2004). These studies suggest that inner speech alterations can be monitored, at least when participants are asked to do so. The level at which inner slips are detected is debated, however. Tongue-twister inner recitation studies suggest that errors are detected at the phonological (formulation) level. Oppenheim and Dell (2008; 2010), for instance, observed a lexical bias, which reveals that phonological representations are monitored. They found that the errors reported by the participants, when covertly repeating tongue-twisters, tend to produce more words than non-words ("reef " replaced by "leaf " is more likely than "wreath" replaced by "leath"). In overt speech, in addition to the lexical bias, a phonemic similarity effect is observed, i.e., a tendency for slips to involve similarly articulated phonemes ("reef " slips more often to "leaf," with /r/ and /l/ sharing voicing and approximant features, than "beef," with /r/ and /b/ only sharing voicing). This effect relies on subphonemic, articulatory representations. The covert speech errors reported by the participants in Oppenheim and Dell's experiments do not exhibit this effect. These findings therefore suggest that monitoring for errors occurs at the formulation stage, not at the articulatory planning stage. Corley et al. (2011), however, did observe a phonemic similarity effect in the errors reported by the participants in their own tongue-twister recitation experiment. This suggests that inner slips could in fact be detected at the articulation planning level. In addition, research on covert VTE has indicated that the effect is disrupted during auditory interference, which suggests that auditory processes are engaged during the search for VTE (Smith et al., 1995). Altogether these studies suggest that intentional inner speech monitoring can at least take place at the lower two linguistic levels, i.e., formulation and articulatory planning. Beyond these levels, it is still an open question whether inner speech monitoring may occur at the conceptualization level. Studies of self-repairs in spontaneous overt speech production show that speakers do monitor the intended pre-verbal message for appropriateness (e.g., Levelt, 1983; Blackmer and Mitton, 1991). In the overt speaking mode, monitoring seems therefore to occur during conceptualization. In children's private speech, which, as mentioned above, has been argued to be a precursor to inner speech, self-repairs are also present at the conceptualization level, as shown by occurrences of re-wording or amending of utterances (e.g., Manfra et al., 2016). Consequently, the feedback arrows in **Figure 1** represent the self-editing processes that may take place at all levels during intentional inner speech, including conceptualization. However, this monitoring may be less stringent than the one that operates in the overt mode. As mentioned above, Egger (1881), Vygotsky (1934/1986), Bergounioux (2004), or Wiley (2006a) claim that inner speech only needs to be understood by ourselves, which implies that we can be less distinct, that we can abbreviate inner sentences and that we can even sometimes produce erroneous forms, as long as meaning is preserved. Wiley (2006a, 2014) proposed that the control processes in overt and covert modes are different. In inner speech, efficiency rules prevail, so that production can be sped up and economized. Linguistic rules are therefore weakened and monitoring can be considered as more lax in intentional inner speech than overt speech. As concerns less intentional forms of inner speech, that occur during mind wandering, to our knowledge, there are no studies showing that monitoring mechanisms are at play. By definition, mind wandering operates without executive control, or with only intermittent control (but see Smallwood et al., 2012). In the present paper, we therefore assume that verbal monitoring is reduced during verbal mind wandering, hence the absence of self-editing arrows on the unintentional side in **Figure 1**.

# The ConDialInt Model: Functional Neuroanatomy of Multidimensional Inner Speech

We propose a neurocognitive model that accounts for the varieties of inner speech along the three dimensions described above, and for their monitoring. The ConDialInt model (for Condensation-Dialogality-Intentionality) is based on the preliminary account presented in Lœvenbruck et al. (2018), which focused on the latest stage of the production of intentional inner speech, i.e., articulatory planning. In this preliminary account, inner speech monitoring was based on a predictive control scheme, inspired from Frith et al. (2000) and also described in Rapin et al. (2013) and Perrone-Bertolotti et al. (2014). In Lœvenbruck (2018), a provisional extension of this account has been sketched, in which formulation and conceptualization stages were added to the articulatory planning stage. We further elaborate on these propositions and consider a more comprehensive neurocognitive model which addresses the three dimensions of inner speech (**Figure 2**). The ConDialInt model is limited to oral language (inner speech), since available data on inner sign language production are too scant, but we speculate that the auditory processes and representations invoked here for inner oral language may be replaced with visual elements to account for inner sign language.

In the ConDialInt model, verbal monitoring is based on a hierarchical predictive control scheme. Such a scheme has been originally proposed for complex movement control by Haruno et al. (2003) and Pacherie (2008). Predictive control has been successfully implemented in speech motor control (e.g., Postma, 2000; Guenther et al., 2006; Houde and Nagarajan, 2011). It is based on the pairing of two types of internal models, a forward model (predictor) and an inverse model (controller). The inverse model computes a motor command, while the forward model predicts the consequence of the ongoing command, using an efference copy of this command. Monitoring is based on several comparisons between desired, predicted and actual

sensory outcomes. The crucial comparison involves predicted and desired signals: it allows errors to be monitored before the action is even accomplished. In hierarchical predictive control, pairs of controllers and predictors are organized in cascade, with bidirectional information processing across levels. This type of control has been applied to overt language production by Pickering and Garrod (2013). According to them, monitoring can take place at all stages of language production, using a predictive scheme: Actual and predicted semantics can be compared, as well as actual and predicted syntax, and actual and predicted phonology. Any mismatch between actual outputs and predictions may trigger a correction, by tuning the internal models at each stage. The ConDialInt model is an adaptation and extension of Pickering and Garrod's (2013) hierarchical predictive control model of overt speech production to covert speech production. Importantly, compared with Pickering and

Garrod's original model, it provides a detailed implementation of the predictive control scheme at each of the hierarchical levels. This fine-grained implementation of predictive control enables us to describe the varieties of inner speech along the condensation dimension by integrating an inhibitory control mechanism that can be applied at different levels in the hierarchy. The higher the speech production flow is interrupted, the more condensed the inner speech variety is. It accounts for dialogality by replacing the speaker's own internal models with internal models that simulate other speakers' vocal productions and by including perspective switching mechanisms (from speaker to addressee). Finally, it accounts for intentionality by incorporating different degrees of production monitoring.

Another predictive account of inner speech has been provided by Wilkinson and Fernyhough (2017). Their account takes a predictive processing approach, stemming from Friston's (2005) active inference theory. Our own model is compatible with many of their hypotheses, but slightly differs in a number of ways. First, as explained below, we claim that inner speech, in its most expanded form, does entail a stimulus, a sensation, and that this sensation is a prediction, derived from motor commands. Second, we argue that inner speaking (in an expanded way) is indeed imagining oneself speaking, i.e., simulating the act of speaking, and that this simulation can take place with different voices, giving rise to different percepts. We speculate that speakers develop internal (or generative) models of themselves as well as of others. And these internal models allow them to simulate different voices. Third, we assume that the ability to engage in dialogs (covertly and overtly) comes with a mechanism by which speakers can hold track of perspectives. This mechanism allows one to imagine that someone is speaking to them. As we describe below, it is precisely this ability which explains the move from "me speaking" to "other speaking" that Wilkinson and Fernyhough argue is lacking in more traditional self-monitoring models of inner speech. We contend that this perspective switching ability, together with voice modulation (own voice vs. other voice), lies at the origin of auditory verbal hallucination, when self-monitoring goes awry.

Our model resolves a few ambiguities in Pickering and Garrod's original model, which does not specify in detail what the forward-inverse pairs implement at each of the hierarchical levels. In our view, at the lowest level (articulatory planning), the predictor-controller pair functions just as described in typical predictive control models of action control (e.g., Miall and Wolpert, 1996). The predictor model is thus a model of the biophysical speech apparatus, that converts motor commands (or rather efference copies of motor commands) into predicted articulatory movements and their resulting sounds and somatosensory percepts. At the higher levels (formulation and conceptualization), however, there is no biophysical apparatus to be modeled, and no movements or sounds to be predicted. The predicted representations at these levels are abstract phonetic goals and preverbal messages. We assume, therefore, that the pairs of predictors and controllers in the two highest hierarchical levels are not models of any biophysical apparatus. They are computational procedures that convert one type of mental representation (e.g., broad language goal) into another type of mental representation (e.g., preverbal message). Consequently, in the ConDialInt model, hierarchical predictive control of inner speech runs as follows. At the conceptualization stage, the broad language goal is converted into a desired preverbal message by a conceptualization controller. This desired preverbal message is the highly condensed inner speech percept. It is sent back as input to a conceptualization predictor, which predicts the language goal that would derive from it. Desired and predicted language goals can thus be compared, provided that the desired goal is buffered, so that desired and predicted signals are temporally aligned (as represented by the 1t triangle in **Figure 2**). Any error at this early monitoring stage can be corrected for, by sending an error signal to the conceptualization controller and by delaying lower level processes. At the formulation stage, the desired preverbal message is converted into a desired phonetic goal by a formulation controller. This desired phonetic goal corresponds to a semi-expanded inner speech percept and can be transformed (in the articulatory planning stage) into motor commands. In robotics or limb control theory, goals are desired configurational states of the peripheral motor system, specified in terms of position and velocity of the motor apparatus (e.g., Miall and Wolpert, 1996). This is appropriate for movements of the hand or arm. In the case of dynamic speech control, it is unlikely that the phonetic targets of the speakers are exclusively specified in terms of spatial configurations, i.e., positions and velocities of the speech articulators. Many studies suggest instead that speech targets are defined in both auditory and articulatory terms (for arguments on auditory targets see e.g., Perkell et al., 1997 or Guenther et al., 2006; for arguments on articulatory, i.e., somatosensory, targets, see Saltzman and Munhall, 1989, Browman and Goldstein, 1989 or Tremblay et al., 2003; for arguments on auditory-somatosensory targets, see e.g., Lœvenbruck, 1996, Patri et al., 2018, Perkell, 2012 or Perrier et al., 1996). We therefore argue that the phonetic goal is a supramodal integration of auditory and somatosensory (and perhaps even visual) representations. A formulation predictor can transform the phonetic goal back into a predicted preverbal message, which can be compared with the (buffered, see 1t triangle) desired one. Any error at this intermediate monitoring stage can be corrected for by sending an error signal to the formulation controller (and perhaps also, by bottom-up cascade, to the conceptualization controller) and by delaying lower level processes. It has been claimed that the formulation stage itself can be divided into grammatical and phonological encoding (see e.g., Levelt, 1989). In this case, then, the pair of controller-predictor at the formulation stage should be replaced with two pairs, one for each sublevel. Lastly, at the articulatory planning stage, the desired phonetic goal is converted into motor commands by an articulatory-planning controller. In the case of overt speech, the motor commands are fed to the speech apparatus, resulting in articulatory movements and sounds. In the case of covert production, the motor commands are inhibited, resulting in no movement of the speech apparatus. In both overt and covert cases, an efference copy of the motor commands is sent to an articulatory-planning predictor which generates a predicted sensory experience (ahead of the actual experience, in the case of overt speech).

This sensory experience corresponds to the percept of an inner voice, with auditory as well as somatosensory qualities. As we have argued in Lœvenbruck et al. (2018) and Perrone-Bertolotti et al. (2014), inner speech can be associated with auditory as well as somatosensory representations. Somatosensory representations include tactile and proprioceptive sensations in the speech organs, that, like auditory sensations, result from imagined articulatory gestures. The claim that the inner voice has auditory qualities is supported by introspective data on timbre, pitch, and intensity (e.g., Egger, 1881), by behavioral findings (e.g., Reisberg et al., 1989; Smith et al., 1995; Corley et al., 2011; Dell and Oppenheim, 2015) and by neuroimaging data (e.g., Bookheimer et al., 1995; Sato et al., 2004; Lœvenbruck et al., 2005; Basho et al., 2007). The assumption that somatosensory representations may sometimes also be at play comes from introspective data (Taine, 1870; Paulhan, 1886) as well as a few neuroimaging results (e.g., Rosen et al., 2000; Huang et al., 2002). Further empirical data are needed to define whether somatosensory signals are systematically involved during expanded inner speech. Our model includes this possibility. The argument that these multisensory signals result from simulated motor actions of the speech organs is itself supported by introspective experiments (Bain, 1855; Stricker, 1885), physiological measurements (Jacobson, 1931; Sokolov, 1972; Conrad and Schönle, 1979; McGuigan and Dollins, 1989; Livesay et al., 1996) as well as neuroimaging data (Bookheimer et al., 1995; McGuire et al., 1996; Baciu et al., 1999; Palmer et al., 2001; Shergill et al., 2001; Huang et al., 2002; Basho et al., 2007; Partovi et al., 2012).

The multisensory experience is integrated into a predicted supramodal representation which can be compared with the (buffered, see 1t triangle) desired phonetic goal. Any error at this last monitoring stage can be corrected for by sending an error signal to the articulatory-planning controller (this error signal may perhaps also be fed back to higher-level controllers) to issue new commands. In the case of overt speech production, this allows for errors to be corrected before the utterance is even produced, a strong argument for predictive control. In action control, it has been claimed (by Frith et al., 2000, for instance), that the efference copy mechanism is crucial to the sense of agency, the feeling of being the agent of our own action. In Rapin et al. (2013) and Lœvenbruck et al. (2018), it was argued that, in inner speech, the sense of agency is derived from the comparison between desired and predicted signals (see also Tian and Poeppel, 2012 and Swiney and Sousa, 2014). We further elaborate on this assumption, by claiming that the comparisons between desired and predicted signals at each level provide a sense of agency (referred to as "A" in **Figure 2**) of the inner production. This is represented with a "<" sign at each level, symbolizing the presence of a desired signal ahead of the predicted signal. Several studies have reported dampened neural response in auditory cortex during inner speech and silently mouthed speech compared with speech perception (e.g., Ford and Mathalon, 2004; Agnew et al., 2013). One interpretation is that the monitoring mechanism not only allows to check that predicted signals are similar to the desired ones, but also plays a role in sensory attenuation. When desired and predicted signals match, a dampening of the self-generated sensory experience takes place, so that any external sensory experience is easier to detect (e.g., Blakemore et al., 2002; Ford and Mathalon, 2004). The ConDialInt model therefore includes an attenuation mechanism at the articulatory planning stage, when desired and predicted signals are consistent.

As concerns the condensation dimension, the ConDialInt model includes inhibitory control mechanisms at each hierarchical level (orange arrow in **Figure 2**). The level at which the speech production flow is inhibited defines the degree of condensation. Inhibition at the formulation stage interrupts production at the preverbal message and results in highly condensed inner speech. Inhibition at the articulatory planning stage terminates production at the phonetic goal, giving rise to a semi-expanded variety. When inhibition occurs further down the production flow, it cancels out motor commands but a predicted sensory experience can still be computed. Therefore, inhibition at this level prevents articulatory gestures from being generated but releases the experience of expanded inner speech, with auditory and somatosensory qualities, i.e., the little voice we can hear in our head.

The ConDialInt model also accounts for dialogality. When inner speech is produced with one's own voice, the processes described above simply unfold, stopping at various stages, depending on the condensation dimension. When one covertly imitates someone else's voice, the controller and predictor internal models are adapted, modulated, in order to control and predict another voice than one's own. Pickering and Garrod (2013) have claimed that their hierarchical predictive control scheme can also account for efficient speech comprehension, by deriving predictions of the interlocutor's language goals, using predictor models. This implies that listeners are able to build adapted internal models of their interlocutor, at the different stages of language processing. Indeed, when we know someone's voice, and know them well, we can often also recognize their phonological, lexical, syntactic, and prosodic habits. In such cases, we can therefore, presumably, make reasonably accurate adaptations of our own predictors and controllers, that fit with our interlocutors' features, at each linguistic level. Similarly, when we covertly imitate someone, adaptations of the controllerpredictor pairs at each stage could also be made, resulting in predicted signals that correspond to a different inner voice than our own. In **Figure 2**, the possibility of adapting predictors and controllers is represented with a blue-red fading pattern (with blue for self, and red for others). The outputs of the predictors and controllers at each stage (which correspond to inner speech varieties) are represented with blue-red bordered boxes. Moreover, dialogality (in the polyphonic sense explained above) also implies switches in perspective. Not only can we mentally imitate someone's voice, but we can also imagine that someone else is talking to us. Dedicated neural mechanisms have been shown to be at play when participants are asked to imagine being the agent of the action or when they imagine another person being the agent (Ruby and Decety, 2001). Compared with imagining being the agent (first-person perspective), imagining another person being the agent (third-person perspective) has been shown to elicit responses in the right inferior parietal lobule,

the precuneus, the posterior cingulate, and the fronto-polar cortex. In line with these findings on motor imagery, we assume that the dialogality dimension involves a perspective switching mechanism, as well as further monitoring and executive control processes. In monologal inner speech, a first-person perspective is taken, in which one imagines being the agent of the speech action. In dialogal inner speech, a third-person perspective is taken, in which one imagines another person being the agent. The perspective switch, from first-person to third-person, probably occurs during the latest stage of speech production, i.e., during articulatory planning, when physical embodiment takes place and the voice is being generated (predicted). The initial stages, conceptualization and formulation, are more abstract, less embodied, and can be initiated with one's own or someone else's linguistic habits. Up to these stages, imagining someone else speaking (rather that oneself) merely requires using internal models that are adapted to that individual's linguistic characteristics (lexicon, syntax, prosody). Changing the agent of the imagined verbal action does not otherwise modify conceptualization and formulation. Articulatory planning, on the other hand, is affected by the change in agent, since it is the stage at which the verbal material becomes physically instantiated, with full articulatory specification. Articulatory planning involves predicting the temporal dynamics of the position and velocity of the speech articulators. When one imagines oneself speaking, these articulatory configurations are computed from a firstperson perspective. When one imagines another individual speaking, the dynamics of the configurations of the speech apparatus is computed with a third-person perspective. The ConDialInt model therefore includes a mechanism by which this change in point of view can operate. This is illustrated in **Figure 2**, by the addition of purple boxes at the articulatory planning stage, which account for the perspective switch that operates in dialogal inner speech.

As concerns the intentionality dimension, we argue that verbal monitoring only concerns intentional inner speech. During intentional inner speech, the signals generated by the controllers at each level are converted by predictors into predicted signals that are issued back one level-up in the hierarchy to be compared with initial desired signals. As stressed above, the comparison process is more lenient than in overt speech, hence the approximate symbols in **Figure 2**. In unintentional inner speech, we assume that no verbal monitoring takes place: unbidden verbal thoughts arise, but they are not confronted to initial objectives. Therefore, the control is merely feedforward, but comparisons between predictions and goals may still take place, for agency to be felt. Even unintentional inner speech comes with a feeling of agency. When that feeling is defective, auditory verbal hallucination may occur. In the ConDialInt model a distinction is therefore made between verbal monitoring (M), which only concerns intentional varieties (represented in green in **Figure 2**), and agency attribution (A), which concerns all varieties.

We speculate on a tentative neuroanatomical grounding for this functional account, based on previous neuroimaging studies and descriptions. The predominantly left-lateralized neural regions associated with the different processes are listed in each box in **Figure 2**. As concerns the conceptualization stage, following considerations by Blank et al. (2002), Caplan et al. (2000), Duffau et al. (2014), Gernsbacher and Kaschak (2003), Haller et al. (2005), Hickok (2009), Indefrey et al. (2001), Indefrey and Levelt (2004), Lœvenbruck et al. (2005), Rauschecker and Scott (2009), Tian and Poeppel (2013), and Tremblay and Dick (2016), we assume that the ventral stream of regions engaged are predominantly left-lateralized and include the dorsolateral prefrontal cortex (DLPFC), the orbitofrontal cortex, the pars orbitalis of the inferior frontal gyrus, the temporal pole and the posterior middle temporal gyrus, with ventral temporofrontal connections presumably involving the inferior occipitofrontal fasciculus (fascicles are not mentioned in **Figure 2**, for simplification).

Next, based on consideration by Duffau et al. (2014), among others, we presume that the formulation stage, which generates lexico-prosodico-syntactico-morpho-phonological representations, involves a dorsal stream, with recruitment of the posterior part of the left superior and middle temporal lobe as well as the left inferior frontal gyrus (IFG, pars opercularis) and with dorsal connections via the superior longitudinal fasciculus, as well as the arcuate fasciculus. We add that the left inferior parietal lobule (IPL) is recruited at this stage, to form the supramodal phonetic goal. We have argued that the phonetic goal is in an integrated supramodal format, which is consistent with IPL recruitment. But it is still an open question whether, at this formulation stage, the activation of the left IFG precedes that of the IPL or whether, instead, the IPL itself provides efferences to the IFG. **Figure 2** opts for the first scheme (at the formulation stage).

We claim that, for expanded varieties of inner speech, articulatory planning follows. A preliminary neural network for this last stage was presented in Lœvenbruck et al. (2018). This proposition was based on considerations and models by Indefrey (2011), Guenther and Vladusich (2012), Hickok (2012), and Tian and Poeppel (2013), among others. We slightly revise this initial proposition to better capture the notion of supramodal phonetic goal described above, to allow for suggestions by Flinker et al. (2015) and by Duffau et al. (2014) on temporo-frontal connections, and to include recent considerations on the role of the cerebellum in language production and internal models (see e.g., Imamizu and Kawato, 2009; Buckner et al., 2011; Smet et al., 2013; Mariën et al., 2014; Diedrichsen and Zotow, 2015; Sokolov et al., 2017). Our speculation takes advantage of the double representation of cerebral regions in the anterior and posterior lobes of the cerebellum (see e.g., Sokolov et al., 2017). **Figure 3** illustrates this revised view of the left cerebral and right cerebellar regions involved. The phonetic goal is sent from the left inferior parietal lobule (or the left IFG, if IPL-IFG connections are in the reversed order, see above) to the cerebellum (possibly the anterior lobe), via the pons. A conversion takes place through the controller in the cerebellum, which generates a motor specification sent to the left frontal regions via the thalamus. Motor programs are then issued, by coordinating the motor specification, stemming from the cerebellum, with ongoing speech actions. We speculate that the regions involved in this process are the triangular and opercular IFG and the anterior insula, then the ventral premotor cortex, the supplementary area

activations. The cross sign refers to the comparison that takes place between the intended phonetic goal and the integrated multisensory prediction.

and the primary motor cortex (via the frontal aslant track, not shown in **Figures 2**, **3**). There are arguments for the hypothesis that the IFG recruitment precedes ventral premotor cortex activation (e.g., the electrocorticography speech production study by Flinker et al., 2015) and that the inferior parietal lobule (supramarginal gyrus) efferences toward the ventral premotor cortex, via the anterior part of the superior longitudinal fascicle (Duffau et al., 2014). There are also arguments for the existence of connections from the IPL toward the cerebellum (Miall, 2003; Imamizu and Kawato, 2009) and from the cerebellum to the frontal motor and premotor areas, possibly including the IFG (Imamizu and Kawato, 2009; Murdoch, 2010). What remains unclear, is whether the direct (not mediated by the cerebellum) parieto-frontal connection is associated with the articulatory planning stage or only relevant to the formulation stage (as assumed here). We claim that the motor commands that result from the motor specification are not issued to the speech apparatus (inhibition) but they are sent, via the pons, to the cerebellum (possibly the posterior lobe), which, we speculate, includes a predictor. We further speculate that the cerebellum issues, via the thalamus, a multisensory prediction, which is processed by the auditory cortex (superior temporal gyrus) and the somatosensory cortex (postcentral gyrus). This multisensory prediction gives rise to the percept of an inner voice, that unfolds over time. The sequence of activation from inferior parietal to temporal cortex (mediated, we argue, by cerebellum and inferior frontal regions) is compatible with the MEG data obtained by Tian and Poeppel (2010). In an articulation imagery tasks, they found that the auditory response was elicited around 170 ms after a posterior parietal activity (where we think the phonetic goal is built) was recorded. We speculate that the auditory and somatosensory responses are further integrated into a supramodal representation, via the temporo-parietal

junction (TPJ). The resulting supramodal phonetic prediction is compared with the desired phonetic goal within the IPL and monitoring can take place. Note that in this account, the IFG is involved at two stages. In an early stage, during formulation, we consider that the triangular part of the IFG plays a role in the monitoring of thematic roles (who-does-what-to-whom) that is crucial to morphosyntactic processing (see Caplan and Hanna, 1998; Caplan et al., 2000; Indefrey et al., 2001; Lœvenbruck et al., 2005). In a later (articulatory planning) stage, we claim that the opercular part may be involved in the coordination and sequencing of articulatory gestures (Blank et al., 2002; Indefrey and Levelt, 2004).

Moreover, we presume that cognitive control, which has been defined as the "ability to orchestrate thought and action in accordance with internal goals" (Miller and Cohen, 2001) must take place to inhibit motor execution and to interrupt production before articulatory planning, when appropriate (condensation dimension). Cognitive control is also needed to launch the adaptation of internal models (controllers/predictors) at each stage, when different voices are imagined (dialogality dimension), and to tune the strength of the monitoring processes depending on the degree of willfulness (intentionality dimension). Cognitive control has been shown to recruit various regions of the prefrontal cortex (PFC), including dorsolateral PFC, ventrolateral PFC, orbitofrontal cortex, and anterior cingulate. It is still debatable what the roles of the different subregions of PFC are and it is beyond the purpose of this paper to describe them. We refer to Ridderinkhof et al. (2004) for more detail. We have therefore added the prefrontal cortex and the anterior cingulate cortex (ACC) above all processes. In addition, the modulation and adaptation of internal models during dialogal inner speech presumably requires memory retrieval processes, in search of the voice quality and linguistic features of the imagined other.

We have therefore added the hippocampus in the set of crucial regions. Furthermore, as mentioned above, the right IPL, the precuneus, the posterior cingulate, and the fronto-polar cortex are claimed to play a role in first-/third-person perspective taking (Ruby and Decety, 2001; Decety, 2005). Decety and Grèzes (2006) provide further argument for the role of the right IPL in the attribution of actions, emotions, and thoughts to their respective agents when one mentally simulates actions for oneself or for another individual. Their review of the literature show that it is difficult to assess whether the crucial region in this process is the rostral part of the right IPL or the right TPJ. The purple boxes in **Figure 2** for the operations of phonetic goal construction, sensory experience processing and multisensory integration, represent the perspective switching operations, which presumably include a shift in hemispheric dominance, from left to right IPL and/or TPJ, as well as recruitment of the precuneus and posterior cingulate.

# Assessing the Neural Networks Mediating Multidimensional Inner Speech

The aim of the present study is to examine the neuroanatomical assumptions of the ConDialInt model by investigating the neural correlates of multidimensional inner speech using fMRI. Previous fMRI studies of inner speech did not address dialogality and intentionality simultaneously.

Along the dialogality dimension, the study by Tian et al. (2016) compared inner speaking (articulation imagery) and imagining someone else speaking (hearing imagery), but only single syllables were used, which is restrictive. In addition, the participants were explicitly trained to mentally articulate during inner speaking, while they were asked to minimize articulatory feeling and rely instead on auditory memory processes during hearing imagery. These results are interesting but they are not sufficiently informative as to which neural networks are involved in less constrained inner speech (i.e., during full sentence production and with less attentional focus on articulatory sensation and auditory memory). The study by Alderson-Day et al. (2016) addressed dialogality in a more ecological way, using scenarios designed to elicit either monologal (soliloquial) or dialogal (imagining a dialog with another person) inner speech. Participants used one single voice in the monologal condition and several voices in the dialogal condition. Therefore, comparing these two conditions does not allow to conclude on the processes that specifically underlie perspective shifting, without the confounding factor of voice modulation.

Along the intentionality dimension, Hurlburt et al. (2016) carefully addressed the difference between intentional monologal and unintentional monologal inner speech (which they refer to as spontaneous inner speaking). They also investigated unintentional dialogal inner speech (referred to as spontaneous inner hearing). Although unintentional monologal inner speech was relatively frequent, occurring in 29 percent of their samples and for each of their five participants, unintentional dialogal inner speech was rare (occurred zero times or twice) for three participants. Further data are therefore needed on dialogal inner speech.

The conditions in the present study were specifically designed to compare inner speech varieties along the two dimensions of dialogality and intentionality. To explore dialogality, three controlled inner speech conditions were compared, during which participants were instructed to mentally generate verbal definitions of visually presented words (they were primed with a written word and its pictorial illustration). In the intentional monologal self-voice condition, participants were asked to covertly produce a definition, with their own voice. In the intentional dialogal other-voice condition, they were instructed to imagine that someone was producing an utterance addressed to them. Compared with the monologal self-voice condition, this condition requires two additional processes: mentally altering one's voice, which implies prosodic and voice quality control, and taking an allocentric perspective. To specifically examine perspective taking, without the confounding factor of voice alteration control, we added an intermediate condition in which participants were asked to covertly produce a definition, with someone else's voice (intentional monologal other-voice). To explore the intentionality dimension, in addition to these conditions, a mind wandering session took place, after which participants were asked to report any spontaneously occurring verbal material. The mind wandering session was also meant to allow us to explore the condensation dimension. To assess to what extent auditory processes are at play during inner speech, we added a speech perception condition.

# MATERIALS AND METHODS

# Participants

Twenty-four healthy native speakers of French were included (10 men; mean age = 29.5 years, SD = 10.04; 14 women, mean age = 28.07 years, SD = 8.14). All participants were right-handed (Edinburgh Handedness Inventory; Oldfield, 1971), scored average on a mental imagery questionnaire (based on Sheehan, 1967), had normal or corrected-to-normal vision and had no history of neurological or language disorders. Each participant gave written informed consent and received 30€ for their participation. Ethical approval was granted by the Comité de Protection des Personnes (CPP) Sud-Est V and by the National Competent Authority France-ANSM (Ref. CPP: 14-CHUG-39, Ref. Promoteur: 38RC14.304, ID-RCB: 2014-A01403-44, Ref. ANSM: 141200B-31, ClinicalTrials.gov ID: NCT02830100).

# Tasks

Participants were first introduced to an avatar, who gave them instructions and provided training for the five conditions. The avatar had a saliently high-pitched voice which was sufficiently strange (outside of an adult's typical pitch range), yet easy to imitate for everyone. The first four conditions included one speech perception condition and three intentional inner speech conditions. In these four conditions, each trial started with the visual presentation of a written word and its illustration. For example, the written word "ball," with a picture of a ball (framed within a stylized clock) was visually presented for 2 s,

after which the clock rotated and the participant performed the task, which lasted for 4 s. Each trial was repeated several times in each condition (see section "Stimuli"). In the "Speech Perception" (SP) condition, participants had to listen to the definitions presented to them via MR compatible earphones. The definitions were pronounced by the avatar with the high-pitched voice. Each definition began with "This is something. . .". In the Monologal Self-voice inner speech (MS) condition, participants had to mentally generate definitions of each of the visually presented objects, using a sentence beginning with "This is something." Participants were not reading sentences, they had to generate their own definitions. The stimuli were purely visual (no audio presentation of the word). The Monologal Other-voice inner speech (MO) condition was similar to the MS condition, except that participants had to mentally imitate the high-pitched voice of the avatar. In the Dialogal Other-voice (DO) condition, participants had to imagine that the avatar was addressing them, producing a sentence starting with "Here is a typical image of a. . ." and ending with the name of the object, without generating a definition (to reduce cognitive load). The fifth condition investigated "Verbal Mind Wandering" (VMW). In this condition, a written word and its illustration was first visually presented for 2 s, in order to provide the same initial visual stimulation as in the other four conditions. After the initial 2 s written word-illustration presentation, participants were asked to fixate a stylized clock rotating for 30 s. They were instructed to monitor spontaneously occurring thoughts. At the end of the trial, they reported the periods during which they experienced verbal thoughts, by selecting time portions on the stylized clock which appeared on the screen, using a joystick. The stimulus presentation and collection of joystick responses were controlled using the Presentation software (Neurobehavioral systems)<sup>1</sup> .

## Stimuli

Four 30-word lists of nouns were created using the LEXIQUE database (New et al., 2001). In order to facilitate the generation of definitions, only frequent and imageable words were chosen. All nouns were of neutral affective content and included the categories of food, houseware, furniture, clothing and transportation devices. Each list was randomly assigned to one of the first four conditions. The lists were the same for all participants. They were carefully matched for syllable counts, frequency, familiarity, concreteness and imageability. Only one item was presented (a clock) in the fifth condition (VMW).

The audio stimuli (for the SP condition) and the instructions were recorded by two female native speakers of French in a quiet room. One speaker generated the avatar's voice contents, i.e., tasks instructions for SP, MO and DO, as well as definitions used in the SP condition. The other speaker generated instructions for the remaining conditions (VMW and MS). Audio signals were digitized with a sampling frequency of 44199 Hz and 32 bit resolution, then normalized in amplitude to the mean power of all stimuli. The recorded definitions in the 30 test trials for the SP condition lasted on average 2.87 s (SD = 0.44).

# Expected Outcomes

Comparing the monologal self-voice (MS) condition with baseline should help assessing the predictive control hypothesis. Namely, it is expected that expanded inner speech in the MS condition should recruit speech production processes down to articulatory planning, resulting in a predicted signal, the inner voice, with auditory qualities. It is expected that compared with baseline, MS should recruit hippocampus and posterior middle temporal gyrus for the conceptualization stage. The posterior temporal lobe and left inferior frontal gyrus should be recruited for the formulation stage. The left inferior parietal lobule should be activated for the articulatory planning stage (for the specification of the supramodal phonetic goal), as well as the right cerebellum (controller model, for motor commands specification and predictor model for sensory prediction), the left premotor cortex, left IFG and insula (for motor command coordination) and the auditory cortex (for sensory processing). Somatosensory cortex might also be recruited. Furthermore, the prefrontal cortex (middle and superior frontal regions) should be recruited to issue inhibitory control signals, preventing movement of the speech apparatus.

Comparing the MS condition with the speech perception (SP) condition should further assess whether auditory processing is at play during expanded inner speech and whether some attenuation occurs, relative to actual speech perception, as predicted by the model.

Comparing monologal other-voice (MO) and dialogal othervoice (DO) each with the baseline and with SP should further test the predictive control hypothesis and assess the recruitment of motor and auditory processes. Comparing MO with MS should shed light on the first aspect of dialogality, namely voice modulation. Given that the most striking feature of the voice to be mentally imitated was its high pitch, it can be speculated that in MO, intonation control regions should be recruited. In particular, it can be expected that the right inferior frontal gyrus should be activated. In addition, the internal models used in MS (and presumably associated with right cerebellar activation) should be replaced with internal models adapted to this new voice. The cerebellar recruitment might therefore differ in these two conditions.

Comparing DO with MO should shed light on the second aspect of dialogality, namely perspective shifting. Based on Ruby and Decety's (2001) study on perspective shifting, it can be expected that, relative to MO, DO should additionally activate the right parietal cortex, and more specifically, the inferior and superior parietal lobules as well as the precuneus and the posterior cingulate.

Comparing the verbal mind wandering (VMW) condition to the baseline should contribute to better describe the intentionality dimension and could potentially shed light on the condensation dimension. It can be expected that compared with the baseline, VMW should activate the default mode network as well as speech production regions. Comparing VMW and MS, MO and DO could potentially provide insight on the neuroanatomical differences between varieties of inner speech along the intentionality dimension.

<sup>1</sup>http://www.neurobs.com

# fMRI Protocol

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 13

A repeated-block design paradigm was used, with two runs, each including all conditions (see **Figure 4**). In all five conditions, participants were asked to remain perfectly still, not to make any head movement and not to articulate. They were trained to do so before entering the scanner. Each run consisted of a sequence of blocks for the five conditions (e.g., SP, MS, MO, DO, VMW) which was repeated three times. Each sequence contained five trials of each of the five conditions. Thus, in each run, each condition was presented in three different blocks of five trials, resulting in 15 trials for each condition. In the SP, MS, MO, and DO conditions, trials were separated by a fixation cross displayed for 2 s. At the beginning of each block, an instruction screen was displayed for 6 s while a recording of the instructions was played in the earphones. Then five trials of the same condition were run. A fixation cross was displayed for 8 s before and after each block. When a participant was doing a task for the first time in the run, the block started with three training trials. The sequence of conditions was pseudo-randomized across participants, with DO always after MO, to reduce confusion between tasks. For each participant, the same sequence order was used for all six repetitions of sequences. This resulted in 30 test trials (two runs, three blocks of five trials in each run) plus six training trials (two runs, three training trials in each run) per condition per participant (i.e., a total of 144 trials for the first four conditions).

# Pre- and Post-experiment Questionnaires

One day before the experiment, participants filled in the Edinburgh Handedness Inventory (Oldfield, 1971) and a mental imagery questionnaire, based on and translated from Sheehan (1967). On the day of the experiment, before entering the scanner, they were trained to report on inner speech and to intentionally produce different varieties of inner speech, without articulating. After the experiment, they filled in a recall questionnaire with a list of 60 words, for which they checked whether they had generated a definition in the scanner (20 words were distractors). This aimed at testing their attention during the tasks: if participants were focused on defining the words presented to them during the intentional inner speech tasks in the scanner, when presented with those words after the experiment, they should remember finding a definition for them. Participants also filled in subjective questionnaires to report how well they performed the tasks and to describe their thought contents during VMW.

# fMRI Acquisition

Experiments were performed using a whole-body 3T MR Philips imager (Achieva 3.0T TX Philips, Philips Medical Systems, Best, NetherLands) with a 32-channel head coil at IRMaGe MRI facility (Grenoble, France). The manufacturer-provided gradient echo planar imaging sequence (FEEPI) was used. Forty-two adjacent axial slices parallel to the bi-commissural plane were acquired in non-interleaved mode. Slice thickness was 3 mm. The inplane voxel size was 3 × 3 mm (240 × 240 mm field of view with a 80 × 80 pixel data matrix). The main sequence parameters were: TR = 2.5 s, TE = 30 ms, flip angle = 82◦ . Two fMRI runs were conducted while subjects performed the tasks. During the break between the two runs, a T1-weighted highresolution 3D anatomical volume was acquired, with a 3D T1

TFE sequence (field of view = 256 × 224 × 175 mm; resolution: 0.89 × 0.89 × 1.37 mm; acquisition matrix: 192 × 137 × 128 pixels; reconstruction matrix: 288 × 288 × 128 pixels). Participants' gazes were monitored with an eyetracker to ensure they followed instructions.

# fMRI Data Analysis

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 14

Image preprocessing and analyses were completed using SPM12 (SPM12<sup>2</sup> , Wellcome Institute of Cognitive Neurology, London, United Kingdom). Standard preprocessing steps were implemented, including slice time correction, rigid body motion correction, a high-pass filter at 1/512 Hz to filter low-frequency non-linear drifts, coregistration of the functional images to each subject's T1 anatomical images, and normalization to the Montreal Neurological Institute (MNI) template. All normalized functional images were smoothed using a Gaussian filter with a full width at half maximum of 8 mm. Individual subject analyses were conducted by constructing a general linear model for each condition. Five regressors were defined: SP, MS, MO, DO, and VMW. For all conditions, regressors were modelled as box-car functions convolved with a canonical hemodynamic response function (Friston et al., 1994). Inspection of the movement parameters derived from realignment corrections suggests that head movement was limited. Movement parameters were still included as factors of no interest. The run number was added as an additional factor. For the first-level analysis, five contrasts corresponding to each regressor of interest vs. implicit baseline were computed. For the second level, several analyses have been carried out: (i) one-sample T-tests, in order to measure main effects of experimental conditions, (ii) conjunction analyses between each inner speech condition and SP, between all five conditions, between all four inner speech conditions, and between all inner speech conditions grouped together and SP, in order to examine whether perception processes were recruited in all varieties of inner speech, and (iii) oneway within-subject ANOVA, in order to measure differential effects between conditions (Friston et al., 2005; Henson and Penny, 2005). To study the varieties of intentional inner speech along the dialogality dimension, MS was compared with MO (effect of changing voice) and MO was compared with DO (switching from monolog to dialog). To explore the intentionality dimension, activations in the VMW condition were compared with activations in the intentional MS condition. In all analyses (except for the contrasts between MS and MO), significant voxel clusters on each t-map were identified with Family Wise Error (FWE) correction at p < 0.05. For the MS > MO and MO > MS contrasts, no activation was found at a FWEcorrected threshold. This was not completely unexpected, given that these two conditions are very similar and they only subtly differ in the quality of the voice to be mentally produced. Although this is statistically fragile, we report the results at an uncorrected threshold (p < 0.001), since these contrasts are interesting in the framework of our model. Moreover, these preliminary results might guide future neuroimaging studies on inner speech production and imitation, and might help identifying regions of interest. Location of cluster maxima was determined using Automated Anatomical Labeling (AAL) map (Tzourio-Mazoyer et al., 2002). In order to quantify potential hemispheric asymmetry changes between conditions (from MS to MO and DO), percent MR signal intensity variations, or percent signal changes (%SC), were extracted within a set of regions of interest (ROIs). These ROIs included Frontal Inferior Opercularis, Frontal Inferior Triangularis, Frontal Inferior Orbitalis, Precentral gyrus, Supplementary Motor Area, Superior Temporal, Middle Temporal, Supramarginal gyrus, Inferior Parietal lobule and Superior Parietal lobule, which are among the crucial regions expected to be recruited during expanded inner speech production, according to the ConDialInt model. The ROIs were anatomically defined using the AAL atlas, in both left and right hemispheres.

# RESULTS

# Behavioral Data

For the recall task carried out after the fMRI experiment, the mean accuracy scores across subjects was 84.42% ± 16.63. Only one participant performed poorly (below 50% accuracy). This high mean score, together with the eyetracker monitoring, suggest that participants were focused on the tasks.

After each VMW trial, participants used a joystick to report the presence of verbal episodes on the stylized clock displayed on the screen. Over the two runs (six VMW trials), participants reported between 4 and 22 verbal episodes, with a mean of 13 episodes. The proportion of time spent on verbal thought in all VWM trials ranged from 4 to 67%, with a mean of 35.6% (SD = 15.04).

The subjective post-scan questionnaires also confirmed that the VMW condition contained verbal episodes. More specifically, concerning the condensation dimension, as the graph across all participants presented in **Figure 5** suggests, the VMW condition included various degrees of condensation, from fully expanded sentences (reported as "sometimes present" in 17% of the participants and "often present" in 46%) to speech fragments (reported as "sometimes present" in 38% and "often present" in 29%), words ("sometimes present" in 4% and "often present" in 13%) and even semantic concepts without words ("sometimes present" in 21%).

In addition, the post-scan questionnaires indicate that participants rated their overall performance as correct. The MS condition was rated as easier than the MO condition, itself easier than the DO condition.

# Functional MRI Data

### Effects of Conditions: Cerebral Correlates of Speech Perception and Inner Speech Varieties

Contrasts between each condition and the baseline are presented in **Table 1**, all p < 0.05, FWE correction. All contrasts revealed activation of the right middle and superior occipital cortex and inferior temporal (fusiform) gyrus.

In addition to the activation in visual cortex, the contrast between speech perception (SP) and baseline revealed increased

<sup>2</sup>https://www.fil.ion.ucl.ac.uk/spm/

activation in bilateral superior temporal gyri (STG, Brodmann Area (BA) 21, 22, 41), left supramarginal gyrus (SMG, BA 40), left inferior frontal gyrus (IFG, BA 44, 47), left superior frontal gyrus (SFG, BA 8), bilateral premotor (PM) cortex, left supplementary motor area (SMA), left motor cortex, left hippocampus (**Figure 6A**).

Compared with baseline, intentional monologal self-voice inner speech (MS) yielded greater left hemisphere activation in the IFG (BA 44, 45, 47), middle frontal gyrus (MFG, BA 10), SFG (BA 8), SMG (BA 39), posterior middle/superior temporal gyrus (MTG/STG, BA 21, 22), hippocampus, together with bilateral SMA, bilateral PM cortex, and right cerebellum (**Figure 6B**).

Compared with baseline, intentional monologal other-voice inner speech (MO) revealed greater left hemisphere activation in IFG (BA 44, 47), MFG (BA 10), hippocampus, together with bilateral PM cortex, bilateral SMA, right insula (BA 13) and right cerebellum (**Figure 6C**).

Compared with baseline, intentional dialogal other-voice inner speech (DO) yielded greater left hemisphere activation in MFG (BA 10), middle occipital gyrus (BA 19), left insula (BA 13), together with bilateral PM cortex, IFG (BA 44, 47), and SMA (**Figure 6D**).

Compared with baseline, verbal mind wandering (VMW) yielded greater left hemisphere activation in SMA, together with bilateral IFG (BA 45, 47), insula (BA 13), MFG (BA 9, 10), SMA, medial SFG (BA 9), inferior (BA 39) and superior (BA 7) parietal cortex, precuneus, and left caudate, thalamus, and cerebellum (**Figure 6E**).

# Common Neural Correlates for Inner Speech and Speech Perception

To investigate whether perception processes were recruited in all varieties of inner speech, conjunctions between SP and either MS, MO, DO, or VMW were examined. Conjunctions between each condition and SP are presented in **Table 2**, all p < 0.05, FWE correction.

The conjunction between MS and SP (**Figure 7A**) confirmed that the left IFG, SFG, MTG/STG, SMA, SMG, hippocampus, bilateral PM cortex, and occipital/posterior MTG were recruited by both conditions. The conjunction between MO and SP (**Figure 7B**) yielded activation in left IFG, SFG, MTG, and hippocampus, as well as bilateral SMA, PM, and occipital/posterior MTG, thus revealing a weaker middle temporal cortex activation.

The conjunction between DO and SP (**Figure 7C**) yielded activation in left IFG, SFG, bilateral PM, SMA, right insula and bilateral occipital/posterior MTG but no middle temporal cortex activation.

The conjunction between VMW and SP (**Figure 7D**) yielded activation in left IFG, SFG, bilateral PM, SMA, and occipital/posterior MTG but no middle temporal cortex activation.

Conjunctions between all four inner speech conditions (MS, MO, DO, VMW), between all five conditions (MS, MO, DO, VMW, SP), and between all inner speech conditions grouped together and SP are listed in **Table 2**. Commonly activated regions in all four inner speech conditions (MS, MO, DO, VMW) and in all five conditions (MS, MO, DO, VMW, SP) include the left IFG, and bilateral SMA, but do not include the auditory cortex. The regions that show a conjunction of activity in SP and all inner speech conditions grouped together are illustrated in **Figure 7E**. In addition to left IFG and SMA, they include left supramarginal and middle temporal gyri.

To further examine the degree of auditory activation in the different conditions, we extracted the %SC within a large temporal ROI including left Superior and Middle Temporal gyri (anatomically defined using AAL), in each hemisphere. The values are displayed in **Figure 8** for each of the 5 conditions, in the left and right hemispheres. For each hemisphere, a one-way ANOVA was run on the %SC with condition as a factor. In the left ROI, results showed that the %SC in the SP condition was significantly different from each of the inner speech conditions (p < 0.001), with higher left temporal activation in SP than in each of the inner speech conditions. In addition, the MS condition was significantly different from VMW (F(1,23) = 7.92, p < 0.001), with higher left temporal activation in MS than VMW. In the right ROI, the %SC in the SP condition was significantly higher than in each of the inner speech conditions (p < 0.001). In addition, the %SC in the right ROI in the DO condition was significantly higher than in MS TABLE 1 | Contrasts between each condition and the baseline.

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 16

#### Grandchamp et al. The ConDialInt Model of Inner Speech

#### TABLE 1 | Continued



Multiple peaks in each cluster are presented at p < 0.05 FWE correction. Main clusters are represented in bold font, with their extent size provided. Sub-clusters are represented in regular font.

(Continued)

(F(1,23) = 16.11, p < 0.001) and MO (F(1,23) = 16.72, p < 0.001) and the %SC in the right ROI was higher in VMW than MS (F(1,23) = 5.96, p = 0.02).

## Contrasts Between Conditions: Dialogality and Intentionality Dimensions

Contrasts between MS and MO, MO and DO, and VMW and MS are presented in **Table 3**, all for p < 0.05, FWE correction, except for the contrasts between MS and MO (p < 0.001, uncorrected).

#### **Dialogality dimension: voice control in inner speech**

The contrasts between MS and MO suggest that covertly using someone else's voice (MO) vs. one's own voice (MS) resulted in an increased involvement of the right hemisphere (**Figures 9A,B**). More specifically, in the MS > MO contrast, greater left hemisphere recruitment was observed, with activation in left IFG (BA 45), SFG (BA 8), medial SFG (BA 8, 32), middle cingulate, postcentral, and superior parietal lobule (BA 7). In MO > MS, greater right hemisphere involvement was found, with activation in right IFG (BA 44, 45), SMA, MFG (BA 10) and inferior parietal lobule (BA 40).

### **Dialogality dimension: perspective control in inner speech**

Perspective switching, from monologal other-voice to dialogal other-voice was examined through the MO vs. DO contrasts (**Figures 9C,D**). In MO > DO, greater activation was observed in left IFG (BA 44), SMA, and ACC and in DO > MO, we found a greater recruitment of right IFG (BA 44), MFG (BA 8, 10, 46), SFG (BA 8), as well as bilateral inferior (BA 39, 40) parietal lobules, precuneus and posterior cingulate cortex. This last contrast indicates an increase in right hemisphere activation in DO relative to MO.

To quantify the increase in right hemisphere involvement and relative disengagement of left hemisphere, the %SC values within a symmetrical left-right set of ROIs were submitted to an ANOVA crossing the factors hemispheric lateralization (right, left) and condition (MS, MO, DO). As illustrated in **Figure 10**, results showed a main effect of lateralization (F(1,23) = 55.63, p < 0.001) and a significant lateralization-bycondition interaction (F(2,46) = 18.63, p < 0.001), indicating that condition affected hemispheric lateralization. Further tests showed that %SC values in MS and DO were significantly different, both for the right (F(1,23) = 17, p < 0.001) and the left (F(1,23) = 5.08, p = 0.03) hemispheres, with more left lateralization for MS than DO and more right lateralization for DO than MS. The difference between MS and MO was not statistically significant neither for the right (F(1,23) = 0.12, p = 0.73), nor for the left (F(1,23) = 3.73, p = 0.06) hemispheres.

### **Intentionality dimension**

Switch from intentional to unintentional inner speech was examined through the MS vs. VMW contrasts (**Figures 9E,F**), since the VMW condition, according to participants, contained verbal episodes. In MS > VMW, greater activation was observed in left SMA, primary motor, IFG (BA 44, 45, 47), insula, MTG/STG (BA 21, 22), SMG, ACC, putamen, caudate, and bilateral PM. In VMW > MS, greater activation was observed in right inferior parietal (BA 7, 40), precuneus, IFG (BA 47), SFG (BA 9, 10), MFG (BA 10), insula, ACC, thalamus, left SFG (BA 6). Some of these activations might reflect the involvement of the Default Mode Network (DMN, Buckner et al., 2008). In order to further describe the specificity of the VMW condition relative to the DMN, the participants were split into two groups (High-verbal and Low-verbal) based on their amount of reported verbal episodes during the VMW condition (below and above the median). A two-sample t-test was used to compare the two groups on this condition. Compared to Low-verbal, High-verbal participants did not show any additional activation. However, the opposite contrast showed that the Low-verbal participants showed more activation of the dorsomedial prefrontal cortex than the High-verbal participants (p < 0.05, FWE corrected), as detailed in **Table 4**.

# DISCUSSION

Our fMRI protocol allowed us to investigate varieties of inner speech along dialogality and intentionality dimensions, in the aim of examining the validity of the neuroanatomical correlates posited in the ConDialInt model. To explore dialogality, three controlled inner speech conditions were elicited. This allowed us to compare monologal inner speech with own and other voices, probing for prosodic and voice aspects of dialog. The comparison between monologal and dialogal inner speech (both produced with other voice), allowed us to reveal aspects specifically associated with perspective shifting. To explore intentionality, willful inner speech was compared with mind wandering, during which verbal activity was reported.

# Intentional Monologal Expanded Inner Speech: The Inner Voice as an Efference Copy Prediction

Occipital activation in all conditions can be related to the visual processing required at the beginning of each trial when the pictures are presented. The pattern of activation observed in the SP condition (compared with the baseline or in conjunction with inner speech conditions) was consistent with previous studies on auditory sentence perception and argues in favor of speech perception theories that include a premotor component (see e.g., Friederici, 2011 for a review).

The contrast between MS and baseline (as well as the conjunction between MS and SP) indicates that intentional monologal own voice inner speech was associated with left hemisphere activation in regions compatible with the predictive control scheme assumed in the ConDialInt model. The contrast between MS and baseline reveals prefrontal cortex activation, in MFG and SFG, regions which have been associated with cognitive control (Ridderinkhof et al., 2004). It has been suggested that the orbitofrontal cortex plays an inhibitory role during motor imagery (Jeannerod, 2001). The recruitment of the orbitofrontal cortex could therefore indicate that inhibitory processes are engaged, to prevent overt production. More detailed effective connectivity or sEEG data are needed, however, to assess whether this orbitofrontal cortex activation does reflect inhibitory influence on areas involved at the various stages

of language production. An alternative account, which does not appeal to inhibitory processes, could be that the highest processing levels are too weakly activated for the last stage (motor execution) to be launched. The contrast between MS and baseline also shows activation in the hippocampus and posterior MTG, which were presumably related to conceptualization. The recruitment of IFG can be associated to formulation and articulatory planning, whereas SMG activation can be related to phonetic goal integration. The activation of the right cerebellum is consistent with the recruitment of controller/predictor models. We can speculate that the phonetic goal issuing from the SMG was sent to a controller in the right cerebellum, which converted it into a motor specification. This motor specification was then coordinated with ongoing motor actions via the recruitment of left IFG, bilateral SMA and PM cortex, resulting in motor commands. An efference copy of these commands could then have been sent to a predictor model in cerebellum. We have argued above for the role of the cerebellum in both motor command preparation (controller) and sensory experience prediction (predictor), with perhaps a distinction between anterior and posterior lobes. Our data do not allow us to assess whether this distinctive pattern of activation occurred, however, given that the field of MR acquisition provided full coverage of the cerebrum but did not cover the entire cerebellum. The observed cluster of activation crossing posterior STG and MTG suggests that auditory percepts were experienced. The recruitment of the right cerebellum together with the auditory activation is compatible with the hypothesis made in the ConDialInt model that the cerebellar predictor model issues predicted sensory signals processed by the auditory cortex. More refined connectivity analyses or neuroimaging data with better temporal resolution could further test this hypothesis. The ConDialInt model posits an attenuation mechanism for self-generated auditory experience relative to externally generated sounds. Our data are consistent with this hypothesis, since less STG/MTG activation was observed during MS than SP. In their study of elicited vs. spontaneous inner speaking, Hurlburt et al. (2016) even found a deactivation of Heschl's gyrus during elicited inner speech compared with the baseline (not only compared with speech perception). They used a region of interest (ROI) analysis centered on Heschl's gyrus, however, and

#### TABLE 2 | Conjunction analyses.

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 19

#### Grandchamp et al. The ConDialInt Model of Inner Speech

#### TABLE 2 | Continued



(Continued)

#### TABLE 2 | Continued


Multiple peaks in each cluster are presented at p < 0.05 FWE correction. Main clusters are represented in bold font, with their extent size provided. Sub-clusters are represented in regular font.

do not report whole-brain analysis results. Agnew et al. (2013) have observed an anterior-posterior division of activity profiles within the STG, where anterior fields are suppressed during (aloud or silent) motor output, whereas posterior fields remain engaged. It is possible that there was some STG/MTG activation during intentional inner speech in Hurlburt et al.'s study, but the restricted ROI analysis may have missed it. Therefore, the neural network that was observed in the present study supports the claim that intentional monologal inner speech involves the inhibited production of motor commands, generated in left frontal regions. Efference copies of the commands would be processed by the cerebellar predictor, giving rise to a sensory experience, the inner voice, albeit a weaker one than during actual speech perception. The ConDialInt model conjectures that the predictor should issue both auditory and somatosensory responses, later integrated into a supramodal representation, via the temporo-parietal junction (TPJ). Except in the MS vs. MO (uncorrected) contrast, we could not observe any somatosensory activation during any of the intentional tasks. This could be due to a lack of power, but we cannot conclude that multisensory representations are indeed at play. The fact that we did register SMG activation (with a cluster encompassing the TPJ) is compatible with an integration process after auditory response, however.

Intentional monologal inner speech with someone else's voice (MO) or intentional dialogal inner speech with someone else's voice (DO) also resulted in networks of IFG and motor activations consistent with our predictive account. The lack of superior temporal gyrus activation can be attributed to the fact that, during MO and DO, internal models are less accurate than during MS, and presumably generate more precarious auditory predictions. This could explain the lesser auditory cortex activation. This account is supported by the participants' subjective experience of a fainter voice percept in these more cognitively demanding conditions (see also Shergill et al., 2001).

# Dialogality Dimension: Neural Correlates of Producing Another Voice

Along the dialogality dimension, covertly using someone else's voice (MO) vs. one's own voice (MS), in a monolog, resulted in a marginally significant decrease of left hemisphere activation in the ROIs. More specifically, greater left IFG, postcentral and superior parietal activation was observed in MS > MO, whereas greater right IFG and parietal activation was detected in MO > MS (uncorrected contrasts). In addition, the cerebellar activation observed in MS was reduced in MO. The MO condition required a mental shift in fundamental frequency range, and perhaps even in voice quality, as the avatar's voice to be imitated was extremely high-pitched. Some prosodic fluctuations, and especially those related to affective, emotional or attitudinal aspects are considered to involve the right hemisphere, typically the right inferior frontal gyrus (Baum and Pell, 1999; Lœvenbruck et al., 2005; Pichon and Kell, 2013). Thus, in the framework of predictive control, the present results suggest that mentally imitating a high-pitched voice requires to modify the controller/predictor pair, at least at the articulatory planning stage. The self-adapted controller/predictor models that are suspected to involve the right cerebellum in MS are not adequate, and right frontal region recruitment seems to take place instead. Participants reported that the MO task was more difficult than MS. An alternative interpretation could be that increased cognitive load resulted in the recruitment of contralateral homologous regions. The fact that MS resulted in greater left postcentral and superior parietal activation than MO could suggest that the somatosensory representations evoked when inner speaking with self-voice are stronger that when a different voice is used.

When comparing DO relative to MS, our analyses on the set of frontal, temporal and parietal ROIs (**Figure 10**), revealed a significant increase in the recruitment of the right hemisphere (also observed on the temporal ROI alone, **Figure 8**) together with a significant decrease in left hemisphere activation. Crucially, the DO > MS contrast showed activity in right IFG, MTG and SMG. Similar right hemisphere activation was found in Shergill et al.'s (2001) fMRI study, in six participants who were examined during (first, second and third person) auditory verbal imagery. Linden et al. (2011) also found significant right hemisphere activation in fronto-temporal regions during voluntary auditory imagery. These findings also chime with the fMRI data obtained by Sommer et al. (2008). They compared the cerebral activation of patients diagnosed with schizophrenia while they experienced auditory verbal hallucination (AVH) and while they produced normal inner speech. They found that the main difference between the two conditions was lateralization, with a predominant engagement of the right inferior frontal region during AVH. An influential account formulates AVH as inner speech misattributed to an external source due to a dysfunction in efference copy and predictive control mechanisms (Feinberg, 1978; Frith, 1992; Jones and Fernyhough, 2007b; but see Gallagher, 2004). Rapin et al. (2013, 2016) have argued that this account leaves several questions open, however. First, with this rationale, all inner speech should be mistaken as coming from an external agent, yet patient interviews show that this is not the case (Larøi and Woodward, 2007; Aleman and Larøi, 2008). Secondly, this model does not describe how "other" voices are heard, yet patients with schizophrenia often report that they can precisely identify the voice they hear as being clearly that of someone they know and as addressing them in the second person (Hoffman et al., 2011). In our view, AVH does not result from a disruption in MS but from MO or rather DO. In the Sommer et al. (2008) study, when patients experienced AVH,

right IFG activation occurred, just like when the participants of the present study imagined the avatar addressing them. The lack of agency felt by the patients could be due to a fawlty agency attribution mechanism when other-adapted controller/predictor models are used. If controller and predictor, for instance, are not symmetrical or temporally misaligned, then the prediction could differ from the desired signal. This would make the predicted auditory experience feel alien, leading to a misattribution to an external source. This interpretation is consistent with an fMRI study by Shergill et al. (2000) on eight patients with schizophrenia who had had experiences of AVH but were in remission at the time of study. They found that the activation pattern of patients during inner speech was not different from that of control healthy subjects, but that attenuated activation was evident in posterior cerebellar cortex, hippocampi, and lenticular nuclei bilaterally and the right thalamus, middle and superior temporal cortex, and left nucleus accumbens, during auditory verbal imagery (similar to what we refer to here as DO). This implies that in patients with a history of AVH, auditory verbal imagery (DO), but not monologal self-voice inner speech (MS), is associated with an atypical neural activation pattern. This pattern, when exacerbated in pathological condition, may contribute to the spurring of AVH.

# Dialogality Dimension: Neural Correlates of Imagining Another Voice Speaking (Third-Perspective Taking)

To study perspective switching by itself, the contrast between MS and DO is not adequate, because a change in voice (selfvoice vs. other-voice) is confounded with a change in perspective (self speaking vs. other speaking). We therefore examined the contrast between MO and DO, since both conditions required the generation of another voice. Relative to MO, DO TABLE 3 | Contrasts between inner speech conditions.

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 22

#### TABLE 3 | Continued



Multiple peaks in each cluster are presented at p < 0.05 FWE correction (except for MS vs. MO, p < 0.001 uncorrected). Main clusters are represented in bold font, with their extent size provided. Sub-clusters are represented in regular font.

additionally recruited the right IFG, MFG, SFG, right superior and inferior parietal lobules as well as bilateral precuneus and posterior cingulate cortex. The recruitment of right frontal region seems therefore even more important in DO than in MO. As argued above, right frontal activation can be related to prosody control at the articulatory planning stage, and this could mean that suprasegmental control is even more demanding

in DO. It could alternatively suggest that increased cognitive load in DO, relative to MO, resulted in the recruitment of contralateral regions homologous to the regions associated with articulatory planning. The recruitment of right parietal cortex is consistent with several studies on perspective switching and imagination of others' actions. Ruby and Decety (2001) found that imagining someone perform an action (what they refer to as third person perspective) involves the inferior parietal lobule, the precuneus, the posterior cingulate, and the frontopolar cortex. Tian et al. (2016) have examined the neural correlates of articulation imagery and hearing imagery. Articulation imagery consisted in imagining producing a syllable (/ba/ or /ki/) and can be considered as close to our MS condition. Hearing imagery consisted in imagining hearing those same syllables, produced by a (previously introduced) female speaker. The authors did not report any right parietal activation during hearing imagery. But their task was aimed at eliciting memory retrieval of previously heard syllables, and participants were specifically asked to minimize production. Therefore, the discrepancy between their results and our own can be explained by the different nature of the tasks. In their fMRI study of auditory imagery, Linden et al. (2011) did not find any parietal activation either. The participants' task consisted in simply imagining one or several familiar voices speaking to them for a few seconds. Using a region of interest analysis, they observed bilateral activation in the superior temporal sulcus (the voice selective region). In addition, they found bilateral activation in IFG, SMA, ACC and cuneus. The lack of parietal activation could also be explained by the nature of the task, which resembles the hearing imagery task by Tian et al. (2016). Linden et al. (2011) state that the most common strategy for participants was to imagine voices of familiar people, such as family conversations or messages left on the phone. Therefore, participants may have been more strongly focusing on memory retrieval rather than actual verbal production with an allocentric perspective. Alderson-Day et al. (2016) used a novel fMRI paradigm in which matched scenarios elicited either monologal (speaking from a single perspective) or dialogal (dialogs between two people) inner speech. The contrast between dialogal and monologal inner speech revealed increased activation in STG bilaterally, left IFG and MFG, left

precuneus, and right posterior cingulate. The observed precuneus and posterior cingulate activation converges with our results and those of studies on egocentric and allocentric perspective handling (see e.g., Ruby and Decety, 2001, or Blanke, 2012 for a review) and suggests that these regions are critically involved in perspective switching. Contrary to our own results, however, there was no increase in right IFG and MTG in dialogal inner speech compared with monologal inner speech in their study. The fact that their dialogal condition used several scenarios which involved different voices (a teacher, a job recruiter, a relative, the prime minister) whereas our MO and DO conditions involved one single high-pitched voice, could explain this discrepancy. The auditory experience related to a single caricatural voice may be easier to predict than the many sensations associated with many voices.

# Intentionality: Neural Correlates of Verbal Mind Wandering

Finally, along the intentionality dimension, when compared with the baseline, VMW displayed greater left hemisphere activation in SMA, together with bilateral IFG, insula, MFG, SMA, medial SFG, inferior and superior parietal cortex, precuneus, and left caudate, thalamus, and cerebellum. The activation of medial SFG, precuneus, posterior inferior parietal regions and lateral temporal cortex is compatible with the default mode network. The addition of the bilateral IFG and insula fits with the verbal quality of this mind wandering period. When the participants were split into Low-verbal vs. High-verbal groups, it was found that, compared with the High-verbal group, the Low-verbal group showed more activation in the dorsomedial prefrontal cortex, classically related to cognitive control (Venkatraman et al., 2009). This could suggest that for unintentional inner speech to occur, cognitive control should be turned down. Further data are required to confirm this result. The contrast between MS and VMW yielded an increase in right hemisphere involvement for VMW relative to MS. Increased activation was observed in left parieto-fronto-temporal regions in MS compared with VMW, whereas VMW yielded greater activation than MS in right parieto-fronto-temporal regions, as well as precuneus, ACC, and thalamus (see also the ROI analysis in temporal regions, **Figure 8**). Since an increase in right hemisphere activation was also observed in DO, this could suggest that the VMW condition may include periods of monologal as well as dialogal inner speech. This is consistent with the post-scan questionnaires: participants reported that they experienced verbal material, and this could be addressed to them or spoken by them. The occipital activation decreased in VMW with respect to MS. This is possibly due to the higher visual stimulation in the latter condition. In

TABLE 4 | Contrasts between the two groups of participants (Low verbal > High verbal) in the VMW condition (p < 0.05 FWE correction).


the MS condition, a new picture, with the associated word to define, was presented every 8 s, whereas in the VMW condition, a picture was presented only once, for 2 s, at the beginning of the trial and then the visually neutral rotating clock appeared. The left STG-MTG activation decreased in VMW compared with MS, just as it did for MO and DO, presumably reflecting the fainter auditory percepts in these conditions. Spontaneous inner speech, i.e., inner speaking episodes during a mind wandering session, was examined in Hurlburt et al.'s (2016) study cited above, using a ROI analysis focused on Heschl's gyrus and the left IFG. Contrary to our results, compared with baseline, their spontaneous speech samples yielded increased activation in Heschl's gyrus and no difference was observed in the left IFG. Although our participants were trained to report on spontaneous inner speech, they did not go through the thorough descriptive experience sampling and expositional interview process used in the Hurlburt et al. (2016) study. The five participants in Hurlburt et al.'s (2016) study had been extensively trained and received guidance to distinguish between spontaneous inner speaking (unintentional monologal inner speech) and spontaneous inner hearing (unintentional dialogal inner speech). Their data only concerns inner speaking, which was the most frequent of the spontaneous speech forms. The more limited training underwent by the participants in our own study probably reduces the validity of the reports. Yet, the observed left IFG activation during VMW suggests that participants did produce inner speech, at least in a semi-expanded form (LIFG is supposed to be already recruited at the formulation stage). It is somewhat surprising that the left IFG was not recruited in Hurlburt et al.'s (2016) spontaneous inner speaking samples. One explanation for the presence of left IFG in our data and the absence in theirs could lie in the different types of contrasts used. Whereas we compared the entire VMW condition with an implicit baseline, Hurlburt et al. (2016) contrasted spontaneous inner-speaking-dominant with spontaneous notinner-speaking-dominant samples. DES samples rarely contain only one kind of experience, inner speaking may be accompanied with inner seeing or other phenomena (Hurlburt et al., 2013). Inner speaking occurrences were carefully selected using the DES method. Inner-speaking occurrences (20 of all 180 spontaneous samples, across the five participants) only included samples for which three interviewers unanimously rated that inner speaking was the predominant feature of the inner experience. These 20 samples were compared with 85 not-inner-speaking samples that were unanimously rated as not containing inner speaking. As acknowledged by the authors, it cannot be excluded that the absence of significant difference in left IFG activation during these two sets of samples could be due to a lack of power. The other difference between our findings and those of Hurlburt et al. (2016) lies in the pattern of temporal lobe activation. We have found a gradient of left temporal activation, from high STG-MTG involvement during SP to minimal activation during VMW via medium recruitment during MS, whereas Hurlburt et al. (2016) observe a strong activation in Heschl's gyrus during spontaneous inner speech, and a deactivation during intentional inner speech. The fact that we observed such a weaker left auditory activation during VMW could be explained by the variety of inner speech at play. As mentioned, in Hurlburt et al.'s (2016) study, inner speaking occurrences were unanimously rated by three interviewers as containing inner speaking. Presumably, these instances were expanded forms of inner speech, with full inner production down to the articulatory planning stage and inner voice prediction. In our own study, participants reported any verbal material, which may have included full-fledged inner voice as well as less expanded forms. We did not select specific instances, but kept instead the entire VMW session. Some of the verbal forms experienced by our participants may therefore have been more condensed than the inner speaking samples selected in Hurlburt et al.'s (2016) study. Therefore, the reduced left auditory activation observed in the present study could be a result of higher condensation in the spontaneous speech observed (as the subjective reports presented in **Figure 5** suggest). We did observe an increase in right temporal activation during VMW (and DO) relative to MS, however. This could suggest that VMW included dialogal inner speech occurrences, be they semi-condensed or expanded. Alternatively, our finding on the reduction of left temporal activation could be due to a lack of power and an insufficient number of spontaneous inner speech fragments, since verbal episodes were only transient during each VMW trial.

# CONCLUSION

On the basis of recent psycholinguistic and neuroimaging data combined with early introspective descriptions, we have proposed ConDialInt, a comprehensive neurocognitive model of inner speech, aiming to account for typical varieties.

We have presented an fMRI study in which we probed varieties of inner speech along dialogality and intentionality dimensions, in the aim of examining the neuroanatomical assumptions of the ConDialInt model. We designed several carefully controlled tasks specifically fit to compare inner speech along those two dimensions. The condensation dimension was also informally tackled.

Our findings support the predictive control hypothesis that expanded inner speech recruits speech production processes down to articulatory planning, resulting in a predicted signal, the inner voice, with auditory qualities. More specifically, the data are compatible with an account in which a supramodal phonetic goal, instantiated in the inferior parietal lobule, is presumably converted into motor commands that are inhibited by cognitive control signals originating from prefrontal cortex, so that no movement of the speech apparatus occurs. The specification of motor commands is supposed to involve a controller model that may be sustained by the right cerebellum, as well as further coordination processes handled by the left IFG, insula, and premotor cortex. An efference copy of the motor commands may be used by a predictor model supported by the right cerebellum, giving rise to auditory percepts handled in STG and MTG.

Along the dialogality dimension, covertly using an avatar's voice with a high pitch, instead of one's own voice, during monologal other-voice inner speech, recruited right hemisphere homologs of the regions involved in own-voice soliloquy. These right hemisphere regions are presumably associated with pitch

control. The lesser cerebellar activation indicates that selfadapted controller/predictor models are inadequate in such a task. Changing perspective, from monologuing to imagining other speaking, was associated with activations in precuneus and parietal lobules, in addition to the pitch-control regions. In line with previous studies on imagination of others' actions or others' speech, we suggest that these regions play a crucial role in first-person and third-person perspective handling.

Finally, along the intentionality dimension, mind wandering with unintentional inner speech episodes was associated with bilateral inferior frontal activation and less activation in left temporal regions than intentional inner speech. This is coherent with the subjective evanescence quality reported by the participants and presumably reflects condensation processes. Whereas the intentional inner speech tasks all implied speech production down to articulatory planning and generation of an inner voice, the verbal episodes during the mind wandering trials were presumably less expanded. Yet the observation of left IFG activation in this condition does suggest that the initial stages of speech production were launched.

The ConDialInt model includes informed speculations on the neural correlates of the conceptualization, formulation and articulatory planning stages of inner speech. Although our data are consistent with these propositions, further studies are needed to test the model more thoroughly and to refine the descriptions. Several questions are still open. Most notably, we have made the hypothesis that the phonetic goal, generated from conceptualization and formulation, is in a supramodal format, that integrates somatosensory and auditory representations. We argue that this phonetic goal is formed within the IPL, before it is sent to the cerebellar controller and later to prefrontal and premotor regions. This is speculative and more refined neuroimaging or electrocorticography (EcoG) studies, with more precise temporal and spatial resolution, should help better describe the temporal sequence of cerebral activations between IPL, cerebellum and IFG-PM cortex. We have also assumed that both controller and predictor models are sustained by the cerebellum, based on recent findings on the double representation of the cerebral regions in the anterior and posterior lobes of the cerebellum. But the present fMRI data do not cover enough of the cerebellum to assess whether different parts of the cerebellum were involved. Furthermore, they do not allow us to test whether the assumed cortico-cerebello-cortical sequence of activation is appropriate. Our model conjectures that multisensory responses are the predicted outputs of internal predictors. Yet we mainly registered an auditory response and little somatosensory activity. Further studies are necessary to assess whether somatosensory activation can be detected. We also speculated that the auditory and somatosensory responses are integrated (via the TPJ) to form a supramodal response, comparable to the initial phonetic goal. This too needs to be better tested, by examining inferior parietal cortex activity in more detail. Furthermore, we have conjectured that the prefrontal activation observed is associated with inhibitory control (suppressing the motor output), as well as with executive control, related to monitoring one's inner speech in intentional instances, and to holding different perspectives in dialogal varieties. Further studies should help disentangle between these different types of control. Moreover, we have speculated that the lack of left auditory cortex responses in the mind wandering condition was due to our participants producing more condensed varieties of inner speech during these trials. Unintentional inner speech is often reported as faint and evanescent, as if its auditory quality was dimmer or even absent. Given that another study did find a strong auditory response during spontaneous speech, further phenomenological and neuroimaging studies are needed to better describe the degree of expansion during unintentional inner speech. Whether or not expanded varieties of inner speech mostly arise during intentional inner speech remains an open question.

# DATA AVAILABILITY

The datasets generated for this study are available on request to the corresponding author.

# ETHICS STATEMENT

Each participant gave informed written consent and received 30€ for their participation. The study was approved by the local ethics committee (38RC14.304/ID-RCB: 2014-A01403-44).

# AUTHOR CONTRIBUTIONS

All authors contributed to the conception and design of the study, discussion of the results, revision of the manuscript, and read and approved the submitted version. LR, RG, and CP collected the fMRI data. RG, CP, HL, MP-B, MB, and EC designed fMRI data analysis methods. RG, CH, CP, and EC performed the data analysis. HL wrote the first draft and revised version of the manuscript. RG wrote sections of the manuscript.

# FUNDING

This research was supported by the ANR project INNERSPEECH (Grant Number ANR-13-BSH2-0003-01; http://lpnc.univ-grenoble-alpes.fr/InnerSpeech). The IRMaGe MRI/Neurophysiology facility was partly funded by the French program "Investissement d'Avenir" run by the "Agence Nationale pour la Recherche" (Grant "Infrastructure d'avenir en Biologie Santé" – ANR-11-INBS-0006).

# ACKNOWLEDGMENTS

We thank all participants. We are grateful to Luciano Fadiga, Yanica Klein, Laurent Lamalle, Irène Troprès, Anne Vilain, and Todd Woodward for helpful advice and suggestions. We thank Flora Gautheron and Alexandra Steinhilber for their contribution in the analyses of the subjective questionnaires and verbal mind wandering reports. We thank the two reviewers for constructive comments on a previous version of this manuscript.

# REFERENCES

fpsyg-10-02019 September 16, 2019 Time: 16:33 # 27


Dennett, D. (1991). Consciousness Explained. New York, NY: Little Brown & Co.



and to hear, see, feel," in Inner Speech: New Voices, eds P. Langland-Hassan, and A. Vicente (Oxford: Oxford University Press), 131–167.


schizophrenia. J. Speech Lang. Hear. Res. 56, S1882–S1893. doi: 10.1044/1092- 4388(2013/12-0210)


Sokolov, A. N. (1972). Inner Speech and thought. New York, NY: Plenum Press.

Sommer, I. E. C., Diederen, K. M. J., Blom, J. D., Willems, A., Kushan, L., Slotema, K., et al. (2008). Auditory verbal hallucinations predominantly activate the right inferior frontal area. Brain 131, 3169–3177. doi: 10.1093/brain/awn251


Taine, H. (1870). De L'intelligence, 2 Vols. Paris: Hachette.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Grandchamp, Rapin, Perrone-Bertolotti, Pichat, Haldin, Cousin, Lachaux, Dohen, Perrier, Garnier, Baciu and Lœvenbruck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Types of Inner Dialogues and Functions of Self-Talk: Comparisons and Implications

#### Piotr K. Oles´ 1 \*, Thomas M. Brinthaupt<sup>2</sup> , Rachel Dier<sup>2</sup> and Dominika Polak<sup>1</sup>

1 Institute of Psychology, The John Paul II Catholic University of Lublin, Lublin, Poland, <sup>2</sup> Department of Psychology, Middle Tennessee State University, Murfreesboro, TN, United States

Intrapersonal communication occurs in several modes including inner dialogue and self-talk. The Dialogical Self Theory (Hermans, 1996) postulates a polyphonic self that is comprised of a multiplicity of inner voices. Internal dialogical activity implies an exchange of thoughts or ideas between at least two so-called "I-positions" representing specific points of view. Among the functions served by self-talk are self-criticism, selfreinforcement, self-management, and social assessment (Brinthaupt et al., 2009). This paper explores the relationships among different types of internal dialogues and selftalk functions. Participants included college students from Poland (n = 181) and the United States (n = 119) who completed two multidimensional measures of inner dialogue and self-talk. Results indicated moderately strong relationships between inner dialogue types and self-talk functions, suggesting that there is a significant overlap between the two modes of communication. We discuss several implications of these findings for exploring similarities and differences among varieties of intrapersonal communication.

#### Edited by:

Stefan Berti, Johannes Gutenberg University Mainz, Germany

#### Reviewed by:

Peter Rober, KU Leuven, Belgium Aleksandra Kaurin, University of Pittsburgh, United States

> \*Correspondence: Piotr K. Oles´ oles@kul.pl

#### Specialty section:

This article was submitted to Cognitive Science, a section of the journal Frontiers in Psychology

Received: 19 July 2019 Accepted: 31 January 2020 Published: 06 March 2020

#### Citation:

Oles PK, Brinthaupt TM, Dier R ´ and Polak D (2020) Types of Inner Dialogues and Functions of Self-Talk: Comparisons and Implications. Front. Psychol. 11:227. doi: 10.3389/fpsyg.2020.00227 Keywords: inner dialogue, intrapersonal communication, self-talk, inner speech, identity, self-regulation

# INTRODUCTION

Intrapersonal communication occurs in several modes and includes research on a wide range of processes and behavioral domains (see this Research Topic). Two such modes are self-talk and internal dialogue. With respect to self-talk, psychologists originally described inner and private speech in the context of developmental processes including the affinity between speaking and thinking (Vygotsky, 1962). Although inner dialogues had long been recognized by philosophers such as Thomas Aquinas and Saint Augustine, and by writers, poets, and other thinkers, formal psychological theorizing about such phenomena was only recently introduced at the end of the 20th and beginning of the 21st century (Hermans and Kempen, 1993; Markova, 2005).

The possible relationship and mixing of these two phenomena occurs within theory and empirical research. For example, according to Kross et al. (2014), "Self-talk is a ubiquitous human phenomenon. We all have an internal monologue that we engage in from time to time" (p. 321). How people engage in internal monologues (or dialogues) and self-talk is likely to vary. For example, people might instruct themselves to "Try again" or relax themselves by saying "Don't worry." In a different context, one might ask oneself "What can I do?" or "Are my talents and knowledge enough to argue in a coming debate?"

These examples of self-talk can also involve dialogic features. From the perspective of Dialogical Self Theory (Hermans, 1996; Hermans and Gieser, 2012), people can take at least two points of

**104**

view or "I-positions" within their intrapersonal communication. We might discuss in our minds multiple options, like a fiddler on a roof: "on the one hand . . ., but on the other hand . . ." Such dialogues can show even greater complexity and detail. For example, a man might imagine how a request for a divorce will affect his spouse, how she would likely respond to that request, whether he should reconsider based on her likely response, etc. This kind of inner dialogue involves posing questions on behalf of the imagined partner and giving answers.

As the previous example suggests, an inner monologue can easily evolve into an internal dialogue between two subjects inside one's mind—between different parts of oneself or between oneself and the imagined partner. In other words, there may be qualitative and quantitative differences in the nature of self-talk and internal dialogues. Self-talk appears to involve basic self-regulatory functions like self-control or self-direction ("Try again"), whereas internal dialogues involve more extended communicative functions ("When I say X, she will answer Y"). In the present study, we aimed to explore the degree of overlap between these two forms of intrapersonal communication.

For our purposes, self-talk can be defined as "self-directed or self-referent speech (either silent or aloud) that serves a variety of self-regulatory and other functions" (Brinthaupt, 2019, para. 7). Internal dialogical activity is defined as "engagement in dialogues with imagined figures, the simulation of social dialogical relationships in one's own thoughts, and the mutual confrontation of the points of view representing different I-positions relevant to personal and/or social identity" (Ole´s and Puchalska-Wasyl, 2012, p. 242).

Most definitions of self-talk and inner speech assume that, in this form of intrapersonal communication, both sender and recipient represent the same person (e.g., Fernyhough, 2016). In contrast, inner dialogical activity does not imply that. Inner dialogues refer to various forms of intrapersonal communication where different voices can represent not only the self but also close persons, imagined friends, lost relatives and spouses, teachers and mentors, media stars, voices of culture, and others (Hermans, 1996). Self-talk can be just a single word, comment, or command without any answer or an extended "conversation," while mutual exchange of expressions is an essence of the internal dialogue.

Whereas everyday self-regulation is an important feature of self-talk (Brinthaupt et al., 2009), internal dialogical activity emphasizes confrontation or integration of different points of view as a way to help a person understand new or strange experiences. In other words, self-talk seems to occur in reaction to or anticipation of specific events or circumstances, whereas inner dialogue appears to involve more reflective or contemplative kinds of intrapersonal communication. Furthermore, inner dialogues frequently involve a person's identity (e.g., Bhatia, 2002; Batory, 2010), whereas self-talk seems to apply to identity questions only indirectly.

In this paper, we first describe theoretical and research conceptions of self-talk and inner dialogical activity. We then propose possible relationships between these two forms of intrapersonal communication. Next, we report the results of a study that compares total and subscale scores of these constructs. The nature of the relationship between inner dialogues and self-talk has important implications for the phenomenon of intrapersonal communication. We discuss some of these implications in the conclusion of the paper.

# Self-Talk and Its Different Functions

Most approaches to studying self-talk assume that it encompasses self-referent or self-directed speech. Research examines several variants of the phenomenon, including positive and negative self-statements (Kendall et al., 1989), silent self-talk (i.e., inner speech) (McCarthy-Jones and Fernyhough, 2011), and out loud self-talk (i.e., private speech) (Duncan and Cheyne, 1999). Self-talk research has long been popular in the domains of clinical (e.g., Schwartz and Garamoni, 1989), sport and exercise (e.g., Hardy, 2006), developmental (e.g., Diaz and Berk, 1992), educational (e.g., Deniz, 2009), and personality (e.g., Brinthaupt et al., 2009) psychology.

Extensive research explores how and why people talk to themselves and whether variations in self-talk content result in different effects on the speaker. Among the self-talk functions are general self-regulation (e.g., Mischel et al., 1996; Carver and Scheier, 1998), self-distancing (Kross et al., 2014), providing instruction and motivation (Hatzigeorgiadis et al., 2011), and self-awareness, self-evaluation, self-knowledge, and self-reflection (White et al., 2015; Morin, 2018).

Evidence suggests that self-talk also plays a role in facilitating a variety of cognitive processes (Langland-Hassan and Vicente, 2018) including emotion regulation (Orvell et al., 2019), coping with painful experiences (Kross et al., 2014, 2017), monitoring of language development and speech production (e.g., Pickering and Garrod, 2013), and perspective taking (e.g., Fernyhough, 2009). Recent studies show that non-first-person self-talk can promote self-distancing and adaptive self-reflection (e.g., Kross et al., 2014; White et al., 2015). Referring to oneself in the third person (he/she/they) or by one's name appears to promote coping with stressful experiences and is associated with appraising future stressors as challenges rather than threats (Kross et al., 2014, 2017). This kind of self-talk is also connected to specific forms of brain activity that constitute effortless self-control (Moser et al., 2017) and emotion regulation (Orvell et al., 2019).

A detailed functional view emerged from the development of the Self-Talk Scale (STS) (Brinthaupt et al., 2009), which measures the self-reported frequency of different kinds of self-talk. Relying on an initial pool of items assessing multiple situations where self-talk might occur and the possible common functions served by it, Brinthaupt et al. identified four broad types. The STS includes subscales on self-criticism (i.e., situations when bad things have happened to a person), self-reinforcement (i.e., relating to positive events), self-management (i.e., determining what one needs to do), and social-assessment (i.e., referring to past, present, or future social interactions).

Research on the psychometric properties of the STS supports these four factors as well as other features of the measure (e.g., Brinthaupt et al., 2009, 2015; Brinthaupt and Kang, 2014). Additional research (Morin et al., 2018) suggests that the kinds of self-talk measured by the STS are common occurrences in the everyday experience of this kind of intrapersonal

communication. Thus, one way to provide an initial assessment of the relationship between the varieties of self-talk and inner dialogues is to utilize a measure that captures at least some of the possible functions served by self-talk.

# The Dialogical Self and Inner Dialogues

Bakhtin (1973) introduced the notion of the polyphonic novel with his analysis of Fyodor Dostoevsky's literary works. That analysis showed possible splitting of the self into voices that were not exactly coherent, and each of them represented relatively autonomous points of view. According to the Dialogical Self Theory (DST) (Hermans, 1996), human consciousness functions as a similar "society of mind" containing mental representations of numerous voices of culture, family members, close friends, significant others, and other sources. These voices can engage in a variety of communications, including posing questions and answers to, and having agreements and disagreements with, each other (Hermans, 2003).

Assuming a multiplicity of inner voices, internal dialogical activity specifically applies to the exchange of thoughts or ideas between at least two I-positions representing specific points of view (Hermans, 1996). Research shows that inner dialogues play an important role in identity construction (e.g., Bhatia, 2002; Hermans and Dimaggio, 2007; Batory, 2010), differentiating and integrating the self as part of the process of self-organization (e.g., Raggatt, 2012; Valsiner and Cabell, 2012), the simulation of social dialogues (e.g., Puchalska-Wasyl et al., 2008; Puchalska-Wasyl, 2011), and general self-reflection and insight (e.g., Markova, 2005; Hermans and Hermans-Konopka, 2010; Rowan, 2011).

Developments within DST (Hermans and Hermans-Konopka, 2010) and associated research (e.g., Ole´s and Hermans, 2005; Hermans and Gieser, 2012; Puchalska-Wasyl, 2016; Puchalska-Wasyl et al., 2018) have led to the identification of several forms and functions of internal dialogical activity. For example, Nir (2012) distinguished contrasting (or confrontational) and integrating dialogues. Contrasting dialogues refer to the clashing of opposing points of view and argumentation until one of them obtains an evident advantage over another. Integrating dialogues tend toward compromising solutions or the integration of opposing points of view into higher levels of abstract meanings. Puchalska-Wasyl (2010) highlighted differences between three forms of dialogical activity: monologue (that implies an interlocutor or audience), dialogue, and changing point of view. This last form refers to the polyphony described by Bakhtin (1973) and Hermans (1996). While dialogue means real exchange of ideas between two or more points of view (I-positions), monologue refers to one-sided communications (whether to oneself or to another person) in which an answer is not expected.

Researchers have recently engaged in efforts to measure individual differences in inner dialogues. For example, the Varieties of Inner Speech Questionnaire (VISQ) (McCarthy-Jones and Fernyhough, 2011; Alderson-Day et al., 2018) measures different phenomenological aspects of inner speech, including a factor on dialogicality (or self-talk occurring as a back-andforth conversation). Ole´s (2009) and Ole´s and Puchalska-Wasyl (2012) developed the Internal Dialogical Activity Scale (IDAS), which focuses specifically on the range of different kinds of inner dialogues postulated by DST. Some of the dimensions of this measure include identity, social, supportive, confronting, and ruminative dialogues. The IDAS therefore permits a more thorough examination of DST concepts than the VISQ.

In summary, DST views intrapersonal communication as a complex process of inner dialogues. These dialogues take a wide variety of forms and functions that play important roles in the development of self and identity. However, to date, there has been little research attention devoted to the relationship of these kinds of forms and functions to other kinds of intrapersonal communication. Self-talk appears to be one kind of intrapersonal communication that is similar to inner dialogues.

# Possible Linkages Between Self-Talk and Inner Dialogues

As we noted earlier, the levels of focus are different for the STS and the IDAS. Internal dialogues tend to apply more to a higher level, or meta-features, of intrapersonal communication, compared to the self-regulatory functions assessed by the STS. That is, the STS measures why and when people might talk to themselves, whereas the IDAS primarily assesses the phenomenology of how people talk to themselves.

The potential relationships among self-talk and inner dialogues are theoretically interesting for several reasons. It is conceivable that different kinds of self-talk reflect different I-positions. For example, self-critical self-talk might reveal the presence of confrontational dialogues, whereas selfmanaging self-talk might be more frequent when people engage in integrative dialogues. Individuals reporting frequent ruminative inner dialogues might also report higher levels of self-critical self-talk.

There are also some likely differences between these two kinds of intrapersonal communication. Self-talk includes a variety of non-dialogical features, such as internal monologues that reflect observations of or commentary on one's experiences that are not interpersonally or socially directed (e.g., Duncan and Cheyne, 1999; Langland-Hassan and Vicente, 2018) or simple auditory rehearsals (e.g., MacKay, 1992) that do not involve more than one I-position. Thus, it is reasonable to expect that some kinds of self-talk may be unrelated to the frequency of inner dialogues.

Fernyhough (2009, 2016) argues that inner speech is fundamentally dialogic and permits people to take perspectives on, understand, and integrate their internal and external worlds. This process includes creating representations of the inner experiences of other people. As such, it is reasonable to predict that some kinds of self-talk will be positively associated with certain types of inner dialogues. For example, socialassessing self-talk is probably similar to dialogues that include an imagined social mirror.

Some research on the frequency of self-talk is relevant to theoretical conceptions of inner dialogues. For example, Brinthaupt and Dove (2012) found that adults who reported having had an imaginary companion in childhood reported more frequent self-talk than those who did not have one. In addition, they found that adults who grew up as only children without siblings reported more frequent self-talk than those growing up with siblings. Such childhood social experiences might play a role in people's levels of comfort with, or awareness of, their self-talk as well as the nature of their inner dialogues. Other contributors to the current Research Topic (e.g., Brinthaupt, 2019; Łysiak, 2019) provide additional insights into possible relationships between internal dialogues and self-talk.

# Aims of the Study

fpsyg-11-00227 March 4, 2020 Time: 16:55 # 4

Our research examines two specific modes of intrapersonal communication. In particular, we explore the relationships among functions of self-talk and types of inner dialogues in order to clarify the similarities between these modes of intrapersonal communication. Previous research has extensively studied the self-talk and internal dialogue types and functions measured by the STS and IDAS-R. However, no research, to date, has examined the ways that these self-talk and internal dialogue facets relate to and overlap with each other. Brinthaupt et al. (2009) constructed and validated the Self-Talk Scale in the United States, whereas Ole´s (2009) published the Internal Dialogical Activity Scale in Poland. In this study, we decided to compare each of these constructs using both United States and Polish samples. We examine the relationships among these two measures through the use of correlational and factor analytic approaches. We are not introducing new ways to assess intrapersonal communication; nor are we primarily interested in cross-cultural differences.

This study explores relationships among the different functions of self-talk defined by the STS and the types of internal dialogues identified by the IDAS. Our general expectation was that individuals who report frequent levels of internal dialogical activity will also report frequent self-talk. However, the strength of these relationships will depend on the specific types and subscales of both kinds of intrapersonal communication. By exploring these relationships, we hoped to better clarify the theoretical and conceptual similarities between self-talk and inner dialogues.

# MATERIALS AND METHODS

# Participants

Participants were two college student samples. The Polish sample consisted of 181 students (117 women, 64 men), with ages ranging from 18 to 34 (M = 24.94, SD = 4.24), who attended courses leading to a master's degree. We drew the United States sample from the university's General Psychology research pool that was comprised of mostly freshmen and sophomores. This sample consisted of 119 students (66 women, 51 men, two missing), with ages ranging from 18 to 29 (M = 19.18, SD = 1.86). The two samples differed significantly in age, t(297) = 13.92, p < 0.001, but did not differ significantly in their gender proportions, X 2 (2) = 3.39, p = 0.18.

# Measures

#### Self-Talk Scale (STS)

Self-Talk Scale (STS) (Brinthaupt et al., 2009). The STS consists of 16 items, representing the four self-talk functions of self-criticism, self-reinforcement, self-management, and socialassessment. Respondents rate the STS items using a five-point frequency scale (1 = never, 5 = very often) and using the common stem "I talk to myself when." Each subscale contains four items. To calculate subscale and total frequency scores, items are summed, with higher scores indicating more frequent self-talk. Research provides good support for the psychometric properties of the STS and the integrity of the four subscales (e.g., Brinthaupt et al., 2009, 2015; Brinthaupt and Kang, 2014).

Self-criticism pertains to self-talk about negative events (e.g., "I should have done something differently" and "I feel ashamed of something I've done"). Self-reinforcement refers to self-talk about positive events (e.g., "I am really happy for myself " and "I want to reinforce myself for doing well"). Self-management assesses self-talk about features of general self-regulation (e.g., "I am mentally exploring a possible course of action" and "I want to remind myself of what I need to do"). Social-assessment applies to self-talk about people's future and past social interactions (e.g., "I try to anticipate what someone will say and how I'll respond to him or her" and "I want to analyze something that someone recently said to me").

#### Internal Dialogical Activity Scale-R (IDAS-R)

Internal Dialogical Activity Scale-R (IDAS-R). The IDAS-R is a 40-item tool aimed at measuring an overall level of internal dialogical activity as well as eight different kinds of inner dialogues. The original version of the Questionnaire (IDAS) consisted of 47 items and contained seven subscales (Ole´s, 2009; Ole´s and Puchalska-Wasyl, 2012). Respondents rate the applicability of each item using a five-point scale. In the current revision of the scale, we changed the response format from the original intensity of agreement (1 = I strongly disagree, 5 = I strongly agree) to a frequency scale (1 = never, 2 = seldom, 3 = sometimes, 4 = often, 5 = very often). Additional revisions included (1) splitting two complex sentences into simple items containing clear meanings, (2) adding four items, (3) reformulating the wording of several items due to the new response format, and (4) deleting one item as irrelevant.

To test the structure and psychometric properties of the IDAS-R, we collected data from 654 Polish participants (449 women, 205 men) ranging in age from 16 to 80 years (M = 31.83, SD = 10.93). All participants provided informed consent prior to completing the measure. For the exploratory factor analysis, we used the least squares method for the extraction of factors, with Oblimin rotation and Kaiser normalization. The results provided nine extracted factors, which explained 63% of the variance. However, one of these factors contained low loadings, so we settled on eight factors for the final version explaining 61% of the variance. Each factor consists of five items, resulting in the final 40-item version. We describe the factor scales, their associated internal consistency values, and sample items below.

Identity Dialogues refer to questions and answers concerning identity, values, and life priorities (e.g., "Thanks to dialogues with myself, I can answer the question, 'Who am I?' and "Through internal discussions I come to certain truths about my life and myself."). Such dialogues pertain to searching for authenticity and may precede important life choices.

Maladaptive Dialogues are internal dialogues treated as undesirable, unpleasant, or annoying (e.g., "I would prefer not to carry on internal conversations" and "The conversations in my mind upset me"). The content and occurrence of such dialogues imply task disturbances or avoidance behavior.

Social Dialogues are inner dialogues that reflect future and past conversations (e.g., "When preparing for a conversation with someone, I practice the conversation in my thoughts" and "I continue past conversations with other people in my mind"). These items capture the frequency of continuation of talk with others, preparation for conversation, finishing discussions, or creating alternative conversational scenarios.

Supportive Dialogues include intrapersonal communications with persons who have given support and whose closeness is valued (e.g., "When I cannot speak with someone in person, I carry on a conversation with him/her in my mind" and "I carry on discussions in my mind with the important people in my life."). Such dialogues might provide bolstering of social bonds and help to overcome loneliness by giving support to, and strengthening, the self.

Spontaneous Dialogues are inner conversations that occur spontaneously in everyday life (e.g., "I converse with myself and "I talk to myself "). Such dialogues refer to the consideration of different thoughts or opinions as well as a dialogical form of self-consciousness.

Ruminative Dialogues consist of dialogues involving selfblame, mulling over failures, and recalling of sad or annoying thoughts or memories (e.g., "After failures, I blame myself in my thoughts" and "I have conversations in my mind which confuse me"). These items capture general rumination tendencies within one's internal dialogues.

Confronting Dialogues are internal dialogues conducted between two sides of the self, such as the "good me" and "bad me" (e.g., "I feel that I am two different people, who argue with each other, each wanting something different" and "I argue with that part of myself that I do not like"). Such internal disputes imply a sense of incoherence, polarization, or even fragmentation of the self.

Change of Perspective refers to changes in point of view in service of understanding challenging situations or searching for solutions (e.g., "When I have a difficult choice, I talk the decision over with myself from different points of view" and "In my thoughts I take the perspective of someone else"). Such dialogues might involve taking a fruitful or conflicted perspective of another person.

For each of these subscales, summing the five items creates a total score, with higher scores indicating greater frequency of that kind of dialogue. It is also possible to compute an overall inner dialogue score by summing the ratings of all 40 items. In the current study, this total score, called Internal Dialogical Activity reflects a person's general frequency of engagement in internal dialogues.

# Procedure

We created two parallel Polish and English language versions of the measures. For the STS, one of the research team members who speaks both Polish and English first translated the scale into Polish. A different colleague then back translated the Polish STS version to English. A native English speaking team member reviewed this version and indicated any areas of clarification, confusion, and discrepancy. We then created the final Polish version of the STS. For the IDAS-R, a team member translated the original (Polish) version of the measure into English. A native English-speaking team member then reviewed this version for clarity. A team member then back translated this version into Polish and identified any discrepancies or areas of confusion. We then implemented necessary corrections to create the final English version of IDAS-R.

The study received approval from the Institutional Review Board (IRB), Middle Tennessee State University, United States. Participants provided their written informed consent when the institution required it. They completed the main measures in counterbalanced order individually or in small groups of 5–10 people. Demographic items appeared at the end of the survey.

# RESULTS

Descriptive statistics for both samples appear in **Table 1**. As the table shows, the alpha coefficients for the STS and IDAS-R were similar across the United States and Polish samples, with comparable and acceptable values. Both samples also showed similar patterns in the relative frequency of the four types of self-talk, with self-managing self-talk most common and selfreinforcing self-talk least common. Among the IDAS-R facets, both samples reported relatively low levels of maladaptive and confronting dialogues and relatively high levels of social and spontaneous dialogues.

Comparison of the two samples revealed that the United States students reported significantly higher scores than their Polish peers on the total STS [t(297) = 7.09, p < 0.001, g = 0.84] as well as the social-assessment [t(297) = 5.71, p < 0.001, g = 0.67], selfreinforcement [t(297) = 4.06, p < 0.001, g = 0.48], self-criticism [t(297) = 6.49, p < 0.001, g = 0.77], and self-management [t(297) = 5.40, p < 0.001, g = 0.64] STS subscales. A similar pattern emerged for overall IDAS-R and five of its eight subscales. In particular, United States students reported higher scores than the Polish students on the total IDAS-R [t(297) = 3.33, p < 0.001, g = 0.39], as well as the identity [t(297) = 1.92, p < 0.05, g = 0.23], spontaneous [t(298) = 3.84, p < 0.001, g = 0.45], ruminative [t(298) = 3.40, p < 0.001, g = 0.40], confronting [t(298) = 3.06, p < 0.002, g = 0.36], and change of perspective [t(298) = 6.61, p < 0.001, g = 0.78] dialogues.

**Table 2** reports the correlations among the STS and IDAS-R measures for each sample and indicates those correlations that reached the 0.001 level of significance. The correspondence between these two kinds of intrapersonal communication turned out to be consistently positive, with most correlations in the moderate to strong range. For the Polish sample, 36 of the 44 correlations between the STS and IDAS-R total and subscale scores were significant. For the United States sample, 35 of 44 of these correlations were significant. In the Polish sample, significant correlations ranged between 0.24 and 0.59; in the United States sample, significant relationships ranged between

TABLE 1 | Descriptive statistics for the self-talk scale and the internal dialogical activity scale—revised for United States, polish, and combined samples.


TABLE 2 | Correlations between the STS and IDAS-R: results from Polish sample above the diagonal and for United States sample below the diagonal.


United States sample: n = 119; Polish sample: n = 181; \*p < 0.001.

0.29 and 0.62. Moreover, the patterns of relationships in both samples were similar. Total STS and IDAS-R scores correlated 0.56 in the Polish sample and 0.62 in the United States sample.

On the one hand, these results show moderate, positive relationships between several self-talk functions and types of internal dialogues. On the other hand, there is evidence of possible independence of these kinds of intrapersonal communication. For our next set of analyses, we sought to determine the extent of independence of STS and IDAS-R subscales. We used both canonical correlational and exploratory factor analysis with the combined samples to address this question.

To answer the question of overlap between the two measures of intrapersonal communication, we first used canonical correlational analysis, which permitted us to explore mutual relationships between STS and IDAS-R subscales in a more complex and advanced way. This analysis allows us to find features that are important for explaining the covariation between the subscales of the STS and IDAS-R. We conducted the analysis on the combined samples with each participant represented by their scores on the four STS and the eight IDAS-R subscales. Because of the potential negative effects of outliers on CCA, we first eliminated respondents who scored three standard deviations above or below the mean on the total score of either measure. This resulted in a new sample size of 293 (180 women, ages 18–34). The results of this analysis showed three significant canonical correlations: 0.64, 0.43, and 0.33 (all p < 0.001), explaining, respectively, 41%, 19%, and 11% of the variance (see **Table 3**). The first canonical variable represented over half of the variance from the original set of variables and explained about 25% of the variance from the opposite set of variables.


Interestingly, all loadings were negative, with lack of selftalk functions (see canonical loadings) corresponding to reduced inner dialogues of all kinds. However, according to the reversed loadings, this variable represented the presence of four selftalk functions, namely, self-management, social assessment, self-criticism, and, to a lesser degree, self-reinforcement, and almost all types of inner dialogues. This variable can be labeled "dialogical self-talk." The second and third canonical variables represented only a small amount of residual variance from the original variables (both 16%) and explained very little of the residual variance (3% and 2%) from the opposite set of variables.

In order to examine similarities of both kinds of intrapersonal communication, we also used exploratory factor analysis, principal components with Varimax rotation, and the Scree test for factor extraction. The 12 subscales (four STS, eight IDAS-R) served as the variables in this analysis. We identified a four-factor solution, according to the Scree test. The four extracted factors explained 79% of the variance (for loadings see **Table 4**).

The factors explained 49.3%, 11.7%, 8.9%, and 7.2% of the variance, respectively. Factor 1 (Internal Dialogicality) represented the different kinds of IDAS-R inner dialogues except for maladaptive and confronting dialogues. This factor explained almost half of the variance in the data, with six of the 12 subscales having relatively high loadings on it. Regarding the content of this factor, the IDAS-R subscales related to contact and union with the self's and others' inner dialogues, representing the adaptive side of inner dialogues. Interestingly, the STS functions did not load strongly on this factor.

Factor 2 (Self-Regulatory Self-Talk) contained three STS subscales/functions: Social Assessment, Self-Management, and Self-Criticism. These subscales seem to represent self-talk TABLE 4 | Results of EFA: loadings for four-factor solution.


aspects that are different from the types of internal dialogues. Factor 3 (Disruptive Dialogicality) contained the maladaptive and confronting IDAS-R subscales. These types of inner dialogues represent a kind of psychic burden caused or accompanied by unpleasant or tension producing dialogues. Factor 4 (Self-Enhancing Self-Talk) included only the Self-Reinforcement STS subscale.

Summing up, both CCA and EFA showed some overlap between self-talk and inner dialogical activity. However, the results are not strong enough to identify these two modes of intrapersonal communication as variable aspects of the same phenomena. Instead, they seem to be complementary types of intrapersonal communication that serve different functions.

# DISCUSSION

This purpose of this study was to examine the similarities between two kinds of intrapersonal communication using two recent multidimensional measures of inner dialogue and selftalk. As we expected, there were moderate to strong relationships among the total and subscale scores of the IDAS-R and STS. These results suggest that internal dialogical activity shares a good deal of variance with common self-talk functions. In other words, there is a significant self-talk component to internal dialogues. Although Brinthaupt et al. (2009) developed the STS independently of Dialogical Self Theory, the self-regulatory functions identified by their measure provide some conceptual and theoretical support for that theory.

Both the zero-order correlational data as well as the canonical correlations showed significant relationships between the selftalk functions and the types of inner dialogues. The results generally showed STS and IDAS-R overlap of between 30% and 40%. The common variance of the subscales of STS and IDAS-R, according to the canonical correlation analysis, was about 41%. Such results show that self-talk functions and inner dialogue types are, on the one hand, clearly related variables.

On the other hand, there are elements of each kind of intrapersonal communication mode that are different. For example, the STS functions appear to represent dynamic aspects

of intrapersonal communication, involving active processing of current or recent situations and compensation for behavioral challenges and cognitive disruptions (see Brinthaupt, 2019, this Research Topic). Alternatively, different types of inner dialogical activity seem to represent contemplative aspects of intrapersonal communication, such as reflecting about oneself or deliberating about different facets of one's identity. The types of inner dialogues illustrate qualities of awareness of human consciousness: representations of others in one's mind, overcoming of loneliness, keeping bonds with significant others, fighting for autonomy, and controlling of a social mirror (e.g., Puchalska-Wasyl et al., 2008; Rowan, 2011; Stemplewska-Zakowicz et al., 2012 ˙ ; Valsiner and Cabell, 2012).

Research on self-esteem suggests that inner dialogues and self-talk serve possibly different roles. Ole´s et al. (2010) found that total and subscale IDAS scores correlated negatively and significantly with self-esteem. However, Brinthaupt et al. (2009) found that self-esteem did not correlate significantly with total and subscale STS scores (except for self-critical selftalk). Both studies measured self-esteem with the same tool, Rosenberg's Self-Esteem Scale, but collected data from different populations/countries (Poland and the United States).

In the present study, there was evidence for more frequent intrapersonal communication activity in the United States sample, especially with respect to the self-talk functions. It is not clear whether these results reflect cultural or age differences between the two samples. The American students were a few years younger than the Polish participants. It is conceivable that younger people might engage in more intrapersonal communication (both IDAS-R and STS) than older people. If younger adults experience the uncertainty of adult life (Hermans and Hermans-Konopka, 2010) and engage more frequently in identity construction processes during late adolescent and emerging adulthood (Arnett, 2000; Hermans and Dimaggio, 2007), then we would expect increases in reports of inner dialogues and self-talk.

Cultural differences between the United States and Polish samples might also account for the differences in reported frequency of self-talk and inner dialogues. Research shows that higher identity integration is associated with less frequent internal dialogical activity measured by the IDAS (Ole´s, 2011) and that higher self-concept clarity integration is associated with less frequent internal dialogical activity (Ole´s et al., 2010). If the two samples differed in their identity or self-concept clarity (something that could be associated with the age differences), this might account for the frequency differences we observed on the STS and IDAS-R. Thus, exploring age and cultural differences in intrapersonal communication appears to be a fruitful avenue for future research.

# Limitations and Implications for Future Research

We operationalized aspects of intrapersonal communication using two self-report measures. As such, this study's data refer mainly to aspects of internal dialogue and self-talk that respondents are consciously aware of or can access upon reflection. As others (e.g., Beck, 1976) have noted, not all intrapersonal communication is conscious, and the present measures are limited to those situations and experiences that respondents are able to recall or infer based on other information. In addition, the list of functions and types of self-talk and internal dialogues tapped by the STS and IDAS-R is not exhaustive. For example, the STS does not measure the frequency of selfdistancing and adaptive coping that have been shown to be implicit functions of third-person self-talk (Kross et al., 2014) or the generic "you" that is used for general meaning making to help "people 'normalize' negative experiences by extending them beyond the self " (Orvell et al., 2017, p. 1299). There may be additional cognitive, motivational, or emotional functions not tapped by the STS and IDAS-R (e.g., Alderson-Day et al., 2018; Latinjak et al., 2019).

We believe that methodological artifacts are unlikely to explain the results. The factor analysis loadings do not reflect solely positive and negative valenced items from the measures. For example, ruminative dialogues appeared within Factor 1, and self-critical self-talk appeared in Factor 2. The results appear to map more closely to the overall frequency of use of each kind of intrapersonal communication, with the three least frequent facets (maladaptive and confronting dialogues and self-reinforcing selftalk) emerging as separate, minor factors. In addition, both scales used the same response format, which should reduce response artifacts. However, the STS uses a specific instructional prompt ("I talk to myself when. . ." certain situations occur), With the IDAS-R, participants rate statements related to self- and otherrelated dialogical thinking situations. Thus, there is a distinction between when one talks to oneself (STS) and how one talks to oneself (IDAS-R). Future research is needed for a careful and systematic examination of the item content and construct indicators of the STS and IDAS-R.

Because the STS and IDAS-R have semantically overlapping item content, it is important to examine the predictive value of each scale with external criteria. Although we have yet to examine external criteria that might address the differentiation of self-talk and inner dialogues, there is some evidence that internal dialogues are more strongly related to self-esteem than is self-talk (Brinthaupt et al., 2009; Ole´s et al., 2010), suggesting potential differences in the functions served by these two kinds on intrapersonal communication. Studying the operation of internal dialogues and self-talk in specific self-regulatory contexts (e.g., novel or stressful situations) could provide additional insight into the predictive value and overlap of the measures.

Future research will need to continue examining the structure and properties of the STS and IDAS-R. One possible direction is to examine situation-specific intrapersonal behavior. For example, within specific contexts or situations (e.g., coping with stress, arriving at a decision, or construing personal identity), there may be specific behavioral signatures (Mischel and Shoda, 1995) containing different combinations of internal dialogue or self-talk types. As the contributions to this Research Topic show, there are other kinds of intrapersonal communication beyond internal dialogues and self-talk. Exploring the relationships among the varieties of intrapersonal communication would also be a worthy goal for future research.

We have shown that the relationship between inner dialogue and self-talk is interesting and complex and that the study of this relationship is a theoretically valuable research goal. There are several additional modes, categories, and functions served by, or relevant to, intrapersonal communication (e.g., Heavey and Hurlburt, 2008). Researchers might find it profitable to utilize the IDAS-R and the STS to explore further the overlap between, and distinctions among, these phenomena.

# DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

# ETHICS STATEMENT

The studies involving human participants were reviewed and approved by the Institutional Review Board (IRB),

# REFERENCES


Middle Tennessee State University, United States. This study spanned over two countries. The patients/participants provided their written informed consent to participate in this study when it was required by the national legislation and the institutional requirements.

# AUTHOR CONTRIBUTIONS

PO conceptualized the research, led the empirical research in Poland, and revised and wrote the first version of manuscript. TB prepared the idea of research, led the empirical research in United States, finally edited, revised, and corrected the first version of manuscript. RD conducted the empirical research in the United States and prepared the database. DP, conducted the empirical research in Poland, prepared the database, and performed computations.

# FUNDING

Publication of this article was supported by The John Paul II Catholic University of Lublin, Poland.



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Ole´s, Brinthaupt, Dier and Polak. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.