Front. Hum. Neurosci., 21 July 2014
Sec. Speech and Language

Mind what you say—general and specific mechanisms for monitoring in speech production

  • 1School of Psychology, University of Queensland, Brisbane, QLD, Australia
  • 2Department of Experimental Psychology, Ghent University, Ghent, Belgium
  • 3Neurobiology of Language Department, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands


For most people, speech production is relatively effortless and error-free. Yet it has long been recognized that we need some type of control over what we are currently saying and what we plan to say. Precisely how we monitor our internal and external speech has been a topic of research interest for several decades. The predominant approach in psycholinguistics has assumed monitoring of both is accomplished via systems responsible for comprehending others' speech.

This special topic aimed to broaden the field, firstly by examining proposals that speech production might also engage more general systems, such as those involved in action monitoring. A second aim was to examine proposals for a production-specific, internal monitor. Both aims require that we also specify the nature of the representations subject to monitoring.

Domain General Mechanisms

Some of the first evidence to support a proposal of a domain general monitoring or attentional selection mechanism being engaged in speech production was provided by functional magnetic resonance imaging (fMRI) investigations of semantic interference effects in picture-naming paradigms. Those studies identified differential activity in the anterior cingulate cortex (ACC), noting similar activity had been observed in fMRI studies of picture-word (PWI), Stroop and manual interference paradigms (e.g., de Zubicaray et al., 2006). Piai et al. (2013) provide the first confirmatory evidence of a domain general, ACC involvement during performance of Stroop, semantic PWI and manual Simon tasks. Although PWI might be considered a generalization of Stroop-like interference effects (e.g., MacLeod, 1991), the Simon task elicits an interference effect in manual responding by manipulating the spatial location of target stimuli (e.g., a square or triangle) on congruent and incongruent trials. Thus, any overlap in ACC activity across the three tasks can be interpreted as reflecting a domain general mechanism.

Electrophysiological studies provide another source of evidence for a domain general fronto-central monitoring mechanism via both stimulus- and response-locked analyses (Riès et al., 2013; Trewartha and Phillips, 2013; Acheson and Hagoort, 2014). Using a phoneme substitution task, Trewartha and Phillips (2013) report an error-related negative potential (ERN) similar to that observed for manual actions. They conclude that speech errors are detected by a general conflict monitoring mechanism supported by the ACC. However, Acheson and Hagoort (2014) tested the domain generality assumption by directly comparing event-related potentials (ERPs) on tongue twister (TT) and Flanker tasks. The TT task required participants to repeatedly and rapidly read sequences of regular non-words. Despite observing similar ERNs, Acheson and Hagoort observed few correlations between behavioral or electrophysiological measures from the two tasks.

Together, the results from the above studies suggest that competition among lexical-level representations might engage a domain general mechanism in the ACC, while conflicting sub-lexical (i.e., phoneme) level representations might engage a different mechanism, either in another subdivision of the ACC or elsewhere in superior medial frontal cortex.

The working memory requirements of production paradigms might also be crucial for the engagement of domain general mechanisms. Riès et al. (2013) address this issue in patients with lesions of the lateral prefrontal cortex (DLPFC), with results indicating that picture naming might not require mechanisms mediated by this region, unless working memory load is increased, as per the verbal Simon task they employed. Crowther and Martin (2014) demonstrate that working memory is clearly involved in more complex production paradigms that manipulate both semantic context and item repetition, such as blocked cyclic naming, finding significant correlations with verbal memory span. In addition, they report the decreasing slope of naming latencies within cycles derives from a process of narrowing down available responses from the set of items within the block as the trials progress. This indicates participants monitor the item names they produce in order to perform the task. Further, they argue this reflects a strategic, task-specific process rather than a mechanism involved in word production in more naturalistic settings. Intriguingly, Riès et al.'s and Crowther and Martin's converging conclusions regarding the involvement of working memory in production are supported by a recent study by Wirth et al. (2011). The latter authors employed anodal transcranial direct current stimulation (atDCS), an electrical brain stimulation technique that induces more efficient neural processing at the stimulation site. They reported aTDCS applied over the left DLPFC reduced the semantic interference effect observed in the blocked cyclic naming paradigm.

Common vs. Distinct Mechanisms for Monitoring Inner and Overt Speech

While the dominant approach to speech monitoring has invoked a general role for the comprehension system, more recent approaches have emphasized distinct mechanisms for monitoring of inner speech. Pickering and Garrod (2014), for example, adapt a well-established mechanism of forward models to provide an account of how a production-internal monitor might operate. Riès et al. (2013) provide electrophysiological evidence that could be considered consistent with both approaches: They observed a larger error-related response over the left temporal cortex that started before, and peaked around, vocal onset during picture naming. Conceivably, this result could reflect a process of computing discrepancies between planned and upcoming utterances or the perception of an erroneous pre-articulatory representation.

Gauvin et al. (2013) provide a direct comparison of perception and production modalities within-participant using eye- tracking in a visual-world paradigm. Their findings support the involvement of the speech comprehension system in monitoring the output of one's own speech in addition to that of others. However, they found no evidence of robust changes in eye-movements before or around production onset, leading them to conclude that the comprehension system is unlikely to be involved in the monitoring of inner speech. As the authors note, this result might reflect a lack of sensitivity of the paradigm to monitoring of inner speech. Alternatively, it might reflect something about the nature of inner speech. Lind et al. (in press) introduced a novel real-time speech exchange paradigm in which participants in a Stroop task occasionally received manipulated feedback of a single word they had spoken (i.e., they heard the name of the distractor word in their own voice, from an earlier recording). Surprisingly, the participants often considered the manipulated word as an error in their own production. Based on these data Lind et al. (2014) propose that inner speech might be relatively under-specified, and that auditory feedback of one's own voice has a more important role in monitoring as it provides a sense of agency in addition to verifying the intended meaning.


The articles in this special topic have broadened the available evidence base by providing novel data from a range of behavioral, lesion, electrophysiological, and neuroimaging investigations. In addition, they have provided new theoretical perspectives on speech monitoring that future research will need to address. Results across the studies suggest that monitoring in language production is likely to involve not just a single mechanism, but multiple systems.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Acheson, D. J., and Hagoort, P. (2014). Twisting tongues to test for conflict-monitoring in speech production. Front. Hum. Neurosci. 8:206. doi: 10.3389/fnhum.2014.00206

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Crowther, J. E., and Martin, R. C. (2014). Lexical selection in the semantically blocked cyclic naming task: the role of cognitive control and learning. Front. Hum. Neurosci. 8:9. doi: 10.3389/fnhum.2014.00009

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

de Zubicaray, G., McMahon, K., Eastburn, M., and Pringle, A. (2006). Top-down influences on lexical selection during spoken word production: a 4T fMRI investigation of refractory effects in picture naming. Hum. Brain Mapp. 27, 864–873. doi: 10.1002/hbm.20227

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gauvin, H. S., Haretsuiker, R. J., and Huettig, F. (2013). Speech monitoring and phonologically-mediated eye gaze in language perception and production: a comparison using printed word eye-tracking. Front. Hum. Neurosci. 7:818. doi: 10.3389/fnhum.2013.00818

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Lind, A., Hall, L., Breidegard, B., Balkenius, C., and Johansson, P. (2014). Auditory feedback of one's own voice is used for high-level semantic monitoring: the “self-comprehension” hypothesis. Front. Hum. Neurosci. 8:166. doi: 10.3389/fnhum.2014.00166

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Lind, A., Hall, L., Breidegard, B., Balkenius, C., and Johansson, P. (in press). Speakers' acceptance of real-time speech exchange indicates that we use auditory feedback to specify the meaning of what we say. Psychol. Sci. doi: 10.1177/0956797614529797

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

MacLeod, C. M. (1991). Half a century of research on the Stroop effect: an integrative review. Psychol. Bull. 109, 163–203. doi: 10.1037/0033-2909.109.2.163

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Piai, V., Roelofs, A., Acheson, D. J., and Takashima, A. (2013). Attention for speaking: domain-general control from the anterior cingulate cortex in spoken word production. Front. Hum. Neurosci. 7:832. doi: 10.3389/fnhum.2013.00832

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pickering, M. J., and Garrod, S. (2014). Self-, other-, and joint monitoring using forward models. Front. Hum. Neurosci. 8:132. doi: 10.3389/fnhum.2014.00132

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Riès, S. K., Xie, K., Haaland, K. Y., Dronkers, N. F., and Knight, R. T. (2013). Role of the lateral prefrontal cortex in speech monitoring. Front. Hum. Neurosci. 7:703. doi: 10.3389/fnhum.2013.00703

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Trewartha, K. M., and Phillips, N. A. (2013). Detecting self-produced speech errors before and after articulation: an ERP investigation. Front. Hum. Neurosci. 7:763. doi: 10.3389/fnhum.2013.00763

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wirth, M., Rahman, R. A., Künecke, J., König, T., Horn, H., Sommer, W., et al. (2011). Effects of transcranial direct current stimulation (tDCS) on behaviour and electrophysiology of language production. Neuropsychologia 49, 3989–3998. doi: 10.1016/j.neuropsychologia.2011.10.015

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Keywords: speech production and perception, lexical access, monitoring, attention, control

Citation: de Zubicaray GI, Hartsuiker RJ and Acheson DJ (2014) Mind what you say—general and specific mechanisms for monitoring in speech production. Front. Hum. Neurosci. 8:514. doi: 10.3389/fnhum.2014.00514

Received: 16 May 2014; Accepted: 26 June 2014;
Published online: 21 July 2014.

Edited and reviewed by: Hauke R. Heekeren, Freie Universität Berlin, Germany

Copyright © 2014 de Zubicaray, Hartsuiker and Acheson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: greig.dezubicaray@uq.edu.au