Attention for speaking: domain-general control from the anterior cingulate cortex in spoken word production

Piai, Vitoria; Roelofs, Ardi; Acheson, Daniel J.; Takashima, Atsuko

doi:10.3389/fnhum.2013.00832

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 09 December 2013

Sec. Speech and Language

Volume 7 - 2013 | https://doi.org/10.3389/fnhum.2013.00832

This article is part of the Research TopicMind what you say - general and specific mechanisms for monitoring in speech productionView all 9 articles

Attention for speaking: domain-general control from the anterior cingulate cortex in spoken word production

Vitória Piai^1,2^*

Ardi Roelofs¹

Daniel J. Acheson^1,3

Atsuko Takashima^1,4

¹Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands
²International Max Planck Research School for Language Sciences, Nijmegen, Netherlands
³Neurobiology of Language Department, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands
⁴Behavioural Science Institute, Radboud University Nijmegen, Nijmegen, Netherlands

Accumulating evidence suggests that some degree of attentional control is required to regulate and monitor processes underlying speaking. Although progress has been made in delineating the neural substrates of the core language processes involved in speaking, substrates associated with regulatory and monitoring processes have remained relatively underspecified. We report the results of an fMRI study examining the neural substrates related to performance in three attention-demanding tasks varying in the amount of linguistic processing: vocal picture naming while ignoring distractors (picture-word interference, PWI); vocal color naming while ignoring distractors (Stroop); and manual object discrimination while ignoring spatial position (Simon task). All three tasks had congruent and incongruent stimuli, while PWI and Stroop also had neutral stimuli. Analyses focusing on common activation across tasks identified a portion of the dorsal anterior cingulate cortex (ACC) that was active in incongruent trials for all three tasks, suggesting that this region subserves a domain-general attentional control function. In the language tasks, this area showed increased activity for incongruent relative to congruent stimuli, consistent with the involvement of domain-general mechanisms of attentional control in word production. The two language tasks also showed activity in anterior-superior temporal gyrus (STG). Activity increased for neutral PWI stimuli (picture and word did not share the same semantic category) relative to incongruent (categorically related) and congruent stimuli. This finding is consistent with the involvement of language-specific areas in word production, possibly related to retrieval of lexical-semantic information from memory. The current results thus suggest that in addition to engaging language-specific areas for core linguistic processes, speaking also engages the ACC, a region that is likely implementing domain-general attentional control.

Introduction

Accumulating evidence suggests that speakers need to engage attentional control for certain language processes (e.g., Ferreira and Pashler, 2002; Roelofs and Hagoort, 2002; Roelofs, 2003, 2008; Roelofs and Piai, 2011; Piai and Roelofs, 2013). Attentional control refers to the regulatory and monitoring processes that ensure that our actions are in accordance with our goals, especially in the face of distraction (e.g., Posner and Petersen, 1990; Roelofs, 2003). For example, when planning a word or a multi-word utterance, speakers need to prevent interference from concurrent information in the environment, such as speech from an interlocutor or visual input from objects surrounding the referent. The object that one wants to refer to may have more than one name, in which case top-down regulation is needed to resolve the conflict between alternative responses. Attentional control also includes self-monitoring, through which speakers assess whether planning and performance are consistent with intent (e.g., Levelt et al., 1999; Hartsuiker and Kolk, 2001; Roelofs, 2004; Christoffels et al., 2007; van de Ven et al., 2009). For example, Levelt (1989) suggests that “Message construction is controlled processing, and so is monitoring” (p. 21).

The present study was designed to address the extent to which these controlled processes may be language-specific or domain-general. In particular, we used functional magnetic resonance imaging (fMRI) to examine brain activity associated with performance in three tasks varying both in the amount of attentional control and in the amount of linguistic processing needed: vocal picture naming with distractor words (picture-word interference, PWI); vocal color naming with distractor words (Stroop); and object discrimination using manual responding with spatial compatibility (Simon task). All three tasks contained stimuli with two dimensions that were either congruent or conflicting with each other, and required responding to a relevant dimension while ignoring an irrelevant one. Given that such conflict often leads to increases in error rates or to the selection of an inappropriate response, people must constantly monitor and regulate their performance (e.g., Posner and Petersen, 1990; Petersen and Posner, 2012). Thus, these three tasks measure the extent to which attentional control is required to select a target response (e.g., Posner and Petersen, 1990; Roelofs, 2003; Hommel, 2011; Petersen and Posner, 2012), with conflicting stimulus dimensions in the incongruent condition increasing response time (RT) relative to neutral and congruent trials.

Attentional control functions have been extensively studied with the Stroop (1935; see also MacLeod, 1991) and Simon tasks (Simon and Small, 1969; see also Hommel, 2011). In the Stroop task, participants name the ink color of words, with the ink color being either congruent (e.g., red printed in red ink), incongruent (e.g., blue in red ink), or neutral (e.g., dream in red ink) with respect to the written word. In the Simon task, participants are instructed to respond to a color or to the identity of an object with lateralized button presses (e.g., press right for a triangle and left for a square), and spatial congruency is manipulated either by presenting the object in the same (i.e., congruent) or opposite (i.e., incongruent) spatial position relative to the response. To examine attentional control functions in spoken word production, tasks such as Stroop and PWI can be used. In the PWI task (Rosinski, 1977; see for review Glaser, 1992), participants name pictures while trying to ignore superimposed distractor words that are, for example, semantically related (e.g., pictured car with distractor bus), semantically unrelated (e.g., pictured car, distractor table), or identical to the picture name (e.g., pictured car, distractor car). Thus, in addition to providing insight into lexical access, PWI is often seen as an experimental method that allows us to examine monitoring and regulation processes in spoken word production (e.g., Lupker, 1979; Glaser and Düngelhoff, 1984; MacLeod, 1991; Roelofs, 2003; Dhooge and Hartsuiker, 2011). In the remainder of this article, we refer to the semantically related condition as incongruent, the unrelated as neutral, and the identical condition as congruent.

A network of brain areas has commonly been implicated in attentional control functions, as measured with the Stroop and Simon tasks (e.g., Peterson et al., 2002; Fan et al., 2003; Liu et al., 2004). In particular, the effects of conflict in these tasks, i.e., more activity for incongruent relative to congruent stimuli, have been co-localized to the lateral prefrontal cortex (PFC) and the dorsal anterior cingulate cortex (ACC) (Fan et al., 2003; Liu et al., 2004). The dorsal ACC includes Brodmann areas 24 and 32 (Devinsky et al., 1995; Paus, 2001; Ridderinkhof et al., 2004), referred to as “anterior” and “mid” cingulate in the Automated Anatomical Labeling (AAL) template (Tzourio-Mazoyer et al., 2002). The dorsal ACC is part of a frontoparietal network underlying domain-general attentional control (e.g., Duncan, 2010; Barbey et al., 2012; Niendam et al., 2012), both at the task and response level (Aarts et al., 2009). Although the exact function of the dorsal ACC within this network is still debated in the literature (e.g., conflict monitoring, Botvinick et al., 2004; response selection, Awh and Gehring, 1999; top-down regulation of selection processes, Roelofs et al., 2006; Aarts et al., 2008; see also Alexander and Brown, 2011 for a recent proposal encompassing several other accounts), all theoretical frameworks acknowledge that the engagement of the dorsal ACC increases with incongruent relative to congruent or neutral stimuli.

In the past few years, significant progress has been made in delineating the neural substrates of the core language processes underlying speaking through the use of tasks such as picture naming, word generation, and word/pseudoword reading (for overviews see Indefrey and Levelt, 2004; Indefrey, 2011; Price, 2012). Despite this progress, the neural substrates associated with the processes of regulating and monitoring language production have remained relatively underspecified (cf. Indefrey, 2011; for recent advances, see Nozari et al., 2011; Riès et al., 2011), in part because the manipulations and comparisons within these tasks may not have been sensitive to attentional control functions. As concerns vocal utterances, the ACC plays an important role in controlling the initiation and suppression of non-verbal vocalizations in humans, such as laughing and crying (Jürgens, 2002). Because of its connections with the lateral PFC, which is involved in broad aspects of top-down control (e.g., Paus, 2001; Petrides, 2005), it has been argued that the ACC has the appropriate characteristics to mediate the attentional control necessary for producing language (e.g., Roelofs, 2008). Evidence for this proposal comes, for example, from a review of two decades of language production neuroimaging research, indicating a critical role for the dorsal ACC during word selection in the context of non-target words (Price, 2012).

Despite this evidence, some important questions about the role of the dorsal ACC in language production have remained unanswered. In their meta-analysis of neuroimaging studies on word production, Indefrey and Levelt (2004) identified the mid-cingulate (part of the dorsal ACC more commonly defined) as one of the brain areas that are active in all production tasks examined (i.e., picture naming, word generation, and word/pseudoword reading). This suggests that the dorsal ACC may implement a production-general function (i.e., regulation and monitoring) rather than making a specific contribution to core language production processes (i.e., conceptual preparation, lexical selection, and word-form encoding). However, whether the production-general contribution of the dorsal ACC is also domain-general (i.e., also engaged outside the language domain) could not be assessed in the meta-analysis of Indefrey and Levelt. Moreover, it is still unclear whether regulation and monitoring processes in word production, as measured by the PWI task, involve the dorsal ACC. The first study to report ACC activity in PWI compared categorically related (incongruent) picture-distractor pairs with a control picture-distractor pair (i.e., a string of Xs) (de Zubicaray et al., 2001). Note that the comparison between categorically related picture-word pairs and pictures paired with a string of Xs concerns a contrast between a word and non-word condition rather than between different word conditions (e.g., semantically related and unrelated words). Subsequent studies examining the contrast between categorically related and unrelated picture-word pairs (often referred to as the semantic effect) failed to observe modulations of ACC activity as a function of distractor type (Spalek and Thompson-Schill, 2008; de Zubicaray and McMahon, 2009; de Zubicaray et al., 2013). Importantly, the portion of the ACC that was sensitive to distractor type in the study of de Zubicaray et al. (2001) does not correspond to areas previously associated with domain-general control, but rather to those observed in tasks involving the processing and control over emotion, reward, and pain (see Torta and Cauda, 2011) in the anterior portion of the ACC. Thus, it is unclear whether the system for attentional control in word production, commonly measured with the PWI task, is part of the same domain-general, attentional control system that has been implicated outside of language.

An additional goal of the present study was to determine whether common brain activation associated with lexical-semantic processing in word production can be found for the PWI and Stroop tasks. Although retrieval of words from long-term memory may rely on general processes for retrieving diverse information from memory, the storage of lexical-semantic knowledge has been mainly associated with the left superior and middle temporal cortex (see for overviews Indefrey and Levelt, 2004; Price, 2012). In an extensive lesion-deficit analysis concerning semantic errors in picture naming by individuals with post-stroke aphasia, Schwartz et al. (2009) identified the left anterior temporal cortex as the brain area that is critically involved in mapping concepts onto words in production (i.e., conceptually driven “lemma retrieval”). This anterior temporal area included the mid-temporal region identified by Indefrey and Levelt (2004) as being involved in conceptually driven word retrieval, providing converging evidence for the functional role assigned to this area. PWI studies have consistently revealed sensitivity of the left superior temporal gyrus (STG) and middle temporal gyrus (MTG) activity to experimental manipulations (de Zubicaray et al., 2001, 2002, 2013; de Zubicaray and McMahon, 2009), but in Stroop studies, activity in left temporal cortex is generally absent (e.g., Bench et al., 1993; Banich et al., 2000). Despite these previous results, it seems reasonable to predict that both tasks might activate elements of the temporal cortex as the distracting information is lexical-semantic in nature.

To recapitulate, the present study was designed to elucidate the inconclusive evidence for the involvement of a domain-general control mechanism, possibly supported by the dorsal ACC, in language production. Furthermore, we also investigated language-specific activity in the left superior and middle temporal cortex, areas shown to be consistently involved in lexical-semantic processes in language production (Indefrey and Levelt, 2004; Indefrey, 2011). We used three tasks that are known to require attentional control, but crucially two of them were language tasks with vocal responding (PWI and Stroop), whereas the third was a spatial congruency task requiring manual responding (Simon). By examining the activity in the dorsal ACC that is common to all three tasks, we aimed to identify a domain-general portion of the cingulate cortex that is active with incongruent (i.e., more difficult) trials. If domain-general control is involved in language production, then such a common dorsal ACC area should be found. Furthermore, we also investigated the activity in the left superior and middle temporal cortex, areas shown to be consistently involved in lexical-semantic retrieval in language production (Indefrey and Levelt, 2004; Indefrey, 2011).

Methods

Participants

The experiment was approved by the Ethics Committee for Behavioral Research of the Social Sciences Faculty at Radboud University Nijmegen. Twenty-six young adults (mean age = 21.2 years, range = 18–29) from the pool of the Radboud University Nijmegen participated in the experiment for monetary compensation or course credits. All participants gave informed written consent to their participation after the nature and possible consequences of the study were explained. Three female participants were excluded from the analyses for the following reasons. One participant revealed having dyslexia after the data were acquired; for another participant, a technical failure caused an imprecision in the registration of the time parameters; one participant was discarded for excessive movement in the scanner (>6 mm). The remaining 23 participants (11 male) were right-handed, native speakers of Dutch with normal or corrected-to-normal vision, and no history of neurological or reading deficits.

Materials and Design

Picture-word interference task

Forty pictures were selected from the picture database of the Max Planck Institute for Psycholinguistics, Nijmegen, together with their basic-level names in Dutch. The pictures belonged to ten different semantic categories with four objects pertaining to each category. All pictures were white line drawings on a black background. The pictures subtended between 1° and 1.3° of the participant's visual angle. A list of the materials can be found in the Appendix. Three picture-word conditions were created. In the incongruent (categorically related) condition, each target picture was combined with a distractor word from the same semantic category (i.e., the distractor words were the names of the other category-coordinate pictured objects from our materials). For the neutral (categorically unrelated) condition, the pictures were re-combined with the names of the pictures from the other semantic categories. Finally, in the congruent condition, the distractor words were the Dutch name of the pictures. Thus, all distractor words belonged to the response set and distractor type was varied within participants and within items. Each picture appeared once in each condition, totalling 40 trials per condition. The distractors were presented in font Arial size 30 in white, centered on the picture. The picture-word trials were randomized using Mix (Van Casteren and Davis, 2006), with one unique list per participant. Participants were instructed to name the picture and to ignore the distractor word.

Stroop task

All words were presented in red, green, and blue font. There were three Stroop conditions: congruent, incongruent, and neutral. In the incongruent condition, the color words (red, green, and blue) were displayed in an incongruent ink color (e.g., red was presented in green and in blue, etc.). In the neutral condition, the Dutch words taak (“task”), droom (“dream”), and klant (“client”) appeared 5 times in each ink color. In the congruent condition, each color word appeared in its corresponding ink color. Each color word appeared 15 times in each condition, totalling 45 trials per condition. The Stroop stimuli were presented in the center of the screen in Arial font size 20, subtending between 1° and 1.3° of the participant's visual angle. The color-word trials were randomized using Mix (Van Casteren and Davis, 2006), with one unique list per participant. Participants were instructed to name the ink color of the words.

Simon task

A square and a triangle were used as white line drawings presented on a black background, subtending about 3° of the participants' visual angle. Half of the participants were instructed to press a button with their left index finger in response to squares and another button with their right index finger to triangles. The other half of the participants received the opposite shape-button press mapping. Each shape appeared 33 times to the left of a centred fixation cross and 33 times to the right, yielding 66 congruent- and 66 incongruent-location trials. Note that this task lacked a neutral condition as this is not typically employed within this task. All 132 trials were randomized using Mix (Van Casteren and Davis, 2006), with one unique list per participant. For the Simon task, two button boxes were resting on the participant's body, one near each hand.

Procedure and Apparatus

Outside the scanner, participants read the instructions and were familiarized with the pictures and the names to be used in the experiment. Both speed and accuracy were emphasized for all three tasks. Next, participants practiced each task with eight trials (PWI and Stroop) or 14 trials (Simon) in the same order they would perform them in the scanner, i.e., PWI, Stroop, Simon task. For the PWI task, two line drawings (heart and star) were selected as practice items. For the Stroop and Simon tasks, the same items were used for the practice and experimental sessions.

The presentation of stimuli (screen resolution 1024 × 768 × 32, 60 Hz refresh rate) and the recording of responses were controlled by Presentation Software 14.1 (Neurobehavioral Systems, Albany, CA). A noise-cancelling microphone, placed above the participant's mouth, was connected to the presentation computer, enabling the recording of vocal responses and the measurement of vocal response latencies. The experiment started with the PWI task. A prompt on the screen indicated the end of one task and the beginning of the next task, with the instructions presented once more for 20 s. The Stroop task followed the PWI task, and the Simon task was performed last. For all three tasks, a trial started with the presentation of a fixation cross in the center of the screen for 500 ms. Next, the stimulus was displayed for 1 s. For PWI and Stroop stimuli, they were displayed in the center of the screen. For the Simon task, the stimuli were presented either to the right or to the left of the fixation cross, depending on the Simon condition of the trial. A black screen followed for the duration of the jitter period (varying between 2.4 and 6 s, following a normal distribution, randomly assigned to each trial). The registration of the vocal and manual responses started as soon as the stimuli were displayed and lasted until the next trial started. For each task, the stimuli were presented in three blocks with breaks of 20 s between blocks.

Data Acquisition

Participants were scanned with a 1.5-T Siemens Avanto Scanner with a 32-channel head coil. For the acquisition of the functional data, we used a parallel-acquired inhomogeneity-desensitized fMRI sequence (Poser et al., 2006), which is a multiecho echo-planar imaging sequence that reduces image artefacts and is therefore suitable for acquiring data of participants while they speak (e.g., Menenti et al., 2011; Segaert et al., 2012). In this sequence, the images are acquired at multiple time echoes (TEs) following a single excitation. The time repetition (TR) used was 2.31 s, with the five TEs acquired at 8.3, 27.6, 37, 46, and 55 ms (echo spacing = 0.5 ms, flip angle = 80°). Each volume comprised 36 slices of 3 mm thickness [ascending slice acquisition, voxel size = 3.5 × 3.5 × 3 mm³, slice gap = 17%, field of view (FOV) = 224 mm, matrix = 64 × 64]. GRAPPA parallel imaging was used (acceleration factor = 3). Functional scans were acquired in one run. First, 30 volumes were acquired and used for weight calculation of each of the echoes (pre-task volumes), followed by the three tasks one after the other.

For the anatomical MRI, T1-weighted images were acquired using a magnetization-prepared, rapid-acquisition gradient echo sequence (MPRAGE; TR = 2.25 s, TE = 2.95 ms, echo spacing = 8.7 ms, flip angle = 15°). We acquired 176 sagittal slices (isotropic voxel size = 1 mm³, FOV = 256 mm, matrix = 256 × 256).

Behavioral Data Analysis

For each trial of the PWI and Stroop tasks, the experimenter evaluated the participants' vocal responses. Trials that contained a disfluent response, a wrong pronunciation of the word, or a wrong response word were coded as errors and subsequently excluded from the statistical analyses of the naming RTs. Errors in the Simon task were also excluded from the statistical analysis of the manual RTs. Vocal RTs shorter than 200 ms and manual RTs shorter than 100 ms were also excluded from the analyses.

RTs were averaged over trials per condition and per participant and submitted to by-participant analyses of variance (ANOVA) for the Simon and Stroop tasks separately, and additionally to by-item ANOVA for the PWI task, with stimulus type (neutral, incongruent, congruent) as the independent variable. Planned contrasts were examined with paired t-tests (two-tailed). Errors were submitted to logistic regression analyses on single-trial data. For the relevant contrasts (i.e., incongruent vs. congruent, incongruent vs. neutral), 95% confidence intervals (CI) around the mean difference are reported, as well as Cohen's d (a measure of effect size), calculated as the difference between two conditions divided by the square root of the averaged variance of the three conditions (Cumming, 2012). Due to technical failures, vocal RTs were not registered for six participants and manual RTs were not registered for one participant (errors were registered). Thus, the statistical analyses of the vocal responses comprised 17 participants and the analyses of the manual responses comprised 22 participants.

fMRI Data Pre-Processing

The pre-processing steps were conducted using Matlab and SPM8 (www.fil.ion.ucl.ac.uk/spm/software/spm8). First, all volumes were realigned to the first volume and re-sliced. Then the five echoes of each volume were combined to yield one volume per TR using an in-house Matlab script (see for details Poser et al., 2006). For each voxel, optimal weighting for the five echoes were calculated from the 30 pre-task volumes, and the weighting values were applied to the rest of the functional volumes resulting in one volume per TR. Then these images were slice-time corrected to the first slice. The participant's mean image of the functional run after realignment was co-registered with the participant's anatomical volume. Finally, the functional and anatomical images were spatially normalized to Montreal Neurological Institute (MNI) space and smoothed (3D isotropic Gaussian smoothing kernel, full-width at half-maximum = 8 mm).

fMRI Data Analysis

Statistical analyses were performed within a general linear model (GLM) framework. For the analysis on individual participants' data, the model included eight regressors timelocked to the onset of each condition of each task (PWI incongruent, PWI neutral, PWI congruent, Stroop incongruent, Stroop neutral, Stroop congruent, Simon incongruent, and Simon congruent), one regressor for trials in which an error was made, and one regressor to model the intra- and inter-task period. The onsets of each event were modeled as a gamma response, or stick-function (i.e., duration = 0) temporally convolved with the canonical hemodynamic response function along with the first temporal derivative. The model also included the six motion parameters and their first derivatives to account for residual movement-related artefacts. Since participants were overtly producing the words during the PWI and Stroop tasks, we specifically included the first derivatives of the motion parameters to account for signals that might be affected by sudden movements due to overt responses. A high pass filter was implemented (1/128 Hz cutoff) to account for slow drifts of the signal. The effects were estimated with a subject-specific fixed-effects model. (We also modeled the RT as durations for each of the trials, but given that the results were quite similar to the ones reported below and we did not have the RTs for all participants, these results are not reported here).

Specific contrasts of interest were calculated for each participant and these contrast images were used as random variables on the group level. All clusters reported as significant had voxels thresholded at p = 0.001 (uncorrected), with the cluster-size statistics thresholded at p ≤ 0.05 (family-wise error corrected) (Hayasaka and Nichols, 2003). First, we looked into areas that were significant in a whole-brain analysis. Since we were interested in domain-general activations, we localized shared areas that were active in all three tasks. For this aim, ANOVAs were performed on participants' individual contrast images with task and stimulus type as independent variables. We then conducted a “conjunction analysis” by identifying overlapping voxels that were above the threshold (voxel level p = 0.001, uncorrected) in each of the incongruent condition of all three tasks. For the linguistic-vocal tasks, images of each stimulus type were contrasted for each task separately using paired t-tests on the group level.

ROI analyses

Given our interest in the involvement of the dorsal ACC, STG and MTG, a region of interest (ROI) analysis was performed by restricting our search volume within these ROIs defined anatomically using the AAL template (Tzourio-Mazoyer et al., 2002). Furthermore, we were interested in the specific part of the dorsal ACC that was active during the conflict trials in all three tasks. For this, a conjunction analysis was performed within the bilateral cingulate cortices in the same way as reported above. The dorsal portion of the cingulate cortex that was commonly active in all three incongruent conditions, as shown in this conjunction analysis, was selected as the functional Cingulate ROI. To determine the involvement of this specific Cingulate ROI in the tasks separately, the beta weights from the functional Cingulate ROI were extracted and averaged for each participant and condition separately using the MarsBar toolbox (Brett et al., 2002). Paired t-tests were used to test the conflict conditions in a pair-wise fashion for each task separately. Since we had an a priori hypothesis that the congruent conditions would elicit the least conflict, one-tailed tests were used.

For the linguistic-vocal tasks, the ROI analyses comprised the left superior and middle temporal cortex (Indefrey and Levelt, 2004), according to the AAL template. The Stroop task showed a significant effect for incongruent > congruent condition in the left temporal cortex. To observe activity differences between conditions for the PWI task in this area, we extracted averaged beta values of each PWI condition from this functional ROI for each participant using MarsBar. Paired t-tests (two tailed) were then used to test the conditions in a pair-wise fashion for the PWI task.

Results

Behavioral Data

Table 1 presents the mean RTs and standard deviations for correct responses and the error rates as a function of stimulus type and task.

TABLE 1

Table 1. Mean response time (M) and standard deviation (SD) in milliseconds, and percent error (E%) as a function of stimulus type in each task.

Errors

Table 2 presents the results of the logistic regression analysis on the errors. In sum, in the PWI task, errors were more likely in the incongruent than in the congruent condition but equally likely in the neutral condition, and more likely in the neutral than in the congruent condition. In the Stroop task, errors were more likely in the incongruent than in the congruent and in the neutral conditions, but equally likely in the neutral and congruent conditions. Finally, in the Simon task, errors were more likely in the incongruent than in the congruent condition.

TABLE 2

Table 2. Results of the logistic regression analysis on the errors for the three tasks.

RTs

Table 3 presents the results of the main effects of stimulus type, which was statistically significant for all three tasks. Table 4 presents the results of the pair-wise comparisons of condition for the three tasks. In sum, for all three tasks, RTs in the incongruent condition were longer than in the congruent and neutral (PWI and Stroop) conditions. Vocal RTs were also longer in the neutral than in the congruent condition.

TABLE 3

Table 3. Results of the analyses of variance on response times for the main effect of stimulus type in the picture-word interference, Stroop, and Simon tasks.

TABLE 4

Table 4. Results of the pair-wise comparisons of response times between conditions for the picture-word interference, Stroop, and Simon tasks.

fMRI Data

Cross-domain activity

Areas that were commonly activated by incongruent stimuli in all three tasks in the whole-brain analysis are shown in Table 5, Figures 1A, 2. The incongruent stimuli in all three tasks commonly activated the cerebellum (bilaterally), a large cluster in the left Rolandic operculum and STG (Figure 2), and the dorsal ACC (Figure 1A). Furthermore, in line with the whole brain analysis, two peaks of activity were observed in the dorsal ACC (BA 24; MNI: −4, 12, 36; and BA 32; MNI: 4, 18, 36) in the Cingulate ROI analysis, shown in the lower part of Table 5.

TABLE 5

Table 5. Statistically significant activations in the whole-brain and ROI analyses for the conjunction of the PWI, Stroop, and Simon tasks.

FIGURE 1

Figure 1. (A) Activity common to incongruent stimuli in the picture-word interference (PWI), Stroop, and Simon tasks in the anterior cingulate cortex (BA 24; peak MNI: −4, 12, 36; and BA 32; peak MNI: 4, 18, 36). (B) Averaged beta weights of active voxels in the anterior cingulate cortex (shown in A) as a function of task and stimulus type. Inc, incongruent; Neu, neutral; Con, congruent; n.s., non-significant. Error bars represent the standard error of the mean. *p-values ≤ 0.05, **p-values ≤ 0.01, ***p-values ≤ 0.005.

FIGURE 2

Figure 2. Activity common to incongruent stimuli in the picture-word interference, Stroop, and Simon tasks in a cluster comprising left Rolandic operculum (BA 22; peak MNI: −50, −6, 4) and left superior temporal gyrus.

Note that ideally, analyses would have targeted regions showing increased BOLD responses for the incongruent relative to the congruent conditions across all three tasks. However, this analysis proved to be untenable in the present investigation as the BOLD responses in the dorsal ACC in the Simon task were elevated in both congruent and incongruent conditions (see below), preventing us from detecting regions showing increased activity for the incongruent relative to the congruent condition in this task. Thus, we were not able to detect brain areas that were commonly modulated by stimulus type (i.e., incongruent > congruent) across all three tasks. Importantly, the cross-task conjunction of incongruent conditions still entails a contrast, i.e., vs. a low-level baseline. Hence, with this contrast, we detect the activity from the most difficult condition in all three tasks relative to this low-level baseline. This is comparable to the approach taken by Indefrey and Levelt (2004) in their meta-analysis, where activity common to different production tasks was detected by means of a comparison to a low-level baseline.

Figure 1B shows the mean beta weights extracted for each stimulus type in the three tasks from the Cingulate ROI, which was generated from the conjunction of the incongruent conditions across all three tasks. In the Stroop task, dorsal ACC activity was higher with incongruent than with congruent stimuli, t₍₂₂₎ = 2.61, p = 0.008; and higher with incongruent than neutral stimuli, t₍₂₂₎ = 3.02, p = 0.003; but similar for neutral and congruent stimuli, t₍₂₂₎ < 1. In the PWI task, dorsal ACC activity was higher with incongruent than with congruent stimuli, t₍₂₂₎ = 1.99, p = 0.030; and higher with neutral than congruent stimuli, t₍₂₂₎ = 2.87, p = 0.009; but similar for neutral and incongruent stimuli, t₍₂₂₎ = 1.43, p = 0.083. In the Simon task, elevated dorsal ACC activity did not differ between the incongruent and congruent conditions, t₍₂₂₎ < 1. The same pattern of activity was observed in the beta weights when we constrained the analyses to the 17 participants for whom RT data was available.

Language-specific activity

When testing for differences in brain activation between conditions for each task separately with paired t-tests, only the Stroop task yielded significant results for the contrasts incongruent > congruent and incongruent > neutral. These results are presented in Table 6 and in Figure 3A. In the whole-brain analysis, shown in the upper part of Table 6, both conflict contrasts (i.e., incongruent vs. neutral and incongruent vs. congruent) showed increased activity in the right inferior frontal gyrus (rIFG). In the Cingulate ROI analysis, shown in the lower part of Table 6, dorsal ACC activations were also increased for incongruent stimuli relative to neutral and congruent stimuli. Interestingly, in the Left Temporal ROI analysis, shown in Figure 3A, activity in the left STG was also increased for incongruent relative to congruent stimuli in the Stroop task. Note that this left STG ROI area (MNI −50, 0, −12 and −46, −10, −12) is slightly more ventral than the left STG area (MNI −46, −30, 16 and −48, 4, 0) that was identified by the conjunction of the incongruent conditions in all three tasks (section Cross-domain activity). That is, this region of the left STG is not activated by the Simon task, suggesting language-specific activation.

TABLE 6

Table 6. Statistically significant activations for the Stroop task in the whole-brain and ROI analyses (cingulate and left superior/middle temporal cortex).

FIGURE 3

Figure 3. (A) Active voxels for incongruent versus congruent in the Stroop task (BA 38; peak MNI: −50, 0, −12; and −46, −10, −12). (B) Averaged beta weights of active voxels in (A) in the picture-word interference (PWI) task as a function stimulus type. Inc, incongruent; Neu, neutral; Con, congruent; n.s., non-significant. Error bars represent the standard error of the mean. *p-values ≤ 0.05, **p-values ≤ 0.01.

To examine language-specific activity in the PWI task, the averaged beta weights within this left STG cluster were extracted, which is shown in Figure 3B. Activity in left STG was higher with neutral than with congruent (identical) stimuli, t₍₂₂₎ = 2.31, p = 0.030; and higher with neutral than incongruent (categorically related) stimuli, t₍₂₂₎ = 2.87, p = 0.009; but similar for congruent and incongruent stimuli, t₍₂₂₎ < 1. Importantly, activity in this left STG cluster was not significantly increased from baseline for the Simon task (incongruent: beta weight = 0.008, t₍₂₂₎ < 1; congruent: beta weight = 0.37, t₍₂₂₎ = 1.73, p = 0.097), nor did it differ between incongruent and congruent conditions, t₍₂₂₎ < 1. The same pattern of activity was observed in the beta weights when we constrained the analyses to the 17 participants for whom RT data was available.

Discussion

In the present study, we compared three control-demanding tasks, two of which had linguistic stimuli requiring vocal responding (Stroop and PWI), and the third had visual-spatial stimuli requiring manual responding (Simon task). Participants responded to congruent and incongruent stimuli in all three tasks, and in the Stroop and PWI tasks to neutral stimuli as well. Behaviorally, RTs were longer for incongruent than for congruent stimuli in all three tasks. Furthermore, in the linguistic-vocal tasks, RTs were longer for neutral than for congruent stimuli. These results are in line with previous literature for all three tasks (for reviews: PWI: Glaser, 1992; Stroop: MacLeod, 1991; Simon: Hommel, 2011).

Regarding the neuroimaging data, an analysis was performed to identify areas showing increased BOLD responses common to the incongruent condition in all three tasks (cross-domain activation). The areas identified by this conjunction analysis were the bilateral cerebellum, the left Rolandic operculum extending to the left STG, and the dorsal ACC.

Top-down control of task performance has been associated with a frontoparietal network of brain areas, including the lateral prefrontal cortex, the anterior insula/frontal operculum, the pre-supplementary motor area (SMA) and the ACC, and regions in and around the intraparietal sulcus (e.g., Dosenbach et al., 2006; Duncan, 2010; Power et al., 2011; Barbey et al., 2012; Niendam et al., 2012; Petersen and Posner, 2012). Our finding of common activation in the left operculum, SMA, and ACC across incongruent conditions in all tasks is in line with the evidence that a domain-general attentional control system is implemented by frontoparietal areas. Given our specific interest in the involvement of the ACC in speech production, we further examined activity in this area for the language tasks.

Cross-Domain Anterior Cingulate Cortex Activity in Language Tasks

An extensive meta-analysis of the cingulate cortex has linked different portions of this area to different behavioral domains, i.e., attention, action, emotion, language, memory, and pain (Torta and Cauda, 2011). In this meta-analysis, two adjacent regions were shown to be involved in all six domains examined, suggesting the exercise of a general function that is commonly called upon by performance in multiple tasks. Notably, the portion of the cingulate cortex where we observed the common activity across our tasks is a part of this multi-domain area identified by the meta-analysis. The activity we observed in the domain-general portion of the cingulate cortex was common to the incongruent condition of all three tasks, thus, independent of the response modality and nature of the stimuli (linguistic vs. non-linguistic). Therefore, the most plausible account for our results is that this activity reflects a domain-general attentional control function, a proposal that is also in line with the functional interpretation of the frontoparietal network of brain areas (e.g., Dosenbach et al., 2006; Duncan, 2010; Barbey et al., 2012; Niendam et al., 2012; Petersen and Posner, 2012). As indicated previously (section Introduction), researchers have found no agreement about what exactly this domain-general function of the ACC is (e.g., conflict monitoring, top-down regulation) but at least our result shows that the activity in this region is present when controlled responses are required in both linguistic and non-linguistic domains.

The evidence for the involvement of the dorsal ACC in spoken word production has thus far remained inconclusive in the literature. To address this issue, we examined the portion of the dorsal ACC that was activated across tasks for modulations in activity as a function of stimulus type in the language tasks (Stroop and PWI). In the Stroop task, activity was higher for incongruent than for neutral and congruent color words. In the PWI task, activity was higher for incongruent and neutral picture-word pairs relative to congruent pairs. These results provide the first direct neuroimaging evidence for the involvement of a domain-general portion of the cingulate cortex in the control over spoken word production (for a comparison between Stroop and Simon tasks with manual responding see Peterson et al., 2002; Liu et al., 2004). Our results agree with the proposal of Roelofs and colleagues (e.g., Roelofs and Hagoort, 2002; Roelofs, 2003; Roelofs et al., 2006), who argued for a regulation function of the ACC, in line with the evidence for a regulatory role of the ACC in non-verbal vocalizations (Aitken, 1981; Ploog, 1981; Jürgens, 2002, 2009). Moreover, our results also agree with the recent proposal of Nozari et al. (2011), who suggested that the ACC is implicated in self-monitoring in language production, in line with the ACC conflict-detection view (Botvinick et al., 2004). The present results do not allow us to adjudicate between the regulation and monitoring views, so future studies explicitly addressing this issue are needed.

Interference effects in behavior and brain activity

We observed a discrepancy in the language tasks between the condition differences in the RTs (incongruent > neutral > congruent) and the beta estimates in the dorsal ACC (see Figure 1). For the Stroop task, the incongruent condition led to an increased BOLD response relative to both the neutral and congruent conditions (incongruent > neutral = congruent), whereas for the PWI task, the incongruent and neutral conditions both had higher BOLD responses than the congruent condition (incongruent = neutral > congruent). Conflict, and thus the amount of conflict detected (Botvinick et al., 2004) or the amount of top-down regulation needed (Roelofs et al., 2006), is thought to be highest in the incongruent condition, followed by the neutral, and then the congruent condition. This pattern was clearly present in the RT data, but not in the neuroimaging data, even when the analyses of the neuroimaging data were constrained to the subjects for whom behavioral data was available. Based on this pattern, it could be argued that the present results do not agree with either the conflict monitoring or the top-down regulation views of ACC function.

The apparent discrepancy between RTs and ACC activity, however, can be resolved (and the theoretical views can be saved) if the magnitude of the conflict effects as evident in the RTs is taken into account. The largest RT effects in the PWI and Stroop tasks (>58 ms on average) are also the effects being detected in the BOLD estimates for each task, whereas the contrasts from the smaller behavioral effects, i.e., on average 25 ms for incongruent vs. neutral in PWI and 35 ms for neutral vs. congruent in Stroop, resulted in no statistically significant differences in the BOLD response. The relatively small behavioral effect sizes may suggest that the discrepancy between the behavioral interference effects and the activity in dorsal ACC may well be a matter of low statistical power. Despite the lack of an exact parallel between condition differences in RTs and dorsal ACC activity, the present results support our claim that a domain-general attentional control mechanism in the dorsal ACC is engaged during spoken word production.

Anterior cingulate cortex activity in picture-word interference studies

As mentioned in the introduction, only one PWI study had observed increased dorsal ACC activity for categorically related picture-word stimuli (equivalent to our incongruent condition) relative to a low-level control condition (de Zubicaray et al., 2001), whereas subsequent PWI studies did not observe differential activity in this area for categorically related (incongruent) and unrelated (neutral) picture-word pairs (Spalek and Thompson-Schill, 2008; de Zubicaray and McMahon, 2009; de Zubicaray et al., 2013). Similar to some of these previous results, we also did not observe activation differences in the dorsal ACC for categorically related relative to unrelated picture-word pairs. As discussed above, the difference in the amount of conflict between these two conditions may not have been large enough to give rise to detectable differences in brain activity. However, different from all previous studies, our design also included congruent picture-word pairs, for which conflict is absent. Relative to the congruent condition, conflicting picture-word pairs were associated with increased dorsal ACC activity, in line with the hypothesis that the ACC is involved in attentional control over word production (e.g., conflict monitoring or top-down regulation). Previous fMRI investigations comparing categorically related picture-word pairs with no-conflict pairs (i.e., pictures paired with a string of Xs) observed activity in an orbito-frontal ACC area not previously associated with domain-general control (cf. de Zubicaray et al., 2001; Torta and Cauda, 2011). Thus, our study provides evidence for the involvement of the dorsal ACC in control over word production.

Language-Specific Activity

Stroop task

The Stroop task has been well studied with fMRI, although the large majority of these studies have used manual responding (e.g., Bench et al., 1993; Banich et al., 2000; Liu et al., 2004; see for a brief overview MacLeod and MacDonald, 2000), rather than vocal responding (e.g., Carter et al., 1995; Brown et al., 1999; Barch et al., 2001). In our task, participants responded overtly to incongruent, neutral, and congruent stimuli. In line with previous literature using manual and vocal responding, an increased BOLD response in the dorsal ACC was observed for incongruent relative to congruent and neutral color words, (e.g., Banich et al., 2000; Barch et al., 2001; Fan et al., 2003; for an overview see also MacLeod and MacDonald, 2000). Moreover, the dorsal ACC coordinates we obtained are similar to those obtained by de Zubicaray et al. (2002) when contrasting phonologically related with unrelated picture-word pairs in PWI. Regarding other areas, rIFG and insular activity was also increased for incongruent relative to neutral and congruent stimuli, which is also consistent with previous studies using manual responding (e.g., Peterson et al., 2002; Floden et al., 2011). Earlier studies have suggested that the rIFG is involved in inhibition (e.g., Aron et al., 2004) or the detection of salient or task relevant cues indicating the need for top-down regulation (e.g., Hampshire et al., 2007). Our findings are compatible with both views. However, the literature suggests that the inhibition function implemented by rIFG is domain-general, whereas we observed activity in this area only related to the language tasks. This finding is consistent with the view that inhibition is not necessarily engaged to resolve conflict and can be optionally employed (Verhoef et al., 2009; Roelofs et al., 2011).

In addition to the areas that were common to the Stroop contrasts (incongruent vs. congruent and incongruent vs. neutral), increased BOLD responses were also observed in the right striatum (caudate and putamen) for incongruent relative to congruent stimuli. This finding is in line with the evidence that the caudate nucleus and the putamen are among the primary subcortical areas that underlie attentional control (e.g., Aarts et al., 2010; Wiecki and Frank, 2013), both at the task and response levels (Aarts et al., 2009). These results thus suggest that speech production, like other motor tasks, engages a frontal-striatal network implicated in attentional control. Finally, we also observed increased BOLD responses in the left anterior STG for incongruent relative to congruent stimuli, a less common finding in the literature (e.g., Fan et al., 2003). We will elaborate on this left STG activation in the next section.

Picture-word interference task and left temporal cortex

For the left anterior STG area showing BOLD response differences in the Stroop task, activity was increased for neutral (categorically unrelated) relative to the incongruent (categorically related) and congruent stimuli in the PWI task. The STG area we observed is located within the left anterior temporal lobe, a structure crucial for semantic memory (Patterson et al., 2007; Binder et al., 2009; Visser et al., 2010; Bonner and Price, 2013), including the mapping of concepts onto words in production (Indefrey and Levelt, 2004; Schwartz et al., 2009). Furthermore, our left temporal cortex activity is similar to a previous report of a PWI study also using categorically related and unrelated picture-word pairs (de Zubicaray et al., 2013). In that study, the left MTG activity was also interpreted in terms of lexical-semantic memory (Indefrey and Levelt, 2004).

Previous fMRI studies investigating the categorically related condition either in comparison to the unrelated condition (de Zubicaray and McMahon, 2009; de Zubicaray et al., 2013) or to a control condition (de Zubicaray et al., 2001) have observed modulations in the BOLD signal in the left STG and MTG as a function of picture-word type. For example, a recent fMRI study (de Zubicaray et al., 2013) observed longer picture-naming RTs for related than unrelated stimuli, but a reduction in activity in the left MTG for related relative to unrelated stimuli, similar to our finding of reduced activity in the left STG for incongruent (i.e., categorically related) relative to neutral (i.e., unrelated) stimuli. In line with these findings, our results provide independent evidence of increased picture-naming RT and decreased activity in the left temporal cortex for categorically related picture-word pairs relative to unrelated pairs. This finding is also in line with a recent magnetoencephalography (MEG) study, which used very similar stimulus materials as in the present fMRI study (Piai et al., 2013). In the MEG study, responses from the left middle temporal cortex between 300 and 500 ms after picture-word presentation were smaller for categorically related (and congruent) picture-word pairs relative to unrelated pairs. Importantly, the behavioral data showed the usual pattern of longer picture-naming RTs for related than unrelated stimuli.

How can we interpret this difference between RTs and brain responses for related and unrelated conditions in the PWI task? In order to name a picture, speakers have to retrieve its name from long-term memory. Upon picture presentation, activation from the pictured concept spreads through the lexical-semantic network, leading to the activation of a cohort of words that belong to the network (e.g., Roelofs, 1992; Abdel Rahman and Melinger, 2009). Similarly, the distractor word also activates representations in this network. Crucially, in PWI, the picture activates the distractor word on related but not on unrelated trials. This “reverse priming” makes related distractors stronger competitors than unrelated ones (Roelofs, 1992). Such priming in the lexical-semantic memory system (e.g., Collins and Loftus, 1975; Roelofs, 1992) may explain why categorically (and semantically) related picture-word pairs show less brain activity in the left temporal cortex relative to unrelated pairs (de Zubicaray et al., 2013; Piai et al., 2013; and the present results).

Although this account can explain why we observed reduced activity in the left STG, it requires an additional mechanism to account for the slowdown in naming associated with categorically related picture-word pairs. Such a mechanism has been proposed by Roelofs (1992), who presented computational simulations demonstrating that the semantic interference effect in RTs is explained by reverse priming and selection of a word only if its activation exceeds that of alternative words by a critical amount. Moreover, the simulations by Roelofs et al. (2006) demonstrated that if the ACC is involved in enhancing the activation of a target concept until a corresponding word is selected, then the patterns of ACC activity in Stroop-like tasks (including those in the present study) can also be explained. Our fMRI results not only corroborate previous findings regarding the left temporal cortex, for which the activation reflects priming in the lexical-semantic memory system, but also highlight the involvement of the dorsal ACC, especially when selection and monitoring processes are more demanding due to the co-activation of categorically related words.

Conclusions

The present study was designed to address whether a common neural-substrate might be engaged in the attentional control over linguistic and non-linguistic tasks with varying degrees of conflict. We observed activity in the dorsal ACC that was common to incongruent conditions of three different attentional control tasks, regardless of the response modality (vocal vs. manual) and nature of the stimuli (linguistic vs. non-linguistic). This common activation suggests a domain-general substrate that is called upon by all three tasks. More focused analysis of this commonly-activated region of the dorsal ACC in the linguistic-vocal tasks showed that it was sensitive to more difficult (i.e., incongruent) relative to easier linguistic stimuli. Finally, in the PWI task, increased activity was observed in the left anterior superior temporal cortex for picture-word pairs that did not belong to the same semantic category relative to picture-word pairs that did, probably reflecting the extent to which categorically related words were co-activated through target and distractor cues. These results suggest that language production engages brain areas implementing domain-general mechanisms for attentional control, as well as areas related to core language processes, such as lexical-semantic retrieval.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

This research was supported by a grant from the Netherlands Organization for Scientific Research under grant number MaGW 400-09-138 to Ardi Roelofs. The authors would like to thank Peter Indefrey for helpful suggestions about the design of the experiment, Paul Gaalman for assistance during data collection, and Kristoffer Dahlslätt for helpful discussion.

References

Aarts, E., Roelofs, A., Franke, B., Rijpkema, M., Fernández, G., Helmich, R. C., et al. (2010). Striatal dopamine mediates the interface between motivational and cognitive control in humans: Evidence from genetic imaging. Neuropsychopharmacology 35, 1943–1951. doi: 10.1038/npp.2010.68

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Aarts, E., Roelofs, A., and Van Turennout, M. (2008). Anticipatory activity in anterior cingulate cortex can be independent of conflict and error likelihood. J. Neurosci. 28, 4671–4678. doi: 10.1523/JNEUROSCI.4400-07.2008

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Aarts, E., Roelofs, A., and Van Turennout, M. (2009). Attentional control of task and response in lateral and medial frontal cortex: brain activity and reaction time distributions. Neuropsychologia 47, 2089–2099. doi: 10.1016/j.neuropsychologia.2009.03.019

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Abdel Rahman, R., and Melinger, A. (2009). Semantic context effects in language production: a swinging lexical network proposal and a review. Lang. Cogn. Process. 24, 713–734. doi: 10.1080/01690960802597250

CrossRef Full Text

Aitken, P. G. (1981). Cortical control of conditioned and spontaneous vocal behavior in rhesus monkeys. Brain Lang. 13,171–184. doi: 10.1016/0093-934X(81)90137-1

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Alexander, W. H., and Brown, J. W. (2011). Medial prefrontal cortex as an action-outcome predictor. Nat. Neurosci. 14, 1338–1344. doi: 10.1038/nn.2921

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Aron, A. R., Robbins, T. W., and Poldrack, R. A. (2004). Inhibition and the right inferior frontal cortex. Trends Cogn. Sci. 8, 170–177. doi: 10.1016/j.tics.2004.02.010

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Awh, E., and Gehring, W. J. (1999). The anterior cingulate cortex lends a hand in response selection. Nat. Neurosci. 2, 853–854. doi: 10.1038/13145

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Banich, M. T., Milham, M. P., Atchley, R., Cohen, N. J., Webb, A., Wszalek, T., et al. (2000). fMri studies of Stroop tasks reveal unique roles of anterior and posterior brain systems in attentional selection. J. Cogn. Neurosci. 12, 988–1000. doi: 10.1162/08989290051137521

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Barbey, A. K., Colom, R., Solomon, J., Krueger, F., Forbes, C., and Grafman, J. (2012). An integrative architecture for general intelligence and executive function revealed by lesion mapping. Brain 135, 1154–1164. doi: 10.1093/brain/aws021

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Barch, D. M., Braver, T. S., Akbudak, E., Conturo, T., Ollinger, J., and Snyder, A. (2001). Anterior cingulate cortex and response conflict: effects of response modality and processing domain. Cereb. Cortex 11, 837–848. doi: 10.1093/cercor/11.9.837

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bench, C. J., Frith, C. D., Grasby, P. M., Friston, K. J., Paulesu, E., Frackowiak, R. S. J., et al. (1993). Investigations of the functional anatomy of attention using the Stroop test. Neuropsychologia 31, 907–22. doi: 10.1016/0028-3932(93)90147-R

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Binder, J. R., Desai, R. H., Graves, W. W., and Conant, L. L. (2009). Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cereb. Cortex 19, 2767–2796. doi: 10.1093/cercor/bhp055

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bonner, M. F., and Price, A. R. (2013). Where is the anterior temporal lobe and what does it do? J. Neurosci. 33, 4213–4215. doi: 10.1523/JNEUROSCI.0041-13.2013

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Botvinick, M. M., Cohen, J. D., and Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: an update. Trends Cogn. Sci. 8, 539–546. doi: 10.1016/j.tics.2004.10.003

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brett, M., Anton, J.-L., Valabregue, R., and Poline, J.-B. (2002). “Region of interest analysis using an SPM toolbox,” in Presented at the 8th International Conference on Functional Mapping of the Human Brain (Sendai).

Brown, G. G., Kindermann, S. S., Siegle, G. J., Granholm, E., Wong, E. C., and Buxton, R. B. (1999). Brain activation and pupil response during covert performance of the Stroop Color Word task. J. Int. Neuropsychol. Soc. 5, 308–319. doi: 10.1017/S1355617799544020

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Carter, C. S., Mintun, M., and Cohen, J. D. (1995). Interference and facilitation effects during selective attention: an H215O PET study of Stroop task performance. Neuroimage 2, 264–272. doi: 10.1006/nimg.1995.1034

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Christoffels, I. K., Formisano, E., and Schiller, N. O. (2007). Neural correlates of verbal feedback processing: an fMRI study employing overt speech. Hum. Brain Mapp. 28, 868–879. doi: 10.1002/hbm.20315

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Collins, A. M., and Loftus, E. F. (1975). A spreading-activation theory of semantic processing. Psychol. Rev. 82, 407–428. doi: 10.1037/0033-295X.82.6.407

CrossRef Full Text

Cumming, G. (2012). Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. New York, NY: Routledge.

Devinsky, O., Morrell, M. J., and Vogt, B. A. (1995). Contributions of anterior cingulate cortex to behaviour. Brain 118, 279–306. doi: 10.1093/brain/118.1.279

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

de Zubicaray, G. I., Hansen, S., and McMahon, K. L. (2013). Differential processing of thematic and categorical conceptual relations in spoken word production. J. Exp. Psychol. Gen. 142, 131–142. doi: 10.1037/a0028717

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

de Zubicaray, G. I., Mc Mahon, K., Eastburn, M., and Wilson, S. (2002). Orthographic/phonological facilitation of naming responses in the picture-word task: an event-related fMRI study using overt vocal responding. Neuroimage 16, 1084–1093. doi: 10.1006/nimg.2002.1135

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

de Zubicaray, G. I., and McMahon, K. L. (2009). Auditory context effects in picture naming investigated with event-related fMRI. Cogn. Affect. Behav. Neurosci. 9, 260–269. doi: 10.3758/CABN.9.3.260

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

de Zubicaray, G. I., Wilson, S. J., McMahon, K. L., and Muthiah, S. (2001). The semantic interference effect in the picture-word paradigm: an event-related fMRI study employing overt responses. Hum. Brain Mapp. 14, 218–227. doi: 10.1002/hbm.1054

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Dhooge, E., and Hartsuiker, R. J. (2011). Lexical selection and verbal self-monitoring: Effects of lexicality, context, and time pressure in picture-word interference. J. Mem. Lang. 66, 163–176. doi: 10.1016/j.jml.2011.08.004

CrossRef Full Text

Dosenbach, N. U. F., Visscher, K. M., Palmer, E. D., Miezin, F. M., Wenger, K. K., Kang, H. C., et al. (2006). A core system for the implementation of task sets. Neuron 50, 799–812. doi: 10.1016/j.neuron.2006.04.031

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Duncan, J. (2010). The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour. Trends Cogn. Sci. 14, 172–179. doi: 10.1016/j.tics.2010.01.004

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Fan, J., Flombaum, J. I., McCandliss, B. D., Thomas, K. M., and Posner, M. I. (2003). Cognitive and brain consequences of conflict. Neuroimage 18, 42–57. doi: 10.1006/nimg.2002.1319

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ferreira, V., and Pashler, H. (2002). Central bottleneck influences on the processing stages of word production. J. Exp. Psychol. Learn. 28, 1187–1199. doi: 10.1037/0278-7393.28.6.1187

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Floden, D., Vallesi, A., and Stuss, D. T. (2011). Task context and frontal lobe activation in the Stroop task. J. Cognitive Neurosci. 23, 867–879. doi: 10.1162/jocn.2010.21492

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Glaser, W. R. (1992). Picture naming. Cognition 42, 61–105. doi: 10.1016/0010-0277(92)90040-O

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Glaser, W. R., and Düngelhoff, F. J. (1984). The time course of picture-word interference. J. Exp. Psychol. Human. 10, 640–654. doi: 10.1037/0096-1523.10.5.640

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hampshire, A., Duncan, J., and Owen, A. M. (2007). The role of the right inferior frontal gyrus: inhibition and attentional control. Neuroimage 50, 1313–1319. doi: 10.1016/j.neuroimage.2009.12.109

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hartsuiker, R. J., and Kolk, H. H. J. (2001). Error monitoring in speech production: a computational test of the perceptual loop theory. Cogn. Psych. 42, 113–157. doi: 10.1006/cogp.2000.0744

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hayasaka, S., and Nichols, T. E. (2003). Validating cluster size inference: random field and permutation methods. Neuroimage 20, 2343–2356. doi: 10.1016/j.neuroimage.2003.08.003

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hommel, B. (2011). The Simon effect as tool and heuristic. Acta Psychol. 136, 189–202. doi: 10.1016/j.actpsy.2010.04.011

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Indefrey, P. (2011). The spatial and temporal signatures of word production components: a critical update. Front. Psychol. 2:255. doi: 10.3389/fpsyg.2011.00255

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Indefrey, P., and Levelt, W. J. M. (2004). The spatial and temporal signatures of word production components. Cognition 92, 101–144. doi: 10.1016/j.cognition.2002.06.001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Jürgens, U. (2002). Neural pathways underlying vocal control. Neurosci. Biobehav. Rev. 26, 235–258. doi: 10.1016/S0149-7634(01)00068-9

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Jürgens, U. (2009). The neural control of vocalization in mammals: a review. J. Voice 23, 1–10. doi: 10.1016/j.jvoice.2007.07.005

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Levelt, W. J. M. (1989). Speaking: from Intention to Articulation. Cambridge, MA: MIT Press.

Levelt, W. J. M., Roelofs, A., and Meyer, A. S. (1999). A theory of lexical access in speech production. Behav. Brain Sci. 22, 1–75. doi: 10.1017/S0140525X99001776

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Liu, X., Banich, M. T., Jacobson, B. L., and Tanabe, J. L. (2004). Common and distinct neural substrates of attentional control in an integrated Simon and spatial Stroop task as assessed by event-related fMRI. Neuroimage 22, 1097–1106. doi: 10.1016/j.neuroimage.2004.02.033

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Lupker, S. J. (1979). The semantic nature of response competition in the picture-word interference task. Mem. Cognition 7, 485–495. doi: 10.3758/BF03198265

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

MacLeod, C. M. (1991). Half a century of research on the Stroop effect: an integrative review. Psychol. Bull. 109, 163–203. doi: 10.1037/0033-2909.109.2.163

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

MacLeod, C. M., and MacDonald, P. A. (2000). Interdimensional interference in the Stroop effect: uncovering the cognitive and neural anatomy of attention. Trends Cogn. Sci. 4, 383–391. doi: 10.1016/S1364-6613(00)01530-8

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Menenti, L., Gierhan, S. M. E., Segaert, K., and Hagoort, P. (2011). Shared language: overlap and segregation of the neuronal infrastructure for speaking and listening revealed by functional MRI. Psychol. Sci. 22, 1173–1182. doi: 10.1177/0956797611418347

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Niendam, T. A., Laird, A. R., Ray, K. L., Dean, Y. M., Glahn, D. C., and Carter, C. S. (2012). Meta-analytic evidence for a superordinate cognitive control network subserving diverse executive functions. Cogn. Affect. Behav. Neurosci. 12, 241–268. doi: 10.3758/s13415-011-0083-5

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Nozari, N., Dell, G. S., and Schwartz, M. F. (2011). Is comprehension necessary for error detection? A conflict-based account of monitoring in speech production. Cogn. Psychol. 63, 1–33. doi: 10.1016/j.cogpsych.2011.05.001

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Patterson, K., Nestor, P. J., and Rogers, T. T. (2007). Where do you know what you know? The representation of semantic knowledge in the human brain. Nat. Rev. Neurosci. 8, 976–987. doi: 10.1038/nrn2277

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Paus, T. (2001). Primate anterior cingulate cortex: where motor control, drive and cognition interface. Nat. Rev. Neurosci. 2, 417–424. doi: 10.1038/35077500

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Petersen, S. E., and Posner, M. I. (2012). The attention system of the human brain: 20 years after. Ann. Rev. Neurosci. 35, 73–89. doi: 10.1146/annurev-neuro-062111-150525

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Peterson, B. S., Kane, M. J., Alexander, G. M., Lacadie, C., Skudlarski, P., Leung, H. C., et al. (2002). An event-related functional MRI study comparing interference effects in the Simon and Stroop tasks. Cog. Brain Res. 13, 427–440. doi: 10.1016/S0926-6410(02)00054-X

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Petrides, M. (2005). Lateral prefrontal cortex: architectonic and functional organization. Philos. Trans. R. Soc. B Biol. Sci. 360, 781–795. doi: 10.1098/rstb.2005.1631

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Piai, V., and Roelofs, A. (2013). Working memory capacity and dual-task interference in picture naming. Acta Psychol. 142, 332–342. doi: 10.1016/j.actpsy.2013.01.006

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Piai, V., Roelofs, A., Jensen, O., Schoffelen, J.-M., and Bonnefond, M. (2013). Distinct patterns of brain activity characterize lexical activation and competition in speech production [Abstract]. J. Cogn. Neurosci. 25(Suppl.), 106. Available online at: http://hdl.handle.net/11858/00-001M-0000-000E-FD7B-2

Ploog, D. (1981). Neurobiology of primate audio-vocal behavior. Brain Res. Rev. 3, 35–61. doi: 10.1016/0165-0173(81)90011-4

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Poser, B. A., Versluis, M. J., Hoogduin, J. M., and Norris, D. G. (2006). BOLD contrast sensitivity enhancement and artifact reduction with multiecho EPI: parallel-acquired inhomogeneity-desensitized fMRI. Magn. Reson. Med. 55, 1227–1235. doi: 10.1002/mrm.20900

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Posner, M. I., and Petersen, S. E. (1990). The attention system of the human brain. Annu. Rev. Neurosci. 13, 25–42. doi: 10.1146/annurev.ne.13.030190.000325

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Power, J. D., Cohen, A. L., Nelson, S. M., Wig, G. S., Barnes, K. A., Church, J. A., et al. (2011). Functional network organization of the human brain. Neuron 72, 665–678. doi: 10.1016/j.neuron.2011.09.006

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Price, C. J. (2012). A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage 62, 816–847. doi: 10.1016/j.neuroimage.2012.04.062

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., and Nieuwenhuis, S. (2004). The role of the medial frontal cortex in cognitive control. Science 306, 443–447. doi: 10.1126/science.1100301

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Riès, S., Janssen, N., Dufau, S., Alario, F.-X., and Burle, B. (2011). General-purpose monitoring during speech production. J. Cogn. Neurosci. 23, 1419–1436. doi: 10.1162/jocn.2010.21467

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A. (1992). A spreading-activation theory of lemma retrieval in speaking. Cognition 42, 107–142. doi: 10.1016/0010-0277(92)90041-F

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A. (2003). Goal-referenced selection of verbal action: modeling attentional control in the Stroop task. Psychol. Rev. 110, 88–125. doi: 10.1037/0033-295X.110.1.88

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A. (2004). Error biases in spoken word planning and monitoring by aphasic and nonaphasic speakers: comment on Rapp and Goldrick (2000). Psychol. Rev. 111, 561–572. doi: 10.1037/0033-295X.111.2.561

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A. (2008). Attention to spoken word planning: chronometric and neuroimaging evidence. Lang. Linguist. Compass 2, 389–405. doi: 10.1111/j.1749-818X.2008.00060.x

CrossRef Full Text

Roelofs, A., and Hagoort, P. (2002). Control of language use: cognitive modeling of the hemodynamics of Stroop task performance. Cogn. Brain Res. 15, 85–97. doi: 10.1016/S0926-6410(02)00218-5

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A., and Piai, V. (2011). Attention demands of spoken word planning: a review. Front. Psychol. 2:307. doi: 10.3389/fpsyg.2011.00307

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A., Piai, V., and Garrido Rodriguez, G. (2011). Attentional inhibition in bilingual naming performance: evidence from delta-plot analyses. Front. Psychol. 2:184. doi: 10.3389/fpsyg.2011.00184

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Roelofs, A., Van Turennout, M., and Coles, M. G. H. (2006). Anterior cingulate cortex activity can be independent of response conflict in Stroop-like tasks. Proc. Natl. Acad. Sci. U.S.A. 103, 13884–13889. doi: 10.1073/pnas.0606265103

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Rosinski, R. R. (1977). Picture-Word interference is semantically based. Child Dev. 48, 643–647. doi: 10.2307/1128667

CrossRef Full Text

Schwartz, M. F., Kimberg, D. Y., Walker, G. M., Faseyitan, O., Brecher, A., Dell, G. S., et al. (2009). Anterior temporal involvement in semantic word retrieval: voxel-based lesion-symptom mapping evidence from aphasia. Brain 132, 3411–3427. doi: 10.1093/brain/awp284

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Segaert, K., Menenti, L., Weber, K., Petersson, K. M., and Hagoort, P. (2012). Shared syntax in language production and language comprehension - an fMRI study. Cereb. Cortex 22, 1662–1670. doi: 10.1093/cercor/bhr249

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Simon, J. R., and Small, A. M. J.r. (1969). Processing auditory information: interference from an irrelevant cue. J. Appl. Psychol. 53, 433–435. doi: 10.1037/h0028034

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Spalek, K., and Thompson-Schill, S. L. (2008). Task-dependent semantic interference in language production: an fMRI study. Brain Lang. 107, 220–228. doi: 10.1016/j.bandl.2008.05.005

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Stroop, J. R. (1935). Studies of interference in serial verbal reactions. J. Exp. Psychol. 18, 643–662. doi: 10.1037/h0054651

CrossRef Full Text

Torta, D. M., and Cauda, F. (2011). Different functions in the cingulate cortex, a meta-analytic connectivity modeling study. Neuroimage 56, 2157–2172. doi: 10.1016/j.neuroimage.2011.03.066

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., et al. (2002). Automated anatomical labelling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single subject brain. Neuroimage 15, 273–289. doi: 10.1006/nimg.2001.0978

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Van Casteren, M., and Davis, M. H. (2006). Mix, a program for pseudorandomization. Behav. Res. Methods 38, 584–589. doi: 10.3758/BF03193889

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Verhoef, K., Roelofs, A., and Chwilla, D. (2009). Role of inhibition in language switching: evidence from event-related brain potentials in overt picture naming. Cognition 110, 84–99. doi: 10.1016/j.cognition.2008.10.013

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

van de Ven, V., Esposito, F., and Christoffels, I.K. (2009). Neural network of speech monitoring overlaps with overt speech production and comprehension networks: a sequential spatial and temporal ICA study. Neuroimage 47, 1982–1991. doi: 10.1016/j.neuroimage.2009.05.057

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Visser, M., Jefferies, E., and Lambon Ralph, M. A. (2010). Semantic processing in the anterior temporal lobes: a meta-analysis of the functional neuroimaging literature. J. Cogn. Neurosci. 22, 1083–1094. doi: 10.1162/jocn.2009.21309

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wiecki, T. V., and Frank, M. J. (2013). A computational model of inhibitory control in frontal cortex and basal ganglia. Psychol. Rev. 120, 329–355. doi: 10.1037/a0031542

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Appendix

TABLE A1

Table A1. Materials from the experiment (English translations between parentheses).

Keywords: attentional control, anterior cingulate cortex, superior temporal cortex, picture-word interference, Simon, Stroop, word production

Citation: Piai V, Roelofs A, Acheson DJ and Takashima A (2013) Attention for speaking: domain-general control from the anterior cingulate cortex in spoken word production. Front. Hum. Neurosci. 7:832. doi: 10.3389/fnhum.2013.00832

Received: 03 July 2013; Accepted: 18 November 2013;
Published online: 09 December 2013.

Edited by:

Greig I. De Zubicaray, University of Queensland, Australia

Reviewed by:

F-Xavier Alario, CNRS and AixMarseille Université, France
Stefan Heim, RWTH Aachen University, Germany

Copyright © 2013 Piai, Roelofs, Acheson and Takashima. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Vitória Piai, Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Radboud University Nijmegen, Montessorilaan 3 B.01.05, Nijmegen 6525 HR, Netherlands e-mail:di5waWFpQGRvbmRlcnMucnUubmw=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.