Motor Response Selection in Overt Sentence Production: A Functional MRI Study

Many different cortical areas are thought to be involved in the process of selecting motor responses, from the inferior frontal gyrus, to the lateral and medial parts of the premotor cortex. The objective of the present study was to examine the neural underpinnings of motor response selection in a set of overt language production tasks. To this aim, we compared a sentence repetition task (externally constrained selection task) with a sentence generation task (volitional selection task) in a group of healthy adults. In general, the results clarify the contribution of the pre-SMA, cingulate areas, PMv, and pars triangularis to the process of selecting motor responses in the context of sentence production, and shed light on the manner in which this network is modulated by selection mode. Further, the present study suggests that response selection in sentence production engages neural resources similar to those engaged in the production of isolated words and oral motor gestures.


INTRODUCTION
How are our innermost thoughts converted into an articulated verbal message? The neural mechanisms that underlie this fascinating conversion include the selection of words to express an intended meaning, and the selection and sequencing of motor programs to realize them. Motor response selection in the context of spoken language production can be broadly construed as the process by which a set of lexical units forming a message is transformed into a sequence of motor programs; it is a complex process that links cognitive, linguistic, and sensorimotor systems.
Despite the importance of motor response selection, attempts to incorporate this process into contemporary biological models of language remain scarce (but see for example Crosson et al., 2001). Most models of speech and/or language (e.g., Levelt, 1999;Hickok and Poeppel, 2004;Indefrey and Levelt, 2004;Riecker et al., 2005;Guenther et al., 2006) postulate a lexical selection stage, which is a non-motor, language-specific process that can, with some difficulty, be integrated into a broader action execution framework. However, these models postulate that competition for selection occurs only at lexical stage, and thus never incorporate motor response selection. Although cascaded models of spoken language production (e.g., Morsella and Miozzo, 2002), do not postulate a motor selection stage per se, they do assume that lexical competition spreads to phonological representations, thereby supporting the idea that competition occurs at different levels of representation.
Notwithstanding the lack of a theoretical framework for response selection in spoken language production, several recent studies suggest a role for frontal premotor regions in this process. For example, results of a recent electroencephalographic (EEG) study comparing volitional and externally cued word selection demonstrate modulation of medial frontal activity, suggesting a role for these areas to response selection (Tremblay et al., 2008). Consistent with this finding, several fMRI studies have shown that manipulating response selection during overt or covert single word production modulates large brain networks including the pre-SMA (Brodmann's area 6 m; supplementary motor cortex; SMA), but also the adjacent cingulate motor area (CMA), the inferior fontal gyrus, and the ventral premotor (PM) cortex (Thompson-Schill et al., 1997Crosson et al., 2001;Zhang et al., 2004;Alario et al., 2006;Gracco, 2006, 2010;Nagel et al., 2008). One important finding is that the pre-SMA appears to be involved not only in selecting single words (Alario et al., 2006;Tremblay and Gracco, 2006) but also in selecting noncommunicative oral motor gestures (Tremblay and Gracco, 2010). Further support for a role for this region is provided by results of a repetitive TMS study (Tremblay and Gracco, 2009), which showed that pre-SMA is essential for volitional motor response selection, but not for stimulus-based selection, and that this pattern is similar for selecting words and non-communicative oral motor gestures. Further evidence for a role for pre-SMA in motor response selection was shown by Braun et al. (2001), who found that production of self-organized sequences of lip, jaw, and tongue movements, as well as the production of language, are both associated with activation in pre-SMA. Taken together, these results suggest that the pre-SMA may be playing a central role in selecting motor response during spoken language production.
It could be argued that pre-SMA activation in these studies is related to other linguistic or cognitive processes associated with the production of spoken language. However, there is some evidence to suggest that this is not the case. First, in some of these studies, non-linguistic actions, such as oral gestures (Braun et al., 2001;Tremblay and Gracco, 2010) and hand actions (Tremblay et al., 2008) were compared to word production tasks and similar www.frontiersin.org patterns of neural activity were found across domains (linguistic, non-linguistic). Furthermore, in the realm of motor control per se, several neuroimaging studies have examined the process of selecting motor responses and shown that the magnitude of activation in pre-SMA increases commensurate with demands on response selection. For instance, activation in pre-SMA is enhanced when participants are free to choose a motor response from among several alternatives (i.e.,"volitional"selection) compared to when they are required to execute a specific, stimulus-driven, motor response (e.g., Deiber et al., 1996;Van Oostende et al., 1997;Hadland et al., 2001;Ullsperger and Von Cramon, 2001;Weeks et al., 2001;Lau et al., 2004Lau et al., , 2006. Despite along-standing tendency to conceptualize language as "unique" or "special," that is, as being independent from other behaviors, it is becoming increasingly accepted that language relies on largely distributed (that is, presumably nonlanguage-specific) neural networks, though the degree and nature of the overlap between language and other functional systems needs to be further characterized. At the behavioral level, several experiments have demonstrated a connection between speech and hand gestures (Gentilucci et al., 2001;Gentilucci, 2003), and between language and oral motor gestures (Alcock et al., 2000;Alcock, 2006). In this context, the finding of similar neural circuits engaged in motor response selection across domains is not surprising.
Taken together, these findings are consistent, at least in part, with a hypothesis that is referred to as the "medio-lateral gradient of control" hypothesis, according to which the more an action requires internal (volitional) control, the more the involvement of medial premotor areas (which corresponds to the medial portion of Brodmann area 6). In contrast, externally (stimulus) driven actions tend to rely on lateral (rather than medial) premotor areas (Goldberg, 1985). Traditionally, the medial portion of Brodmann area 6 was considered to be a single area, the supplementary motor area (Penfield and Welch, 1951;Woolsey et al., 1952). However, it is now widely accepted that this large cortical area divides into at least two distinct areas approximately at the level of the anterior commissure (see for example Rizzolatti et al., 1998;and Luppino and Rizzolatti, 2000, for reviews), with the SMA-proper forming the caudal part of the region, posterior to the VAC line, and the pre-SMA forming the anterior part. The pre-SMA has a connectivity pattern that is ideal for linking cognitive and motor processes, a sine qua non-for the implementation of motor response selection, with important projections from the prefrontal cortex, particularly the dorsolateral prefrontal cortex (Luppino et al., 1993;Lu et al., 1994;Wang et al., 2005), and connections with several premotor areas such as the SMA-proper and the lateral PM (Luppino and Rizzolatti, 2000), for controlling motor output. In addition to the pre-SMA, the lateral premotor cortex has also been discussed in the context of response selection, particularly in relation to stimulusbased hand movement selection (Goldberg, 1985;Mushiake et al., 1991;Deiber et al., 1996;Dirnberger et al., 1998), though evidence of distinct pathways for volitional and stimulus-based selection remains scarce.
In sum, a review of the current literature suggests an important contribution of the pre-SMA, along with potential contribution of the adjacent CMA, the inferior frontal gyrus (IFG), and the ventral PM, in selecting motor programs for single words, single oral non-communicative gestures, and finger movements. One important question that follows from these findings is whether the pattern of results in isolated single word processing bears any resemblance to the pattern associated with production of phrases, sentences, and discourse that characterize naturalistic spoken language. Given a heavy reliance on selection, and the accelerated pace at which selection occurs -considering that adult speakers may produce as many as 14 phonemes per second, i.e., up to six to nine syllables per second (e.g., Kent, 2000) -it is reasonable to ask whether selection in this setting relies on the same neural mechanisms as in isolated single word production. The objective of the present study was to test the generalizability of previous results by examining the neural underpinnings of motor response selection in a set of sentence production tasks. To this aim, we compared a sentence repetition task with a sentence generation task in a group of 21 healthy adults. Based on the literature, we predicted a stronger involvement of pre-SMA and possibly ventral PM (PMv) in sentence generation than sentence repetition, reflecting the increased requirements for selection during generation. We also expected regions involved in response selection to be active in both production modes, as both require selection.

PARTICIPANTS
Twenty-one healthy right-handed (Oldfield, 1971) native speakers of English (mean 25 ± 4.4; 10 males), with a mean of 15.4 years of education participated in the fMRI experiment. All participants had normal hearing sensitivity, as measured by normal pure-tone thresholds and normal speech recognition scores (92.3% accuracy on the Northwestern University auditory test number 6). The Institutional Review Board for the Division of Biological Sciences at The University of Chicago approved the study.

BEHAVIORAL TASKS
To evaluate spontaneous production of words under restricted search conditions, a category fluency task was administered to participants prior to the fMRI session. Participants were instructed to produce as many animal and vegetable words as possible in 1 min (in two separate trials). To examine participants' verbal comprehension skills, an auditory memory span task was administered to participants (an auditory version of the reading span task developed by Daneman and Carpenter, 1980). Participants' responses were recorded and stored to disk for offline analysis. A research assistant naive to the purpose of the study transcribed all the responses.

EXPERIMENTAL PROCEDURES
Participants underwent five different tasks while in the scanner (1) passive observation of object pictures, (2) passive sentence listening, (3) listening and repeating sentences, (4) generating sentences from object pictures, and (5) passive observation of short action movies. The comparison of the language tasks and the non-language tasks has been reported elsewhere (Tremblay and Small, 2011). Each condition was acquired in separate runs, and alternated with "rest" epochs during which the participants were asked to relax. For each condition, the order of the conditions and number of rest trials was optimized using Frontiers in Psychology | Language Sciences OPTseq2 (http://surfer.nmr.mgh.harvard.edu/optseq/). Stimuli were presented using Presentation Software (Neurobehavioral Systems).
The tasks of interest for this study were the two sentence production tasks (sentence repetition and sentence generation). During sentence repetition, participants heard a set of 80 sentences (40 action, 40 object sentences) interleaved with 30 rest trials; their task was to repeat the sentence. Both stimulus presentation and response occurred while the gradients were switched off for a 4.5-s of silence ("sparse sampling"). At the beginning of the silent interval, a Go cue was presented, instructing participants to start repeating the sentence. Participants' responses were recorded and stored to disk for offline analysis. In sentence generation, participants were asked to generate 80 sentences (40 action, 40 object) from a set of 40 object pictures interleaved with 28 rest trials. The pictures were simple black-and-white line drawings representing common man-made objects selected from the International Picture Norming Project corpus from the Center for Research in Language at the University of California San Diego Szekely et al., 2003). In each experimental trial, a picture was presented for 1 s and was followed, after 500 ms, by a visual cue ("go") instructing participants to start generating the sentence. As noted, all speaking occurred while the MR gradients were switched off.
In addition to these two sentence production tasks, we included two passive tasks, sentence listening and picture observation, as controls for sentence repetition and sentence generation, respectively. During sentence listening 80 short sentences (0.9-1.3 s) interleaved with 30 rest trials were presented to participants. Half of these sentences described manual object-directed actions and the other half described visual properties of the same set of objects. The sentence stimuli were presented while the gradients were switched off which ensured ease of auditory processing for participants. During picture observation, a set of 40 simple blackand-white line drawings was presented one per trial for 1 s and interleaved with 37 rest trials (crosshair fixation). Participants were instructed simply to attend to the pictures.

Image acquisition
The data were acquired on a 3 T General Electric (Milwaukee, WI) Signa HDx imager with EXCITE. Participants wore MR compatible headphones and goggles (NordicNeuroLab Audio/Visual system). 34 axial slices (3.125 mm × 3.125 mm × 3.6 mm, no gap, FOV = 256 mm × 256 mm, matrix = 64 × 64) were acquired in 1.5 s using a multislice EPI sequence with parallel imaging (ASSET = 2; TE = 26 ms; FOV = 20 cm; 64 × 64 matrix; Flip angle: 73). To eliminate movement artifacts associated with speaking, and to ensure that participants could hear the auditory stimuli, a sparse image acquisition technique was used during all the language tasks. A silent period (1.5 s for listening, 4.5 s for repetition and generation) was interleaved between each volume acquisition. Trials containing errors 1 (corresponding to 1.2% of the trials in sentence repetition and 13.5% in sentence generation) were excluded from the analysis of the behavioral and fMRI data. Highresolution T1-weighted volumes were acquired for anatomical localization.

Timeseries analyses
The timeseries were spatially registered, motion-corrected (within and across runs), de-spiked and converted to percentage of signal change using AFNI (Cox, 1996). A linear least squares model was used to establish a fit to each time point of the hemodynamic response function for each of these conditions. There were separate regressors for each of the experimental conditions. Additional regressors were the mean, linear, and quadratic trend components, as well as the six motion parameters (x, y, z, roll, pitch, yaw). We modeled the entire trial duration (i.e., 6 s), which included stimulus presentation and speech production. Event-related signals were calculated by linear interpolation, beginning at stimulus onset, and continuing for 12 s, using AFNI's tent function (i.e., a piecewise linear spline model). The fit was examined at two different time lags (0-6 s, and 6-12 s) to identify the time point showing the strongest hemodynamic response in our regions of interest (ROI). All subsequent analyses focused on the beta values from the first 6 s post-stimulus onset time lag.
Participants' anatomical scan was aligned to the registered EPI timeseries (Saad et al., 2009). FreeSurfer Fischl et al., 1999) was used to create surface representations of each participant's anatomy. Once these surfaces were created, they were exported into SUMA (Saad et al., 2004), which was used to project the functional data resulting from the first-level analysis onto two-dimensional surfaces. Prior to running the group analyses, we applied a 6-mm smoothing kernel to increase the signal-to-noise ratio. Smoothing data on the surface instead of the volume ensures that smoothing avoids inclusion of white matter data, and it prevents averaging data across sulci and gyri (Argall et al., 2006). The group analyses were performed using SUMA on the smoothed beta values. First, we examined the main effect of each condition (repetition, generation) compared to their respective baselines (sentence listening, picture observation). We then examined the difference between sentence generation and sentence repetition. These standard subtraction-type analyses were complemented by a "conjunction" analysis (Nichols et al., 2005) to uncover brain regions commonly active across the speaking tasks. In particular, we identified a task-independent speech production network by computing the intersection (or conjunction) of brain activity for repetition ∩ generation. The conjunction analysis only includes regions that survived correction for multiple comparisons in both repetition and generation. For each analysis, a permutation approach (Nichols and Holmes, 2002) was used to identify significant clusters of activated vertices, with an individual vertex threshold of p < 0.005, corrected for multiple comparisons to achieve a family-wise error (FWE) rate of p < 0.05 (clusters ≥ 168 vertices).

Anatomical region of interest analysis
In addition to the whole brain analyses, further analyses were conducted on two sets (frontal lateral and fronto-medial) of anatomical ROI selected a priori. The lateral ROIs included the rostral and caudal portions of PMv (rostral PMv: precentral sulcus; caudal PMv: precentral gyrus), and the pars opercularis and pars www.frontiersin.org triangularis of the IFG. The medial ROIs included the pre-SMA and SMA-proper, as well as the rostral and caudal parts of the cingulate gyrus. Each of the ROIs was identified on the individual's cortical surface representation using an automated parcellation scheme as implemented in FreeSurfer (Fischl et al., 2002(Fischl et al., , 2004Desikan et al., 2006). This procedure uses a probabilistic labeling algorithm that incorporates the anatomical conventions of Duvernoy (1991), and thus is based on macroanatomical landmarks, not on cytoarchitectonic maps, and therefore represents an approximation to the actual motor and premotor areas. Such anatomical approach is very robust as it takes into account individual participant's anatomy; moreover, it avoids the common problem of selection bias in fMRI research, whereby only those voxels exhibiting a particular pattern are chosen for further analyses (for a discussion of this issue, see for example Vul and Kanwisher, 2009): here, all the voxels in each pre-determined region is selected for analysis.
The ROIs were defined as follows: (1) Rostral PMv: this region was operationalized as the ventral part of the precentral sulcus, defined as the part of the sulcus below the junction of the inferior frontal sulcus with the precentral sulcus. The resulting rostral PMv was bounded rostrally by pars opercularis, caudally by the precentral gyrus, and dorsally by the dorsal PM.
(2) Caudal PMv: this region was defined as the part of the precentral gyrus below the junction of the inferior frontal sulcus with the precentral gyrus. The resulting caudal PMv was bounded rostrally by the rostral PMv, caudally by the central sulcus, and dorsally by the dorsal PM.
(3) Pars triangularis was defined as the gyrus immediately anterior to pars opercularis; bounded caudally by pars opercularis, and rostrally by pars orbitalis, not including the inferior frontal sulcus. (4) Pars opercularis was defined as the part of the IFG immediately anterior to the precentral gyrus, bounded caudally by the precentral sulcus, and rostrally by pars triangularis, and not including the inferior frontal sulcus. (5) Pre-SMA was defined as the portion of the medial superior frontal gyrus that is anterior to the VAC line, which is a (virtual) vertical line passing through the anterior commissure, and posterior to a virtual line passing through the genu of the corpus callosum. The ventral boundary of the pre-SMA is the cingulate sulcus. (6) SMA-proper was defined as the portion of the medial superior frontal gyrus posterior to the VAC line, and anterior to the medial precentral gyrus. (7) The rostral cingulate region was defined as the part of the cingulate gyrus anterior to the VAC line, and posterior to a virtual line passing through the genu of the corpus callosum. (8) The caudal cingulate was defined as the portion of the cingulate gyrus posterior to the VAC line, and anterior to the medial precentral gyrus.
The mean percentage of BOLD signal change was extracted for each ROI and each condition. We then calculated two difference scores to isolate the effects specific to producing language, over and above perception of the stimuli: (1) repetition (sentence repetition -sentence listening), and (2) generation (sentence generation -picture observation). These scores were entered in a three-way ANOVA with repeated measurement on Task (Repeat, Generate), Hemisphere (Left, Right), and ROI. We conducted this analysis separately for each ROI group (lateral, medial). We used FDR corrected two-tailed comparisons to examine whether the activation magnitude in each ROI was significantly different from zero (positively or negatively) for repetition and generation. When a region showed significant activation in either of the tasks, we also performed an FDR corrected two-tailed pairwise comparison to examine a potential task effect.
In addition to these analyses, we also examined the relation between regional activation and behavior. Specifically, we correlated the mean activation in each ROI during sentence generation and a set of five behavioral measures: (i) accuracy during the sentence generation task (percentage of correct sentences produced); (ii) number of words produced; (iii) number of syllables generated; (iv) category fluency score (total number of words produced for animal and vegetable fluency combined); and (v) verbal working memory score (reading span; total word recalled per participant). We postulated that these last two measures would be highly related to performance on the sentence generation task, because, like the sentence generation task, they involve word search and response selection. Using partial correlations (with participants as a covariate of no interest), we investigated potential linear relationships between the magnitude of brain signal in each of our ROIs and these measures.

ONLINE BEHAVIORAL DATA ANALYSES
Participants' responses during the fMRI session were recorded online using LabVIEW (National Instruments, Austin, TX, USA) and stored to disk. The responses for two participants could not be stored due to technical difficulty. A research assistant naive to the purpose of the study transcribed the responses for the 19 remaining participants. For each sentence, we verified accuracy (whether or not it conformed to task instructions) and grammaticality (whether the sentence was correctly formed). In addition, we calculated the number of syllables and words for each sentence. Finally, we calculated the number of departures from the primed sentence structure.
Trials containing errors were removed from the analysis of the behavioral and fMRI data.

ONLINE BEHAVIORAL DATA
Complete details on the analysis of the behavioral data have been reported elsewhere (Tremblay and Small, 2011). Of particular importance, the sentence repetition and sentence generation tasks did not differ from each another on any of the online measures (number of words, number of syllables, accuracy).
Moreover, as was expected, without having been instructed to do so, participants spontaneously imitated the structure they had been exposed to (primed) during the sentence generation and sentence listening tasks, as anticipated based on known "structural persistence"in sentence production (Bock, 1986). The primed sentence structures were simple sentences containing a subject and a predicate. Half the sentences consisted of a noun subject and a simple predicate such as "The drawer is open" or "The scissors are sharp" (the object-related sentences). The other half of the sentences consisted of the first person pronoun ("I") followed by a predicate, such as "I drag the suitcase" (the action-related sentences). All action sentences used the present tense. Results show that participants employed the primed sentence structures in the majority of the trials, with "novel" sentence structures occurring in only 156 (of 1200 total) trials, representing fewer than 13% of all uttered sentences. Most of these novel structures were simple modifications of the primed structure, such as a change from the present to the past tense (representing 49% of all novel structures), deletion of the pronoun (representing 9% of all novel structures), or deletion of the determiner (representing 8% of all novel structures). The details of the deviations from the primed syntactic structure are reported in Table 1. Figure 1 reveals the brain areas jointly activated for sentence repetition and sentence generation, after removal of baseline activation (sentence listening and picture observation, respectively). These areas included the precentral gyrus and central sulcus bilaterally, as well as the transverse temporal gyrus and sulcus bilaterally. An exhaustive list of all regions is presented in Table 2. Figure 2A shows task-related activation during sentence repetition, after removing the effect of sentence listening. As can be seen in the Figure, activation was largely bilateral and included clusters of activated nodes along the precentral gyrus and central sulcus covering both the ventral primary motor cortex and the PMv, as well as clusters of activation in the medial frontal area, the bilateral transverse temporal gyrus, and the planum temporale bilaterally. Figure 2B shows task-related activation during sentence generation, after removing the effect of picture observation. Activation was distributed across a large network of bilateral brain areas, including primary and secondary visual areas, the precentral gyrus and central sulcus covering both the ventral primary motor cortex and the ventral premotor cortex, in the medial frontal area, in the bilateral transverse temporal gyrus and bilateral planum temporale, and the left IFG. Compared to sentence repetition, in which activation was equally distributed across both hemispheres, activation in sentence generation was stronger on the left than on the right hemisphere. An exhaustive list of all taskrelated activation for the basic contrasts (repetition -listening and generation -picture observation) is presented in Table 3. Direct comparison of the repetition and generation tasks is shown in Figure 3. This contrast revealed activation in the left pre-SMA, as well as activation in the left IFG and in the primary visual cortex bilaterally. These results are detailed in Table 4.
To further examine these results, we tested the activation level in each of the ROIs against zero using a set of FDR corrected pairwise comparisons. These comparisons revealed that overall the left pre-SMA was significantly more active than all other medial regions. Activations in the caudal cingulate gyri and SMA-proper bilaterally were not significantly different from zero in either production task. Activation in the rostral cingulate gyrus was lower than zero (relative deactivation) for the generation task,

Brain-behavior correlations
In addition to examining the activation patterns in the ROIs, we also examined the relationship between activation magnitude during the sentence generation task and a set of five behavioral measures (accuracy during the sentence generation task, number of words produced, number of syllables produced, category  fluency, and verbal working memory). In the animal fluency task, participants generated an average of 25.3 (±6.09 SD; range: 15-37) words. In the vegetable fluency task, they generated on average 14.4 (±14.4 SD; range: 8-23) words. We used the total number of words generated as our measure of fluency. In the auditory span task, participants were able to recall a mean of 53/100 words (±10.4 SD; range: 33-67). The average number of words produced in the sentence generation task was 4.49 (±0.55 SD; range: 4-7); the average number of syllables was 5.62 (±0.65 SD; range: 4-8). The results of the correlation analyses are detailed in Table 5. Participants' verbal working memory, as measured by the auditory span task, did not correlate with activation during sentence generation in any of the ROIs. One interesting finding is that activation in the left or right pre-SMA did not correlate with any of the online or offline language measures. In PMv (rostral and caudal) and IFG (pars triangularis and opercularis), activation was negatively correlated with the number of words produced; that is, the more words produced, the less activation was found in these regions.

DISCUSSION
The objective of the present study was to test the generalizability of previous results related to the neural basis of motor response selection by examining the neural underpinnings of this process during a sentence production task, focusing on premotor areas of the cerebral cortex. As discussed in the Introduction, previous studies of hand and finger response selection suggest the existence of a response buffer in which candidate motor programs are co-activated and compete for selection during response planning (e.g., Deiber et al., 1996;Van Oostende et al., 1997;Hadland et al., 2001;Ullsperger and Von Cramon, 2001;Weeks et al., 2001;Lau et al., 2004Lau et al., , 2006. In addition, previous imaging studies (Braun et al., 2001;Tremblay et al., 2008;Tremblay and Gracco, 2010) provide some evidence that this motor response selection mechanism may also be involved during speech production. In the current study we wanted to examine whether such a mechanism could play a role in the production of connected speech. Indeed, most of the research reported in the literature focuses on single word production. However, it is unclear if single word production is an adequate proxy for more complex forms of language, which involve the production of connected speech. To address the question of response selection in a more natural production context, we compared sentence repetition with sentence generation in a group of healthy adults. Sentence generation requires selection of a set of words to express meaning, and the selection of motor programs to realize them, and thus relies heavily on response selection mechanisms; sentence repetition, in contrast, relies less heavily on selection because it involves producing a set of pre-defined words. While sentence generation, in addition to requiring semantic processing, also requires syntactic processing, the demands on the syntactic system are limited in our study by the fact that participants had just listened to over 150 sentences with similar syntactic structure prior to sentence generation. We used this design to take advantage of structural persistence (Bock, 1986, 1990, see Pickering and Branigan, 1999, for a review), the priming phenomenon in which people tend to use syntactic constructions they have most recently encountered. Indeed, this part of our design was successful: the sentences participants generated were largely identical to those they had heard, thus controlling for the syntactic complexity of the repetition and generation tasks. Hence, while both sentence generation and sentence repetition required selection of motor programs, the generation task included a competition/selection component minimized during sentence repetition. Our hypothesis was that competing words are associated with competing motor programs. Thus, in this context, we expected regions involved in motor response selection to be modulated (generation > repetition), but, critically, we also expected such regions to be active in both sentence production tasks since both require selection and sequencing of motor programs. Based on the literature, we expected to find such pattern in the pre-SMA and possibly ventral premotor cortex (PMv).

TASK-RELATED ACTIVATION AND DEACTIVATION IN MEDIAL CORTICAL AREAS
Our findings demonstrate that a region of the left medial wall, the pre-SMA, was active in both sentence repetition and sentence generation, and showed a unique and significant task-related modulation, suggesting a role in response selection at the sentence level, and henceforth extending previous results at the single word level. Interestingly, this effect was restricted to the left pre-SMA and did not extend into the right pre-SMA, suggesting a degree of functional specialization of the left pre-SMA. This pattern of activation is consistent with previous reports of a selection mode effect in the left but not the right pre-SMA (Tremblay and Gracco, 2010). It is also consistent with results of a study in which participants were required to generate sentences aloud from incomplete stimuli www.frontiersin.org  All coordinates are inTalairach space and represent the peak surface node for each of the cluster (minimum cluster size: 168 contiguous surface nodes, each significant at p < 0.005).

FIGURE 3 | Family-wise error-corrected group-level (N = 20) task difference (generation > repetition).
Activation is shown on the group average smoothed white matter folded surface.
("the child throws the ball" from "throw child ball"). The comparison of this task, which places a high demand on selection and sequencing mechanisms, with a sentence-reading task, which is less taxing, revealed activation in the left pre-SMA (Haller et al., 2005). It could be argued that activation in pre-SMA is related to semantic processing, though this would be surprising given the known involvement of this region in tasks requiring volitional selection without semantic processing. For instance, Tremblay and Gracco (2010) recently showed that when participants freely choose a word or a non-speech oral motor gesture from a pool of potential responses, activation in left pre-SMA is stronger than when they produce a word or a non-speech oral motor gesture based on specific instructions. In this task, semantic processing is minimal, and importantly, in the free selection condition, selection is not based on semantics. Moreover, the fact that activation in pre-SMA does not correlate with any of our language measures supports the claim that activation in the pre-SMA is not tied specifically to language, but rather to a domain-general process. In keeping with previous findings, the present results thus suggest that the left pre-SMA is involved in selecting a response in the context of sentence production. Further, it appears that despite increased complexity, response selection in the context of sentence production engages similar mechanisms to response selection for isolated words and oro-facial gestures.
In addition to task-related activation in the pre-SMA, we also found task-related deactivation in the rostral cingulate gyrus during sentence generation that was not present during sentence repetition. The rostral cingulate area is known to be part of a putative default mode network (DMN), which was first identified through a meta-analysis of positron emission tomography studies (Shulman et al., 1997). In addition to the anterior cingulate, the DMN also includes the medial frontal cortex, the posterior cingulate cortex, precuneus, inferior parietal cortex, and the amygdala/hippocampus. It is now recognized that parts of the DMN are differentially engaged depending on task (e.g., Hasson et al., 2009;Newton et al., 2011), and it is postulated that these deactivations are the consequence of either increased or reduced task-related effort (Lin et al., 2011). In a recent study, it was shown that a cortical region including both rostral cingulate region and anterior medial frontal cortex was deactivated during a working memory task, and further, that deactivation in this region was positively correlated with working memory performance (Hampson et al., 2006), suggesting increased working memory demands for the sentence generation condition relative to the sentence repetition condition.

LATERAL PREMOTOR AREAS IN MOTOR RESPONSE SELECTION
In the present study, we examined three anatomically distinct parts of the lateral premotor area: the pars opercularis of the IFG, the rostral PMv corresponding to the ventral precentral sulcus, and the caudal PMv, corresponding to the ventral precentral gyrus.
In the left hemisphere all three areas were significantly active in both sentence repetition and sentence generation, while in the right hemisphere, only the caudal PMv was significantly active (for both tasks). The left rostral and caudal parts of PMv both exhibited a significant task-related modulation, extending previous findings of a modulation of PMv activation during single word selection under different selection modes (Tremblay and Gracco, 2010). While this pattern of activation suggests a role in response selection, the finding that activation magnitude in both regions is negatively correlated with number of words produced during the sentence generation tasks seems counterintuitive. Indeed, if a linear relationship exists between these two factors, one would predict that the more words are produced (hence the more motor programs compete for selection), the more activation there should be in a region involved in response selection; this pattern was not www.frontiersin.org found. Additional studies are required to examine further the contribution of the left PMv in response selection. Nevertheless, the present results clearly demonstrate that no lateral premotor area is more strongly involved in stimulus-driven actions than the medial regions, which challenges the "medio-lateral gradient of control" hypothesis of Goldberg (1985). In this seminal article, Goldberg described two separate systems (medial and lateral) for the control of voluntary actions. The medial system was organized around the SMA/pre-SMA, sensitive to internal events, and operated in an anticipatory mode, being primarily concerned with "volitional" actions. In contrast, the lateral was organized around the lateral premotor cortex sensitive to the external world, and operated in a responsive, interactive manner rather than being focused on internal events. The present results do not support the idea of a dual system for the control of actions. Instead, we suggest that motor response selection (whether it is volitional or stimulus-driven) is Frontiers in Psychology | Language Sciences accomplished within a single system involving both the pre-SMA and the rostral and caudal parts of PMv.

THE CASE OF PARS TRIANGULARIS
The role of Broca's area in language has been a central theme in language neuroscience since the nineteenth century. Multiple functions have been proposed to account for the complex and seemingly multifold contribution of this cortical area to language, including domain-specific functions such as syntactic processes (e.g., Grodzinsky and Friederici, 2006), and more general functions such as action understanding (e.g., Fadiga et al., 2009) and information integration (Hagoort, 2005). Of particular interest in the context of the current framework is the hypothesis that the anterior sector of Broca's area, the pars triangularis, is involved in a domain-general, response selection process (Thompson-Schill et al., 1997Robinson et al., 2005). In the present study, the left pars triangularis was significantly active in sentence generation, a task that is contingent upon semantic processing, but not in sentence repetition, a task with a limited reliance on semantic processes. This finding challenges the hypothesis of a general role for this area in response selection. As noted above, our hypothesis was that regions involved in response selection should be www.frontiersin.org modulated by selection mode (generation > repetition), but also, we expected these regions to be active in both sentence production tasks since both require selection and sequencing of motor programs. Admittedly, it is possible that selection mode (volitional, externally constrained) does not affect response selection. If that were the case, one would still expect a region involved in motor response selection to be active both during sentence generation and sentence repetition, a pattern that was not found in pars triangularis.
One possible interpretation of these results is that the left pars triangularis is involved in response selection by helping resolve response competition, consistent with Thompson-Schill et al. (1997 but only when competition occurs in the linguistic/semantic domain. In line with this interpretation, previous results have shown that pars triangularis is not active for selecting single word and single oral communicative gestures when selection is not dependent upon semantic or linguistic processes (Nagel et al., 2008;Tremblay and Gracco, 2010). Moreover, evidence for a role of pars triangularis in semantic/linguistic processing abounds (e.g., Poldrack et al., 1999;Wagner et al., 2001;Devlin et al., 2003;Amunts et al., 2004;Costafreda et al., 2006). For example, results of a combined fMRI/rTMS study show that the left pars triangularis is involved in the process of making semantic decisions about words presented visually, and further shows that rTMS over the pars triangularis interferes with a semantic decision task (Devlin et al., 2003), thereby demonstrating the importance of this region for semantic processing. Taken together, these results suggest that one way in which pars triangularis contributes to language production is by helping resolve response competition when competition occurs in the semantic domain. At a more general level, the entirety of the IFG is likely to participate in a large number of neural networks that act upon language input for a variety of context-dependent purposes.

MOTOR VS. LEXICAL SELECTION IN SPOKEN LANGUAGE PRODUCTION
It could be argued that the patterns of response that were found in the left pre-SMA and PMv (generation > repetition) in the present study reflect lexical rather than motor response selection. Indeed, from sentence repetition to sentence generation, the demands on lexical selection processes increase because different lexical entries compete for expressing a given meaning. However, another explanation (that we favor) is that during spoken language production, competition for selection occurs simultaneously at multiple levels of representation (lexical, motor). Although inconsistent with serial cognitive models of spoken language production such as that of Levelt (1999), such an interpretation is in line with cascaded models of spoken language production, such as those of Peterson and Savoy (1998) and Morsella and Miozzo (2002), both of which postulate that activation spreads (cascades) from lexico-semantic representations to phonological-motor representations during the preparation for speech production, until a selection is made.
Neurobiological models also support the existence of multiple simultaneous processes. Previous biological studies suggest that lexical and motor competition/selection rely on (at least partially) distinct neural circuits (pre-SMA and PMv for motor selection, left middle temporal gyrus for lexical selection). For instance, based on a comprehensive meta-analysis of the literature on spoken language production, Indefrey and Levelt (2000) and Indefrey and Levelt (2004) identified one region that appears to be critical for lexical selection: the central portion of the left middle temporal gyrus. In contrast, selection of non-speech oro-facial actions (which does not involve lexical selection) activates the pre-SMA and PMv, but not the central portion of the left middle temporal gyrus (Braun et al., 2001;Tremblay and Gracco, 2010). Moreover, studies on finger/hand response selection have shown that motor response selection occurs at the level of the pre-SMA and PMv (e.g., Deiber et al., 1996;Van Oostende et al., 1997;Hadland et al., 2001;Ullsperger and Von Cramon, 2001;Weeks et al., 2001;Lau et al., 2004Lau et al., , 2006. Finally, imaging studies in which a primed picture-naming paradigm was used to elicit over verbal responses support the claim of parallel processing through anatomically segregated circuits (de Zubicaray et al., 2006). In this study, semantically primed pictures were compared to unprimed pictures and activation was found in regions involved in both phonological retrieval and lexical-conceptual processing during picture-naming, as well as in the pre-SMA, suggesting multiple levels of competition during lexical access in spoken language production. In sum, both cognitive models and neurobiological data support the claim that selection occurs simultaneously at multiple levels during spoken language production.
Admittedly, the current study was not specifically designed to disentangle the possible levels of competition. It is therefore possible, although the evidence presented here suggests otherwise, that the activation patterns found in pre-SMA and PMv reflect lexical rather than motor competition. It is also possible, though unlikely, that lexical and motor competition processes are not dissociable anatomically. Additional studies are needed to characterize further the neural underpinnings of competition and selection during spoken language production, and to the extent possible, to disentangle the simultaneous competition mechanisms and the neural networks that implement them.

CONCLUSION
In general, results of the present study help clarify the contribution of the pre-SMA, cingulate areas, PMv, and pars triangularis to the process of selecting motor responses in the context of sentence production. Further, the present results suggest that motor response selection during sentence production engages neural resources similar to those engaged in the selection of isolated words and oral motor gestures, focusing on the left pre-SMA as well as the left rostral and caudal parts of PMv.