Original Research ARTICLE
Neural responses to complex auditory rhythms: the role of attending
- 1 Music Dynamics Laboratory, Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, FL, USA
- 2 Gazzaley Laboratory, Departments of Neurology and Physiology, University of California San Francisco, San Francisco, CA, USA
- 3 Human Cognition and Neural Dynamics Laboratory, Department of Psychology, Western Washington University, Bellingham, WA, USA
- 4 Human Brain and Behavior Laboratory, Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, FL, USA
- 5 Intelligent Systems Research Centre, University of Ulster, Derry, Northern Ireland
- 6 University MRI of Boca Raton, Boca Raton, FL, USA
The aim of this study was to explore the role of attention in pulse and meter perception using complex rhythms. We used a selective attention paradigm in which participants attended to either a complex auditory rhythm or a visually presented word list. Performance on a reproduction task was used to gauge whether participants were attending to the appropriate stimulus. We hypothesized that attention to complex rhythms – which contain no energy at the pulse frequency – would lead to activations in motor areas involved in pulse perception. Moreover, because multiple repetitions of a complex rhythm are needed to perceive a pulse, activations in pulse-related areas would be seen only after sufficient time had elapsed for pulse perception to develop. Selective attention was also expected to modulate activity in sensory areas specific to the modality. We found that selective attention to rhythms led to increased BOLD responses in basal ganglia, and basal ganglia activity was observed only after the rhythms had cycled enough times for a stable pulse percept to develop. These observations suggest that attention is needed to recruit motor activations associated with the perception of pulse in complex rhythms. Moreover, attention to the auditory stimulus enhanced activity in an attentional sensory network including primary auditory cortex, insula, anterior cingulate, and prefrontal cortex, and suppressed activity in sensory areas associated with attending to the visual stimulus.
Rhythms in music are complex sequences of acoustic events made up of repeating patterns of alternating sounds and silences that flow in time. Beat is a periodicity perceived in a rhythm, while metrical accent, or meter, refers to the perception of alternating stronger and weaker beats. Pulse refers to the most salient level of beats, i.e., the periodicity at which one is most likely to tap along with a rhythm. Figure 1A illustrates these concepts using the notation of Lerdahl and Jackendoff (1983). Pulse and meter are thought to correspond to temporal expectations, which are expectations for when rhythmic events should occur (e.g., Large and Jones, 1999; London, 2004). Pulse and meter develop over time through a process called induction, and rhythms that give rise to pulse and meter perception are called metrical rhythms. Metrical rhythms are easier to remember and reproduce than rhythms that are less likely to give rise to metrical percepts (See also Essens and Povel, 1985; Grahn and Brett, 2007). The degree of metricality affects the precision of the temporal encoding of rhythmic sequences (Grube and Griffiths, 2009), and pulse and meter are thought to enable synchronistic entrainment of body movements to complex musical rhythms (Large, 2000). Interestingly, pulse and meter persist in the face of considerable rhythmic complexity, such as syncopated rhythms (Figure 1B), in which event onset times violate temporal expectancies. For example, a periodic pulse is commonly perceived in syncopated rhythms even when no corresponding objective frequency exists among the acoustic events that comprise the rhythm (cf. Patel et al., 2005).
Figure 1. An illustration of the concepts of rhythm, beat, pulse, meter, and syncopation. S, Strong beat; W, Weak beat; (A) is a simple rhythm on a grid showing metrical structure and accent, with 16 beats at the eighth-note metrical level, eight (strong) beats at the quarter-note level (the pulse), and four beats at the half-note level; (B) is a syncopated rhythm shown on the same grid. The syncopated example shows violation of expectation based on metrical structure of strong/weak beats, with the absence of events on some strong beats and the presence of events on weak beats.
Investigations of the neural circuitry underlying rhythm and meter perception reveal overlap between brain regions sensitive to the production of rhythmic sequences and those related to movement (Dhamala et al., 2003; Chen et al., 2006, 2008b; Karabanov et al., 2009; Thaut et al., 2009). Rhythm perception recruits motor related areas even in the absence of overt movement, showing activity in premotor cortex (PMC) (Schubotz et al., 2000; Grahn and Brett, 2007; Chen et al., 2008a; Bengtsson et al., 2009; Grahn, 2009; Grahn and Rowe, 2009), cerebellum (Schubotz et al., 2000; Grahn and Brett, 2007; Chen et al., 2008a; Bengtsson et al., 2009), pre-supplementary motor area (pre-SMA) (Schubotz et al., 2000; Grahn and Brett, 2007; Bengtsson et al., 2009), supplementary motor area (SMA) (Schubotz et al., 2000; Grahn and Brett, 2007; Chen et al., 2008a; Bengtsson et al., 2009; Grahn, 2009; Grahn and Rowe, 2009), and basal ganglia (Schubotz et al., 2000; Grahn and Brett, 2007; Grahn, 2009; Grahn and Rowe, 2009). Basal ganglia and SMA have been implicated specifically in meter and pulse perception and have been shown to be more active while listening to metrical rhythms than in listening to rhythms not likely to induce a pulse percept (Grahn and Brett, 2007; Grahn, 2009; Grahn and Rowe, 2009). The role of basal ganglia in mediating pulse perception is further supported by the finding that Parkinson’s patients do not show the same benefit for beat-based rhythms as normal controls in a rhythm discrimination task (Grahn and Brett, 2009). Moreover, functional connectivity between basal ganglia (putamen) and cortical motor areas (PMC and SMA) and auditory cortex increases when listening to rhythms that have a perceived beat than when listening to non-beat rhythms (Grahn, 2009; Grahn and Rowe, 2009).
The foregoing results stress that the perception of pulse and meter involves integration across widespread auditory and motor related brain regions (Todd et al., 1999; Warren et al., 2005; Stewart et al., 2006; Zatorre et al., 2007). It has been proposed that the interaction between auditory and motor networks is mediated through the dorsal auditory pathway that leads from posterior superior temporal gyrus (planum temporale, PT) to prefrontal, premotor, and motor cortices (Warren et al., 2005; Zatorre et al., 2007). The dorsal auditory pathway is activated in the production of rhythmic sequences regardless of whether the rhythm was learned through auditory or visual modalities, suggesting that all rhythms learned for the purposes of production, at least with short-term training, are maintained through an auditory-motor representation (Karabanov et al., 2009). Both the PT and PMC have been shown to be recruited when tapping to increasingly metrical rhythms (Chen et al., 2006), to be functionally correlated when tapping to increasingly complex rhythms (Chen et al., 2008b), and to be active during passive listening to rhythms (Chen et al., 2008a).
Because of the role that beta band activity plays in motor processes (Stancák and Pfurtscheller, 1996; McFarland et al., 2000) and in long-range coordination of brain areas (Kopell et al., 2000; Brovelli et al., 2004), it has been suggested that auditory responses might be modulated by the motor system via high-frequency activity in the beta band (Iversen et al., 2009). Moreover, recent studies have found that the time course of high-frequency neural activity in certain brain areas provides a good temporal correlate of pulse and meter perception (Snyder and Large, 2005; Fujioka et al., 2009; Iversen et al., 2009). These results are consistent with the theory of dynamic attending, which hypothesizes that neural oscillation underlies the perception of pulse and meter (Large and Kolen, 1994; Large, 2000), targeting attentional energy toward expected points in time (Large and Jones, 1999). Dynamic attending is supported by a number of studies that have observed perceptual facilitation of temporally expected events (McAuley and Kidd, 1995; Jones and Yee, 1997; Large and Jones, 1999; Barnes and Jones, 2000; Jones et al., 2002; Jones and McAuley, 2005; Quene and Port, 2005). We reasoned that, if high-frequency bursting mediates not only attention to events in rhythmic sequences but also the temporal coordination between brain areas, then attention may play a role in coordinating the interaction between auditory and motor areas in pulse and meter perception (Large and Snyder, 2009). Neural responses to metrical changes (Geiser et al., 2009) and behavioral responses to tempo changes have been shown to be attention dependent (Repp and Keller, 2004). Thus, it is possible that pulse and meter perception in complex, syncopated rhythms is also attention dependent. This hypothesis leads to several predictions. Here, we ask whether differences in functional activation may be observed in auditory and motor areas depending on whether attention is directed toward or away from a rhythmic stimulus. To test this hypothesis, participants were instructed to selectively attend to either a complex rhythmic sequence or a visually presented list of words so that activation related specifically to auditory attention to complex rhythms could be observed.
The current experiment was designed to uncover neural activation associated with attending to complex rhythms. Syncopated stimuli were constructed such that observed neural correlates of pulse perception would necessarily reflect endogenous processes, not merely responses to acoustic events in the rhythmic stimulus. For these stimuli, it was hypothesized that activations in auditory and motor areas associated with pulse and meter perception would depend on whether attention was directed toward or away from the rhythm. Specifically, activity in motor areas thought to support pulse perception, such as basal ganglia and SMA, was expected to be seen when participants were instructed to selectively attend to the rhythms but not when participants were instructed to attend to the visual stimuli. Selective attention to the auditory rhythms was also expected to reveal modality related differences in areas known to be involved in attention such as anterior cingulate (ACC), which, given its role in error detection (Bush et al., 2000), could be implicated in temporal expectancy as well. While some behavioral (Duncan et al., 1997) and electrophysiological (measuring MMN) (Alho et al., 1994) studies have suggested independent processing of simple visual and auditory stimuli using attention monitoring tasks, we predicted that selectively attending to complex rhythms in an auditory working memory task would modulate activity in cortex specific to the modality (e.g., greater primary auditory activity seen when attending to the rhythms) (Woodruff et al., 1996; Johnson and Zatorre, 2006; Lakatos et al., 2008). Finally, because we used syncopated rhythms with no cues for pulse, pulse induction depended in part upon repetition of the rhythmic pattern and was expected to unfold over two or more pattern repetitions. Therefore, it was hypothesized that activations in pulse-related areas, such as basal ganglia and SMA, would be observed only after a sufficient number of pattern repetitions.
Materials and Methods
Thirteen right-handed participants, five female and seven male (aged 20–46 years, M = 28.83 years), gave informed consent before participating in the study. Musical experience ranged from 0 to 24 years (1 had 24 years experience playing music, 3 had 20 years experience, 1 had 7 years playing experience, 1 had 2 years experience and 7 had no experience playing music).
Stimuli and Task
Auditory and visual stimuli were presented simultaneously in the fMRI scanner in two conditions (1) auditory and (2) visual. Participants were instructed to either (1) perform an auditory working memory task in which they attended to rhythmic patterns while ignoring visual stimuli or (2) perform a visual working memory task in which they attended to visual stimuli while ignoring the rhythms. Performance on auditory and visual reproduction tasks was used to gauge whether participants successfully attended to the appropriate stimulus. This allowed us to directly compare activity associated with attending to complex rhythms with activity related to passive exposure to rhythmic stimuli. Two stages of rhythm perception were investigated, an early phase during which pulse and meter are first induced and a later phase in which the listener has developed a stable pulse and meter percept.
Ten complex rhythms were based on a metrical grid with 16 beats at the eighth-note metrical level, and eight (strong) beats at the quarter-note level (the pulse). Each of the eighth-note level beats was the possible temporal location of an acoustic event. Acoustic events were 440 Hz pure tones with a duration of 80 ms and 10 ms rise and fall times. The inter-beat-interval (IBI) at the eighth-note level was 250 ms and each pattern was 4 s long. Syncopated patterns were constructed as follows. Each pattern contained eight tones. The first tone always occurred on the first beat (which was a strong beat) and a rest always occurred on the final beat (a weak beat). Patterns were constructed in this way to facilitate the perception of the pattern repetition. The remaining seven tones were distributed such that half of the tones of the pattern occurred on strong beats and half occurred on weak beats. Thus, each pattern was expected to give rise to a basic pulse at 500 ms (i.e., the quarter-note level of the metrical grid) but would be highly syncopated (half of the pulse times would not be marked by a tone onset; see Figure 2). Fourier analysis of the rhythms verified that none of the patterns contained significant energy at the pulse frequency (i.e., 1/0.500 s = 2 Hz). A higher pitched 880 Hz tone began and ended the interval in which participants were asked to reproduce the rhythm. The auditory stimulus was adjusted to a comfortable listening level.
Figure 2. Auditory stimuli consisted of 10 syncopated rhythms with eight acoustic events, each placed at 1 of 16 possible event locations (i.e. , eighth-note level beats with an IBI = 250 ms). During the pulse synchronization experiment, participants were asked to tap quarter-note level beats [i.e., the pulse, corresponding to strong beats (S), IBI = 500 ms].
Participants looked at a fixation cross surrounded by three letter words (see Figure 3) while they listened to the rhythmic patterns. Words were randomly selected from a list of 300 three-letter English words. The visual stimulus was arranged in such a way that the participant could see the entire word list even though s/he was fixating on the cross. The same word list/auditory pattern pairing was used in both auditory and visual conditions.
In the auditory condition, the participant was instructed to attend to the rhythmic pattern, which repeated for six cycles (attend = 24 s), mentally rehearse the rhythm for the duration of three cycles (rehearsal = 12 s), and reproduce the rhythm (using the right-hand) for three cycles (reproduction = 12 s). The rhythm reproductions corresponded to the events illustrated in Figure 2. The stimulus presentation portion of the experiment was divided into two parts, termed attend 1 (first 3 repetitions – 12 s) and attend 2 (second 3 repetitions – 12 s). Stimulus presentation was continuous through both Attend 1 and Attend 2.
In the visual condition, the participant was instructed to attend to the words surrounding the fixation cross (attend = 24 s), mentally rehearse the words once they disappeared (rehearsal = 12 s), and then verbally report the remembered words (reproduction = 12 s).
Prior to participation in the fMRI experiment, participants were tested in a preliminary pulse synchronization experiment (cf. Patel et al., 2005). The goal of this behavioral experiment was to determine the extent to which each participant was able to perceive the pulse of the complex rhythmic stimuli. Participants were seated in an IAC sound-attenuated experimental chamber wearing Sennheiser HD250 linear II headphones. The rhythms were presented by a custom Max/MSP program running on a Macintosh G3 computer. Participants tapped on a Roland Handsonic HPD-15 drumpad that sent the time and velocity of the taps via MIDI (Musical Instrument Digital Interface) to the Max/MSP program. The experimenter instructed participants to listen to the pattern and begin tapping the pulse when they could “‘feel’ the beat” at a rate equal to how they would “normally tap (their) foot to a song.” The experimenter demonstrated tapping the pulse for two practice patterns (not used in the study) at a rate corresponding to the pulse (strong beats) illustrated in Figure 2. Participants were encouraged to practice pulse synchronization while listening to the practice patterns. Once they felt comfortable synchronizing with the practice patterns, participants began the experiment.
Magnetic Resonance Imaging
As a correlate for neural activity, changes in blood oxygenation (BOLD response) were measured using echo-planar imaging on a 3.0-T Signa Scanner equipped with real time fMRI capabilities (General Electric Medical Systems, Milwaukee, WI, USA). Echo-planar images were collected using a single shot, gradient-echo, echo-planar pulse sequence [field of view (FOV) = 24 cm, echo time (TE) = 35 ms, flip angle (FA) = 90°, in plane matrix = 64 × 64]. All images were collected using a sparse sampling technique with an effective repetition time (TR) of 12 s. Adequate coverage of the brain was achieved by collecting 30 interleaved 5 mm thick axial slices with no spacing between (voxel size = 3.75 mm × 3.75 mm × 5 mm). Immediately following the functional imaging, high resolution anatomical spoiled gradient-recalled at steady state (SPGR) images (5 mm thick, no spacing, number of excitations = 2, TE = in phase, TR = 325 ms, FA = 90°, in plane resolution 256 × 256, bandwidth = 31.25) were collected at the same slice locations as the functional images. Using an eight-channel head coil another set of high resolution FSPGR images (1 mm thick, no spacing, 180 locs per slab, TE = min full, TR = prep time 400 ms, FA = 12°, in plane resolution 256 × 256, bandwidth = 31.25) were collected.
A sparse sampling technique was used in the scanner to increase the signal response from baseline (which was silence) and to avoid non-linear interaction of the scanner sound and the auditory stimulus (see Figure 4, Hall et al., 1999). Participants were presented six 10-min blocks (three auditory attend, three visual attend conditions, presented in counterbalanced order), with 10 trials in each block. A custom Visual Basic 5 program running on a Dell Optiplex GX260 was used to generate both auditory and visual stimuli. Sound stimuli were presented using custom noise-attenuating headphones (Avotec Inc., Stuart, FL, USA). Visual stimuli were presented through a set of fiber optic goggles (Avotec Inc., Stuart, FL, USA) mounted to the head coil. Participants were instructed to tap with their right index finger on an MR compatible button box.
Figure 4. A schematic representation of the fMRI scanning session for both auditory and visual conditions. A sparse sampling approach was adopted by clustering image acquisition into a 2 s interval preceded by 10 s of scanner silence. This approach gave an effective TR of 12 s.
Performance on the pulse synchronization experiment was measured by calculating the synchronization coefficient, also called vector strength (Batschelet, 1981; Pikovsky et al., 2001), which quantified how well taps were time locked to the perceived pulse of the rhythms. Synchronization coefficients ranged from 0 (no synchronization) to 1 (perfect synchronization). Performance on the rhythm reproduction task was measured by correlating the participants’ inter-tap-intervals (ITI) with the inter-onset-intervals (IOI) of the rhythms.
Reproduction of the rhythmic patterns was used to gauge whether participants had successfully attended to the auditory rhythms. In the attend auditory condition, trials in which the participant correctly reproduced the pattern were included in the fMRI analysis. Exclusion criteria for rhythm reproduction trials were based in part on the correlation between the participants’ ITIs and the IOIs of the rhythms. In addition, two judges listened to each reproduction and agreed on whether or not participants had tapped the qualitatively correct pattern. The judgment allowed us to retain four trials in which the participant tapped the correct pattern but did not have a high ITI/IOI correlation (e.g., because they started tapping in the middle of the pattern). Using these criteria, 103/390 trials were judged unsuccessful, and therefore excluded from the fMRI analysis. However, this did not represent a sufficient number of unsuccessful rhythm reproductions to enable comparison of trials in which reproduction was successful to trials in which participants were not able to reproduce the rhythms accurately. Similarly, attend visual trials in which participants remembered four or more words were included in the fMRI analysis. Using this criterion, 46/390 trials were unsuccessful and therefore excluded from the analysis.
Except where noted, data analysis was performed using AFNI (Cox, 1996; Cox and Hyde, 1997) running on an Apple G5. Functional data sets were corrected for motion and smoothed spatially by convolution with a Gaussian kernel (FWHM 4 mm). Data was high-pass filtered at 1/90 s (∼0.0111 Hz) to correct for low frequency drift. A hemodynamic response function (HRF) was convolved with a binary vector representing the off/on timing of each condition to create a model time series. Multiple regression was used to determine the contribution of the model to the data at each voxel. Functional images were registered to a template brain in the coordinate system of Talairach and Tournoux (1988) using SPM2 (Wellcome Department of Imaging Neuroscience, London) using a two step process. First the high resolution SPGR image of each participant was registered to the template brain. Second, the same transformation matrix was applied to each of the low-resolution functional images. Group analysis was conducted by submitting individual beta weights to one sample t-tests. To correct for multiple comparisons, a Monte Carlo simulation was conducted to determine the random distribution of voxel cluster sizes for a given threshold (for similar approaches see, Ledberg et al., 1998). A corrected alpha of p < 0.002 was achieved by the combination of a per voxel threshold of p < 0.01 and a cluster size of eight contiguous voxels (512 mm3).
In the preliminary pulse synchronization experiment, the mean time to begin pulse synchronization was 1.37 pattern repetitions (equal to 5.48 s, SD = 0.93 s). Therefore, participants perceived and attempted to synchronize with the pulse about halfway through the second repetition of the pattern. A wide range of synchronization coefficients was observed in pulse synchronization (0.26 ≤ rsync ≤ 0.86, mean rsync = 0.60). In the fMRI experiment, performance on the rhythm reproduction task varied as well (0.32 ≤ rcorr ≤ 0.88, mean rcorr = 0.62). Thus, some participants had an easier time perceiving and synchronizing to the pulse and some participants had an easier time reproducing the rhythmic patterns. Correlation analysis revealed a significant relationship between pulse synchronization task and rhythm reproduction (r = 0.74, p = 0.0064, Figure 5) after one outlier was removed (r = 0.50, p = 0.0804 when outlier was included). Thus, the ability to perceive the pulse of a complex rhythm predicted the ability to accurately reproduce the rhythm, as has been previously observed (Essens and Povel, 1985). On average, subjects remembered slightly more than half of the words during the visual reproduction task (mean = 5.23 words, SD = 1.66 words).
Figure 5. Scatter plot showing pulse synchronization coefficients and rhythm reproduction values. Each subject’s data is represented by a blue cross, with the outlier circled in red. The solid green line represents the regression line with the outlier removed (p = 0.0064) and the dotted green line represents the regression line with the outlier included (p = 0.0804).
In evaluating the imaging results, the auditory conditions were first compared to rest. BOLD signal increases during auditory attend 1 (Figure 6A; Table 1) were restricted to bilateral superior temporal gyrus (STG, BA 22, 41) in areas compatible with primary auditory cortex. Similar activity in primary and secondary auditory areas (BA 41, 22) was associated with auditory attend 2 (Figure 6B; Table 1). Additionally, for auditory attend 2 we observed an increase in the BOLD response in motor areas including left SMA, right basal ganglia (caudate, globus pallidus, extending into nucleus accumbens), and left postcentral gyrus (BA 3).
Figure 6. Brain regions where BOLD signal was significantly different during (A) the auditory attend 1 condition compared to rest (p < 0.002 corrected) and (B) auditory attend 2 compared to rest. Red to yellow colored voxels represent brain areas where auditory attend 1 > rest and attend 2 > rest. Blue areas show where auditory attend 1 < rest and auditory attend 2 < rest. The coronal slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. STG, superior temporal gyrus; SMA, supplementary motor area.
Auditory rehearse (Figure 7; Table 2) was associated with BOLD increases in motor areas including bilateral SMA, bilateral basal ganglia (right side caudate, lentiform nucleus, and putamen, extending into nucleus accumbens; left side lentiform nucleus, putamen, lateral globus pallidus), right precentral gyrus (BA 6), left postcentral gyrus (BA 3,2), cerebellum (uvula, culmen), left prefrontal cortex, and secondary auditory cortices. During auditory reproduce (Figure 8; Table 2), increased activation was observed in left postcentral gyrus (BA 3,2, extending into precentral gyrus (BA 4), ventral PMC, SMA, inferior parietal lobe (IPL, BA 40), left basal ganglia (lentiform nucleus, putamen, lateral globus pallidus), right cerebellum (declive culmen, uvula), bilateral inferior frontal gyrus (IFG, BA 44), as well as in primary and secondary auditory areas (mainly left lateralized).
Figure 7. Brain regions where BOLD signal was significantly different during the auditory rehearse condition compared to rest (p < 0.002 corrected). Red to yellow colored voxels represent brain areas where auditory rehearse > rest. Blue areas show where auditory rehearse < rest. The coronal slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. SMA, supplementary motor area; PMC, premotor cortex; LN, lentiform nucleus; Put, putamen.
Figure 8. Brain regions where BOLD signal was significantly different during the auditory reproduce condition compared to rest (p < 0.002 corrected). Red to yellow colored voxels represent brain areas where auditory reproduce > rest. Blue areas show where auditory reproduce < rest. The axial slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. SMA, supplementary motor area; STG, superior temporal gyrus; IPL, inferior parietal lobe; PMC, premotor cortex; LN, lentiform nucleus; Put, putamen.
Activations associated with attending to complex auditory rhythms were revealed by comparing auditory attend 2 with visual attend 2 (Figure 9; Table 3). Increased BOLD responses associated with auditory attention were seen in right basal ganglia (caudate), left primary auditory cortex, left superior frontal gyrus (BA 8, extending into pre-SMA), and right medial prefrontal cortex (extending to bilateral ACC and cingulate).
Figure 9. Brain regions where BOLD signal was greater during the auditory attend 2 compared to visual attend 2 condition (p < 0.002 corrected). Red to yellow colored voxels represent brain areas where auditory attend 2 > visual attend 2. The coronal slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. STG, superior temporal gyrus; pre-SMA, pre-supplementary motor area; MPFC, medial prefrontal cortex; ACC, anterior cingulate cortex.
Activity associated with attending to a rhythm once a pulse percept had sufficient time to fully develop was uncovered by comparing auditory attend 2 with auditory attend 1 (Figure 10; Table 3). Increased BOLD responses were seen in left IFG [BA 47, extending into bilateral basal ganglia (caudate), nucleus accumbens), left STG (BA 22, 41, extending to insula, basal ganglia (lentiform nucleus, putamen)], left postcentral gyrus (BA 3, extending into primary and secondary auditory cortex), left medial prefrontal cortex [extending to ventral ACC, cingulate (BA 24, 32)], and left dorsal ACC (BA 24).
Figure 10. Brain regions where BOLD signal was greater during the auditory attend 2 compared to auditory attend 1 condition (p < 0.002 corrected). Red to yellow colored voxels represent brain areas where auditory attend 2 > auditory attend 1. The coronal slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. STG, superior temporal gyrus; LN, lentiform nucleus; Put, putamen; ACC, anterior cingulate cortex.
Rehearsing rhythms compared to rehearsing words (Figure 11; Table 3) revealed greater activity in bilateral basal ganglia (lentiform nucleus, putamen, caudate), left medial prefrontal cortex [BA 9, extends to bilateral cingulate, ACC (BA 24, 32), extending into pre-SMA], left postcentral gyrus (BA 3), and left primary auditory cortex. Results for relevant visual conditions (attend 2 and rehearse) compared to rest are reported in Table 4.
Figure 11. Brain regions where BOLD signal was significantly greater during the auditory rehearse compared to visual rehearse condition (p < 0.002 corrected). Red to yellow colored voxels represent brain areas where auditory rehearse > visual rehearse. The coronal slice is shown with the left (L) on the left side of the figure. The colorbar reflects t-values. STG, superior temporal gyrus; pre-SMA, pre-supplementary motor area; SFG, superior frontal gyrus; LN, lentiform nucleus; Put, putamen, ACC; anterior cingulate cortex.
In this experiment, we observed that brain activations related to selective attention, rehearsal, and reproduction of complex auditory rhythms unfolded over time in a meaningful way. Attending to the first three repetitions of a complex rhythmic pattern activated primary sensory areas. During the next three repetitions of the pattern, the activation became more complex. Areas related to pulse and meter perception (Grahn and Brett, 2007, 2009; Grahn, 2009; Grahn and Rowe, 2009), such as basal ganglia and SMA, were recruited as the participant attended to additional repetitions of the pattern. After the external stimulus stopped, the pattern was maintained by these same structures with the added support of the dorsal auditory pathway (PT, PMC, prefrontal cortex) and insula. Reproduction of the rhythmic pattern recruited primary auditory sensory areas (mainly lateralized to the left), insula, and the dorsal auditory pathway, in addition to motor areas, which may be indicative of the utilization of an auditory sensory memory.
Basal ganglia activity was observed when subjects were instructed to attend to the rhythms but not when they were instructed to attend to the visual stimulus. This finding supports the hypothesis that attention is necessary to recruit basal ganglia when listening to complex rhythms. Basal ganglia activity was observed only after the rhythms had been presented a sufficient number of times for the listener to perceive a pulse. Because there is evidence linking pulse perception to basal ganglia activation (Grahn and Brett, 2007, 2009; Grahn, 2009; Grahn and Rowe, 2009), the current observations suggest that attention may be necessary for the induction of a pulse percept when listening to complex (syncopated) rhythms that contain no energy at the pulse frequency. This would be consistent with the role of attention in more complex rhythmic tasks (Repp and Keller, 2004; Geiser et al., 2009), though this prediction needs further testing in future behavioral experiments. Moreover, basal ganglia have also been discussed as playing a role in “training” more frontal areas during learning of musical sequences (Leaver et al., 2009). In agreement with this notion, basal ganglia were found to remain active during rhythm rehearsal (and more so than during word rehearsal), when frontal areas were also recruited to maintain and learn the rhythm in preparation for reproduction.
Supplementary motor area and PMC activations were found when participants were instructed to direct their attention toward the auditory rhythms. However, increased activation in these areas was not found in comparison with the visual attend condition, possibly because SMA and PMC activations were also observed during the attend visual conditions (compared to rest, see Table 4). Activation of these areas during visual attend conditions could reflect their involvement in the visual working memory task or indicate automatic engagement of the motor system in response to rhythm presentation regardless of the modality to which attention is directed. SMA and PMC have been implicated in the semantic processing of words (Chee et al., 1999), and maintenance of verbal working memory (Smith and Jonides, 1996, 1998). Furthermore, in the current study, rehearsal of the words was also associated with SMA and PMC activation (when there was no stimulus was present). Thus, while automatic engagement of these areas during rhythm presentation cannot be ruled out, these results suggest that the activations seen in SMA and PMC during the attend visual condition were due to the role of the motor system in perception and working memory for verbal information. Thus, the activity of the SMA and PMC during both visual and auditory attend conditions may reflect the inherent role of the motor system in verbal and rhythm perception, respectively.
Instructions to attend to the auditory rhythms additionally led to greater activity in an attentional sensory network including primary auditory cortex, insula, ACC, and prefrontal cortex, indicating the role of attention in modulating activity in primary sensory areas through higher-level cognitive areas involved in learning complex sequences. Similar areas, such as STG, insula, and prefrontal cortex, have also been correlated with selective attention to different streams in polyphonic music (Janata et al., 2002). Dorsal ACC activity was seen when comparing the auditory to visual attend and rehearse conditions and when comparing the later and earlier phases of the auditory attend condition. Consistent with its involvement in on-line monitoring of expectancies (Bush et al., 2000), the dorsal portion of the ACC may be related to temporal expectancy in the complex rhythms presented in this study. ACC has also been correlated with tracking dynamic changes in tonality in autobiographically salient musical excerpts (Janata, 2009). In addition, in line with previous work, selective attention to the auditory stimulus (Figure 9) enhanced activity in auditory sensory areas (Woodruff et al., 1996; Johnson and Zatorre, 2006) and suppressed activity in sensory areas associated with attending to the visual stimulus (Johnson and Zatorre, 2006; Lakatos et al., 2008). Previous work has suggested independent processing for simple auditory and visual stimuli using dual task (Duncan et al., 1997) and oddball detection (Alho et al., 1994) paradigms. However, suppression of auditory cortex has been observed during visual working memory tasks (Crotazz-Herbette et al., 2004) and selective attention tasks (Johnson and Zatorre, 2006). The current study used more demanding working memory tasks with complex stimuli in a selective attention paradigm. While we did not observe suppression of auditory areas during the visual attend condition, we did observe greater activation of auditory cortex during selective attention to the auditory stimulus. The current findings provide evidence that selective attention for complex stimuli and tasks results in differential activity depending on the attended modality and that there is an asymmetry in suppression of activity in the unattended modality.
Verbal working memory has been modeled as a phonological loop that consists of articulatory rehearsal and phonological store components (Baddeley, 1986). Paulesu et al. (1993) attributed the articulatory rehearsal component to activation in Broca’s area and the phonological store to activity in supramarginal gyrus. Smith and Jonides (1996, 1998) observed activity in Broca’s area along with activation of PMC and SMA during verbal working memory. In the current study, activity was observed in PMC, SMA, and Broca’s area during both rhythm rehearsal and visual rehearsal conditions, which could be indicative of subvocal rehearsal seen in verbal working memory tasks. Similar activations due to verbal working memory in both auditory and visual rehearsal conditions would explain why these areas are not seen in the contrast between the two conditions. However, rehearsal of rhythms compared to rehearsal of words does result in other areas of activation, including basal ganglia, dorsal and ventral ACC, and primary auditory cortex, showing that maintaining a rhythmic pattern recruits additional areas that may be related to pulse perception, temporal expectancy, and auditory memory.
As predicted, activity in pulse-associated areas (basal ganglia and SMA) was seen during the second half of stimulus presentation in the auditory attend condition, whereas activation in these areas was not seen during the first three repetitions of the rhythms. Together with our observation that pulse synchronization begins during the second pattern repetition, this represents additional evidence that these functional activations reflect pulse perception. On the basis of these data alone it cannot be ruled out that the observed activation of motor areas during attention to the auditory stimuli is related to imagination and preparation of the subsequent rehearsal/reproduction stages. However, our interpretation would be consistent with previous findings that these circuits are associated with pulse and meter perception during passive listening in the absence of any motor demands (Schubotz et al., 2000; Grahn and Brett, 2007; Chen et al., 2008a; Bengtsson et al., 2009; Grahn, 2009; Grahn and Rowe, 2009). The role of the frontal motor circuit in rhythm generation is not surprising given the established role of these motor areas in human timing (Meck et al., 2008), selective attention to time (Coull et al., 2004), and sequencing (reviewed in Nachev et al., 2008). In light of this previous work, the current observations further support the growing understanding that pre-motor regions such as the SMA (Chen et al., 2008a) and basal ganglia are important for the representation of pulse and rhythm even in the absence of movement (Grahn and Brett, 2007; Zatorre et al., 2007). Here, this finding has been extended to demonstrate that the proposed auditory to motor mapping is not automatic for syncopated rhythms, but requires attention to the rhythmic stimulus and requires time to develop.
In general, the current results confirm previous findings and illustrate the fundamental importance of an extended motor network in pulse and meter perception (Grahn and Brett, 2007; Chen et al., 2008a; Grahn, 2009; Grahn and Rowe, 2009). Integrated auditory–motor activity corresponding to meter may help explain the universal subjective experience of the spontaneous urge to move to rhythmic music. This interaction may also explain why the most common tempo for popular dance music (van Noorden and Moelants, 1999), preferred and spontaneous tapping rates (Fraisse, 1982), and preferred gait frequency are all well matched (averaging around 2 Hz) (for review see Todd et al., 1999), as well as the benefit that rhythmic stimuli have on those with movement disorders (McIntosh et al., 1997; Thaut et al., 1997; Whitall et al., 2000). Moreover, auditory–motor interactions are reciprocal such that movement can influence meter perception in infants (Phillips-Silver and Trainor, 2005) and adults (Phillips-Silver and Trainor, 2007). Rhythm perception can even be influenced without any overt motion by the illusory sensation of movement induced through vestibular stimulation (Trainor et al., 2009).
It was observed that attention modulates the brain networks responsible for the perception of complex, syncopated rhythms. Most significantly, the current observations show that attention is necessary for the activation of basal ganglia when listening to complex rhythms that do not contain energy at the pulse frequency. Whether attention is similarly necessary when such a frequency component is exogenously present is not yet clear, but previous work suggests that the answer to this question may be “no” (Grahn and Brett, 2007; Chen et al., 2008a). Additionally, we observed that for syncopated rhythms, sufficient time is needed for basal ganglia activations to develop. How can we incorporate these observations with our current knowledge of pulse, meter, and attention? Previous empirical and theoretical work suggests that pulse and meter are essentially a form of attentional allocation, serving to direct processing resources toward expected points in time; and performance on change detection tasks confirms that perception is facilitated for metrically regular sequences in both adults (Jones and Yee, 1997; Jones et al., 2002) and infants (Bergeson and Trehub, 2006; Trehub and Hannon, 2009). Within this context, the current results suggest that attention may be responsible not only for the temporal coordination of neural activity with external events, but also for the integration of brain regions necessary for task performance. This raises the possibility that both aspects of attention may be manifest in neural activity that coordinates brain areas in the perception of meter and rhythm. Future work is needed to understand the mechanisms mediating dynamic attending and the relationship between rhythmic entrainment and network coordination.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by NSF career award BCS-0094229 and a Fulbright visiting research chair awarded to Edward W. Large, and by NIMH grant MH19116, NINDS grant NS48229, NIMH Grant MH080838 and the Pierre de Fermat Chair to Scott J. A. Kelso. We would like to thank Judith Becker, Petr Janata, Armin Fuchs, Bruno Repp, Summer Rankin, and Marc Velasco for their helpful comments on the manuscript and Steve Sedita for his advice on analysis.
Brovelli, A., Ding, M., Ledberg, A., Chen, Y., Nakamura, R., and Bressler, S. L. (2004). Beta oscillations in a large-scale sensorimotor cortical network: directional influences revealed by Granger causality. Proc. Natl. Acad. Sci. U.S.A. 101, 9849–9854.
Chen, J. L., Penhune, V. B., and Zatorre, R. J. (2008b). Moving on time: brain network for auditory motor synchronization is modulated by rhythm complexity and musical training. J. Cogn. Neurosci. 20, 226–239.
Crotazz-Herbette, S., Anagnoson, R. T., and Menon, V. (2004). Modality effects in verbal working memory: differential prefrontal and parietal responses to auditory and visual stimuli. Neuroimage 21, 340–351.
Hall, D. A., Haggard, M. P., Akeroyd, M. A., Palmer, A. R., Summerfield, A. Q., Elliott, M. R., Gurney, E. M., and Bowtell, R. W. (1999). “Sparse” temporal sampling in auditory fMRI. Hum. Brain Mapp. 7, 213–223.
McIntosh, G. C., Brown, S. H., Rice, R. R., and Thaut, M. H. (1997). Rhythmic auditory-motor facilitation of gait patterns in patients with Parkinson’s disease. J. Neurol. Neurosurg. Psychiatr. 62, 22–26.
Stancák, A. Jr., and Pfurtscheller, G. (1996). Event-related desynchronization of central beta-rhythms during brisk and slow self-paced finger movements of dominant and nondominant hand. Cogn. Brain Res. 4, 171–183.
Thaut, M. H., Stephan, K. M., Wunderlich, G., Schicks, W., Tellmann, L., Herzong, H., McIntosh, G. C., Seitz, R. J., and HÖmberg, V. (2009). Distinct cortico-cerebellar activations in rhythmic auditory motor synchronization. Cortex 45, 44–53.
Whitall, J., Waller, S. M., Silver, K. H. C., and Macko, R. F. (2000). Repetitive bilateral arm training with rhythmic auditory cueing improves motor function in chronic hemiparetic stroke. Stroke 31, 2390–2395.
Woodruff, P. W., Benson, R. R., Bandettini, P. A., Kwong, K. K., Howard, R. J., Talavage, T., Belliveau, J., and Rosesn, B. R. (1996). Modulation of auditory and visual cortex by selective attention is modality-dependent. Neuroreport 7, 1909–1913.
Keywords: fMRI, attention, rhythm, timing, auditory perception
Citation: Chapin HL, Zanto T, Jantzen KJ, Kelso SJA, Steinberg F and Large EW (2010) Neural responses to complex auditory rhythms: the role of attending. Front. Psychology 1:224. doi: 10.3389/fpsyg.2010.00224
Received: 29 April 2010;
Accepted: 26 November 2010;
Published online: 24 December 2010.
Edited by:Claude Alain, Rotman Research Institute, Canada
Reviewed by:Petr Janata, University of California Davis, USA
Steve A. Arnott, Rotman Research Institute, Canada
Ben Dyson, Ryerson University, Canada
Copyright: © 2010 Chapin, Zanto, Jantzen, Kelso, Steinberg and Large. This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.
*Correspondence: Edward W. Large, Center for Complex Systems, Florida Atlantic University, 777 Glades Rd, Building 12, Rm 316, Boca Raton, FL 33431, USA. e-mail: firstname.lastname@example.org