Original Research ARTICLE
The impact of auditory working memory training on the fronto-parietal working memory network
- 1Department of Psychology, Saarland University, Saarbrücken, Germany
- 2Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- 3School of Psychology, Southwest University, Chongqing, China
- 4Liaoning Normal University, Dalian, China
Working memory training has been widely used to investigate working memory processes. We have shown previously that visual working memory benefits only from intra-modal visual but not from across-modal auditory working memory training. In the present functional magnetic resonance imaging study we examined whether auditory working memory processes can also be trained specifically and which training-induced activation changes accompany theses effects. It was investigated whether working memory training with strongly distinct auditory materials transfers exclusively to an auditory (intra-modal) working memory task or whether it generalizes to a (across-modal) visual working memory task. We used adaptive n-back training with tonal sequences and a passive control condition. The memory training led to a reliable training gain. Transfer effects were found for the (intra-modal) auditory but not for the (across-modal) visual transfer task. Training-induced activation decreases in the auditory transfer task were found in two regions in the right inferior frontal gyrus. These effects confirm our previous findings in the visual modality and extents intra-modal effects in the prefrontal cortex to the auditory modality. As the right inferior frontal gyrus is frequently found in maintaining modality-specific auditory information, these results might reflect increased neural efficiency in auditory working memory processes. Furthermore, task-unspecific (amodal) activation decreases in the visual and auditory transfer task were found in the right inferior parietal lobule and the superior portion of the right middle frontal gyrus reflecting less demand on general attentional control processes. These data are in good agreement with amodal activation decreases within the same brain regions on a visual transfer task reported previously.
The ability to keep representations in an active and accessible state is crucial for adaptive, intelligent behavior and is assumed to underlie a vast amount of cognitive functions such as language learning or problem solving (Baddeley, 1986, 2002, 2003). The temporary storage and manipulation of information has been termed working memory. One of the prominent working memory models, the multicomponent model (Baddeley and Hitch, 1974; Baddeley, 2002, 2003), suggests a system that comprises a central executive and subsystems specialized for maintaining specific types of information (Baddeley and Logie, 1999). The phonological loop stores auditory and phonological information and uses a subvocal rehearsal system to refresh information whereas the visual-spatial sketchpad is specialized for holding spatial and non-spatial visual information (e.g., Baddeley, 1986; Baddeley and Logie, 1999). Although the distinction between the two slave systems has triggered a considerable amount of research, the question to which degree these systems are plastic and trainable and whether training might affect the respective neural networks was rarely investigated.
This distinction between visual and auditory working memory systems can be found in several contemporary working memory models (e.g., Baddeley, 2003; Zimmer, 2008). However, most functional neuroimaging studies showed that across a wide variety of tasks such as the n-back task, item recognition or delayed matching tasks the bilateral fronto-parietal working memory network is active mainly independent of stimulus type (Nystrom et al., 2000; Wager and Smith, 2003; Owen et al., 2005). From these data it follows that a clear modality-specific dissociation for visual and auditory information might potentially not exist in the working memory network, which is constituted by direct and reciprocally connections between posterior brain regions including the intraparietal sulcus and posterior and mid-dorsolateral frontal brain regions (Petrides and Pandya, 2002; Mecklinger and Opitz, 2003).
Only a few studies have directly contrasted working memory for visual and auditory information. Studies using non-verbal visual and auditory material found subtle differences in the activity of the prefrontal cortex for information that differed in input modality. A working memory study by Rämä and Courtney (2005) with non-spatial visual (faces) and auditory materials (human voices) used a delayed recognition task and found subtle activation differences in the ventral prefrontal cortex: faces activated the dorsal part at Brodmann Area (BA) 44/45 more strongly than voices, while voices more strongly activated the inferior part at BA 45/47 of the ventral prefrontal cortex. These data provide evidence for a functional segregation within the ventral prefrontal cortex with ventral regions recruited by auditory and dorsal regions recruited by visual working memory processes. In a similar vein, Protzner and McIntosh (2007) compared auditorily and visually presented white noise bursts in simple working memory tasks and found modality-specific activations in the fronto-parietal network in addition to activations in sensory cortices. The auditory task version led to stronger activations in the right putamen and left posterior cingulate gyrus, while for the visual version stronger activations in the right middle frontal cortex, left middle cingulate, and left inferior parietal temporal cortex were found. Functional brain imaging studies using visually and auditorily presented verbal material also found modality-specific activation patterns (Crottaz-Herbette et al., 2004; Rodriguez-Jimenez et al., 2009). Both studies investigated working memory for auditorily and visually presented verbal stimuli, using digit numbers (Crottaz-Herbette et al., 2004) or letters (Rodriguez-Jimenez et al., 2009) and a 2-back task. They report greater activations for auditory material in the left dorsolateral prefrontal cortex, whereas the visual version of the task led to stronger activations in the left posterior parietal cortex (Rodriguez-Jimenez et al., 2009). However, these modality-specific dissociations need to be interpreted cautiously because by using verbal materials activations found for visual materials could actually represent phonological transformation processes rather than effects which are specific for processing visual input (Smith and Jonides, 1997; Baddeley et al., 1998; Suchan et al., 2006). Even though the studies examining the dissociation between holding auditory and visual information in working memory leave a rather inhomogeneous picture, most of the studies refer to a relative dissociation of modality-specific activity.
Functional brain imaging studies on auditory memory for pitch further specified the neural circuitry for auditory object working memory i.e., working memory for sound identity information (Zatorre et al., 1994; Griffiths et al., 1999; Gaab et al., 2003; Koelsch et al., 2009). Using different kinds of pitch working memory tasks activations in the right inferior frontal region (Zatorre et al., 1994; Griffiths et al., 1999) or the left inferior frontal gyrus (Gaab et al., 2003) were found besides more inhomogenous activations between the studies in the cerebellum, posterior temporal and parietal regions. Furthermore, Koelsch et al. (2009) found that rehearsal of either the pitch information or the verbal information of sung syllables activated the ventrolateral premotor cortex (encroaching Broca's area), dorsal premotor cortex, the planum temporale, inferior parietal lobule, the anterior insula as well as subcortical structures and the cerebellum. By this, rehearsal of tonal and verbal information seems to recruit strongly overlapping neural networks. Notably, although the results of the studies are not homogenous, all of them found activations in the prefrontal cortex especially the left or right inferior frontal cortex to be involved in working memory for melodic and pitch information. Together the functional brain imaging studies contrasting auditory vs. visual material and the studies on the neural correlates of auditory object working memory speak for a specific involvement of the inferior frontal gyri for holding and rehearsing auditory object information in working memory.
To examine the functional plasticity of holding specific information in working memory, few recent studies have employed working memory training (Sayala et al., 2006; Schneiders et al., 2011; see Lövdén et al., 2010, for a review). More precisely, they used this method to disentangle specific components or processes improved by the training. This aim is based on the idea that cognitive training leads to improvements only in those tasks which share processing components with the trained task and thus might involve similar or overlapping brain regions (Jonides, 2004; Dahlin et al., 2008; Jaeggi et al., 2008; Lövdén et al., 2010; Morrison and Chein, 2011). From this commonality logic it follows that one approach to investigate trained processes is to compare two (or more) training tasks, which differ only in terms of a processing component of interest (Lövdén et al., 2010; Schneiders et al., 2011). This approach will be referred to as “training-specificity approach” in the following because multiple training regimens are compared with respect to the differential effects they have on one and the same transfer task. Another approach is to investigate the degree to which one specific training regime results in improved performance on multiple transfer tasks which do or do not share the processing component of interest (for a similar approach see Dahlin et al., 2008). Thus, if the training was effective and in turn the processing component of the training task improved, transfer effects should be found only for those transfer tasks, which engage that process. In the following this approach is referred to as “task-specificity approach.”
In a previous training study we applied the “training-specificity approach” to investigate the impact of intra-modal and across-modal working memory training on a visual working memory task (Schneiders et al., 2011). Larger improvements after visual working memory training compared to auditory or no training were found in a visual 2-back task with abstract black and white pattern stimuli. These intra-modal effects were accompanied by training-related decreases in activation in the right middle frontal gyrus at BA 9 resulting from visual training only. Both trainings—in the visual and auditory modality—led to decreased activation in the superior portion of the right middle frontal gyrus at BA 6 and the right posterior parietal lobule at BA 40. These results support the view that working memory for visual materials can be trained separately from auditory materials and leads to increased neural efficiency i.e., reduced brain activation in combination with better performance in the visual 2-back task after visual training. This effect can functionally be dissociated from amodal activation decreases which were present after both, visual and auditory training at BA 6 and BA 40. These effects were taken to reflect more effective general control processes. Together these data could convincingly demonstrate that intra-modal training effects occur on the behavioral and neural level in the visual modality. As there was no auditory transfer task in our previous study, the data do not speak to the question whether working memory is also trainable specifically for auditory material.
The aim of the present study was to investigate whether auditory working memory training (training task) leads to specific improvements in the intra-modal auditory modality (near transfer task) or to general (across-modal) improvements also in visual working memory (far transfer tasks). By this we follow the task-specificity approach of using one training regimen to elucidate the nature of plasticity for holding specific types of information in working memory. To increase the likelihood of obtaining training gains in auditory working memory, we used highly salient tonal sequences in an auditory adaptive n-back training paradigm, in which the global pitch contour pattern, i.e., the relative pitch of tones in a sequence, had to be compared to the pattern presented n positions back in the stimulus train. As it was already shown that such pitch contour discrimination can be trained (Foxton et al., 2004), we assume that this stimulus material is highly suitable to train holding and rehearsing auditory information in working memory. Similarly to what we already demonstrated for the visual modality (Schneiders et al., 2011), it was hypothesized that working memory is specifically trainable for auditory material and thus its training results in considerable improvements in an intra-modal working memory task (near transfer effects) whereas more far transfer effects on a visual working memory task should be absent or decidedly smaller.
Additionally, we examined whether intra-modal and across-modal transfer effects of auditory working memory training are accompanied by differential activation changes in the fronto-parietal working memory network. Previous studies reported a great variety of activation patterns resulting from cognitive training (e.g., Jonides, 2004; Kelly and Garavan, 2005; Kelly et al., 2006; Buschkuehl et al., 2012). First, activation decreases in the same brain areas before and after training were consistently reported in studies using short-term working memory training (within-session practice) (Garavan et al., 2000; Jansma et al., 2001; Landau et al., 2004; Sayala et al., 2006). This pattern was usually taken to reflect more efficient processing in task-specific brain areas as a consequence of training. However, studies using more prolonged working memory training over several separate sessions exhibited a more inconsistent pattern of results. Most of the studies found activation decreases in the fronto-parietal working memory network (Olesen et al., 2004; Dahlin et al., 2008; Schneiders et al., 2011). Some studies additionally (Olesen et al., 2004; Dahlin et al., 2008) or exclusively (Jolles et al., 2010) report activation increases in brain regions that were active before and after training which are usually taken as an expansion of neural structures involved in the processing of the task. Furthermore, Hempel et al. (2004) report a combination of both patterns, i.e., an inverted u-shaped function of activation changes during training of an n-back working memory task. According to Kelly and Garavan (2005) different patterns of brain activity within the same areas before and after working memory training are referred to as redistribution and are taken to reflect a combination of more efficient engagement of task-specific cognitive processes and reduced demands on attentional control processes as a function of training. Particularly, prefrontal cortex, anterior cingulate, and posterior parietal cortex are assumed to fulfill such a “scaffolding” function that becomes redundant after extensive practice. Those “scaffolding” areas broadly overlap with the common fronto-parietal working memory network.
Another pattern of training-related changes in brain activation, namely the activation of new brain areas after training, has been termed reorganization and is assumed to lead to a qualitative change in the processes used to solve the trained task (Kelly and Garavan, 2005; Kelly et al., 2006). Although this pattern of results is commonly found in various cognitive training studies (e.g., Poldrack et al., 1998; Poldrack and Gabrieli, 2001; Erickson et al., 2007) to our knowledge, there is no single study reporting such a pattern of activation change as a result of working memory training.
Although activation increases in fronto-parietal brain regions are the most frequent activation changes after working memory training, there is still some inconsistency in the literature on the nature of neural activation changes after working memory training. Consistent with a number of studies mentioned above, we assume that within the prefrontal cortex, there exists a relative specialization for auditory object working memory with the ventrolateral prefrontal cortex being involved in auditory working memory tasks (for a review see also Rämä, 2008). Thus, this region might be recruited for maintaining and rehearsing auditory material over short periods of time. Auditory working memory training should therefore enhance the processing efficiency in this region, as indicated by activation decreases in an auditory but not a visual working memory task as well as behavioral improvements specifically in the auditory task.
Activation changes in a visual working memory task after auditory working memory training should be found in more posterior regions of the fronto-parietal working memory network, which are commonly recruited by amodal control and attentional processes in working memory tasks and for which activation decreases after n-back working memory training have been reported independently of training modality and behavioral improvements (Schneiders et al., 2011). The latter prediction is based on the assumption that the posterior parietal cortex reflects training-unspecific (Schneiders et al., 2011) and task-unspecific effects (present study) to a similar extent.
Materials and Methods
Participants and Procedure
Thirty-two undergraduate students of Southwest University, Chongqing, China, 17 females and 15 males, mean age = 21.31 years (age range = 18–24 years), participated in this study. All participants were right-handed as assessed by the Edinburgh Inventory (Oldfield, 1971) and indicated on a screening form to be physically and psychologically healthy, to have normal hearing and normal or corrected to normal vision. Subjects were unselected for musical training: most of them had received some musical instruction as part of their elementary or high school education, but none were professional musicians or had more than five years of learning to play an instrument. They gave written informed consent before testing and received 10 Yuan/h for their participation.
As shown in Figure 1 participants were assigned to either the auditory training group (n = 16) (mean age = 21.13 years, age range = 18–14 years) or the no training control group (mean age = 21.50 years, age range = 19–23 years). The groups were matched according to age (p = 0.43), gender (p = 0.73), fluid intelligence as measured by the Bochumer Matrizentest (BOMAT) (Hossiep et al., 1999) (p = 0.60).
Figure 1. Schematic description of the experimental design. Both groups performed the same auditory and visual 2-back and 0-back control task in the pretest and posttest fMRI session. During the training interval, the auditory training group was trained on an adaptive n-back task using auditory tonal sequences, whereas the control group did not receive any training.
Before training, participants took part in an initial fMRI pretest. The training group received eight training sessions within two weeks following the initial fMRI pretest. During the training participants performed an auditory adaptive n-back task with tonal sequences. Twenty-one to 22 days after the initial fMRI pretest all participants participated in the fMRI posttest.
To train auditory working memory, we used an adaptive n-back paradigm adapted from Jaeggi et al. (2008) (see Figure 2). In the n-back task, a sequence of stimuli is presented consecutively. It has to be decided whether the present stimulus matches the stimulus that was presented n positions back in the sequence. Stimuli were presented sequentially at a rate of 3700 ms (stimulus length = 700 ms, inter-stimulus interval = 3000 ms). Each block contained six targets with their positions determined randomly. To avoid non-targets that are most likely to distract participants' attention, non-targets immediately preceding or following a target had to be different from the target such that those trials did not function as lure trials. All other non-target stimuli were assigned randomly. Participants had to respond manually on every stimulus by pressing either the letter “M” or “C” of a standard computer keyboard. Response mappings were counterbalanced across participants and were maintained throughout training and fMRI sessions. To implement adaptivity in the task, the level of n changed from one block of 20 + n trails to the next according to each participant's individual performance. If the participant performed better than 78% correct, the level of n increased by 1 but decreased by 1 if accuracy was worse than 67% correct. In all other cases n remained unchanged. Each training session comprised 40 blocks and started with the n level of 1. Starting level was always n = 1 for motivational reasons and to assure that participants were actually able to perform the task well, before n increases. As compared to our previous study (Schneiders et al., 2011), the current auditory stimulus material as described below rendered the training task more difficult.
Figure 2. Schematic description of the adaptive auditory n-back task during training illustrated for a 2-back condition. Targets were defined as tonal sequences comprising the same sequence that was transposed in pitch. Non-targets were defined as tonal sequences comprising a different sequence that was also transposed in pitch.
Rhythmic three-tone melodies were employed for the auditory working memory training. They consisted of two short pure tones lasting 175 ms (20 ms gating windows) and one long pure tone lasting 350 ms (20 ms gating windows) resulting in a total length of 700 ms. Three different tones within each melody were taken from an atonal scale and with the octave divided into seven equally spaced logarithmic steps (“tones”) (see also Foxton et al., 2003, 2004). Starting pitch varied from 224.48 Hz for the most low-pitched scale and 356.30 Hz for the most high-pitched scale. In each training session a completely new set of eight stimuli was used to ensure that effects were not due to highly familiar stimulus material and to prevent verbal and semantic encoding strategies as much as possible. In each stimulus set, two stimuli featured a pitch pattern of two falls, two raises, a raise followed by a fall, or a fall followed by a raise, respectively. Stimuli with the same pitch pattern differed in the amount of frequency change between the tones (e.g., tone 1 (224.48 Hz)—tone 4 (317,19 Hz)—tone 5 (345,96 Hz) of the scale vs. tone 1 (224.48 Hz)—tone 2 (266,64 Hz)—tone 3 (290,82 Hz) of the scale). However, the absolute pitch varied between all of the stimuli within one block. Tones were not repeated within one melody. Targets were defined as melodies comprising exactly the same melody (“pitch contour”) but were transposed in absolute pitch. Non-targets were pitch patterns that differed in one raise or fall compared to the original melody and were also transposed in absolute pitch.
The procedure was self-paced from one block to the next such that the amount of time to complete one training session varied between participants resulting on average 50 min per session. The training comprised eight sessions taking place within two weeks. The time lag between sessions was between one and four days.
A repeated measures analysis of variance (ANOVA) with the factors session (collapsed across two consecutive sessions) was calculated on the mean level of n as an indicator of the participants' mean performance for each session. In each training session, the first ten blocks were excluded from calculating the mean level of n because participants had to pass those levels of n, which were below their individual performance level.
Pretest and Posttest Tasks
To examine whether auditory working memory training leads to specific improvements of auditory working memory and whether it also transfers to visual working memory, an auditory and a visual 2-back task were employed as transfer tasks in the fMRI pretest and posttest (Figure 3).
Figure 3. Schematic description of the auditory and visual 2-back transfer tasks in the pre and posttest fMRI sessions. In the auditory task equivalent auditory tonal sequences as during training were used. In the visual black-and-white pattern stimuli were used.
The auditory task was different from the training task in that a constant level of n = 2 was employed. By this it poses less demands on maintenance and updating processes engaged by the n-back task as compared to the adaptive version of the task that requires the updating of the actual n-level every 20 + n trials. As during training new sets of melodies were used; stimuli were randomly assigned to the pretest and the posttest and were taken from the same pool of stimuli used in the training sessions. An auditory 0-back task using the same stimuli throughout the block was applied as a control task. In this task, a pure tone (stimulus length = 400 ms, frequency = 440 Hz, 20 ms gating windows) was overlaid on the melody. Similar to the transfer task, subjects were required to press a button upon the presentation of a target (i.e., whenever the tone was added to the melody) and another if it was not. Six targets were presented in each block. Five blocks of the auditory transfer task consisting of 22 trials alternating with five blocks of the auditory control task comprising 20 trials were completed.
After completion of the auditory transfer task an analogous visual transfer task was employed. The visual transfer task was equivalent to the task used in our previous study (Schneiders et al., 2011). Stimulus presentation was 500 ms, the inter-stimulus interval lasted 2500 ms. As in the previous study abstract black and white pattern stimuli were employed for the visual transfer and control task. In the visual control task a gray dot was added to the center of one of the stimuli. Subjects were instructed to respond upon the presentation of the target (with gray dot) by pressing one button and by pressing another button to respond to non-targets (without gray dot). Five blocks of the visual transfer task consisting of 22 trials alternating with five blocks of the visual control task comprising 20 trials were completed. During the fMRI sessions an additional run with a language task was performed which will not be reported here.
A Two-Way ANOVA with the factors Time (pretest vs. posttest) and Group (auditory working memory training vs. no training) was performed on the auditory and visual transfer task using the discrimination index Pr [P(hits to targets)—P(false alarms to non-targets)] (Snodgrass and Corwin, 1988) as dependent variable.
Before the pretest fMRI session, participants performed one block of each task outside the scanner to get familiar with the tasks.
fMRI Acquisition and Analyses
Imaging data collection was performed on a 3 T scanner (Magnetom Trio, Siemens Medical Systems, Erlangen, Germany). Each participant was tested twice, in a pretest and a posttest, with separate blocks for each task (i.e., transfer task and control task) and modality (visual and auditory modality). Visual stimuli were presented through a projector onto a translucent screen. Participants viewed the stimuli through a mirror attached to the head coil. Head motions were restricted using foam padding. Responses were collected using two-button response grips. Responses were given using the left and right index finger. A T2-weighted gradient echo planar imaging sequence was used for fMRI scans (matrix = 64, field of view = 220 mm, inplane resolution = 3.5 × 3.5 mm, slice thickness/gap thickness = 3 mm/1 mm, repetition time/echo delay time /flip angle = 2300 ms/30 ms/90°). Thirty-two axial slices were acquired per volume. An intra-session high-resolution structural scan was acquired using a T1-weighted 3D magnetization prepared rapid gradient echo sequence (1 mm3 voxel size).
The functional imaging data were analyzed using BrainVoyager QX (Brain innovation; Goebel et al., 2006). The first four volumes of each subject's functional data set were discarded to allow for T1 equilibration. For the remaining 646 volumes, standard preprocessing was performed: the images were slice time corrected (sinc interpolation), motion corrected (trilinear interpolation), and spatially smoothed using an isotopic Gaussian kernel at 5 mm full width at half maximum. The data were high-pass filtered at three cycles per run (i.e., at approximately 0.002 Hz). Functional slices were coregistered to the anatomical volume of the pretest session using position parameters and intensity-driven fine-tuning and were rescaled to a 3 × 3 × 3 mm resolution before they were transformed into Talairach coordinates (Talairach and Tournoux, 1988).
Functional time series were analyzed using random effects multi-subjects general linear model (GLM) (Friston et al., 1999). All levels of the factor Task (transfer vs. control) and the factor Time (pretest vs. posttest) were modeled as separate predictors for each subject; motion parameters were added as predictors of no interest to the design matrix of each run. Thus, the resulting GLM contained eight parameters of interest per subject: auditory transfer and auditory control, visual transfer and visual control for each of the pretest and posttest sessions. Predictor time courses were adjusted for the hemodynamic response delay by convolution with a double-gamma hemodynamic response function (Friston et al., 1998). All time points not associated with one of the eight parameters served as the implicit baseline.
To explore training-induced activation changes from pretest to posttest between the groups we performed voxel-wise whole-brain repeated measures ANOVAs As for the analysis of the behavioral data we focused our analysis on the Time (pretest vs. posttest) by Group (training vs. no training group) interaction with the % signal changes relative to the implicit baseline for the auditory and for the visual transfer task as dependent variable. Within this analysis a main effect of Time would reflect unspecific effects of task repetition from pre-to post-test and was therefore, not evaluated. To achieve a desirable balance between Types I and II error rates i.e., not to miss any potential activity by avoiding an unnecessarily high rate false of positives, the resulting F-maps were thresholded at a more liberal threshold of p < 0.005 (uncorrected) using clusters determined by the number of anatomical voxels > 135 (see Lieberman and Cunningham, 2009, for a detailed discussion). To further specify the Time by Group interaction we defined functional volumes-of-interest (VOI) on the basis of these cluster activations showing a significant Time by Group interaction. The difference of the mean activity of these clusters between pre- and posttest was then compared within each group and task.
Performance increases during training as measured by the mean level of n collapsed across two consecutive training sessions are shown in Figure 4A. Participants improved their performance on average by 0.782 n (min = 0.21, max = 1.30, SEM = 0.815) from the first two training sessions to the last two training sessions. The repeated measures ANOVA revealed that the training group improved its performance as indicated by a significant main effect of Session [F(3, 45) = 54.12, p < 0.001, η2p = 0.78]. Moreover, a significant difference between performance at the first and second training session compared to the seventh and eighth training session substantiates these training improvements [t(15) = 9.59, p < 0.001] and allows for testing the effects the training had on the posttest tasks.
Figure 4. Performance increase in the n-back task for the auditory training group. (A) The mean level of n as an indicator of the participants' performance for each session and corresponding standard errors of the mean are shown. (B) Mean Pr scores and corresponding standard errors of the mean of the auditory transfer task (left panel) and of the visual transfer task (right panel) for both groups during fMRI pretest and posttest.
The most interesting analysis according to our predictions concerns the effects of auditory training on the auditory and visual 2-back tasks from pretest to posttest compared to no training (intra-modal and across-modal transfer effects). The Three-Way ANOVA with the factors Time (pretest vs. posttest), Group (auditory training vs. no training) and Task Modality (auditory vs. visual task) revealed significant main effects of Time [F(1, 30) = 41.58, p < 0.001, η2p = 0.58], and Task Modality [F(1, 30) = 19.71, p < 0.001, η2p = 0.40]. The main effect of Group was not significant [F(1, 30) = 1.59, p = 0.22, η2p = 0.05]. The Two-Way interactions Time by Group [F(1, 30) = 4.26, p < 0.05, η2p = 0.12], Task Modality by Group [F(1, 30) = 4.61, p < 0.05, η2p = 0.13] and Time by Task Modality [F(1, 30) = 4.68, p < 0.05, η2p = 0.14] were also significant as was the Three-Way interaction [F(1, 30) = 11.63, p < 0.01, η2p = 0.28]. To further explore the Three-Way interaction Two-Way ANOVAS with the factors Time (pretest vs. posttest) and Group (auditory training vs. no training) were performed separately for the two tasks. The Two-Way ANOVA on the auditory transfer task revealed a significant main effect of Time [F(1, 30) = 66.46, p < 0.001, η2p = 0.69] and Group [F(1, 30) = 4.65, p < 0.05, η2p = 0.13] and a significant Time by Group interaction [F(1, 30) = 25.23, p < 0.001, η2p = 0.46], reflecting group-specific improvements from pre to posttest (see Figure 4B). Performance did not differ between the groups in the pretest [t(30) = 0.02, p = 0.98]. However, the posttest performance was significantly greater after auditory training as compared to no training [t(30) = 4.23, p < 0.001]. The analogous Two-Way ANOVAs on the visual transfer task revealed a significant main effect of Time [F(1, 30) = 7.61, p < 0.05, η2p = 0.20] but the main effects of Group [F(1, 30) = 0.01, p = 0.99, η2p < 0.01] and the Time by Group interaction [F(1, 30) = 0.44, p = 0.51, η2p = 0.01] were not reliable.
Taken together, behavioral data shows a specific improvement of the working memory training group compared to the control group in the auditory but not in the visual transfer task.
Brain Imaging Results
As the main of interest of the present study was to explore changes in brain activity from pretest to posttest after auditory working memory training compared to no training the present analysis focused on voxel-wise whole-brain Time by Group interactions on the auditory transfer task. Such interactions were found in four clusters of activation, the right postcentral gyrus at BA 5, the right middle temporal gyrus at BA 21 and two clusters in the right inferior frontal gyrus, one in BA 45 and one in BA 46 (for a list of peak cluster coordinates and local maxima coordinates, see Table 1A). To test whether those interactions arose due to pretest activation differences between the two groups, we compared the mean activity of these clusters in the pretest auditory transfer task between the two groups. Significant pretest group differences were found in the right postcentral gyrus [t(30) = −2.01, p = 0.05] and the right middle temporal gyrus [t(30) = 3.58, p = 0.001]. These pretest group differences, for obvious reasons, could not be related to working memory training. Moreover, as both groups were equally naïve with respect to the 2-back task these differences are not related to the specific task demands but rather reflect some unspecific differences between groups. For this reason both clusters were excluded from further analyses and VOI analyses were restricted to the remaining two clusters in the right inferior frontal gyrus for which no pretest group differences between the two groups were found [BA 46: t(30) = 0.52, p = 0.61; BA 47: t(30) = 1.54, p = 0.14].
Table 1A. Brain regions activated in the voxel-wise Time by Group Interaction for the auditory transfer task.
VOI analyses revealed that after working memory training activation in the auditory transfer task significantly decreased in both VOIs [BA 46: t(15) = 3.17, p < 0.01, and BA 47: t(15) = 2.50, p < 0.05], whereas activation significantly increased after no training in BA 46 [t(15) = −2.72, p < 0.05] and BA 47 [t(15) = −2.92, p < 0.05] (see Figures 5A,B). A next analysis tested whether the activation decreases in BA 46 and 47 were specific for the auditory 2-back task. Thus, a one-tailed paired t-test was calculated, to test whether the posttest-pretest difference was significantly larger in the auditory than in the visual transfer task. This analysis revealed significantly larger training-related changes in BA 47 [t(15) = 1.95, p < 0.05] for the auditory as compared to the visual transfer task. The same analysis for BA 46 revealed a marginally significant effect [t(15) = 1.38, p < 0.10]. By this, activation decreases in the two regions in the right inferior frontal gyrus after working memory training seem to be specific for the auditory transfer task.
Figure 5. Intra-modal training-related activation changes during the performance of the auditory transfer task (left panel). The activation changes for the visual transfer task are shown in the right panel. Percent signal change values of functional volumes of interests thresholded at p < 0. 005 (135 voxel extend) are shown for the training and the control groups [left inferior frontal gyrus at BA 46 (A upper panel) and left inferior frontal gyrus at BA 47 (B lower panel)]. Note that the activation decrease in the training group from pre to posttest was larger in the auditory than in the visual transfer task. See results section for details.
To test for effects the training had on the visual transfer task, an analogous voxel-wise whole-brain Time by Group analysis was performed for the visual transfer task. Significant Time by Group interactions were found in three clusters in the right hemisphere, postcentral gyrus at BA 5, posterior parietal lobule at BA 40, and superior frontal gyrus at BA 6 (for a list of peak cluster coordinates and local maxima coordinates, see Table 1B). As marginally significant pretest differences between the groups were found in the right postcentral gyrus [t(30) = −1.75, p < 0.10], this cluster was excluded from further analyses. No pretest differences between groups were obtained for BA 40 [t(30) = 0.84, p < 0.41], and BA 6 [t(30) = 1.30, p < 0.15]. VOI analyses revealed significant activation decreases after auditory training in the right posterior parietal lobule at BA 40 [t(15) = 4.43, p < 0.001] and in the right superior frontal gyrus at BA 6 [t(15) = 3.32, p < 0.01] (see Figures 6A,B). Activation increased significantly in the control group in BA 6: t(15) = −2.30, p < 0.05, and marginally significant in BA 40, t(15) = −1.73, p = 0.10. To crosscheck whether those activation changes were specific to the visual transfer task, we applied the analogous VOI analyses to the auditory transfer task although there were no significant interactions in these region in the voxel-wise whole-brain analyses. We found a similar pattern of results for the auditory task: activation decreased after auditory training in BA 40 [t(15) = 3.78, p < 0.01] and in BA 6 [t(15) = 3.12, p = 0.01]. In the no training control group activation did not change in BA 40 [t(15) = −1.32, p = 0.21] and showed a trend towards an increase in BA 6 [t(15) = −2.04, p < 0.10]. These results point to modality-general effects in the posterior parietal lobule and the prefrontal gyrus after auditory working memory training as those effects were found equivalently for the auditory and visual transfer task.
Table 1B. Brain regions activated in the voxel-wise Time by Group Interaction for the visual transfer task.
Figure 6. Amodal training-related activation changes during the performance of the auditory (left panel) and visual transfer task (right panel). Percent signal change values of functional volumes of interests thresholded at p < 0. 005 (135 voxel extend) are shown for the training (solid line) and the control group (dotted line) [right inferior parietal lobule at BA 40 (A upper panel) and superior part of the right middle frontal gyrus at BA 6 (B lower panel)].
In this study behavioral and neural effects of auditory working memory training on an auditory and a visual working memory task were investigated. The group that performed an adaptive working memory training was compared to a control group receiving no training. Before and after training, participants were tested on an auditory and visual transfer working memory task while being scanned. Reliable training gains were found which allowed us to test for transfer effects on the pretest and posttest tasks. Performance in the auditory transfer task at posttest was higher for the training group than for the control group whereas performance in the visual transfer task did not differ from the control group after auditory working memory training.
Regarding training-related neural effects, the main finding was that auditory adaptive working memory training resulted in reduced brain activity in the right inferior frontal gyrus in the auditory task but not in the visual task. In contrast, training led to task-unspecific activation decreases in the right superior parietal lobule at BA 40 and the superior part of the right middle frontal gyrus at BA 6.
Performance improvements across the training period (training gains) were a necessary precondition for testing the effects the training had on the auditory and visual working memory tasks at posttest. This transfer effect was modality-specific insofar as performance in an equivalent visual working memory task was not affected by the training and by this indistinguishable from the no training control group. These data clearly support our hypothesis for an advantage of modality-specific training also in the auditory modality and corroborate similar modality-specific training effects for the visual modality (Schneiders et al., 2011).
Notably, those transfer effects potentially can be attributed to the specific auditory stimulus material. In the current auditory working memory training paradigm we used a set of eight global pitch sequences comprising three tones as stimulus material (adopted from Foxton et al., 2003, 2004). It is noteworthy that we found those specific training effects using stimulus material for which it was already shown that it provides a large potential for improvement in a perceptual discrimination task. A previous training study compared the trainability of discrimination global pitch patterns i.e., tonal sequences in which the pitch contour had to be compared independently of the melody's absolute pitch level, with training effects for local pitch patterns, i.e., tonal sequences in which the pitch contour differed but absolute pitch was always held constant (Foxton et al., 2004). It was shown that global pitch sequences more strongly benefited from training than local pitch patterns (Foxton et al., 2004). Presumably our modality-specific transfer effects arose because global pitch patterns are specifically distinctive and by this better memorable than other auditory material such as bird sound stimuli (Schneiders et al., 2011). In this context it needs to be acknowledged that by using three-tone sequences only four categories of raises and falls within a sequence are possible. By this participants can identify the regularity in patterns and recode them semantically and this may have additionally enhanced their memorability. Although it is still an open question whether comparable behavioral training improvements could have also be obtained with local pitch pattern sequences or other less distinct kinds of auditory information, our data clearly supports the view that auditory processes can be trained specifically.
Moreover, it needs to be mentioned that we found main effects of Time in both, the auditory and the visual transfer task. In the visual transfer task, training and control groups likewise showed improved performance at posttest indicating improvements attributable to pure repetition only. In the auditory transfer task a similar retest effect is found for the control group. These data indicate that all participants improved performance from pretest to posttest in both tasks independently of whether they received any working memory training. This shows that even a small amount of within-session practice can lead to retest effects (Garavan et al., 2000). This result is in line with many working memory training studies that likewise found main effects of Time or pure retest effects in the control group (e.g., Smith et al., 2009; Jolles et al., 2010; Owen et al., 2010; Schneiders et al., 2011) and by this makes a control group indispensable. Thus, the transfer effects on the auditory task are additive to these retest effects.
It needs to be acknowledged that there were performance differences between the auditory and the visual transfer tasks in the pretest. Thus, missing transfer effects on the visual transfer task might be explained by ceiling effects, i.e., the initially high performance level may have made further improvements impossible. However, Pr scores in the visual task, although higher than in the auditory task, were between 0.5 and 0.6 for the two groups and, by this, still not at ceiling. Additionally, the initial Pr scores in the visual transfer task were comparable to the Pr scores in an analogous visual transfer task in a previous training study (Schneiders et al., 2011), in which we found transfer effects after visual training. On that account it is rather unlikely that higher initial performance in the visual task of the present study prevented transfer effects on the behavioral level.
Training-induced intra-modal activation decreases after working memory training were found in the auditory transfer task in two adjacent regions within the right inferior frontal gyrus (BA 46 and BA 47). These effects were accompanied by specific performance improvements. As analogous transfer effects in the visual transfer task were substantially smaller, these effects are assumed to be rather specific for auditory information. Even though the effect size of this finding is small and the results are exploratory in nature, they support the view that the right inferior frontal gyrus is specifically sensitive to auditory information although it is part of the common fronto-parietal working memory network which was assumed to be widely independent from input modality (Owen et al., 2005). In support of this view several lines of research indicate especially the ventral part of the inferior frontal gyrus to be selectively involved in maintaining and rehearsing auditory and phonological material (Zatorre et al., 1994; Griffiths et al., 1999; Gaab et al., 2003; Rämä and Courtney, 2005; Koelsch et al., 2009; Jerde et al., 2011).
According to the framework proposed by Kelly and Garavan (2005), the current findings can be classified as redistribution effects and suggest that auditory working memory training increased efficiency in storage, access, updating, and rehearsing of purely auditory information mediated by the inferior frontal gyrus (see also Petersen et al., 1998). Intensive and demanding updating training made these processes highly efficient, such that less neural activity is needed and better performance is achieved According to Kelly and Garavan (2005) reorganization effects are unlikely to occur after working memory training (e.g., Garavan et al., 2000; Landau et al., 2004; Olesen et al., 2004; Sayala et al., 2006; Schneiders et al., 2011), because training of working memory is less likely to result in strategic changes or enhanced automaticity during the training of the task. Instead the kind of information which needs to be maintained in working memory differs for each trial and by this always requires cognitive control processes and this is why highly similar brain regions are recruited before and after training.
Furthermore, the present study also revealed across-modal training effects at the neural level i.e., effects auditory working memory training had on the visual transfer task. As similar effects were also observed for the auditory transfer task they are task-unspecific in nature. By this, the activation decreases in the superior part of the right middle frontal gyrus at BA 6 and the right inferior parietal lobule at BA 40 can be taken to reflect alterations in amodal general control processes. Importantly, highly similar activation decreases in BA 6 and BA 40 in a visual 2-back task were found in our previous study irrespective of whether participants were trained in the visual or auditory modality before (Schneiders et al., 2011), accentuating the task- and training unspecific nature of these effects.
The superior portion of the right middle frontal gyrus is assumed to be one of the major areas for continuous updating processes in working memory (Wager and Smith, 2003), which is especially crucial for solving the n-back task irrespective of stimulus type. Moreover, Schubotz (2007) provides convincing support for the notion that this region is particularly recruited when predicting relevant dynamics of events, i.e., the next stimulus in serial prediction tasks. This task requires participants to monitor a sequence of abstract stimuli to work out how this sequence will evolve. Thus, participants have to update their mental representation of the sequence upon the encounter of the next stimulus. They are also asked to indicate whether the sequential order was correct until the end of presentation or whether it was violated. Importantly, to successfully solve the task participants have to predict the upcoming stimulus and to compare this predicted stimulus with the encountered one. It is reasonable to assume that successful performance in the n-back task entails similar predictions of the target stimulus on the basis of the prior sequence of events. For this reason, we suppose that processing requirements are functionally similar in serial prediction tasks and n-back tasks and by this similarly reliant on brain structures in the right middle frontal gyrus. The present task-unspecific amodal effect in this region further support the view that n-back working memory training leads to more efficient sequencing and prediction processes irrespective of task modality as reflected in decreased activation in this brain region in both transfer tasks.
Training-related activation decreases in the right inferior parietal lobule (BA 40) are in good agreement with findings in several working memory training studies (Hempel et al., 2004; Dahlin et al., 2008; Schneiders et al., 2011). In our previous study an equivalent decrease in the right inferior parietal lobule was found in a visual transfer task irrespective of whether the participants trained with auditory or visual materials (Schneiders et al., 2011). The intraparietal lobule is part of the fronto-parietal working memory network. This region is considered to be specifically involved in the attentional control of working memory (Jonides et al., 1998). Thereby, training-induced task-unspecific activation decreases are most likely to reflect reduced scaffolding as storage and continuous updating became more efficient and results in less demand on attentional control. It needs to be acknowledged, that training-related activation decreases in the superior part of the right middle frontal gyrus at BA 6 and in the right inferior parietal lobule at BA 40 were accompanied by performance improvements in the auditory but not in the visual transfer task. It seems that the degree of auditory training was not yet sufficient to be also manifested in significant performance improvements in the visual transfer task. It might be that the training was not intensive enough to result in performance increases in a far transfer task that does not match the trained modality. Thus, with a longer and more intense training we would assume substantial transfer effects of auditory working memory training also to the visual transfer task, however, less pronounced than to the auditory task, due to the non-matching training and transfer modalities.
Furthermore, it was surprising that we found activation increases from pre to posttest without any training in the control group in both the auditory and the visual task. There is some evidence that within-session practice of working memory tasks can lead to alterated brain activity (see Klingberg, 2010, and Buschkuehl et al., 2012, for recent reviews). However, in these studies activation decreased independently from performance. Nevertheless some studies on working memory training found activation increases (Olesen et al., 2004; Jolles et al., 2010) or an inverted u-shaped function of activation changes (Hempel et al., 2004). But in those studies increases or the rising part of an inverted u-shaped function were only found for the training groups that trained longer than one or two sessions. Thus, the findings in our control group are not in line with those patterns of results. Alternatively, the increase of activation in the control group might be related to an increase in performance. As the control group did not practice, neural processing might not have become more efficient such that the slight increase in performance might be accompanied by more mental operation per time unit, which could have resulted in stronger activations in the respective brain areas.
Moreover, one limitation of this study is that we used a passive control group that did not receive any training. By this the groups differ in how often they came to the lab and were treated by the experimenter, which can lead to motivational differences for task performance. However, if there would be a motivational decline in the control group one would assume performance to decrease from pretest to posttest. In our data, we do not find such an effect; instead we find performance increases in the control group that are numerically comparable for the auditory and visual transfer task. Especially behavioral performance in the visual task is nearly identical to the performance of the training group. This is why we assume that factors other than working memory training are rather unlikely to account for the present data.
In conclusion, the present behavioral and functional data further strengthens the view that modality-specific training is not only possible within visual working memory (Schneiders et al., 2011) but also within the auditory modality. Specific behavioral improvements after auditory training were accompanied by specific activation decreases in the right inferior frontal gyrus. In an auditory working memory transfer task this intra-modal effect can be separated from amodal activation decreases in the right inferior parietal lobule and the superior part of the right middle frontal gyrus.
If one considers the activation changes of both our working memory training studies in conjunction, the data suggests a differentiation of the redistribution effects. Modality-specific decreases in the prefrontal cortex co-occurred with behavioral improvements: This was the case after visual training on a visual working memory task in the right middle frontal gyrus (Schneiders et al., 2011) and after auditory training on an auditory task in the right inferior frontal gyrus in the current study. In contrast, amodal activation decreases were found in more posterior regions independently of behavioral improvements irrespective of training modality in a visual transfer task (Schneiders et al., 2011) and after auditory training for a visual and an auditory transfer task in the present study.
The post training modality-specific activation decreases in the prefrontal cortex that were accompanied by improved task performance suggests that the prefrontal cortex provides most capacity for training-related efficiency. As it is known that IQ-scores negatively correlate with prefrontal cortex activation i.e., more intelligent participants show reduced activation in frontal regions compared to less intelligent ones in cognitively demanding tasks (Neubauer and Fink, 2009), it might be that prefrontal regions provide modality-specific capacities for cognitive plasticity. Last but not least these results add to our understanding of working memory systems and processes by demonstrating that additionally to a distinction between holding auditory and visual information in working memory (Baddeley, 2002, 2003; Zimmer, 2008), these systems seem to be plastic and trainable in a modality-specific way.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We deeply thank Xuchu Weng for organizational support on scanning facilities, Anna-Lena Scheuplein for her assistance with stimulus generation and programming, Changming Chen, Suo Tao, and Lijun Deng for their assistance with fMRI data acquisition. This research was funded by the German Research Foundation (IRTG 1457).
Crottaz-Herbette, S., Anagnoson, R. T., and Menon, V. (2004). Modality effects in verbal working memory: differential prefrontal and parietal responses to auditory and visual stimuli. Neuroimage 21, 340–351.
Erickson, K. I., Colcombe, S. J., Wadhwa, R., Bherer, L., Peterson, M. S., Scalf, P. E., Kim, J. S., Alvarado, M., and Kramer, A. F. (2007). Training-induced functional activation changes in dual-task processing: an fMRI study. Cereb. Cortex 17, 192–204.
Foxton, J. M., Talcott, J. B., Witton, C., Brace, H., McIntyre, F., and Griffiths, T. D. (2003). Reading skills are related to global, but not local, acoustic pattern perception. Nat. Neurosci. 6, 343–344.
Goebel, R., Esposito, F., and Formisano, E. (2006). Analysis of functional image analysis contest (FIAC) data with brainvoyager QX: from single-subject to cortically aligned group, general linear model analysis and self-organizing group independent component analysis. Hum. Brain Mapp. 27, 392–401.
Hempel, A., Giesel, F. L., Garcia Caraballo, N. M., Amann, M., Meyer, H., Wüstenberg, T., Essig, M., and Schröder, J. (2004). Plasticity of cortical activation related to working memory during training. Am. J. Psychiatry 161, 745–747.
Jolles, D. D., Grol, M. J., van Buchem, M. A., Rombouts, S. A. R. B., and Crone, E. A. (2010). Practice effects in the brain: changes in cerebral activation after working memory practice depend on tasks demands. Neuroimage 52, 658–668.
Jonides, J., Schumacher, E. H., Smith, E. E., Koeppe, R. A., Awh, E., Reuter-Lorenz, P. A., Marshuetz, C., and Willis, C. R. (1998). The role of parietal cortex in verbal working memory. J. Neurosci. 18, 5026–5034.
Landau, S. M., Schumacher, E. H., Garavan, H., Druzgal, T. J., and D'Esposito, M. (2004). A functional MRI study of the influence of practice on component processes of working memory. Neuroimage 22, 211–221.
Nystrom, L. E., Braver, T. S., Sabb, F. W., Delgado, M. R., Noll, D. C., and Cohen, J. D. (2000). Working memory for letters, shapes, and locations: fMRI evidence against stimulus-based regional organization in human prefrontal cortex. Neuroimage 11, 424–446.
Petrides, M., and Pandya, D. N. (2002). “Association pathways of the prefrontal cortex and functional observations,” in Principles of Frontal Lobe Function, eds D. T. Stuss and R. T Knight (New York, NY: Oxford University Press), 31–50.
Rodriguez-Jimenez, R., Avila, C., Garcia-Navarro, C., Bagney, A., Aragon, A. M., Ventura-Campos, N., Martinez-Gras, I., Forn, C., Ponce, G., Rubio, G., Jimenez-Arriero, M. A., and Palomo, T. (2009). Differential dorsolateral prefrontal cortex activation during a verbal n-back task according to sensory modality. Behav. Brain Res. 20, 299–302.
Schneiders, J. A., Opitz, B., Krick, C. M., and Mecklinger, A. (2011). Separating intra-modal and across-modal training effects in visual working memory: an fMRI investigation. Cereb. Cortex 21, 2555–2564.
Smith, G. E., Housen, P., Yaffe, K., Ruff, R., Kennison, R. F., Mahncke, H. W., and Zelinski, E. M. (2009). A cognitive training program based on principles of brain plasticity: results from the improvement in memory with plasticity-based adaptive cognitive training (IMPACT) study. J. Am. Geriatr. Soc. 57, 594–603.
Keywords: auditory, n-back task, training, visual, working memory, plasticity, fMRI
Citation: Schneiders JA, Opitz B, Tang H, Deng YA, Xie C, Li H and Mecklinger A (2012) The impact of auditory working memory training on the fronto-parietal working memory network. Front. Hum. Neurosci. 6:173. doi: 10.3389/fnhum.2012.00173
Received: 16 February 2012; Accepted: 29 May 2012;
Published online: 12 June 2012.
Edited by:Torsten Schubert, Ludwig-Maximilians University Munich, Germany
Reviewed by:Yvonne Brehmer, Karolinska Institute, Sweden
Andre Szameitat, Ludwig-Maximilians University Munich, Germany
Copyright: © 2012 Schneiders, Opitz, Tang, Deng, Xie, Li and Mecklinger. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
*Correspondence: Julia A. Schneiders, Department of Psychology, Brain and Cognition Unit, Saarland University, Campus, Building A 2 4, 66123 Saarbrücken, Germany. e-mail: firstname.lastname@example.org