Working-memory training in younger and older adults: training gains, transfer, and maintenance

Working memory (WM), a key determinant of many higher-order cognitive functions, declines in old age. Current research attempts to develop process-specific WM training procedures, which may lead to general cognitive improvement. Adaptivity of the training as well as the comparison of training gains to performance changes of an active control group are key factors in evaluating the effectiveness of a specific training program. In the present study, 55 younger adults (20–30 years of age) and 45 older adults (60–70 years of age) received 5 weeks of computerized training on various spatial and verbal WM tasks. Half of the sample received adaptive training (i.e., individually adjusted task difficulty), whereas the other half-worked on the same task material but on a low task difficulty level (active controls). Performance was assessed using criterion, near-transfer, and far-transfer tasks before training, after 5 weeks of intervention, as well as after a 3-month follow-up interval. Results indicate that (a) adaptive training generally led to larger training gains than low-level practice, (b) training and transfer gains were somewhat greater for younger than for older adults in some tasks, but comparable across age groups in other tasks, (c) far-transfer was observed to a test on sustained attention and for a self-rating scale on cognitive functioning in daily life for both young and old, and (d) training gains and transfer effects were maintained across the 3-month follow-up interval across age.


INTRODUCTION
Working memory (WM), the ability to maintain and manipulate information over short periods of time in the context of concurrent processing or distraction, is a key determinant of several higher-order cognitive functions, such as reasoning, fluid intelligence, problem solving, and language comprehension (Engle, 2002;Borella et al., 2010;Nettelbeck and Burns, 2010). WM functioning declines in late adulthood (Bopp and Verhaeghen, 2005;Payer et al., 2006;Borella et al., 2008) and is considered as one of the main contributing factors of various cognitive impairments in old age (Park et al., 2002). Hence, investigating the possibilities of improving WM functioning in older adults should be highly relevant to everyday cognition in late life. A large number of training studies have investigated the trainability of WM across the lifespan (for reviews, see Klingberg, 2010;Shipstead et al., 2010;Takeuchi et al., 2010;Morrison and Chein, 2011).
To investigate the effects of a training program, choice of control group is critical. No-contact (passive) control groups are most commonly used. Here, participants perform pre-and post-training tests to rule out effects based on the fact that the same test is performed twice (i.e., test-retest effects), but participants are not contacted during the training phase (e.g., Li et al., 2008;Dahlin et al., 2008a;Chein and Morrison, 2010;Schmiedek et al., 2010). However, in addition to test-retest effects, the task environment (e.g., performing a specific task regularly, receiving feedback, being challenged with a new testing situation, having contact with test leaders, expectations about performance improvements due to training) might influence performance. Obviously, these influences cannot be eliminated by using a passive control group. Only few studies have used active control groups. In these studies, the controls typically perform activities unrelated to the targeted cognitive function (e.g., quizzes, questionnaires of autobiographical memory and well-being, physical activity, watching DVDs), matched on time and effort with the actual training program (e.g., Mahncke et al., 2006;Buschkuehl et al., 2008;Borella et al., 2010;Richmond et al., 2011). However, this procedure has the disadvantage that participants are engaged in quite different tasks that might affect performance differently.
In the present study, we investigated training gains, transfer effects, and 3-month maintenance effects of an intensive computerized WM training in younger and older adults. Experimental as well as control groups worked on the same training software, the only difference being that the experimental groups received adaptive training, while the control groups worked on a constant low task difficulty level. Adaptive training (e.g., individualized adjustment of task difficulty levels) is known to contribute to the efficiency of memory training and to allow individuals to make optimal use of their latent potential (Baltes et al., 1989;Klingberg et al., 2002;Brehmer et al., 2007;Hertzog et al., 2009). We used a process-specific WM training regimen (Park et al., 2007;Morrison and Chein, 2011), with abstract and new stimuli configurations presented at each trial, designed to target domaingeneral WM mechanisms and to minimize the formation and use of domain-specific strategies. Based on previous findings, we expected (a) younger and older adults to benefit from WM training, (b) near-transfer effects to non-trained WM tasks but also some far-transfer to tasks that share similar underlying processes (i.e., attention, reasoning), and (c) maintenance effects for younger as well as older adults across the 3-month time interval for the training gains as well as for potential transfer effects.

PARTICIPANTS
Participants were recruited through a newspaper advertisement according to the following inclusion criteria: (a) aged between 20 and 30 years or 60-70 years, (b) healthy and no history of psychiatric or neurological disease, (c) inexperienced to computerized WM training, and (d) access to a PC with Internet connection at home.
Hundred and six adults who fulfilled the inclusion criteria were randomized to either adaptive training or low-level practice (active control) groups. Six persons withdrew from the study after baseline testing (four from the training group and two from the control group) due to technical problems, lack of time, or illness. In the final study sample, 55 younger adults (M age = 26.0, 32 females) and 45 older adults (M age = 63.8, 27 females) completed: (a) cognitive baseline assessment, (b) 5 weeks of intervention, and (c) cognitive post-training assessment. Only one younger adult did not attend the 3-month follow-up assessment due to moving abroad. Hence, valid results of 99 individuals were available for the 3-month follow-up assessment.
The adaptive training and control groups did not differ significantly in age, education, or gender distribution (ps > 0.80; see Table 1 for sample descriptives). Regarding completed training days, the two intervention groups did not differ significantly, F (1, 96) = 2.8, p = 0.10; however, older adults trained on average 1 day more than younger adults, 24.6 days and 23.5 days, respectively (F (1, 96) = 10.60, p < 0.05, η 2 p = 0.10). The study was approved by the local ethics committee at the Karolinska Hospital, Stockholm, Sweden. All participants were paid SEK 3000 (approximately 440 USD) for participation.

DESIGN AND PROCEDURES
This study focuses on age-and intervention-related effects in the cognitive tests assessed before and after 5 weeks of adaptive WM training/low-level practice, as well as at a 3-month follow-up. A more detailed examination of the 5 weeks of intervention is described elsewhere (Brehmer et al., 2009Bellander et al., 2011).

Cognitive intervention
The WM training was implemented using a commercial software product (Cogmed QM), which runs on the participants' PCs at home. Individuals trained for 20-25 days (minimum 20 days) on seven verbal and non-verbal WM tasks. All tasks involved: (1) maintenance of multiple stimuli at the same time, (2) short delays during which the representation of stimuli should be held in WM, and (3) unique sequencing of stimuli order in each trail (for details of the trained tasks, see Bellander et al., 2011).

Adaptive training
In total, individuals trained on 90 WM trials per day, and needed on average 26 minutes to complete a training session. In the first session, individuals started each task at the same low difficulty level, namely remembering 2 items. Across training, task difficulty was adjusted as a function of individual performance. Specifically, task difficulty was adjusted by increasing/decreasing the number of items individuals had to remember, such that they reached approximately 60% correct per day for each task (for details about the trained tasks and the adaptive training algorithm, see Cogmed QM; www.cogmed.com, Klingberg et al., 2002). Each training session started at the task difficulty level where the participant ended in the previous session. The test leader provided feedback on the training data once a week via e-mail and controlled the data for potential breaks, interruptions, and unusual performance fluctuations. No problems were observed for any participant.

Low-level practice
Individuals in the active control groups worked on the same computerized WM program as the adaptive training groups. The differences between the groups were that task difficulty remained constant at the same low starting level for the controls, namely remembering two items. In addition, to adjust for time differences on task due to increased number of items per task in the adaptive training group, the control groups worked on 120 stimuli on each task and day. For motivational reasons, individuals were told to participate in speed training that may have a positive impact on cognitive functioning.

Cognitive assessment at baseline, post-training, and follow-up
Before and after the 5 weeks of intervention as well as after a 3-month time interval, all individuals were examined with the same set of eight cognitive tests to assess training-related performance gains in the criterion tasks (WM tasks) similar to the ones participants trained for 5 weeks (Span Board forward, Digit Span backward; Wechsler, 1981), near-transfer tasks (Span Board backward, Digit Span forward; Wechsler, 1981), as well as far-transfer tasks (sustained attention, PASAT, Gronwall, 1977;interference control, Stroop, Dodrill, 1978; episodic memory, RAVLT, Lezak, 1983;and non-verbal reasoning, RAVEN, Raven, 1995). In addition, participants completed a self-rating scale for cognitive functioning in daily life (CFQ; Broadbent et al., 1982) at all three measurement occasions. For more details on the tasks, see Klingberg et al. (2002) and Westerberg et al. (2007). Before baseline assessment, participants were randomly assigned to two groups receiving either adaptive training or low-level practice. Participants as well as test leaders were blind as to which experimental group individuals belonged. In addition, individuals' training accounts were locked after post-training assessment. Thus, it was not possible to practice further between post-training and 3-month follow-up assessment.

Performance gains during training
This analysis was restricted to the adaptive training groups, due to the fact that the performance of the control groups was fixed at a low-level across the 5 weeks of intervention. All participants in the adaptive training groups completed at least 20 training sessions. Participants' daily performance on the seven different WM tasks was aggregated into one t-standardized WM performance score. Weekly WM performance scores were used for analysis (for details on the rationale and implementation of this score, see Bellander et al., 2011 andBrehmer et al., 2011). A mixed repeated-measure ANOVA was conducted with age (young, old) as between-subject factor and time (weeks 1-4) as within-subject factor to investigate performance gains during the training period and potential age differences therein.

Cognitive performance
One-Way ANOVAs were conducted separately for the eight criterion and transfer tasks and the self-rating scale of cognitive functioning to examine potential baseline differences between the age and intervention groups. To determine differences in trainingrelated changes in the age and intervention groups, mixed repeated measure ANOVAs were conducted with age (young and old) and intervention (adaptive training and low-level practice) as between-subject factors and time (baseline, post-training, and follow-up) as within-subject factor for the eight cognitive tasks and the self-rating scale, respectively. Follow-up analyses were conducted comparing baseline to post-training and post-training to 3-month follow-up assessment. For all analyses, alpha levels were set to 0.05 and effect sizes refer to partial eta-square values.

COGNITIVE BASELINE PERFORMANCE
Older adults' baseline performance was lower in all cognitive tasks and the self-rating scale compared to that of younger adults (ps < 0.05), the only exceptions being Digit Span forward and backward, where both age groups performed equally well (Fs < 1). The two intervention groups did not differ in their baseline performance (Fs < 1.3) in the self-rating scale or any of the different cognitive tasks apart from the RAVLT, where the low-level practice groups performed better than the adaptive training groups, F (1, 96) = 4.53, p = 0.04, η 2 p = 0.05.

Far-transfer tasks
For all far cognitive transfer tasks (i.e., PASAT, Stroop, RAVLT, and RAVEN) main effects for age (ps < 0.01) were observed indicating higher overall performance for younger than for older adults. Apart from the RAVLT (F < 1), all tests showed an additional main effect of time (ps < 0.01), indicating general performance improvements across time for all groups. More importantly, an intervention × time interaction was observed for PASAT, F (2, 192) = 7.64, p = 0.001, η 2 p = 0.07, indicating that the adaptive training groups improved more than the low-level practice groups across the 5 weeks of intervention. No other effects reached significance (ps > 0.05, see Table 2).
Regarding the self-rating scale on cognitive functioning (CFQ), generally lower memory complaints in younger adults in comparison to older adults were observed (F (1, 96) = 9.78, p = 0.002, η 2 p = 0.09) as well as a general decrease of memory complaints across time (F (2, 192) = 9.06, p < 0.001, η 2 p = 0.86). Further, an intervention × time interaction was obtained, F (2,192) = 3.22, p = 0.045, η 2 p = 0.03, reflecting that the adaptive training groups reduced their memory complaints more than the low-level practice groups across the 5 weeks of intervention (see Table 2).

Maintenance effects
Further inspection of the time-related effects revealed that in all criterion, near-transfer, as well as in two far-transfer tasks (i.e., RAVEN, PASAT), and the CFQ, the significant main effect of time was based on the difference between baseline and posttraining (ps < 0.05), whereas post-training and 3-month followup performance did not differ reliably (ps > 0.05). Thus, the performance level reached after 5 weeks of intensive WM training was maintained across 3-months. Further, for RAVLT no performance change across time was observed; for Stroop, time on task decreases were observed after the 5 weeks of intervention as well as at the 3-month follow-up (F (1, 96) = 43.24, p < 0.001, η 2 p = 0.31; F (1, 96) = 13.06, p < 0.001, η 2 p = 0.12, respectively). For the intervention × time interaction in the criterion and neartransfer tasks, as well as for PASAT and CFQ, follow-up analyses again revealed a significant difference between baseline and posttraining (ps < 0.05), but not between post-training and follow-up (ps > 0.05), indicating that the difference between adaptive training and low-level practice groups was maintained across the 3-month follow-up interval.

DISCUSSION
The present study investigated the effects of 5 weeks of intensive domain-general adaptive WM training in comparison to low-level practice in younger and older adults. Performance was assessed using criterion, near-transfer, and far-transfer tasks before training, after 5 weeks of intervention, as well as after a 3-month time interval. Younger as well as older adults gained considerably from adaptive WM training. Although younger adults showed larger training gains than older adults during the first week, both age groups gained similarly after the second week. Both younger and older adults gained more in some criterion and non-trained WM tasks (Digit Span) in comparison to controls receiving low-level practice, although we observed larger gains and transfer effects for the young in other criterion and near-transfer tasks (Span Board). Regarding far-transfer, similar performance improvements for the adaptive training as well as the active control groups were observed for tests of interference control (Stroop) and reasoning (RAVEN). These findings demonstrate general test-retest effects. More interestingly, both younger and older adults receiving adaptive training showed larger performance gains in a test measuring sustained attention (PASAT) and reported less memory complaints (CFQ) after the 5 weeks of intervention than the controls. Further, the observed training gains and transfer effects were maintained across a 3-month time interval. The same set of eight cognitive tests to assess training-related performance gains and transfer effects were used at the three assessment occasions (i.e., baseline, posttest, 3-month follow-up). Thus, potential retest influences on the observed performance changes cannot be excluded. However, this possibility does not affect the observed training and transfer effects. This is so because by including an active control group, test-retest effects were accounted for, ensuring that the additional performance changes resulted from the adaptive WM training.

Frontiers in Human Neuroscience
www.frontiersin.org We used a domain-general computerized WM training paradigm, which has been employed in previous studies with children, younger adults as well as persons with acquired brain lesions (Klingberg et al., 2002(Klingberg et al., , 2005Olesen et al., 2004;Westerberg et al., 2007;Holmes et al., 2009;Thorell et al., 2009;Jolles et al., 2010). These studies consistently observe near-transfer effects to non-trained WM tasks and often far-transfer effects to tests of attention, interference control, and reasoning. Our findings are in line with this previous work and other training studies in the aging domain regarding near-transfer effects to non-trained WM tasks and far-transfer to sustained attention (e.g., Mahncke et al., 2006;Mozolic et al., 2010Mozolic et al., , 2011Richmond et al., 2011). Our expectations regarding transfer effects relied on the assumption that training and transfer tasks have to tap on similar underlying processes required for successful performance (Thorndike and Woodworth, 1901). Our domain-general WM training included processes like attention control, gating the flow of information, reducing interference while requiring maintenance of stored information, and rapid shifting between encoding and retrieval demands. Hence, the transfer of our WM intervention to the PASAT suggests that the training also improved attentional focusing.
Most often transfer effects in older adults are difficult to demonstrate and, when observed, they are reduced compared to younger adults Li et al., 2008;Dahlin et al., 2008a,b;Karbach and Kray, 2009;Schmiedek et al., 2010;Richmond et al., 2011;Zinke et al., 2012). In accordance with this research, we observed larger training gains for younger than for older adults in one of the two criterion tasks (Span Board Forward) and one of the near-transfer tasks (Span Board Backward). However, younger and older adults did not differ in training and transfer effects in the two Digit Span tasks. It remains unclear why we observed age differences in the magnitude of gains for the Span Board tasks, but not for the Digit Span tasks. With regard to Digit Span, previous studies have not observed any improvements after WM training in older adults Li et al., 2008;Dahlin et al., 2008b;Richmond et al., 2011). Future training studies should consider the approach suggested by Lövden et al. (2010), using established hierarchical structures of cognitive abilities instead of single tests (see also Schmiedek et al., 2010) to assess training and transfer effects. However, in our study performance gains on the PASAT as well as for the CFQ were also comparable in younger and older adults. These findings are in line with some previous studies (e.g., Bherer et al., 2006;Carretti et al., 2007) suggesting that our training paradigm is a sensitive means to detecting cognitive plasticity even in older individuals. This might reflect the adaptive algorithm used in our study, which provides a challenging task situation for the participants, the variety of verbal and non-verbal tasks used in the training program (i.e., domain-general instead of process-specific), as well as the structure and appearance of the training program.
Younger as well as older adults reported less memory complains (CFQ) after adaptive WM training in comparison to participants in the active control groups. This was the case even though participants were blind to group assignment; hence placebo/expectancy effects would be an unlikely explanation for this finding. These self-reported cognitive improvements may have important implications for everyday cognitive functioning and should be investigated further in future studies (see also Richmond et al., 2011).
The nature of our control group needs to be highlighted. To our knowledge, this is the first study in the aging domain using an active control group where individuals worked on the same task material as the experimental group, the only difference being that task difficulty was fixed at a low-level. The use of such an active control group (as opposed to no-contact controls or active controls performing different tasks) provides a conservative assessment of training effects, because the influence of various unspecific factors (e.g., stimulus-response mappings, motivation, test familiarity, performance anxiety, expectations) is attenuated (Zehnder et al., 2009;Shipstead et al., 2010). Although no direct measure of motivation was included in the present study, the uniformly high number of training sessions (i.e., 24.6 for younger adults and 23.5 for older adults) for the experimental and control groups (a minimum of only 20 sessions was required), speaks for highly engaged and committed participants in both training groups.
This conservative assessment of training and transfer gains strengthens the impact of our observed effects and suggests caution in comparing our results with other studies using passive control groups, especially in light of the fact that we observed comparable performance improvements for adaptive training groups and the controls for some of the transfer tasks (interference control, reasoning). To be able to disentangle different performanceinfluencing factors and to make assumptions about the value of adaptive training over low-level practice, future studies should include both active and passive (no-contact) control groups.