Altered Function of Ventral Striatum during Reward-Based Decision Making in Old Age

Normal aging is associated with a decline in different cognitive domains and local structural atrophy as well as decreases in dopamine concentration and receptor density. To date, it is largely unknown how these reductions in dopaminergic neurotransmission affect human brain regions responsible for reward-based decision making in older adults. Using a learning criterion in a probabilistic object reversal task, we found a learning stage by age interaction in the dorsolateral prefrontal cortex (dlPFC) during decision making. While young adults recruited the dlPFC in an early stage of learning reward associations, older adults recruited the dlPFC when reward associations had already been learned. Furthermore, we found a reduced change in ventral striatal BOLD signal in older as compared to younger adults in response to high probability rewards. Our data are in line with behavioral evidence that older adults show altered stimulus–reward learning and support the view of an altered fronto-striatal interaction during reward-based decision making in old age, which contributes to prolonged learning of reward associations.

proposed that dopamine neurons generate a "prediction error" serving as a global learning signal (Montague et al., 1996;Schultz et al., 1997). Neuroimaging studies in humans have confi rmed corresponding responses in SN/VTA (Duzel et al., 2009) and the striatum (McClure et al., 2003). The VST among other frontal cortical areas has been shown to be involved in processing the delivery (Delgado et al., 2000;Elliott et al., 2000;Breiter et al., 2001), anticipation (Knutson et al., 2001), and predictability of rewards (Pagnoni et al., 2002). Most of these fi ndings are in line with the hypothesis that VST responses signal errors in the prediction of rewards (McClure et al., 2004;Schultz, 2007).
Little is known, however, about how these processes are altered in the aging brain. Behavioral evidence suggests that reward-based decision making is impaired in old age. In probabilistic reversal learning tasks, older participants are defi cient in the acquisition and reversal learning of reward associations Mell et al., 2005;Weiler et al., 2008). Moreover, older participants show insuffi cient category learning and set shifting as assessed with the Wisconsin Card Sorting Test (WCST, Ridderinkhof et al., 2002;Rhodes, 2004) and impaired decision making as assessed with the Iowa Gambling Task (Denburg et al., 2005).
Few human neuroimaging studies have focused on age-related neurofunctional changes of the reward system and reported striatal dysfunction in old age (Nagahama et al., 1997;Esposito et al., 1999;Fera et al., 2005;Mohr et al., 2009). Tasks used in those studies lack the separation of specifi c processes, such as decision making

INTRODUCTION
Aging is associated with a decline in different cognitive domains, such as working memory, episodic memory, fl uid aspects of intelligence, and executive functioning (Lindenberger et al., 1993;Craik and Salthouse, 2000). Many aspects of cognitive abilities as well as motor functions rely on the functional integrity of the dopaminergic system (Burns et al., 1983;Sawaguchi and Goldman-Rakic, 1991;Wittmann et al., 2005;Schultz, 2007 for review). During aging, dopamine concentration and receptor density gradually decrease, especially in the prefrontal cortex (PFC) and the striatum (Bäckman and Farde, 2005). Furthermore, the PFC and basal ganglia appear to be susceptible to age-related decreases in volume (Raz, 2000;Raz and Rodrigue, 2006). Based on these fi ndings, the dopamine hypothesis of cognitive aging states that neurochemical alterations of the dopaminergic system give rise to declines in various cognitive functions in old age (Bäckman et al., 2006). The dopaminergic system also plays a major role in reward-based learning and decision making (Pagnoni et al., 2002;O'Doherty et al., 2003;Montague et al., 2006). Neurophysiological studies in animals and, more recently, neuroimaging studies in healthy young humans have identifi ed a neural network, which represents different modalities and aspects of reward processing and reward association learning. This network includes the dopaminergic midbrain (substantia nigra/ventral tegmental area, SN/VTA) and its target structures such as the PFC [orbitofrontal cortex (OFC), ventromedial and dorsolateral PFC (dlPFC)], the amygdala, and the ventral striatum (VST, Schultz, 2000 for review). It has been and reward processing. Furthermore, it is unclear whether the neurofunctional differences observed in those studies are due to performance differences or changes of the reward system.
More recent functional magnetic resonance imaging (fMRI) studies have compared younger and older participants performing a monetary incentive delay (MID) task (Samanez-Larkin et al., 2007;Schott et al., 2007). Schott et al. found greater increases in BOLD signal in the VST in younger adults during gain anticipation and in older adults during reward feedback (Schott et al., 2007). Another study reported age-related differences in VST during loss anticipation indicating an altered reward system in old age (Samanez-Larkin et al., 2007). In the MID task, associations between cues and outcomes are learned before the actual experiment; the task does not focus on the learning process and does not have a choice component.
To study reward-based decision making in old age, we used a probabilistic object reversal task (pORT, Heekeren et al., 2007). We hypothesized that the PFC and VST would show an altered response during reward association learning in old age. Importantly, our design allows the separation of reward processing and decision making. Furthermore, a learning criterion enabled us to: (1) assess learning-related changes, and (2) control for the potentially confounding effect of performance differences by comparing responses in trials where older adults performed as well as younger adults.

PARTICIPANTS
Twenty-eight healthy right-handed participants, 14 younger (8 females, mean age 26.48, SD ± 3.96 years) and 14 older adults (7 females, mean age 67.82, SD ± 5.01 years), participated in this study. Participants were recruited through newspaper advertisements and received 40€ for participation. Groups were matched for formal education (see Table 1 for sample characterization).
None of the participants reported cardiovascular pathology, medication, history of neurological or psychiatric episodes or substance abuse. We applied a neuropsychological test battery to further characterize the sample and to assess for signs of dementia in each participant (see Table 2 for descriptive statistics).  This included assessing episodic memory [Ten-Word-List with Encoding Enhancement (Reischies et al., 2000), Free and Cued Selective Reminding Test (Grober et al., 1988)], processing speed [Reitan Trail Making Task Part A and B (Reitan, 1958), Digit Letter Test (Lindenberger et al., 1993)], crystallized intelligence [Multiple Choice Vocabulary Test, Mehrfach-Wortwahl-Wortschatz test Version B (Lehrl et al., 1995)], reasoning [Leistungsprüfsystem Test 3 (LPS-3, Horn, 1983)], and executive functions [Self Ordered Pointing Task (Petrides and Milner, 1982), Tower of London Task (Shallice, 1982), Controlled Oral Word Association Task (Benton and Hamsher, 1976), Stroop Color-Word Interference Test (Stroop, 1935), WCST (Grant and Berg, 1948)]. As expected, younger participants performed signifi cantly better than older participants on most tasks assessing processing speed, reasoning, executive functions, and episodic memory. Older participants scored higher on the multiple choice vocabulary test and were above normal age performance on the LPS-3 and the Digit Letter Test (see Table 2).
In conclusion, psychometric data indicate that we included healthy older participants with a high level of cognitive abilities. The study was approved by the local ethics committee of the Charité-University Medicine Berlin, and written consent was obtained from each participant prior to participation.

PROBABILISTIC OBJECT REVERSAL TASK
In object reversal tasks, the participant typically has to learn object-reward associations between repetitively presented items. After a given time or criterion, the reward schedule changes so that another item is maximally rewarding. In this version, participants viewed four of six letters (C, F, H, N, R, S) simultaneously on a screen ( Figure 1A). Each letter corresponded to an abstract non-monetary feedback cue ranging between −40 and +40 points (40, 20, 0, −20, −40; Reischies, 1999). Participants had to choose one of the letters and indicate their choice with a button press on a fourbutton response box. The participant's task was to collect as many points as possible; that is, to fi nd the maximally rewarding letter.
To increase task diffi culty and to reduce predictability, we introduced a probabilistic variation in the outcome schedule: for each letter, a fi xed payoff was used in 80% of the trials. In the remaining 20%, only a reduced amount of points was delivered (e.g., 20 instead of 40 points). After six to eight continuous successful trials, reward contingencies covertly changed; that is, the participant had to learn new reward associations. If they did not learn this association within 15 trials, the feedback schedule also covertly changed. Thirty control trials were also randomly inserted. The experiment consisted of a total of 12 blocks with different feedback schedules. Stimuli were presented for 3500 ms. The duration of this period was adjusted according to reaction times in previous studies using this task in older adults to account for age differences in processing speed . The stimulus presentation was separated from the presentation of feedback (1000 ms) by inserting a fi xation period of variable duration (2000 ± 1500 ms). The feedback presentation was followed by a variable intertrial interval (ITI, 2500 ± 2000 ms) during which a fi xation cross was presented. In control trials, the participant had to choose the letter "X" from a display of three "Y" and one "X" and received 0 points. Prior to the fMRI experiment, participants performed one block without reversal and were told not to use any strategy or rule.
For post hoc analysis, we defi ned an additional learning criterion that defi nes a "learned" trial as any correct trial after two successful trials and a continuous correctness of 80%. According to this criterion, we grouped trials into two stages of learning. In "search" trials, participants looked for the maximally rewarding letter by trial and error until they reached the learning criterion, whereas in "learned" trials they showed successful learning according to the criterion (cf. Figure 1B).
We assessed several behavioral parameters: the total amount of collected points throughout the task ("global score"), and the number of trials in which participants did not choose the most FIGURE 1 | Probabilistic object reversal task. (A) Experimental design: Participants chose one of four letters presented for 3.5 s (letter chosen marked red). After a randomized delay, abstract non-monetary feedback cues ("points") were presented. (B) Performance of one participant. Choices and resulting outcomes are plotted throughout one block. The red marked trials were assigned to "search" trials whereas the blue marked trials were treated as "learned" trials. profi table letter of the current block ("errors"). The amount of total errors was subdivided into "perseverative errors" in which participants kept consistently choosing a letter no longer associated with the maximum feedback, and "random errors". In addition, the number of blocks in which the learning criterion was reached was determined for each participant (number of successful blocks; for a more detailed description, see Mell et al., 2005).

fMRI DATA ACQUISITION
FMRI was performed on a 1.5-T Magnetom Vision scanner (Siemens Medical Systems, Erlangen, Germany) with a standard head coil. Head movement was minimized using a vacuum pad. First, two structural 3D data sets were acquired for all participants using a T1-weighted sequence (FLASH, TR 20 ms, TE 5 ms, FA 30°, FOV 256 mm, Matrix 256 × 256, one hundred and eighty 1-mm slices, in-plane resolution: 1 mm 2 ). Thereafter, three runs, each consisting of 210 functional images using a T2*-weighted gradient echo sequence, were acquired (TR 2500 ms, TE 40 ms, FA 90°, FOV 256 mm, matrix 64 × 64, twenty-six 4-mm slices, inter-slice gap 0.6 mm, in-plane resolution: 4 mm 2 , ascending acquisition of images). The task was programmed in C++ and was projected on a back-projection screen using an LCD projector. We used a rapid event related design. The event schedule was optimized using Optseq2 (http://surfer.nmr.mgh.harvard.edu/optseq, Dale, 1999). Durations of the delay period and ITI were determined by a genetic algorithm (Wager and Nichols, 2003). Variation of the delay period and ITI allowed us to acquire scans in temporal asynchrony to the task in order to avoid a systematic bias in sampling over peristimulus time. During functional scanning, each participant completed 164.6 ± 3.6 trials.

Behavioral data
Raw data were tested for homogeneity of variance using Levene's test. The global score as well as the number of successful blocks violated this assumption. Therefore, group comparisons for the global score as well as the number of successful blocks were subjected to non-parametric statistics (Mann-Whitney U test). All other scores were calculated using t-tests for independent groups. Analyses were performed using the SPSS software package (SPSS Inc., Cary, NC, USA).

fMRI data
Imaging data were analyzed using a mixed effects approach within the framework of the general linear model as implemented in the statistical parametric mapping software package SPM2 1 . Six functional volumes were excluded to avoid magnetic saturation effects. Slice-time correction, realignment and spatial smoothing (Gaussian Kernel, FWHM = 10 mm × 10 mm × 11.5 mm) were applied. Our design allowed us to defi ne separate regressors for decision making and reward processing within each trial. Decision making was defi ned as the period between presentation of the letters and the button press response. Reward processing was defi ned as the time during which participants watched the feedback cue indicating the number of points earned for the preceding choice. Additional regressors for "learned" and "search" trials were defi ned for decision making and reward processing, respectively. These contrast images were used to analyze the main effects of learning stage ("search"/ "learned") within each group using one sample t-tests for younger and older adults separately in a mixed effects model and treating participants as random (p < 0.005 uncorrected). These regressors resulted in the following contrasts: Decision making "learned" vs. "search" and Reward processing "learned" vs. "search". Note that for the analysis of reward processing we only compared trials in which participants received 40 points either by chance ("search") vs. as expected outcome ("learned"). Afterwards, a between-groups analysis was performed using a two-sample t-test in order to study age differences (interaction of learning stage × age, p < 0.005, uncorrected). We assigned neuroanatomical labels to coordinates of each contrast map by converting the MNI coordinates to Talairach coordinate space (Talairach and Tournoux, 1988) using the Talairach Daemon (Lancaster et al., 2000). For a region of interest analysis in right VST and right dlPFC, we functionally defi ned a sphere with a diameter of 10 mm around each peak voxel in those contrast maps where a signifi cant signal increase was found. GLM beta weights were obtained for each participant. To test whether age-related differences in BOLD response in dlPFC could be explained by age-related differences in other cognitive abilities we conducted an analysis of covariance (ANCOVA). We entered parameter estimates in the dlPFC region modulated by the learning stage × age interaction as dependent variable, scores of other tests assessing executive functions, processing speed, or episodic memory as covariates, and age group as a two-level independent group variable (younger vs. older).

BEHAVIORAL DATA
Consistent with an earlier behavioral study , older adults as compared to younger adults collected fewer points throughout the entire task (U = 38.5, p < 0.01,

Effect of learning on BOLD signal changes during reward processing ("learned" trials vs. "search" trials)
In younger adults, during reward processing in "learned" relative to "search" trials, we found a bilateral BOLD signal increase in the  (Table 4). Direct group comparison (interaction of learning stage × age) confi rmed greater recruitment of the right VST in younger as compared to older participants (Figure 2; Table 4).

Effect of learning on BOLD signal changes during decision making ("learned" trials vs. "search" trials)
Both younger and older adults showed greater BOLD signal increases in the dlPFC when associations had been learned ("learned" trials) as compared to trials when associations had not yet been learned ("search" trials, Table 5). There was no signifi cant signal increase in striatal regions in either of the two groups. A direct group comparison (interaction of learning stage × age), however, showed a greater signal increase in older adults in the bilateral frontal cortex, including the dlPFC, the cingulate gyrus, and the left parietal cortex (Figure 3; Table 5).
In addition, we tested whether age-related differences in the dlPFC BOLD signal in "learned" vs. "search" trials during decision making could be explained by age-related differences in other tests of executive functions (i.e., SOPT, TOL, FAS, RTMT-B, Stroop), processing speed (Digit Letter Test), or episodic memory

FIGURE 2 | Interaction of learning stage and age in VST in reward processing. (A)
Increased activity in bilateral VST in reward processing comparing "learned" trials relative to "search" trials in younger adults (x = 8, y = 12, z = −5). In older adults no signifi cant signal changes were observed. (B) Contrast estimates (mean, standard error) of younger and older participants for "search" and "learned" trials in VST. (Ten-Word-List). An ANCOVA showed that differences in SOPT, Stroop, TOL, FAS and RTMT-B could not explain differences in dlPFC BOLD signal. After correcting for all scores of executive functions separately as well as simultaneously, the effect of age on changes in BOLD signal in dlPFC remained statistically signifi cant (F = 4.98, p < 0.05; Table 6). Furthermore, the age effect found in dlPFC remained statistically signifi cant after correcting for measures of processing speed (Digit Letter Test, corrected age effect: F = 11.33, p < 0.005), and episodic memory (Ten-Word-List, corrected age effect: F = 7.45 p < 0.01).

DISCUSSION
In this study we investigated age-related changes in reward-based decision making and its neural correlates. At the behavioral level, older adults collected fewer points than younger adults, completed fewer blocks successfully, and needed more trials to learn the stimulus-response association. Remarkably, there was no statistically signifi cant difference in the number of perseverative errors between the two groups, i.e., learning was slower in older adults even in the absence of persistent responding to the previously rewarded stimulus. The results suggest that there is no agerelated defi cit in reversing learned stimulus-response mappings but instead in learning the stimulus-reward associations, which replicates previous fi ndings .
To study the neuronal correlates of reward-based decision making in old age, we applied an event-related design that allowed separation of age-related differences in brain activity during decision making and reward processing. Furthermore, to test the course of learning and to account for performance differences typically found between young and older adults in a pORT, we grouped trials into "search" and "learned" trials depending on a learning criterion in a post hoc analysis.
During reward processing, we found a greater signal increase in the VST in younger participants when reward associations had been learned relative to when they had not yet been learned. As is known from studies in animals and humans, the VST serves as a key structure in reward processing. It responds to predictors (Knutson et al., 2001) and the outcome of expected rewards (Breiter et al., 2001). Our data confi rm the involvement of the VST in reward association learning by signaling an expected outcome under probabilistic conditions in younger adults (e.g., Heekeren et al., 2007). In contrast, we did not observe an analogous signal increase in the VST in older adults. The group × learning stage interaction revealed that while younger adults recruited the VST signifi cantly more in those trials when reward associations had been learned, older adults showed the opposite pattern. They recruited the VST in response to a rewarding stimulus when associations were not yet learned, that is, when the participant was rewarded by chance relative to when associations were learned, as seen in the young participants. Note that during "search" trials, the participant is rewarded by chance and shall use the respective feedback information to guide further decision making. Older adults, while showing a greater signal increase in VST after reward delivery compared to the young adults, still needed longer to learn the reward association as illustrated by the behavioral difference, suggesting that an increase in VST activity might not facilitate reward association learning in old age. Our results are compatible with results of other age-comparative fMRI studies suggesting age-related differential recruitment of the VST (Samanez-Larkin et al., 2007;Schott et al., 2007;Dreher et al., 2008). In our study we found no group difference in VST signals during decision making. This is in contrast to a recent study showing specifi c recruitment of the VST in young as compared to older adults (Dreher et al., 2008). Our results agree with one study reporting stronger ventral striatal activity in older participants during gain FIGURE 3 | Decision making -interaction of learning stage and age. (A) Interaction of learning stage × age in the right dlPFC (x = 52, y = 16, z = 28). (B) Contrast estimates (mean, standard error) of "learned" and "search" trials in younger and older participants in the right dlPFC.  (Schott et al., 2007). Note that in contrast to the pORT used in the present study, in the MID task, associations are known prior to the start of the experiment. Therefore, only the state of anticipation and the occurrence of anticipated outcomes were investigated in those studies, but not the process of rewardbased learning and decision making. Our data confi rm the view of differential VST function in old age and show that the difference in VST activity is dependent on the stage of learning. During decision making, older adults compared to younger participants recruited dorsolateral prefrontal areas differentially depending on the stage of learning. While younger adults activated the dlPFC during the initial stage of learning reward associations, older adults activated the dlPFC when reward associations had been learned successfully. The dlPFC is thought to be involved in various aspects of executive functions, such as maintenance, updating, and monitoring of working memory contents (Petrides et al., 2002) and decision making (Heekeren et al., 2004(Heekeren et al., , 2008. Note that the learning by age interaction found in dlPFC could not be explained by age-related differences found in other cognitive measures such as executive functions, processing speed or episodic memory, suggesting an age-related differential recruitment of dlPFC during decision making. With regard to reward-based decision making, the dlPFC is assumed to determine relevant reward information to plan and execute behavior directed toward rewards (Schultz, 2000;Hornak et al., 2004;Ichihara-Takeda and Funahashi, 2008). To succeed in our task, participants have to consider the previously experienced rewards and punishments elicited by the presented letters; that is, they have to monitor their own performance and use the respective information for decision making. The interaction of learning stage × age found during decision making could refl ect the fact that in the initial learning stages, younger adults successfully use the dlPFC to monitor past and current selections as well as their outcomes, which contributes to fi nding the maximally rewarding stimulus. In contrast, older adults allocate more resources to actively maintain and monitor performance and choice selection at the later stage of task performance when associations have already been learned.
Studies in non-human primates have shown that the rhinal cortex and the OFC play an important role in reward reversal learning (e.g., Lee et al., 2007;Watanabe and Sakagami, 2007;Phillips et al., 2008 for reviews). It should be noted that we did not observe statistically signifi cant changes in BOLD signal in these regions during decision-making or reward processing in the pORT. Due to susceptibility artifacts BOLD signal changes in the OFC as well as the hippocampal formation of the brain are more diffi cult to detect (Asano et al., 2004;Du et al., 2007).
The present results may support the view of alteration in the functional integrity of the dopaminergic system as suggested by neurocomputational models that hypothesize that age-related loss of dopamine increases neural noise, which results in less distinctive representations in the brain (Li et al., 2001). Accordingly, a less distinctive representation of behaviorally relevant feedback information in the VST may cause impoverished inputs to extrastriatal areas (for example, the dlPFC), resulting in diminished frontostriatal interaction for further decision making in old age. This view is supported by a recent study using 6-[(18)F]FluoroDOPA (FDOPA) positron emission tomography (PET) and fMRI showing a signifi cant interaction between midbrain dopamine synthesis and reward-related lateral PFC function in young compared to older adults. Midbrain measures of FDOPA correlated positively with BOLD signals in the lateral PFC during reward processing in young, but negatively in older participants (Dreher et al., 2008).
Age-related changes in the serotonergic system, which have been shown using PET (Yamamoto et al., 2002;Moller et al., 2007), could also contribute to the observed defi cit in reward association learning in older adults. Alterations in the serotonergic system have been reported to result in defi cits in learning changing reward-associations, reversal learning and in the evaluation of immediate and delayed rewards (Doya, 2008 for a review; Rogers et al., 1999;Clarke et al., 2004;Tanaka et al., 2007). Thus, age-related decline of serotonin receptors and serotonin transporters could also contribute to an altered neuromodulation of cortical and subcortical regions that mediate important aspects of fl exible reward association learning in old age (see Mohr et al., 2009 for a review).
In conclusion, the data support the view of altered fronto-striatal interaction during reward-based decision making in old age, which contributes to altered reward-based learning and decision making.