Activation in the VTA and Nucleus Accumbens Increases in Anticipation of Both Gains and Losses

To represent value for learning and decision making, the brain must encode information about both the motivational relevance and affective valence of anticipated outcomes. The nucleus accumbens (NAcc) and ventral tegmental area (VTA) are thought to play key roles in representing these and other aspects of valuation. Here, we manipulated the valence (i.e., monetary gain or loss) and personal relevance (i.e., self-directed or charity-directed) of anticipated outcomes within a variant of the monetary incentive delay task. We scanned young-adult participants using functional magnetic resonance imaging (fMRI), utilizing imaging parameters targeted for the NAcc and VTA. For both self-directed and charity-directed trials, activation in the NAcc and VTA increased to anticipated gains, as predicted by prior work, but also increased to anticipated losses. Moreover, the magnitude of responses in both regions was positively correlated for gains and losses, across participants, while an independent reward-sensitivity covariate predicted the relative difference between and gain- and loss-related activation on self-directed trials. These results are inconsistent with the interpretation that these regions reflect anticipation of only positive-valence events. Instead, they indicate that anticipatory activation in reward-related regions largely reflects the motivational relevance of an upcoming event.


INTRODUCTION
Neural representations of anticipated reward value are core to models of the mechanisms for learning (Schultz et al., 1997;Sutton and Barto, 1998;O'Doherty et al., 2004;Seymour et al., 2004) and decision making (Montague and Berns, 2002;Bayer and Glimcher, 2005;Balleine et al., 2007;Rangel et al., 2008). These models associate predictive cues with their subsequent outcomes, in order to describe behavior. Accordingly, the subjective experience of the cue-outcome association prior to the occurrence of the outcome refl ects "anticipation".
The most common functional neuroimaging paradigms for studying reward anticipation use learned cue-response-outcome contingencies (Delgado et al., 2000;Knutson et al., 2001Knutson et al., , 2005. On each trial an initial cue indicates a potential reward (e.g., a monetary gain). Then, following a short delay, a target appears, and if participants respond suffi ciently quickly and/or accurately, they receive a reward. Studies using variants of this approach have demonstrated that the ventral striatum (vSTR), particularly its nucleus accumbens (NAcc), exhibits increases in blood oxygenation level-dependent (BOLD) contrast (hereafter, "activation") to anticipated rewards (Knutson et al., 2000;Ernst et al., 2005;Adcock et al., 2006;Knutson and Gibbs, 2007;Dillon et al., 2008). Yet, despite the prevalence of this approach, several important questions about reward anticipation remain incompletely answered: How do these fi ndings generalize to other regions within the dopaminergic system (e.g., the ventral tegmental area, VTA)? Does activation of modulatory role of VTA in shaping memory, demonstrating specifi cally improved recall for stimuli associated with greater potential rewards. Using a combination of standard regression analyses and functional connectivity measures, they found that voxels within the anatomical location of VTA both increased in activation to larger potential rewards and exhibited functional connectivity with the hippocampus in effective memory formation.
More recently, D' Ardenne et al. (2008) describe VTA responses to the experience of primary and secondary rewards, as a functional neuroimaging analog of the prediction error signals previously reported in single-unit recordings (Ljungberg et al., 1992;Schultz et al., 1997;Bayer and Glimcher, 2005). They found that VTA activation increased to unexpected rewards, both primary (liquid) and secondary (money), consistent with single-unit studies showing that its neurons convey a positive reward prediction error. Of note, D'Ardenne et al. found no signifi cant changes in VTA activation to the omission of an expected liquid reward nor to an unexpected monetary loss, as would be expected if that region also signaled negative reward prediction errors. Where imaging volumes have allowed, some prior studies have reported qualitatively similar results in both NAcc and midbrain (Knutson et al., 2005) and NAcc and VTA (Moll et al., 2006), although a systematic comparison is needed.

WHAT DOES NEURAL ACTIVITY DURING REWARD ANTICIPATION REPRESENT?
Understanding how the brain encodes, represents, and manipulates signals that indicate potential and experienced rewards has been an area of considerable basic (Montague and Berns, 2002;Bayer and Glimcher, 2005;Phillips et al., 2007;Delgado et al., 2008a,b, Knutson andGreer, 2008) and clinical research (Kilts et al., 2001;Grusser et al., 2004;Kienast and Heinz, 2006;Bjork et al., 2008;Knutson et al., 2008aScott et al., 2008;Strohle et al., 2008;Pizzagalli et al., 2009). The common thread in this extensive literature is that the neural representation of reward does not refl ect any simple unitary construct. In particular, there has been an ongoing debate about whether and how the brain represents two different aspects of reward. The fi rst aspect is the absolute value of the outcome (i.e., important vs. unimportant outcomes), referred to as energization (Elliot, 2006), salience (Zink et al., 2003), incentive salience (Berridge et al., 2009), and magnitude (Knutson et al., 2001). A second aspect differentiates positive from negative outcomes; this aspect has been described in terms of affect (Knutson and Greer, 2008), valence, and approach/avoidance (Elliot, 2006). In the current paper, we will refer to these two aspects, which we intended to manipulate separately, as motivation and affective valence.
In the infl uential framework advanced by Berridge and colleagues, there are functional and neural dissociations between the valenced and non-valenced aspects of reward (Berridge, 2004;Berridge et al., 2009). Specifi cally, these authors contend that the response of dopaminergic neurons in the VTA and NAcc refl ect a motivational signal associated with information about future rewards (i.e., "wanting" the reward). In contrast, other neurotransmitters (e.g., opioids) affect the valence component of reward [i.e., "liking" the reward (Wise, 1980)]; they make pleasurable stimuli more pleasurable and aversive experiences less aversive (Pecina and Berridge, 2005). These potentially dissociable conceptsmotivational signifi cance and affective valence -recur in functional neuroimaging studies of reward anticipation and experience, although some reports discuss activation in these brain regions from the perspective of approach/avoidance behavior (Elliot, 2006), others invoke changes in affect evoked by rewards (Knutson and Greer, 2008), and still others consider responses in these regions as markers of prediction error [both valenced and non-valenced (O'Doherty et al., 2004;Seymour et al., 2007)].
Despite this ongoing debate, motivation and affective valence can be diffi cult to tease apart experimentally. Rewards in neuroeconomic research are commonly monetary gains implemented in paradigms where they have both motivational signifi cance and affective valence. The resulting activation in NAcc, VTA, or other reward-related regions may thus be attributed to either motivation or valence. Some reports indicate that stimuli of similar motivational signifi cance but different valence (e.g., monetary gains and losses) evoke similar activation in reward-related regions. For example, Cooper and Knutson (2008) show that when an outcome is uncertain, activation in the NAcc increases for both gain and loss anticipation. Other studies have suggested that activation in some components of the reward system does indeed depend on valence, whether because of distinct spatial loci evoked by positive and negative stimuli (Seymour et al., 2007) or because of decreases in activation to negative events (Breiter et al., 2001). Tom et al. (2007) tracked parametric effects of gain and loss magnitudes in a lossaversion paradigm, and found that activation in regions including the vSTR increased with magnitude for decisions about potential gains and decreased with magnitude for potential losses. Based on these and other confl icts in the literature, how motivation and affective valence information interact within the multiple regions that constitute the reward system remains unknown.

ARE THE NEURAL SUBSTRATES OF OUTCOME ANTICIPATION SIMILAR WHEN PLAYING FOR SELF AND OTHERS?
Finally, there exists considerable evidence that anticipatory activation, at least in the NAcc, generalizes across a wide range of rewards. Most neuroimaging studies of reward have used monetary outcomes, typically repeated opportunities to gain or lose about a dollar (Knutson et al., 2001;Daw, 2007). Yet, similar patterns of NAcc activation can be evoked using fl uid rewards (Valentin et al., 2007), food items (Hare et al., 2008, valuable consumer goods (Knutson et al., , 2008b, social cooperation (Rilling et al., 2002), and even the opportunity to punish others (Singer et al., 2006). Recent studies have related the increases in NAcc activation preceding a decision to the value of rewards earned for others (Moll et al., 2006;Harbaugh et al., 2007). Based on these studies, one natural conclusion is that any anticipated reward, even one with reduced personal relevance (and thus motivational salience), would evoke activation in multiple regions within the reward system (e.g., NAcc and VTA). While plausible, this conjecture has not yet been demonstrated.

OVERVIEW OF THE CURRENT EXPERIMENT
In the current study, we manipulated the valence (i.e., gain vs. loss) and motivational relevance (i.e., oneself vs. charity as benefi ciary) of anticipated rewards, using an incentive-compatible response-time game modeled on common paradigms in the literature (Knutson et al., 2000(Knutson et al., , 2001. In these paradigms, the trial cue is the earliest possible predictor of the potential gain or loss, and thus initiates anticipation. We focus on reward anticipation, rather than reward outcome, because the motivational and affective explanations for reward-system activation make clear and opposing predictions. If motivational infl uences alone drive activation during anticipation, and if manipulating the benefi ciary of the reward changes the motivational salience (Mobbs et al., 2009), then gain-and loss-related activation should be positively correlated across individuals, with greater responses observed to self-compared to other-directed outcomes. Conversely, if affective valence alone determines anticipatory activation, activation should be greatest when playing for gains and least when playing to avoid losses (relative to neutral outcomes), but with no differences between Self and Charity treatments. Moreover, by assessing participants' reward sensitivity and other-regarding preferences, we obtained independent predictors of individual differences in the neural responses to each reward type.
This paradigm can also test predictions of temporal difference (TD) models of anticipatory association. According to common TD models (Sutton and Barto, 1990), a well-learned reward cue should evoke activation that refl ects the value of the expected outcome. This prediction error signal can be described in terms of the value of the associated outcome (i.e., valenced) or the association value (i.e., the strength of the prediction), as discussed further below. In the case where prediction error is valenced, a pattern similar to the valence interpretation of anticipation would be expected: positive for gains, negative for losses. In the case where prediction error mirrors the strength of the association a result similar to the motivational salience signal model would be expected: positive for both gains and losses, with neutral cues producing the least activation. Importantly, both prediction error accounts would dictate identical results in charity and self conditions, unless the predictive system also represented the motivational signifi cance of the cues.

PARTICIPANTS
Twenty young adults (mean age 24 years; range 19-29 years; 10 females) participated in this study. Two were excluded because of misalignments in acquisition coverage, and one was excluded due to a Beck Depression Inventory (BDI) score indicating depression, leaving 17 participants in the reported data. All participants provided informed consent under a protocol approved by the Duke Medical Center Internal Review Board.

EXPERIMENTAL PROCEDURE
The experimental session comprised initial selection of a charity, task training outside the scanner, an fMRI session using a reward anticipation task, and completion of questionnaires to assess reward attitudes.
Following informed consent, subjects read descriptions of four non-profi t organizations -Easter Seals, Durham Literacy Center, Animal Protection Society, and the American Red Cross -and then selected one as their charitable target. They were then provided full information about the task structure and payment contingencies (see below for task details), and were told that no deception was used in the experiment. All participants reported that they understood the task procedures and that they believed that their earnings for charity would go to the selected target. Before entering the scanner, they completed one practice run of the task using only gain trials. We separated gain trials and loss trials into different runs, to minimize cue confl ict. Then, the participants were taken to the scanner for the MRI session. During acquisition of initial structural images, each participant completed a second practice run (using only loss trials). Participants then completed four 7-min task runs during collection of fMRI data. The fi rst run always involved monetary gains, so that subjects built up balances within cumulative banks, and the second run always involved monetary losses. The last two runs consisted of one gain run and one loss run, with their order randomly determined.
Each run consisted of 50 trials (Figure 1), evenly split between fi ve conditions according to potential outcome: Self $4, Charity $4, Self $0, Charity $0, and Neutral Control $0. Every trial began with a 500-ms cue whose composition indicated the target (picture), monetary amount at stake [background color: red (Self) or blue (Charity) for $4, yellow for $0 control conditions], and valence (gain: square frame, loss: circular frame). Following a variable delay of between 4 and 4.5 s, a target appeared on the screen. The subject's task was to respond by pressing a button with the index fi nger of the right hand, before the target disappeared. Within gain runs, responses that were suffi ciently fast added $4 to the subject's or charity's bank (visually indicated by a coin), and responses that were longer than the current threshold had no fi nancial consequences (visually indicated by a '0'). Within loss runs, responses that were suffi ciently fast resulted in no fi nancial consequences (visually indicated by a '0'), whereas responses that were longer than the current threshold subtracted $4 from the subject's or charity's bank (visually indicated by a red circle with a diagonal line). The presentation time of the target was determined by an adaptive algorithm; using information about response times on previous similar trials, the algorithm estimated the response time threshold at which the subject would be successful on approximately 65% of trials. We emphasize that independent thresholds were used for each trial type.
At the end of all runs, the participants exited the scanner and completed a series of behavioral questionnaires (see below). Participants were paid a base sum of $15. In addition, cumulative bank totals were calculated for both the participant (M $22.35, SD 11.75) and charity (M $22.59, SD $6.62), and participants were paid the full amount of their bank in cash (participants were guaranteed a minimum of $40 for participation). Following completion of data collection from all subjects, the researchers paid the cumulative earnings to each charity.

BEHAVIORAL QUESTIONNAIRES
After completing the experiment, participants were asked to fi ll out a series of psychological questionnaires. These included: the BDI (a screening tool for depression) (Beck et al., 1961); Behavioral Inhibition System/Behavioral Activation System (BIS/BAS, an index of approach and avoidance tendencies) (Carver and White, 1994); Interpersonal Reactivity Index (IRI, an assessment of otherregarding behavior) (Davis, 1983); Personal Altruism Level (PAL, a questionnaire using indices of other-regarding personal efforts) (Tankersley et al., 2007); Self Report Altruism Scale (SRAS, an index size: 1 mm × 1 mm × 1 mm). We also collected 17 slice IR-SPGR images, coplanar with the BOLD contrast images described below, for use in registration and normalization.
We collected BOLD contrast images acquired using a standard echo-planar sequence on a 3T GE Signa MRI scanner. Each of the four runs comprised 416 volumes (TR: 1 s; TE: 27 ms; Flip angle: 77°; voxel size: 3.75 mm × 3.75 mm × 3.8 mm) of 17 axial slices positioned to provide coverage of the midbrain and striatum (Figure 2). A TR of 1 s, and consequently a smaller acquisition volume, was chosen to increase the sampling rate in our ROIs (NAcc and VTA). We note that the GE Signa EPI sequence automatically passes images through a Fermi fi lter with a transition width of 10 mm and radius of half the matrix size, which resulted in an effective smoothing kernel of approximately 4.8 mm 3 . Thus, we did not include additional smoothing as part of our preprocessing protocol. Following reorientation, raw BOLD images were skull stripped using FSL's BET, corrected for intervolume head motion using MCFLIRT (Jenkinson et al., 2002), intensity normalized by a single multiplicative factor, and subjected to a high-pass temporal fi lter (Gaussian-weighted least-squares straight line fi tting, with sigma = 50.0 s). Registration to high-resolution structural and standard-space images were of other-regarding preferences) (Rushton et al., 1981); Temporal Experience of Pleasure Scale (TEPS, an index of reward experience and anticipation) (Gard et al., 2006). By taking the average of Z-score-transformed subscales from these measures, we constructed three individual-difference covariates. We defi ned the covariates based on a priori relations between the above scales: a personal reward-sensitivity covariate (BAS and TEPS, combined); an other-regarding preference covariate (PAL, IRI, and SRAS); and a behavioral inhibition covariate (BIS and BDI). A factor analysis presented by Pulos et al. (2004) suggests the personal-distress subscale of the IRI, included in our other-regarding preference covariate, may differ from the other-regarding trait targeted by the rest of the included subscales. Therefore, as a control test, we also evaluated a more limited empathy covariate that eliminated the personal distress subscale from the IRI.

fMRI ACQUISITION AND PREPROCESSING
At the beginning of the scanning session, we collected initial localizer images to identify the participant's head position within the scanner, followed by IR-SPGR high-resolution whole-volume T1-weighted images to aid in normalization and registration (voxel

FIGURE 1 | Participants performed a monetary incentive reaction time task.
An initial cue marked the start of the trial and indicated whether money was at stake and, if so, who would receive it. Each trial offered either $4 or $0, for the participant (Self), a charity (Charity), or no one. Gain and loss outcomes occurred in separate runs, to minimize cue confl ict. After a variable wait (4-4.5 s) a response target appeared indicating that participants were to press a button using their right index fi nger as quickly as possible. The trial was scored as a hit if the participant responded in time or as a miss if they did not. Changes to the bank as a result of that trial were then displayed for 0.5 s. In gain runs on $4 trials, if the subject responded to the target in time they won $4 for themselves or a charity, if they missed the trial there was no change to that bank. During loss runs on $4 trials, if the subject responded to the target in time there was no change to that bank, if they responded too slowly, they lost $4 for either themselves or their charity. Control trials resulted in no change to the bank but participants were asked to respond as quickly as possible. Reaction time thresholds for hits and misses were set using an adaptive algorithm to allow the subject to win approximately 65% of the time. Thresholds were set independently for each trial type.
Reward anticipation carried out using FLIRT (Jenkinson and Smith, 2001;Jenkinson et al., 2002). All coordinates are reported in MNI space.

fMRI ANALYSIS: GENERAL LINEAR MODEL
All fMRI analyses were carried out using FEAT (FMRI Expert Analysis Tool) Version 5.92, part of FSL (FMRIB's Software Library, www.fmrib.ox.ac.uk/fsl). Time-series statistical analyses used FILM with local autocorrelation correction (Woolrich et al., 2001). Our fi rst-level (i.e., within-run) analysis model included fi ve regressors for the anticipation period with two regressors (gain and loss) for the outcome period of each trial type. The anticipation period was modeled as a unit-amplitude response with 1 s duration following the disappearance of the trial indicator cue. The outcome period was modeled as a unit-amplitude response with 1 s duration following the onset of feedback. Trial timing and numbers are noted in the task description above. Self $4 trials were contrasted against Self $0 trials (and Charity $4 against Charity $0) to examine anticipation of gain and loss. The Neutral Control $0 trials were modeled but not analyzed. Second-level (i.e., acrossrun, but within-subject) analyses used a fi xed-effects model, while third-level (i.e., across-subjects) mixed-effects analyses (FLAME 1) included the main effects of each regressor from the lower level analysis, along with three covariates: reward sensitivity, empathy (other regarding preference), and inhibition. Whole-brain analyses used a voxel signifi cance threshold of z > 2.3 and a clustersignifi cance threshold of p < 0.05, fully corrected for all voxels in our imaging volume (Worsley, 2001). Because clustering algorithms do not easily differentiate large areas of activation, Tables 1-4 report the top ten peak voxels present using the elevated threshold indicated in each table.

fMRI ANALYSIS: REGIONS OF INTEREST
Our primary analyses used two anatomically defi ned ROIs: NAcc and VTA. Hand drawn anatomical ROIs were identifi ed based on the average of all participant's normalized high-resolution anatomical images. The NAcc ROIs were drawn in each hemisphere according to (Breiter et al., 1997). The VTA ROI was drawn by isolating the region medial and anterior to the substantia nigra, following work of Adcock et al. (2006). Only ROI voxels that fell within the group coverage area were included in the analysis.

Anticipating gain and loss for self
All analyses reported in this manuscript use regressors associated with reward anticipation (i.e., time-locked to the disappearance of the initial reward cue). We fi rst contrasted parameter estimates between trials that offered the chance to make $4 and trials where no money was at stake (Self-Gain $4 > Self-Gain $0). Activation associated with anticipated monetary gains was widely distributed throughout the imaged volume (Table 1), with peaks in the dorsal striatum and vSTR, bilateral operculum/insula (Figure 3A, top), midbrain ( Figure 3A, bottom), mediodorsal thalamus, medial prefrontal, medial orbitofrontal, anterior pole, and visual cortex. These results replicate those found in previous studies of gain anticipation (Knutson et al., 2001;Knutson and Greer, 2008). Next, we conducted a similar analysis for anticipated monetary losses, by contrasting trials that offered the chance to avoid losing $4 and trials where no money was at stake (Self-Loss $4 vs. Self-Loss $0). Activations in this loss-anticipation contrast ( Table 2) were distributed similarly to the gain condition. Peaks of activation were also similar to those noted under the gain condition, including in the dorsal striatum and vSTR, bilateral operculum/insula ( Figure 3B, top), midbrain (shown in Figure 3B, bottom), mediodorsal thalamus, and orbitofrontal and visual cortex.
The direct contrast between gain and loss anticipation (Self-Gain $4 > Self-Loss $4) identifi ed only one cluster along the inferior parietal sulcus (Z = 3.2; max: 32, −82, 20), and no differential activation overlapping our ROIs or in other regions implicated in reward anticipation by prior literature. No signifi cant clusters of activation were identifi ed in the reverse contrast (Self-Loss $4 > Self-Gain $4). Moreover, no clusters exhibited signifi cantly decreased activation during either self-directed gain or loss trials compared to control trials (i.e., Self-Gain $0 > Self-Gain $4, or Self-Loss $0 > Self-Loss $4).

Anticipating gain and loss for charity
We repeated all of the analyses from the previous section for trials that offered the chance to gain or lose money for the selected charity. Anticipating potential gains and losses for a charity evoked activation in regions within the dorsal striatum and vSTR, midbrain, thalamus, prefrontal cortex, bilateral insula, and visual cortex. Note that there was very good match between the peak loci of activation for self-directed and charity-directed rewards (Tables 3 and 4). Direct contrasts of trials involving potential gains and potential losses (Charity-Gain $4 > Charity-Loss $4, or Charity-Loss $4 > Charity-Gain $4) revealed no clusters of activation that survived whole-volume correction.

Playing for Self vs. playing for charity
We next identifi ed regions that exhibited signifi cant differences in activation depending on whether participants were anticipating playing for themselves or for their charity. The direct contrast of self-directed gains greater than charity-directed gains (Self-Gain $4 > Charity-Gain $4) identifi ed activations similar to those found for self-gains (i.e., Self-Gain $4 > Self-Gain $0); i.e., within rewardrelated regions like the NAcc and VTA. Activation in these regions was greatest to self-directed rewards, intermediate to charitydirected rewards, and least on trials where no reward could be obtained. Additional regions whose activation increased to selfdirected gains (Table 5) included the prefrontal cortex, temporalparietal-occipital junction (TPO), and posterior insula/inferior parietal lobule (IPL). Likewise, the direct contrast of self-directed losses greater than charity-directed losses (Self-Loss $4 > Charity-Loss $4, Table 6) evoked activation in reward-related regions, along with additional clusters in the TPO and IPL.

Anticipatory activations in the VTA and NAcc are similar for self and charity
We defi ned ROIs in the VTA and NAcc, collapsed across hemispheres (see Section "Materials and Methods" for details). For each subject, we calculated parameter estimates for each ROI and reward type within a two-factor (benefi ciary: Self vs. Charity; valence: gain vs. loss) repeated measures ANOVA. Note that for each trial type, we subtracted the mean activation associated with the matched $0-reward trial (e.g., Self-Gain $4 minus Self-gain $0), to control for non-task-related processing (e.g., cue perception). We found that both VTA and NAcc showed greater activation to self-directed rewards compared to charity-directed rewards [VTA:  Figure 4]. We also note that we found no signifi cant FIGURE 3 | Whole-brain analysis reveals similar patterns of activation during anticipation of gains and losses, whether participants played for self or a charity. Activated regions were larger and more signifi cant in the Self conditions. Activation peaks were present in the NAcc and VTA in all four treatments (i.e., anticipating gain, anticipating loss, playing for self, playing for a charity). ROIs for bilateral NAcc (A and B, top)  differences in these regions between mean signal changes in the $0 conditions, indicating that these effects are contingent upon the presence of anticipated reward.
To assess the localization of each ROI and test for potential spatial inhomogeneity, we also restricted our analyses to the single voxel with the highest Z-score (i.e., most signifi cant) to self-directed gains within the NAcc (MNI coordinates: 12, 6, −6) and VTA (MNI coordinates: 4, −16, −10) ROIs. Results of these ANOVAs are consistent with the results of the whole ROI ANOVAs in both the VTA and NAcc with two exceptions. In the NAcc, the three-way interaction present in the complete NAcc ROI (Gain vs. Loss × Reward Sensitivity) was non-signifi cant for the peak voxel alone [F(1,13) = 2.38, p = 0.15]. Second, in the VTA, the peak reward-anticipation-sensitive voxel showed a signifi cant main effect of valence [Gain vs. Loss: F(1,13) = 8.30, p < 0.05], an effect only signifi cant at the trend level in the analysis of the complete VTA ROI. Note that this increase in signifi cance may simply refl ect a selection bias, given that this voxel was selected for its robust responses in the self-gain condition. As further confi rmation of a motivational salience signal, signifi cant increases in activation to self-directed losses, to charity-directed gains, and to charity-directed losses were also present in the wholebrain analysis from this voxel. In addition, there were no voxels in the VTA or NAcc that showed negative activity on loss trials (with respect to the $0 condition) across participants.
Recent work by Matsumoto and Hikosaka (2009) indicates there are two varieties of dopaminergic neurons in the VTA, one population that responds to positive conditions and one population that responds to both positive and negative conditions. With this in mind we also interrogated voxels in the VTA (MNI coordinates: −10, −16, −12) and NAcc (MNI coordinates: −12, 10, −6) that showed the peak activation increase (i.e., greatest Z-score) during anticipation on loss trials. The NAcc loss peak results were consistent with those of the complete ROI and gain peak in that they showed a positive average response in all conditions, a main effect of benefi ciary, a trend toward an effect of participant reward sensitivity, and signifi cant three-way interaction of Self vs. Charity × Gain vs. Loss × Reward Sensitivity. Consistent with Matsumoto and Hikosaka, the peak loss voxel in the VTA differed from the peak gain voxel in that it exhibited no signifi cant main effect of valence [F(1,13) = 0.713, p = 0.41]. We caution that these analyses do not directly test spatial inhomogeneity effects and that such results may be attributable to selection bias because although our initial defi nition of ROIs was independent, defi nitions of the peak voxels was based on non-independent tests.

Individual gain and loss anticipation traits in the NAcc and VTA
In the current study, a main effect of affective valence would manifest in increased activation for anticipation of gains and decreased activation for anticipation of losses (or vice versa). Conversely, a main effect of motivation would lead to increased activation for anticipation of both gains and losses, compared to trials without the possibility of reward. As described above, our whole-volume analyses provided no suggestions of opposite responses for gains and losses within reward-related regions; to the contrary, we found that gains and losses each evoked signifi cant increases in activation within the NAcc and VTA, among other regions. We repeated these analyses for our anatomically defi ned ROIs and found a similar result: increased activation for both gain and loss trials, with greater activation for self-directed compared to charity-directed trials. Thus, we found no evidence for group-level main effects of valence in our target regions. The peaks listed are only signifi cantly active when playing for self.

FIGURE 4 | Percent signal change in the NAcc and VTA for $4 vs. $0 trials.
Mean activations, relative to $0 conditions, were positive for all trial types. Activations were larger in the Self than Charity treatment condition, refl ecting reliable differences on both gain and loss runs. A trend for a main effect of valence was present in the VTA but not the NAcc. Valence effects that are modulated by the reward sensitivity of the participant were present in both regions. We found no signifi cant differences between $0 conditions. Error bars are ±standard error of the mean.
We next investigated whether there were any across-subjects relationships between the magnitude of the responses to gain and loss trials. If there were a negative correlation across individuals between activations to gain and loss trials, even though the mean activation for both types of trials was positive, then that would be strong evidence that both motivation and affective valence modulate activation in reward-related regions. Alternatively, a positive correlation between activations to gain and loss trials would provide evidence in favor of a motivational explanation, alone. Our results support the motivation explanation. In the NAcc, activations during gain anticipation scaled positively with loss anticipation (Figure 5), with a signifi cant correlation in self-directed-trials (r = 0.64) and a non-signifi cant but numerically positive correlation on charitydirected trials. In the VTA (Figure 6), activations during gain and loss anticipation were positively correlated for both self-directed (r = 0.58) and charity-directed (r = 0.63) trials.
We next used a hierarchical regression analysis to evaluate whether the neural bias toward gains, compared to losses, was predicted by our reward sensitivity covariate. We found that there were strong correlations between reward sensitivity and the differential activation between gains and losses (e.g., Self-Gain $4 minus Self-Loss $4) in both the NAcc and VTA (Figures 5A and 6A). Individuals who had the greatest reward sensitivity exhibited the greatest relative increment in activation gains compared to losses. (We note that this is a fully independent correlation, in that we are using an independent behavioral test, an anatomical ROI, and the residual activation following a contrast of conditions.) This effect was signifi cant for self-directed trials in both NAcc and VTA, but not for charity-directed trials in either ROI. We conducted similar analyses using covariates for other-regarding preferences and behavioral inhibition, and found no signifi cant effects. Based on these results, we conducted a post hoc test looking at the relationship between our reward sensitivity covariate and activation to each trial type (as opposed to the difference between trial types described above). We found that, within our sample, the NAcc and VTA responses to self-directed gains were largely similar regardless of reward sensitivity, but that high reward-sensitivity scores correlated with a relative decrease in activation on the other trial FIGURE 5 | BOLD responses in the NAcc during gain and loss anticipation are positively correlated when participants play for themselves. Top: Average percent signal change differences (paid-control) for anticipation of gain and loss trials for the Self (left) and Charity (right) treatments. Each point is colored according to the participant's relative reward sensitivity index (z transformed). Bottom: Individual Gain vs. Loss signal change differences are plotted against the participant's reward sensitivity index (z transformed). Each plot includes the orthogonal distance regression best fi t line, as well as the correlation coeffi cient (r) and the p-value of that correlation (p). Only the regression in the Self condition was signifi cant. types (Figure 7; see also colored circles on the upper right panels of Figures 5 and 6).

DISCUSSION
We examined brain activation during the anticipation of monetary rewards that varied in their valence (i.e., gain vs. loss) and benefi ciary (i.e., self-directed vs. charity-directed). We found that activation in putatively reward-related regions, specifi cally the NAcc and VTA, increased during both gain-and loss-anticipation, with greater responses to self-directed than charity-directed trials. Moreover, there was a strong positive correlation between these responses across individuals, such that those individuals with the greatest anticipatory response to potential gains also had the greatest response to potential losses. Together, these results indicate that anticipatory activation refl ects the motivational properties of the potential reward, not its valence. However, we found evidence, using an independent behavioral covariate, that individual differences in reward sensitivity modulated the relative response to gains and losses, with more reward-sensitive individuals exhibiting relatively more activation to gains compared to losses. Below, we consider the implications of each of these results.

REWARD ANTICIPATION: MOTIVATION VS. VALENCE
In group analyses, we found no evidence that anticipatory activations in either VTA or NAcc refl ect a univariate value signal that scales according to both the valence and magnitude of the potential reward (i.e., gain > neutral > loss). Both potential gains and potential losses evoked increased activation compared to control stimuli in the NAcc and VTA, as shown within a whole-volume analysis, an anatomical ROI analysis, and in an analysis restricted to the most-active voxel in each region. And, as even stronger evidence that anticipatory activations refl ect motivational salience, we found that activations associated with gains and with losses were positively correlated across participants. These results lie in contrast to some previous studies that have shown increased NAcc activation to anticipated gains, compared to anticipated losses (Knutson et al., 2001), or have FIGURE 6 | BOLD responses in the VTA during the anticipation of gain and loss are positively correlated whether participants play for themselves or for a charity. Top: Average percent signal change differences (paid-control) for anticipation of gain and loss trials for the Self (left) and Charity (right) treatments. Each point is colored according to the participant's relative reward sensitivity index (z transformed). Gain and loss responses differences are positively correlated in both the self and charity conditions. Bottom: Individual Gain vs. Loss signal change differences are plotted against the participant's reward sensitivity index (z transformed). Each plot includes the orthogonal distance regression best fi t line, as well as the correlation coeffi cient (r) and the p-value of that correlation (p). Only the regression in the Self condition was signifi cant. failed to fi nd increased activation to anticipated losses compared to a neutral control condition (Knutson et al., 2003(Knutson et al., , 2008a. For example, a study of loss aversion by Tom et al. (2007) showed that activation in the vSTR to decisions about mixed gambles (i.e., that involved a potential gain and a potential loss) increased with increased size of gain, but decreased with increasing size of loss. This result was interpreted as refl ecting the response of a single reward mechanism that codes for both gains and losses along a single axis of reward value. We note that gains and losses were always paired in the design of Tom et al. (2007), such that the magnitude of the loss attenuated the overall value (i.e., magnitude) of the gamble. Within our study, in contrast, the potential losses were presented in isolation and thus refl ected an independent and negative potential outcome, allowing a differentiation between magnitude and valence.
Prior research has suggested that under certain conditions NAcc activation refl ects task factors other than value of a poten-tial reward. Activation in the NAcc has been reported to correlate with both the salience of the stimulus presented (Zink et al., 2003) as well as the unpredictability of the potential outcome (Berns et al., 2001). It could be argued that salience or risk are inherently rewarding. However, there is also evidence that NAcc responses positively correlate with aversive stimuli (Delgado et al., 2004;Jensen et al., 2007;Salamone et al., 2007;Levita et al., 2009), as well.. In the current study, although activations to gain and loss anticipation both exceeded those present during all control conditions ($0), we found evidence that reward valence modulates the amplitude of this activation in the NAcc and VTA. In the VTA we found valence modulations related to reward sensitivity. In the NAcc modulation of valence was dependent on both the benefi ciary of the reward and the reward sensitivity of the participant. One reasonable possibility is that the VTA and NAcc are primarily sensitive to aspects of motivational salience but that those responses are modulated by affective valence (Cooper and Knutson, 2008), especially in those participants who are most reward sensitive. The relative strength of affective valence modulation would then likely be dependent upon task context. This mixed signal may also refl ect spatial inhomogeneity within the VTA and NAcc, as discussed below.
A striking result came from the imperfect matching between the neurometric responses to potential gains and to potential losses. While the gain:loss ratio across the entire subject sample was approximately 1:1, some subjects showed a relatively increased response to gains, while others showed a relatively increased response to losses. This residual variation turned out to be systematically related to participants' reward-sensitivity scores. This behavior-brain correlation could refl ect a contribution of some subcomponent of these reward-related regions, or the infl uence of another region that itself was sensitive to affective valence. An important direction for future research will be identifying the pattern of functional connectivity across regions that predict both trial-to-trial effects of cue value and across-subjects factors that bias those value signals.

THE ROLE OF THE VTA IN REWARD ANTICIPATION
Most prior neuroimaging studies of reward processing have focused on the vSTR, specifi cally the NAcc, which has been reliably reported to exhibit increased activation during anticipation. Much less evidence exists for the modulatory effects of anticipation in the VTA, the primary dopaminergic input to the NAcc (Swanson, 1982;Ikemoto, 2007). Prior research on VTA function, mostly using single-unit recording in non-human primates, has implicated that region in the processing of rewards, generally, and in transient responses to changing reward expectations (Ljungberg et al., 1992). Based on data showing that VTA neurons respond to both unexpected primary rewards and cues that predict future rewards, it has been theorized that these neurons code a reward prediction error, critical for TD learning (Schultz et al., 1997). It would be diffi cult to account for our results using prediction error signals that treat gains and losses as a single continuum. Because we separated our gain and loss cues into separate blocks, and used two types of rewards, a single continuum prediction error model would predict that we should observe the greatest anticipatory responses to Self-Gain cues, smallest (or most negative) responses to Self-Loss cues, responses in the same directions, but possibly attenuated, to both types of Charity cues, and minimal responses to the non-rewarded control cues. In contrast, we found very similar activation, both in spatial pattern and amplitude, for anticipated gains and anticipated losses, with both gains and losses greater than control cues or charitable cues. Alternatively, the opportunity to avoid losses may be might be seen as rewarding. In fact, there is evidence that relief from pain (Seymour et al., 2005) and even avoiding potential negative outcomes (Kim et al., 2006) can be viewed as rewards. However, this kind of "pure valence" explanation is inconsistent with the observation that activations on loss trials were still greater than neutral ($0) trials.
We emphasize that these results are not necessarily incompatible with the numerous prior demonstrations that prediction errors modulate the responses of VTA neurons, for three reasons. First, as is proposed by Seymour et al. (2007), there may be multiple and potentially valence-dependent prediction error signals. That is, separate neuronal prediction-error signals may increase in anticipation of gains and of losses, each contributing to observed BOLD activation. Second, monetary losses may not have similar psychological and neural effects as omitted primary rewards or aversive stimuli. In particular, the loss of money refl ects an opportunity cost that affects the total value of a future reward, rather than an immediate negative consequence (e.g., a painful shock). Accordingly, humans frequently reframe decision problems to minimize decision diffi culty or to maximize perceived value (Tversky and Kahneman, 1974); in our paradigm, like many others, the loss cue may have been reframed as an opportunity to avoid negative consequences. Third, activation measured using fMRI does not necessarily map onto the fi ring rate of individual neurons. Substantial methodological work suggests that the amplitude of BOLD activation matches best to local fi eld potential and multi-unit activity within a region (Goense and Logothetis, 2008), and less well to single-unit activity. The relatively coarse timescale of fMRI data collection, combined with the fi ltering effects of the BOLD hemodynamic response, precludes determination of the relative timing of the contributing neuronal activity. In addition, evidence from Ungless et al. (2004) indicates the VTA may not be homogenous in its responsiveness to gain and loss. They fi nd evidence of two distinct populations of neurons in the VTA, one responsive to positive stimuli and the other to aversive stimuli. Inhomogeneity in the VTA within dopaminergic neurons is supported in recent work by Matsumoto and Hikosaka (2009) who show not only distinct populations of neurons responsive to positively valenced stimuli but also provide evidence of a dorsal/ventral spatial distinction. Preliminary fi ndings in the current study suggest that an fMRI study designed to look for spatial separation of gain-specifi c neuronal populations in the VTA may be able to isolate them from those responsive to both gains and losses. Given these caveats, our results should be interpreted as showing that some aspect of information processing in VTA is driven by motivational properties of anticipated rewards or by a prediction error that increases with the magnitude of anticipated punishment. We also note that individuals who are more reward sensitive display effects of valence not present in those relatively less sensitive to reward.

MODULATION OF ANTICIPATORY REWARD SIGNALS BY SELF-VS. OTHER-DIRECTED CONTEXT
The NAcc not only responds to meaningful self-directed outcomes, it also responds to a variety of other-directed outcomes: social cooperation (Rilling et al., 2002), altruistic punishment (Singer et al., 2006), and rewards for a favored charity (Moll et al., 2006;Harbaugh et al., 2007), among others. In these latter cases, the reward may be emulated as if it were being personally received and is therefore represented within the same system, albeit with reduced magnitude. We note that prior research showing activation in the reward system to charitable rewards used tasks involving active decisions or passive receipt of those rewards. Here, we show that mere anticipation of potential reward is suffi cient to evoke activation within the NAcc; moreover, like the work of Moll et al., we extend our conclusions to include VTA, as well.
Notably, all of our main-effect analyses indicated that selfdirected and charity-directed rewards evoked very similar patterns of activation: for both types of rewards, activation in the NAcc and VTA increased for both anticipated gains and anticipated losses. What differentiated these two reward types was our participants' relative reward sensitivity, such that individuals with higher reward-sensitive individuals showed lower responses for all charitable rewards. Somewhat surprisingly, we found no similar acrossparticipant effect of our other-regarding-preference covariate. We note that prior studies have shown that the relative subjective value of different charitable rewards, defi ned by the participant's willingness to engage in a transaction as opposed to individual differences in overall other-regarding-preferences, modulates activation of the vSTR (Moll et al., 2006;Harbaugh et al., 2007). In contrast, self-reported trait measures of other-regarding preferences have been reported to relate to structural (Yamasue et al., 2008) and functional (Tankersley et al., 2007) differences in other brain regions associated with social cognition. The independence of other-regarding preferences and likelihood of engaging in a charitable transaction is worthy of further investigation.
We have presented evidence that motivational salience modulates activation in the VTA and NAcc. Activations during the anticipation phase of all trial types were positive with respect to a $0 trial. However, the magnitude of this positive activation was modulated by three factors. First, the benefi ciary: activations were smaller in magnitude when the outcome of the trial was not directed toward the participant, suggesting that a single system processes social and personal rewards according to their motivational salience. Second, the valence: in the VTA, the anticipation of gains evokes greater activation than the anticipation of losses, even though both conditions are greater than trials where no reward or punishment could be obtained. Third, the reward sensitivity of the individual: for participants who are more reward sensitive, the magnitude of activations to anticipation in the VTA and NAcc is largest on gain trials played for themselves. We conclude that both the VTA and NAcc provide anticipatory signals that largely refl ect the motivational signifi cance of potential rewards.