Nucleus Accumbens Core and Shell are Necessary for Reinforcer Devaluation Effects on Pavlovian Conditioned Responding

The nucleus accumbens (NA) has been hypothesized to be part of a circuit in which cue-evoked information about expected outcomes is mobilized to guide behavior. Here we tested this hypothesis using a Pavlovian reinforcer devaluation task, previously applied to assess outcome-guided behavior after damage to regions such as the orbitofrontal cortex and amygdala that send projections to NA. Rats with sham lesions or neurotoxic lesions of either the core or shell subdivision of NA were trained to associate a 10-s CS+ with delivery of three food pellets. After training, half of the rats in each lesion group received food paired with illness induced by LiCl injections; the remaining rats received food and illness unpaired. Subsequently, responding to the CS+ was assessed in an extinction probe test. Both sham and lesioned rats conditioned to the CS+ and formed a conditioned taste aversion. However only sham rats reduced their conditioned responding as a result of reinforcer devaluation; devalued rats with lesions of either core or shell showed levels of responding that were similar to lesioned, non-devalued rats. This impairment was not due to the loss of motivational salience conferred to the CS+ in lesioned rats as both groups responded similarly for the cue in conditioned reinforcement testing. These data suggest that NA core and shell are part of a circuit necessary for the use of cue-evoked information about expected outcomes to guide behavior.

These data suggest that NA, along with ABL and afferent regions in lateral OFC, might be part of a circuit critical for the use of cueevoked information about expected outcomes to guide behavior. We tested this hypothesis by using a Pavlovian reinforcer devaluation task. In the critical extinction probe test, rats must use a representation of the outcome evoked by the CS+. To adaptively guide behavior, this CS+ evoked representation must be updated to reflect the current value of the outcome. Previous evidence has demonstrated that OFC and ABL are essential to updating and using the current value of the outcome to guide behavior. Here, we tested the role of NA core and shell in the Pavlovian reinforcer devaluation task and found pre-training lesions disrupted the ability of the animal to alter conditioned responding based on the current value of the outcome.

Materials and Methods subjects
Fifty-six Male Long-Evans rats (Charles River Laboratories, Wilmington, MA, USA), weighing between 275 to 325 g on arrival, were individually housed and were given ad libitum access to food and water, except during testing. Beginning 5 days before testing and continuing until testing ended, rats were food deprived to 85% of their baseline body weight. The animals were placed on a 12-h light/ dark cycle and tested only during the light cycle. Rats were tested at the University of Maryland, School of Medicine in accordance with University of Maryland and NIH guidelines.

introduction
A critical feature of adaptive behavior is the ability to use expectations of outcomes to appropriately guide behavior. Experiments in rats, monkeys, and humans using reinforcer devaluation have implicated the orbitofrontal cortex (OFC) and basolateral amygdala (ABL) in the ability to use information about expected outcomes to guide Pavlovian behavior (Hatfield et al., 1996;Malkova et al., 1997;Gallagher et al., 1999;Gottfried et al., 2003;Machado and Bachevalier, 2007).
ABL and OFC -defined broadly as including ventrolateral and lateral orbital regions as well as parts of insular cortex (Ongur and Price, 2000;Schoenbaum et al., 2002) -send projections to the nucleus accumbens (NA) (McDonald, 1991;Berendse et al., 1992;Brog et al., 1993;Voorn et al., 2004;Schilman et al., 2008). The NA has long been implicated in a variety of Pavlovian behaviors (Mogenson et al., 1980;Cardinal et al., 2002a). Neurons in the NA fire to cues in a manner that appears to signal their associative significance Nicola et al., 2004;Peoples et al., 2004;German and Fields, 2007;Hollander and Carelli, 2007;Kimchi and Laubach, 2009;Roesch et al., 2009;van der Meer and Redish, 2009), and this region is critical to Pavlovian-toinstrumental transfer and other aspects of Pavlovian responding (Parkinson et al., 1999a(Parkinson et al., ,b, 2000Corbit et al., 2001;Hall et al., 2001;Cardinal et al., 2002b;de Borchgrave et al., 2002;Balleine and Corbit, 2005 We performed aseptic surgeries to make bilateral neurotoxic lesions of the NA core and NA shell under isoflurane anesthesia. All surgeries were conducted prior to behavioral testing, and the animals were given 1 week to recover. NMDA was delivered at 0.1 μl/min at AP: 1.6, ML: ± 0.9 at three DV sites: DV −7.8 (0.2 μl), DV −7.2 (0.1 μl) and DV −6.5 (0.1 μl). These coordinates and neurotoxic agents have been demonstrated to selectively lesion either NA shell or NA core with little to no damage to surrounding regions (Ito et al., 2008). Sham surgeries (n = 20) were performed by lowering the Hamilton syringe without infusing neurotoxin.

aPParatus
Sixteen standard behavior boxes (12″ × 10″ × 12″) in sound attenuating cubicles were used for testing (Coulbourn Instruments, Allentown, PA, USA). A recessed food cup was placed in the center of the right wall approximately 2 cm above the floor. A feeder, mounted outside of the behavioral box, contained 45 mg sucrose pellets (Bio-Serv., Frenchtown, NJ, USA) and was connected to the food cup. The house light was mounted above the food cup in the center panel.

Pavlovian conditioning
All behavioral training procedures are outlined in Table 1. After rats reached 85% of their baseline body weight, they were trained to eat from the food cup over two days during daily 64-min shaping sessions that each included 16 deliveries of three 45 mg grain pellets (Bio-Serv, Frenchtown, NJ, USA). No additional stimuli were present during food cup shaping. Following food cup shaping, animals underwent 10 days of Pavlovian conditioning. Rats received 16 10-s presentations of the house light (CS+) followed by delivery of three sucrose pellets. Intertrial intervals varied from 3 to 5 min. The 10-s pre-CS period was used to calculate baseline responding.

reinforcer devaluation
Following conditioning, rats were matched for performance and divided into devalued and non-devalued groups. On days 1 and 3, non-devalued groups were given 10-min access to a ceramic bowl containing 100 sucrose pellets. On days 2 and 4, devalued groups were given 10-min access to a ceramic bowl containing 100 sucrose pellets, and immediately following this consumption period, they were given an (intraperitoneal) i.p. injection of 0.3 M LiCl (Hatfield et al., 1996;Pickens et al., 2003Pickens et al., , 2005Johnson et al., 2009). Non-devalued animals also received i.p. injections of 0.3 M LiCl on days 2 and 4 but this was not contiguously paired with sucrose pellets. In Figure 2B, pellet consumption is depicted as a function of trial. Thus, trial 1 indicates the first exposure to pellets, when no learning has taken place. Trial 2 reflects consumption after rats have experienced one pairing of either food and illness or the LiCl injection alone. Finally trial 3 reflects the final consumption test, which reflects learning from two pairings of food and illness, or illness alone.

Probe test
Following reinforcer devaluation (but prior to the final consumption test), rats were given a probe test that was exactly the same as Pavlovian conditioning outlined above, except that this test was run under extinction conditions (i.e., at the end of CS+ presentation, no pellets were delivered).

resPonse Measures
We measured percent time spent in the food cup with an infrared photo beam positioned at the front of the food cup. For purposes of analysis, we examined the last 5 s of the CS+. Previous reports have demonstrated that responses are confined to this segment of the CS+ (Pickens et al., 2003). All of the reported behavior is conditioned responding and thus by definition occurs prior to food pellet delivery. Conditioned behavior was measured as the percent of time during the CS during which the photobeam was broken, indicating presence in the food cup. This would sum across multiple entries; historically this measure correlates closely with rate of responding and also latency to respond during the CS (Holland, 1977).

conditioned reinforceMent
Conditioned reinforcement testing took place over two consecutive days, beginning two days after the probe test. Each day consisted of a single 30-min session. For each session, two levers were inserted into the behavioral box, one on either side of the food cup. Responding on one lever resulted in a 1-s presentation of the CS+ (the same CS used prior conditioning); responding on the other had no programmed consequences. Cues were presented on an FR2 schedule, and lever-cue associations were counterbalanced across animals.

histology
After conditioned reinforcement testing, the rats were deeply anesthetized with isoflurane and perfused with 4% paraformaldehyde (PFA). Brains were extracted and kept in 4% PFA for 24 h and then transferred to a 30% sucrose solution for 24-48 h. Finally, 40-μm sections were made, mounted on slides, and Thionin stained to verify lesions.

data analysis
Data was collected using Graphic State 2 software from Coulbourn Instruments (Allentown, PA, USA). Then, the data was processed in Matlab to extract response rates and percent time spent in the food cup during CS presentations. Finally, the data was analyzed using Statistica, version 9. Post hoc comparisons were either planned comparisons or Tukey's honestly significant difference (HSD) tests. Planned comparisons assume that one group (i.e., devalued) will be different than another (i.e., non-devalued) and calculate a suitable p value.

results
Rats with sham (n = 20) or neurotoxic lesions of the nucleus accumbens shell (n = 16) or core (n = 20) were trained in a Pavlovian devaluation task. An experimental timeline is shown in sham-paired rats responded significantly less than sham-unpaired rats (p < 0.05), whereas both shell and core lesioned rats in the paired and unpaired groups responded similarly (p > 0.2).
Finally, the rats were tested to see if the CS+ would support the acquisition of a new instrumental response. As illustrated in Figure 2D, all groups learned to respond selectively on a lever that activated the CS+. A two-factor ANOVA (lesion × lever) demonstrated a significant main effect of lever (F 1,30 = 13.14, p < 0.01) but no significant main effect nor any interaction with group (F < 1.2, p > 0.1). Planned comparisons revealed that rats in both the sham and the core lesioned groups responded significantly more on the lever which led to the CS+ (ps < 0.05); rats with shell lesions showed the same pattern, although the direct comparison did not reach significance (p = 0.098). Comparison of lever pressing in the current experiment to lever pressing by controls in previous studies in which we have tested conditioned reinforcement immediately following the final Pavlovian conditioning session (Burke et al., 2007(Burke et al., , 2008 revealed no significant differences in lever pressing (Fs < 2.0, ps > 0.1). Thus the imposition of a probe test had no effect on conditioned reinforcement. This is consistent with reports that extinction does not impact underlying CS-US associations that form the basis of conditioned reinforcement (Rescorla, 1996).

discussion
Here we show that NA lesions impair changes in Pavlovian conditioned responding after reinforcer devaluation, in a task that is sensitive to damage to upstream regions of the OFC and ABL. Neurotoxic lesions of either core or shell prior to behavioral testing had little effect on conditioning or the formation of a conditioned taste aversion; however rats with lesions of either NA core or shell failed to show the normal reduction in conditioned responding in the probe test after devaluation. This suggests that lesions of either region affect the ability of rats to utilize the cue to evoke a representation of the expected outcome; because rats were given pre-training lesions, this Table 1. Histology after testing showed that 17 core lesioned rats and 16 shell lesioned rats had acceptable (>85%) bilateral damage to the respective region with limited damage to surrounding regions (Figure 1). The remaining rats were excluded from the analysis.
Training began with 10 days of conditioning. As illustrated in Figure 2A, conditioned responding in the last 5 s of the CS+ increased in both groups across sessions. A three-factor ANOVA (lesion × session × cue/pre) revealed significant main effects of session (F 9,387 = 58.63, p < 0.001) and cue/pre (F 1,43 = 150.46, p < 0.0001), and a significant day × cue/pre interaction (F 9,387 = 41.44, p < 0.001). In addition to these effects, there was also a significant overall effect of lesion (F 2,43 = 4.05, p < 0.05). Notably, while post hoc testing revealed that sham rats responded significantly more during the pre-CS baseline period than core and shell lesioned rats (Tukey's HSD, p < 0.05, df = 43), on the final day of conditioning there were no significant differences in responding during the CS + in any group. Moreover, shell and core lesioned rats responded similarly to sham rats in the 5 s after CS termination (when the US is delivered), indicating that all groups were consuming pellets in the food cup and that they had similar motivation to retrieve the food reward (Fs < 1, ps > 0.1).
After reinforcer devaluation (and prior to the final consumption test, which occurred after completion of all testing), all rats underwent a probe test in which they were exposed to the CS+ again under extinction conditions. As illustrated in Figure 2C, reinforcer devaluation had a significant effect on conditioned responding in sham but not core or shell lesioned rats. A two-factor ANOVA (lesion × pairing) revealed a significant lesion × pairing interaction (F 1,29 = 4.83, p < 0.05). Planned comparisons demonstrated that  Importantly the deficit cannot be attributed to the inability of the CS to gain motivational salience in NA lesioned rats, because lesioned rats responded like control rats in conditioned reinforcement. Conditioned reinforcement conducted using similar procedures has been shown to be insensitive to reinforcer devaluation (Parkinson et al., 2005;Burke et al., 2007). Thus the normal performance here by lesioned rats is consistent with preservation of these devaluation-insensitive or general affective representations.
Similarly, while there was a non-specific effect of lesions on behavior, evident in attenuated baseline and conditioned responding during training and in the probe test, the lack of an effect of devaluation was not easily attributable to this general effect. Lesioned rats clearly conditioned to the cue, and conditioned responding to the cue in the probe test, though reduced relative to controls, was roughly double baseline responding during conditioning. Thus there was not a floor to prevent an effect of devaluation on responding in lesioned rats. Nevertheless there was no effect of devaluation on responding of rats in either lesioned group; in both cases, lesioned rats that had received food-illness pairings (and stopped eating the food) responded the same as lesioned rats that had received food and illness on alternate days (and continued eating the food). While it has been suggested to us that this pattern of results could reflect a generalized effect of illness on responding, this seems unlikely given that illness was experienced days removed from training and in the rats' home cages. Further rats showed no lingering effects of illness, and those that received food and illness on alternate days continued to consume the food pellets like controls. Thus we would suggest the much more parsimonious explanation of these data, which is that lesions of NA -either core or shell -had two somewhat independent effects: a generalized effect on behavior and a more specific impairment in outcome-guided responding, due to effects on acquisition, updating or the use of the specific CS-US association.
Although the single-reinforcer Pavlovian reinforcer devaluation paradigm used here has been extensively applied to understand the functions of OFC and ABL, to the best of our knowledge it has not been applied previously to study NA. Instead prior studies using devaluation to explore NA function have typically focused on tasks with significant instrumental components; these studies have led to conflicting evidence (Corbit et al., 2001;de Borchgrave et al., 2002). One notable exception to this is a recent study examining the effects of dopamine depletion in NA core on changes in Pavlovian and instrumental responding after satiation (Lex and Hauber, 2010). Rats were trained that different responses or cues predicted either a food or sucrose reinforcer. Subsequently rats were satiated on one reinforcer prior to an extinction test, similar to the probe test employed here. Dopamine depletion within NA core had no effect on instrumental behavior, but it generally reduced Pavlovian responding and also abolished the normal affect of devaluation induced by satiation. Our results are consistent with this prior report and extend it to show that both core and shell are necessary for normal changes in conditioned responding following reinforcer devaluation, even if training involves only a single reinforcer and devaluation is induced by pairing with illness.
The involvement of NA in outcome-guided responding is consistent with anatomical studies showing that NA receives input from the region we would define as rat OFC (Ongur and Price, 2000; deficit could be in the formation of the original associations (i.e., in the ability of the CS to evoke a representation of the expected outcome including its value), updating of the associated outcome representation during devaluation (i.e., linking the cue-evoked representation to the new value of the US after devaluation), or in the subsequent use of this information in the probe test. J. Comp. Neurol. 316, 314-347. Brog, J. S., Salyapongse, A., Deutch, A. Y., and Zahm, D. S. (1993). The patterns of afferent innervation of the core and shell in the "accumbens" part of the rat ventral striatum: immunohistochemical detection of retrogradely transported fluoro-gold. J. Comp. Neurol. 338, 255-278. Burke, K. A., Franz, T. M., Miller, D. N., and Schoenbaum, G. (2007 current study. A third, related possibility is that the recovery of function shown by the monkeys between the initial study, when the ipsilateral orbital and amygdalar lesions were made , and the later study when NA was removed (Izquierdo and Murray, 2010) may have allowed them to sustain performance in the face of an insult that would have impaired devaluation in a naïve animal. For example, these monkeys may have come to rely more heavily on mediodorsal thalamic circuits; mediodorsal thalamus is implicated in reinforcer devaluation (Mitchell et al., 2007), and contralateral damage to this region in monkeys with prior orbitofrontal-amygdalar lesions resulted in a return to impaired performance (Izquierdo and Murray, 2010).
Lastly it may be that lesions of both core and shell of NA are less effective at impairing devaluation than partial damage to one or the other subdivision. This might occur if the impairment caused by damage to one region is mediated by abnormal processing in the remaining area. We have found something analogous to this in studying the role of amygdala in mediating OFC-dependent reversal deficits, where removal of basolateral amygdala actually mitigated impairment of reversal learning caused by OFC damage (Stalnaker et al., 2007). Notably there is some basis for such speculation; Balleine and colleagues have reported that lesions of NA restricted to the core impair changes in instrumental responding after reinforcer devaluation (Corbit et al., 2001), whereas lesions that encompass both core and shell have no effect (de Borchgrave et al., 2002).
Of course, even if one accepts the involvement of NA in Pavlovian reinforcer devaluation, the exact nature of that involvement remains to be determined using more detailed parametric manipulations. For example, here we made lesions prior to any training and used only a single outcome; prior work in ABL and OFC have shown that lesions (or inactivation) later in training may have effects that differ from pre-training lesions (Pickens et al., 2003;Wellmann et al., 2005;Johnson et al., 2009). Likewise, the use of multiple outcomes can reveal dissociations not evident using the procedures applied here (Pickens et al., 2003;Wellmann et al., 2005;Johnson et al., 2009). Based on its anatomical position and prior evidence from second-order conditioning work (Setlow et al., 2002a,b), it seems likely that NA functions to integrate prefrontal and amygdalar input. This would be consistent with long-standing notions that NA functions as the limbic motor interface (Mogenson et al., 1980) and with more recent proposals that ventral striatum generally functions as a critical final processing station for computing expected value in a common currency (Joel et al., 2002). acknowledgMent This work was supported by grants from NIDA to Geoffrey Schoenbaum. Schoenbaum et al., 2002) -particularly the most lateral agranular insular regions -and ABL (McDonald, 1991;Berendse et al., 1992;Brog et al., 1993;Voorn et al., 2004;Schilman et al., 2008). Experiments using pre-and post-training lesions of these areas in the same behavioral paradigm applied here demonstrate that OFC is critical for using outcome-expectancies to guide responding (Gallagher et al., 1999;Pickens et al., 2003Pickens et al., , 2005. ABL also seems to be critically involved in this setting, particularly for the acquisition of cue-outcome associations in the present task (Hatfield et al., 1996). Pre-training ABL lesions abolish the devaluation effect while rats with post-training lesions do not (Pickens et al., 2003), unless multiple reinforcers are used (Wellmann et al., 2005;Johnson et al., 2009). Based on these results, it is tempting to speculate that these upstream regions interact with NA to drive changes in Pavlovian responding after reinforcer devaluation.
Notably this hypothesis was addressed recently in a study of reward-based decision-making in primates. In this study (Izquierdo and Murray, 2010), monkeys were allowed to make choices between visual objects that had been paired with different food rewards. Choices were made before and after the monkeys were fed to satiety on one of the foods. Prior work using this task has shown that normal monkeys bias their choices away from objects associated with the food devalued by satiation, whereas monkeys with damage to the orbitofrontal-amygdalar circuit show no effect of devaluation on choice behavior (Malkova et al., 1997;Baxter and Murray, 2000;. In this follow-up study, monkeys that had exhibited deficits after ipsilateral damage to the orbitofrontal-amygdalar circuit 26 months earlier  were retested before and after lesions of contralateral NA (both core and shell, thereby disconnecting the orbitofrontal-amygdalar regions from NA). These monkeys, who had recovered function during the time between the two studies, had no difficulty shifting their choice performance away from the objects paired with the devalued food after removal of contralateral NA. These results are contrary to the hypothesis that the role of OFC and amygdala in devaluation involves input to NA.
The apparent discrepancy between this study and the implications of the current results is striking and may be due to a number of factors. One possibility is the use of multiple reinforcers, though this seems unlikely since upstream regions are, if anything, more important when representations become more complex (Wellmann et al., 2005;Johnson et al., 2009), and the study mentioned earlier using 6-OHDA lesions of accumbens core found results comparable to ours using two different reinforcers (Lex and Hauber, 2010). A second possibility is the presence of both instrumental contingencies and Pavlovian associations in the choice task (Izquierdo and Murray, 2010); this might make possible more complex strategies for solving the devaluation problem not available to rats in the