Decision Salience Signals in Posterior Cingulate Cortex

Heilbronner, Sarah; Hayden, Benjamin  Yost; Platt, Michael

doi:10.3389/fnins.2011.00055

ORIGINAL RESEARCH article

Front. Neurosci., 19 April 2011

Sec. Decision Neuroscience

volume 5 - 2011 | https://doi.org/10.3389/fnins.2011.00055

This article is part of the Research TopicBehavioral and neuroscientific analysis of economic decision making in animalsView all 14 articles

Decision salience signals in posterior cingulate cortex

Sarah R. Heilbronner¹*

Benjamin Y. Hayden¹

Michael L. Platt^1,2

¹ Department of Neurobiology, Center for Cognitive Neuroscience, Duke University, Durham, NC, USA
² Department of Evolutionary Anthropology, Duke University, Durham, NC, USA

Despite its phylogenetic antiquity and clinical importance, the posterior cingulate cortex (CGp) remains an enigmatic nexus of attention, memory, motivation, and decision making. Here we show that CGp neurons track decision salience – the degree to which an option differs from a standard – but not the subjective value of a decision. To do this, we recorded the spiking activity of CGp neurons in monkeys choosing between options varying in reward-related risk, delay to reward, and social outcomes, each of which varied in level of decision salience. Firing rates were higher when monkeys chose the risky option, consistent with their risk-seeking preferences, but were also higher when monkeys chose the delayed and social options, contradicting their preferences. Thus, across decision contexts, neuronal activity was uncorrelated with how much monkeys valued a given option, as inferred from choice. Instead, neuronal activity signaled the deviation of the chosen option from the standard, independently of how it differed. The observed decision salience signals suggest a role for CGp in the flexible allocation of neural resources to motivationally significant information, akin to the role of attention in selective processing of sensory inputs.

Introduction

Although posterior cingulate cortex (CGp) dysfunction is associated with both Alzheimer’s Disease (Minoshima et al., 1997; Hirono et al., 1998; Yoshiura et al., 2002) and schizophrenia (Newell et al., 2006, 2007), the cognitive function of this brain area remains unclear. Neuroimaging studies (Maddock et al., 2003; Buckner and Vincent, 2007; Kable and Glimcher, 2007; Luhmann et al., 2008), lesion studies (Gabriel et al., 1991; Bussey et al., 1996), and neurophysiological studies (McCoy et al., 2003; McCoy and Platt, 2005; Hayden et al., 2008; Pearson et al., 2009) support two distinct functional roles for CGp in decision making.

On one hand, correlations between neural activity and individual decision preferences suggest CGp contributes to decision making by signaling the subjective value of a chosen option (McCoy and Platt, 2005; Kable and Glimcher, 2007; Levy et al., 2011). Indeed, firing rates of neurons in this area track the subjective value of preferred risky options in a choice task (McCoy and Platt, 2005), and BOLD signal correlates with the subjective value of a delayed option in an inter-temporal choice task (Kable and Glimcher, 2007).

However, modulations in neural activity by task engagement, learning, and memory suggest CGp plays a more fundamental role in the allocation of neural resources to cognitive control akin to that of attention in the selective processing of sensory stimuli (Maddock et al., 2001, 2003; Greicius et al., 2004; Luhmann et al., 2008). Firing rates of CGp neurons are modulated by the omission of predicted rewards as well as larger than average rewards (McCoy et al., 2003), signal whether monkeys will switch from a preferred option to a non-preferred one (Hayden et al., 2008), and predict when monkeys will strategically shift from exploiting an option with known value to learning about alternatives (Pearson et al., 2009). Moreover, increased tonic firing rates in CGp predict lapses in task performance (Hayden et al., 2009), corroborating brain imaging studies that have linked high CGp activity to decline in task engagement (Weissman et al., 2006). Finally, CGp lesions in rodents and rabbits impair several forms of associative learning (Gabriel and Sparenborg, 1987; Bussey et al., 1996).

Adjudicating between these two possibilities is difficult because motivational variables associated with cognitive control may covary with valuation (Pearce and Hall, 1980; Maunsell, 2004; Rangel et al., 2008). We tested these two hypotheses by dissociating the subjective value of an option as revealed by choice preference from the degree to which that option differed from a standard, herein defined as decision salience. Monkeys made decisions in three distinct contexts, each offering a choice between options differing in a single relevant variable: risk (McCoy and Platt, 2005), delay to reward (Hwang et al., 2009; Louie and Glimcher, 2010), and the potential to acquire social information at a juice cost (Deaner et al., 2005; Klein et al., 2008). Each variable assumed one of three different levels of decision salience (i.e., risk, delay, or price). We found that, across decision contexts, neuronal activity was uncorrelated with subjective value as estimated from choice frequencies. Instead, firing rates reflected decision salience, the degree of deviance of a chosen option from the standard. Our findings thus argue against the subjective value hypothesis and support the idea that CGp contributes to the motivational allocation of cognitive resources – in part by signaling decision salience.

Materials and Methods

Two male rhesus macaques participated in this experiment (monkeys N and S). Monkeys began each trial by fixating on a central square. Following a fixation period (2 s for monkey N, 0.3–2 s for monkey S), they were required to shift gaze to one of two eccentric targets. After a successful gaze shift, a fluid or a fluid plus social reward was delivered (see Figure 1A). On each trial, the monkeys chose between a standard target, offering an immediate, safe, medium-sized reward (200 μL of juice) with no social reward and another, non-standard reward. The identity of this second reward option determined the trial type and varied in blocks. Following reward delivery, an inter-trial interval (ITI) began. The ITI was 5 s in all trials except choices of the delayed option, in which case it was adjusted such that the total trial length for delay trials was approximately the same as all other trials.

FIGURE 1

Figure 1. Task design and decision contexts. (A) Trial events. Trials began when a central fixation light appeared. Once the monkeys looked at the fixation light, it changed color to indicate the current context. After a stable fixation period, the fixation light extinguished, and two eccentric yellow dots appeared. When the monkey had shifted gaze to one of these targets, the reward period began. Juice was either delivered immediately, or, in the case of LL choices, after some delay. An adjusting delay followed such that all trials were of approximately the same total length. (B) Reward matrix showing outlier outcomes for each level of each context. All recording sessions included blocks composed of one of nine trial types: three conditions (risk, delay, social), each with three levels. Clock indicates delay to reward, the droplet indicates amount of juice delivered, and the picture indicates that a social reward was presented just before juice delivery. The standard option was always 200 μL of juice available immediately, with no picture.

Each trial offered one of three possible decision salience levels within three possible trial types. Thus there were nine trial types (see Figure 1B). Monkeys completed at least two blocks each (one for each side of the monitor) of the nine trial types within each session. The first trial type gave monkeys a choice between a sure reward and risky gamble on a larger or smaller reward (McCoy and Platt, 2005). We defined risk as the coefficient of variation (CV) in reward value, to permit easy comparison with other studies. While the safe option remained the same across all three levels (200 μL of juice), the risky option could be either high risk (280 μL 50% of the time; 120 μL 50% of the time), medium risk (253 μL of juice 50% of the time; 147 μL of juice 50% of the time), or low risk (227 μL of juice 50% of the time; 173 μL of juice 50% of the time). The second context was a form of a standard delay discounting task (Mazur, 1987; Ainslie and Haslam, 1992; Kim et al., 2008). Monkeys chose between a small, immediately available reward (200 μL of juice – the standard) and a large, delayed reward (233 μL of juice). The delay associated with the large reward could be small (1 s), medium (2 s), or large (3 s), depending on the condition. The final context was a social decision making task based loosely on the “pay-per-view” task described previously (Deaner et al., 2005; Klein et al., 2008), in which monkeys chose between a large amount of juice without an associated picture (200 μL) or a smaller amount of juice paired with a small photograph of a familiar monkey. The amount of juice associated with the picture could be either small (120 μL juice), medium (147 μL of juice), or large (173 μL of juice), depending on the condition. In contrast with previous studies, photographs of different monkeys with different ranks within the colony were randomly interleaved. The safe option (risk context), immediate option (delay context), and non-picture option (social context) were identical, so we refer to this as the standard option. Thus, a standard option was available on every choice, and the identity of the non-standard (outlier) option determined the decision context and level.

Each block consisted of 11 to 21 trials; the specific number was chosen randomly so as to prevent the monkey from guessing when the block would end. Each block contained choices belonging to only one of the nine possible conditions (three levels and three contexts). Each block began with a forced-choice trial in which only the outlier option was available. This trial served to inform the subject about the new block’s context and level. In addition, the color of the central fixation square was associated with the decision context, so monkeys always had information about whether they were making choices about risk, time, or social/juice reward tradeoffs. The standard and outlier options were randomly assigned to the two target locations at the start of each block and remained there for the duration of the block. On the next block of the same type, these assignments reversed. Thus, locations were roughly counterbalanced.

Surgical Procedures

All procedures were approved by the Duke University Institutional Animal Care and Use Committee and were designed and conducted in compliance with the Public Health Service’s Guide for the Care and Use of Animals. Two male rhesus monkeys (Macaca mulatta) served as subjects. A small head-holding prosthesis was implanted in both animals using standard surgical techniques. Six weeks later, animals were habituated to training conditions and trained to perform oculomotor tasks for liquid reward. A second surgical procedure was then performed to place a stainless steel recording chamber (Crist Instruments) over CGp at the intersection of the interaural and midsagittal planes. Animals received analgesics and antibiotics after all surgeries. Throughout both behavioral and physiological recording sessions, the chamber was kept sterile with regular antibiotic washes and sealed with sterile caps.

Behavioral Techniques

Monkeys were placed on controlled access to fluid outside of experimental sessions. Horizontal and vertical eye positions were sampled at 1000 Hz by an infrared eye-monitoring camera system (SR Research, Osgoode, ON, USA). Stimuli were controlled by a computer running Matlab (Mathworks, Natick, MA, USA) with Psychtoolbox (Brainard, 1997) and Eyelink Toolbox (Cornelissen et al., 2002). Visual stimuli were small colored squares on a computer monitor placed directly in front of the animal and centered on his eyes. A standard solenoid valve controlled the duration of juice delivery. Monkeys were generally familiar with this type of task, and had performed one of the context types described (risk) previously. Monkeys performed the entire task, consisting of the three contexts, for at least three sessions prior to recording.

Microelectrode Recording Techniques

We recorded action potentials from 71 single neurons in two monkeys (53 in monkey N, 18 in monkey S) during the performance of the task. Single electrodes (Frederick Haer Co.) were lowered under microdrive guidance (Kopf) until the waveform of a single (1–4) neuron(s) was isolated. Individual action potentials were identified by standard criteria and isolated on a Plexon system (Plexon Inc, Dallas, TX, USA). Neurons were selected for recording on the basis of the quality of isolation only, and not on task-related response properties.

We approached CGp through a standard recording grid. CGp was identified using a hand-held digital ultrasound device (Sonosite 180) placed against the recording chamber (Glimcher et al., 2001). We confirmed that we were in CGp using stereotactic measurements, as well as by listening for characteristic sounds of white and gray matter during recording. CGp recordings were made in areas 23 and 31 in the cingulate gyrus and ventral bank of the cingulate sulcus. These recordings were made from areas equivalent to those reported in McCoy et al. (2003), Dean and Platt (2006), Hayden et al. (2008).

Analysis

We used an alpha of 0.05 as a criterion for significance. Peri-stumulus time histograms (PSTHs) were constructed by aligning spikes to saccade offset, averaging across trials, and smoothing with a 200-ms boxcar. Statistics were performed on binned firing rates, as described for each analysis. To compare firing rates across trials for single neurons, tests were performed on individual trials; to compare firing rates across neurons, tests were performed on average rates for individual neurons. The post-reward epoch was 900 ms, beginning at the completion of reward delivery. The pre-saccadic (pre-choice) epoch was 900 ms, beginning 1300 ms prior to saccade completion. The peri-saccadic epoch was 400 ms, ending at the completion of the saccadic. Analyses were performed using Matlab (Mathworks, Natick, MA, USA).

The main subjective value model was based on a model used in a previous study (Hayden et al., 2007), in which daily choice frequencies were transformed to equivalent juice amounts. The model takes advantage of the roughly linear relationship between choice frequency and juice amount identified in a previous experiment (McCoy et al., 2003). For this study, to reduce day-to-day noise, we added one additional (hypothetical) choice frequency per context (risk: CV of 0; delay: 0 s delay on large reward; social: juice amount equivalent to standard).

Results

Monkeys Exhibit Stable, Ordered Preferences in Three Decision Contexts

Each day, monkeys performed a single task with three different embedded decision contexts: risk, delay, and social valuation, each associated with three levels of decision salience. All three contexts required monkeys to shift gaze in order to choose between two eccentric targets associated with different reward properties. In the risk context, monkeys chose between a risky option (50% chance of high reward, 50% chance of low reward) and a safe option (100% chance of medium reward) of equal expected values. We varied decision salience by changing the risk level (CV) of the risky option. In the delay context, monkeys chose between a larger, delayed amount of juice (LL: larger, later) and a smaller amount of juice available immediately (SS: smaller, sooner). Delays could either be 1, 2, or 3 s, depending on the block. In the social valuation context, monkeys chose between a large amount of juice and a small amount of juice paired with a photograph of a familiar monkey (mix of dominant and subordinate males). The photograph option was paired with different small amounts of juice (small, medium, large), depending on the block. The safe, immediate, and non-social options were identical (standard) across the three contexts; only the “outlier” option changed according to block. Thus, the identity of the outlier option determined the decision context. In the risk context, monkeys preferred the probabilistically rewarded option to the safe one (Figure 2A), as described previously (McCoy and Platt, 2005; Hayden et al., 2008). In the delay context, monkeys preferred immediate rewards to delayed ones (Figure 2B). Finally, in the social context, they preferred larger juice rewards to smaller rewards paired with photographs of familiar monkeys [Figure 2C; p < 0.0001 in all cases, two-tailed single-sample t-tests; Monkey N risk: M = 0.59, SE = 0.008, t(4247) = 11.5; delay: M = 0.42, SE = 0.008, t(4310) = −11.3; social: M = 0.25, SE = 0.007, t(4243) = −37.9; Monkey S risk: M = 0.59, SE = 0.02, t(708) = 4.8; delay: M = 0.41, SE = 0.02, t(726) = −4.9; social: M = 0.36, SE = 0.02, t(745) = −7.8]. These effects were highly significant for both monkeys.

FIGURE 2

Figure 2. Behavioral preferences used to compute subjective value in the risky, delay, and social contexts. (A) Preferences in risk context. Monkeys significantly preferred a risky reward to a safe reward and had stronger preferences for higher levels of risk. (B) Preferences in delay context. Monkeys were significantly delay-averse, preferring smaller, immediate rewards to larger, delayed rewards, and had stronger preferences against longer delays. (C) Preferences in social valuation context. Monkeys preferred the standard juice reward to smaller rewards coupled with images, but were more likely to choose to view the image as juice volume was increased.

As noted above, there were three different levels of the outlier option in each decision context (high risk, medium risk, low risk; long delay, medium delay, short delay; large juice, medium juice, small juice), meaning there were nine different possible block types in total (three contexts × three levels). The standard option (no risk, no delay, no picture) was available on all choice trials. We found that preference for the risky option increased with increasing CV in reward [F(2, 4805) = 31.5, p < 0.0001] as described previously (McCoy and Platt, 2005). As expected (e.g., Myerson and Green, 1995; Reynolds et al., 2002; Freeman et al., 2009), monkeys also chose the smaller, immediate option more often when the delay to the larger option was longer [F(2, 4877) = 119.8, p < 0.0001]. Finally, as the amount of juice associated with the photograph increased, monkeys were more likely to choose it [F(2, 4822) = 15.0, p < 0.0001].

Neuronal Firing Rates in CGp Do Not Track Behavioral Preferences Independent of Context

We first examined the neuronal response to choice of a risky, delayed, or social option. Figure 3 shows the firing rates of a single neuron to choice of the outlier option (shown in red) over the standard option (shown in blue). Firing rates were higher for choices of a social or risky option over the standard. Figure 4 demonstrates that, generally, average population activity was stronger when monkeys chose the risky option, which was preferred, but was also stronger when monkeys chose the delayed option and social options, which were not preferred.

FIGURE 3

Figure 3. Firing rates of a single CGp neuron are modulated by choice but do not signal value independent of decision context. Plots are aligned to end of choice saccade (dotted line). (A) Risk context. PSTH shows average response of population of neurons when monkey chose the risky option (red) or the safe option (blue). (B) Delay context. PSTH separated by whether the monkey chose the LL option (red) or the SS option (blue). (C) Social valuation context. PSTH separated by whether the monkey chose the picture option (red) or the non-picture option (blue). Pre-choice modulations likely reflect block structure (see main text). Statistics are for correlation between subjective value and firing rate, within context.

FIGURE 4

Figure 4. The CGp population response increases when monkeys choose the risky, delayed, and social options – independent of preference. (A) Risk context. Population PSTH separated by whether the monkey chose the risky option (red) or the safe option (blue). On the right is proportion choice of the risky (red) and safe (blue) options. (B) Delay context. Population PSTH separated by whether the monkey chose the LL (red) or SS (blue) options. On the right is proportion choice of the LL (red) and SS (blue) options. (C) Social valuation context. Population PSTH separated by whether the monkey chose the picture option (red) or non-picture option (blue). On the right is proportion choice of the picture (red) and non-picture (blue) options.

We next quantified these effects in our population of studied neurons. During the post-reward epoch (see Materials and Methods), the population as a whole showed higher firing rates during choice of a risky option than during choice of a safe option [t(70) = 2.39, p = 0.02]. Overall, 19 of the 71 (27%) recorded neurons were significantly modulated by risky versus safe choice; 16 of these showed higher activity when monkeys chose the risky option (see Table 1). Of the 19 neurons significantly modulated during the post-reward phase, 11 were also modulated by upcoming choice of the risky option prior to the saccade (10 with higher firing rate for risky option). Seven were modulated during the peri-saccadic period, six of those with higher firing for the risky option. Because both monkeys preferred the risky option, this positive relationship between firing rate and risk replicates previous behavioral and neuronal results (McCoy and Platt, 2005) and is consistent with the hypothesis that CGp firing rates encode the subjective value of a chosen option.

TABLE 1

Table 1. Neurons modulated by choice of the outlier option over the standard option within each task context.

Quantification of data from the other two contexts suggests otherwise. In the delay context, the population showed a higher firing rate when monkeys chose the delayed option than when they chose the immediate option, although this difference was not significant t(70) = 0.61, p > 0.05. Although the population did not show significantly higher firing rates for one choice over the other, the activity of a substantial minority of neurons (17/71; 24%) was significantly modulated by choices of the larger, later (LL) reward over the smaller, sooner (SS) reward. Although monkeys generally preferred the SS option to the LL, roughly half (nine) of these neurons showed higher activity for choices of the delayed option. Eight of the 17 neurons significant during the post-reward epoch were also significantly modulated during the pre-choice epoch, five of them showing higher firing rates prior to choice of the LL option. Seven of the 17 neurons were significantly modulated during the peri-saccadic epoch, three of them showing higher firing rates during choice of the LL option. Since firing rates were also generally higher when monkeys chose the risky option, this pattern of results is inconsistent with the hypothesis that CGp encodes subjective value.

Finally, the population showed a significantly higher firing rate when monkeys chose the social option over the non-social one [t(70) = 2.8, p < 0.008], even though the non-social option was strongly preferred. Overall, 14 neurons (20%) were modulated by the choice of the social option over the non-social option. Of these, 11 fired at higher rates when monkeys chose the picture. Five of the 14 were significantly modulated prior to saccade (three with higher firing rates for upcoming choice of the picture option). Six of the 14 were significantly modulated in the peri-saccadic epoch (five with higher firing rates during choice of the picture option). Again, these findings are inconsistent with the idea that CGp signals the subjective value of a chosen option independent of context. Figure 3 demonstrates that all of these effects can be observed in the activity of a single neuron: this example cell shows higher firing rates during choice of the outlier options relative to the standard, despite contrasting preferences in the three conditions.

CGp Neurons Do Not Encode Subjective Value Independent of Decision Context

We next quantified the relationship between firing rates of CGp neurons and the subjective value of the chosen option. Subjective value signals in the brain can be identified by correlations between neuronal activity and the preference functions that serve as the basis for estimating value (cf. Montague and Berns, 2002; O’Doherty, 2003; Padoa-Schioppa and Assad, 2006).

We used a measure of subjective value based on revealed preferences that allowed us to assign a value estimate to each option across all decision contexts. Subjective value was estimated using the frequency of choosing the risky, delayed, and social options (outlier options) in each of the nine possible conditions in each session. For each context (risk, delay, social), we fit a line to the day’s preference points – one for each level of the non-standard option (low, medium, high risk; short, medium, long delay; small, medium, large juice). In a previous study (McCoy et al., 2003), we gave monkeys (one of whom is also used in this study) choices between different amounts of juice to determine the relationship between reward size and choice frequency. We then used that data to convert choice frequencies to equivalent juice values to model subjective value in the current study (cf. Hayden et al., 2007). For example, a 75% preference for the risky option over the safe one would be equivalent to the frequency with which monkeys choose 220 μL over 200 μL of juice rewarded deterministically. We examined the relationship between our estimate of subjective value and firing rate following delivery of the reward for each neuron, within each context. Figure 3 shows these relationships for a single neuron. In this example cell, firing rate was positively correlated with subjective value in the risk context, but was negatively correlated with subjective value in the social context.

Overall, during the post-reward epoch the population of CGp neurons was biased toward a positive relationship between firing rates and subjective value in the risk context [M = 0.066, t(70) = 3.5, p < 0.001, Figure 5A], biased toward a negative relationship between firing rates and subjective value in the social context [M = −0.064, t(70) = −3.67, p < 0.001, Figure 5C], and trended toward a negative relationship between firing rates and subjective value in the delay context [t(70) = −1.64, p = 0.10, Figure 5B]. This sign inversion is contradictory to the hypothesis of subjective value encoding. We also examined the relationship between firing rates and subjective value across all three decision contexts by incorporating all types of trials into our model. When all trials from all contexts and levels were included, there was no relationship between subjective value and firing rate across the population [t(70) = −1.10, p > 0.2, Figure 5D]. We next examined whether these correlation coefficient distributions were not just different from zero, but also from each other. As expected, the risk and delay correlation coefficient distributions were significantly different from each other [t(70) = 3.72, p = 0.0006] as were the risk and social correlation coefficient distributions [t(70) = 5.13, p < 0.00001], however, the social and delay correlation coefficient distributions were not significantly different from each other [t(70) = 1.46, p = 0.15]. If these neurons were encoding subjective value, we would have expected little difference across conditions.

Given that CGp neurons show a weak bias for contralateral choices, we repeated these analyses using the original model for trials that only included contraversive saccades and found the same effects. We found qualitatively similar results to those reported for all saccades, but with significant (negative) encoding in the delay context: Risk, t(60) = 2.13, p = 0.04; Delay, t(61) = −2.04, p = 0.046; Social, t(61) = −2.5, p = 0.01.

FIGURE 5

Figure 5. The CGp population does not encode value independent of context. Histogram of correlation coefficients for subjective value (see Materials and Methods) for the: (A) Risk context. (B) Delay context (C) Social valuation context. (D) All contexts combined.

In addition to this method, we recalculated our data using an alternative method, in case the particular model of value we chose biased our data against the subjective value hypothesis. Thus, we estimated subjective value using an alternative approach and found highly similar results. We examined whether firing rates matched daily choice frequencies, without any additional transformations. Under this model, both the standard and non-standard options could attain different relative values across different contexts and levels, simply based on different preference levels. Results, however, were similar to other model. In the post-reward epoch, we observed a significant correlation between subjective value and firing rate in the risk context [t(70) = 2.86, p = 0.006], a negative correlation in the social context [t(70) = −3.37, p = 0.001], and a non-significant negative correlation in the delay context [t(70) = −0.83, p = 0.40].

Thus, although post-reward firing rates varied with which choice the animal made, they were not correlated with subjective value in a consistent fashion across decision contexts.

Given the heterogeneity in response direction amongst CGp neurons, we were concerned that different subsets of neurons may have been activated by one task over another, thus muddling the population results described above. We asked whether the divergent relationships between firing rates and subjective value observed across contexts were the result of separate neuronal populations contributing exclusively to one of the three types of decisions. When we divided cells into positive or negative correlations with subjective value (without regard to significance) in each of the three contexts, we observed the largest number of cells with positive modulation in the risk context, negative modulation in the delay context, and negative modulation in the social context (18/71 cells, see Table 2). Furthermore, cells that were significantly modulated in one context were not less likely to be modulated in either of the other contexts than those cells that were non-significant (independent-samples t-tests, all p > 0.1). For example, neurons that showed an effect of subjective value in the risk context were not less likely to signal value in the delay context than cells that did not signal value in the risk context. This was also true for choice effects (e.g., risky versus safe). Indeed, neurons with higher firing rates for delayed choice over standard (without regard to significance) fired at higher rates for picture choices and risky choices versus the standard option (p < 0.05 in both cases). Likewise, neurons with higher firing rates following risky choices also had higher firing rates following picture choices, relative to the standard (p = 0.03), and vice-versa (p < 0.02). Collectively, these results suggest that there are not special populations of neurons that only respond to decisions involving risk, delay, or social information in CGp. Rather, the relationship between firing rates and value, as estimated from revealed preferences, differs depending on context.

TABLE 2

Table 2. Number of neurons with firing rates modulated positively and negatively by subjective value, according to decision context.

Firing Rates of CGp Neurons Vary with Decision Salience

The observed pattern of results – higher activity for choice of a risky, delayed, or social option compared to the standard (certain, immediate, non-social) option – suggest the hypothesis that CGp neurons signal the deviation of the chosen option from an anchor (in this case, the standard option), that is, decision salience, rather than the subjective value of the chosen option. If this were the case we would expect overall higher firing rates for choices of the outlier option than for the standard, regardless of decision context. Combining data across all contexts, outlier choices did yield significantly higher firing rates during the post-reward epoch than standard choices, as expected based on the context-specific results presented above [t(70) = 2.2, p = 0.03]. Overall, 24 neurons showed significant differences in firing rate after choosing the outlier versus standard options, collapsing across all contexts and levels. Out of these 24 significant cells, 21 showed higher firing rates following choice of the outlier option than following choice of the standard option. This argues strongly for a value-independent decision salience signal.

Furthermore, if CGp neurons encode decision salience this would predict higher firing rates for riskier, later, and smaller juice outlier options, as they are progressively different from the standard. Indeed, neurons responded differently to the various outlier options. We combined data across all contexts and regressed neuronal responses for each cell against outlier level (only outlier choices). We found that regression coefficients were significantly skewed in the positive direction, meaning higher firing rates for more salient options [t(70) = 3.8, p < 0.001, Figure 6A]. Twenty out of 71 cells were significantly modulated by outlier salience (14 in the positive direction). Once again, although these task contexts are quite different, examining them all along the dimension of salience proves useful. This effect was not present in choices of the standard option, either for the post-reward or pre-choice epoch, indicating that this signal does not reflect overall environmental salience. Because the outlier option was also available on these trials, this lack of an effect also argues against processing of available options, and instead ties these signals more closely to choice.

FIGURE 6

Figure 6. Firing rates of CGp neurons track decision salience. (A) Histogram of regression coefficients for firing rates as a function of choice salience for all neurons in the population. (B) Average population response was significantly higher for choosing high risk (versus low risk) and low juice (versus high juice) options. Higher firing rates were associated with options more different from the standard.

We examined the least and most deviant outlier options from each context in order to more fully quantify this effect (Figure 6B). As reported previously (McCoy and Platt, 2005), CGp neurons fired at higher rates when monkeys chose the risky option with the highest CV compared with when monkeys chose the risky option with the lowest CV [t(70) = 3.33, p = 0.001, paired-samples t-test]. In the social context, neurons fired at higher rates during the post-reward epoch when monkeys chose the picture paired with the smallest amount of juice (most different from standard) compared with when monkeys chose the picture paired with the largest amount of juice [t(68) = −2.41, p = 0.019]. In the case of delayed rewards, firing rates were higher when monkeys chose the 3-s delayed option than when they chose the 1-s delayed option, although this difference was not significant [t(69) = −1.69, p = 0.096]. However, as clearly evident in the population response (Figure 4), firing rates during LL choices were higher during the delay period than prior to the choice, meaning neurons increased their responses during the delay, in anticipation of the reward [t(70) = 2.5, p = 0.015]. This effect disappears following reward delivery, when firing rates return to pre-choice baselines (p > 0.9). Given our other results, this suggests that the delay period itself was the more salient outlying feature in this context. Thus, firing rates were consistently higher when the outlier option deviated more from the standard, strongly suggesting that CGp encodes decision salience rather than the subjective value of a chosen option.

We considered the possibility that these results could be explained by a relatively simple arousal signal. We assessed whether there was a consistent relationship between firing rate and reaction time in this task. Previous studies have showed that, in certain contexts, CGp activity increases with slower reaction times (Hayden et al., 2009), as tonic increases in firing rate in CGp are associated with task disengagement (Raichle et al., 2001). However, here, we did not observe any consistent relationship across cells between firing rate and reaction time (mean correlation coefficient = 0.011, p = 0.4), even when only examining significant cells (p = 0.3). Moreover, the bias toward higher firing rates during choice of outlier options relative to the standard option was maintained while controlling for reaction times, t(70) = 3.36, p = 0.001.

Discussion

Our data show that CGp neurons do not signal behavioral preferences consistently across different decision contexts. The population of CGp neurons responded with higher firing rates when monkeys chose the risky option, which was preferred, and the delayed and social options, which were non-preferred. Furthermore, firing rates increased as delay and risk increased, and as amount of juice associated with the social option decreased. These data demonstrate that CGp does not track subjective value in a manner that is independent of the type of decision being made. Instead, CGp neurons appear to encode variables that sometimes covary with preference.

One such variable is what we are calling decision salience: neurons tended to fire at higher rates when the chosen option was more aberrant from the standard option available on every trial. This type of outlier encoding may be useful for guiding learning and memory (Pearce and Hall, 1980), a function previously linked to CGp (Cabeza and Nyberg, 2000; Maddock et al., 2003; McCoy et al., 2003).

We have operationally defined decision salience purely in the context of choosing outcomes that deviate from a standard option. Unfortunately, the task we used was not designed to examine learning, but rather to examine preference signals across distinct decision contexts, although these signals would certainly be useful for learning about unusual events. That being said, we do not think this signal fits with simple cue processing for associative learning. Instead, these signals seem to track the salience of the chosen option. One way to examine this is to compare trials on which the monkey chose the standard option even though the outlier option was also available. If this signal reflects broader option or cue processing, then the neural signal should track salience regardless of option chosen. Instead, firing rates when the standard option was chosen do not vary based on the salience of the outlier option. Thus, although we believe these signals to be useful for learning, at this point we remain agnostic as to the details of this process.

Our lab previously showed that firing rates of neurons within CGp predict preferences for chosen options in a risky choice task similar to the one used here (McCoy and Platt, 2005). The present study replicates those previous results. By contrast, our finding that the CGp population tends to respond more strongly when monkeys choose the delayed but non-preferred option conflicts with a recent fMRI paper which found that hemodynamic responses in human CGp vary with subjective value in a delay discounting task (Kable and Glimcher, 2007; Levy et al., 2011). The discrepancy between the present study and these earlier findings may reflect species differences in neuronal processing, differences in task design (i.e., the use of primary versus monetary rewards), or discontinuities between the BOLD signal and single unit firing (Logothetis et al., 2001). Other studies have suggested the BOLD signal in CGp is stronger during decisions concerning delay than risk (Weber and Huettel, 2008). Furthermore, Luhmann et al. (2008) reported in a recent paper that activation of human CGp increased with choice of a delayed reward – an effect we confirmed on the level of the single neuron. They hypothesized that such signals may be linked to self-projected time rather than decision processing. Firing rate modulations observed here, however, suggest that CGp activation may not indicate self-projection specifically, but may instead reflect neural processing involved in tracking salience.

Overall, our findings suggest that CGp signals decision salience or even uncertainty more broadly (Critchley et al., 2001; Behrens et al., 2007). The consistently higher firing rates we observed for the “outlier” options (risky, delayed, social) may signal deviation from standard or predicted outcomes, a variable important in attentional models of learning (Mackintosh, 1975; Pearce and Hall, 1980). Such a signal would indicate when and how rapidly learning or behavioral adjustment would occur, but would not provide information about precisely what should be learned. Consistent with this idea, a previous study found that firing rates of CGp neurons were higher when monkeys explored their options than when they pursued a single source of reward (Pearson et al., 2009), a pattern consistent with the idea that CGp neurons signal decision salience. With prominent connections to the medial temporal lobes, CGp is well-positioned anatomically to provide an instructional signal to engage learning.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

This work was supported by a NIDA pre-doctoral fellowship 028133 (Sarah R. Heilbronner), NIDA K99 027718-01 and by a fellowship from the Tourette Syndrome Foundation (Benjamin Y. Hayden), NIH grant R01EY013496 (Michael L. Platt), and the Duke Institute for Brain Sciences (Michael L. Platt).

References

Ainslie, G., and Haslam, N. (1992) “Hyperbolic discounting,” in Choice Over Time, eds G. Loewenstein and J. Elster (New York, NY: Russell Sage Foundation), 177–209.

Behrens, T. E., Woolrich, M. W., Walton, M. E., and Rushworth, M. F. (2007). Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221.