Original Research ARTICLE
Discounting in pigeons when the choice is between two delayed rewards: implications for species comparisons
- Department of Psychology, Washington University, St. Louis, MO, USA
Studies of delay discounting typically have involved choices between smaller, immediate outcomes and larger, delayed outcomes. In a study of delay discounting in humans, Green et al. (2005) added a period of time prior to both outcomes, creating a delay common to both. They found that the subjective value of the more delayed reward was well described by a hyperboloid discounting function and that the degree to which that outcome was discounted decreased as the common delay increased. In two experiments, we examined the effect of adding a common delay on the discounting of food rewards in pigeons. In Experiment 1, an adjusting-amount procedure was used to establish discounting functions when the common delay was 0, 3, 5, and 10 s, and different stimuli signaled time to the smaller, sooner and larger, later rewards. In contrast to humans, the pigeons showed increases in the degree of discounting when a common delay was added. In Experiment 2, the delay common to both rewards and the delay unique to the larger, later reward were each specifically signaled. With this procedure, the degree of discounting decreased as the common delay increased, a result consistent with that obtained with humans (Green et al., 2005). These findings reveal fundamental similarities between pigeons’ and humans’ choice behavior, and provide strong interspecies support for the hypothesis that choice between delayed outcomes is based on comparison of their hyperbolically discounted present subjective values.
People and other animals often have to choose between an immediate reward and another, larger reward of the same kind that is available only after a delay. When the delay to the later reward is long, a small amount of immediate reward may be chosen over the delayed reward, but if the delay is brief, then the choice may be to wait for the larger reward. This difference in preference is assumed to reflect the fact that the value of a delayed reward is discounted, with longer delays leading to greater discounting, and it is observed in both humans (Green et al., 1994; Kirby, 1997) and non-human species (rat and pigeon, Richards et al., 1997; Mazur, 2000; Green et al., 2004; monkey, Freeman et al., 2009). The decrease in the value of a reward as the delay to its receipt increases is well described by a simple hyperbolic function (Mazur, 1987):
where V is the subjective value of the delayed reward, A is the amount of the delayed reward, and D is its delay. The parameter k governs the degree of discounting, with larger values indicating steeper discounting1.
Often, of course, the choice is not between an immediate and a delayed reward, but rather between two delayed rewards. If the sooner reward is also the larger one, choice is straightforward, but when the sooner reward is smaller, then the decision is more complicated. For example, consider the situation depicted in Figure 1, in which the heights of the bars represent the actual (undiscounted) value of two rewards and the curved lines depict their subjective (i.e., discounted) values as predicted by Eq. 1. The likelihood of choosing a particular alternative at any point in time depends on the relative subjective values of the two rewards. As may be seen, choice of the larger reward is more likely if the decision is made at an earlier point in time (e.g., at T1), whereas choice of the smaller reward is more likely if the decision is made later (e.g., at T2). Indeed, both humans and non-human animals show the preference reversals predicted by Figure 1 (e.g., Ainslie and Herrnstein, 1981; Green et al., 1981, 1994).
Figure 1. Hyperboloid discounting of smaller, sooner and larger, later rewards. The x-axis represents the time until a reward, and the y-axis represents its subjective value. The portion of the delay that is common to both the smaller, sooner and larger, later rewards is labeled A, and the portion of the delay to the larger, later reward that is unique is labeled B.
The present study provides a systematic examination of choice between delayed rewards in pigeons and a test of the mechanism that is hypothesized to underlie such choices. In two experiments, the delays to the smaller, sooner and larger, later rewards were varied, as was the amount of the smaller, sooner reward, while the amount of the larger, later reward was held constant. The amount of time corresponding to the delay from the choice point until the smaller, sooner reward (designated A in Figure 1) is common to both alternatives, whereas the delay from the choice point to the larger, later reward consists of the common delay plus an additional delay (designated B in Figure 1) that is unique to the larger, later reward.
The framework depicted in Figure 1, which implicitly assumes that choices are made based on comparison of the present subjective values of hyperbolically discounted outcomes, predicts that the subjective value of the larger, later reward should be discounted less steeply when the common period is longer, and indeed, this result was observed in humans by Green et al. (2005). To see why this is so, consider that according to Eq. 1, the subjective value of the smaller, sooner reward (VS) is given by
and the subjective value of the larger, later reward (VL) is given by
where AS and AL are the amounts of the sooner and later rewards, respectively, Dc and Du are the common and unique portions of the delay to the later reward (see Figure 1).
It follows from the preceding two equations that the amount of the sooner reward that will be equal in subjective value to the later reward is given by
Expanding the denominator, dividing both the numerator and denominator by (1 + k Dc), and rearranging yields
which may be rewritten as
where k′ = k/(1 + k Dc). It may be seen that as the duration of the common delay, Dc, increases, the value of the fraction k/(1 + k Dc) decreases. Thus, Eq. 1 predicts that when the choice is between a smaller, sooner reward and a larger, later reward, discounting will be hyperboloid in form, and as the common delay increases, the parameter k′ will decrease, leading to shallower discounting. Note that whereas k governs the degree of discounting when subjective value is measured in terms of the amount of immediate reward, k′ governs the rate of discounting when subjective value is measured in terms of the amount of delayed reward (available at the end of the common delay).
Because the same equation (Eq. 1) fits delay discounting data from both humans and pigeons when choice is between an immediate and a delayed reward, the question arises as to whether the extension of this equation to choice between two delayed rewards (represented by Eqs 2a, 2b, and 3) describes pigeon as well as human data. In two experiments, pigeons chose between smaller amounts of food available after a shorter delay and larger amounts of food available after a longer delay. An adjusting-amount procedure was used to estimate the amount of the smaller, sooner reward that was approximately equal in subjective value to the larger, later reward. The two experiments differed in how the common and the unique portions of the delays were signaled. In Experiment 1, pigeons’ choices between delayed rewards appeared to be quite different from humans, suggesting that quite different decision processes were involved. In Experiment 2, however, when signals were provided to facilitate discrimination of the common and unique portions of the delay to the later reward, the pigeons’ choices were similar to those of humans in analogous situations.
Five naïve, female White Carneau pigeons (numbered P15–P19) were individually housed in an animal colony room with a 12:12-h light/dark cycle. The pigeons had water and health grit continuously available in their home cages, and they were provided supplemental post-session food (Pigeon Checkers) to maintain their weights between 80 and 85% of their individually determined free-feeding body weights. The experiments were performed in accordance with relevant institutional and national guidelines and regulations, and were approved by the Animal Studies Committee of Washington University.
Two experimental chambers (Coulbourn Instruments, Inc.) were used, each measuring 28 cm long, 23 cm wide, and 30.5 cm high. The experimental chambers were enclosed within sound- and light-attenuating chambers equipped with ventilation fans that also provided masking noise during experimental sessions. A MED Associates interface and MED-PC™ software running on a personal computer located in an adjacent room were used to present stimuli and record responses.
Three response keys, spaced 8 cm apart, were mounted on the front panel of each experimental chamber. The right- and left-most keys (which could be transilluminated green and red, respectively) were 25 cm above the grid floor and 3.5 cm from the side walls of the chamber. The center response key (which could be transilluminated yellow) was 21 cm above the grid floor and mounted in the center of the front panel. A triple-cue light was mounted 6 cm above the center key, and could be illuminated red, yellow, and green (from left to right). A 7-W house light, mounted on the ceiling of the chamber, provided ambient illumination. A food magazine was located below the right key and another magazine was located below the left key; in both cases, the bottom of the magazine was 4 cm above the grid floor. Food pellets (20-mg pellets; TestDiet, Formula 5TUZ) were dispensed at a rate of one every 0.3 s. There was a 7-W light located inside each magazine to provide illumination during reinforcement and an infrared photo-detector to detect when a pigeon’s head entered and left the magazine.
Pigeons were trained to peck the response keys, following which they were studied daily using a discrete-trials procedure in which each block of trials consisted of two forced-choice trials followed by two free-choice trials. Of the two forced-choice trials, one was a smaller–sooner-reward trial and the other was a larger–later-reward trial; which type of trial was first was varied randomly across blocks. Experimental sessions were conducted daily and ended either after 40 blocks of trials or after 75 min had elapsed, whichever came first.
The beginning of all trials, both free- and forced-choice, was signaled by the illumination of the center yellow response key and the yellow cue light. On free-choice trials, a single response darkened both the center key and the yellow cue light and illuminated the right (green) and left (red) side keys as well as the green and red cue lights. The red side key was associated with a larger, later reward (30 food pellets), and the green side key was associated with a smaller, sooner reward (an adjusting number of pellets). A single response on either side key darkened both side keys. If the left key was pecked, the green cue light was extinguished and the red cue light remained illuminated; if the right key was pecked, the red cue light was extinguished and the green cue light remained illuminated. On forced-choice trials, only one side key and its associated cue light were illuminated; a single response darkened the key but not the cue light.
Pigeons experienced a delay to reinforcement on every trial. As depicted in Figure 1, the time until the smaller, sooner reward was the common delay (corresponding to A in the figure), and the time until the larger, later reward consisted of two intervals: the common delay plus a unique delay specific to the later reward (corresponding to B in the figure). On smaller–sooner-reward trials, the green cue light and the house light remained illuminated until the common delay elapsed, at which point the right magazine light was illuminated and an adjusting number of pellets was delivered. On larger–later-reward trials, the red cue light and the house light remained illuminated through the common delay and until the unique delay elapsed, at which point the left magazine light was illuminated and 30 pellets were delivered. The magazine light remained illuminated until 3 s had elapsed since the pigeon removed its head from the magazine, after which the magazine light was extinguished and the house light was illuminated. A new trial began 70 s after the pigeon had made its choice (i.e., pecked a side key) on the preceding trial.
The (common) delay to the smaller, sooner reward was either 0, 3, 5, or 10 s, depending on the condition (plus an additional 0.5 s to allow the pigeon time to get its head down to the magazine; Mazur, 2000). Within each common-delay condition, there were four unique-delay conditions (2, 5, 10, and 25 s). For example, in the 25-s unique-delay condition of the 3-s common-delay condition, the pigeon chose between an adjusting number of pellets that could be received after 25 s and 30 pellets that could be received after 28 s (again, plus an additional 0.5 s). For each pigeon, a unique-delay condition was terminated once the subjective value of the larger, later reward was determined. Once all the unique-delay conditions within a common-delay condition had been completed, a new common-delay condition began. The order of common-delay conditions and the order of unique-delay conditions at each common delay (16 conditions in all) were varied non-systematically across pigeons.
In the first block of the first session of each condition, the amount of the smaller, sooner reward was one pellet. Within each session, the amount of the smaller, sooner reward was adjusted from one block of trials to the next in order to determine the amount of smaller, sooner reward that subjects judged equal in value to the larger (30-pellet), later reward. If a pigeon chose the smaller, sooner reward on both free-choice trials in a block, then the amount of smaller, sooner reward was decreased by one pellet for the next block of trials; if the pigeon chose the larger, later alternative on both free-choice trials in a block, then the amount of the smaller, sooner reward was increased by one pellet for the next block of trials. Otherwise, the amount of the smaller, sooner reward remained the same for the next block of trials. The amount of sooner reward in the last block of trials of a session was used as the initial amount of smaller, sooner reward in the following session of the current condition.
Conditions were run for a minimum of 200 blocks and ended when a pigeon’s preference was judged stable, indicating that the smaller, sooner reward was equal in value to the larger, later reward. To assess stability, the last 50 blocks of trials were divided into 10 groups of five consecutive blocks each, and both the overall mean amount of smaller, sooner reward for the 50 blocks of trials and the mean for each of the ten five-block groups were determined. Preference was considered to be stable when (i) none of the means of the 10 groups deviated by more than two pellets from the overall mean, (ii) neither the first nor the last of these 10 group means contained the highest or the lowest amount, and (iii) there was no upward or downward trend in the group means.
Results and Discussion
Figure 2 shows the amount of smaller, sooner reward equal in value to the later 30-pellet reward (i.e., the subjective value of the later reward measured in pellets available at the end of the common delay) plotted as a function of the unique delay. The curved lines represent Eq. 1 with D equal to the duration of the unique delay. Table 1 shows the estimated k parameters (higher values indicate steeper discounting) and R2s for each pigeon. In the 0-s common-delay condition, in which pigeons were choosing between an almost immediate reward and a larger, later reward, discounting was comparable to that observed in previous discounting studies with pigeons (e.g., Mazur, 2000; Green et al., 2004). When the common delay was increased, however, pigeons discounted the value of the larger, later reward much more steeply. This finding is clearly inconsistent with the predictions of Eq. 3 and opposite to what has been observed when human subjects discount delayed hypothetical monetary rewards (Green et al., 2005); for humans, the degree of discounting decreased as the common delay increases.
Figure 2. Discounting of the larger, later reward in the 0-, 3-, 5-, and 10-s common-delay conditions in Experiment 1. Symbols represent subjective values (in pellets) for the four common-delay conditions. Curves represent the best-fitting discounting functions (Eq. 1).
Table 1. The estimated k parameter and the proportion of variance in subjective values accounted for by Eq. 1 for each individual pigeon in each common-delay (CD) condition of Experiment 1.
Caution is required, of course, before concluding that this difference in results between pigeons and humans represents a true species difference in decision making. For one thing, different procedures typically are used when studying different species. In the present case, the pigeons received real, biologically important reinforcers, and experienced the delays associated with their delivery on every trial, whereas in the Green et al. (2005) study, the participants neither received the reward nor experienced the delay, but rather were asked to imagine the choices they would make if the delays and rewards were real. It is unclear, however, how these differences could lead to opposite findings like the difference between the present results and those of Green et al. (2005).
Another notable difference between the human and pigeon procedures may be in the salience of the common delay. In one condition of the Green et al. (2005) study, for example, participants were asked to choose between a smaller amount of money available in 2 years and a larger amount available in 2 years and 6 months. Thus, the durations of both the common delay (2 years) and the unique delay (6 months) were specifically indicated. Even in a condition where the choice was between a smaller amount available in 2 years and a larger amount available in 7 years, participants could easily reframe the choice in terms of a common delay of 2 years and a unique delay of 5 years. In contrast, pigeons chose between a smaller reward available after a brief delay, signaled by one stimulus (a green cue light), and a larger reward available after a longer delay, signaled by a different stimulus (a red cue light), and there was nothing to specifically signal the portion of time common to both delayed rewards.
It seemed possible that a difference in the salience of the common and unique delays between the human and pigeon experiments was responsible for the difference in the results. In Experiment 2, therefore, we changed the stimuli to more clearly signal the common and unique delays. Specifically, the same stimulus that was present during the (common) delay until the smaller, sooner reward was also present during the initial (common) portion of the delay until the larger, later reward; on trials ending in a larger, later reward, a different stimulus signaled the final (unique) portion of the delay. Whereas in Experiment 1, different stimuli signaled the shorter delay and the longer delay (SD/LD-signal procedure), in Experiment 2, different stimuli signaled the common delay and the unique delay (CD/UD-signal procedure). Other aspects of the procedure, as well as the subjects, remained unchanged from Experiment 1.
Subjects and apparatus
The subjects and apparatus were the same as in Experiment 1.
The procedure was basically the same as in Experiment 1 with the principal exception being the way in which the common and unique delays were signaled (compare the SD/LD-signal procedure used in Experiment 1 with the CD/UD-signal procedure used in Experiment 2, as shown in Figure 3). In Experiment 2, regardless of which key the pigeon chose, the house light flashed twice per second throughout the common delay and then was extinguished. If the pigeon had chosen the right (green) key associated with the smaller, sooner reward, then when the common delay ended, the green cue light flashed once for 0.5 s and an adjusting number of pellets was delivered in the right food magazine. If the pigeon had chosen the left (red) key associated with the larger, later reward, then when the common delay ended, the red cue light illuminated for the duration of the unique delay, after which 30 pellets were delivered in the left food magazine.
Figure 3. Procedures for Experiments 1 and 2. SD and LD refer to the shorter and longer delays to reinforcement; CD refers to the portion of the delay to the smaller, sooner and larger, later rewards that they have in common, and UD refers to the portion of the delay to the larger, later reward that is unique to that reward. The circles at the top represent the response keys (R = red; G = green), the rectangles represent the cue lights, and the light bulbs represent the flashing house light.
In different conditions, the duration of the common delay was either 5, 10, or 20 s (plus an additional 0.5 s to allow the pigeon time to get its head down to the magazine). Within each common-delay condition, the durations of the four unique delays were the same as in Experiment 1: 2, 5, 10, and 25 s. All pigeons completed the 5-s common-delay condition first; three pigeons then completed the 10-s common-delay condition followed by the 20-s common-delay condition; the other two pigeons completed the 20-s common-delay condition first followed by the 10-s common-delay condition. In addition, four SD/LD-signal conditions (two unique delays at a 3-s common delay and two at a 5-s common delay) like those in Experiment 1 were interpolated among the CD/UD-signal conditions. Each pigeon experienced the 16 experimental conditions just described in a unique order.
At the end of the experiment, each pigeon completed a final pair of conditions. The SD/LD-signal procedure was used in the first condition of the pair and the CD/UD-signal procedure was used in the second condition. For each pigeon, the durations of the common and unique delays used in this final pair of conditions were the same as in the last CD/UD-signal condition they experienced.
Figure 4 shows the amount of smaller, sooner reward equal in value to the later 30-pellet reward (i.e., the subjective value of the later reward measured in pellets available at the end of the common delay) plotted as a function of the unique delay; the 0-s common-delay condition from Experiment 1 is replotted for comparison purposes. Within each common-delay condition, the subjective value of the larger, later reward tended to decrease with increases in the unique delay, whereas subjective value increased as the common delay was increased across conditions.
Figure 4. Discounting of the larger, later reward in the 5-, 10-, and 20-s common-delay conditions in Experiment 2. The data for the 0-s common-delay condition are replotted from Experiment 1. Symbols represent subjective values (in pellets) for the four common-delay conditions. Curves represent the best-fitting discounting functions (Eq.1).
The curved lines in Figure 4 represent Eq. 1 with D equal to the duration of the unique delay. Eq. 1 tended to provide a good description of the individual data from each common-delay condition (median R2 = 0.86). As may be seen in Table 2, the k parameter decreased with increases in the common delay for each pigeon. This decrease in k reflects the fact that in contrast to Experiment 1, discounting in Experiment 2 became progressively shallower as the common delay increased. The difference between the results of the two experiments may be seen clearly in Figure 5, which shows the normalized areas under the observed subjective values (i.e., the area under the curve or AuC; Myerson et al., 2001) for each common-delay condition in Experiments 1 and 2. Areas were calculated based on the observed subjective values depicted in Figures 2 and 4. Note that because they are normalized, AuC values can range between 0.0 and 1.0, with higher values indicating shallower discounting. Whereas AuC increased with the duration of the common delay in Experiment 2, reflecting a systematic decrease in the degree of discounting as predicted by Eq. 3, no such decrease was observed in Experiment 1.
Table 2. The estimated k parameter and the proportion of variance in subjective values accounted for by Eq. 1 for each individual pigeon in each common-delay (CD) condition of Experiment 2.
Figure 5. Area under the discounting curve for each common-delay condition for each pigeon in Experiments 1 and 2. Shallower discounting is indicated by higher values.
Two types of replications comparing the CD/UD procedure introduced in Experiment 2 with the SD/LD procedure of Experiment 1 were conducted in order to establish whether the shallower discounting observed in Experiment 2 at longer common delays reflected an order effect or was the consequence of the change in how the common delay was signaled. In the first type of replication, subjective values for selected unique delays from both the 3- and 5-s common-delay conditions were re-determined for each pigeon, and in all cases, the replication closely matched the original determination from Experiment 1. In the second type of replication, each pigeon completed a final pair of conditions, both with either a 10-s or a 20-s common delay, the first of which used the SD/LD signaling procedure, followed by the CD/UD procedure. The results from these two conditions, as well as those from the preceding CD/UD condition, are depicted in Figure 6. For each pigeon, the subjective values obtained using the CD/UD procedure were much higher than those obtained using the SD/LD procedure, demonstrating the powerful effect of explicitly signaling the common delay.
Figure 6. Subjective value of the larger, later reward in the final series of replications for each pigeon in Experiment 2. CD/UD refers to the signaling procedure introduced in Experiment 2, and SD/LD refers to replications using the signaling procedure originally used in Experiment 1. Note that the common and unique delays were different for each pigeon.
In order to determine whether Eq. 3, which has only a single k parameter, will suffice to describe the systematic change in the degree of discounting across all four (0-, 5-, 10-, and 20-s) common-delay conditions, this equation was fitted to the group mean subjective values from all of the common-delay conditions simultaneously. The proportion of variance from all four common-delay conditions accounted for by Eq. 3 was then compared to variance accounted for by fitting Eq. 1 (with D equal to the duration of the unique delay) to each condition separately. Notably, Eq. 3 accounted for 91% of the variance in the data, whereas a model with four discounting parameters (i.e., one for each common-delay condition) accounted for only 2% more of the same variance, a difference that was not statistically significant [F(3, 12) = 1.30]. It is important to recall that even though Eq. 3 assumes a single underlying k parameter, it predicts the observed decreases in the degree of discounting because increases in the common delay produce decreases in the value of the equation’s k′ parameter.
Equation 3 is a special case of the discounting model, based on the common-aspect attenuation hypothesis proposed by Green et al. (2005), which describes choice between delayed rewards in humans. According to this hypothesis, the k′ parameter in Eq. 3 is equal to k/(1 + k w Dc), where the additional parameter w reflects differential weighting of the common delay. For humans, the value of w was less than 1.0, indicating that human participants placed less weight on the duration of the common delay than on the duration of the unique delay. In order to determine whether pigeons also underweighted the common delay, we compared the fits of Eq. 3 with k′ equal to k/(1 + k w Dc) when the w parameter was fixed at 1.0 with the fit when w was free to vary. Making w a free parameter did not significantly improve the fit to group mean data [F(1, 14) < 1.0], suggesting that pigeons (on average) do not weight the common delay differently from the unique delay.
In the present experiment, in which the common delay was explicitly signaled regardless of which alternative (i.e., the smaller, sooner or the larger, later reward) was chosen, adding a common delay tended to decrease the degree to which the larger, later reward was discounted. Indeed, the degree of discounting decreased systematically as the common delay was increased for every pigeon. A simple hyperbolic discounting model (Eq. 3) with only one free discounting parameter, predicted the observed changes in the degree of discounting of the larger, later reward in all four common-delay conditions.
These results stand in contrast to those of Experiment 1, in which the common delay was not explicitly signaled and in which adding a common delay tended to increase the degree to which the larger, later reward was discounted. Why the addition of a common delay in Experiment 1 not only did not decrease the degree of discounting, but instead actually increased discounting, is puzzling. One possibility is that by making the delay to the larger reward even longer, the common delay made the signal for the larger, later reward (i.e., the red light) more aversive. Of course, adding a common delay also increased the delay to the smaller, sooner reward, but it is possible that, as is the case with observing stimuli, the stimulus that signals the longer wait to primary reinforcement is a conditioned punisher just as the stimulus that signals the shorter wait is a conditioned reinforcer (e.g., Fantino, 1977; Dinsmoor, 1983).
Regardless of the mechanism underlying the extremely steep discounting in Experiment 1, the difference between the results of the two experiments is clearly due to the difference in the stimuli that were associated with the common delay. This may perhaps be most clearly seen in the results of the final experimental manipulations (see Figure 6), in which the signaling procedure of Experiment 1 was reintroduced. In every case, this manipulation markedly increased the degree of discounting, which returned to its previous level when the signaling procedure of Experiment 2 was reinstated. These results suggest that pigeons’ discounting is controlled not just by the choice alternatives, but also by the way in which the choice is framed.
The effect of explicitly cueing the common delay in Experiment 2 is reminiscent of the effect of explicitly cueing the post-reward interval on discounting in rhesus macaques (Pearson et al., 2010). In the monkey study, explicit cueing reduced the degree of discounting relative to a condition in which the post-reward interval was uncued, again indicating that the way in which questions are framed may have significant effects on animals’ choices.
The question of major interest in the present study was whether the hyperboloid discounting model describes pigeons’ choices between two delayed rewards just as it describes humans’ choices. Indeed it does, at least under the conditions studied in Experiment 2. This is not to say that there are no differences. Green et al. (2005) reported that humans, on average, underweight the common delay when choosing between two delayed rewards. In contrast, pigeons in the current experiment, on average, weighted both the common and unique portions of the delay to the larger, later reward equally. Taken together, the results of Experiment 2 reveal both similarities and differences between discounting by pigeons and humans. Although the two species appear to differ in whether or not equal weighting is given to the common and unique portions of the delays, their behavior is similar in that when the common portion of the time until delayed rewards is increased, the degree of discounting decreases.
In two experiments, pigeons were given choices between two delayed food rewards, a smaller amount available sooner and a larger amount available later. In Experiment 1, the delay common to both rewards was not explicitly signaled. Compared to choice between an immediate and a delayed reward, the addition of a common delay resulted in an increase in the degree to which the later reward was discounted. In contrast, when the common delay was explicitly signaled in Experiment 2, the extent to which the larger, later reward was discounted decreased systematically as the common delay was increased. The fact that differences in the signaling of the delays could have such a marked effect on the degree of discounting, even though the procedures were otherwise the same, highlights the important role that signaling plays in discounting in particular and reinforcement processes in general (Lattal, 2010).
Comparisons of Different Discounting Models
The pattern of shallower discounting with increases in the common delay observed in Experiment 2 is similar to what has been observed in humans (Green et al., 2005). It is inconsistent, however, with what would be predicted based on exponential or quasi-hyperbolic models of discounting. Exponential discounting assumes that the subjective value of a delayed reward decreases by a constant proportion with the passage of each additional unit of time; quasi-hyperbolic discounting that the subjective value of a delayed reward is unaffected by the passage of just a single time period, but decreases exponentially thereafter (Laibson, 1997).
If discounting were exponential, and people and other animals made choices between delayed outcomes by comparing their present (i.e., discounted) values, then the degree of discounting would be unaffected by the duration of the common delay. Similarly, if discounting were quasi-hyperbolic, then once the time until the smaller, sooner outcome exceeded one time period, then the degree of discounting would be unaffected by further increases in the common delay.
In contrast, the present discounting framework, which assumes that choices are made based on comparison of the present subjective values of hyperbolically discounted outcomes (as instantiated in Eq. 3), correctly predicts the observed pattern of results in Experiment 2. As predicted, increases in the common delay resulted in decreases in how steeply the later reward was discounted as a function of the unique delay. As the time until the sooner reward (i.e., the common delay) was increased, the degree to which the subjective value of the later reward decreased, relative to that of the sooner reward, decreased. This decrease was reflected in the amount of sooner reward that was equivalent in subjective value to the later reward. Importantly, a mathematical model (Eq. 3) that assumed only a single, fundamental discounting parameter predicted the observed changes in the degree of discounting of the larger, later reward as measured in terms as the amount of smaller, sooner reward of equivalent value.
Implications for Species Comparisons
The present effort provides a cautionary tale for those making species comparisons. What initially appeared to be a clear species difference (i.e., the addition of a common delay, which leads to shallower discounting in humans, led to steeper discounting in pigeons) turned out to be peculiar to the way in which the choice was framed. That is, the way in which the common portion of the delays to smaller, sooner and larger, later rewards was signaled turned out to determine the way in which pigeons chose between delayed rewards. When the common delay was made more salient, pigeons’ choice behavior resembled that of humans choosing between delayed monetary rewards, although the time scale differed by orders of magnitude. We would point out, however, that recent studies reveal that this apparent species difference in scale breaks down when the choices presented to human and non-human animals are framed in more similar ways. That is, the subjective value of directly consumable rewards declines over seconds in deprived humans (Jimura et al., 2009, 2011) just as it does in deprived non-human animals (Mazur, 2000; Green et al., 2004).
We do not contend, however, that these discounting rates are representative of foraging in the natural environment (Stephens et al., 2004), either for humans or other animals. For laboratory experiments, researchers have designed tasks that allow them to examine discounting rates while holding the time between choice opportunities constant, regardless of how representative such situations are of those encountered in the natural environment. Discounting as observed under such circumstances is only one aspect of what determines choice behavior in the natural environment, but it presumably does play a role, and tasks like those in the present study are designed to allow examination of the discounting process in relative isolation.
The focus of the present study, however, was not on the role that discounting plays in foraging, although this is an important (and controversial) issue (e.g., Stephens et al., 2004; Kalenscher and Pennartz, 2008). Rather, the question here was whether the hyperboloid discounting model that describes human choices between two delayed rewards would also describe pigeon choices when both species are tested under somewhat analogous circumstances. And indeed, in Experiment 2, when the procedure used in Experiment 1 was modified so as to make more salient the variables that the model assumes control human discounting, the hyperboloid discounting model did describe pigeon choices.
The present findings also demonstrate how research with human and non-human animals can be mutually informative and, as such, are consistent with the view that species comparisons can increase our understanding of human decision making (Hackenberg, 2005; Shettleworth, 2010). Although the results of Experiment 1 suggested striking differences between humans and pigeons with respect to their choice between delayed rewards, consideration of recently proposed models of human discounting (Green et al., 2005) suggested critical procedural changes that were made in Experiment 2. The results observed with this modified procedure, in turn, revealed fundamental similarities between pigeons’ and humans’ choice behavior. More specifically, the present findings extend the generality of the hyperboloid discounting model and provide interspecies support for the hypothesis that choice between delayed outcomes is based on comparison of their hyperbolically discounted present values.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The research was supported by a grant from the National Institutes of Health, Grant Number MH055308. We thank members of the Psychonomy Cabal for their assistance in the running of the experiments.
- ^Typically, human delay discounting data are better described by a generalized form of Eq. 1 in which the denominator is raised to a power (e.g., Myerson and Green, 1995), but a simple hyperbola (i.e., the special case where the exponent equals 1.0) usually suffices to describe data from non-human animals.
Jimura, K., Myerson, J., Hilgard, J., Braver, T. S., and Green, L. (2009). Are people really more patient than other animals? Evidence from human discounting of real liquid rewards. Psychon. Bul. Rev. 16, 1071–1075.
Jimura, K., Myerson, J., Hilgard, J., Keighley, J., Braver, T. S., and Green, L. (2011). Domain independence and stability in young and older adults’ discounting of delayed rewards. Behav. Process. 87, 253–259.
Mazur, J. E. (1987). “An adjusting procedure for studying delayed reinforcement,” in Quantitative Analyses of Behavior: Vol. 5: The Effect of Delay and of Intervening Events on Reinforcement Value, eds M. L. Commons, J. E. Mazur, J. A. Nevin, and H. Rachlin (Hillsdale, NJ: Erlbaum), 55–73.
Keywords: delay discounting, hyperbolic, discounting function, pigeons, humans
Citation: Calvert AL, Green L and Myerson J (2011) Discounting in pigeons when the choice is between two delayed rewards: implications for species comparisons. Front. Neurosci. 5:96. doi: 10.3389/fnins.2011.00096
Received: 21 April 2011;
Accepted: 16 July 2011;
Published online: 17 August 2011.
Edited by:Tobias Kalenscher, Heinrich-Heine University Duesseldorf, Germany
Reviewed by:Benjamin Hayden, Duke University Medical Center, USA
Howard Rachlin, Stony Brook University, USA
James Mazur, Southern Connecticut State University, USA
Copyright: © 2011 Calvert, Green and Myerson. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
*Correspondence: Leonard Green, Department of Psychology, Washington University, Campus Box 1125, St. Louis, MO 63130, USA. e-mail: firstname.lastname@example.org