Impulsive actions and choices in laboratory animals and humans: effects of high vs. low dopamine states produced by systemic treatments given to neurologically intact subjects

Increases and decreases in dopamine (DA) transmission have both been suggested to influence reward-related impulse-control. The present literature review suggests that, in laboratory animals, the systemic administration of DA augmenters preferentially increases susceptibility to premature responding; with continued DA transmission, reward approach behaviors are sustained. Decreases in DA transmission, in comparison, diminish the appeal of distal and difficult to obtain rewards, thereby increasing susceptibility to temporal discounting and other forms of impulsive choice. The evidence available in humans is not incompatible with this model but is less extensive.

the go/no-go task (de Wit et al., 2002), the 5-CSRT; Robbins, 2002, the DRL task (e.g., Seiden et al., 1979), and the simple reaction time task (Amalric and Koob, 1987). Versions of all five tasks have been used in laboratory rodents and humans, but the 5-CSRT literatures consist predominantly of animal studies (Winstanley, 2011). While all tasks can measure premature responding, the SST and go/no-go tasks were also designed to measure the ability to inhibit a motor response. The go/nogo task measures the ability to inhibit responses to inappropriate cues, whereas the SST measures the speed at which an already initiated response can be inhibited (Eagle and Baunez, 2010). The 5-CSRT, DRL and simple reaction time tasks involve "waiting" before making a response to obtain a reinforcer (see Box 1 for a more detailed description of these tasks). Impulsive choice, in comparison, reflects the preference for immediately available small rewards over larger but more distal ones, and is commonly evaluated with temporal discounting procedures such as the delay discounting task (DDT; Ainslie, 1975), effort discounting paradigms (e.g., Floresco et al., 2008), probabilistic discounting tasks (e.g., St Onge and Floresco, 2009) and gambling-like tasks, such as the Iowa Gambling Task (IGT; Bechara et al., 1994; see Box 2 for more detailed descriptions of these tasks).
While multiple other neurotransmitters also influence performance on these tasks (Winstanley et al., 2003;Winstanley, 2011), the focus of the present review is on the role of DA, both in laboratory animals and in healthy human subjects. With a few noted exceptions, we will specifically review studies on the effects of acutely administered drugs given systemically.
1. The five-choice serial reaction time task (5-CSRT; Robbins, 2002) A visual stimulus is presented at one of five locations, and responding must be withheld until the stimulus signals that responding is appropriate. Impulsive behavior is measured by the number of responses made before the onset of the stimulus.
2. The differential reinforcement of low rates (DRL) of responding task (e.g., Seiden et al., 1979) To obtain a reinforcer, subjects are required to withhold from responding for a fixed period of time and then to respond. Delays during which subjects are required to withhold from responding typically range from 10 s (DRL 10) to 72 s (DRL 72). Premature responses on this task consist of those made before this period of time has elapsed. Such responses reset the trial and are not reinforced. In some instances, subjects are required to respond after a fixed period of time has elapsed, but the response must occur within a certain delay (e.g., DRL 10-14), otherwise late responses are not reinforced.
3. The simple reaction time task (Amalric and Koob, 1987) Each trial begins when subjects press on a lever. They must hold the lever down for a variable period of time, until a visual or auditory stimulus is presented. Following the presentation of this stimulus, subjects must release the lever within a pre-determined delay. Incorrect trials consist of those during which the lever was released prior to the stimulus onset (i.e., anticipated or premature responses) or after the delay has elapsed following the stimulus onset (i.e., delayed responses).
4. The stop-signal task (SST; Logan et al., 1984) Subjects initiate a motor response following a go signal, and reaction times (RT) are determined. On a small proportion of trials, a stop signal follows the go signal. Sometimes the stop signal appears well before the subject's RT limit, thereby providing sufficient time to inhibit the response. On other trials though the stop signal occurs very close to when the subject would normally respond, providing little time to inhibit the behavior. The longer the interval required to inhibit responses, the longer the stop signal response time (SSRT), and the poorer the inhibitory control.
5. The go/no-go task (de Wit et al., 2002) Only one signal is presented per trial. The go signal is much more frequent than the no-go signal, thus priming subjects to initiate a motor response. On this task, poor inhibitory control is quantified by the number of responses on no-go trials (i.e., errors of commission).
2. Effort discounting tasks (e.g., Floresco et al., 2008) Subjects must choose between a small reinforcer that can be easily obtained and a large reinforcer that requires greater effort to obtain. On separate trials, the value of the easy reinforcer or the amount of effort required to obtain the large reinforcer is changed, until subjects reach a point where both options are chosen equally. This point is called the indifference point. This step is repeated several times to obtain multiple points, which yield a hyperbolic discounting curve. The measure of interest is the steepness of that curve, which is called the discounting rate, with greater steepness indicating greater effort discounting, meaning that subjects are less willing to exert effort to obtain the larger reinforcer (i.e., impulsive choice).
3. Probability discounting tasks (e.g., St Onge and Floresco, 2009) Subjects must choose between a small reinforcer that is delivered with greater certainty, and a larger, more uncertain reinforcer that is delivered according to various probabilities. On separate trials, the value of the small reinforcer or the probability at which the large reinforcer is delivered is changed, until subjects reach a point where both options are chosen equally. This point is called the indifference point. This step is repeated several times to obtain multiple points, which yield a hyperbolic discounting curve. The measure of interest is the steepness of that curve, which is called the discounting rate, with greater steepness indicating greater probability discounting, meaning that subjects choose the large reinforcer despite low probabilities of its delivery (i.e., impulsive choice). Task (IGT; Bechara et al., 1994) Subjects are asked to pick cards from four decks, two of which result in overall gain due to frequent small gains and infrequent small losses, and the remaining two decks resulting in overall loss due to frequent large gains but infrequent larger losses. As subjects are not explicitly told about these contingencies, they must learn the patterns of wins and losses associated with each deck, and guide their choice toward the advantageous decks (i.e., those providing smaller immediate rewards in prospect of avoiding a long-term loss) to obtain a positive outcome. Therefore, the IGT requires learning for optimal performance (Fellows and Farah, 2005).
In studies using the SST and go/no-go tasks, the results have been similar. An acute dose of GBR 12909 (5, 10 mg/kg, i.p.) accelerated both go and stop SST RT leading to an increased number of premature responses (Bari et al., 2009). Although low to moderate doses of amphetamine (0.5, 1.0 mg/kg, i.p. in mice; Loos et al., 2010) and cocaine (5, 10 mg/kg, i.p. in rats; Paine and Olmstead, 2004) had no effect, a higher dose of cocaine (15 mg/kg, i.p.) increased premature responding, as indexed by increased commission errors (responding during no-go trials; Paine and Olmstead, 2004 Britton and Koob (1989) Male Wistar rats  The ability to inhibit an initiated response is also altered by DAergic drugs but the pattern of effects differs compared to those seen on premature responding. In studies using the SST, the effects of amphetamine, methylphenidate and modafinil depended on whether the animals had slow or fast inhibitory responses under placebo. In rats with poor inhibitory control (slow responders), all three drugs improved performance by shortening the time required to inhibit an initiated response (Feola et al., 2000;Eagle and Robbins, 2003;Eagle et al., 2007Eagle et al., , 2009). The opposite effect was observed in fast responders (0.3, 1.0 mg/kg modafinil, i.p.; Eagle et al., 2007). Administration of the D1/D2 receptor antagonist flupenthixol (0.01-0.125 mg/kg, i.p.), in comparison, had no effect on the ability to inhibit responses on the SST in either fast or slow responders (Eagle et al., 2007).
Together then, the weight of evidence from these studies in laboratory rodents indicates that elevations in DA transmission can have two main effects on impulsive actions: they increase premature responding while also improving the ability to inhibit prepotent responses in impulsive animals.

Human studies
To our knowledge, there are no studies of the effects of DA augmenters on premature responding in healthy humans. There is a small literature, though, describing effects on the ability to inhibit initiated responses ( Table 3). In agreement with studies in laboratory animals, the acute administration of low-to-moderate doses of oral d-amphetamine (10-20 mg) had differential effects on SST performance depending on the subject's baseline performance. In individuals with poor baseline inhibitory control (i.e., slow stoppers), d-amphetamine improved the ability to inhibit an initiated response, while having no effect in fast stoppers (de Wit et al., , 2002. A similar pattern has been observed on the go/no-go task, where more impulsive subjects at baseline showed a reduction in the number of commission errors following both 10 and 20 mg of d-amphetamine (de Wit et al., 2002). These results are consistent with those seen in children and adults with attention-deficit/hyperactivity disorder, where the ability to inhibit responses is improved following the administration of oral methylphenidate (Tannock et al., 1989;Aron et al., 2003). They are also in accordance with a recent study by Aarts et al. (2014), in which individuals with greater DA synthesis capacity performed poorly when they anticipated a large reward on a modified Stroop task. The authors proposed that DA release in response to the large rewards may "overdose" an already DA-rich system, and have detrimental effects on performance. These detrimental effects would potentially emerge with higher doses of d-amphetamine in fast stoppers. In one study, methylphenidate administration (40 mg) did not affect healthy adults' performance on the SST or on the go/no-go task, but it reduced intra-individual RT which is indicative of increased attention (Costa et al., 2013). The lack of effect on overall performance could be explained by the low rate of inhibition errors in this particular sample of participants.
Decreasing DA release, in comparison, using the acute phenylalanine/tyrosine depletion method, has been reported to increase go/no-go commission errors, particularly in response to reward cues , and diminish the ability to supress incorrect impulses, as measured with a sensitive electromyography index (Ramdani et al., 2014). In the converse experiment, administration of the DA precursor, tyrosine (2.0 g, p.o.), improved SST performance by reducing the time required to inhibit initiated responses (Colzato et al., 2014). This effect of tyrosine might reflect greater cognitive control in the prefrontal cortex, as the same research group has found that tyrosine improves performance on a demanding condition of the N-Back task (Colzato et al., 2013) while tyrosine depletion tended to reduce N-Back performance in people carrying the low activity met allele of the gene encoding for the enzyme, catechol-O-methyltransferase (Kelm and Boettiger, 2013). These effects on the ability to inhibit prepotent responses, though, have not been observed consistently in other studies following tyrosine depletion (McLean et al., 2004;Lythe et al., 2005) or following administration of d-amphetamine (7.5-15 mg/kg, p.o.; Fillmore et al., 2005) or the DA agonist, pramipexole (0.25-0.5 mg, p.o.; Hamidovic et al., 2008). It remains to be tested whether these divergent findings reflect baseline differences in performance or the need for more sensitive measures (Ramdani et al., 2014).
Dopamine antagonists, such as haloperidol (0.1-0.2 mg/kg, i.p.) and flupenthixol (0.25-0.5 mg/kp, i.p.) also increase effort discounting (Salamone et al., 1991;Denk et al., 2005;Floresco et al., 2008) and decrease the willingness to sustain effort as measured by progressive ratio breakpoints for natural rewards, such as food (Salamone et al., 2009) and pharmacological rewards, such as cocaine (Roberts et al., 2013). Interestingly, Cocker et al. (2012) reported evidence for differential effects of amphetamine on effort discounting in rats that exert high vs. low effort at baseline. In hard-working rats, a low dose (0.3 mg/kg, i.p.) increased their willingness to work to obtain the large reward, but higher doses (0.6, 1.0 mg/kg, i.p.) had the opposite effect. In the so-called "slacker" rats, these high doses enhanced their willingness to exert effort to obtain the large reward. These results are consistent with Aarts et al. (2014) findings of reduced cognitive control when DA levels are too high in individuals with greater DA synthesis capacity. It is possible that having too much DA impaired cognitive control in a way that made the smaller, but immediately available reward more appealing than the larger reward, which required greater effort to obtain.
Compared to temporal and effort discounting tasks, the results overall differ in probabilistic tasks where rats choose between a smaller but certain reward, and a larger but uncertain reward. In such tasks, low DA states induced by administration of flupenthixol (0.4 mg/kg, i.p.), eticlopride (0.01-0.03 mg/kg), and SCH 23390 (0.005-0.01 mg/kg, i.p.) decreased risky choices, even when probabilities of obtaining the larger reward were high (St Onge and Floresco, 2009;St Onge et al., 2010). In comparison to these effects of DA antagonists, the administration of small to moderate doses of amphetamine (0.125-1.0 mg/kg, i.p.) shifts choice preferences toward larger, more uncertain reward, even when the probability of delivery is very low (St Onge and Floresco, 2009;St Onge et al., 2010). This effect has been observed when such probabilities are presented in descending order, while the opposite was reported when probabilities increased over time (St Onge et al., 2010). The same research group has also found that low doses of amphetamine reduced delay (0.25 mg/kg) and effort discounting (0.125, 0.25 mg/kg, i.p.), but that a higher dose (0.5 mg/kg, i.p.) increased effort discounting, meaning that animals were less willing to exert effort to obtain larger rewards (Floresco et al., 2008).
The above findings suggest that distinct mechanisms underlie delay, effort, and risk discounting. They further highlight that methodological differences in task requirements or the order of presentation of various contingencies can be crucial when interpreting the effects of DA manipulations. Other differences, such as the presence of reward cues during the delay, the type of reinforcer used, and variations in the paradigms, are also worth considering. It is noteworthy that the standard delay discounting paradigm shares features with premature responding tasks such as the 5-CSRT, as both assess the ability to wait in order to get a reinforcer. This is supported by correlations between levels of delay discounting and premature responding in the same rats . It is therefore possible that large increases in DA levels, which are known to induce premature responding, interfered with performance on DDT, and as such, masked the potential benefits of DA agonists on the ability to tolerate delays to maximize rewards. It thus appears that having too little or too much DA can impair performance on the delay and effort discounting tasks, whereas high DA might result in greater impulsivity on probabilistic discounting tasks when high probabilities are presented first.
On a recently developed rat version of the IGT, the rGT, d-amphetamine (0.25, 0.5, 1.0, 1.5, 2.5 mg/kg, i.p.) increased preference for smaller but certain rewards, resulting in poorer overall outcome (Zeeb et al., 2009;Baarendse et al., 2013;van Enkhuizen et al., 2013), whereas eticlopride (0.01 mg/kg, i.p.) improved performance by shifting preference toward larger but riskier rewards (Zeeb et al., 2009). It should be noted that rats were punished for losses by timeout periods during which no reward could be earned, which may indicate that amphetamine exerted its effects by increasing sensitivity to punishment. This interpretation is consistent with the observation that d-amphetamine (0.3-1.5 mg/kg, i.p.) dose-dependently decreased choice of a large but risky reinforcer in a probability discounting paradigm in which the risky reinforcer was associated with a mild footshock (Simon et al., 2009Mitchell et al., 2011). Together, the above findings suggest that, in these studies, amphetamine affects risk aversion more clearly than reward sensitivity. There are challenges, though, in comparing rewards and punishments on features such as stimulus salience, intensity, etc. Other DA augmenters, such as cocaine (5-15 mg/kg, i.p.; Simon et al., 2009), GBR 12909 (2.5-28.5 mg/kg, i.p.; Baarendse et al., 2013;van Enkhuizen et al., 2013), and modafinil (16-64 mg/kg, i.p.;van Enkhuizen et al., 2013), as well as the DA antagonist SCH 23390 (0.001-0.03 mg/kg, i.p.; Zeeb et al., 2009;Simon et al., 2011), have not significantly affected performance on the rGT or on the probabilistic task with mild footshock.

Human studies
The DA-impulsive choice literature in healthy humans remains quite small (see Table 6). Most of the evidence-direct and indirect-suggests that low DA states aggravate impulsive choice while modest increases improve it (Trifilieff and Martinez, 2014). For example, transiently decreasing DA synthesis, using the tyrosine depletion method, impairs the ability to resist short-term, large gains despite long-term, larger losses on the IGT (Scarnà et al., 2005;Sevy et al., 2006), and decreases the willingness to sustain effort as measured by progressive ratio breakpoints when subjects work for alcohol (Barrett et al., 2008), tobacco (Venugopalan et al., 2011) and money (Cawley et al., 2013). On a guessing game in which probabilities of making the right decision vary, poor performance has also been observed following tyrosine depletion when probabilities were low (McLean et al., 2004), although a lack of effect of tyrosine depletion on probability discounting has also been reported (Lythe et al., 2005). Following administration of haloperidol (3 mg, p.o.), performance was impaired on a betting game in which there were no contingencies between responses and outcomes. Specifically, healthy controls who won money on a given trial subsequently increased the size of their bet on the next trial (Tremblay et al., 2011). Thus, greater reward expectancies resulted in increased risk-taking. However, the same dose of haloperidol did not affect performance on a slot machine game (Zack and Poulos, 2007), nor did a smaller dose (1.5 mg/kg, p.o.) affect rates of delay discounting in healthy adults (Pine et al., 2010). Pramipexole (0.5 mg, p.o.) increased risky choice on a gambling task following unexpected double wins (Riba et al., 2008). Again, reward expectancies influenced risk-taking. It thus seems that both high and low DA states enhance risk-taking when a large reward is expected.
In comparison to these effects of decreasing DA transmission, healthy volunteers' tolerance for delayed rewards on the DDT was increased by a moderate dose of oral d-amphetamine (20 mg; de Wit et al., 2002), but not following a lower dose of amphetamine (10 mg, p.o.;de Wit et al., 2002), 150 and 300 mg (p.o.) of the weak DA reuptake inhibitor bupropion (Acheson and de Wit, 2008), or the direct DA D2 agonist pramipexole (0.25-0.50 mg, p.o.) (Hamidovic et al., 2008). Low to moderate doses of damphetamine (10, 20 mg, p.o.) also decreased effort discounting, meaning that participants were more willing to work hard to obtain large rewards (Wardle et al., 2011). In contrast to the above findings, administration of the immediate DA precursor, L-DOPA (150 mg, p.o.), has been reported to increase delay discounting (Pine et al., 2010), while a smaller dose (100 mg, p.o.) had no effect on a gambling task in which no feedback was provided (Symmonds et al., 2013). It has been proposed that DA might affect decision-making through its effects on learning from different forms of feedback (Collins and Frank, 2014). The absence of ongoing feedback during the gambling task might have prevented L-DOPA from exerting effects, as no learning was involved. It remains unclear whether the conflicting results reviewed above reflect lack of specificity of some of the compounds used, different paradigms affecting different aspects of performance, different behavioral effects from changes in phasic vs. tonic DA release, spurious findings in a still small literature, or something else.
In summary, the evidence is less consistent when it comes to impulsive choice. Animal studies point to dose-dependent effects, with small increases in DA improving performance on the DDT and larger doses leading to impairment. In humans, decreasing DA transmission increases impulsive, effort discounting, but the effects of DA augmenters and behavioral responses on other tasks are less consistent. In studies using gambling paradigms, poorer performance is seen following elevated DA transmission in rats and lowered DA in healthy human subjects. Additional research in humans is needed where different drugs and a wide range of doses are directly compared.

SUMMARY AND CONCLUSIONS
It was previously proposed that increased vs. decreased DA transmission might predispose individuals to premature responding vs. delay discounting (Leyton, 2007). Since then, the animal literature has grown, and the proposed demarcation stands up well. Studies in neurologically intact humans, though, remain scarce, and caution is warranted since the exact effects in humans and rodents are not always the same. For now, it remains unclear whether these differences reflect methodology (e.g., different drugs, routes of administration and tests), neurobiology (e.g., larger, more complex and more dense DA innervation of primate frontal cortex), or, more simply, the smaller number of studies in healthy human subjects.

ACKNOWLEDGMENTS
This work was supported by an operating grant from the Canadian Institutes of Health Research to Marco Leyton (MOP-36429). Valérie D'Amour-Horvat is a recipient of a student award from Fonds de la recherche du Québec-Santé (FRQ-S). We thank Yogita Chudasama for providing feedback on an earlier version of the manuscript.