# Memory Systems of the Addicted Brain: The Underestimated Role of Drug-Induced Cognitive Biases in Addiction and Its Treatment

edited by : Vincent David, Daniel Béracochéa and Mark E. Walton published in : Frontiers in Psychiatry

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-487-7 DOI 10.3389/978-2-88945-487-7

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# Memory Systems of the Addicted Brain: The Underestimated Role of Drug-Induced Cognitive Biases in Addiction and Its Treatment

Topic Editors:

Vincent David, Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS UMR 5287, France and Université de Bordeaux, France Daniel Béracochéa, Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS UMR 5287, France and Université de Bordeaux, France Mark E. Walton, University of Oxford, United Kingdom

Citation: David, V., Béracochéa, D., Walton, M. E., eds. (2018). Memory Systems of the Addicted Brain: The Underestimated Role of Drug-Induced Cognitive Biases in Addiction and Its Treatment. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945- 487-7

# Table of Contents


Miranda C. Staples and Chitra D. Mandyam

	- Paul S. Regier and A. David Redish

# Editorial: Memory Systems of the Addicted Brain: The Underestimated Role of Cognitive Biases in Addiction and Its Treatment

*Vincent David1,2\*, Daniel Béracochéa1,2 and Mark E. Walton3*

*<sup>1</sup> Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, CNRS UMR 5287, Pessac, France, 2Université de Bordeaux, Pessac, France, 3Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom*

Keywords: addiction, memory systems, decision-making, habit learning, alcohol

#### **Editorial on the Research Topic**

#### **Memory Systems of the Addicted Brain: The Underestimated Role of Cognitive Biases in Addiction and Its Treatment**

Drug addiction has often been viewed as an aberrant form of learning during which strong associations linking actions to drug seeking are expressed as persistent stimulus–response habits, thereby maintaining a vulnerability to relapse. However, an increasing body of data suggests a more complex picture, revealing that different cognitive processes are altered by drug use or abuse. These alterations clearly need to be taken into account to better understand addictive behaviors, as they are likely to contribute to their persistence and their response to pharmacological and nonpharmacological treatments. Therefore, the aim of this research topic is to provide an overview of the current work investigating the long-term impact of drug use on learning, memory, and decisionmaking processes, how multiple memory systems modulate drug-seeking behavior, as well as how drug-induced cognitive biases could contribute to the persistence of addictive behaviors. Another interesting feature of this research topic is that new animal models of cognitive processes pertaining to addictions are presented, providing strong support to the translational interest of these tasks.

The research topic begins with a commentary repositioning the initiative on precision medicine launched by the National Institutes of Health in the context of addictions (Ghitza), and a comprehensive presentation of neuropsychological consequences of chronic drug use across a wide range of different substances (Cadet and Bisagno). The emerging picture is that drugs of abuse have effects on cognitive processes which go far beyond their well-known habit forming action. In fact, under certain circumstances, evidence now exists that repeated cocaine exposure appears to *promote* more complex goal-directed behaviors (Halbout et al.).

Chronic drug use increasingly appears to have also long-lasting effects on interactions between memory systems, which are a normal aspect of learning. Both human and rodent studies support the view that the hippocampus and the dorsal striatum can interact in either a cooperative or a competitive manner during learning, with the prefrontal cortex being involved in the selection of an appropriate learning strategy. Building on original studies of Norman M. White 20 years ago, a comprehensive review describes how chronic consumption of drugs of abuse impacts normal interactions between these memory systems (Goodman and Packard). Within this theoretical framework, an experimental report further shows that opiate self-administration eventually leads to a functional imbalance characterized by an exclusive use of striatum-dependent learning strategies, at the expense of hippocampal-dependent processes, in rodents performing navigational tasks (Baudonnat et al.). One structure that may be critical for acting as a switch between memory systems, the ventral tegmental area, is the focus of an in-depth review looking closely into its afferent circuits

#### *Edited by:*

*Yasser Khazaal, Université de Genève, Switzerland*

> *Reviewed by: Aviv M. Weinstein, Ariel University, Israel*

*\*Correspondence: Vincent David vincent.david@u-bordeaux.fr*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 16 November 2017 Accepted: 25 January 2018 Published: 19 February 2018*

#### *Citation:*

*David V, Béracochéa D and Walton ME (2018) Editorial: Memory Systems of the Addicted Brain: The Underestimated Role of Cognitive Biases in Addiction and Its Treatment. Front. Psychiatry 9:30. doi: 10.3389/fpsyt.2018.00030*

and their specific implication into drug-related behaviors (Oliva and Wanat).

In the recent version of the Diagnostic and Statistical Manual of Psychiatric Disorders (DSM V), alcohol and drug addiction have been combined under the new classification of substance use disorders. Common behavioral symptoms with diagnostic value now include already existing criteria such as loss of control, negative affect upon withdrawal, vulnerability to relapse and lately, craving defined as an urgent desire to use the target substance. In the present issue, several reviews and experimental studies present compelling evidence that alcohol abuse lead to long-lasting changes in learning processes, which may contribute to persistent alcoholism (Corbit and Janak; Staples and Mandyam). Nicely complementing the description of these behavioral changes, other authors have reviewed extensively what is currently known about the role of epigenetic marks (histone deacetylation) in the glucocorticoid-dependent dysregulation of the hypothalamic– pituitary–adrenal axis activity (Mons and Beracochea).

A loss of cognitive flexibility may also be observed through assessment of decision-making processes, an essential component of our daily life. They may be uncovered by imposing rule changes on the subject, such as requiring an attentional shift between different perceptual features of a complex stimulus, as in the attentional set shifting task which was recently adapted to rodents (Besson and Forget). In this issue, Granon and colleagues (Pittaras et al.) provide evidence to implicate β2 nicotinic receptors in the excitation/inhibition balance in the prefrontal cortex using β2<sup>−</sup>/<sup>−</sup> mice, which exhibit inappropriate decision-making and a blunted sensitivity to punishment when outcome uncertainty is high. These reports are especially interesting in that they also provide new means to evaluate carefully decision-making in rodents.

The importance of a better understanding, at both the experimental and theoretical levels, of decision-making processes for

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 David, Béracochéa and Walton. This is an open-access article distributed under the terms of the Creative Commons Attribution License*  the purpose of addiction treatments is highlighted by the study of Regier and Redish on contingency management. The authors challenge the view that the success of this approach relies solely on alternative reinforcement. Instead, they provide evidence that access to deliberative decision-making processes, and bypass of automatic action-selection systems, may be the key to the therapeutic efficiency of contingency management. It is striking to note that, although formulated using a different theoretical framework, the conclusion drawn here point to cognitive processes similar to those described at the neurobiological level by Baudonnat et al. and Goodman and Packard. Finally, observing the efficiency of eye movement desensitization and reprocessing (EMDR) on posttraumatic stress disorders, an elegant study asked the question of the effects of EMDR on nicotine-related mental imagery and craving (Littel et al.). These intriguing results open an interesting debate about EMDR therapeutic approaches, encouraging future work to determine for how long EMDR-induced improvements may be maintained during protracted abstinence.

In conclusion, the present collection of articles provides original data and new perspectives on a highly promising line of research looking at dynamics of cognitive processes throughout main steps of the addiction cycle, from its initial instatement to treatment.

### AUTHOR CONTRIBUTIONS

Equal contribution by all the authors.

### FUNDING

This work was supported by the Centre National de la Recherche Scientifique (CNRS), the University of Bordeaux, and the Regional Council of the Aquitaine Region. MEW was funded by a Wellcome Trust Senior Research Fellowship (202831/Z/16/Z).

*(CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Morphine Reward Promotes Cue-Sensitive Learning: Implication of Dorsal Striatal CREB Activity

*Mathieu Baudonnat 1,2, Jean-Louis Guillou1,2, Marianne Husson1,2, Veronique D. Bohbot <sup>3</sup> , Lars Schwabe4 and Vincent David1,2\**

*1CNRS UMR 5287, Institut de Neurosciences Cognitives et Intégratives d'Aquitaine, Pessac, France, 2Département des Sciences de la Vie et de la Santé, Nouvelle Université de Bordeaux, Pessac, France, 3Department of Psychiatry, Douglas Institute, McGill University, Montreal, QC, Canada, 4Department of Cognitive Psychology, University of Hamburg, Hamburg, Germany*

Different parallel neural circuits interact and may even compete to process and store information: whereas stimulus–response (S–R) learning critically depends on the dorsal striatum (DS), spatial memory relies on the hippocampus (HPC). Strikingly, despite its potential importance for our understanding of addictive behaviors, the impact of drug rewards on memory systems dynamics has not been extensively studied. Here, we assessed long-term effects of drug- vs food reinforcement on the subsequent use of S–R vs spatial learning strategies and their neural substrates. Mice were trained in a Y-maze cue-guided task, during which either food or morphine injections into the ventral tegmental area (VTA) were used as rewards. Although drug- and food-reinforced mice learned the Y-maze task equally well, drug-reinforced mice exhibited a preferential use of an S–R learning strategy when tested in a water-maze competition task designed to dissociate cue-based and spatial learning. This cognitive bias was associated with a persistent increase in the phosphorylated form of cAMP response element-binding protein phosphorylation (pCREB) within the DS, and a decrease of pCREB expression in the HPC. Pharmacological inhibition of striatal PKA pathway in drug-rewarded mice limited the morphine-induced increase in levels of pCREB in DS and restored a balanced use of spatial vs cue-based learning. Our findings suggest that drug (opiate) reward biases the engagement of separate memory systems toward a predominant use of the cue-dependent system *via* an increase in learning-related striatal pCREB activity. Persistent functional imbalance between striatal and hippocampal activity could contribute to the persistence of addictive behaviors, or counteract the efficiency of pharmacological or psychotherapeutic treatments.

Keywords: reward, drug self-administration, CREB, memory, morphine, striatum, ventral tegmental area

### INTRODUCTION

Drug addiction may be viewed as an aberrant form of learning during which strong associations linking actions to drug seeking are expressed as persistent stimulus–response (S–R) habits, thereby increasing the vulnerability to relapse (1–3). Whereas the hippocampal memory system encodes relationships between events and their later flexible use, the dorsal part of the striatum plays a critical

#### *Edited by:*

*Alain Dervaux, Centre hospitalier Sainte-Anne, France*

#### *Reviewed by:*

*Roberto Ciccocioppo, University of Camerino, Italy Robert F. Leeman, University of Florida, United States*

> *\*Correspondence: Vincent David vincent.david@u-bordeaux.fr*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 15 November 2016 Accepted: 01 May 2017 Published: 30 May 2017*

#### *Citation:*

*Baudonnat M, Guillou J-L, Husson M, Bohbot VD, Schwabe L and David V (2017) Morphine Reward Promotes Cue-Sensitive Learning: Implication of Dorsal Striatal CREB Activity. Front. Psychiatry 8:87. doi: 10.3389/fpsyt.2017.00087*

role in habit/procedural learning (4–7). Studies in both rodents and humans support the view that the hippocampus (HPC) and the dorsal striatum (DS) interact in either a cooperative (8–10) or competitive manner during learning (11–14). It is well documented that emotional, stressful events are potent modulators of striatum–HPC interactions: they promote habitual over cognitive forms of learning, through the interaction of glucocorticoids and noradrenaline (15–19). The amygdala plays a key role in orchestrating the switch from hippocampal to striatal learning (20, 21). Stress decreases hippocampal LTP in rodents with an intact amygdala, but not in lesioned animals (22). In contrast, we know surprisingly little about the impact of rewards on interactions between memory systems.

All rewards, whether they are sensory (e.g., food) or pharmacological (e.g., drugs of abuse), activate an ascending dopamine (DA) mesolimbic circuit composed of neurons projecting from the ventral tegmental area (VTA) to the nucleus accumbens (NAC) (23, 24). This circuit mediates appetitive learning (25, 26) and is implicated in the transition from goal directed to habitual behavior through a succession of loops recruiting progressively the nigrostriatal system following novelty-elicited activation of the mesolimbic pathway (27–30). The VTA also provides direct innervation to the HPC forming a loop that could act as a gating mechanism allowing access to long-term memory (31, 32). The VTA therefore appears to be a key locus for modulating interactions between memory systems (33, 34). We have previously reported that drug, but not food rewards lead to a deficit in a spatial memory task, while sparing a cued version of the same task (35). These effects were related to an increase in the PKA dependent phosphorylation of the cAMP response element-binding protein (pCREB) in the DS. pCREB is involved in the acquisition/consolidation of both cue-guided, striatum-dependent learning and spatial, HPC-dependent learning (12, 36–40). Interestingly, spatial learning produces transient waves of pCREB in the HPC, and a long-term increase in pCREB levels lasting up to 72 h (41). pCREB has been linked to synaptic plasticity changes and to late-long-term potentiation (l-LTP) (42, 43). The l-LTP is clearly involved in long-term memory formation (44), and DA is a potent modulator of these cellular adaptations (45, 46), further suggesting that the reward system modulates interactions between different forms of learning. These cellular adaptations may reinforce information processing by a particular memory system and thereby, determine the mode of learning strategies subsequently used.

In the present study, we investigated the impact of druginduced activation of the reward system on the subsequent use of different learning strategies, i.e., HPC-dependent spatial vs striatum-dependent cue learning. We first tested the acquisition of a cued Y-maze discrimination task in animals rewarded with either food or intra-VTA drug self-injections. To compare the impact of these two forms of reward on subsequent learning processes, we then evaluated the preferential use of cued vs spatial learning strategies in a competition task and linked this preference to brain regional pCREB phosphorylation. We used two subsequent, different tasks to avoid direct drug-related effects on performance and to assess new learning as opposed to the expression of a consolidated memory. Finally, we tested whether pharmacological manipulation of the PKA/CREB pathway within the dorsal striatum (DS) can modulate learning strategies in animals with a history of drug self-administration.

### ANIMALS AND METHODS

### Experiment I: Effects of Drug vs Food Reward on Learning Strategies Animals

Male C57BL/6J mice (13 weeks old; Charles River) were housed individually and maintained on a 12 h light–dark artificial cycle (lights on at 7:00 a.m.) in a temperature-controlled colony room (22 ± 1°C). They were provided with food and water *ad libitum*. The week before behavioral testing, the food ration was adjusted individually so that animals reached 95% of their *ad libitum* weights during the Y-maze task. Immediately after the end of Y-maze testing, food was provided back *ad libitum*. All experiments were approved by the local Ethics Committee for Animal Experiments (Comité d'Ethique pour l'Expérimentation Animale de Bordeaux, CEE50) and were performed in accordance with the European Communities Council Directive of 1st February 2013 (2010/63/UE).

### Surgery

Mice were anesthetized with a ketamine/xylazine mixture (Ketamine 1000 Virbac®: 100 mg/kg/Rompun® 2%: 8 mg/kg i.p.), and lidocaine HCl (Xylocaine®, 5%) was applied locally before opening the scalp and trepanation. The incisor bar was leveled with the interaural line. A guide cannula (30 gauge, Le Guellec®, Douarnenez, France) is implanted unilaterally in a counterbalanced left and right order 1.5 mm above the posterior VTA (from interaural line: AP: +0.40 mm, ML: ±0.30 mm, DV: −3.30 mm from skull surface). Mice were allowed to recover from surgery for 1 week. After experiments, animals were anesthetized with Avertin (10 ml/kg, i.p.) and perfused transcardially with 4% paraformaldehyde in 0.1 M phosphate buffer (PB) for the histological control of all surgical implantations (see **Figure 1**) using thionin blue coloration (35).

### The Y-Maze Task

All procedures started with a 10-day Y-maze training protocol and are schematized in **Figure 2**. The Y-maze discrimination protocol was identical to the one described in Ref. (35). Briefly, animals (*n* = 47) had to learn that a visual intra-maze cue (black– white striped laminated paper) is associated with the delivery of reward. They were separated into four groups: the first group was rewarded using a self-administration system allowing the delivery of microinjections of morphine into the VTA (morphine reward: 50 ng/50 nl/inj, *n* = 17); the second group with small pieces of crisps (5 mm2 of naturally flavored crisps Vico®, *n* = 15); and the third group received artificial cerebrospinal fluid (aCSF, Phymep, France) (*n* = 15). A fourth yoked-control group (yoked, *n* = 16) was submitted to the same protocol as morphine-rewarded animals, except that they could not trigger any injection. Instead the computer did so each time a paired self-administering animal reached the correct arm, so that the number of morphine injections (and thus the dose) received by yoked controls was

(lower arrow).

equivalent but irrespective of their behavior or location in the maze, as previously described (35).

Small pieces (5 mm2 ) of naturally flavored crisps were chosen as food reward after pilot studies showing that motivation to learn the task was obtained with a very low level of deprivation (<5%). Therefore, the same level of deprivation was applied to all groups to ensure a comparable physiological state in all animals. Intracranial drug self-administration was used as a model of reinforcement learning similarly to intracranial self-stimulation (47). This model presented several advantages. Food or drugs were self-administered in the same conditions, avoiding manipulation during behavioral tests, thus allowing direct comparison of learning in drug and food-reinforced animals. We used morphine as a mean to activate pharmacologically VTA–DA neurons without altering directly function in all brain regions (35). The dose of morphine was selected on the basis of optimal learning performance established in dose–effect curves reported previously using the same task (48).

#### The Water-Maze Competition Task

The test used is an adaptation of the previously published watermaze competition task in the mouse (6, 13, 38). The training

regimen is an important factor in the modulation of interactions between memory systems (49, 50). We used an acquisition protocol allowing a balanced expression of HPC and striatum-dependent learning (13). The last training session of the Y-maze learning task was followed by a 72 h-resting period after which the watermaze task started in a subgroup of mice [*n* = 33, composed of the following: morphine reward (*n* = 8); crisp reward (*n* = 8); aCSF (*n* = 9); and yoked morphine (*n* = 8)]. This delay allowed for a complete washout of morphine from the animal's brain (51), thus avoiding any effect of residual morphine on brain function during the competition task. Briefly, the task is composed of two stages. During the acquisition phase (10 trials, ITI 10 min), animals start from a constant position and have to reach a submerged platform located by both a cue in its center and numerous extra-maze visual cues. The platform remained in a fixed position for the whole acquisition phase. On the following day, mice underwent the retention test (five trials, ITI 10 min). One platform remained in the spatial location learnt the day before, whereas a second, new platform marked by the cue used during acquisition was introduced and located in the opposite quadrant. The starting point was changed to be equidistant from both platforms.

#### Immunohistochemistry

Concurrently to the WM competition task, i.e., 72 h after completion of the Y-maze training, brains of another subgroup of mice [*n*= 30; composed of the following: morphine reward (*n*= 8); crisp reward (*n* = 7); aCSF (*n* = 7); and yoked morphine (*n* = 8)] were removed to assess changes in brain regional expression of pCREB as previously described (41). We used unbiased stereology in the following areas according to Paxinos and Franklin (52): subfields of the dorsal HPC (CA1, CA3), the DS, the shell part of the NAC, and prefrontal cortex (infralimbic and prelimbic parts merged) (PFC). Cell counts were expressed as mean number of pCREB positive nuclei per square millimeters. Under anesthesia, animals were perfused transcardially with a cold (4°C) solution of 4% paraformaldehyde in PB (0.1 M, pH 7.4). Brains were then removed and postfixed overnight in the same fixative at 4°C. Brains were then put in a saccharose solution (30% in Tris buffer 0.1 M, pH 7.4) over a night and were then frozen to make 50-μm coronal freefloating sections with a freezing microtome (Leica) to proceed the pCREB immunochemistry. All solutions contained the phosphatase inhibitor sodium fluoride (2.1 g/L). Sections were collected in Tris buffer (0.1 M). After elimination of endogenous peroxidase activity by H2O2 30 min incubation and a preincubation step in saturation buffer (bovine serum albumin 1%, goat serum 3%, Triton X100 0.2%), sections were incubated for 48 h with rabbit anti-pCREB antibody (1:6,000 in saturation buffer, Millipore, Billerica, MA, USA). Subsequently, sections were incubated with biotinylated goat antirabbit antibody (1:2,000 in Tris buffer, Jackson Immunoresearch) and followed by an avidin-biotinylated horseradish peroxidase complex (Vectastain Elite Kit, Vector Laboratories, Burlingame, CA, USA). The peroxidase reaction end product was visualized in a Tris solution containing diaminobenzidine tetrahydrochloride (5%). Sections were mounted on gelatin-coated slides, air-dried, dehydrated, cover slipped with Eukitt and examined through light microscopy. The quantification of pCREB positive nuclei was carried out at 10× magnification, which yielded a field of view of 849 μm × 637 μm. At least six serial sections for each brain regions were digitized bilaterally and analyzed using a computerized image analysis system (Biocom, Visiolab 2000, V4.50). The number of nuclei was quantified blind to experimental conditions.

## Experiment II: Inhibition of PKA Activity within the DS

#### Surgery

An additional cohort of mice (*n* = 15) received a guide cannula 1.5 mm above the VTA and were implanted bilaterally with two guide cannulae (gauge 30) 1 mm above the mediolateral midline of the DS (from Bregma: AP: +0.5 mm, ML: ±1.9 mm, DV: −2.0 mm from skull surface), so that the stainless-steel injection cannulae (gauge 36) used for bilateral infusions projected to 1 mm below the tip of the guide-cannula.

#### Rp-8Br-cAMPs Infusions

The 8-bromoadenosine-3′,5′-cyclic monophosphorothioate, Rp-isomer (Rp-8Br-cAMPS; Enzo Life Science) is a lipophilic analog of Rp-cAMPS, a well-characterized membrane-permeable competitive inhibitor of cyclic AMP-dependent protein kinase (PKA), which discriminates between PKA and other cAMP receptors (53). On the basis of previous behavioral and CREB expression studies in C57BL/6 mice (35, 54), Rp-8Br-cAMPS was dissolved in aCSF to be delivered at the concentration of 0.4 nmol/0.5 μl per hemisphere. Bilateral infusions were performed before the last Y-maze session to avoid disruption of encoding during the water-maze task that was run 72 h after. Ten minutes before the last training session, mice were injected for 3 min in their home cage with either the Rp-8Br-cAMPS (*n* = 6) or aCSF (*n* = 6) into the DS, using a double infusion pump (Elite 11, Harvard®). Injectors remained connected for 2 min after the injection. Mice were then allowed to rest for 5 min.

## Statistical Analysis

#### Y-Maze

The mean number of correct responses and the mean choice latency per trial were analyzed using a two-way analysis of variance (ANOVA) (StatView 5.01 statistical software, Abacus Concept, Piscataway PA, USA) with "Reward" type as between-subjects factors and "Session" as a within-subjects repeated factor. Day-byday between-groups comparisons for latencies and responses were performed using a one-way ANOVA with "Reward" as between subject factor. Significant main effects were further analyzed (*post hoc*) using Newman–Keuls *t*-tests. One sample *t*-tests were used to compare performance in the last training session against chance level (5/10 correct responses).

### Water Maze

Analysis of the swim distance within the acquisition or retention phase was performed using a two-way ANOVA with "Reward" type as between-subjects factors and "Trial" as within-subjects repeated factor. Mean swim speed over all acquisition or retention trials was analyzed using a one-way ANOVA with "Reward" as between subject factor. For the water-maze retention test, the percentage of cue or place responses and the percentage of time spent in enlarged platform were compared across groups using unpaired Student's *t*-test.

#### Immunochemistry

Immunostaining data were expressed as mean number of pCREB positive nucleus per square millimeters for each of both hemispheres. Six consecutive serial sections were examined bilaterally for all regions. We found no left–right difference; therefore, data were averaged to produce group mean ± SEM. One-way ANOVAs with "Reward" as between-group factor followed by *post hoc* Newman–Keuls *t*-tests were performed.

### RESULTS

### No Differential Effect of Food vs Drug Rewards on Learning Performance in the Y-Maze Task

As illustrated in **Figure 3A**, both crisp- and morphine-rewarded mice learned similarly the cue-guided Y-maze discrimination task. The number of correct responses for these two groups increased over sessions, whereas aCSF controls performed at chance level and did not improve across trials (two-way ANOVA: Reward effect: *F*2,44 = 46.90, *p* < 0.001; Session effect: *F*9,396 = 4.18, *p* < 0.001; Reward × Session interaction: *F*18,396 = 3.18, *p* < 0.001; *post hoc*: Crisps vs aCSF *p*< 0.001; Morphine vs aCSF, *p*< 0.001; Morphine vs Crisps, *p* > 0.05). Both Crisp- and Morphine-rewarded mice choose the reinforced arm significantly more than aCSF controls from day 2 to day 10 (all *p* < 0.05) and displayed very similar learning rates as evidenced by their overlapping learning curves. Analysis of the mean latency to complete trials (**Figure 3B**) revealed that this parameter significantly decreased over sessions in both morphine- and crisp-rewarded mice, but not in mice that received aCSF (Reward effect: *F*2,44 = 8.72, *p* < 0.001; Session effect: *F*9,396 = 8.38, *p* < 0.001; Reward × Session interaction: *F*18,396 = 2.23, *p* = 0.027; *post hoc*: Crisps vs aCSF, *p* < 0.01; Morphine vs aCSF, *p* < 0.01; Morphine vs Crisps, *p* > 0.05).

### Morphine Self-administration Elicits Long-lasting CREB Phosphorylation in the DS while Reducing pCREB Expression in the HPC

pCREB immunostaining was performed to reveal the brain regional activation state in animals of each group 72 h after the

Crisps group from day 2 to 10: \*\**p* < 0.01; vs Morphine group from day 2 to 10: °°*p* < 0.01). (B) Analysis of mean (±SEM) latencies to complete a trial (in seconds) over the 10 training sessions. Both rewarded groups decrease their choice latency over trials and completed trials faster than aCSF group (vs Crisps group: \**p* < 0.05; \*\**p* < 0.01; \*\*\**p* < 0.001; vs Morphine group: ° *p* < 0.05; °°*p* < 0.01; °°°*p* < 0.001).

last Y-maze session. Expression levels are detailed in **Figure 4**. At this delay, previously food rewarded and aCSF controls exhibited similar pCREB levels in the analyzed structures. In contrast, morphine-exposed animals exhibited higher pCREB levels as compared to other groups in the DS, and this effect was significantly heightened when morphine was self-administrated as compared with yoked subjects (Reward effect: *F*3,26 = 26.70, *p* < 0.001; *post hoc*: Morphine vs aCSF, *p* < 0.001; Morphine vs Crisps, *p* < 0.001; Morphine vs Yoked, *p* < 0.001; Yoked vs aCSF, *p* = 0.04; Yoked vs Crisps, *p* = 0.03). Statistical analysis also yielded an elevated level of pCREB in the NAC of morphine selfadministering mice (Reward effect: *F*3,26= 3.19, *p*= 0.039; *post hoc*: Morphine vs aCSF, *p* = 0.006; Morphine vs Crisps, *p* = 0.056; Morphine vs Yoked, *p* = 0.071). In contrast, pCREB expression in the dorsal CA1 of the HPC was significantly reduced in mice with a history of morphine self-administration (Reward effect: *F*3,26 = 4.21, *p* = 0.014; *post hoc*: Morphine vs Crisp, *p* = 0.02; Morphine vs Yoked, *p* = 0.002; Morphine vs aCSF, *p* > 0.05). A similar, although non-significant tendency was observed also in the CA3 (Reward effect: *F*3,26 = 1.30 ns). In the PFC, pCREB levels were slightly elevated in Yoked subjects but this effect did not reach significance (Reward effect: *F*3,26 = 2.83 ns). **Figure 5** summarizes region-dependent relative changes and points out to a drastic increase in the DS, but a decrease in the dorsal HPC (CA1–CA3).

### History of Morphine Self-administration Promotes Cue-Guided Learning Strategy

As shown on **Figure 6A**, all animals learned to find the platform efficiently over trials. However, the previously morphinerewarded group displayed better learning performance than aCSF-injected animals, whereas subjects having experienced non-contingent morphine administrations (yoked controls) had to swim more than any other groups (ANOVA Reward effect: *F*3,29= 6.71, *p*= 0.001; Trial effect: *F*9,261= 24.35, *p*< 0.001; *post hoc*: Morphine vs aCSF, *p* = 0.03; Yoked vs aCSF, *p* = 0.02; Yoked vs Morphine, *p* = 0.001; Yoked vs aCSF, *p* = 0.009; Crisps vs aCSF, n.s.; Crisps vs Morphine, n.s.). These differences were abolished during the competition task. Analysis of the mean swim speed over acquisition trials pointed to group differences (Reward effect: *F*3,326 = 26.57, *p* < 0.001): previously drug-rewarded mice swam faster than food-rewarded subjects (all *p* < 0.001) and aCSF controls (all *p* < 0.001) (**Figure 6B**). These differences were observed also in the retention test (Reward effect: *F*3,161 = 11.26, *p* < 0.001; Yoked vs aCSF; Yoked vs Crisps and Morphine vs aCSF, *p* < 0.001; Morphine vs Crisps *p* = 0.02).

Spatial vs cue-oriented responses during the retention test are shown in **Figure 7A**. Behavior of previously drug selfadministering mice was dominated by the single cue, whereas behavior of food-rewarded, yoked, and aCSF control animals was equally influenced by spatial information and the cue (*t-*test vs chance level of 50%: Morphine *t* = 2.75, *p* = 0.02; aCSF, Crisps, Yoked all *p* > 0.20). Animals that had experienced morphine self-administration earlier on spent more time in the enlarged cued-platform zone than all the other groups (Reward effect: *F*3,161 = 2.66, *p* < 0.05; *post hoc* tests: Morphine vs aCSF, *p* < 0.05; Morphine vs Crisps, *p* < 0.01; Morphine vs Yoked, *p* < 0.05).

Moreover, morphine self-administered animals swam more in the enlarged cued-platform zone than in the spatial one during retention trials (unpaired *t-*test: Morphine, *p* = 0.008; Crisps, Yoked, and aCSF all *p* > 0.05; **Figure 7B**).

### Inhibition of PKA/CREB Pathway in the DS Abolishes the Bias toward Cue-Oriented Learning

Pre-injection of Rp-8Br-cAMPS had no effect on performance during the last Y-maze acquisition session (**Figure 8A**). Treated animals were tested in the water-maze competition task 72 h later. Rp-cAMPS or aCSF injections into the DLS did not alter swim distances to the platform during either the acquisition or retention phase of the water-maze task (**Figure 8B**). Rp-8Br-cAMPS pretreatment, however, completely abolished the preferential use of the cue-guided learning strategy that was observed in aCSF treated mice. As evidenced by the percentage of responses over the five retention trials summarized in **Figure 8C**, Rp-8BrcAMPS-treated animals displayed as many spatial as cue-oriented responses (*t*-test against theoretical 50% chance level: *p* > 0.05), whereas subjects receiving the vehicle persisted in choosing the cued platform over the spatial platform (*t-*test against chance level: *t* = 3.47, *p* = 0.02). Histological control of all pretreated animals showed that injection sites were located mainly in the DLS (**Figure 8D**), as can be estimated from the study of Yin and Knowlton (55).

### DISCUSSION

We previously reported that drug-reinforced animals are selectively impaired in the acquisition of a spatial discrimination task, but not in the cued version of the same task (35). This finding suggests that drug rewards may induce a shift toward cue-oriented behavior and striatum-dependent forms of learning. In the present study, we challenged this view by assessing the selection of spatial vs cue-oriented learning strategies in a water-maze competition task (13). We compared mice having experienced a Y-maze discrimination task rewarded with either food, non-contingent or self-administered morphine. We now show that animals with a history of drug self-administration rely

almost exclusively on a cue-guided strategy to reach the platform. In contrast, animals having received passively the same amount of morphine as well as food-rewarded subjects, retained a flexible use of spatial and cued strategies. Along with their cue-dependent behavior, animals with a history of morphine self-administration displayed a persistent increase in pCREB within the DS and the NAC, but a decrease in the dorsal CA1. This expression pattern was bilateral, thus ruling out any possibility that unilateral activation of these brain regions may underlie cognitive inability. Such an inverse relationship between striatal and hippocampal pCREB expression as demonstrated by present behavioral, CREBimaging, and pharmacological data fits well with the view that a functional antagonism between HPC and DS takes place during learning. Consistently, decreasing HPC function or enhancing DS processing using pharmacological or genetic manipulation of pCREB levels induces a predominant use of striatum-dependent learning in navigational tasks (12, 38, 39, 56). Humans using response strategies in navigational tasks exhibit increased fMRI activity and gray matter in the DS (57, 58).

The habit-forming effects of drugs of abuse are well documented (3, 59). Repeated systemic or intra-VTA administration of amphetamine or morphine induces an increase in locomotor activity and repetitive, stereotyped behaviors (60–62). This behavioral sensitization can disrupt action–outcome (A–O) learning, and repeated preexposure to a psychostimulant promotes habitual responding in a DA-D1 receptor-dependent manner (63, 64). We show here that VTA morphine reward not only promotes S–R learning but it also increases the bias toward subsequent striatum-dependent learning. This is consistent with the view that repeated cued drug self-administration facilitates the use of striatum-dependent

learning strategies (65). This cue attractiveness could be related to a sign-tracking profile as recently defined in rats (66). Signtracking refers to individuals more likely to approach cues in a novel environment, whereas goal trackers will try to locate directly the reward (food tray). Interestingly, sign trackers exhibit phasic DA signals shifting from the unconditional stimulus (US food) to the conditional stimulus (CS cue), whereas goal trackers maintain an elevated DA response to the CS and US. Rats selectively bred for high reactivity to a novel environment show a sign-tracking response and an increased propensity to self-administer cocaine, suggesting that they could represent an animal model of addiction vulnerability (67). Identification of common neural features of sign-tracking (rat) and cue attractiveness (mouse) is an interesting prospect for future addiction research.

There is ample evidence that cue-dependent control of behavior in drug addiction relies on neuroadaptations occurring in the PKA/pCREB signaling pathway within cortico-limbicstriatal and amygdala circuits (1, 68–70). Chronic drug use led to an aberrant over-learning of drug-related cues, and craving or

relapse can be induced by presenting such cues (71–73). Here, we provide evidence that morphine self-administration upregulate CREB activity within the DS, facilitating the recruitment of a learning strategy depending on cues. Concurrently, pCREB level was reduced in dorsal CA1 of the HPC, a region involved in flexible, spatial learning. Reward-dependent increase in striatal DA facilitates LTP at the level of medium spiny neurons of the direct pathway (74), and this form of LTP depends on D1-DA receptors or co-activation of D1/NMDA receptors (75, 76). Chronic drug-induced modulation of DA D1/D2 receptor ratio in the DS leads to an increased excitability of this brain region in humans (77). Together, these data strongly suggest that drug-reinforced learning resulted in hyperactivity of the DS. Consistently, we show that blocking striatal PKA activity with Rp-8Br-cAMPS restored a balanced expression of cued and spatial navigation strategies. PKA is the main kinase involved in CREB phosphorylation through DA D1 signaling (78–80). PKA activity maintains cue-dependent control of behavior through a DA/glutamate signaling cascade (68). Importantly, CREB may be phosphorylated also *via* the extracellular signalregulated kinase pathway, its recruitment depending mainly on glutamatergic inputs (81–83). The efficiency of Rp-8Br-cAMPs in restoring spatial learning could reflect either a predominant role of the DA-dependent striatal PKA, or an alteration of coincident DA-glutamate signaling. In any case, it is consistent with a role of DS DA in navigational tasks (55, 84), the inhibiting effects of DS electrical stimulation on the HPC (85), and the improving effect of DS lesions on spatial learning (12).

Since we previously demonstrated that Rp-8Br-cAMPS did not blocked CREB activity in the adjacent ventral striatum, it is unlikely that this inhibitor had to reach distant, extra-striatal regions to exert its effect (35). This view is also supported by the observation that transgenic mice expressing a dominantnegative mutant of CREB show specific impairments in both CREB activity in the DS and cued learning (12). However, at least three subregions have been described within the DS itself based on functional data: the anterior dorsomedial, the posterior dorsomedial, and the DLS (37, 55, 86–90). One limitation of our PKA/CREB inhibition study is that Rp-8Br-cAMPS injections targeted the midline of the DS; therefore, it is not possible to attribute its effects selectively to one of these subregions. Yet, histological control points out to the DLS, thus present restorative effects of PKA inhibition on place learning are consistent with the lateral/medial dissociation of the DS, respectively, associated with habitual/A–O responses in instrumental and drug-maintained behaviors, or response/place learning (37, 55, 86–90). Finally, since food-trained mice exhibited neither persistent CREB activity nor learning bias in the WM competition task, they were not tested for Rp-8Br-cAMPS, leaving open the question of its action in non-biased animal. We and others have reported that the effects of PKA inhibitors on memory typically depend on the region that is targeted: intra-HPC administration blocks spatial memory, whereas intra-DS and intra-PFC infusions disrupt striatum-dependent learning and cued-induced relapse (35, 91–93).

One intriguing observation of the present study is that yoked morphine did not have the same cognitive impact than self-administered morphine. During the Y-maze task, all mice were trained on a cued protocol, raising the possibility that a morphine-training interaction might explain subsequent preference for the cued learning strategy. The absence of preferential cued learning (and DS-CREB hyperactivity) in the yoked-control group, in which each subject received non-contingently the same amount of morphine as self-administering animals, demonstrates that this interaction is not sufficient to elicit this learning bias. Instead, it suggests that response contingency is involved in this form of neuroplasticity. Profound differences between self-administered and yoked cocaine rats have been reported in electrically evoked [(3)H] DA release (94). Self-administering animals exhibit sensitized DA release in the NAC, DS, and medial prefrontal cortex up to 3 weeks after cessation of cocaine selfadministration, whereas terminal DA release is sensitized only in the NAC core in yoked subjects (94). Although the response contingency is clearly necessary, it is not sufficient to elicit such a cognitive bias, as it was not observed in food-rewarded animals. Our results suggest that reward value may be another critical component required for this long-lasting behavioral/cellular plasticity. The strong morphine-induced CREB activity observed in the NAC argues in favor of this hypothesis. Indeed, there is evidence that the reinforcer value plays a role in the facilitation of S–R learning (64).

There are striking similarities in the impact of emotional events on learning processes, whether their valence is positive (reward) or negative (stress). Both stress and drugs promote habit learning (15–19). Mechanisms underlying this effect remain to be fully understood, yet it has been proposed that drugs favor S–R association by impairing retrieval or utilization of outcomes (3). A growing body of evidence suggests that in humans, chronic consumption of drugs of abuse impairs HPCand PFC-dependent learning tasks (95, 96), whereas habit learning is mostly spared or even enhanced by drug consumption (30, 97, 98). Accordingly, our results further reveal that morphine self-administration leads to a functional imbalance between the HPC and DS, prompting the use of the striatal-dependent habit learning system. Future work should aim at detecting a similar hippocampostriatal unbalance in human abstinent drug users, using functional or structural brain imaging. Enduring states of differential excitability could represent a form of disconnection syndrome contributing to the maintenance of addictive behaviors. Interestingly, young adults expressing a response learning strategy in a virtual navigational task use more drugs than spatial learners (99). These data raise a critical question awaiting to be specifically addressed by future research: could emotional events such as rewards, stressors, or even prenatal stress promote the habit system early on in life (100)? A corollary issue with tremendous therapeutic interest is whether or not pharmacological treatments or cognitive therapies aiming at restoring the HPC activity could maintain protracted abstinence or prevent relapse.

In conclusion, we provide behavioral, pharmacological, and cellular evidence suggesting that morphine reward elicits a cognitive bias toward the use of cue-guided learning strategies, an effect specifically observed in animals receiving contingent drug injections (self-administration). This cognitive bias relies on the persistent upregulation of learning-induced CREB phosphorylation in the DS and could be reversed by locally inhibiting the PKA/CREB signaling pathway. We suggest that such drug-induced biases are likely to play a critical, yet overlooked role in addictive behaviors, as they could counteract pharmacological treatments of addiction. This calls for further exploration of neural mechanisms involved in drug-induced cognitive biases toward cue-sensitive forms of learning.

### AUTHOR CONTRIBUTIONS

MB, J-LG, VB, LS, and VD contributed to the writing of the manuscript. MB and MH performed experiments. MB, J-LG, and VD designed experiments.

### ACKNOWLEDGMENTS

The authors thank all the personnel of the Animal Facility of the Institute de Neuroscience Cognitive et Intégrative d'Aquitaine (INCIA) for their help throughout the study, Gilles Courtand for his technical support in image analysis, and Laurence Decorte

### REFERENCES


and Nadia Henkous for their technical assistance regarding these experiments. The authors thank Dr. M. E. Walton and D. Bannerman for their comments on the initial version of this manuscript.

### FUNDING

This work was supported by the Centre National de la Recherche Scientifique (CNRS), The University of Bordeaux, and the Regional Council of the Aquitaine Region (MB and VD). MB was awarded with a PhD fellowship co-funded by the CNRS and the Regional Council of the Aquitaine Region.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Baudonnat, Guillou, Husson, Bohbot, Schwabe and David. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Neuronal Nicotinic Receptors Are Crucial for Tuning of E/I Balance in Prelimbic Cortex and for Decision-Making Processes

*Elsa Cécile Pittaras1,2‡ , Alexis Faure1‡ , Xavier Leray1† , Elina Moraitopoulou1 , Arnaud Cressant3 , Arnaud Alexandre Rabat2 , Claire Meunier1 , Philippe Fossier1 and Sylvie Granon1 \**

*1CNRS 9197, Institut de Neuroscience Paris Saclay, Orsay, France, 2 Institut de Recherche Biomédicale des Armées et Unité Fatigue & Vigilance, Brétigny-sur-orge, France, 3Brain@vior, Saint-Prest, France*

#### *Edited by:*

*Daniel Beracochea, University of Bordeaux 1, France*

#### *Reviewed by:*

*Roberto Ciccocioppo, University of Camerino, Italy Karine Guillem, CNRS, France*

> *\*Correspondence: Sylvie Granon sylvie.granon@u-psud.fr*

#### *†Present address:*

*Xavier Leray, Neurophotonics Laboratory, CNRS UMR8250, Sorbonne Paris Cité, Paris Descartes University, Paris, France*

*‡ Elsa Cécile Pittaras and Alexis Faure contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 29 May 2016 Accepted: 26 September 2016 Published: 14 October 2016*

#### *Citation:*

*Pittaras EC, Faure A, Leray X, Moraitopoulou E, Cressant A, Rabat AA, Meunier C, Fossier P and Granon S (2016) Neuronal Nicotinic Receptors Are Crucial for Tuning of E/I Balance in Prelimbic Cortex and for Decision-Making Processes. Front. Psychiatry 7:171. doi: 10.3389/fpsyt.2016.00171*

Rationale: Decision-making is an essential component of our everyday life commonly disabled in a myriad of psychiatric conditions, such as bipolar and impulsive control disorders, addiction and pathological gambling, or schizophrenia. A large cerebral network encompassing the prefrontal cortex, the amygdala, and the nucleus accumbens is activated for efficient decision-making.

Methods: We developed a mouse gambling task well suited to investigate the influence of uncertainty and risk in decision-making and the role of neurobiological circuits and their monoaminergic inputs. Neuronal nicotinic acetylcholine receptors (nAChRs) of the PFC are important for decision-making processes but their presumed roles in risk-taking and uncertainty management, as well as in cellular balance of excitation and inhibition (E/I) need to be investigated.

Results: Using mice lacking nAChRs – β2−/− mice, we evidence for the first time the crucial role of nAChRs in the fine tuning of prefrontal E/I balance together with the PFC, insular, and hippocampal alterations in gambling behavior likely due to sensitivity to penalties and flexibility alterations. Risky behaviors and perseveration in extinction task were largely increased in β2−/− mice as compared to control mice, suggesting the important role of nAChRs in the ability to make appropriate choices adapted to the outcome.

Keywords: brain activation, cfos, prefrontal cortex, gambling behaviors, risk-taking, anxiety, social behavior

## INTRODUCTION

Decision-making is an essential component of our everyday life. According to Doya (1), decision follows four steps: recognizing the situation of decision, evaluating the possible options (valuation), selecting the appropriate action in inhibiting all other non-optimal ones (action selection), and eventually learning about this action in evaluating the output (learning). These processes are modulated by various factors, such as motivational internal state, risk, and uncertainty. Studying the part of valuation in decision-making might be achieved by modifying the value of each option using devaluation procedures or by changing their relative quantity or quality (2). Ability to inhibit nonoptimal action could be revealed using reversal and/or extinction procedures that require adaptation to a novel rule (3). Finally, the influence of uncertainty and risk-taking in decision-making can be challenged with gambling tasks initially developed in humans, and recently adapted for rodents (4–8).

At a neurobiological level, making decision requires corticostriatal loop activation that might be separated in a limbic (affective/emotion) and a cognitive loop (executive/motor). The limbic loop would encompass the orbitofrontal cortex, the amygdala and the nucleus accumbens (NAcc), and the cognitive loop would be composed of the prelimbic, infralimbic and anterior cingulate cortices, and the dorsal striatum (9). The limbic loop would participate in evaluation of behavioral outcomes in term of cost, risk, and amount (9) while the cognitive loop would rather play a role in selecting and adapting behavioral choice in regard to change. When facing high uncertainty and risk like in gambling tasks or in social situations, there is an involvement of both loops (9, 10). Additional pieces of recent evidence report the implication of the insular cortex in decision-making under risk or uncertainty (11) and in the development of compulsive behaviors (12). Multiple neuromodulators, such as dopamine, noradrenaline, and serotonin, are highly involved in these loops and affect various components of the decision-making process (1, 4, 9). At a cellular level, decision-making processes are suggested to require a precise control of the E/I balance within cortico-striatal circuits (13, 14). In a recent rat study, modulation of GABAergic function within the medial prefrontal cortex (PFC) has been demonstrated to modulate decision-making in a gambling task (15).

In numerous psychiatric pathologies, alteration of processes involved in decision-making leads to maladaptive choices. These disabilities might underpin behavioral defects in many psychiatric disorders, such as bipolar and impulsive control disorders (16, 17), addiction, or pathological gambling (18, 19). Elevation in the E/I balance within cortico-striatal circuits has been associated with many of these pathologies (14, 20). Indeed, alteration of the PFC E/I ratio has been proposed to trigger cognitive and social dysfunctions in pathologies, such as autism and schizophrenia (20–22). Better knowledge about factors which could influence E/I balance within cortico-striatal circuits and its impact on decision-making abilities is therefore crucial.

The major neuronal nicotinic receptors – nicotinic acetylcholine receptors (nAChRs) – are pentameric oligomers composed of subunits, principal combinations of which are α4β2 subunits, for heteromeric ones, and α7 subunits for homomeric ones (23, 24). Endogenous acetylcholine (ACh) modulates numerous neurotransmitters release in these cortico-striatal circuits via its binding onto nAChRs presynaptically located on dopaminergic, noradrenergic, and serotonergic terminals (25). β2<sup>−</sup>/<sup>−</sup> mice (null mice for nAChRs containing the beta2 subunit) exhibited marked alteration in exploration and navigation (26, 27), and in organization of social behaviors, reflecting behavioral flexibility troubles (28–30). In the PFC, both functional β2-nAChRs and monoaminergic inputs are necessary for showing organized social behaviors (28, 31). Previous alteration in β2<sup>−</sup>/<sup>−</sup> mice have been reported in a social decision-making tasks in which natural rewards like food, novelty seeking, and social contact compete, with a high level of uncertainty associated to a social conspecific having, by nature, unpredictable behavior (29, 32, 33). By contrast, when such competition existed without uncertainty β2<sup>−</sup>/<sup>−</sup> mice were not impaired and exhibited normal choices (33). This highlighted the crucial importance of uncertainty in decision-making for β2<sup>−</sup>/<sup>−</sup> mice. In addition, as β2-nAChRs are crucial for PFC activity (28, 34), it is relevant to question their putative implication in the PFC E/I balance. To date, we lack information on β2<sup>−</sup>/<sup>−</sup> abilities in complex decision-making with high risk and/or under uncertainty aside from social situations.

In this framework, our current aim is to test if β2-nAChRs could be one of the actors influencing the excitation/inhibition balance within the PFC. Besides, we address the selective role of these receptors in behavioral tasks that target different aspects of decision-making processes: a gambling task that involves uncertainty and risk management, and a novel decision-making task that involved the valuation and devaluation of various outcomes – social, food, and novelty – and which allowed us to investigate behavioral extinction. Finally, we measured cFos expression in multiple brain structures following the gambling task completion.

### MATERIALS AND METHODS

### Animals

In all the behavioral experiments, male C57Bl/6J mice and β2<sup>−</sup>/<sup>−</sup> knockout mice, bred in Charles' River facilities (L'Arbresle Cedex, France) were used. β2<sup>−</sup>/<sup>−</sup> knockout mice were generated from a 129/sv Embryonic Stem line as previously described (35) and back crossed onto the C57Bl/6J strain for 20 generations. As they were shown to be at more than 99.99% C57Bl/6J by a genomic analysis using 400 markers, C57Bl/6J mice were used as control of β2−/− knockout mice. Mice were housed in a temperature controlled room (21 ± 2°C) with a 12 h light/dark cycle (light on at 8:00 a.m.). All experiments were performed during the light cycle between 9:00 a.m. and 5:30 p.m. All experimental procedures were carried out in accordance with the EU Directive 2010/63/EU, Decree N 2013-118 of February 1, 2013, and the French National Committee (87/848).

### Experiment I. Electrophysiological Study of the Excitation/Inhibition Balance in the PFC

In order to better apprehend how lack of β2 subunit in β2−/− animal modulate the activity of prefrontal cortex, we investigated the specific roles of α4β2 or α7 nAChRs in the activity of PFC. For that, we determined the balance between E–I balance inputs onto the soma of L5PyNs and checked the effects of α4β2 or α7 antagonists on E–I balance. Experiments were done both on 67 C57Bl/6 mice and on 38 β2<sup>−</sup>/<sup>−</sup> mice from post-natal days 20–25. Electrophysiological study of the PFC was done following the methods extensively described elsewhere (36–38). Briefly, electrical stimulations (1–10 μA, 0.2 ms duration) were delivered in layer 2–3 or in layer 6 using 1 MΩ impedance bipolar tungsten electrodes (TST33A10KT; WPI). Evoked synaptic responses recorded in L5PyNs were measured and averaged at several holding potentials. I–V relationship was then determined at each time-point of the response. An average estimate of the input conductance waveform of the cell was calculated. The decomposition of this input conductance in its excitatory and inhibitory components enables to assess the E–I balance.

The α4β2 antagonist Dihydro-β-erythroidine (DhβE) hydrobromide (Sigma), and the α7 antagonist methyllycaconitine (MLA, Tocris) were perfused in the bath solution for at least 15 min before recording.

### Experiment II. Mouse Gambling Task

Our aim in this task is to test the gambling profile of β2−/− knockout mice and how their brain were activating during the gambling task using cellular imaging with c-fos immunohistochemistry.

Twenty-four male C57Bl/6J and 21 β2<sup>−</sup>/<sup>−</sup> mice of 3–6 months old were used. Mice were group-housed (three or four mice per cage) and were food deprived (maintenance at 85% of the free feeding weight) with water *ad libitum*.

### Behavioral Procedures of the Mouse Gambling Task

This decision-making task inspired by the human Iowa Gambling Task (39) was previously adapted to mice (4, 8).

#### *Habituation in Operant Chambers*

Mice were habituated to be manipulated by experimenters, to eat pellets, and to make an effort to get food pellets in operant chambers for 10 days before starting the mouse gambling task (MGT). The central hole was the only hole available. A nose poke led to distribution of one food pellet in the magazine. After consumption a fixed 5-s delay occurred before which a new trial began. The daily session continued until 65 pellets were obtained or for 30 min, whichever arrived first.

### *Mouse Gambling Task Apparatus and Protocol*

The task took place in a maze with four transparent arms (20 cm long × 10 cm wide) containing an opaque start box (20 cm × 20 cm) and a choice area. We used standard food pellets as a reward (dustless Precision Pellets, Grain-based, 20 mg, BioServ®, New-Jersey) and food pellets previously steeped in a 180 mM solution of quinine as penalty (7, 8). The quinine pellets were unpalatable but not uneatable. Each mouse performed 10 trials in the morning and 10 trials in the afternoon during 5 days, i.e., 100 trials at the end of the experiment.

Two of the four arms gave access to "advantageous" outputs: immediate access to a small reward, represented by 1 pellet, followed by additional small rewards (3 or 4 pellets) 18 times out of 20 and two times out of 20 by small penalty (3 or 4 quinine pellets). The two other arms gave access to "disadvantageous" outputs: immediate access to 2 pellets followed most of the time by 4 or 5 quinine pellets (19 trials out of 20) or large reward (4 or 5 pellets) one trial out of 20. Despite the immediate less attractive amount of reward "*advantageous*" choices are, thus, more advantageous in the long term and "*disadvantageous*" choices are less advantageous in the long term (**Figure 1**). Mice had, thus, to favor small immediate reward ("*advantageous"* choices) to obtain the largest amount of pellets at the end of the day.

Between each trial, the maze was cleaned up with distilled water; and between each mouse, it was cleaned up with a 10% of alcohol solution. During the first session, animals were put into

[Adapted from Ref. (8)]. White circle represented food pellets and black circle quinine pellets. "Advantageous" choices gave access to one food pellets and then to 3 or 4 food pellets (18/20) or quinine pellets (2/20). "Disadvantageous" choices gave access to 4 or 5 food pellets (1/20) or quinine pellets (19/20). We distinguished "advantageous" choices from "disadvantageous" ones because mice earned more pellets (74 or 92 pellets vs. 45 or 44 pellets) after 20 trials by choosing the "advantageous" ones.

the maze during 5 min with food pellets scattered everywhere (habituation). If mice did not eat any food pellets during the first habituation in the morning, a second 5 min habituation period was conducted during the afternoon. For the following sessions, habituation lasted only 2 min without food pellets on. At the beginning of each trial, the mouse was placed in an opaque tube in the starting box to avoid directing the future choice of the animal. After 5 s, we removed the opaque tube and let the animal freely choosing one arm of the maze.

We measured the time spent by the mouse to choose one arm (i.e., when the animal crossed 1/3 of the arm) and we scored the arm chosen and the pellets consumption (pellets earned).

What we call the rigidity score of an animal is the highest percentage of choice of an arm during this period. The first step is to calculate the percentage of choice in all four arms in regard to the total number of possible choices. In first two gambling sessions, an animal get 40 possible choices. If he choose 21 times the arm 1, the score for this arm will be [(21/40) × 100] = 52.5%, 9 choices for arm 2 [(9/40) × 100] = 22.5%, 4 choices for arm 3 [(4/40) × 100] = 10%, and 6 choices for arm 4 [(6/40) × 100] = 15%. Thus, rigidity score of this mouse in these 2 days of gambling is the maximal percentage of choice, i.e, 52.5%. For example, the rigidity score was 25% if animals chose equally advantageous options and disadvantageous ones. A 50% score reflected that animals chose twice more one arm than the others and a 75% score that animals have chosen one arm 3 times out of 4.

In summary:


The data are shown as percentage of "advantageous" choices that encompass choices made on the two advantageous arms.

*Subgroups Formation.* To built subgroups of choices, we calculated the mean of the 30 last trials (i.e., when performance was stable and strategies established) and we used the k-mean clustering separation using Statistica software (version12) (40). Each animal belonged to a set that had the closest mean to its own performance value. As such, animals were separated on three groups: those which made a majority of advantageous (safe) choices at the end of the experiment, called "safe"; those which maintained some visit in the disadvantageous arms until the end of the experiment, called "risky"; those which had an intermediate behavior, with a majority of choices in the advantageous arms but some unfrequent visit of risky options, called "average." For each mouse, we calculated a rigidity score at the beginning (two first days) and at the end (two last days) of the experiment.

### C-fos Immunohistochemistry

The brains of WT mice (*n* = 24) and β2<sup>−</sup>/<sup>−</sup> mice (*n* = 11) that have done the MGT were analyzed for c-fos immunohistochemistry.

### *Brains Removed and Conservation*

Animals were anesthetized [for 2 ml: Rompun 2%, 50 μl; Kétamine 500, 600 μl; phosphate buffered solution (PBS) 1×, 1350 μl. 1 ml for 10 g] exactly 90 min after the end of the last MGT trial of the week. This timing allows the synthesis of c-fos (early immediate gene) protein in the nuclei of activated neurons. Then, mice were perfused transcardially with 20 ml (PBS) and then by 50 ml of 4% paraformaldehyde (PFA). Brains were removed, fixed during 24 h with PFA and cryoprotected with croissant sucrose solution during 3 days at 4°C. Then brains were put in −20°C in glycerol.

### *Brains Slices and Immunohistochemistry*

Brains were sliced with a vibratome (Leica, VT1000E) on a coronal plane into 40 μm. After between two 4 × 10 min rinses in PBS, endogenous peroxidases were neutralized during 30 min in PBS containing 3% H2O2. To block the non-specific site, we used PBS solution with 1% bovine serum albumin (BSA), 3% normal goat serum (NGS), and 0.2% Triton ×100 during 2H. c-fos immunolabeling was performed with a purified polyclonal rabbit IgG anti-human c-fos [anti c-fos (Ab-5) (4-17) rabbit pAb, CALBIOCHEM] diluted 1:20.000 in 1% BSA, 3% NGS, and 0.2% Triton x100 during 38H. After 4 × 10 min rinses in PBS, sections were incubated for 2H with secondary biotinylated antibody (Biotin Goat anti-rabbit IgG (H + L), INTERCHIM) diluted 1:2.000000 in 1% BSA, 3% NGS, and 0.2% Triton ×100 during 2H. After 4 × 10 min rinses in PBS, the staining was revealed using H2O2 and diaminobenzidine (D-5905, SIGMA) for 3 min. After rinsing, sections were flattened on SuperFrost glass slides (Menzel-Gläser, Braunschweig, Germany), dehydrated with xylene, and mounted with Eukitt solution.

#### *Images Acquisition and Quantification of c-Fos***<sup>+</sup>** *Nuclei*

Quantification was performed by identifying spot positions. c-Fos<sup>+</sup> were counted with ICY software (http://icy.bioimageanalysis. org/) after acquired images using a digital camera (Nikon DXM 1200) of an Olympus BX600 microscope coupled to a software (Mercator Pro; Explora Nova, La Rochelle, France). The constant use of a X10 Plan Apo objective allowed to have a good resolution for c-fos immunochemistry. The focus was set on the upper face of each section before digitization. Each region of interest (ROI) was delimited on the screen for each picture based on the mouse atlas (41). ICY software directly counts the number of cells in the ROI. Cell density per square micrometer was thereafter calculated. The ROIs chosen included the prelimbic (PrL), infralimbic (IL), orbitofrontal lateral, median, dorsolateral and ventral cortex (OFC), the NAcc, caudate putamen (CPu), basolateral nucleus of amygdala (BLA), the hippocampus (Hipp), motor cortex (M) and agranular and granular insular cortex, and dorsal and ventral (CIns). Figures 7, 8, and 9 from the atlas were chosen to analyze PrL and OFC. Figures 17, 18, and 19 were chosen to analyze PrL, IL, Cg, M, CIns, NAcc, and CPu and Figures 41, 42, and 43 to analyze BLA, Amy (amygdala), and H (hippocampus).

### Experiment III: Valuation and Inhibition Processes in a Decision-Making Task with Three Concurrent Motivations. (Explicit Choice, Motivational Modulation of Explicit Choice, Change in Rule)

We first aimed at testing whether β2<sup>−</sup>/<sup>−</sup> mice are able to rank efficiently competing rewards and to make choice when no uncertainty/risk is associated. Second, we tested their ability to adapt and modulate their choices as a function of the nature and the value of the reward, or when the rule change in extinction (for time schedule, see **Figure 2C**).

#### Animals

Eight C57Bl/6J male mice and 8 β2<sup>−</sup>/<sup>−</sup> male mice were used for the task. Animals were 8 weeks old at their arrival in the colony room (obtained from Charles River, L'Arbresle Cedex, France). Two weeks arrival, animals underwent 3 weeks of social isolation before the first step (**Figure 2A**). They underwent a small water restriction in order to increase their motivation for food and water retrieval. Water restriction was established as follow: 24 h total restriction, 3 days with 4 h/day access to water, 12 days with 1 h access to water, 12 days with 30 min access to water, and eventually 25 days with 15 min access to water. During water restriction, animal's weight progressively decreased to 95% of the free feeding weight and came back to 98–100% at the end of the procedure. An additional group of C57Bl/6J group-housed (four or five mice per cage) male mice (*n* = 18) were used as social reward in the behavioral tasks. These "social" mice were age related with the isolated mice and were given food and water *ad libitum*. All experiments were performed during the light cycle (from 9:00 a.m. to 6:00 p.m). The general health of isolated mice was regularly checked, and body weights were assessed every day throughout the experimental period.

### Apparatus

The maze (**Figure 2A**) consisted of four identical opaque Plexiglas boxes with a front sliding door, a flexible plastic door, and a transparent Plexiglas arena (L: 22 cm × l: 61 cm × H: 24 cm).

One of the opaque Plexiglas box was used as a start box which opened on the transparent arena, and the three other boxes were goal boxes also connected to the transparent arena with door set equidistant to the start box door (30 cm). Once the mouse was released from the start box, it could roam in the arena and reach one of the goal boxes. To avoid the view of the reward, we inserted a flexible plastic door that animals could easily push to enter the box. Light levels of boxes were set around 25–30 Lux and that of the arena at ~35 lux. Social mice were placed under a large cup (L: 7 cm × l: 7 cm × H: 10 cm) containing holes (0.8 cm diameter), so that animals could smell and touch each other. Food reward was placed in food cup (5.5 cm in diameter, 1 cm high).

### Behavioral Protocols

#### *Explicit Choice*

In this part of the protocol, we aimed at assessing how β2<sup>−</sup>/<sup>−</sup> mice organized their explicit choices between each reward. For that, we first scored the latency to collect reward as an index of motivation, and then we tested their choices between each reward.

Animals were taken out of the animal facility by group of four animals (2 C57Bl/6J and 2 β2<sup>−</sup>/<sup>−</sup>) and stocked in the maze room on a nearby table during 15 min before the beginning of the test. Food reward consisted of 15 μl of 0.1% liquid saccharin (0.1 g saccharin sodium salt hydrate from Sigma in 100 ml water) in a cup in the food reward box and social reward consisted of a 20-s contact with a social mouse restrained under the cup in the social reward box (only nose–nose contact was allowed). For each kind of trial, the four animals were put successively in the maze. Social mice were habituated to mild restriction under the cup in a 5-min session in another box before been gently placed in the social reward box. The third reward box simply consisted of an empty box allowing novelty exploration.

*Habituation to Maze and Reward (5 Days).* Mice were individually placed in the maze for a 10-min habituation session during two consecutive days. All doors of goal boxes were maintained opened but they contained no reward. To avoid potential neophobia mice received 2 ml of 0.1% saccharin in their home cage during these two first days. On day 3, isolated animals were habituated to reward consumption in the maze. For each animal, each reward was permanently assigned to a precise goal box (food, social, and novelty exploration) and position of the reward in the goal boxes were counterbalanced among groups. During this reward habituation days, animals were submitted to six trials, two trials per reward. In each trial, animals were directly placed in a goal box with reward (either access to 15 μl liquid saccharin until full consumption, 20 s access to a social mouse and 20 s in a novel empty box). The fourth day, animal were submitted to a 15-min free choice habituation paradigm, during which ad libitum rewards (food: 8 ml 0.1% saccharin, social mouse under a cup, and empty novel box) were available and all goal boxes opened. Social reward was provided by a novel mouse.

*Reward Ranking (4 Days).* After 2 days off, we begin 4 days of forced choice in order to collect the latency to reach each reward. Each day, mice were submitted to 12 forced choice trials were during which they had to enter one of the three goal box to get the reward (4 trials of food followed by 4 trials of social and 4 trials of exploration). Each trial started by 10 s in the start box before the door was opened and the mouse allowed entering the central arena. If the mouse was not exiting the start box for 30 s, it was gently pushed in the central arena and the sliding door was closed. During these trials, only the door of the target reward was open. Once the mouse entered the goal box, the sliding door was manually closed. If the mouse failed to enter the goal box in 60 s, it was removed from the maze and the trial ended. At the end of the trial, the mouse was put back in its home-cage. Between each trial, the maze was cleaned with tap water in order to homogenize odors. The order of the four trials was randomized during the 4 days. In this part, we online measured the latency of to reach goal boxes.

*Explicit Choices (1 Day).* In the following day, animals were given a choice between the three rewards at each trial. For each 12 trials, animals had to choose between one of the reward (food, social, and novelty exploration). Once entered in a chosen goal box, the sliding door was manually closed and the animal could consume the reward for 20 s. The maximum choice latency was set at 180 s. The same social mouse was used during four choice trials of the four animals in the group i.e., for a total of 16 trials and ~15–20 min.

#### *Motivational Modulation of Explicit Choice*

In this part, we aimed at testing adaptation of β2<sup>−</sup>/<sup>−</sup> mice choices when social or food reward value was modulated.

*Devaluation of Social Reward (4 Days).* After 2 days off, we submitted all animals to a devaluation of the social reward. Devaluation of social reward is achieved in inducing social reward "satiety" in mice with 1 h exposure to social reward. More precisely, on social devaluation day (D), all animals were first put by four during 1 h in a devaluation cage placed in the maze room, with three mice in the middle and available nose–nose social contact (Figure 2B). Immediately after, they will be tested in explicit choice protocol (12 trials) with these social mice as social rewards. During control day of social non-devaluation (ND), all animal were put in the maze room in a cage for 1 h, resulting in no social reward "satiety." Immediately after, they will be tested in explicit choice protocol (12 trials). The social devaluation day (D) preceded the non-social devaluation day (ND) and on the two following days, called postD1 and postD2, animals were submitted to 12 trials of explicit choices.

*Increase Food Reward Value: Change of Saccharin Quality and Quantity (5 Days).* After 3 days off, animals were submitted to free choices protocol for 1 day, and then on the next day, we exposed to a change of food reward from 1 drop of 0.1% saccharin to 2 drops of 1% saccharin. During this reward habituation day, animals were submitted to six trials (two trials by reward) as exposed above. This novel food reward was maintained for the rest of the experiment.

*Devaluation of Food Motivation (5 Days).* In order to test the impact of food reward devaluation, we pre-exposed the mice to food reward ad libitum (8 ml of 1% liquid saccharin) in the maze room for 1 h before the free choices protocol. In the non-devalued control condition, the same was done but the food cup was empty. Animals were thereafter exposed to 5 days of free choices. For the first day, we followed a classical free choices protocol. On day 2, half of the mice were submitted to devaluation of food motivation (devalued) and the other half to the not-devalued procedure. On day 3, we followed a classical free choices procedure to minimize possible long-lasting impact of ad libitum consumption of 1% liquid saccharin. On day 4, we alternated the animals that were devalued or not. On day 5, we followed the classical free choices protocol.

#### *Adaptation to a Change of Rule*

*Extinction (5 Days).* After 2 days off, animals were submitted to an extinction protocol. Extinction consisted of presentation of no reward in any goal boxes. Animals were left 20 s in the chosen box.

We measured the number of choices made in each goal boxes, the choice latency to enter goal boxes, and the number of social contacts done.

## Statistical Analysis

#### Experiment I

Differences between means were evaluated for statistical significance using the *t*-test for paired and unpaired conditions samples and the Mann–Whitney U-test when data would not follow a normal law of distribution.

### Experiment II

When considering all animals (i.e., before subgroup separation), we used ANOVAs usingVAR3 statistical software (42) with an alpha level of 0.05. In order to test global differences from chance level (50%) we use Wilcoxon rank sum test, paired version (*Z* of the Wilcoxon test is displayed in Statistica software). Once subgroups were made and number of animals was below 30, we considered that data would not follow a normal law of distribution. We, thus, used Mann–Whitney or Kruskal–Wallis non-parametric test when appropriate.

### Experiment III

Non-parametric analyses were performed using R software (version 2.13.2 (2011-09-30) copyright (c) 2011 the R foundation for Statistical computing with Rcmdr-package), as some of the scored behavior would not follow a Gaussian distribution. We used Wilcoxon rank sum test for two samples, the Wilcoxon-signed-rank test for paired data, and Friedman chi-squared test.

### RESULTS

### Experiment I

### Beta2-nAChRs Are Necessary for the Regulation of the Prefrontal E/I Balance

To determine the role of nAChRs in the PFC cellular activity, we determined the balance between E–I balance inputs onto the soma of layer 5 pyramidal Neurons (L5PyNs) and we checked the effects of α4β2 or α7 antagonists on this E–I balance. This strategy permitted to analyze the role of endogenous release of ACh on the activity of cortical excitatory and inhibitory networks.

Stable somatic voltage-clamp recordings of L5PyNs subthreshold postsynaptic responses (composite E–I responses) evoked by layers 2-3 or 6 electrical stimulation (inset A,B **Figure 3**) were obtained in the PFC and the decomposition method (43) was applied to extract E and I. For each recording (e.g., **Figure 3C**) the total input conductance (gT) was first extracted (**Figure 3D**) and its decomposition allowed to further evaluate the relative contribution of evoked excitatory and inhibitory inputs reaching the soma of the recorded L5PyN (**Figure 3D**). Typical layer 2–3 or 6 electrical stimulation produces a fast excitatory conductance (gE) elicited before a long-lasting inhibitory conductance (gI). Quantification of these somatic conductances showed that the control stimulus-locked composite signal at the soma of L5PyNs is composed of 18% of E and 82% of I whatever the stimulated layer was (**Figure 3E**, *n* = 25 cells and *n* = 11 cells for stimuli in layer 2–3 or 6, respectively, *p* = 0.8).

We further explored whether the E–I balance was modulated by ACh around its set-point and to do so we determined the balance in the PFC of β2<sup>−</sup>/<sup>−</sup> mice and compared the effects of α4β2 or α7 antagonists on the balance between C57Bl6 mice and β2<sup>−</sup>/<sup>−</sup> mice (**Figure 4**). The E–I balance in β2<sup>−</sup>/<sup>−</sup> mice was equal to 24–76% in response to layer 2-3 stimulation (*n* = 16) and to 23–77% in response to layer 6 stimulation (*n* = 6). These values of the E–I balance were significantly different from the values obtained in C57Bl6 mice (*p* < 0.05, Mann–Whitney *U*-test). This result was in favor of a modulation of synaptic inputs on L5PyNs by ACh. Surprisingly, in C57Bl6 mice DhβE (500 nM) the α4β2 nicotinic antagonist had no effect on E and I when the stimulation was applied in layer 2-3 as compared to control condition (*n* = 10, *p* = 0.8). However, the α7 nicotinic antagonist MLA (10 nM) increased E by 43% (*n* = 10, *p* < 0.05) and I by 44% (*n* = 10, *p* = 0.02) without changing the E–I balance (18–82%, *p* = 0.7). In the contrary, MLA had no effect on E and I of β2<sup>−</sup>/<sup>−</sup> mice (*n* = 10, *p* = 0.8 for E and *p* = 0.3 for I). We conclude that in superficial layers ACh decreases synaptic inputs on L5PyNs through the activation of α7 receptors and that this modulator effect is lost in β2<sup>−</sup>/<sup>−</sup> mice.

The modulation exerted by ACh on synaptic inputs is more complicated in deep layers of the PFC. In C57Bl6 mice, the stimulation of layer 6 in presence of DHβE induced an increase of E by 37% (*n* = 7, *p* = 0.051) and I by 55% (*n* = 7, *p* = 0.057) without changing the E–I balance significantly (*p* = 0.4). Elsewhere, MLA increased E by 29% (*n* = 4, *p* < 0.05) and I by 48% (*n* = 4, *p* < 0.05) with no significant change of the E–I balance (*p* = 0.5). However, in β2<sup>−</sup>/<sup>−</sup> mice MLA had no effect on E (*n* = 6, *p* = 0.3) and I (*n* = 6, *p* = 0.5).

Our results showed that the control of excitatory and inhibitory inputs by ACh through α7 receptors was lost in the PFC of mice lacking β2-nAChRs. Moreover, we determined a link between the laminar and cellular segregation of nAChRs and specific functional effects on synaptic inputs on L5PyNs. The change of the modulator effects of α7 receptors in β2<sup>−</sup>/<sup>−</sup> mice support the possibility of crossed modifications of expression and function of nAChRs types.

### Experiment II

#### Beta2 Have Alteration in Gambling Task: Mouse Gambling Task

As illustrated in **Figure 5**, mice initially chose equally advantageous and disadvantageous options. Over time, a two-way ANOVA revealed that choice of β2<sup>−</sup>/<sup>−</sup> mice and WT mice evolved significantly differently over time as there was a genotype × sessions interaction [*F*(4, 172) = 2.42, *p* < 0.05] with WT favoring advantageous choice [*F*(4,92) = 2.9, *p* < 0.05] while β2<sup>−</sup>/<sup>−</sup> mice did not [*F*(4, 80) < 1, ns]. This difference in choice evolution led to a global genotype effect for the last 2 days [*F*(1,43) = 4.43, *p* < 0.05]. Indeed, WT mice chose more advantageous options (Sessions 3, 4, and 5 differed from the chance, Wilcoxon rank sum test, paired: S1 *Z* = −1.120, ns; S2 *Z* = −1.640, ns; S3 *Z* = −2.273, *p* < 0.05; S4 *Z* = −3.071, *p* < 0.01; S5 *Z* = −3.511, *p* < 0.001). By contrast, β2<sup>−</sup>/<sup>−</sup> mice were not able to choose advantageous options from disadvantageous ones until the end of the task (S1 *Z* = −1.784, ns; S2 *Z* = −1.784, ns; S3 *Z* = −0.983, ns; S4 *Z* = −0.282, ns; S5 *Z* = −0.678, ns). Choice latencies (data not shown) globally decreased with gambling sessions [*F*(4,172) = 12.28, *p* < 0.05], but this decrease was not the same in the two genotypes (genotype × sessions interaction [*F*(4, 172) = 4.34, *p* < 0.05]). β2<sup>−</sup>/<sup>−</sup> choice latencies were shorter than that of WTs at the beginning of the task and were not modified with time [*F*(4, 80) = 2.03, ns]. By contrast, WT mice demonstrated a decrease in choice latency across the five gambling sessions [*F*(4,92) = 14.31, *p* < 0.05]. This differential evolution concerning choice latencies led to a genotype effect restricted on the two first gambling days [*F*(1, 43) = 12, *p* < 0.05].

The k-mean clustering made it possible to separate WT and β2<sup>−</sup>/<sup>−</sup> mice in three subgroups of performance: "safe" (WT *n* = 5, β2<sup>−</sup>/<sup>−</sup> *n* = 8), "risky" (WT *n* = 6, β2<sup>−</sup>/<sup>−</sup> *n* = 5), and "average" (WT *n* = 13, β2<sup>−</sup>/<sup>−</sup> *n* = 8). Safe WT animals (**Figure 5**) developed a preference for advantageous options from the fourth session until the end (S1, S2, S3, ns; S4 *Z* = −2.023, *p* < 0.05; S5 *Z* = −2.023, *p* < 0.05), whereas safe β2<sup>−</sup>/<sup>−</sup> mice, already developed a stable preference for advantageous options on the first one session (S1 *Z* = −2.366, *p* < 0.05; S2 *Z* = −2.366, *p* < 0.05; S3 *Z* = −2.310, *p*< 0.05; S4 *Z*=−2.251, *p*< 0.05; S5 *Z*=−2.521, *p*< 0.05). Unlike average WT mice, β2<sup>−</sup>/<sup>−</sup> average mice were not able to distinguish advantageous options from disadvantageous ones at the end of the task (average WT S4 *Z* = −2.795, *p* < 0.01; S5 *Z* = −3.059, *p*< 0.01; average β2<sup>−</sup>/<sup>−</sup> S4 *Z*=−0.676, ns; S5 *Z*=−1.120, ns). Except for the first session, WT risky mice equally chose advantageous and disadvantageous options throughout sessions (risky WT S1

Figure 3 | (A) Coronal slice of the prefrontal cortex of a 22-day-old male mouse. Arrows show the position of the patch-pipette in layer 5 and of the stimulating electrode in the layer 2-3. (B) Scheme of the coronal slice from Allen Brain Atlas Resources Seattle (WA): Allen Institute for Brain Science. ©2009 (Available from: http://www.brain-map.org). (C) Representative current responses of a L5PyN to layer 2/3 and 6 stimulation in prefrontal cortex of a WT mouse recorded under voltage-clamp at various holding potentials (each response is the mean of five recordings). Vertical arrows indicate the stimulation onset. (D) Corresponding conductance change gT (black line) of the response. Excitatory (gE, dark gray line) and inhibitory (gI, light gray line) conductance changes were obtained from gT decomposition. Data reported are mean ± SD of the mean of *n* layer five pyramidal neurons (L5PyNs). (E) E–I balance determined in layer five pyramidal neurons after a stimulation in layer 2–3 (*n* = 25 neurons) or in layer 6 (*n* = 11 neurons).

*Z* = −2.023, *p* < 0.05; S2, S3, S4, S5, ns). Conversely, β2<sup>−</sup>/<sup>−</sup> risky mice exhibited a marked preference for disadvantageous options (risky β2<sup>−</sup>/<sup>−</sup> S1, S2, S3, S4, ns; S5 *Z* = −2.023, *p* < 0.05). On the last gambling session, there was a significant genotype effect in average (Mann–Whitney: S5 *U* = 0, *p*< 0.05) and risky subgroups (S5 *U* = 30, *p* < 0.01), but not in the safe ones (S5 *U* = 7, ns).

In all animals, rigidity significantly increases from the two first sessions to the last two [*F*(1)= 31.078, *p*< 0.0001]. However, there was no interaction session × genotype [*F*(1, 1) < 1, ns] (**Figure 6**). There was an effect of session [*F*(1) = 30.44, *p* < 0.0001] and an interaction session × subgroup (safe, average, and risky) for β2<sup>−</sup>/<sup>−</sup> mice [*F*(1,2) = 11.28, *p* < 0.001]. For WT mice, however, there was only a session effect [*F*(1) = 22.28, *p* = 0.0001] and no interaction session × subgroup [*F*(1, 2) = 2.55, ns]. The increase of the rigidity score was significantly different for average WT mice (Wilcoxon: *Z* = −3.1, *p* < 0.05) but not for safe (*Z* = −1.461, *p* = 0.1441) or risky mice (*Z* = −0.674, ns). In β2<sup>−</sup>/<sup>−</sup> mice, the increase of rigidity was significant for safe (Z = −2.366, *p* < 0.05) and risky mice (*Z* = −2.023, *p* < 0.05) but not for average animals (*Z* = −0.734, ns). Moreover, rigidity scores were significantly different between safe and risky WT mice (Mann–Whitney: *U* = 2, *p* < 0.05), average and risky β2<sup>−</sup>/<sup>−</sup> mice (*U* = 6, *p* < 0.05), and between risky β2<sup>−</sup>/<sup>−</sup> and WT mice (*U* = 0, *p* < 0.01) during the two last sessions.

### Differential Activation of Neuronal Circuits in Beta2 vs. WT during Gambling

We measured the brain expression of cFos 90 min after the last gambling session in WT or β2<sup>−</sup>/<sup>−</sup> mice allowing us to have an estimation of brain structures activation during the last gambling session (for example of cFos labeling in Prl see **Figure 7C**). This method demonstrates that β2<sup>−</sup>/<sup>−</sup> mice have a significantly lower cFos activation in Infralimbic, Insular cortex, and hippocampus (*U* = 46, *U* = 31, and *U* = 62, respectively, *p* < 0.05). By contrast, all other regions were identically activated in both genotype (Prelimbic cortex, *U* = 103, Cingular cortex, *U* = 96, Motor cortex, *U* = 83, Amygdala, *U* = 93, NAcc, *U* = 79, Orbitofrontal cortex, *U* = 97, CPu, *U* = 87, and BLA, *U* = 101, all ns) (**Figure 7A**).

For WT animals, cFos expression was significantly different in relation to subgroups only in Prl (Kruskall–Wallis, H = 8.63, *p* < 0.05) and not in all other structures (InfraL, H = 0.58, Cins, H < 1, Cg, H = 1.14, Moteur, H = 0.59, Amy, H < 1, Nacc, H = 2.83, OFC, H = 3.20, Hippocampe, H = 1.06, Cpu, H = 4.56, BLA, H = 2.87, ns). In β2<sup>−</sup>/<sup>−</sup> mice, cFos activity was not related to subgroups (Prl, H = 3.39, InfraL, H < 1, Cins, H = 2.45, Cg, H = 2.86, Motor, H < 1, Amy, H < 1, Nacc, H < 1, OFC, H < 1, Hippocampe, H < 1, Cpu, H = 1.74, BLA, H < 1, ns). In Prelimbic cortex, WT "safe" animals demonstrated significantly lower cFos expression than β2<sup>−</sup>/<sup>−</sup> "safe" animals (*U* = 0, *p* < 0.05) and WT "risky" animals demonstrated significantly greater cFos expression than β2<sup>−</sup>/<sup>−</sup> "risky" (*U* = 0, *p* < 0.05). Average animal display the same cFos expression in Prl whatever the genotype (*U* = 14, ns) (**Figure 7B**).

### Experiment III Beta2 Have Normal Explicit Choice between Three Natural Motivations

Once animals have experienced the reward during the goal exposure, and have been habituated to presence of rewards during 15 min, we assess their motivation for each independent reward during forced choices (**Figure 8**). During the forced choices, all animals (β2<sup>−</sup>/<sup>−</sup> and WT) demonstrated a shorter latency to reach the social goal box in contrast to food or empty one (explo vs. social; Wilcoxon rank sum test, paired, *V* = 131, *p* < 0.001, food vs. social *V* = 132, *p* < 0.001) with no difference between food or exploration goal boxes (*V* = 52, ns). This low latency to reach the social goal was similar in both genotype (genotype effect for Food; Wilcoxon rank sum test, two samples, *W* = 25, ns; Social; *W* = 35, ns; Explo; *W* = 45, ns). During the following explicit choice session, all genotypes clearly choose social goal box in a majority of choices (social vs. food, *V* = 133, *p* < 0.001; social vs. explo, *V* = 0, *p* < 0.001) and they also prefer food goal box over empty box for exploration (*V* = 133, *p* < 0.001) demonstrating a clear ranking of motivation Social > Food > Exploration. Absence of β2 subunit has no significant impact on this ranking (genotype effect for Food; *W* = 36, ns; Social; *W* = 37, ns; Explo; *W* = 35.5, ns).

### Beta2 Normally Adapt to Change in Motivation *Social Devaluation*

During the 4 days of social devaluation protocol with social ND, social devaluation (D), and the two following days, only

social choice were affected in contrast to others choices (for social choice; Friedman = 8.15, df = 3, *p* < 0.05, for food choice; Friedman = 3.56, df = 3, ns and for explo choice; Friedman = 2.36, df = 3, ns) (**Figure 9A**). And number of contact to the social mice was also significantly decrease (Friedman = 12.66, df = 3, *p* < 0.01) (data not shown). This significant decrease of social choice and contact is mainly due to a significant decrease between day with devaluation and day without devaluation (for social choice, ND vs. D; *V* = 8, *p* < 0.01, contact; *V* = 30, *p* = 0.05). Moreover, the number of social choice or the number of social contact never came back to non-devalued level with no more evolution on following days (for social choice, evolution between D, and postD1 and postD2; Friedman = 1.08, df = 2, ns; social contact; Friedman = 2.41, df = 2, ns). This decrease in number of social choice and contact, due to devaluation, was unaffected by the absence of béta2 subunit (genotype effect for devalued day; social choice *W* = 20.5, ns; social contact *W* = 21, ns; and for non-devalued day; social choice *W* = 20.5, ns; social contact *W* = 23.5, ns) and there were no genotype effect during following days (social choice: postD1; *W* = 14.5, ns and postD2; *W* = 22, ns; social contact: postD1; *W* = 23, ns and postD2; *W* = 15, ns). Eventually, on the last day (postD2), number of food choice or social choice were equivalent (*V* = 48, ns) and were significantly higher than exploration choice (food vs. explo; *V* = 20, *p* < 0.05, social vs. explo; *V* = 5.5, *p* < 0.01).

Choice latency for food and social constantly decrease during this four days paradigm (data not shown) (for social choice; Friedman = 14.47, df = 3, *p* < 0.01, for food choice; Friedman = 12.375, df = 3, *p* < 0.01 and for explo choice; Friedman = 6.9, df = 3, ns) with no significant difference between D and ND days (for social choice, ND vs. D; *V* = 80, ns; for food choice; *V* = 61, ns) and with no genotype effect on D (social; *W* = 18, ns, food; *W* = 29, ns) and ND days (social; *W* = 34, ns; food; *W* = 26, ns).

### *Change of Saccharin Value and Quantity*

We observed a significant rise in number of food choice and decrease in social one from the day with one drop of 0.1% saccharin through 3 days with two drops of 1% saccharin (food choice; Friedman = 9.13, df = 3, *p* < 0.05; social choice: Friedman = 9.08, df = 3, *p* < 0.05) with no evolution of choice of empty box (Friedman = 5.26, df = 3, ns) (**Figure 9B**). During these days, increasing the value and quantity of food reward significantly decreases latency to reach the food goal box but also the social one (Friedman = 11.1, df = 3, *p* < 0.05; Friedman = 20.92, df = 3, *p* < 0.001, respectively) with no genotype effect (social latencies *W* = 23, 39, 32, and 13, ns; food latencies, *W* = 33, 29.5, 41, and 43, ns). Latency to enter the empty box would not be analyzed on following manipulations due to the insufficient number of empty choice, which prevent us to have relevant latency. Animal go from a ranking of choices with social choice higher than exploration (*V* = 17.5, *p* < 0.01) and equivalent to food (*V* = 27.5, ns) to ranking with a predominant choice for food over social or exploration (respectively *V*= 98, *p*< 0.05 and *V*= 18.5, *p*< 0.05). Even with this predominant increase of food choice, social choice number is still significantly higher than exploration one (*V* = 18.5, *p* < 0.05). On the first day before the shift, β2<sup>−</sup>/<sup>−</sup> mice demonstrated same choice for social box (*W* = 14, ns) with significantly less number of social contact (*W* = 12, *p* < 0.05) and no impact on food or exploratory choice (*W* = 40.5, ns; *W* = 32, ns). However, β2<sup>−</sup>/<sup>−</sup> adapt their

choice in similar manner than WT mice (genotype effect on food choice for sac J1-2-3, respectively, *W* = 23.5, 26, and 28.5; ns; on social choice for sac J1-2-3, respectively, *W* = 33, 40.5, and 34.5). interestingly, during these 3 days, the mean number of social contact on these 3 days with novel reward is significantly lower in β2<sup>−</sup>/<sup>−</sup> than in WT (stat on the mean of the three days: *W* = 9.5, *p* < 0.05; WT, mean of 6.06 ± 0.44 contact, β2<sup>−</sup>/<sup>−</sup> mean of 4.78 ± 0.23).

### Food Devaluation

Devaluation of food has no significant effect on food, social, or exploratory choice (*V* = 50, ns; *V* = 49, ns and *V* = 25.5, ns) nor on social contact (*V* = 65, ns). Moreover, genotype demonstrate the same kind of choices in non-devalued (food, *W* = 28.5, ns, social, *W* = 36, ns; explo, *W* = 33.5, ns) or devalued day (food, *W* = 28, ns, social, *W* = 34.5, ns; explo, *W* = 33.5, ns). However, food devaluation significantly increases latency to reach the food box (*V* = 118, *p* < 0.01) but not latency for social choice (*V* = 85, ns). This impact of devaluation on latency was similar for WT or β2<sup>−</sup>/<sup>−</sup> mice (WT vs. β2<sup>−</sup>/<sup>−</sup>, food latency on D; *W* = 26, ns on ND, *W* = 25, ns; social latency on D, *W* = 33, ns, on ND; *W* = 27, ns). As in the previous manipulation, β2<sup>−</sup>/<sup>−</sup> mice have a trend to demonstrate less social contact than WT (on D day, *W* = 12, *p* < 0.05, ND day, *W* = 15.5, ns, on the mean of both day *W* = 7, *p* < 0.01).

#### *Beta2 Have Alteration in Adaptation to Rule Change in Extinction*

When all rewards were removed, animals significantly decrease their choice to the previously food rewarded box, i.e., ex-food (Friedman = 32.93, df = 4, *p*< 0.001) and increase their choice to the previously social rewarded box, i.e., ex-social (Friedman = 19.90, df = 4, *p* < 0.001) (**Figure 10**). They also slightly increase their choice toward previously empty box (Friedman = 12.55, df = 4, *p* < 0.05). When look carefully, these evolutions drive the choice of all animals from food predominance (Extinction D1; food vs. empty, *V* = 0, *p* < 0.001, social vs. food *V* = 4.5, *p* < 0.01, and empty vs. social *V* = 24.5, ns) toward almost equivalence of all

empty boxes, i.e., four choice in each one, but with still a tendency to ExtD5; food vs. empty, *V* = 31, *p* = 0.057 and empty vs. social *V* = 13.5, *p* < 0.05 and no more difference between ex-food and ex-social (*V* = 71.5, ns). On the graph, we see that evolution of choices is slower in β2<sup>−</sup>/<sup>−</sup> mice leading to a conserved difference between choice in ex-food and ex-social on second day compare to WT (ExtD1: ex-food vs. ex-social; β2<sup>−</sup>/<sup>−</sup>, *V* = 0, *p* < 0.05, WT; *V* = 1.5, *p* < 0.05; ExtD2, ex-food vs. ex-social; β2<sup>−</sup>/<sup>−</sup>, *V* = 0, *p* < 0.05, WT; *V* = 4.5, ns) and a trend on third day (ex-food vs. ex-social; β2<sup>−</sup>/<sup>−</sup>, *V* = 4, *p*-value = 0.057, WT; *V* = 18, ns). This slowing down due to genotype appears significant only for ex-social choice on extinction days 3 and 4 (*W* = 3.5, *p* < 0.01, *W* = 13, *p* < 0.05). Moreover, during these 5 days of extinction, latency to choose ex-social and ex-food significantly increased for both genotype (ex-food, Friedman = 44.85, df = 4, *p* < 0.001; ex-social, Friedman = 19.01, df = 4, *p* < 0.001).

### DISCUSSION

In this paper, we clearly demonstrate that β2 nicotinic acetylcholine receptors (β2-nAChRs) within the prelimbic area of the prefrontal cortex are major actors influencing E–I balance. Using β2<sup>−</sup>/<sup>−</sup> mice, we demonstrate that the value of the E-I balance was significantly elevated compared to WT mice (E–I, 18–82% in WT to E–I, 24–23% to 76–77% in β2<sup>−</sup>/<sup>−</sup>). Our results also show that the control of excitatory and inhibitory inputs by ACh through α7 receptors is lost in the prelimbic cortex of mice lacking the nicotinic β2 subunit.

Previous measurements of E–I balance had been successfully used to show the effect of ACh or serotonin in the rat visual cortex (44, 45) and in the mouse PFC (37, 38). Here, we show that the E–I balance (18–82%) in the C57Bl/6 strain was not significantly different from the E–I balance (20–80%) in the PFC of 129/ Sv mice (38). This result shows that coordinated functions of neuronal networks regulate the E–I balance of synaptic inputs on layer 5 pyramidal neurons (L5PyNs) in the PFC of C57Bl/6 mice similarly to other mouse strains, and this is crucial for keeping neuronal networks of the PFC in a functional range.

Our results also show that the control of excitatory and inhibitory inputs by ACh through α7 receptors is lost in the prelimbic of β2<sup>−</sup>/<sup>−</sup> mice. α7-nAChRs are highly involved in the development of cortex and disruption of their function might lead to neurodevelopmental disorders, such as schizophrenia or other psychiatric disorders (46). Moreover, α7-nAChRs play a major role in the development of cortical parvalbumin-containing GABAergic interneurons (47). Thus, absence of α7 regulation in the PFC of β2<sup>−</sup>/<sup>−</sup> mice might lead to alteration in the wiring of inhibitory circuits within the PFC and altered PFC functioning. Additional

between the three available options.

studies are necessary to decipher the exact roles of β2 vs. α7 in the regulation and development of PFC E/I balance.

Alteration (increased excitation and decreased inhibition) of E/I balance was measured in adolescent β2<sup>−</sup>/<sup>−</sup> mice, while decision-making defects were evidenced in adults. We can, thus, wonder whether the E/I prefrontal alteration during development led to an altered prefrontal functioning and wiring which itself had consequences at adulthood, or whether the altered E/I balance plays a direct role in adulthood and impairs prefrontal functioning *per se*. One argument toward an effect not only during development is the fact that viral re-expression of β2 subunit in the PFC of β2<sup>−</sup>/<sup>−</sup> mice was sufficient to restore social interactions (28). Interestingly, optogeneticaly mediated elevation of the PFC E/I balance in adult mice was shown to decrease social choice (20) and conditional neuroligin-2 knockout adult mice exhibited a reduction of PFC inhibition associated with altered social interactions (48). We, thus, might suggest that PFC E/I balance modifications in β2<sup>−</sup>/<sup>−</sup> mice remain such at adulthood and may be at least partially responsible for decision-making alterations both social and non-social situations. This remains at this point only speculative. It would, however, be of interest to measure individual E/I balance in animals previously subjected either to the gambling task or to the social choice task.

We demonstrate here an involvement of β2-nAChRs in MGT in which uncertainty and risk have to be managed as outcomes are probabilistic. Indeed, β2−/− mice were not able to choose longterm advantageous options from disadvantageous ones until the end of the task. This choice profile led β2<sup>−</sup>/<sup>−</sup> mice to make largely less advantageous choices than WTs. As previously reported (8), a majority of WT mice (54%) preferred advantageous options without neglecting alternative but rare – potentially more risky – choices, i.e., *average* mice. A small subgroup of mice (21%) continued throughout the experiment to explore all available options despite a putative risk, i.e., *risky* mice. Another small proportion of mice (25%) strongly preferred long-term advantageous choices, avoided exploring alternative options and presented a more rigid behavior compared to the others, i.e., *safe* mice. β2<sup>−</sup>/<sup>−</sup> mice could also be classified in three subgroups but evolution of their choices across sessions was very different from that showed by WTs. Indeed, the β2<sup>−</sup>/<sup>−</sup> average mice did not prefer the advantageous options at the end of the task; they had the same percentage of advantageous choices than WT risky mice at the end of the task. Moreover, risky β2−/− mice showed a marked preference for disadvantageous options. To that regard, they had the same profile of choice than poor performance of human patients with bilateral lesions of the ventromedian prefrontal cortex (vmPFC) (39, 49).

Furthermore, mice distribution between the three subgroups was quite distinct from that of WTs: there was a similar proportion of safe and of average mice (i.e., 38%) while 24% of the mice belonged to the risky subgroup. As a result, the absence of β2-nAChRs led mainly to extreme profiles, with no real average subgroup and only safe and risky mice. In addition, a new behavioral profile appeared as some mice strongly preferred disadvantageous options. It is noticeable that the rigidity score of WT mice was roughly similar to that observed previously (8), and particularly that it increased across sessions. This increase reflects the establishment of a fixed choice pattern, away from exploration of multiple options. Average β2<sup>−</sup>/<sup>−</sup> mice, however, did not show any increase in rigidity scores across sessions, thus supporting the idea that β2−/− mice behaved like the risky WT mice and continued to explore available options until the end of the task. Risky β2<sup>−</sup>/<sup>−</sup> mice increased strongly their rigidity score at the end of the task by choosing nearly exclusively disadvantageous options. We never observed such extreme profile in WT mice (4, 8). Multiple factors might explain choice profiles of β2<sup>−</sup>/<sup>−</sup> mice, like alteration in sensitivity to punishment/risk-taking and/or flexibility.

It was proposed that vmPFC patients could either be more sensitive to reward, or insensitive to punishment, or insensitive to future positive, or negative consequences (49). Moreover, vmPFC patients increased betting regardless of the odds of winning during the Cambridge Gamble Task (CGT) a task for which probabilities to loose are presented explicitly (50). Interestingly, patients with insular cortex lesion also failed to adjust their bets by the odds of winning (50). The latter study indicated a necessary role of the vmPFC in decision-making regulation and of the insular cortex in the signaling of aversive outcomes (50).

Here, we observed that β2<sup>−</sup>/<sup>−</sup> mice had a hypoactivation of the infralimbic (IL) and insular (CIns) cortices, and of the hippocampus (H). The IL cortex was proposed to be the functionally equivalent to the vmPFC in humans (51). Altogether, these data supported that in β2<sup>−</sup>/<sup>−</sup> mice hypoactivation led to poor MGT performance because of a difficulty to regulate decisionmaking (IL) and to integrate the value of negative outcome (CIns). During the forced and explicit choice task no negative outcome existed. Likewise, during the food or social devaluation task there was no negative outcome. Conversely, during the extinction task mice were not presented with the reward, which could be perceived as a negative condition. Therefore, the slower evolution of β2<sup>−</sup>/<sup>−</sup> mice choices during the extinction task could be linked to the hypoactivation of CIns, hence, to a difficulty to detect changes in outcomes. At the level of prelimbic cortex, in which β2−/− mice displayed E/I balance alteration, cfos activation of β2<sup>−</sup>/<sup>−</sup> mice was not related to gambling performance. This contrasted with WTs' c-fos activity for which higher expression correlated to lower rigidity scores. Thus, poor performance of β2<sup>−</sup>/<sup>−</sup> mice might be linked to differential activation of neuronal circuits including, IL, PL, CIns, and hippocampus.

It was previously demonstrated that β2−/− mice were hyperactive while displaying less exploratory behavior compared to WT animals (27, 30–32). Our current results showing reduced choice latency in gambling remind our previous data (26) and might be related to the unbalanced locomotion/exploration previously shown to be controlled by nAChRs activity on dopaminergic neurons of the substantia nigra pars compacta (SNpc) and ventral tegmental area (VTA) (27). It was suggested that decision-making processes result in a balance between exploiting existing options and exploring new possibilities (52), with a main involvement of dopamine (DA) in cortico-striatal circuits. Thus, it may be that β2<sup>−</sup>/<sup>−</sup> mice that are less explorative are more prone to favor the exploitation of a chosen strategy during the MGT, thus resulting in more extreme profiles, and increasing rigidity. In β2<sup>−</sup>/<sup>−</sup> mice, exploration was restored with re-expression of subunit in VTA and not SNpc, suggesting role of nAChRs in accumbal and prefrontal DA input (27). In β2<sup>−</sup>/<sup>−</sup> mice, alteration in basal levels of dopamine and serotonin in fronto-striatal circuits (25, 31) might have altered the valuation process when different rewards compete. Indeed, dopamine signaling in the prelimbic cortex plays a major role in goal-directed behavior and ability to detect motivational value of outcomes (53), as well as in selective attention of cues predicting reward (54). Previous data (29) and current results clearly demonstrate that β2<sup>−</sup>/<sup>−</sup> mice may adapt normally their behavior when the choice to be made is essentially underpinned by motivational value of outcome with no uncertainty or risk involved. This strongly suggests that decision alteration seen in gambling task in β2−/− mice was not due to a valuation or motivation processes deficit *per se*. We, thus, suggest that dopamine alteration in fronto-striatal circuits of β2<sup>−</sup>/<sup>−</sup> mice may underpin, at least in part, decision-making alteration seen in the MGT. Accordingly, the fact that β2<sup>−</sup>/<sup>−</sup> mice showed perseveration in extinction task together with the well demonstrated role of prelimbic cortex in flexibility (28, 34) suggests that gambling alterations of β2−/− mice are due to prefrontal dysfunction leading to lower exploration and higher rigidity.

### CONCLUSION

In conclusion, we demonstrate for the first time that β2-nAChRs play a critical role in the fine tuning of prefrontal E/I balance and that lack of these receptors change α7-mediated prefrontal activity modulation. A shifted set-point of the E/I balance may promote dysfunction of infralimbic, prelimbic and insular cortices and of hippocampus, behaviorally leading to decision-making defects, at the origin of which are lack of flexibility and blunted sensitivity to punishment, specifically when uncertainty regarding outcome is high.

### REFERENCES


### AUTHOR CONTRIBUTIONS

SG designed the gambling task, supervised the behavioral experiments and their analyses, and wrote the paper. AF conducted the social task experiments, performed statistical analyses, and wrote the paper. EP, AC, and EM conducted the gambling task, performed statistical analyses, and wrote the paper. XL and CM performed the electrophysiology experiments and analyzed the data. PF designed the electrophysiology experiments, analyzed the data, and wrote the paper. AR supervised the behavioral experiments and wrote the paper.

### FUNDING

This project was supported by Centre National de la Recherche Scientifique (CNRS), Université Paris-Sud, and Institut de recherche Biomédical des Armées (IRBA).


**Conflict of Interest Statement:** Dr. AC is the head of a CRO dedicated to behavioral research. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Pittaras, Faure, Leray, Moraitopoulou, Cressant, Rabat, Meunier, Fossier and Granon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Changes in the Influence of Alcohol-Paired Stimuli on Alcohol Seeking across Extended Training

*Laura H. Corbit1 \* and Patricia H. Janak2,3*

*1School of Psychology, The University of Sydney, Sydney, NSW, Australia, 2Department of Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD, USA, 3Department of Neuroscience, Johns Hopkins School of Medicine, Johns Hopkins University, Baltimore, MD, USA*

Previous work has demonstrated that goal-directed control of alcohol-seeking and other drug-related behaviors is reduced following extended self-administration and drug exposure. Here, we examined how the magnitude of stimulus influences on responding changes across similar training and drug exposure. Rats self-administered alcohol or sucrose for 2 or 8 weeks. Previous work has shown that 8 weeks, but not 2 weeks of self-administration produces habitual alcohol seeking. Next, all animals received equivalent Pavlovian conditioning sessions where a discrete stimulus predicted the delivery of alcohol or sucrose. Finally, the impact of the stimuli on ongoing instrumental responding was examined in a Pavlovian–instrumental transfer (PIT) test. While a significant PIT effect was observed following 2 weeks of either alcohol or sucrose self-administration, the magnitude of this effect was greater following 8 weeks of training. The specificity of the PIT effect appeared unchanged by extended training. While it is well established that evaluation of the outcome of responding contributes less to behavioral control following extended training and/or drug exposure, our data indicate that reward–predictive stimuli have a stronger contribution to responding after extended training. Together, these findings provide insight into the factors that control behavior after extended drug use, which will be important for developing effective methods for controlling and ideally reducing these behaviors.

Keywords: outcome devaluation, Pavlovian–instrumental transfer, ethanol, stimuli, habit learning

## INTRODUCTION

While early recreational drug use is largely driven by the reinforcing properties of the drug, over extended use, many of the positively reinforcing effects of drugs are diminished. The continued drug use by some individuals under such conditions suggests that drug-seeking behavior has become disconnected from expectations regarding the outcome of that behavior. An increasing automatization of responding could explain this shift. Although the notion that responding for drug rewards becomes habitual is prevalent in the addiction field (1–3), it has only been relatively recently that empirical studies have directly assessed this claim. There is now accumulating evidence that with prolonged drug use, control of drug-seeking behaviors transitions from flexible and goal-directed to habitual.

Tests developed in the animal learning field can dissociate goal-directed actions from response habits. Goal-directed actions rely on their relationship to, and the value of, their associated outcome.

#### *Edited by:*

*Vincent David, Centre National de la Recherche Scientifique (CNRS), France*

#### *Reviewed by:*

*Kate M. Wassum, University of California Los Angeles, USA Richard J. Lamb, University of Texas Health Science Center at San Antonio, USA*

> *\*Correspondence: Laura H. Corbit laura.corbit@sydney.edu.au*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 03 August 2016 Accepted: 23 September 2016 Published: 10 October 2016*

#### *Citation:*

*Corbit LH and Janak PH (2016) Changes in the Influence of Alcohol-Paired Stimuli on Alcohol Seeking across Extended Training. Front. Psychiatry 7:169. doi: 10.3389/fpsyt.2016.00169*

Corbit and Janak PIT after Extended Training

Thus, responding tracks both the action–outcome contingency and current value of the outcome and is normally reduced when either the former is degraded or the latter reduced (4, 5). In contrast to the knowledge of the action–outcome relationship and evaluation of outcome value that characterize goal-directed behaviors, habits are argued to rely on an independent learning process. Habits are acquired as stimulus–response (S–R) associations that are gradually strengthened each time a response is reinforced, explaining why the relative dominance of habitual control grows with extended training (6, 7). Because habitual responding is controlled by an S–R association that does not include a representation of the outcome or its value, changes in the value of the outcome have no immediate effect on the performance of habitual responses (6, 8). Thus, by specifically manipulating outcome value or the action–outcome contingency and observing consequent effects on performance, the outcome devaluation and contingency degradation tests have become useful tools for identifying goal-directed and habitual actions (5), and evidence of drug-induced habits has largely been derived from studies using these tests. The outcome devaluation task, in particular, has been effective in demonstrating that drug exposure can promote habitual control. For example, sensitizing doses of psychostimulant drugs prior to training with food reward can promote rapid habit formation evidenced by impaired sensitivity to devaluation (9–13). Likely of more direct relevance to human addiction, extensive, but not limited self-administration training with cocaine (14), alcohol (15), or nicotine (16) results in drug seeking that is no longer sensitive to outcome devaluation.

These failures of goal-directed control imply that drug seeking is habitual; nonetheless, they do not directly assess the S–R learning that is thought to underlie habitual behavior. While habits are thought to rely on the formation of an S–R association, the stimuli that support the S–R association and consequently, habitual performance in a free operant paradigm are typically poorly defined. The S–R association is established during instrumental training when the response is repeatedly reinforced, incrementally strengthening the association between that response and situational cues that are present. These stimuli could be derived from the physical context. However, since these cues are incidental, it is not clear what exact information the animal uses (context, elements of the context, sight of the lever, aspects of their own behavior, the outcome itself, etc.), and this could differ animalby-animal, making the stimuli difficult to manipulate. While there is an independent literature implicating drug-related stimuli in craving and subsequent relapse risk (17–21), how the nature of such influences changes across the course of extended drug use has rarely been assessed and deserves further study, particularly in relation to whether behavior is under goal-directed or habitual control.

Stimulus influences in general can be readily manipulated and examined using the Pavlovian–instrumental transfer (PIT) task. This task examines the influence of stimuli on the choice and vigor of responses that earn drug or other rewards. It involves three independent stages. In the Pavlovian conditioning phase, a stimulus or stimuli are paired with an outcome or outcomes (such as drug, food, or other reward). Separately, animals are trained to perform one or more instrumental actions, such as a lever-press response, to earn reward. Importantly, the Pavlovian stimuli are not present during the instrumental training phase. In the final test stage, the instrumental action(s) is available and, for the first time, the Pavlovian stimuli are presented in order to assess their influence on instrumental performance. Changes in instrumental responding in the presence of the Pavlovian stimuli relative to stimulus-free periods constitute the PIT effect. Tests of PIT are typically conduced under extinction conditions (i.e., no rewards are delivered following either stimulus presentations or performance of the instrumental response) to prevent new learning at the time of testing and to allow confidence that effects rely on associations previously established during training. There is some evidence that the magnitude of PIT effects increases with extended instrumental training with food reward (7); however, the relationship between the amount of training and the magnitude of PIT effects is not straightforward (22). Furthermore, how stimulus effects related to drug seeking may change over the course of extended training has not been extensively investigated.

We have previously shown that an alcohol-seeking response is sensitive to devaluation of the alcohol reward following 2 weeks, but not 8 weeks of training, providing evidence of a failure of goaldirected control after this extended training and drug exposure (11, 12, 15). In the current study, we examined the influence of alcohol-predictive stimuli on an alcohol-seeking response across this same timeframe. Given that habits are thought to be driven by stimuli rather than outcome and that the relative dominance of the habit system increases across extended training, we predicted that stimulus influences on responding should increase with training, that is, the magnitude of the PIT effect would increase from 2 weeks of training, where behavior is goal-directed, to 8 weeks of training, where behavior is habitual. We compared any changes in the magnitude of the PIT effect in animals trained to self-administer alcohol versus sucrose reward. Furthermore, we tested whether the specificity of PIT changes over extended training.

### MATERIALS AND METHODS

### Experiment 1: Pavlovian–Instrumental Transfer Following 2 or 8 Weeks of Alcohol Self-Administration Subjects and Apparatus

Sixteen male Long–Evans rats (approximately 300 g at the start of the experiment; Harlan, Indianapolis, IN, USA) were singly housed with free access to food and water. This study was conducted in accordance with the recommendations of the National Institutes of Health Office of Laboratory Animal Welfare. All procedures were approved by the Institutional Animal Care and Use Committee of the Ernest Gallo Clinic and Research Center at the University of California, San Francisco, CA, USA. Training and testing took place in 16 Med Associates (East Fairfield, VT, USA) operant chambers housed within sound-attenuating shells. Each chamber was equipped with a pump fitted with a syringe that delivered a fixed volume of solution into a recessed magazine in the chamber when activated. The chambers contained retractable levers to the left and right of the magazine. A houselight mounted on the top-center of the opposite wall provided illumination.

#### Alcohol Acclimation in the Home Cage

To familiarize the rats with the taste and pharmacological effects of alcohol, they were given free access to 10% ethanol (10E) (v/v) in filtered water in the home cage, for 24 h/day for 14 days, followed by 14 days of 1-h access to 10E at the time that training would subsequently occur. Water was always available in a separate bottle fixed to the home cage. Rats were weighed daily, and EtOH consumption was recorded.

#### Instrumental Training

Animals were assigned to either a 2-week or an 8-week group (*N* = 8/group) in an effort to match home cage alcohol consumption. The 2-week group completed 14 daily training sessions, whereas the 8-week group completed 56 daily sessions before Pavlovian training and PIT testing. Training started with a single 30-min magazine training session, where 10E was delivered under a random time (RT) 60-s schedule. Rats were next trained to make a lever-press response to deliver small aliquots (0.1 ml) of 10E in 60-min sessions. The first 2 days of training were under a continuous reinforcement schedule; reinforcement was then shifted to a random ratio-2 schedule for 3 days, followed by a random ratio-3 schedule for the remainder of training. Animals failing to respond at levels sufficient to achieve alcohol intake of at least 0.3 g/kg for 5 out of 7 days/week were excluded from the study. In sum, for the experiments reported here, four animals were excluded on this basis; however, group sizes reported here reflect animals that met the instrumental training criterion as only those animals went on to Pavlovian conditioning and PIT testing. The reward receptacle was examined at the end of each session to ensure that the earned rewards were consumed; apart from the initial training day, this was always the case. At the end of instrumental training, animals were tested for sensitivity to outcome devaluation by outcome-specific satiety. These procedures and data are reported elsewhere (15).

#### Pavlovian Training

Pavlovian training and PIT testing followed our previous published methods (23). Briefly, following instrumental training and devaluation testing, the rats received eight sessions of Pavlovian conditioning. Two auditory stimuli (white noise and clicker) served as conditional stimuli. One of these stimuli (CS+) was paired with ethanol delivery, while the other stimulus (CS−) had no programed consequences (counterbalanced). Six presentations of each stimulus were given in each session in random order separated by periods in which no stimuli were present. The average length of the intertrial interval varied but on average was 4.5 min. The stimulus presentations were 2-min long. During each CS+ presentation, 0.2 ml of 10E was delivered on a RT 30-s schedule. Because the schedule of 10E delivery was random, the number of outcomes varied across sessions. On average, the animals received 4.8 ml of 10E across the 75-min session, which should lead to significant blood alcohol levels. The number of magazine entries during each stimulus and pre-stimulus interval of equal length (2 min) was measured. The magazine was inspected at the end of the training sessions to ensure that the solutions had been consumed.

### Pavlovian–Instrumental Transfer Test

Rats received a single PIT test in which the lever was available, and each stimulus was presented twice interspersed with intervals of no stimulus (Ø). No rewards were delivered during testing. The 22-min test contained eight, 2 min bins [two white noise trials (N) and two clicker trials (C) alternated with four Ø trials in the following order: N, C, C, N]. Each stimulus presentation was separated from the subsequent baseline (Ø) interval by 1 min, and there was an additional 2-min extinction period prior to the first pre-CS interval.

#### Data Analysis

Data were analyzed using analysis of variance (ANOVA). Significant main effects and interactions were analyzed with further ANOVA, and significant simple effects were examined with pairwise comparisons.

## Experiment 2: Pavlovian–Instrumental Transfer Following 2 or 8 Weeks of Sucrose Self-Administration

### Subjects and Apparatus

The housing conditions and training apparatus were identical to those described in Experiment 1. Seventeen rats were assigned to either a 2-week (*N* = 8) or 8-week (*N* = 9) group. Rats were given free access to a 2% sucrose solution (2S) (weight/volume in filtered water) in the home cage for 48 h before training. The 2S solution was chosen based on pilot studies suggesting it would produce similar response rates as 10E.

#### Instrumental and Pavlovian Training and PIT Test

The training and test parameters were identical to those described for Experiment 1, except that 2S instead of 10E was used as the reinforcer.

### Experiment 3: The Specificity of Pavlovian–Instrumental Transfer Following 2 or 8 Weeks of Alcohol Self-Administration

#### Subjects and Apparatus

The housing conditions and training apparatus were identical to those described in Experiment 1.

#### Instrumental Training

Thirty rats were assigned to either a 2-week (*N* = 14) or 8-week (*N* = 16) group and trained to self-administer 10E as in Experiment 1.

#### Pavlovian Training

The rats received eight sessions of Pavlovian conditioning similar to that described above, except that two rewards (10E and 2S) were paired with the two stimuli (white noise and clicker). Six presentations of each stimulus were given in each session in random order separated by stimulus-free intervals. During each stimulus presentation, 0.2 ml of the appropriate solution was delivered on a RT 30-s schedule. All other parameters matched those described in Experiment 1.

#### Pavlovian–Instrumental Transfer

The PIT test was identical to that described in Experiment 1.

### RESULTS

### Experiment 1: Pavlovian–Instrumental Transfer Is Enhanced Following Extended Alcohol Self-Administration

Training

Response rates at the end of instrumental training for the 2- and 8-week groups are included in **Table 1**. Magazine entries across days of Pavlovian training are shown in **Figure 1A**. Responding during the CS+ increased across days relative to responding during either the CS− or the baseline period. ANOVA confirmed these observations with a significant effect of stimulus [*F*(2,28) = 82.4, *p* < 0.001], day [*F*(7,98) = 12.3, *p* < 0.001], and an interaction between these factors [*F*(14,196) = 20.1, *p* < 0.001]. Importantly, there was no effect of group [*F*(1,14) = 0.002, *p* = 0.961], and none of the interactions involving group were significant (*F*s < 1).

#### Pavlovian–Instrumental Transfer

We tested the hypothesis that stimulus influences on responding would grow with extended training by testing the magnitude of

Table 1 | Instrumental response rates for alcohol prior to PIT testing in Experiment 1.


*Mean (SEM) lever-press responses, volume of alcohol consumed, and gram/kilogram ethanol levels for the final 3 days of instrumental training.*

the PIT effect following 2 or 8 weeks of training. The data are presented in **Figure 1B**, which shows that the alcohol-predictive stimulus elevated the alcohol-seeking response from baseline, and that this effect was bigger after 8 weeks of training. The analyses confirmed these impressions revealing an effect of stimulus [pre, CS+, CS−; *F*(2,28) = 16.7, *p* < 0.001], no effect of group [*F*(1,14) = 1.4, *p* = 0.253], but an interaction between these factors [*F*(2,28) = 4.3, *p* = 0.024]. To examine the nature of the interaction and to address whether the impact of the CS+ was specifically enhanced with extended training, simple effects analyses comparing groups for each level of stimulus were conducted. The groups did not differ in responding during the baseline [pre; *F*(1,15) = 0.45, *p* = 0.511] or CS− [*F*(1,15) = 0.51, *p* = 0.486] intervals. However, responding during the CS+ was greater for the 8-week than for the 2-week group [*F*(1,15) = 5.21, 0.039]. Furthermore, responding during the CS+ was greater than during the baseline period for both the 2- and 8-week groups [2 weeks: *F*(1,7) = 2.5, *p* = 0.041; 8 weeks: *F*(1,7) = 6.31, *p* < 0.001] confirming significant PIT in each group.

### Experiment 2: Pavlovian–Instrumental Transfer Following 2 or 8 Weeks of Sucrose Self-Administration Training

Instrumental response rates at the end of training are shown in **Table 2**. Pavlovian training is shown in **Figure 2A**. As with alcohol reward, responding during the CS+ increased across days relative to responding during either the CS− or baseline period. ANOVA confirmed these observations with a significant effect of stimulus [*F*(2,30) = 97.5, *p* < 0.001], day [*F*(7,105) = 3.0, *p* = 0.006], and an interaction between these factors [*F*(14,210) = 9.6, *p* < 0.001]. Again, there was no effect of group [*F*(1,15) = 3.0, *p* = 0.103], and none of the interactions involving group were significant (*F*s < 1).

#### Pavlovian–Instrumental Transfer

Data from the PIT test are shown in **Figure 2B**, which shows that a sucrose-predictive stimulus also elevates performance

Figure 1 | Pavlovian–instrumental transfer is greater following extended alcohol self-administration. (A) Mean magazine entries (+SEM) during the pre-CS (baseline) period and presentations of the CS+ and CS− across days of Pavlovian training for the 2- and 8-week training groups. (B) Mean lever presses (+SEM) during the pre-CS (baseline) period and presentations of the CS+ and CS− during the Pavlovian–instrumental transfer test. The excitatory effects of the CS+ are greater for the 8-week group. \*indicates responding during the CS+ is greater than during the baseline period, *p* < 0.05. \*\*indicates responding is greater for the 8-week group than for the 2-week group, *p* < 0.05.

of a sucrose-seeking response, and that this effect appears to grow with extended training. Analyses revealed an effect of stimulus [*F*(2,30) = 8.64, *p* = 0.001], an effect of group [*F*(1,15) = 5.66, *p* = 0.029], and an interaction between these factors [*F*(2,30) = 4.61, *p* = 0.017]. As above, to address whether the impact of the CS+ was enhanced with extended training, simple effects analyses comparing groups for each level of stimulus were conducted. The groups did not differ in responding during the baseline [pre; *F*(1,16) = 0.06, *p* = 0.808] or CS− [*F*(1,16) = 0.76, *p* = 0.395] intervals. However, responding during the CS+ was greater for the 8-week, than for the 2-week group [*F*(1,16) = 8.2, 0.011]. Furthermore, responding during the CS+ was greater than during the baseline period for both the 2- and 8-week groups [2 weeks: *F*(1,7) = 8.59, *p* = 0.022; 8 weeks: *F*(1,8) = 16.58, *p* = 0.002] confirming significant PIT in each group.

### Experiment 3: The Specificity of Pavlovian–Instrumental Transfer Following 2 or 8 Weeks of Alcohol Self-Administration

#### Training

Instrumental response rates at the end of training are shown in **Table 3**. Pavlovian training is shown in **Figure 3A**. Responding during both stimuli increased similarly across days relative to responding during the baseline period. ANOVA confirmed these observations with a significant effect of stimulus [*F*(2,56) = 78.8, *p* < 0.001], day [*F*(7,196) = 47.8, *p* < 0.001], and an interaction

Table 2 | Instrumental response rates for sucrose prior to PIT testing in Experiment 2.


*Mean (SEM) lever-press responses and volume of sucrose consumed for the final 3 days of instrumental training.*

between these factors [*F*(14,392) = 18.8, *p* < 0.001]. The stimulus effect was driven by increased responding during the stimuli relative to the baseline period. Responding during E+ and S+ did not differ [*F*(1,28) = 0.426, *p* = 0.519]. Again, there was no effect of group [*F*(1,28) = 1.6, *p* = 0.223], and none of the interactions involving group were significant (*F*s < 1).

#### Pavlovian–Instrumental Transfer

Data from the PIT test are shown in **Figure 3B**. There was an effect of stimulus [*F*(2,56) = 29.38, *p* < 0.001] and an effect of group [*F*(1,28) = 5.86, *p* = 0.023]. The interaction between these factors was not significant [*F*(2,56) = 2.39, *p* = 0.101] potentially because baseline responding was slightly higher in the 8-week group in this experiment. Based on the results of Experiments 1 and 2, we further explored whether the magnitude of the stimulus effects differed between groups. The groups did not differ in responding during the baseline [pre; *F*(1,29) = 3.23, *p* = 0.083]. However, responding during the E+ was greater for the 8-week, than for the 2-week group [*F*(1,29) = 5.82, 0.023]. Responding during the S+ did not differ between groups [*F*(1,29) = 0.77, *p* = 0.389]. Responding during the E+ was greater than during the baseline period for both the 2- and 8-week groups [2 weeks: *F*(1,13) = 35.25, *p* < 0.001; 8 weeks: *F*(1,14) = 32.79, *p* < 0.001]. Responding was also greater during the S+ for the 2-week group [*F*(1,13) = 8.16, *p* = 0.014] but failed to reach significance for the 8-week group [*F*(1,14) = 4.18, *p* = 0.059], overall confirming PIT effects in each group. Finally, responding during the E+ was greater

Table 3 | Instrumental response rates for alcohol prior to PIT testing in Experiment 3.


*Mean (SEM) lever-press responses, volume of alcohol consumed, and gram/kilogram ethanol levels for the final 3 days of instrumental training.*

Figure 2 | Pavlovian–instrumental transfer is greater following extended sucrose self-administration. (A) Mean magazine entries (+SEM) during the pre-CS (baseline) period and presentations of the CS+ and CS− across days of Pavlovian training for the 2- and 8-week training groups. (B) Mean lever presses (+SEM) during the pre-CS (baseline) period and presentations of the CS+ and CS− during the Pavlovian–instrumental transfer test. The excitatory effects of the CS+ are greater for the 8-week group. \*indicates responding during the CS+ is greater than during the baseline period, *p* < 0.05. \*\*indicates responding is greater for the 8-week group than for the 2-week group, *p* < 0.05.

(A) Mean magazine entries (+SEM) during the pre-CS (baseline) period and presentations of the alcohol-predictive (E+) and sucrose-predictive (S+) stimuli across days of Pavlovian training for the 2- and 8-week training groups. (B) Mean lever presses (+SEM) during the pre-CS (baseline) period and presentations of the E+ and S+ during the Pavlovian–instrumental transfer test. The excitatory effects of the E+ are greater for the 8-week group. While S+ enhanced responding from baseline, this effect did not differ for the 2- and 8-week groups. \*indicates responding during the CS+ is greater than during the baseline period, *p* < 0.05. \*\*indicates responding is greater for the 8-week group than for the 2-week group, *p* < 0.05. # indicates responding is greater during E+ than during S+, *p* < 0.05.

than responding during the S+ in both the 2- and 8-week groups [F(1,13) = 6.36, *p* = 0.026; *F*(1,14) = 13.09, *p* = 0.003], providing evidence of specific PIT in addition to some general excitatory effects of the S+.

### DISCUSSION

Previous work has shown that exposure to drugs, including alcohol, promotes the development of habitual control of responding. The extant data have largely been generated using the outcome devaluation task, where insensitivity to changes in the value of the outcome produced by responding demonstrates a lack of goal-directed control. Such findings are taken as evidence of habitual control since the outcome of responding is not part of the underlying associative structure that supports habitual behavior, and as such, manipulations of the outcome are expected to have no immediate effect on performance of a habitual response. Nonetheless, since habit learning is thought to rely on an independent learning process involving the formation of associations between stimuli present when responding is reinforced and the response itself, it is reasonable to expect changes in the influence of stimuli on responding as behavior transitions from goal-directed to habitual. While there is some evidence that stimulus influences grow with the development of habitual control (7), this has not been explored with drug reward, where drug-related stimuli are thought to contribute to sustained drug use and precipitate relapse following periods of abstinence.

Here, we find that the magnitude of the PIT effect is greater following 8 weeks of self-administration training than it is after 2 weeks, time points where related work has shown responding to be habitual and goal-directed, respectively, based on sensitivity to outcome devaluation (15). This effect is not explained by changes in overall response rates, which were similar for the 2- and 8-week training groups. Importantly, the amount of Pavlovian training was the same for the two groups, and the measure of Pavlovian performance during training, the magazine entry response, despite including both CS- and US-related responding, did not differ between groups. This suggests that it is not that the strength of the Pavlovian conditioning differs, but that with extended instrumental training, the susceptibility of the instrumental response to Pavlovian influences increases. Similar results were found in animals trained to self-administer alcohol or sucrose reward suggesting that this phenomenon relates to extended training rather than something specific about drug reward. Of note, the previous study by Holland (7), showing evidence of enhanced PIT with extended training, also used natural rewards; thus, the current finding with sucrose reward is not entirely unexpected. It is important to note that while few studies have manipulated the amount of training to examine effects on PIT within a single study, a meta-analysis performed by Holmes et al. (22) found a complex relationship between the amount of training and the magnitude of PIT effects. For example, they found that PIT effects were greater with more instrumental training when instrumental training was conducted after, but not before, the Pavlovian training phase in apparent contrast to the current findings. However, the meta-analysis only included studies that trained rats on interval schedules and excluded studies using drug, including alcohol reward. Further, the range of instrumental training for studies included in the analysis was 2–20 sessions with the majority using 6–12 sessions, which is a fairly narrow range. Rats in the current experiments underwent almost three times as much training as the maximum reported by Holmes et al. (22), reinforcement was according to ratio schedules, which could produce different learning and performance patterns than interval schedules, and, in Experiments 1 and 3, rats earned alcohol reward. With these important procedural details in mind, it is not clear that the results of the meta-analysis can be extended to the current results. Nonetheless, it appears that multiple factors contribute to the magnitude of PIT effects, and even the relationship with the amount of training may be complex, meaning enhanced PIT may not always be observed following extended training.

Interestingly, an experimental study included in Holmes et al. (22) found that extensive (16 sessions) Pavlovian training reduced rather than enhanced PIT in comparison to shorter training (4 sessions). They interpreted this result in terms of response competition as they also found evidence of increased magazine entries in the extensively trained group. However, absolute response rates for the lever-press and magazine entry responses were not high in relation to the 2-min stimulus interval, suggesting response competition is less likely, although an effect of unmeasured Pavlovian responses in addition to the magazine response can not be ruled out. In the current study, magazine entries during Pavlovian training that followed instrumental training did not differ between groups. Furthermore, for response competition to account for the current results, this competition would have to be greater in the 2-week groups. Further experimentation would be required to provide any support for such a claim; however, as it was the amount of instrumental rather than Pavlovian training that varied in the current study, it seems more likely that some change to the nature of instrumental performance between groups is responsible for the effects observed here.

As noted above, while habits are thought to rely on an S–R association, the stimuli that support habitual performance in a free operant paradigm are typically poorly defined. One possibility is that these stimuli are derived from the physical context. Indeed, there is some evidence that contextual stimuli can contribute to habitual responding. For example, studies using designs where goal-directed and habitual responses are generated in the same animal train these two responses in distinct contexts that differ in a range of visual and tactile properties (24, 25). Furthermore, instrumental performance is sometimes decreased when animals are tested in a context that is distinct to where they were trained, suggesting that the context contributes partially to instrumental performance (26). In contrast, the PIT procedure measures the effects of stimuli conditioned in a separate Pavlovian training phase rather than those that are incidentally present as animals perform the instrumental response. While it does not directly assess the strength of the S–R association thought to underlie response habits, it nonetheless provides evidence of the susceptibility of instrumental responding to Pavlovian influences. How the independently trained Pavlovian stimuli interact with the S–R association thought to underlie responding is currently unknown. In addition to a role for the training context, it is also possible that the animals' own behavior sets the occasion for further responses or otherwise contributes to the S that drives S–R based responding. For example, animals may learn to follow magazine entry with a sequence of lever-press responses and as such, CS-elicited magazine entries could provoke additional lever presses in the presence of the CS. To the extent that behavior is more automatized following extended training or that sequences of behavior have been organized into "chunks," it is possible that such effects could grow with extended training and account for the elevated PIT observed in the 8-week groups. Future work involving detailed analyses of response microstructure within PIT testing could address these possibilities.

Another possibility is that the outcome serves not only as a reinforcer but also as a stimulus that directs subsequent responses. Strong evidence that animals use the outcome in this way comes from some elegant experiments by Ostlund and Balleine examining outcome-specific reinstatement effects (27). For example, they trained animals under circumstances where different outcomes (O1 and O2) not only served as reinforcers for responding (R1–O1; R2–O2) but also served as antecedents of the response. The critical manipulation was that the outcome of responding and that which preceded the subsequent response was either congruent (O1–R1–O1; O2–R2–O2) or incongruent (O1–R2–O2; O2–R1–O1). Ostlund and Balleine then tested the ability of, say, O1 to reinstate extinguished responding. They found that presentation of O1 reinstated R1 in the group with congruent training; however, O1 reinstated R2 after incongruent training suggesting that the antecedent O–R association is responsible for reinstatement of instrumental responding. Thus, outcomes can serve as stimuli to direct responding. Applying this to an expectancy- or cueing-based explanation of PIT (28, 29), presentation of S will retrieve a representation of the outcome it was trained with, which in turn, through this O–R association, will promote performance of a response also associated with that O. In the current experiments, the free operant training of a single response is most similar to the congruent training of Balleine and Ostlund (27), and it would be expected that the earned outcome, say alcohol, serves not only as a reinforcer but also as a signal for performance of a response that earns alcohol. Presentation of the E+ then can invigorate performance of the alcohol response to generate the observed PIT effect. Based on the results of Balleine and Ostlund (27), one would expect this effect to be selective, which would explain why in Experiment 3, the effects of E+ but not S+ grow with extended training. To explain the enhanced PIT following extended training, this view assumes that the strength of the O–R association is incrementally strengthened with extended training much the same as is suggested for the more general S–R association proposed to underlie habit learning. With a stronger O–R association, retrieval of O as a signal for responding by S should have a greater effect on responding in the extended training group, which could account for the enhanced PIT that was observed in these groups. Importantly, Balleine and Ostlund (27) found that while the magnitude of outcome-specific reinstatement effects was reduced by devaluation, the specificity of these effects remained intact, indicating that the influence of the outcome on response selection does not depend on outcome value. This finding parallels reports that outcome-specific PIT is not dependent on outcome value and explains how PIT effects could grow under conditions where outcome value plays little role in controlling performance (that is, the devaluation-insensitive performance of the extended training groups).

Several different types of PIT have been identified. Stimuli may produce an enhancement (or suppression) of responding as a result of the motivational consequences of association with reinforcement generally (referred to as non-selective or general transfer). Alternatively, a stimulus may have quite specific effects impacting only response(s) associated with the same outcome as is predicted by the stimulus (referred to as specific transfer). As noted above, to explain such PIT effects, some theoretical accounts suggest that stimuli produce an expectancy regarding a particular outcome that, through a form of S–R process (S–O–R), elevates the performance of actions associated with the predicted outcome [e.g., Ref. (28, 30)]. Interestingly, when rats were trained with two stimuli that predicted alcohol and sucrose, respectively, while both stimuli elevated responding (on a response trained Corbit and Janak PIT after Extended Training

with alcohol), the alcohol-predictive stimulus was more effective in elevating responding, providing some evidence of specific PIT, and importantly, only the influence of the alcohol-predictive stimulus grew with extended training suggesting predominantly an outcome-specific effect rather than an energizing effect that should have impacted both stimuli. The meta-analysis conducted by Holmes et al. (22) found no evidence of changes to the specificity of PIT in experiments in designs that allowed examination of stimuli paired with the same or different outcomes as the target response. We have previously observed that alcohol-predictive stimuli are unique in that they also enhance performance of a response earning an alternate reward (sucrose) under training conditions that typically produce outcome-specific PIT (23). Thus, the lack of change in the influence of the sucrose stimulus on responding for alcohol in Experiment 3 is consistent with previous results (22). Whether the amount of training would have any impact on the previously reported general effects of an alcohol stimulus on responding for an alternate outcome, such as sucrose, was not tested in the current experiments and thus requires future experimentation.

While insensitivity to devaluation provides the most direct evidence of performance that is independent of goal value, it is worth noting several important demonstrations that the ability of stimuli to trigger responding does not depend on the predicted outcome being valuable at the time of testing. While the current study demonstrates particularly strong stimulus effects after training shown elsewhere to generate responding that is insensitive to devaluation, we did not examine the effects of devaluation on expression of PIT. However, others have shown that the ability of a stimulus to augment the performance of an action predicting the same outcome as the stimulus is not altered by outcome devaluation (31–33), although baseline response rates may be reduced. These types of findings demonstrate that the ability of stimuli to invigorate responding can be independent of evaluative processes related to the consequences of that responding. PIT effects also persist following manipulations that degrade the stimulus–outcome (S–O) contingency, such as extinction of S, pairing of S with a new outcome, or switching

### REFERENCES


the S–O contingency to either a random or explicitly unpaired relationship with the outcome following initial training (34). These results, like those found with various recovery phenomena (spontaneous recovery, renewal, and reinstatement), suggest that S–O associations and their influence on behavior are persistent and difficult to change once established.

Of note, outcome devaluation and PIT tests are typically conducted under extinction conditions where reward is withheld, similar to other recovery phenomenon used to model human relapse. This differs from the human situation where drug seeking is likely to produce the desired drug. With this in mind, increases in the magnitude of effects, such as PIT, perhaps speak to the power of drug-associated stimuli to provoke the initiation of drug-seeking behaviors. The stronger these effects, the more likely stimuli are to trigger a drug-seeking response, which in real-world settings could result in drug use. Findings, such as the current results, suggest that the ability of stimuli to drive behavior increases under conditions that promote habitual control provide some insight into the factors that control responding when it is not generated by expectation and evaluation of a particular outcome, and it may help explain why habitual responding is resistant to change. Such findings may also improve understanding of the factors that contribute to relapse to drug use in individuals with a stated desire to abstain and who are aware of, but apparently insensitive to, the negative consequences of continued drug use.

### AUTHOR CONTRIBUTIONS

LC conducted the experiments. LC and PJ designed the experiments, analyzed the data, and prepared the manuscript.

### FUNDING

This research was supported by National Institutes of Health Grants AA014925 and AA018025 to PHJ and Australian National Health and Medical Research Council 1051037 and ABMRF to LHC.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Corbit and Janak. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Behavioral Neuroadaptation to Alcohol: From Glucocorticoids to Histone Acetylation

*Nicole Mons\* and Daniel Beracochea*

*CNRS UMR 5287, Institut des Neurosciences cognitives et intégratives d'Aquitaine, Nouvelle Université de Bordeaux, Pessac, France*

A prime mechanism that contributes to the development and maintenance of alcoholism is the dysregulation of the hypothalamic–pituitary–adrenal axis activity and the release of glucocorticoids (cortisol in humans and primates, corticosterone in rodents) from the adrenal glands. In the brain, sustained, local elevation of glucocorticoid concentration even long after cessation of chronic alcohol consumption compromises functional integrity of a circuit, including the prefrontal cortex (PFC), the hippocampus (HPC), and the amygdala (AMG). These structures are implicated in learning and memory processes as well as in orchestrating neuroadaptive responses to stress and anxiety responses. Thus, potentiation of anxiety-related neuroadaptation by alcohol is characterized by an abnormally AMG hyperactivity coupled with a hypofunction of the PFC and the HPC. This review describes research on molecular and epigenetic mechanisms by which alcohol causes distinct region-specific adaptive changes in gene expression patterns and ultimately leads to a variety of cognitive and behavioral impairments on prefrontal- and hippocampal-based tasks. Alcoholinduced neuroadaptations involve the dysregulation of numerous signaling cascades, leading to long-term changes in transcriptional profiles of genes, through the actions of transcription factors such as [cAMP response element-binding protein (CREB)] and chromatin remodeling due to posttranslational modifications of histone proteins. We describe the role of prefrontal–HPC–AMG circuit in mediating the effects of acute and chronic alcohol on learning and memory, and region-specific molecular and epigenetic mechanisms involved in this process. This review first discusses the importance of brain region-specific dysregulation of glucocorticoid concentration in the development of alcohol dependence and describes how persistently increased glucocorticoid levels in PFC may be involved in mediating working memory impairments and neuroadaptive changes during withdrawal from chronic alcohol intake. It then highlights the role of cAMP–PKA–CREB signaling cascade and histone acetylation within the PFC and limbic structures in alcohol-induced anxiety and behavioral impairments, and how an understanding of functional alterations of these pathways might lead to better treatments for neuropsychiatric disorders.

#### *Edited by:*

*Stefan Borgwardt, University of Basel, Switzerland*

#### *Reviewed by:*

*Giovanni Martinotti, Università degli Studi "G. d'Annunzio" Chieti – Pescara, Italy Diana Martinez, Columbia University, USA*

> *\*Correspondence: Nicole Mons nicole.mons@u-bordeaux.fr*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 30 May 2016 Accepted: 21 September 2016 Published: 06 October 2016*

#### *Citation:*

*Mons N and Beracochea D (2016) Behavioral Neuroadaptation to Alcohol: From Glucocorticoids to Histone Acetylation. Front. Psychiatry 7:165. doi: 10.3389/fpsyt.2016.00165*

Keywords: alcoholism, epigenetic, learning and memory, glucocorticoid, anxiety, signaling, CREB, brain

## INTRODUCTION

Alcoholism is a chronic, often relapsing brain disorder characterized by periods of sustained, compulsive alcohol intake, relying in part on allostatic changes within the prefrontal cortex (PFC) and limbic structures [i.e., the hippocampus (HPC) and the amygdala (AMG)] [for review, see Ref. (1)]. This circuit plays key roles in behavior and cognitive function as well as in orchestrating neuroadaptive responses to stress and anxiety. The transition from recreational to alcohol dependence and compulsive alcohol drinking takes place *via* neuroadaptive changes in the stressrelated neural circuits, caused partly by repeated cycles of alcohol intoxication and withdrawal (2, 3). A prime mechanism that contributes to the development and maintenance of alcoholism is the dysregulation of the hypothalamic–pituitary–adrenal (HPA) axis activity (4) and the release of glucocorticoids (cortisol in humans and primates, corticosterone in rodents) from the adrenal glands. Clinical and preclinical evidence in both humans (5–7) and rodents (4, 8, 9) have shown that acute and chronic alcohol consumption, as well as withdrawal, markedly affects plasma glucocorticoid levels. The release of glucocorticoids can influence brain function by readily crossing the blood–brain barrier and exert effects through a dual glucocorticoid binding receptor system, i.e., the type I high affinity mineralocorticoid receptors (MRs) or the type II low affinity glucocorticoid receptors (GRs) (10), which act as ligand-dependent transcription factors to modulate target gene transcription. The MRs display a restricted expression in the brain, with highest densities in the HPC (11–13). The GRs are widely distributed throughout the brain (10, 14, 15) with a predominant expression in the three areas involved in learning and memory and particularly sensitive to the effects of stress, namely, the PFC, the dorsal HPC, and the AMG (16–18). Indeed, human studies of Cushing's syndrome have shown that sustained cortisol elevation over the years compromises the integrity of the HPC– PFC circuitry and thus influences the onset and/or the severity of cognitive decline in various tasks, including spatial, decisionmaking and working memory processes (19–23). Further, sustained, high local concentration of glucocorticoids is responsible for long-lasting cognitive impairments occurring several weeks after the cessation of alcohol in rodents (24, 25) and abstinent patients (26, 27). As to how elevation of glucocorticoids might be implicated in the enduring cellular, molecular, and behavioral changes, it has been suggested that neuroadaptation induced by alcohol exposure involves the dysregulation of numerous signaling cascades, leading to long-term changes in transcriptional profiles of genes, through the actions of transcription factors such as [cAMP response element-binding protein (CREB)] and chromatin remodeling due to modifications of the posttranslational properties of histone proteins [for review, see Ref. (28)]. In the following, we provide an overview of how transcriptional and histone acetylation changes in the PFC, the HPC, and the AMG play a central role in the glucocorticoid-dependent neuroadaptation and behavioral deficits that occur during acute and chronic alcohol exposure. While this review focuses on aspects of how spatial and temporal changes in histone acetylation drive alcoholinduced alterations in neural plasticity and behavior, it should be emphasized that other histone modifications marks, such as histone phosphorylation and histone lysine methylation, occur in parallel and are also involved in the long-term adaptations in neural function and behavioral responses to alcohol exposure.

### BRAIN REGIONAL GLUCOCORTICOID RESPONSE TO CHRONIC ALCOHOL EXPOSURE

Surprisingly, little is known about the long-lasting neuroadaptive changes of glucocorticoids caused by prolonged alcohol consumption and withdrawal within neural circuits involved in learning and memory and emotional events and about their behavioral consequences. Studies, including our own, have shown that the initial phase of alcohol withdrawal period produces elevation in both circulating and brain glucocorticoids levels (29–31). Importantly, Little and colleagues (30) were first to show that during the initial phase of withdrawal from chronic (*8 months in rats*) alcohol consumption, rats and mice display an abnormal, exaggerated corticosterone level selectively in the medial PFC and the dorsal HPC. Strikingly, the authors found that withdrawal-associated excessive corticosterone response in the PFC persists for up to 2 months; therefore, long after, plasma corticosterone levels returned to baseline levels. In the PFC, the sustained elevation of corticosterone concentration was associated with enhanced GRs activation in mice undergoing a 2-week withdrawal period from chronic alcohol consumption (30). Further, administration of the GRs antagonist mifepristone or the dihydropyridine calcium channel nimodipine, given just prior to withdrawal from chronic alcohol exposure, not only reduced the rises in brain corticosterone but also prevented persistent memory deficits seen several weeks later in mice (24) or rats (32), suggesting that withdrawn-associated rise in glucocorticoid levels specifically within medial PFC may be an early index of maladaptive persistent behaviors in alcohol-dependent subjects. Indeed, chronic treatment with the GR antagonist mifepristone attenuated escalation of ethanol intake following intermittent ethanol vapor exposure (33) as well as the development of alcohol dependence and ultimately withdrawal-associated behavioral deficits (34). Endogenous glucocorticoids have been suggested to play an essential role in maintaining PFC-dependent cognitive functions, mainly *via* complex interaction with dopaminergic and glutamatergic receptors (35–37). Both human and animal studies have demonstrated that alcohol withdrawal impairs a variety of the cognitive functions during tests that require cortical prefrontal processing (38–40). As regards, pharmacological (hydrocortisone administration) or pathological (Cushing's disease) increase of cortisol was found to predict frontal cortexbased cognitive impairments including alterations in executive processes and working memory dysfunction (19, 23, 41–43). Long-lasting deficits on tasks that rely on the PFC are also observed in rodent models in which chronic alcohol dependence is induced by chronic alcohol exposure or chronic intermittent ethanol that involves repeated cycles of exposure to alcohol vapors (44, 45). However, in addition to PFC dysfunction, there is evidence that a functional disconnection of brain network connectivity between the (dorsomedial) PFC and the central nucleus of the AMG may also contribute to the alcohol-induced working memory impairments in rats (46).

Recent work in our laboratory has employed *in vivo* microdialysis in freely moving mice to investigate effects of chronic alcohol treatment and withdrawal (early and prolonged) periods on brain corticosterone concentrations by simultaneously measuring time-course evolution of corticosterone concentration in the medial PFC and dorsal HPC seen before, during, and after completion of a working memory task in a T-maze (31, 47). This task is based on spontaneous alternation behavior, known to require intact connections between the two structures for successful performance (48, 49). Specifically, alternation behavior is the innate tendency of rodents to alternate at each successive trial the choice of the goal arm over a series of trials run in a T-maze (except for the first trial). From trial to trial, accurate performance at a given trial (*N*) requires for subjects to be able to discriminate the specific target trial *N* − 1 from the interfering trial *N* − 2. Thus, the target information required for successful performance varies from trial to trial, so that the subject is not only required to temporarily keep specific information in shortterm storage but also reset it over successive runs. The resetting mechanisms and cognitive flexibility required to alternate over successive runs are major components of working memory processes. Working memory is a component of the sequential alternation task, since spontaneous alternation rates are dependent on the length of the inter-trial delay interval and/or the place of the trial in the series. Indeed, repetitive testing constitutes a potent source of proactive interference. Thus, the sequential alternation procedure is relevant to assess delay-dependent working memory in mice (50–52). Using *in vivo* microdialysis in freely moving mice, we observed that early (1 week) and protracted (6 weeks) withdrawal periods from prolonged (6 months) alcohol exposure causes an exaggerated corticosterone rise in the medial PFC. In addition, withdrawn mice having abnormal corticosterone concentration in the PFC displayed impaired working memory performance, effects that were not observed in animals still submitted to chronic alcohol consumption. Moreover, early and protracted withdrawal periods had no effect on the dynamic pattern of corticosterone response in the dorsal HPC, indicating that alcohol impacts glucocorticoid regulation in a brain region-specific fashion. During the 6-week withdrawal period, the degree of working memory impairment correlated with the magnitude of prefrontal corticosterone concentration, which is in accordance with the notion that there is a functional link between excessive corticosteroid signaling and PFC dysfunction (53–55). Many neuroimaging studies have indicated consistently that structural and functional deficits in PFC regulatory regions are associated with chronic alcoholism [for review, see Ref. (56)]. Another study using SPECT imaging showed that detoxified alcoholic patients who relapsed 2 months later displayed working memory deficits associated with low blood flow in the medial frontal lobe (57). Given the importance of frontal cortical regions in the modulation of AMG reactivity and the mediation of effective emotion regulation, weakened PFC function associated with a specific functional disconnection between the PFC and the AMG has been proposed as an early index of neuroadaptation in alcohol dependence that predicts PFC-dependent cognitive impairments observed during abstinence (38, 39, 46).

Subsequently, we have studied whether local glucocorticoid blockade in the medial PFC would prevent the long-term deficits in working memory induced by protracted withdrawal from chronic alcohol consumption (31). Intraperitoneal administration of the corticosterone synthesis inhibitor metyrapone prior to testing prevented the withdrawal-associated working memory impairments, confirming the essential role of persistently increased glucocorticoid levels in behavioral impairments during withdrawal from chronic alcohol intake. Similarly, a single bilateral infusion of spironolactone into the medial PFC that diminished MR activation and to a lesser extent of mifepristone that diminished GRs activation fully restored working memory function in withdrawn mice. In contrast, neither spironolactone nor mifepristone had any effect when infused into the dorsal HPC, thus highlighting the importance of glucocorticoids specific to the PFC in neural substrates mediating the prolonged, detrimental effects of alcohol on behavioral performance. These findings are reminiscent of data showing that elevated glucocorticoid levels, *via* either systemic injection of corticosterone or local infusion of the GRs agonist RU 28362 into the medial PFC shortly before testing, similarly impair working memory (55), while the GRs antagonist RU 38486 infused into the PFC can restore stress-induced deficits in executive function (58). Collectively, these data support the view that long-term adaptive behavioral effects of chronic alcohol exposure are mediated in large part through long-lasting glucocorticoid dysregulation within the PFC circuitry.

### MOLECULAR MECHANISMS UNDERLYING ANXIETY-LIKE AND ALCOHOL-DRINKING BEHAVIORS: THE ROLE OF cAMP–PKA–CREB CASCADE

The transcription factor CREB is a key downstream target of a variety of kinases, including cAMP–protein kinase A (PKA), Ca2<sup>+</sup>/calmodulin-dependent kinase, and extracellular-regulated kinase/mitogen-associated protein kinase (ERK/MAPK) (59, 60). The resulting activation/phosphorylation of CREB and recruitment of CREB-binding protein (CBP) along with other transcriptional components enables transcription of specific CREB target genes, including those implicated in long-term memory and plasticity as well as in the development of anxietylike and alcohol-drinking behaviors, such as the neuropeptide Y (NPY) and the brain-derived neurotrophic factor (BDNF) (61–64). There is mounting evidence to support a role for phosphorylated CREB (pCREB) through a PKA-dependent mechanism and downstream CREB target genes, in the adaptive changes and behavioral effects associated with acute and chronic alcohol exposure [for review, see Ref. (65–68)]. Acute and chronic ethanol exposures have long been known to modulate the various steps of the cAMP-dependent pathways in the rodent brain and in other cell systems (69–71). Exposure to ethanol affects a cascade of events allowing for sustained translocation of PKA catalytic subunit into the nucleus (72), ultimately resulting in long-lasting increased CREB activation/phosphorylation (73) and downstream expression of many target genes (74). In this context, abnormal PKA-dependent CREB functioning has been implicated in the molecular mechanisms of neuroplasticity that underlie alcoholism and alcohol drinking. There is evidence for a biphasic temporal effect of ethanol on cAMP–PKA-dependent signaling cascade with acute and prolonged exposure to ethanol potentiating (75) and decreasing (76, 77), respectively, adenylyl cyclase–cAMP–PKA activity in the cortex and HPC (78) in mice. Using a combination of genetic or pharmacological approaches, *Drosophila* and rodents studies have shown that maintaining integrity of the cAMP–PKA activity is central to establishing sensitivity to the sedative effect of ethanol as well as in modulating ethanol consumption (79–81). Acute withdrawal (24 h) from chronic ethanol treatment produced a decrease in Ser133– pCREB within specific neurocircuitry of the frontal, parietal, and piriform cortex in rats (82), suggesting the possibility that CREBdependent events in these cortical structures may be involved in the development of alcohol dependence. Among the mechanisms responsible for reduced pCREB and downregulation of cAMPdependent genes, chronic intermittent alcohol exposure has been shown to increase expression of the protein kinase inhibitor-α (PKI-α) in the PFC, nucleus accumbens, and AMG in Wistar rats (83). Given the wealth of data for the recruitment of the cAMP– PKA signaling pathways upon acute ethanol exposure, it has been proposed that the increased PKI-α expression may be part of the adaptation of the cAMP–PKA pathway induced by intermittent alcohol exposure.

Investigations into the role of CREB in amygdaloid brain structures with regard to anxiety-like and alcohol-drinking behaviors have shown that CREB activity fluctuates depending on brain structures and alcohol "condition" (acute, chronic, or withdrawal). For instance, a series of studies by Pandey's group conducted in the rat AMG clearly indicate a strong relationship between decreased CREB phosphorylation and high anxietylike responses associated with acute withdrawal from 2-week ethanol treatment (62, 82). Decreases in CREB phosphorylation and downstream cAMP-inducible genes, including NPY in the central and medial, but not the basolateral, nuclei of the AMG, have been associated with a predisposition to both anxiety-like and excessive alcohol-drinking behaviors in alcohol-preferring rats (60, 84–86). Restoring CREB function to optimal level or enhancing NPY signaling in the central AMG prevented the onset of anxiety-like behaviors (84, 87, 88), while alcohol-associated anxiety disorders can be mimicked by pharmacological blockade of PKA in ethanol-naïve-preferring rats or non-preferring rats (60, 84). Thus, anxiety-induced downregulation of CREB function in the AMG may constitute a critical neuroadaptation central to the development and maintenance of alcohol dependence. As regards, dysregulation of the PFC associated with a functional disconnection between the PFC and AMG central nucleus during abstinence and renewed access to alcohol has been implicated in long-lasting cognitive impairment and excessive alcohol drinking in rats (46).

Clinical evidence from alcohol-dependent patients also indicates that acute and protracted withdrawal/abstinence is strongly associated with depressive-like behaviors, such as anhedonia.

The catecholamines dopamine and noradrenaline *via* the cAMP–PKA–CREB signaling cascade provide an essential modulatory influence on PFC-dependent behaviors producing an inverted "U-shaped" dose–response influence, whereby moderate levels improve PFC function while either too little or too much catecholamines lead to cognitive impairments [for review, see Ref. (89)]. A number of studies including work in our laboratory (51) have shown that blocking the cAMP–PKA–CREB signaling cascade *via* local infusion of Rp-cAMPS (a compound known to inhibit CREB phosphorylation) into the PFC prevents the impairing effect of stress or aging on working memory performance, while drugs that increase cAMP–PKA signaling either by direct intra-PFC infusion of the cAMP analog Sp-cAMPS or dopamine D1 receptor agonist or i.p. administration of the phosphodiesterase (PDE) inhibitor Rolipram impair cognitive functions [for reviews, see Ref. (89–91)]. As mentioned above, we recently reported that consumption of an alcohol-containing liquid diet for 6 months followed by a 1-week withdrawal period produces working memory impairment in a T-maze spontaneous alternation task in mice, which persists for at least 6 weeks after the cessation of alcohol intake (31, 47). Moreover, withdrawn mice displaying impaired working memory performance were those that had the lowest pCREB level in the PFC along with a persistent rise of prefrontal corticosterone concentration. Because glucocorticoids in the PFC interact with β-adrenoceptor–cAMP/ PKA activity to influence working memory function (92), one route by which elevated glucocorticoid levels may impair PFCmediated cognitive function long after the cessation of alcohol exposure is by inhibiting the cAMP–PKA cascade. In this context, growing evidence supports a central role for PDE, which is responsible for the breakdown of cAMP, in the regulation of alcohol drinking in rodents [for review, see Ref. (93)]. For example, treatment with various PDE4 inhibitors, including rolipram, produces long-lasting reduction of alcohol intake and preference in C57BL/6J mice (94). Chronic rolipram treatment also results in sustained reduction of alcohol seeking and consumption in alcohol-preferring rats (95, 96). As mentioned earlier, mice subjected to 1- or 6-week alcohol withdrawal from chronic alcohol consumption exhibited working memory impairments accompanied by enhanced anxiety level (at 1 week only) as well as persistently elevated corticosterone and sustained decreased pCREB levels in the PFC. Intraperitoneal administration of the PDE4 inhibitor, rolipram, before working memory testing abolished these withdrawal-associated behavioral, endocrine, and neuronal alterations (31) – a finding consistent with other observation, which demonstrated that in rats, heightened anxiety during acute alcohol withdrawal was accompanied by elevated expression of *Pde10a* isoform mRNA levels in interconnected medial PFC–AMG circuit, which persisted in the AMG after protracted (6 weeks) alcohol withdrawal (97). Together, these observations strongly support further research with regard to isoform-specific PDE-selective inhibitors that are promising pharmacotherapy targets for alcohol use disorders.

As discussed above, long-term adaptive behavioral effects of chronic alcohol exposure are mediated in large part through long-lasting glucocorticoid dysregulation within the PFC but not the dorsal HPC. Confirming differential sensitivity of the PFC and dorsal HPC to chronic alcohol-induced damage, recent work in our laboratory has shown that, unlike the PFC in which withdrawal from prolonged alcohol intake caused persistent working memory impairments along with sustained inhibition of the cAMP–PKA–CREB signaling cascade, both alcohol (unimpaired) and alcohol withdrawal (impaired) mice display reduced levels of pCREB in the dorsal HPC (namely, the CA1 region), compared with water-drinking mice (31, 47). Furthermore, intraperitoneal administration of rolipram was able to correct the deficit in pCREB in the dorsal HPC but did not reverse working memory impairments in withdrawn animals (47). Together, these observations support the notion that disruption of the cAMP– PKA–CREB signaling cascade specifically in the PFC (but not in the dorsal HPC) has an essential role in promoting long-term neuroadaptive changes accompanying persistent behavioral changes during withdrawal from chronic alcohol intake. Interestingly, early pioneering work in our laboratory emphasized a key role for PKA–CREB signaling as a sustained "molecular switch" that gradually converts acute "drug" responses into relatively stable adaptations that contribute to drug and alcohol addictionmediated long-lasting neural and behavioral plasticity. Under conditions of drug- and food-reinforced behavior, drug-induced reward impaired spatial discrimination learning in a Y-maze task and caused drastic decreases in pCREB and downstream target c-Fos expression in the dorsal HPC and the PFC while sparing the cued version of the task and pCREB in the dorsal striatum in mice (98). Further, pharmacological blockade of cAMP–PKA cascade into the striatum before training normalized CREB activity within the HPC–PFC circuit and, as subsequently, prevented the drug-induced modulation of multiple memory systems.

Emerging evidence indicates that brain region-specific alteration of CREB signaling is also an important regulator involved in depression-like behavior that emerges during abstinence following alcohol drinking. As a key symptom of clinical depression, anhedonia reflects reduced interest in enjoying pleasure-seeking behavior and plays a key role in relapse (99, 100) and in the perpetuation of excessive alcohol consumption in dependent individuals (101). Important clinical evidence clearly demonstrated that the persistence and intensity of some behavioral withdrawal symptoms positively correlated with anhedonia scales in detoxified alcohol-dependent subjects (102), extending previous findings of strong correlation between anhedonia and substance-related symptoms particularly in detoxified opiate-dependent subjects (103). The presence of depression-related behavioral phenotypes during protracted abstinence was also reported in rodent models (104–106). In mice undergoing 2 weeks of abstinence from chronic alcohol consumption, the persistent increase in plasma corticosterone response and upregulation of GR expression correlated with the development of depressive-like phenotypes, including anhedonia and helplessness (105), and reduced hippocampal neurogenesis (104). Further, there are several lines of evidence that suggest that downregulation of BDNF–TrKB–CREB signaling pathway may serve as a common link between the development of alcohol-induced depression-like symptoms and reduced hippocampal neurogenesis (104, 105, 107, 108). Finally, since enhancing the BDNF–CREB activity through pharmacological treatments with various classes of antidepressant drugs or environmental enrichment abolished the alcohol-induced anhedonia and depressive behaviors seen during protracted abstinence (104, 107, 108), supporting the hypothesis that BDNF–CREB signaling pathway may be a potential therapeutic target for interventions in alcoholism–depression coincidence.

### ALCOHOL ALTERS THE BALANCE BETWEEN HISTONE ACETYLATION: DEACETYLATION

Equally important for providing precise, long-lasting changes in brain function associated with alcohol intake are histone modifications, which exert lasting control over transcriptional activity of target genes through modifications of the chromatin structure and function that make the DNA less or more accessible to transcription factors and enzymes. The basic unit of chromatin, the nucleosome, is a histone octamer wrapped by approximately 147 base pairs of DNA. Each core histone (H2A, H2B, H3, and H4) has a highly conserved amino (N)-terminal tail, which is subject through a range of posttranslational modification (PTM) marks at distinct residues/sites including acetylation and methylation of lysine residues and phosphorylation of serine residues (109). Histones acetylation and phosphorylation are associated with transcriptional activation, whereas histone methylation reflects both transcriptional activation and repression depending on the specific site and context of the modification. An important feature of histone PTMs is that they can influence each other in a synergistic or antagonistic manner, leading to a complex "histone code" (110). Of these histone PTMs, histone acetylation is the most widely investigated in terms of epigenetic mechanisms underlying region-specific changes in brain gene networks required for long-term memory processes. Many rodent studies have detailed how different learning paradigms trigger distinct histone acetylation patterns in the brain, which are accompanied by region-, task-, and age-specific changes in memory-associated genes [for reviews, see Ref. (111–115)]. For instance, increased acetylation of histones, H3 and H4, occurred in the dorsal HPC or the dorsal striatum, depending on whether mice were subjected to a spatial or cued training in the water maze task, respectively (116, 117).

The degree of histone acetylation/deacetylation is finely orchestrated by dynamic balance of antagonistic enzymes that "write" (HATs) and "erase" [histone deacetylases (HDACs)] acetylation sites (113, 118–121). Systemic administration of HDAC inhibitors (HDACi), such as sodium butyrate (NaB) or trichostatin A (TSA), can improve memory formation and also prevent or reverse cognitive impairments associated with normal and pathological aging. However, this enhancing effect of HDACi on HPC-dependent memory required accurate CREB activity (116, 117, 122). Furthermore, infusing HDACi directly to the HPC was not only effective in promoting HPC-dependent learning and memory processes but can also influence relative use of multiple memory processes by affecting transcriptional events within subcortical and PFC cortical structures (116, 123).

A growing set of studies in both humans and animals have indicated that alcohol exposure causes widespread, dynamic changes of histone acetylation patterns, and thereby dysregulation in gene expression profiles across multiple brain regions (28, 124–126). Most of the studies have focused on the two histones H3 and H4 acetylation and chromatin-related events within the PFC, the HPC, and the AMG. In mouse and rat brain, studies reported that alcohol's effects on histone acetylation patterns depend on the alcohol treatment paradigm, the timing of alcohol exposure or withdrawal, and brain structures examined, and even within a structure, alcohol can affect differently subregions. For example, work from Pandey's lab has shown that anxiolytic-like responses caused by acute ethanol i.p. injection were accompanied by increased HAT CBP activity and associated increased acetylation of histone H3 at lysine 9 and histone H4 at lysine 8 (H3K9 and H4K8, respectively) leading to rapid elevation of NPY (mRNA and protein level) specifically in the central and medial, but not the basolateral amygdaloid, nuclei (125). The same group observed that a 2-week ethanol exposure followed by acute ethanol withdrawal (24 h) switches alcohol's effect to anxiogenic-like responses, effects that involve a shift from HDAC hypoactivity to HDAC hyperactivity and subsequently decreased histone acetylation and transcriptional repression of NPY function in the two AMG nuclei (125, 127, 128). Correcting histone acetylation deficits in the AMG *via* administration of the pan HDACi TSA can reverse the rapid tolerance to the anxiolytic effects of ethanol (128) and prevent the development of alcohol withdrawal-related anxiety in rat (125). Alcohol-induced neuroadaptation in the AMG also implicated deficits of BDNF activity and its target [activity-regulated cytoskeleton-associated protein (Arc)], two key signaling factors involved in synaptic transmission and plasticity. While acute ethanol exposure caused an upregulation of BDNF–Arc signaling pathway and subsequently increased dendritic spine densities in the central and medial AMG nuclei, withdrawal from prolonged ethanol exposure or binge ethanol consumption potently inhibited BDNF and Arc expression and reduced dendritic arborization in these nuclei and other regions, leading to increased anxiety-like and drinking behaviors (66, 129, 130). Importantly, these long-lasting adaptive changes associated with alcohol dependence were reversed upon treatment with the HDACi TSA (61, 128–130). In another study by Moonat and colleagues (61) examining the role of HDAC2 in the development of alcohol dependence, investigators found lower baseline BDNF protein levels in the AMG (and also the bed nucleus of stria terminalis) of alcohol-preferring rats, a well-established model used to study the genetic predisposition to alcoholism (131), relative to the low-drinking NP rats. In addition, innate HDAC2 overexpression and decreased H3K9 acetylation in the central nucleus of alcohol-preferring rats correlated with low levels of BDNF, Arc, and NPY and was accompanied with high levels of anxiety-like and alcohol-drinking behaviors. These HDAC2 associated molecular and behavioral deficits were rescued *via* specific knockdown of HDAC2 expression either by direct infusion of small interfering RNA (siRNA) against HDAC2 into the central AMG nucleus (61, 66) or by TSA treatment (127, 130, 132). Collectively, these observations raised the possibility that adaptive epigenetic changes involving HDACs, and in particular HDAC2, in the AMG may be important regulatory mechanisms that underlie expression of genes implicated in the development and pathogenesis of alcohol dependence.

Using a chronic intermittent ethanol exposure model, a robust H3K9 hyperacetylation was seen in the AMG and cortical areas of rats, which displayed motivation to self-administer ethanol after a 6-h withdrawal period, compared with non-dependent rats (133). Treatment with the HDACi NaB or MS-275 (i.p. or i.c.v.) was able to counteract the effects of alcohol in dependent rats but not in non-dependent rats. Treatment with NaB, when administrated prior to ethanol self-administration, was also able to reverse H3K9 hyperacetylation and counteract excessive alcohol intake and relapse in alcohol-dependent rats. In order to identify brain region-specific regulatory molecular (*epigenetic*) signatures potentially involved in adaptive processes that lead to alcohol tolerance and dependence, Smith and colleagues (134) recently examined brain regional expression network responses to acute (0–8 h) and late (72 h to 7 days) withdrawal from chronic intermittent ethanol exposure in mice. Remarkably, the authors showed that neuroinflammatory responsive genes can be seen across all brain regions at 0–8 h after the beginning of alcohol withdrawal, while sustained over-representation for subset groups of genes related to neurodevelopment and synaptic plasticity (such as *Bdnf*) and to histone acetylation (such as *HDAC4* and *HDAC6*) and histone/DNA methylation are found at 3- to 7-day-withdrawal periods specifically in the PFC and the HPC. These results illustrate how transient and persistent histone acetylation changes could serve as a key mechanism for tight regulation of the expression of large sets of genes within specific brain regions of animals predisposed to excessive ethanol drinking or exposed to protracted abstinence. A functional disconnection of the CeA–PFC circuit during abstinence (72 h) and renewed access to alcohol has been recently implicated in long-lasting PFC-dependent cognitive dysfunction and the development of anxiety-like behavior, and more specifically, the resulting PFC hypofunction was shown to facilitate the transition from moderate to excessive and uncontrolled alcohol intake in rats (46).

Persistent changes of the HAT CBP activity and H4 acetylation were observed in the frontal cortex of C57BL/6 mice given 5-month chronic alcohol consumption followed by a 15-day withdrawal period (135). In that study, withdrawal-associated H4 hypoacetylation correlated with neuroinflammatory damage and the persistently altered memory and anxiety-related behaviors. Nonetheless, these changes were absent in mice lacking the Tolllike receptor 4 (TLR4) that have undergone the same treatment, suggesting a critical role for TLR4-mediated epigenetic modifications in mediating long-lasting deleterious effects of chronic alcohol on PFC-dependent behaviors (135). This is in line with findings in our laboratory showing a robust decrease in histone H4 acetylation in the medial PFC of C57/BL mice at 1 week after withdrawal from chronic alcohol consumption; this decrease was maintained for at least 6 weeks after alcohol withdrawal and correlated with the persistently impairment of working memory noted during abstinence (31, 47). Alcohol's effects on H4 acetylation closely paralleled effects on CREB activation in the PFC. Further, systemic delivery of corticosterone inhibitor metyrapone or local intra-PFC blockade of MRs (*via* spironolactone) or GRs (*via* mifepristone) similarly reversed long-lasting deficits in pCREB and H4 acetylation levels in the PFC and alleviated working memory deficits associated with alcohol withdrawal (31). Thus, these findings suggest that long-lasting glucocorticoid-induced neuroadaptive changes in CREB and H4 acetylation in the PFC may be involved in the enduring working memory impairments caused by prolonged alcohol consumption and withdrawal. Cumulative evidence indicates that structural and functional integrity of the HPC was also compromised in rats after prolonged alcohol exposure and even greatest alterations were found after cessation of alcohol exposure (136–138). Prolonged ethanol intake caused enduring deficits in HPC-dependent spatial reference memory in the water maze (138–140). Chronic ethanol treatment also caused long-lasting decrease of histone acetylation in the dorsal HPC. However, contrary to the PFC where there was strong relationship between alcohol-induced decrease of H4 acetylation and long-lasting working memory impairments, H4 acetylation in the HPC (*the CA1 region*) was decreased in behaviorally "unimpaired" alcohol-treated mice and even continued to decrease in "impaired" withdrawal-treated mice, compared with water-treated mice (31, 47). However, the drugs that prevented

alcohol's effects in the PFC did not rescue alcohol's effects on HPC function, underscoring a region-specific influence of regulatory epigenetic signature on adaptive processes that lead to alcohol tolerance and dependence.

### ALCOHOL AND HISTONE H3 MODIFICATION CROSS TALKS

Ethanol's effects on histone H3 phosphorylation at serine 10 (H3ser10phos) and concurrent H3 phosphoacetylation are of particular interest as their rapid elevation is critical for leaning/ memory-associated induction of immediate-early genes (e.g., *c-fos* and *egr-1*) (141–143), an effect shown to mediate adaptive responses to psychology stressful events such as forced swimming or novelty stress paradigm exposure (142–145). In rats, acute ethanol dose dependently alters the number of H3ser10phos in the dentate granular cells of the HPC, and these changes are paralleled by changes in c-fos protein expression (146). The same group has shown that, in ethanol-dependent rats, both H3ser10phos and c-fos levels are reduced in dentate granule cells during excessive alcohol intake, while opposite effects are evident at withdrawal peak in the HPC. Elevation of H3ser10phos and histone H3 phosphoacetylation is achieved through a direct interaction of the GR with [mitogen- and stress-activated protein kinase 1 (MSK1)] and ETS-domain protein Elk-1 that are downstream of the ERK/MAPK signaling cascade (143, 145, 147, 148). Conversely, nuclear type 1 protein phosphatase (PP1), a nuclear protein Ser/Thr phosphatase that acts as a universal negative regulator of memory and synaptic plasticity, interfered with H3Ser10phos in several brain areas such as the HPC and the AMG (149–152).

Combinatorial modifications of acetylated H3 and histone H3 lysine 4 trimethylation (H3K4me3) have been implicated in long-term adaptive changes in the HPC resulting from prolonged alcohol intake (126, 153). Using a 3-week mouse model of chronic ethanol consumption, Stragier and colleagues (154) recently reported that ethanol-induced BDNF-mediated neuroplastic changes in the HPC are controlled by combinatorial modifications of acetylated H3 and H3K4me3 around individual *Bdnf* gene promoters in dorsal CA3 region and the dentate gyrus and by decreased *Bdnf* DNA methylation in CA1–CA3 regions of the HPC. These ethanol-induced changes were associated with a deficit in HPC-dependent (contextual fear and novel recognition object) memory while sparing AMG-based cued fear memory. Chronic intermittent ethanol vapor exposure followed by 2–5 days of abstinence robustly and selectively increased histone H3K9 acetylation and DNA demethylation in PFC neurons with a parallel decrease of H3K9 methylation repressive mark as well

### REFERENCES


as a downregulation of a set of histone methyltransferases (HMT) (155, 156). These changes mostly occurred after ethanol removal and contributed to the development of physical dependence on alcohol through an adaptive long-lasting upregulation of the NMDA receptor 2B (NR2B) gene expression (155). Moreover, systemic treatment with TSA during ethanol exposure increased H3K9 acetylation at the NR2B promoter in PFC neurons and potentiated voluntary ethanol consumption (157). Together, these data suggest that persistent upregulation of the NR2B-containing NMDA receptors through deregulation of the balance between histone H3K9 acetylation and methylation states in the PFC may act as a potentially important contributor to the development of alcohol dependence.

### CONCLUDING REMARKS

This review summarizes recent advances in our comprehension of endocrine, epigenetic, and transcriptional changes that serve as determining factors in controlling alcohol-associated changes in the expression of gene networks and behavior and play a central role in the regulation of alcohol dependence, withdrawal, and relapse (**Figure 1**). Most of the studies conducted thus far focused mainly on epigenetic and transcriptional regulation of adaptive responses to acute and chronic alcohol that occur within a single brain region (mostly the AMG). This review highlights new evidence from clinical and preclinical studies on how long-term adaptations arising from disruption of the fine coordination of highly interconnected brain structures within a circuit, including, but not limited to, the PFC, the HPC, and the AMG, may contribute to excessive alcohol consumption and alcohol dependence as well as behavior impairments. The findings reviewed in this article support the view that brain region- and cell type-specific histone acetylation modification (both in terms of global/genome-wide changes as well as promoter-specific changes) is a key mechanism underlying anxiety-like and alcohol-drinking behaviors. Thus, treatments designed to counteract alcohol-associated epigenetic changes may be promising targets for novel medications in the treatment of alcoholism.

### AUTHOR CONTRIBUTIONS

NM and DB contributed to the writing of this review article.

### FUNDING

This work was supported by a grant from FRA (Fondation pour la Recherche en Alcoologie, Paris, France).


patients with major depressive disorder. *Psychopharmacology (Berl)* (2011) 215(1):71–9. doi:10.1007/s00213-010-2117-z


receptor-dependent behavioural response. *Eur J Neurosci* (2005) 22(7): 1691–700. doi:10.1111/j.1460-9568.2005.04358.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Mons and Beracochea. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Thinking after Drinking: Impaired Hippocampal-Dependent Cognition in Human Alcoholics and Animal Models of Alcohol Dependence

*Miranda C. Staples† and Chitra D. Mandyam\*†*

*Committee on the Neurobiology of Addictive Disorders, The Scripps Research Institute, La Jolla, CA, USA*

#### *Edited by:*

*Vincent David, Centre National de la Recherche Scientifique (CNRS), France*

#### *Reviewed by:*

*Gabriel Rubio, Hospital Universitario 12 De Octubre, Spain Myeong Ok Kim, Gyeonsang National University Jinju, South Korea*

#### *\*Correspondence:*

*Chitra D. Mandyam cmandyam@scripps.edu*

#### *†Present address:*

*Miranda C. Staples MileStone Research Organization, San Diego, CA, USA; Chitra D. Mandyam, Veterans Administration San Diego Healthcare System, San Diego, CA, USA* 

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 15 July 2016 Accepted: 13 September 2016 Published: 30 September 2016*

#### *Citation:*

*Staples MC and Mandyam CD (2016) Thinking after Drinking: Impaired Hippocampal-Dependent Cognition in Human Alcoholics and Animal Models of Alcohol Dependence. Front. Psychiatry 7:162. doi: 10.3389/fpsyt.2016.00162*

Alcohol use disorder currently affects approximately 18 million Americans, with at least half of these individuals having significant cognitive impairments subsequent to their chronic alcohol use. This is most widely apparent as frontal cortex-dependent cognitive dysfunction, where executive function and decision-making are severely compromised, as well as hippocampus-dependent cognitive dysfunction, where contextual and temporal reasoning are negatively impacted. This review discusses the relevant clinical literature to support the theory that cognitive recovery in tasks dependent on the prefrontal cortex and hippocampus is temporally different across extended periods of abstinence from alcohol. Additional studies from preclinical models are discussed to support clinical findings. Finally, the unique cellular composition of the hippocampus and cognitive impairment dependent on the hippocampus is highlighted in the context of alcohol dependence.

Keywords: alcohol use disorder, cognitive impairment, abstinence, hippocampus, prefrontal cortex

### OCCURRENCE AND IMPACT OF ALCOHOL USE DISORDERS IN THE UNITED STATES

In the United States, 18 million individuals (7.4% of the 15 and older population, according to estimates from 2010) report having an alcohol use disorder (AUD), with nearly 12 million of these individuals reporting alcohol dependence (1). Recent changes to the diagnostic definition of AUDs in the updated DSM-V eliminate the clinical distinction between AUDs and alcohol dependence, opting to categorize them together under the umbrella category of AUDs and describe the broad disorder as a "… problematic pattern of alcohol use leading to clinically significant impairment or distress …" as well as requiring concurrent escalation of alcohol intake, craving for alcohol, and significant disruptions to personal and professional conduct (2). In 2011, AUDs cost the United States \$223.5 billion, an estimation which includes the cost of medical treatment, judiciary involvement, and loss of productivity (3).

However, these statistics, while useful in conveying the gravity of the alcohol abuse problem in the United States, do not provide insight into the recovery process nor the continuing health

**Abbreviations:** AUD, alcohol use disorder; BALs, blood alcohol levels; BOLD, blood-oxygen-level dependent; CIE, chronic intermittent ethanol vapor exposure; DG, dentate gyrus; fMRI, functional magnetic resonance imaging; GABAa, gammaaminobutyric acid A subunit; GABAaR, GABAa Receptor; GluN, *N*-methyl-d-aspartate glutamatergic receptor; PFC, prefrontal cortex; TFC, trace fear conditioning.

and cognitive disparities these individuals face into periods of abstinence from alcohol consumption. Additionally, long-term alcohol abuse results in significant, non-economic personal costs, including devastating bodily harm, with some of the most striking effects apparent in the brain. Evidence from human and animal studies suggest that select regions of the cortex, particularly the prefrontal cortex (PFC) and hippocampus, may be more sensitive to the deleterious and damaging effects of long-term alcohol use than others, and recovery of cognitive function sensitive to these regions may occur at different times into periods of prolonged abstinence (4–7).

### IMPACT OF ALCOHOL ON COGNITION: CLINICAL FINDINGS

Alcohol is widely known to acutely alter cortical function by modulating inhibitory and excitatory receptor function on neuronal processes (8–10). By repressing excitatory transmission (8, 11–15) and concurrently enhancing inhibitory transmission (16–21), alcohol acutely acts as a systemic depressant. Over repeated, chronic exposures, neuronal transmission achieves a homeostatic state in the presence of alcohol (22), and cognition can resemble that of non-dependent function. However, during periods of abstinence when alcohol is absent from the system for extended phases, effectively disrupting the previously described modified homeostasis, cognitive function is significantly impaired (due to the absence of alcohol as critical modulating factor), and these cognitive impairments persist for some time. Interestingly, these cognitive perturbations, in some instances, do recover to or near pre-dependency levels. What follows is a description and synthesis of how alcohol modulates PFC and hippocampal function, what changes occur as occasional alcohol consumption becomes chronic consumption, and what cognitive impairments are present during acute withdrawal.

It is worth noting, while outside the general scope of this review, that chronic alcohol use does result in structural and/or functional atrophy in regions outside of the PFC and hippocampus and that these additional changes cannot be eliminated as potential modulators of the deleterious effects observed in the PFC and hippocampus (23). Further, research into the cognitive capacities of alcoholic individuals has identified cognitive disorders, such as Wernicke–Korsakoff syndrome, alcohol dementia, and Marchiafava–Bignami disease, which are directly related to long-term alcohol abuse and cloud our understanding of alcohol's solitary effects on cognitive functioning (24, 25). Similarly, age and concurrent drug use can additionally complicate our understanding of alcohol's impact; therefore, for the purpose of this review, studies including subjects with chronic alcohol use without poly drug use were evaluated.

### COGNITIVE IMPAIRMENT FOLLOWING NONDEPENDENT ALCOHOL USE

### Prefrontal Cortex

The PFC is a region of the cerebrum, which has been colloquially referenced as the switchboard of the cortex due to its role in planning and selecting appropriate responses and actions to events and stimuli (26–28). Behaviors such as impulsivity (29), decision-making (30), and attentional focus (31) are all under the control of the PFC and are often manipulated and impaired in individuals with an AUD (discussed subsequently). Whenassessed in a controlled setting, acute doses of alcohol (0.4–0.8g/kg) given to nondependent subjects impairs numerous PFC functions, including disruption in planning (32), increases in impulsive actions (33–36), decreases behavioral inhibition (37–39), reduces perseverance (40), and increases poor decision-making (41). In many studies, these dysfunctions were correlated with reductions in typical lateralization (asymmetric distribution of activity) (36) as well as reduced functional magnetic resonance imaging (fMRI) activity during false responses (42). Further, studies in humans have demonstrated subtle structural abnormalities (43), increased blood flow (as an indicator of cortical activity) (44–47), and reduced hemispheric dominance (36, 48–50). Taken together, it is clear that the function of the PFC is significantly impaired with acute exposures to alcohol.

### Hippocampus

Similar to the inhibition observed in the PFC, the hippocampus is a sensitive target of alcohol's actions in the brain. Defined, in part, by its characteristic trisynaptic circuit, human and animal studies have demonstrated that the hippocampus is critical for spatial memory [reviewed in Ref. (51)], context discrimination (52), pattern separation (53), and time-sensitive memories (54). A critically unique region of the hippocampus, the dentate gyrus (DG), contains neural stem cells that continue to divide and primarily generate functional neurons into adulthood in nearly all mammalian species (55) and have proved critical for pattern separation functionality (56). Beyond its role in the previously described functions, the hippocampus plays a critical role in emotional and stress regulation (57), critical components to the development and cyclical nature of addiction (58). In human subjects, hippocampal function is typically assessed as contextual memory or episodic memory, both of which have been shown to be impacted during acute alcohol exposure (49, 59).

### COGNITIVE IMPAIRMENTS DURING AND FOLLOWING HEAVY ALCOHOL USE

### Prefrontal Cortex

When compared with healthy subjects, individuals reporting chronic alcohol abuse demonstrate structural abnormalities, including reduced frontal cortical volume (60–64), compromised white matter integrity (65–67), reduced quantities of frontal–cerebellar connections (68), and aberrant patterns of frontal cortical activity (69, 70). Further, Kril et al. (71) confirmed previously reported reductions in PFC white matter and found a significant reduction in the number of neurons in postmortem tissue of alcoholics when compared with healthy control subjects, confirming losses to cortical gray matter (60). Finally, it is possible that these pathological changes are underlying the diminished cognitive function often observed in human alcoholics.

In order to test the deleterious effects of chronic alcohol abuse on the intellectual capacities of alcohol-dependent individuals, tests of memory, impulsivity, risk, and attention are often employed. While individuals struggling with alcohol dependence rarely exhibit impairments on assessments of generalized intelligence, specialized complex tasks are uniquely able to elucidate potentially subtle difference between dependent and non-dependent populations. Estimates suggest that at least half of individuals diagnosed as alcohol-dependent are also cognitively challenged (4). One early study assessing a group of recently abstinent alcoholics, individuals with frontal lobe damage, and healthy controls found, as expected, no difference on assessments of IQ, but did report that alcoholic individuals were significantly impaired compared with both controls and individuals suffering from frontal lobe trauma in tasks that were designed to explicitly test frontal lobe function (72, 73). More recent studies have demonstrated explicit impairments on tasks, involving executive functioning (74, 75), working memory (76, 77), and impulsivity (76, 78–81). Structural abnormalities have been directly linked to frontal cortical function in within-subject experimental designs. One study measuring frontal cortical electrical activity (electroencephalogram recordings) during a Go/No Go task, a test where subjects are asked to learn and perseverate changing rules pertaining to cues, demonstrated blunted activity during the task in alcoholics as compared with non-dependent controls (82). Most recently, Nakamura-Palacios et al. (83) reported that the damage to the PFC was predictive of the cognitive impairments on tests of executive function. Additionally, studies have identified abnormal patterns of activity during cognitive tasks in alcohol-dependent subjects, whose intellectual performance is comparable to non-dependent subjects (84); this finding is particularly intriguing as it implies that individuals with significant disruptions in cognitive capacities may lack the capacity to form adaptive connections in the presence of chronic alcohol. Taken together, these findings present solid evidence that the PFC is subject to extensive damage as a result of chronic alcohol use, some of which could potentially be mediated by certain individual characteristics.

### Hippocampus

Studies involving human subjects with chronic alcohol use have demonstrated reduced hippocampal volume (85–87), postmortem evidence of prior neuronal loss (88), and severely reduced hippocampal activity, including reductions in blood flow (89). Recently, one study comparing mild and heavy drinkers demonstrated no significant impairment of general cognition but an increased fMRI blood-oxygen-level-dependent (BOLD) response, an indicator of regional activity, in the hippocampus during correct responses to the visual encoding and memory task, implying a compensatory mechanism for cognitive function (90). However, tasks capable of identifying explicit hippocampal-sensitive cognitive impairments in adults, particularly those with substance dependency issues, are scarce beyond those investigating episodic memory. Episodic memory, or the function of remembering events in specific spatial and temporal context (in contrast to factual or semantic memory), is an important hippocampal function in humans (91, 92) and has been demonstrated to be significantly impaired in alcoholic patients (93–96). However, it should be noted that as described by Noel et al. (96), episodic memory is also sensitive to alcoholinduced damage to the PFC, so the findings of reduced episodic memory function cannot be explicitly attributed to impaired hippocampal function.

### RECOVERY OF COGNITIVE CAPACITIES

A strong body of evidence in alcohol-dependent individuals has demonstrated that various cognitive capacities do return to (or nearly to) non-dependence levels of performance. However, the details of this recovery vary widely in terms of temporal resolution based primarily on the cortical structure of interest, and it is difficult to disseminate apparent recovery of damaged regions from compensation by other cortical regions with regards to behavioral function and performance alone. For example, studies appear to suggest that cognitive deficits due to PFC damage from alcohol abuse recover on a shorter timescale compared with those dependent on the hippocampus. However, as the functionality of the PFC and hippocampus is intricately related, there is a clear challenge to designing studies to directly address the explicit temporal recovery of specific structures in humans. Therefore, the findings presented here are from studies addressing broader questions of functionality in alcoholics.

With respect to the PFC damage, recovery of cognitive function in this region is critical to the persistence of abstinence from alcoholism and avoidance of relapse in dependent individuals (97). A recent met-analysis of human literature (62 sources in all) demonstrated that cognitive impairments sensitive to the PFC in individuals with AUDs identified in recent abstainers (98–101) are primarily alleviated or "normalized" (meaning performance is comparable to non-dependent individuals) by 1-year abstinence of alcohol use (102). Similarly, improvements in executive functioning occurring as soon as 6 months into abstinence has been reported (95, 103). However, as proposed and reviewed by Oscar-Berman et al. (104), it is plausible that the recovery of PFC function is more the result of compensatory activity in associated regions of the cortex rather than distinct recovery or repair of the PFC itself.

With regard to hippocampal functionality, human studies evaluating episodic memory in dependent, long-term abstinent individuals have reported similar findings to those relating to the PFC, but the outcomes of the studies have not been entirely equivocal. For example, multiple studies have reported impaired performance on tasks of episodic memory (105–107), and that "normalization" of episodic memory performance in alcoholdependent subjects has taken place by 1 year of abstinence (95). However, there is evidence that hippocampal dysfunction remains impaired years after abstinence (5, 108). The potential distinction of these two seemingly disparate findings may be the result of (A) many of the studies not evaluating function beyond 1-year abstinence and (B), as described previously, episodic memory is not entirely exclusive of hippocampal function. Therefore, it is possible that, while episodic memory function returns, other facets of hippocampal function remain perturbed long into abstinence from alcohol. Taken together, the current evidence suggests that the recovery of cognitive functionality in abstinent alcohol-dependent individuals is sensitive to the duration of the abstinence period, with the PFC returning to "normative" levels prior to the hippocampal formation.

### LIMITATIONS OF CLINICAL FINDINGS

A wealth of evidence from clinical findings demonstrates that acute alcohol exposures can inhibit cognitive capacities. Interestingly, it is primarily following withdrawal from chronic alcohol exposure that individuals experience persisting, severe cognitive impairments. As eloquently described in Oscar-Berman et al. (104), studies involving human subjects and drugs of abuse are often rife with complicating and confounding factors, including family history, genetic predisposition, and past life events and experience, much of which cannot be controlled for. While clinical studies are limited to observational investigations into the deleterious cortical adaptations subsequent to chronic alcohol exposure, preclinical models have been successful at informing and elaborating our understanding of the cellular and molecular changes, which may explain the mechanisms underlying cognitive disparities in abstinent alcohol-dependent subjects. Further, preclinical models of alcohol dependence have generated evidence suggesting that the distinct cellular compositions of the PFC and the hippocampus may be the basis for the differential cognitive recovery in these regions in abstinent individuals. Therefore, the following sections will discuss preclinical models of alcohol addiction and dependence with specific focus on cognitive impairments dependent on the PFC and hippocampus and will elucidate the associated cellular and molecular changes in these regions.

### IMPACT OF ALCOHOL ON COGNITION: PRECLINICAL FINDINGS

Rodent models of alcohol dependence have been instrumental in furthering our understanding of both the cognitive and neurobiological impact of withdrawal from alcohol dependence, as well as providing critical insight into the potential mechanisms of the pathological state associated with and resulting from alcohol withdrawal in dependent animals. While studies targeting examination of one explicit region or feature are impossible in human populations, particularly with regards to the effects of drugs of abuse, animal models have been instrumental tools in allowing for the fine manipulation of explicit cortical regions and functions.

## ALCOHOL IMPAIRS PFC FUNCTION

Multiple studies employing rodent models have investigated the impact of alcohol dependence on prefrontal cognitive capacity. Growing evidence suggests that the rodent medial prefrontal cortex (mPFC) likely represents a functional homolog of the human medial and dorsolateral PFC (109). Reports using various rodent models of alcohol dependence [including chronic intermittent ethanol vapor exposure (CIE), liquid diet, two bottle choice; for paradigm overviews, see Ref. (110)] have found behavioral inflexibility (111), impaired extinction (112), impaired setshifting (113), and impaired working memory (114, 115), all tasks which require a fully functioning PFC. Further, two of these studies (112, 113) linked the disruption in frontal cortical function to alcohol-induced dysregulation of the *N*-methyl-d-aspartate glutamatergic receptor (GluN) system. Two studies have investigated PFC functions into periods of abstinence following chronic ethanol exposure via CIE (10 days abstinence; (116)), or liquid diet (6 weeks abstinence; (114)). Interestingly, at 10 days into abstinence there is a lack of impairment in cognitive flexibility while at 6 weeks into abstinence there were severe impairments in working memory. Furthermore, investigation of anxietylike behavior, 6 weeks into abstinence, demonstrated a lack of emotional behavioral deficit in abstinent animals (114). Taken together, it is evident that the paradigm of ethanol experience and the type of behavioral investigation are critical when determining alterations in PFC-dependent functions during abstinence, and that some PFC-dependent behaviors are less sensitive to the neurobiological alterations in the PFC in abstinent animals compared with others.

### ALCOHOL IMPAIRS HIPPOCAMPAL FUNCTION

Animal models have also been critical in resolving the explicit impact of chronic alcohol on the functionality of the hippocampus. Similar to the studies in animal models of alcohol dependence, which replicated the PFC impairments observed in humans, studies in animals exposed to translationally relevant models of chronic alcohol exposure have reproduced and expanded on the findings from human subjects. These studies have resulted in numerous structural and functional abnormalities of the rodent hippocampus similar to those seen in human studies. For example, studies in rodents employing forced chronic consumption demonstrate long-term exposures to alcohol resulted in extensive impairment in spatial memory (117–122). Unfortunately, behavioral disparities in these preclinical models have been limited to the spatial and contextual processing functions of the hippocampus with no reference to the temporal discrimination role of this structure. Nevertheless, it is clear that chronic alcohol exposure critically impairs hippocampal function in preclinical models similar to those previously discussed in clinical settings, although there remain unanswered questions in this field with regard to the complete profile of hippocampal cognitive impairments. The remainder of the review will focus on the hippocampus and provide a brief overview of the cellular and molecular mechanisms in the hippocampus that could contribute to the long-term impairments in the behaviors dependent on the hippocampus in preclinical models of AUDs.

### MOLECULAR ACTIONS OF ALCOHOL IN THE HIPPOCAMPUS

### Acute Effects on GluNs

Animal models of acute alcohol exposure have been instrumental in elucidating our understanding of the molecular actions of alcohol with regard to excitatory and inhibitory transmission in the mammalian cortex (see **Figure 1A** for a summary). GluNs are one of the main components of excitatory transmission in the hippocampus (as well as the cortex at large) and are critical for learning and memory (123). The receptors are comprised of four subunits, two obligatory GluN1 subunits, and two additional subunits, which can be any of GluN2A-D or GluN3A-B. Evidence suggests that the 2A and 2B subunits, expressed in high density in the hippocampus, are particularly sensitive to alcohol's inhibitory effects (124–127). Further, early evidence suggests that alcohol dose-dependently inhibits GluN-dependent current in cells (8) by decreasing the time the channel spends open (128).

### Acute Effects on Gamma-Aminobutyric Acid A Receptors

Inhibitory transmission plays a similarly critical role in cognition, learning, and memory in the hippocampus (and the cortex at large) (129). In addition to alcohol's reduction of glutamatergic transmission *via* impairment of GluN function, alcohol also acts as a non-competitive agonist, directly enhancing the chloride transmission of the gamma-aminobutyric acid A (GABAa) channel (130) effectively hyperpolarizing the neural cells (see **Figure 1A** for a summary). Similar to GluN, the GABAa receptor (GABAaR) is comprised of five subunits, typically two alpha (A1-6), two beta (B1-3), and one subunit, which could be comprised of a gamma (G1-3) or delta. However, unlike GluN, the precise site of action on a given subunit is of debate [reviewed in Ref. (21)], with many subunits demonstrating sensitivity to alcohol (131), and much evidence is contradictory; for example, Wallner et al. (20) suggested that the B3 subunit was mediating the receptor's sensitivity to alcohol, but this was later contradicted in a mutant mouse model void of the B3 subunit, but still demonstrated GABA-ergic enhancement following alcohol administration (132). It is highly possible that alcohol's capacity to enhance inhibitory function of the GABAaR is dependent on the specific conformation of subunits instead of acting at a single subunit.

### Chronic Effects on GluNs

*N*-methyl-d-aspartate glutamatergic receptors and associated intracellular signaling molecules adapt to the reoccurring

presence of alcohol, facilitating the development of the dependent phenotype. Post-translationally, the GluN2B subunit is phosphorylated subsequent to alcohol exposure (133), particularly in the hippocampus (13), resulting in an increase in receptor function. Over repeated alcohol exposures, an increase in expression of GluN subunits 2A and 2B (134, 135), synaptic-specific clustering of GluNs (136), as well as an increase in GluN-mediated currents (136) are observed (**Figure 1B**). It is probable that this increase in expression and function of the GluN receptor is a compensatory mechanism against chronic alcohol's impairment on the receptor; however, when alcohol is absent from the cortical system during withdrawal, the pathologic over-expression of GluNs (137), along with the normalized GABA-ergic function in the absence of alcohol's facilitating effects, results in cortical hyperactivity and excitotoxicity.

### Chronic Effects on GABAaRs

In addition to the molecular changes observed in the GluN system following long-term alcohol exposures, the GABAaRs are subject to dynamic regulation by the drug (see **Figure 1B** for a summary). The subunits of the GABAaR are differentially expressed subsequent to chronic alcohol in a region- and subunit-specific manner [for detailed review see Ref. (138)]. Evidence suggests an exchange of subunits expressed on the cell surface with a reported reduction in A1 subunits in the hippocampus (139) and an increase of A4 (140–142) and A5 (139) following CIE. However, subunit expression is not the only element of GABAaR modulation that is altered by chronic alcohol exposure. Following withdrawal from CIE, neurons displayed heightened excitability, which was pharmacologically attributable to increases in the number of A4 containing GABAaRs (142) as well as reductions in tonic current modulators (143), increase in A4 synaptic localization (144), and subunit-specific changes in trafficking (145), leading to a preferential increase in A4 expression over other subunits. Therefore, following chronic alcohol exposure, there is a generalized reduction of GABAaR functionality, leading to heightened neuronal activity in the absence of alcohol's modulating effects.

### POTENTIAL BIOLOGICAL MECHANISM OF HIPPOCAMPAL SENSITIVITY TO AUDs: IMPACT OF ALTERED GluN AND GABAR SIGNALING IN THE HIPPOCAMPUS ON ADULT NEUROGENESIS

The regionally differential rates of cognitive recovery following abstinence from alcohol use are potentially consequent to the neurogenic properties (or lack thereof) of each region. To be more specific, cognitive function relying on the frontal cortical region in humans has been described as being recovered at an earlier time in abstinence than cognitive functions specific to the hippocampal formation of the limbic system as previously discussed. It is possible that this disparity is due to, at least in part, the ongoing adult neurogenesis in the hippocampus which occurs at a much lesser rate in the PFC of mammals (146); neurons which would be generated during critical periods of withdrawal would be developing into mature neurons during a time of negative affect (147, 148), potentially resulting in a pathologic phenotype and dysfunctional characteristics (149). This problematic phenomenon would be far more impactful in a region with high neurogenesis (such as the hippocampus) as compared with a region of low or absent neurogenesis, where the typical functioning of the existing circuitry may return upon complete washout of the drug.

Adult mammalian neurogenesis is a widely accepted phenomenon, as evidence demonstrates the existence of mitotically active cells in distinct regions of the brain, one which is the granule cell layer of the DG of the hippocampus. Neurogenesis, or the process of proliferation, differentiation, and maturation of neural progenitor cells to fully functional and integrated neuronal components of the surrounding network (150, 151), has been confirmed in numerous mammalian species, including humans (152). Assessment of cell number and structure at various time points following cell birth can provide insight into the impact of exogenous factors on the neurogenic process in the hippocampus [for comprehensive review of granule cell development see Ref. (153)].

The explicit functionality of these adult-born cells is still a topic of contention. Hippocampal-sensitive learning has been shown to positively influence proliferation and survival of new neurons [reviewed in Ref. (154)]; inversely, increases in proliferation or survival of newly born neurons can increase performance on hippocampal-sensitive tasks, while reductions or ablations of neuronal proliferation results in problematic cognitive performance [reviewed in Ref. (155)]. Acquisition, retention, and extinction of trace fear conditioning (TFC; a hippocampus sensitive task) has been shown to be sensitive to changes in neurogenesis (156) due to or as a result of its hippocampal-dependence (157), but as yet, investigations into the impact of clinically relevant models of chronic alcohol on TFC performance have not been reported.

### Regulation of Neurogenesis by GluNs

Glutamatergic signaling *via* GluNs is of critical importance in regulating neural stem cells in the hippocampus, particularly in the withdrawal/abstinence period in alcohol-dependent subjects. Under basal conditions, some stages of immature neural progenitors (proliferating and differentiating cells) in the hippocampus express GluNs (158). When coupled with the evidence that GluNdependent long-term potentiation in the DG can increase progenitor proliferation (159, 160) and survival (159), these findings imply that regulation of hippocampal neurogenesis is sensitive to GluN stimulation on newly born granule cells. Alcohol's longterm actions *via* GluNs would, therefore, affect proliferation, survival, and function of the newly born neurons in a dynamic manner which would change over the course of abstinence from alcohol. Alcohol, as described previously, has the consequence of maintaining GluNs at the synapse, effectively impairing cycling of receptors back into the cell for degradation or reuse. Therefore, the role of alcohol on hippocampal neurogenesis would be mediated by either GluN dysregulation, GABA-ergic dysregulation, or a balance of both.

### Regulation of Neurogenesis by GABAaRs

The granule cells of the hippocampus are maintained in a quiescent state by the mossy fibers of the hilus *via* GABA-ergic regulation [reviewed in Ref. (161)]. Evidence has demonstrated that these cells do express GABAaRs (162), as do the surrounding cells of the DG (163, 164); therefore, not only are the granule cells sensitive to enhanced GABA-ergic transmission during exposure to chronic alcohol but are also subject to secondary regulation due to the modulation of activity of surrounding cells by alcohol's actions on the GABAaR. As specific subunit compositions of the GABAaR can modulate important stages of neurogenesis (particularly the maintenance of quiescent cells and proliferation), this could provide a potential mechanism by which alcohol could be modulating neurogenesis in dependent individuals. During periods of alcohol intake, GABAaR function would be supported and facilitated such that quiescent cells would be maintained (165, 166) as such and proliferation would be reduced (167–169). In the acute absence of alcohol, the facilitation of GABAaR activity would be lost and quiescent cells would be allowed to proliferate, and these effects could result in increase or decrease in cell survival in the days following withdrawal (169–171). However, impaired GABA-ergic receptor function has been shown to restrict morphology of newly born cells (172), which could reduce the number of synaptic connections and network integration required for survival and function of the granule cells and, therefore, result in net reduction of the number of surviving cells during protracted abstinence (171). This finding serves as a potential argument for the reduced survival subsequent to the increased proliferation following withdrawal in dependent animals (171).

### Regulation of Neurogenesis by Alcohol

In addition to a general understanding of neurogenesis, we are beginning to understand how alcohol exposure impacts hippocampal neurogenesis and what this may imply for cognitive performance and capacity (see **Figures 1A–C** for a summary). For example, while cellular proliferation and neurogenesis are reduced during excessive alcohol-induced dependence (167–169), early withdrawal from excessive alcohol is documented to result in an increase in cellular proliferation in the DG (169–171). The survival capacity of progenitors born during this period of increased proliferation and their functional importance is still unclear; however, reports using alcohol gavage [blood alcohol levels (BALs) reaching >400 mg%] demonstrate increased survival of newly born neurons subsequent to the proliferative burst (170, 173, 174). In contrast, animals made dependent to alcohol *via* ethanol vapor exposure (BALs maintained between 150–250 mg%) demonstrate a marked reduction in the number of surviving young neurons in the DG (169, 171). This difference could be attributed to differences in BALs and negative affect symptoms resulting from the exposure paradigm (gavage vs. CIE). Unfortunately, there is no conclusive evidence linking aberrant neurogenesis subsequent to alcohol dependence and impaired hippocampal cognitive function. Future studies will be required to demonstrate the plausibility of this mechanism as an underlying explanation for the deleterious effect of alcohol dependence on hippocampal function.

### SUMMARY AND CONCLUSION

The goal of this review was to provide initial evidence in support of the proposal that the cognitive recovery of the hippocampus and the PFC following abstinence from long-term alcohol abuse occur at different rates, potentially due to their difference in cellular composition and neurogenic functionality. For example, clinical evidence supports recovery of certain PFC-dependent tasks in times of abstinence from alcohol at different rates compared with hippocampal-dependent tasks. Preclinical findings in animal models of alcohol exposure support the clinical observation; mechanistic studies support that this temporally differential rescue of PFC-dependent tasks is potentially due to the neurogenic deficits in the hippocampus during abstinence, such that the birth of new neurons during periods of negative affect result in the persistence of the hippocampal-specific cognitive disparities.

### FUTURE PERSPECTIVE

Many questions remain unanswered with regard to human hippocampal function during periods of alcohol abstinence. For example, it is clear that employing cognitive therapy can support individuals in successful attempts at abstinence. Given that extinction training is being adopted in clinical behavioral therapy to promote recovery from relapse (175), it is critical to investigate similar potential therapeutic strategies (be it behavioral or pharmacological), which will serve this purpose not only to ameliorate the cognitive disparities in these individuals but to facilitate dependent individuals in avoiding relapse to alcohol abuse.

### AUTHOR CONTRIBUTIONS

MS was responsible for the article concept and drafted the manuscript. CM and MS provided critical revision of the manuscript for important intellectual content. Both authors critically reviewed content and approved final version for publication.

### FUNDING

The authors would like to acknowledge McKenzie Fannon-Pavlich for her critical reviewing of this manuscript. Funding for this manuscript provided by the National Institute of Alcoholism and Alcohol Abuse grants to MS (T32AA00747 and F32AA023690) and CM (AA020098 and AA06420). This is manuscript number 28063 from The Scripps Research Institute.

### REFERENCES


function, and decreases behavioral responses to positive allosteric modulators of GABAA receptors. *Mol Pharmacol* (2003) 63:53–64. doi:10.1124/ mol.63.1.53


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Staples and Mandyam. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cognitive Dysfunction, Affective States, and Vulnerability to Nicotine Addiction: A Multifactorial Perspective

#### *Morgane Besson\* and Benoît Forget*

*Unité de Neurobiologie Intégrative des Systèmes Cholinergiques, Department of Neuroscience, CNRS UMR 3571, Institut Pasteur, Paris, France*

Although smoking prevalence has declined in recent years, certain subpopulations continue to smoke at disproportionately high rates and show resistance to cessation treatments. Individuals showing cognitive and affective impairments, including emotional distress and deficits in attention, memory, and inhibitory control, particularly in the context of psychiatric conditions, such as attention-deficit hyperactivity disorder, schizophrenia, and mood disorders, are at higher risk for tobacco addiction. Nicotine has been shown to improve cognitive and emotional processing in some conditions, including during tobacco abstinence. Self-medication of cognitive deficits or negative affect has been proposed to underlie high rates of tobacco smoking among people with psychiatric disorders. However, pre-existing cognitive and mood disorders may also influence the development and maintenance of nicotine dependence, by biasing nicotine-induced alterations in information processing and associative learning, decision-making, and inhibitory control. Here, we discuss the potential forms of contribution of cognitive and affective deficits to nicotine addiction-related processes, by reviewing major clinical and preclinical studies investigating either the procognitive and therapeutic action of nicotine or the putative primary role of cognitive and emotional impairments in addiction-like features.

#### *Edited by:*

*Mark Walton, University of Oxford, UK*

#### *Reviewed by:*

*François Paille, Université de Lorraine, France Patricia Robledo, Universitat Pompeu Fabra, Spain*

#### *\*Correspondence:*

*Morgane Besson morgane.besson@pasteur.fr*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 25 November 2015 Accepted: 06 September 2016 Published: 21 September 2016*

#### *Citation:*

*Besson M and Forget B (2016) Cognitive Dysfunction, Affective States, and Vulnerability to Nicotine Addiction: A Multifactorial Perspective. Front. Psychiatry 7:160. doi: 10.3389/fpsyt.2016.00160*

Keywords: nicotine, predisposition, psychiatric disorders, cognition, addiction, emotion

## INTRODUCTION

Smoking tobacco remains the most preventable cause of morbidity and mortality worldwide. Nicotine is the main psychoactive component of tobacco responsible for its addictive properties and modifies the function of the brain *via* its interaction with the nicotinic acetylcholine receptors (nAChRs) (1, 2). Drug addiction is a complex psychiatric disorder, and there are individual differences in the vulnerability to develop this pathology that can be conceptualized at different levels interacting with each other, such as environmental, genetic, and psychological contributions. Only a percentage of individuals starting to smoke tobacco eventually develop an addiction (3). In particular, there is a high prevalence of smoking in patients with psychiatric disorders. However, it has been difficult to define in clinical studies the nature of the causal interactions between these pathologies. The psychological and neural processes that underlie addiction have been shown to overlap with those that support cognitive and emotional functions. One critical question is to which extent psychiatric conditions may pre-date smoking or develop after chronic exposure to nicotine. One of the main limitations to resolve this issue is the difficulty to conduct longitudinal prospective studies in humans and to control for co-use of multiple substances in patient cohorts. As a consequence, preclinical research

has increasingly aimed at identifying distinctive endophenotypes that may predispose individuals to nicotine addiction-like processes and/or that are influenced by nicotine exposure. Animal models can never encompass entirely the complexity of the psychological processes underlying behavior related to addiction and other psychiatric conditions in humans with full face and construct validities. Yet, they provide a valuable tool to precisely control the environmental (and genetic) context, the conditions of drug delivery, and to determine whether beforehand drug consumption influences the risk to develop specific endophenotypes or whether pre-existing endophenotypes confer vulnerability to addiction, through the implementation of longitudinal studies. They also allow detailed investigations of the distinct stages of addiction that may be connected to some endophenotypes to varying extents. In fact, the defining criteria of addiction are still a matter of debate, and this pathology exhibits complex dynamics with different stages, from the initiation and maintenance of drug taking to a switch toward a loss of control over drug intake, compulsive drug taking and seeking, i.e., despite negative consequences, together with high rate of relapse after abstinence (4–8). With the use of experimental models of distinct addiction-like behaviors in addition to epidemiological and neurocognitive studies in human subjects, specific behavioral endophenotypes of presumed genetic origin have been identified as significant risk factors for drug addiction according to different modalities. Understanding the causal relationship between nicotine addiction and psychiatric disorders may significantly contribute to the treatment of comorbid psychiatric conditions and smoking. This review will describe and discuss both clinical and preclinical studies that brought significant insight in that matter.

### TOBACCO SMOKING, PERSONALITY TRAITS, AND PSYCHIATRIC CONDITIONS

Vulnerability to addiction varies across individuals. Thus, although many people experiment with drugs of abuse, most do not develop drug addiction as defined by diagnostic criteria for substance-use disorder (9). Individual differences in vulnerability to abuse are thought to exist before the first drug experience and clinical evidence suggests that these differences reflect both genetic and environmental determinants, including social influences, as well as their interaction [see Ref. (10) for review]. Cigarette smoking is the leading preventable cause of death in the Western world (11) with a prevalence considerably higher in individuals with psychiatric diagnosis. In this part of the review, we will examine non-exhaustively the relationships described in clinical studies between smoking behavior, personality traits, and psychiatric disorders, such as impulsivity, novelty/sensation seeking, attention-deficit hyperactivity disorder (ADHD), depression, and anxiety disorders (see **Table 1**).

### Impulsivity, Novelty Seeking, and Tobacco Smoking

Impulsivity is a heritable and multifaceted psychiatric construct defined by the tendency to engage in inappropriate, premature, poorly planned, and unduly risky actions without adequate

#### Table 1 | Mental disorders/personality trait and nicotine addictionrelated features in humans.


*References in bold describe longitudinal studies.*

*ADHD, attention-deficit hyperactivity disorder; PTSD, posttraumatic stress disorder.*

forethought about the potential consequences of this behavior (50–53). It has been associated with drug addiction, including tobacco smoking (54).

Current theories differentiate between motor and cognitive aspects of impulsive behavior. Motor impulsivity reflects a failure in motor inhibition leading to impulsive actions and can be assessed by the ability to exert volitional control over a response that has already been initiated or rendered prominent with extensive training. This type of impulsivity can be notably measured in the "stop-signal reaction time task," in which subjects are trained to respond as quickly as possible but must inhibit their response when a stop signal is presented, or in a go/no go task (54). While several studies linked deficits in this type of impulsivity with alcohol (55), cocaine (56, 57), and methamphetamine (58) addiction, the data about tobacco addiction are less clear. Thus, tobacco smoking has been shown to decrease inhibitory control in a stop-signal task, where an increased number of errors during the stop signal and increased stop latencies were observed (59). But, another study reported no baseline differences between smokers and non-smokers in the same task (60). In addition, an increase in failure in response inhibition in both stop signal and go/no go tasks was observed after nicotine deprivation in tobacco smokers (61, 62), suggesting that nicotine withdrawal induces deficits in inhibitory control. Interestingly, a recent longitudinal prospective study showed that alterations in neural correlates of response inhibition in adolescents increase the risk for subsequent regular cigarette smoking (15), suggesting that functional brain correlates of response inhibition can be used as a marker of risk for tobacco addiction.

Cognitive aspects of impulsivity include response inhibition, delay discounting, and reward/punishment-based decisionmaking skills and represent the cognitive processes that regulate impulse control (54, 63–65). The delay discounting describes the tendency to discount the value of a reward as a function of the length of delay to its delivery. Higher delay discounting rates have been associated with cigarette smoking. Thus, current smokers tended to discount future monetary reinforcers more than ex-smokers and non-smokers (66), suggesting that smoking increases cognitive impulsivity in this task and that this effect is reversible. Another study confirmed the increased delay discounting in smokers but found no differences in discounting rates for either money or cigarettes between light and heavy smokers (67), a result confirmed in a recent report (68).

Interestingly, performances in delay discounting at age 10 were shown to predict the initiation of smoking behavior in adolescents at age 14 (12). Also, delay-discounting rate has been identified as a strong prognostic indicator of smoking relapse (13), suggesting that cognitive impulsivity can be a risk factor for subsequent tobacco smoking. Trait impulsivity has also been positively associated with the subjective rewarding effects of nicotine (14) as well as explicit expectancies about nicotine reward (16). A longitudinal study using a sample of college men and women showed that trait impulsivity predicts subsequent smoking initiation (17).

Novelty or sensation seeking can be defined as a heritable tendency to seek out varied, novel, complex, and intense sensations and emotional experiences and to show enhanced behavioral responses to novel situations (69–73). It is one of the most critical individual difference factors predicting drug use among humans (74, 75). Novelty seeking is typically measured in humans by using questionnaires such as the Tridimensional Personality Questionnaire (76), the Zuckerman Sensation Seeking Scale, or the Cloninger's Temperament and Character Inventory (77). This personality trait was shown to predict tobacco use during adolescence (75, 78) and the early onset of smoking in adolescents (79, 80). In line with this, a study of longitudinal smoking patterns in adolescents found that individuals with high novelty seeking were significantly more likely to become regular smokers than never smokers (18). In addition, novelty seeking was increased in heavy smokers (81) and was positively associated with sensitivity to the initial reinforcing effect of acute nicotine under controlled laboratory conditions (14, 82). A longitudinal study also showed that sensation seeking in college men and women predicts the initiation of smoking and its continuation 20 years later (17). Finally, high levels of novelty seeking have been negatively correlated with smoking-cessation success, with reduced odds of cessation compliance and outcomes (19, 20).

Thus, novelty seeking seems to predict tobacco addiction, but more studies are needed in order to determine the effect of tobacco exposure on this personality trait.

One should nevertheless bear in mind that, although the association between some personality traits and drug addiction is frequently observed, there are no structured and established pre-addictive personalities. Some dissociable personality profiles, including impulsiveness and novelty seeking, may rather be considered as vulnerability factors and facilitate some aspects of the addiction process.

### Attention-Deficit Hyperactivity Disorder and Tobacco Smoking

Attention-deficit hyperactivity disorder is a developmental disorder characterized by hyperactivity, high impulsivity, and an inability to sustain directed attention (83). ADHD affects approximately 6.5–8.4% of children and between 1.9 and 6% of adults (84–86). Evidence suggests that ADHD is a predisposition factor for tobacco smoking. For example, ADHD predicted future smoking (21) and adolescents with ADHD were more likely to experiment with cigarettes and become smokers (22). In addition, ADHD symptoms during childhood, particularly hyperactivity/impulsivity, predicted later nicotine dependence in adulthood (87). ADHD status in childhood was also shown to predict time to relapse to smoking after controlling for gender, history of depression, and baseline smoking variables (23). Smokers with ADHD present an earlier onset of regular smoking, have a higher frequency of smoking behavior, show greater withdrawal symptoms, are more willing to work harder for cigarette puffs, and exhibit a higher level of nicotine dependence than smokers without ADHD (24–29, 88, 89). In addition, there is an increase of ADHD symptoms during periods of abstinence in smokers that was associated with an increased risk of relapse (90). This suggests that the increased withdrawal symptoms observed in ADHD patients negatively affect the success of quitting tobacco smoking. Since ADHD is a neurodevelopmental disorder, there are no data on the influence of tobacco smoking on the emergence of ADHD. However, smoking during pregnancy has previously been strongly associated with the risk of ADHD in offspring (91–95) suggesting a direct causality. However, these studies did not rule out the potential influence of unmeasured familial factors (96, 97), and the association no longer holds in recent studies that used different designs accounting for these factors (97–99). This suggests that maternal smoking during pregnancy reflects a genetic predisposition rather than a causal risk factor for ADHD in offspring. Individuals with ADHD may also be more susceptible to the negative effects of smoking. Thus, smokers exhibited a greater increase in attention deficits over the years than their never-smoking twins (100), suggesting that smoking can worsen attention problems.

In conclusion, there is a complex relationship between ADHD and smoking with ADHD contributing to smoking, but smoking may also contribute to the development of attention deficits.

### Depression and Tobacco Smoking

Depression is characterized by depressed mood, anhedonia, vegetative symptoms, and impaired psychosocial functioning. Cigarette smoking and depression both account for significant morbidity, mortality, and economic burden. Depression is overrepresented among adult smokers and contributes to lower smoking-cessation rates and cigarette smoking is overrepresented in adult smokers prone to depression (101, 102). Longitudinal studies are useful to determine if depressive states can influence tobacco smoking. Thus, a 21-year longitudinal study found an association between major depression (MD) and smoking, with a 19% increase in the average daily smoking rate and a 75% increase in the odds of being nicotine dependent from mid-adolescence to young adulthood (30) in people with MD episode. In addition, adolescents with a history of MD had 50% more risk to progress to daily smoking and were significantly less likely to quit by age 25 compared with controls (31). These results suggest a strong influence of MD on the likelihood to develop tobacco addiction, but several studies suggested that less severe depressive symptoms are also a risk factor for tobacco dependence. For example, depression symptoms at mid-adolescence predicted smoking progression across mid-to-late adolescence (103). Adolescents with higher depressive symptoms were more likely to start smoking (34) and to progress to regular smoking compared with adolescents with lower depressive symptoms (35–37). Another longitudinal study found that depressive symptoms in early adolescence predict faster increases in smoking behavior (104).

In addition, depression seems to have a negative influence on smoking cessation since history of MD reduced the odds of short- and long-term smoking abstinence (32, 33). An increase in negative mood in the early stages of treatment for tobacco dependence was predictive of failure to quit smoking or smoking relapse (105, 106).

These data clearly indicate that depression is a risk factor for tobacco addiction, but other studies also support the opposite, i.e., that smoking influences the development of depression. Thus, cigarette smoking during adolescence was shown to predict the development of depressive symptoms (107–111) and an increased time of smoking dependency has been correlated with increased risk of depression. This suggests that the vulnerability for depression increases with higher rates of smoking (110).

In addition, quitting smoking has been associated with a significant decrease in depression compared with continued smoking (112), supporting the hypothesis that smoking might be the cause for mental health problems and not necessarily the inverse.

In conclusion, despite the fact that some of these studies failed to identify a reciprocal relationship between tobacco addiction and depression (30, 37, 108), the relationship seems to be bidirectional (113). As described earlier, tobacco dependence predicts the development of depressive symptoms and MD, while a history of MD predicts the onset of daily smoking and progression to tobacco dependence. This conclusion is supported by a meta-analysis of 15 longitudinal studies in adolescents that reported evidence for a bidirectional relationship, with a larger effect of depression status on smoking likelihood than the effect of smoking on depression (114).

### Anxiety Disorders and Tobacco Smoking

Anxiety disorders, such as panic disorders, phobias, generalized anxiety disorder, and posttraumatic stress disorder (PTSD), are among the most common mental disorders (115, 116). A strong relationship between anxiety disorders and tobacco smoking has been established in humans. Indeed, while tobacco smoking rates significantly declined from 2004 to 2011 in people without psychiatric illness, this is not the case in people with anxiety disorders (117). Along this line, patients with anxiety disorders had significantly higher smoking rates than a control population (38, 39), and anxiety disorders were significantly more prevalent in people diagnosed with nicotine dependence than in a nondependent population (118). In addition, patients with social anxiety or generalized anxiety disorders exhibited more severe nicotine dependence at baseline and smokers with a lifetime history of anxiety disorder were resistant to pharmacotherapy for abstinence (40).

PTSD is one of the most common anxiety disorders that can develop in humans after an exposure to one or more traumatic events, with a lifetime prevalence of approximately 8% in the general population (119). Smoking initiation and daily smoking rates were shown to increase after trauma (120, 121), and the presence of PTSD symptoms, such as hyperarousal and emotional numbing, is a predictor of tobacco dependence (43–46). Taken together, these data suggest that anxiety disorders are risk factors for the development of tobacco addiction, but prior smoking has also been found to be associated with increased risk to develop PTSD after a trauma or panic disorder (122, 123). In addition, smoking or smoke exposure in early life increased the likelihood of developing an anxiety disorder later in life (124, 125).

Finally, anxiety disorders have also been associated with greater difficulties for quitting tobacco smoking since smokers with lifetime anxiety disorder have significantly lower rates of abstinence and report more severe withdrawal symptoms than control smokers (41, 42, 126, 127). PTSD patients also exhibited lower rates of quitting, shorter times to first smoking relapse after quitting (38, 47, 48) and experienced worsened nicotine withdrawal symptoms compared with a non-PTSD population (49). However, as for depression, anxiety and stress were shown to be decreased in abstinent subjects by follow-up studies (112). This suggests that the assumption of beneficial effects of nicotine on anxiety and mood, which probably contributes to the maintenance of smoking in populations with mental health problems, should be more drastically challenged to motivate quitting.

Thus, the relationship between anxiety disorders and tobacco addiction is probably bidirectional, a conclusion supported by several additional studies (120, 128–130).

### Schizophrenia and Tobacco Smoking

Schizophrenia is a chronic disabling disorder characterized by positive symptoms (hallucinations and delusions), negative symptoms (blunted affect, alogia, reduced sociability, and anhedonia), and persistent cognitive deficits (memory, concentration, and learning). It affects approximately 1% of the population (131). Cigarette smoking is highly prevalent in persons with schizophrenia and schizoaffective disorder since it ranges from 45 to 88%, compared with <20% in the general population (132). Individuals with schizophrenia smoke more cigarettes per day, are more nicotine dependent, and also have more difficulties in quitting smoking than smokers with no history of mental health problems (38), leading to high mortality due to tobacco-related illnesses (39). Interestingly, smokers with schizophrenia have higher plasma and urine levels of nicotine, even when matched for the number of cigarettes smoked per day and other indices of nicotine dependence (133–135). This is not due to a difference in nicotine metabolism (136) but rather to the manner in which cigarettes are smoked by schizophrenic patients. Indeed, schizophrenic patients take significantly more puffs, have shorter inter puff intervals, and larger total cigarette puff volumes compared with matched healthy control smokers (137). Smokers with schizophrenia also exhibited a higher intensity of demand and greater consumption and expenditure in a cigarette purchase task, suggesting a higher incentive value of cigarettes in smokers with schizophrenia (138).

Thus, schizophrenia appears to be a strong risk factor for tobacco addiction, and individuals with schizophrenia may sustain smoking because of its higher reinforcing effect and to remedy certain symptoms of the disorder (139). Further research is now needed to look at the alternative possibility that tobacco smoking may confer vulnerability to the development of schizophrenia.

### EFFECTS OF NICOTINE ON COGNITION, PERSONALITY TRAITS, AND PSYCHIATRIC DISORDERS IN HUMANS

As described in the first part of this review, several clinical studies have linked tobacco addiction with impulsivity, novelty seeking, attention, mood disorders, ADHD, and schizophrenia. But, an investigation of the effects of nicotine on these personality traits and psychiatric disorder-associated phenotypes is important to better understand these relationships (see **Table 2**).

### Cognition

In addition to its abuse liability, nicotine can also enhance cognitive functions, including attention and memory (156). Thus, nicotine and other nAChR ligands have been proposed as potential therapeutics for the treatment of cognitive deficits in pathologies, such as schizophrenia, ADHD, and Alzheimer's disease (157, 158). However, chronic cigarette smoking has also been associated with decreased cognitive performance in middle age (159, 160) and increased risk for cognitive decline and dementia later in life (161).

Few studies have investigated the impact of nicotine on attention in humans. For example, transdermal nicotine improved the performance in a rapid visual information-processing task (140, 141) and nicotine exposure trough nasal spray decreased the reaction times in a visual oddball task in smokers (142), suggesting an increase in sustained attention induced by acute nicotine in smokers. Transdermal nicotine also significantly improved attention in both schizophrenic patients and controls (145) and visual attentional performance in mildly deprived smokers (162, 163). These studies clearly indicate that nicotine has a pro-attentional effect in humans. Along this line, there is evidence to suggest that nicotine may be useful in treating the symptoms of ADHD. Thus, positive effects of nicotine have been reported on attention, concentration, and other ADHD symptoms among adults with ADHD (22, 148, 149, 164, 165), indicating that ADHD patients may smoke as a form of self-medication.

Some studies further suggest a promnesic effect of smoking. Thus, abstinent smokers exhibited more impairment in visuospatial working memory (VSWM) compared with current smokers (166), and overnight smoking abstinence in schizophrenic patients' impaired VSWM performance, an effect reversed by



*PPI, prepulse inhibition of startle reflex; ADHD, attention-deficit hyperactivity disorder; OCD, obsessive–compulsive disorder.*

reinstatement of cigarette smoking. The effect of smoking reinstatement was blocked by the non-selective nAChR antagonist mecamylamine (167), indicating that the procognitive effect of tobacco smoking in VSWM tasks is through nAChR activation in patients with schizophrenia. Nicotine administration *via* gum, patch, or injection also improved short-term memory recall in non-smokers (168–170). Interestingly, the effect of nicotine on memory seems to be dependent on baseline performance. Thus, Niemegeers et al. showed that the effect of subchronic nicotine (1 or 2 mg trough oromucosal spray three times daily for 3 days) was dependent on baseline performance in working and visual memory in young and elderly healthy subjects (171). Subjects with lower baseline performance benefited from nicotine administration, while subjects with higher baseline performance performed worse after nicotine administration. This suggests that subjects with lower cognitive performance, irrespective of age, may benefit from nicotine.

There have been few publications on the effect of nicotine on executive functions, and it is difficult to draw conclusions due to the heterogeneity of the procedures and results. For example, nicotine (1 mg through nasal spray) improved prospective memory in minimally deprived (2 h) smokers and non-smokers when the subjects were able to devote resources to that task, but impaired the performance when they completed a concurrent auditory monitoring task (143). Nicotine (2 mg gum) has been shown to improve performance in complex flight simulation tasks, which involve high cognitive load, in non-smoking pilots, but had no effect on the executive function aspect of attention in never smokers (2 and 4 mg gums) (172).

In a study investigating the effect of nicotine on the performance of male non-smokers with high or low attentiveness on the Wisconsin Card Sorting Test (WCST), nicotine administration (7 mg patch) in the high attentiveness group impaired the performance (173). This suggests a deleterious effect of nicotine on strategic planning, set-shifting, and mental flexibility in this subpopulation. Finally, in a study using a virtual reality paradigm that assesses multiple cognitive constructs simultaneously (144), nicotine improved the overall performance, time-based prospective memory, and event-based prospective memory in minimally (2 h) deprived smokers (4 mg nicotine gum), but not in never smokers (2 mg nicotine gum). At the same time, action-based prospective memory was enhanced in both groups.

Thus, nicotine seems capable to improve, impair, or have no effect on executive functions depending on the task, the dose of nicotine or the target population, highlighting the need for new studies to obtain a clearer picture on that issue.

Several studies show that cigarette smoking impairs decisionmaking processes assessed through different neurocognitive tasks (174–177). However, these studies do not discriminate the effects of nicotine alone from the effects other psychoactive compounds found in tobacco smoke. Further studies are needed for providing clear information about the consequences of chronic nicotine exposure on decision-making.

Deficits in pre-attentive sensory information processing, characterized by the inability to filter out or gate sensory information, are thought to contribute to the higher order cognitive deficits observed in schizophrenia. This includes attention, working memory, verbal learning and memory, decision-making, and executive functioning (178, 179). One measure of sensory processing is the P50 suppression that measures the inhibition of electroencephalic cortical response to the second auditory stimulus presented 50 ms after the first. Patients with schizophrenia fail to suppress the response to the second auditory stimulus reflecting gating deficits (180). Several studies have shown that nicotine can improve P50 suppression. Thus, cigarette smoking improved P50 suppression in abstinent smokers with schizophrenia (181), and nicotine gum improved P50 suppression in non-smoking subjects with impaired gating or healthy controls (182–184).

Another measure of sensory information processing is the prepulse inhibition (PPI) of startle reflex that reflects the inhibition of a blinking reflex to a loud startling stimulus presented after a weak prepulse stimulus. This gating mechanism is also impaired in patients with schizophrenia (185) and nicotine (administered *via* nasal spray or subcutaneous injection) improved PPI in smokers and non-smokers with schizophrenia or in healthy subjects (146, 147). In addition, PPI of satiated smokers with schizophrenia is comparable to PPI of smokers without schizophrenia (186). Taken together, these data suggest that nicotine can improve sensory information processing and those patients with schizophrenia may smoke in part to alleviate their deficit in sensory gating.

Very few studies have investigated the effect of nicotine on impulsivity in humans. A positive correlation between levels of nicotine exposure and discounting of delayed monetary reinforcers has been observed in chronic smokers but not in ex-smokers (187, 188), suggesting that nicotine administration trough smoking increases cognitive impulsivity, an effect that is reversible. However, a positive effect of nicotine on the Stop Signal Reaction Time measure of the Stop Signal Task has been observed in adolescent and young non-smoking adults with ADHD, and in a control population (151–153), indicating that nicotine can reduce motor impulsivity. Thus, nicotine appears to have a differential effect on these two types of impulsivity, but more studies are needed to conclude.

We did not find additional clinical data on the effects of nicotine on cognitive impulsivity or on novelty seeking, highlighting the need for such investigations.

### Depression

Self-medication is one of the possible explanations for the impact of depression on cigarette smoking since nicotine reduces negative affect and can have antidepressant effects (189). This theory is supported by the fact that patients with MD increased their smoking behavior when they experienced depressive symptoms (190). In addition, several clinical studies reported that nicotine administration through transdermal patches reduced symptoms of depression, even in non-smoking depressed patients (154, 191) and relieved self-reported depression in regular smokers (150).

Interestingly, chronic administration of low levels of nicotine, as delivered by the nicotine patch, is thought to desensitize, rather than activate, nAChRs (192, 193), suggesting that the therapeutic effect of nicotine on depression may be mediated by inactivation of nAChRs. This is supported by the fact that mecamylamine, a non-selective antagonist at heteromeric nicotinic receptors, decreased depression-like symptoms in patients with Tourette's disorder (194–197) and enhanced the effects of a selective serotonin reuptake inhibitor (SSRI) in depressed subjects (198).

In conclusion, nicotine can relieve some symptoms of depression, potentially *via* desensitization of nAChRs thus supporting the self-medication hypothesis, which may nevertheless not be the only valid one.

### Anxiety Disorders

Several studies have shown a positive association between symptom severity in PTSD patients and their desire to smoke in order to reduce negative affect (129, 199–201). Other studies also suggested that this association was mediated by the expectancy that smoking would reduce negative affect (202) and that patients with PTSD smoked and relapsed to smoking in response to negative affect and trauma (48, 203). This suggests that people with PTSD smoke to relieve negative affect and anxiety as a form of self-medication, an hypothesis supported by the fact that PTSD symptoms are reduced by nicotine intake (43–46) and by the anxiolytic effect of nicotine patches in non-smokers with obsessive–compulsive disorders (155). Thus, people with anxiety disorders may smoke to alleviate their symptoms, but more clinical studies on the effect of nicotine on anxiety are needed to support this conclusion.

### PREDISPOSING ENDOPHENOTYPES FOR NICOTINE TAKING AND SEEKING IN PRECLINICAL STUDIES

Some psychological constructs, in particular, have been repeatedly associated with vulnerability to addiction, e.g., sensation seeking, impulsivity, and anxiety (6, 7, 204, 205). To date, the majority of preclinical animal research on individual differences in the response to drugs of abuse has mostly focused on cocaine. Additional work is now needed for nicotine, although some interesting data have nevertheless been generated as detailed in the following paragraphs (see **Table 3**). In this review, we will strictly focus on behaviors reflecting processes that directly contribute to the addiction cycle, such as those related to (i) drug rewarding properties (e.g., conditioned place preference (CPP), acquisition of self-administration), (ii) later stages of self-administration (e.g., increasing fixed ratios), (iii) motivation for the drug (e.g., progressive ratio schedules of reinforcement), (iv) persistence of drug seeking (e.g., extinction of self-administration), (v) relapse, and (vi) withdrawal syndrome during abstinence.

### Impulsivity

High impulsivity has been associated with a wide range of neuropsychiatric disorders, including ADHD (224), mood disorders (225), and also drug addiction (64, 226, 227). Findings in

Table 3 | Association between pre-existing endophenotypes and nicotine addiction-related features in animal studies.


*5-CSRTT, 5-choice serial reaction time task; IVSA, intravenous self-administration; PR, progressive ratio; CPP, conditioned place preference; EPM, elevated plus maze.*

trait-impulsive laboratory animals suggest that high impulsivity represents a vulnerability factor for addiction to several classes of drugs including cocaine (228–230), alcohol (231), and nicotine (53, 206). One plausible hypothesis is that high impulsivity results from a dysfunction of the frontal cortex and that this pre-existing dysfunction may facilitate the progressive incapacity of the frontal cortex to suppress maladaptive responses that develop following repeated exposure to a drug (232). Alternatively, drug intake may normalize excessive impulsivity in some individuals and may therefore represent a form of self-medication (53). As described earlier, impulsivity encompasses a complex array of behavioral processes, which can be categorized through at least two major components: motor/action impulsivity (motor disinhibition) and cognitive/choice impulsivity (impulsive decision-making). Several procedures have been developed to provide objective measures of impulsivity in animals, including delay-discounting tasks and the 5-choice serial reaction time task, an analog of the human continuous performance task (233, 234).

Very few preclinical studies have examined the putative link between pre-existing manifestations of impulsivity and nicotine addiction-like behaviors. Yet, one comprehensive study has shown that poor impulse control influences the motivational properties of nicotine and of nicotine-associated cues on a self-administration procedure in rats, and that sub-dimensions of impulsivity predict vulnerability to distinct stages of nicotine-seeking behavior (206). The authors found that high motor impulsivity on a 5-choice serial reaction time task predicts both enhanced selfadministration of nicotine during the acquisition and increased motivation for nicotine under progressive ratio of reinforcement. At the same time, high choice impulsivity on a delayed reward task was mostly predictive of both increased resistance to extinction of nicotine-seeking and increased cue-induced relapse of nicotine seeking after extinction. High-impulsive choice was also associated with higher motivation for nicotine when ratios of response requirement are increased, an observation that was confirmed by these authors in the second study (207). In contrast, high- and low-impulsive rats selected on a delay discounting task appear to show similar somatic withdrawal syndrome intensity after chronic exposure to low dose of nicotine (208). These data suggest that the two sub-dimensions of impulsivity influence both distinct and overlapping processes through the dynamics of addiction development in vulnerable individuals.

### Response to Novelty

The second behavioral factor strongly linked to addiction including smoking is the novelty/sensation seeking trait (7, 205, 235). Like impulsivity, novelty/sensation seeking represents a multifaceted behavioral construct and can be divided into a number of dimensions. Several tasks have been developed in animal models to assess responses to novelty.

The primary animal model of sensation seeking is measured as an enhanced locomotor activity in a novel and inescapable environment (236, 237). As for impulsivity, only a small number of preclinical studies have examined the relationship between pre-existing high locomotor response to novelty and nicotine addiction-like behaviors. Consistent with what was reported for other psychostimulants (237), one study found that high locomotor responding to a novel environment predicted the propensity to self-administer nicotine under both fixed and progressive ratios of reinforcement in rats (209). However, such an association was not observed in a more recent study where rats screened as high and low responders to novelty displayed similar levels of nicotine self-administration, although high responders were more prone to self-administer nicotine when it was delivered concomitantly with IMAOs (210). In contrast, a study reported that mice showing low basal locomotor activity manifested nicotine-induced CPP, while mice exhibiting high basal locomotor activity did not (211). However, in this study, the mice had previously been exposed to nicotine for prior experimental testing, which might have influenced subsequent nicotine rewarding effects (238). Consistently, other authors showed that rats classified as low responders according to their locomotor response to novelty following an injection of nicotine, showed nicotine-induced CPP after a long- but not short-term conditioning procedure, while rats classified as high responders did not show CPP under any condition (212). Also, rats selected as high locomotor responders to novelty showed enhanced social anxiety-like behavior during abstinence after repeated nicotine exposure (213–216).

In addition to the sensation seeking trait that is modeled as high locomotor reactivity to novel environments, novelty seeking has been proposed to reflect a distinct dimension of sensation seeking that would differentially contribute to the vulnerability to develop addiction (239, 240). The terms sensation seeking and novelty seeking are often used in an exchangeable way throughout the literature, though. In animal studies, novelty seeking *per se* is modeled by a high propensity to visit a novel object or environment in a free choice procedure, the so-called novelty preference. Very few studies have attempted to identify the predictive value of novelty seeking to the appetence for nicotine. Interestingly, it has been shown that rats, screened as high novelty seekers as measured by their preference for a novel object in a procedure where they could freely explore either a novel or a familiar object, were also characterized as high locomotor responders to novelty as measured by the number of rears they displayed in an open-field (217). However, high novelty seeker rats did not show differences compared with rats screened as low novelty seekers when subsequently tested for oral nicotine consumption. In another study, the same authors also observed no enhanced nicotine-induced CPP in rats with high rearing activity, although it is difficult to conclude since they did not observe nicotine CPP in any of the rat subpopulations tested in this study (218). Using multiple regression analysis, other authors reported that novelty seeking measured as exploration of a novel object predicted nicotine self-administration in female, but not in male, rats (219). Another animal model of novelty seeking based on the number of head-dips in the hole-board apparatus has been used (241). Mice preselected for high novelty seeking in this test showed a marked increase for oral nicotine intake over time, while mice with low novelty seeking did not (220). However, mice showing high headdip behavior in the hole-board task and that had been exposed to nicotine during gestation and suckling tended to consume less nicotine when tested during adolescence (242). In contrast, the same study showed that mice similarly exposed to nicotine and showing high rearing or high general locomotor behavior in the hole-board displayed increased oral nicotine intake.

Taken together, these data suggest that additional work is clearly needed to conclusively acknowledge whether high response to novelty/high novelty seeking represents a significant risk factor for nicotine addiction and, if so, for which specific features of this disorder. Novelty seeking measured as high novelty preference, but not high novelty-induced locomotor activity, has notably been shown to predict the compulsive use of cocaine in rats, a hallmark feature of addiction (243). The existence of a similar causal association has not been investigated for nicotine, partly because behaviors reflecting loss of control over nicotine intake and compulsive nicotine taking and seeking have not been accurately modeled so far. The recent development of increasingly reliable models may open new paths for such longitudinal investigations (244–247).

### Anxiety and Mood Disorders

There is a high prevalence of tobacco smoking in subjects with mood or anxiety disorders (235, 248–250). It has been proposed that individuals may use drugs including nicotine as a coping strategy to self-regulate affective distress states (251–253). Drug users may self-medicate for affective distress existing before the initiation of drug use and also to alleviate mood and anxiety distress that are part of the withdrawal syndrome resulting from abstinence (254). Alternative explanations for the strong association between smoking and mood and anxiety disorders are also to be considered, notably since repeated use of nicotine significantly impacts anxiety and mood processing. Below, we review the preclinical studies that assessed whether the manifestation of such disorders beforehand may predict the future response to nicotine.

In preclinical studies, anxiety is usually assessed using procedures that exploit the emotional conflict occurring between the innate strong tendency to explore novel environments and the natural fear of open and/or brightly lit spaces. In particular, the elevated plus maze (EPM) is commonly used with anxiety measured as the preference of animals for closed versus open arms (255). High anxiety in this task predicts several features of cocaine and alcohol, but not heroin, addiction (7). Adolescent mice with high anxiety in this test showed similar levels of oral nicotine intake as mice with low anxiety in a free choice procedure (221). However, during a withdrawal period after 2 weeks of exposure to nicotine through their drinking bottles, adolescent mice with high anxiety consumed less nicotine than mice with low anxiety when tested in a free choice procedure (221). The same group further showed no differences in oral consumption of nicotine in a free choice procedure between adolescent mice with high and low anxiety classified according to their percentage of center squares crossed in a hole-board activity box (220). Another study also reported no association between prior behavioral measurements on the EPM and oral nicotine consumption in rats (217). In contrast, a study in adolescent rats reported that individuals with high anxiety measured as the time spent in the white versus the black chamber of a biased CPP apparatus manifested subsequent nicotine-induced CPP while individuals with low anxiety did not (222). Furthermore, in a comprehensive study assessing several risk factors for nicotine self-administration in a social context in rats, multiple regression analysis found that anxiety measures on the EPM were a predictor of nicotine intake in males, but not in females, while measures of depression on the tail suspension test were predictors of nicotine intake in both males and females (219). In males, both depression- and anxiety-related measures also predicted context-induced nicotine reinstatement. Interestingly, mice generated from the intercross of high (C57BL/6J) and low (C3H/J) emotional mouse strains and classified as "high stress reactive" according to their scores in an elevated zero maze, light– dark box, startle response, and forced swim tests, showed higher vulnerability to relapse but not to initiation or maintenance of nicotine self-administration compared with low and average stress reactive animals (223).

In addition to data regarding the causal link between interindividual differences in anxiety- and depression-like behaviors and appetence for nicotine, it was demonstrated that acute stressor exposure through a single episode of intermittent footshock administered 24 h before the start of place conditioning dose-dependently facilitated acquisition of CPP to nicotine in adolescent rats (256). Prenatal stress in rats also increased nicotine reinforcing properties in a CPP procedure and anxiety withdrawal symptom at the cessation of nicotine exposure (257, 258). Finally, chronic mild stress, considered as a model of depression, which was delivered prior to nicotine exposure was found to exacerbate nicotine withdrawal syndrome in rats (259).

Although these data are heterogeneous, they suggest that anxiety and mood disorders may represent a significant predictor of nicotine addiction and may notably influence the vulnerability to relapse after abstinence, depending on the sex and the age of the individual.

### Cognitive Impairments

In addition to alleviating stress, anxiety, and improving mood, nicotine has the ability to enhance cognition. Nicotine use has also been proposed as a self-treatment for cognitive deficits that are encountered in numerous psychiatric diseases strongly represented in smoker populations such as schizophrenia or ADHD (260). As for other aspects of the comorbidity between smoking and psychiatric conditions, one fundamental pending question is whether cognitive deficits are of premorbid origin or develop after long-term exposure to nicotine and subsequent withdrawal. Animal models have proven to be useful tools for helping to resolve these issues with the possibility for wellcontrolled longitudinal studies to be conducted. Nevertheless, while many studies have looked at the effects of nicotine on cognitive processes, there is a great lack of preclinical studies investigating the relationship between inter-individual differences in cognitive functions, such as baseline impairments in attention, learning, and memory functions, and addiction-like behaviors, especially with regard to nicotine. One study provided evidence for a causal link between prior cognitive deficits and behavioral response to nicotine, by looking at individual differences in baseline PPI of acoustic startle reflex and subsequent nicotine-induced locomotor effects including locomotor sensitization. Disruption in the PPI is a model of cognitive impairment in schizophrenia and reveals deficits in the sensorimotor gating system which is critical for the integration of sensory and cognitive information processing and execution of appropriate motor responses. The authors showed that the acute effect of nicotine on locomotion was higher in rats classified as high-inhibitory, while a locomotor sensitization after repeated exposure to nicotine developed only in low-inhibitory rats (261). Another study reported that neonatal ventral hippocampal lesions that produced post-adolescent onset, pharmacological, neurobiological, and cognitive features of schizophrenia, such as spatial learning and working memory deficits, increased nicotine self-administration and nicotine seeking during extinction in adult rats (262). Furthermore, spontaneously hypertensive rats, considered as the most valid animal model of ADHD and that display symptoms of inattentiveness, impulsivity, and hyperactivity, show enhanced nicotine self-administration (263) and CPP (264). It has also been shown that social interaction phenotypes are predictor of nicotine self-administration and nicotine seeking in rats, although it is difficult to conclude about which cognitive functions – if any – were implicated in such a causal association (219).

Taken together, these data suggest that different behavioral factors may preferentially contribute to some of the many dimensions of the addiction cycle. Combinations of some predisposing behavioral traits may result in specific vulnerability profiles predicting higher risk for starting nicotine use or shifting toward nicotine abuse, or for relapse during abstinence. For instance, outbred rats classified as high locomotor responders to novelty show decreased anxiety as compared with low responders (265). Also, as mentioned earlier, a study based on a dimensional analysis approach within a single and large population of rats reported that high locomotor reactivity to novelty predicts the propensity to self-administer cocaine, while high novelty seeking in a free choice procedure predicts the transition to compulsive cocaine seeking (243). Additional studies measuring the inter-individual vulnerability for different personality traits and addiction-like phenotypes in the same population of animals may significantly improve our understanding of vulnerability to nicotine addiction.

### EFFECTS OF NICOTINE ON COGNITIVE AND AFFECTIVE ENDOPHENOTYPES IN PRECLINICAL STUDIES

### Impulsivity

In addition to a possible influence of pre-existing impulsivity on later development of drug abuse, psychostimulant abuse may itself lead to the increased impulsivity often observed in chronic drug abusers, including nicotine, and, thereby, help to develop and maintain addiction (see **Table 4**) (348).

Animal studies on the effects of nicotine on inhibitory control have mostly focused on motor impulsivity using attentional tasks. Acute nicotine exposure consistently increased premature responding on serial reaction time- (266–272) and go/no-go-tasks in rats (273). These effects appear to be long-lasting, although data about chronic exposure to nicotine on motor impulsivity are fewer and less consistent (268, 271, 274, 276). One recent study in mice demonstrated that chronic oral, but not acute, injections of nicotine attenuated phencyclidine-induced increases in motor impulsivity (349). Increased motor impulsivity was further reported in rats after prenatal exposure to nicotine, while cognitive impulsivity was not affected (350, 351). In adolescent, but not post-adolescent rats, repeated exposure to nicotine increased impulsive action but not impulsive choice (275).

Few animal studies have focused on the consequences of nicotine exposure on cognitive impulsivity using delay-discounting tasks, and the data are more heterogeneous. Acute injections of nicotine dose-dependently increased impulsive choice in rats, while repeated injections of nicotine also increased impulsive choice, but to the same extent regardless of the dose (277). After nicotine treatment cessation, impulsive choice remained enhanced for a long period before gradually returning to baseline, suggesting that chronic nicotine exposure can produce long-lasting although reversible alterations in inhibitory control. Acute exposure to nicotine increased both impulsive action in a go/no go task and impulsive choice in a delayed reward task in rats, with greater sensitivity of impulsive choice to nicotine (273). Both acute and subchronic injections of nicotine increased impulsive choice in rats in a procedure where the delayed reward was made preferable by decreasing the probability rather than the magnitude of the immediate reward (278). In contrast, a study reported decreased impulsive choice in rats after acute nicotine, and this effect was abolished after repeated nicotine injections (279). Finally, in rats with high cognitive impulsivity, chronic nicotine exposure and nicotine withdrawal had no effect on impulsive choice, while chronic nicotine exposure increased impulsive choice in low-impulsive rats, with no effects on animals with intermediate impulsivity levels (352). Nicotine may result in varying effects on choice processing, depending on key parameters such as basal levels of impulsivity, reinforcement amount, or delay (e.g., adjusting versus fixed delay), and genetic background of rats.

### Anxiety and Mood Disorders

The effects of acute nicotine exposure on anxiety-like behavior is highly dependent on the task, dose, timing of testing, sex, strain, age, and basal anxiety levels of the animals (353, 354). In the EPM, acute or subchronic systemic nicotine was found anxiolytic in some studies (280, 285, 293), anxiogenic at both low and high doses in others (288, 289, 292, 294), or to have no effects (288), in rats. Inconclusive data have also been obtained in mice, with anxiolytic effects at low doses and anxiogenic effects at high doses of nicotine in C57BL/6J, CD1, and BALB/C mice (283, 284, 286, 287), and anxiogenic effects with an intermediate dose with anxiolytic action when given subchronically in Swiss mice (290, 291). In the social interaction test, it is also generally found that low doses of nicotine induce anxiolytic effects, while high doses are anxiogenic (281). However, a study reported that acute nicotine injections performed 5 min before testing induced anxiogenic effects, whereas nicotine injections using the same dose but performed 30 min before the task elicited anxiolytic effects (282). Nicotine reduced stress-induced hyperthermia (355).

Interestingly, a tolerance to nicotine's effects on anxiety may develop over time. Chronic exposure to nicotine was found to have no longer effects on anxiety or to induce anxiolytic effects

#### Table 4 | Effects of nicotine administration on affective and cognitive processes in animal studies.


*(Continued)*


*DRL, differential reinforcement of low rate; EPM, elevated plus maze; 5-CSRTT, 5-choice serial reaction time task.*

to which tolerance also develops eventually in the EPM and the social interaction test, in rats and mice (289–291, 293, 299, 300). The consequences of chronic nicotine exposure also depend on several factors such as sex or basal levels of anxiety. For instance, mice that overexpress the R isoform of acetylcholinesterase exhibit increased anxiety that is normalized by chronic forced nicotine consumption (356). Chronic nicotine treatment also reversed affective deficits produced by chronic mild stress (357). Yet, increased anxiety was also observed in the EPM and the light–dark box after chronic nicotine consumption (296–298). One study reported increased anxiety in the social interaction test in rats after nicotine self-administration, which may appear contradictory to the self-medication hypothesis (295).

Increased anxiety is consistently observed when testing is performed during nicotine withdrawal in the EPM, light–dark box, or social interaction test (221, 282, 295, 358–361) and is reduced by nicotine injection (289). Nicotine withdrawal also increased sensitivity to stressors in the light-enhanced startle paradigm (362).

These studies suggest that nicotine effects on anxiety are dependent on various factors such as the source of anxiety, baseline levels, and genetic background of the individuals. Nicotine may be used to self-medicate anxiety-related distress associated with abstinence or in people with predisposing phenotypes, while it may have opposite effects on anxiety in other individuals or under different conditions. In the latter case, smoking behavior might be sustained by the belief that nicotine consumption will alleviate the anxiety that was essentially induced by smoking itself in the first place, while long-term smoking cessation would actually be much more beneficial for reversing such anxiety-related problems.

The effects of nicotine on fear conditioning in rodents are clearer than those on anxiety-like behavior (363). Studies have consistently reported enhanced hippocampus-dependent fear conditioning in mice after acute nicotine exposure (302–305, 307), while there is no effect on hippocampus-independent fear conditioning or on general freezing behavior (301, 302). Acute nicotine was further shown to impair contextual safety discrimination in a safety learning paradigm (310). A tolerance to these effects seems to develop under chronic nicotine exposure in mice and rats, while nicotine withdrawal altered fear conditioning (306, 308, 309, 353, 363). Furthermore, a study showed that nicotine had differential effects on extinction of fear conditioning depending on when it was administered, during training and/or during extinction, and on the context during extinction (364), suggesting that nicotine may strengthen contextual fear memories and interfere with extinction. Chronic nicotine administration 2 weeks prior to the training impaired subsequent cued – but enhanced contextual – fear extinction (365). Studies on fear conditioning extinction are particularly relevant in the context of the self-medication for emotional distress hypothesis of nicotine abuse. Further investigation will hopefully be carried along this line in the future.

Numerous studies showed antidepressant-like effects of nicotine in rat and mouse models, such as in learned helplessness (316) and forced swim tests (311–314, 317). However, some authors have observed decreased depression-like phenotypes in response to nicotine only in rat strains that display enhanced basal levels of depressive features, with contradictory effects depending on the post-injection time of the testing (311, 315, 318). As for anxiety, factors including age, sex, and genetic background may also influence the action of nicotine on mood. One study notably demonstrated that while acute nicotine decreased depressionlike behavior in adult Sprague-Dawley rats, it had no effect in adolescent rats (285). There is also evidence for decreased depression-like phenotypes following chronic nicotine exposure (312, 316). Furthermore, chronic administration of nicotine results in an enhanced response to classical antidepressants (314, 366) and reverses anhedonia induced by chronic stress (367). Acute and chronic exposure to nicotine also had antidepressant effects in environmentally induced rat models of depression (357, 368, 369). Interestingly, chronic oral nicotine intake or repeated nicotine injections diminished depressive symptoms more than transcranial magnetic stimulation (369). However, one study found no depression-like phenotypes in response to chronic nicotine in the tail suspension task in male and female rats, whatever the dose of nicotine tested (300). By contrast, nicotine withdrawal is clearly associated with enhanced depression-like behaviors, including elevated reward thresholds (370) in rats. At early stages of withdrawal, mice exhibited a depression-like profile similar to that observed following a chronic stress regimen (367). Acute administration of the antidepressant fluoxetine reversed nicotine withdrawal-induced intra-cranial self-stimulation threshold elevations when coadministered with a 5-HT1A receptor antagonist (371).

Overall, there is evidence supporting the self-medication hypothesis for anxiety and depressive-like symptoms, including those resulting from nicotine exposure cessation. Subsequent nicotine-seeking relapse may be driven by negative reinforcement mechanisms that anticipate such affective distress (260). However, nicotine-elicited improvements of anxiety and mood appear to strongly depend on several conditions. Nicotine can also deteriorate affective states in some conditions, an important fact that may paradoxically contribute to smoking maintenance and should be taken in account to provide appropriate smoking cessation help.

### Cognitive Impairments

Accumulating evidence suggests that cognitive enhancement may contribute to nicotine addiction through different modalities. Research using experimental animals has provided a better understanding of the effects of nicotine on cognitive processes.

Nicotine administration has been shown to improve learning and memory (157, 319, 321, 329, 331, 372, 373). Single injections of nicotine notably improved working memory in rodents (157, 320). Acute nicotine administration also enhanced acquisition, consolidation, and restitution of the information in an object recognition task in rats (319). Yet, it was reported that acute nicotine did not improve acquisition in the water maze in group housed mice and even impaired performances in this task in individually housed mice (322). Importantly, many preclinical studies show that the efficacy of nicotine on memory does not diminish with chronic administration. For instances, chronic nicotine exposure improves memory performances in rats (323–326) or memory consolidation in mice (332). Nevertheless, some studies found no effects of chronic administration on memory function. Notably, chronic nicotine in NMRI male mice did not significantly change performance in the water maze (333). Age may be a significant factor influencing the action of nicotine on memory. A study reported that nicotine improved the acquisition of a serial pattern learning task in young but not old Fisher 344 rats, while no effects were found on reference memory in either group (330). Chronic nicotine administration also failed to improve working memory in old rats (328). Yet, other studies obtained contrasting data with improvements of memory in response to nicotine in senescence-accelerated mice (374) and aged rats (327). Nicotine also alleviated memory deficits induced by chemical or pharmacological agents (375, 376), and brain lesions (377, 378). By contrast, nicotine withdrawal resulted in learning and memory impairments including in contextual fear conditioning (306, 379, 380).

Although these data suggest primary mnemonic effects of nicotine, there has been much debate as to whether beneficial effects of nicotine in tasks of learning and memory may be secondary to effects on attentional functions. A first study reported that small doses of nicotine reversed deficits in 5-CSRTT accuracy in basal forebrain lesioned rats, but not in non-lesioned animals (381). Nevertheless, other studies showed improvements in 5-CSRTT response accuracy following acute (266, 334, 338, 339) and chronic (271, 274, 340, 341) exposure to nicotine, although these effects may be strain dependent (337). Nicotine also induced improvements in choice accuracy in a two-choice stimulus detection task (335, 336). As observed for learning and memory, nicotine reversed attentional impairments caused by brain or pharmacologically induced lesions (325, 381, 382). Nicotine withdrawal was shown to impair choice accuracy, to increase omission errors in the 5-CSRTT (271, 383), and to impair PPI of acoustic startle in mice (384), although contrasting results were found with another stain of mice (385).

Apart from learning, memory, and attention functions, very few studies have focused on the consequences of nicotine exposure on executive functions in animals. Some studies have evaluated the effects of nicotine on measures of cognitive flexibility. Deficits in cognitive flexibility may contribute to drug addiction as the inability to change a response to stimuli previously associated with a drug stimulus or reward (386). Acute nicotine injections impaired decision-making, and this effect was associated with deficits in behavioral flexibility measured as perseverating responding in rats (342). The same authors reported that chronic neonatal nicotine did not impair decisionmaking in rats (343). Yet, chronic exposure to a high, but not low, dose of nicotine impaired response reversal learning in mice (344, 345). In contrast, other authors (346, 347) reported that acute and repeated nicotine administration improved attentional set-shifting in rats.

### CONCLUSION

The studies related across this review strongly support the idea that inter-individual differences in cognitive and affective processing both preceding and resulting from repeated exposure to nicotine contribute to nicotine addiction. There is growing evidence that nicotine addiction arises from the combined interactions of various processes underlying cognition and emotion with nicotine exposure according to several modalities.

First, human studies, but mostly preclinical investigations, clearly indicate that nicotine can have direct facilitator effects on cognitive processing and alleviate negative affective states, supporting the hypothesis of tobacco smoking as a form of selfmedication. This seems to be particularly the case for memory and attention deficits, as well as anxiety and depression-like phenotypes. Reversal of such cognitive and affective deficits by nicotine is even clearer for withdrawal-associated phenotypes. Tobacco smoking may thus also be maintained as a form of self-medication in individuals who show moderate cognitive or affective impairments and who are not diagnosed with a particular psychiatric condition. However, despite demonstrable nicotine-induced improvements of affective states and cognitive deficits, this is only indirect evidence supporting the self-medication hypothesis, which should not be considered as the only plausible explanation for high rates of smoking behavior in psychiatric populations. One should also emphasize the fact that chronic exposure to nicotine can also impair anxiety and mood in some conditions, to help attenuate hesitations in smoking-cessation attempts. Second, pre-existing phenotypes, such as high impulsivity and sensation seeking, appear to influence the appetence for nicotine according to most studies and may drive the propensity for initiating and pursuing smoking behavior. However, additional preclinical longitudinal studies need to be performed for resolving this issue, particularly to investigate the relationship between predisposing phenotypes and behavioral models that still need to be developed to truly capture addiction-like features such as habitual and compulsive nicotine taking and seeking. Last but not least, numerous studies reviewed here show that nicotine can trigger "pro-addiction" phenotypes such as impulsivity and deficits in cognitive flexibility. Nicotine-induced enhancements of learning, memory, and attention may also promote the shift toward nicotine addiction by facilitating the associations between smoking and contextual cues that underlie habitual drug use, craving, and relapse.

The great heterogeneity regarding the effects of nicotine observed across the different studies that we reviewed further suggests that the underlying reasons for smoking may vary across individuals, according to their pre-existing differences in genetics, life experiences, tobacco history, or personality traits.

### AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

### ACKNOWLEDGMENTS

The authors would like to thank Dr. Uwe Maskos for editing and improving the use of English in the manuscript.

### REFERENCES


smoking from adolescence to young adulthood. *J Pediatr Psychol* (2007) 32:1203–13. doi:10.1093/jpepsy/jsm051


terms of visuospatial attention and inhibition before and after single-blind nicotine administration. *Neuroscience* (2014) 277:375–82. doi:10.1016/j. neuroscience.2014.07.016


smoking during pregnancy – a reexamination using a sibling design. *J Child Psychol Psychiatry* (2015) 57(4):532–7. doi:10.1111/jcpp.12478


panic psychopathology. *J Anxiety Disord* (2008) 22:1214–26. doi:10.1016/j. janxdis.2008.01.003


adolescent C57BL/6 mice. *Behav Brain Res* (2006) 167:175–82. doi:10.1016/j. bbr.2005.09.003


in different cost-benefit decision making tasks in rats. *Psychopharmacology (Berl)* (2012) 224:489–99. doi:10.1007/s00213-012-2777-y


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Besson and Forget. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A Closer Look at the Effects of Repeated Cocaine Exposure on Adaptive Decision-Making under Conditions That Promote Goal-Directed Control

#### *Briac Halbout*1,2*\*, Angela T. Liu*1,2 *and Sean B. Ostlund*1,2

*1Department of Anesthesiology and Perioperative Care, School of Medicine, University of California Irvine, Irvine, CA, USA,*  <sup>2</sup>*UC Irvine Center for Addiction Neuroscience, Irvine, CA, USA*

It has been proposed that compulsive drug seeking reflects an underlying dysregulation in adaptive behavior that favors habitual (automatic and inflexible) over goal-directed (deliberative and highly flexible) action selection. Rodent studies have established that repeated exposure to cocaine or amphetamine facilitates the development of habits, producing behavior that becomes unusually insensitive to a reduction in the value of its outcome. The current study more directly investigated the effects of cocaine pre-exposure on goal-directed learning and action selection using an approach that discourages habitual performance. After undergoing a 15-day series of cocaine (15 or 30 mg/kg, i.p.) or saline injections and a drug withdrawal period, rats were trained to perform two different lever-press actions for distinct reward options. During a subsequent outcome devaluation test, both cocaine- and saline-treated rats showed a robust bias in their choice between the two actions, preferring whichever action had been trained with the reward that retained its value. Thus, it appears that the tendency for repeated cocaine exposure to promote habit formation does not extend to a more complex behavioral scenario that encourages goal-directed control. To further explore this issue, we assessed how prior cocaine treatment would affect the rats' ability to learn about a selective reduction in the predictive relationship between one of the two actions and its outcome, which is another fundamental feature of goal-directed behavior. Interestingly, we found that cocainetreated rats showed enhanced, rather than diminished, sensitivity to this action–outcome contingency degradation manipulation. Given their mutual dependence on striatal dopamine signaling, we suggest that cocaine's effects on habit formation and contingency learning may stem from a common adaptation in this neurochemical system.

Keywords: habit learning, contingency degradation, outcome devaluation, rat, goal-directed, sensitization, choice, cognitive control

## INTRODUCTION

For many, recreational drug use can develop into a pathological behavior that is difficult to control or abstain from despite its many harmful consequences. Similarly, when rodents are given extensive opportunity to self-administer cocaine, they can develop a compulsive tendency to seek out the drug even when doing so leads to physical punishment (1, 2). Understanding how this pathological

#### *Edited by:*

*Mark Walton, University of Oxford, UK*

#### *Reviewed by:*

*Martin Zack, Centre for Addiction and Mental Health, Canada Michael Saddoris, University of Colorado Boulder, USA*

> *\*Correspondence: Briac Halbout halboutb@uci.edu*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 25 November 2015 Accepted: 07 March 2016 Published: 21 March 2016*

#### *Citation:*

*Halbout B, Liu AT and Ostlund SB (2016) A Closer Look at the Effects of Repeated Cocaine Exposure on Adaptive Decision-Making under Conditions That Promote Goal-Directed Control. Front. Psychiatry 7:44. doi: 10.3389/fpsyt.2016.00044*

decision-making develops is a major objective of addiction research and theory.

Some have proposed that compulsive tendencies are caused by drug-induced dysregulation of neural systems that normally mediate adaptive reward-related learning and decision-making (3–8). Although this hypothesis draws heavily on literature regarding animal learning, current evidence shows that humans and rodents use analogous action selection strategies when pursuing rewards (9–13). For instance, when first encountering a task or problem, both species tend to apply a sophisticated goal-directed strategy that allows for rapid learning and flexible decision-making. The term *goal-directed*, here, refers to a reward-seeking action that is performed because an individual infers that doing so will lead to a desired outcome, as opposed to automatically performing an action that has become habitual or routine. One way to determine if an action is goal-directed is to change the value of its outcome between initial training and testing. For instance, rats trained to perform a lever-press action for food pellets will withhold this behavior if they are fed to satiety on those food pellets (instead of some other type of food) immediately before the test session (9, 14, 15). Importantly, outcome devaluation tests are conducted in extinction to ensure that changes in performance are based on previously encoded action–outcome learning.

Another test of goal-directed performance involves changing the causal relationship between an action and its outcome. The contingency degradation procedure accomplishes this by delivering the outcome with the same probability regardless of whether an action is performed or not. In such studies, rats trained to lever press for food pellets will exhibit a decline in this behavior if it is no longer needed to produce pellets (9, 16, 17).

Because goal-directed control involves executive processes that tax cognitive resources (18), both rodents and humans tend to shift to a more efficient, but less flexible, habit-based strategy when appropriate. For instance, rats given extensive training on a simple task tend to be insensitive to manipulations of outcome value or action–outcome contingency (10, 14). Relying on a habitual action selection strategy allows an individual to automatically perform routine reward-seeking tasks while freeing up cognitive resources for other activities.

Based on this conceptual framework, it has been suggested that neuroadaptations caused by chronic drug intake bias action selection in favor of habitual control of drug and adaptive reward seeking (3–5, 7, 19). In line with this general account, there have been many reports that drug and alcohol seeking become insensitive to post-training outcome devaluation (or related treatments), particularly after extensive training (20–24). Those studies, aimed at modeling a loss of control over volitional, drug-directed, actions have shown that initial drug taking can become habitual with prolonged drug use. Interestingly, there is further evidence that the impact of chronic drug experience (volitional or not) on behavioral control is so profound that it even alters the way animals pursue other non-drug rewards. For example, rats given repeated exposure to cocaine or amphetamine before learning to lever press for food reward develop habitual (devaluation insensitive) performance under limitedtraining conditions that support goal-directed performance in drug-naive rats (25–29).

It is important to note, however, that under normal conditions the transition from goal-directed to habitual performance is neither final nor mandatory. For instance, normal individuals tend to rapidly re-exert goal-directed control over habitual actions if they encounter response-contingent punishment or other salient stimuli (4, 18). Of even greater relevance to the current study, it is known that certain training factors discourage the transition to habitual control. For instance, rats trained with multiple action–outcome relationships typically maintain goal-directed performance even after extensive training (30–34), presumably because executive processes continue to be engaged in settings that encourage consideration of distinct action–outcome relationships (13, 35, 36).

With this in mind, it is interesting that most studies investigating if drug pre-exposure disrupts the balance between goaldirected and habitual control have used simple reward-seeking tasks that would normally support habitual performance in drug-naive animals if sufficient training were provided. Although such findings indicate that chronic drug exposure can facilitate the development of habits, they do not address whether it also compromises goal-directed control in more complex decisionmaking scenarios that require choice between different response options. This is significant because, for human addicts, the decision to use drugs would seem to occur in situations where countless other more adaptive activities are available. Interestingly, of the few animal studies that have addressed this issue, there is evidence that certain aspects of goal-directed behavior may be unimpaired (28, 37, 38), or perhaps even enhanced (39, 40), following repeated drug exposure.

The current study tests this hypothesis by giving rats repeated experimenter-administered injections of saline or cocaine prior to training them on a challenging instrumental learning protocol involving two distinct action–outcome contingencies. Their ability to exert goal-directed control over task performance was then assessed using outcome devaluation and action–outcome contingency degradation tests. We found that cocaine preexposure had no impact on rats' ability to learn about multiple action–outcome relationships or use these associations when adapting to a change in reward value. Interestingly, rather than being impaired, cocaine-exposed rats displayed enhanced sensitivity to instrumental contingency degradation training. Thus, in a behavioral scenario that discourages habitual control, repeated cocaine exposure actually enhances certain features of flexible goal-directed behavior, which has important implications for our understanding of the neural and behavioral substrates of drug addiction.

### MATERIALS AND METHODS

### Subjects and Apparatus

Adult male Long–Evans rats (*n* = 30) weighing ~375 g at the start of the experiment were used as subjects. Rats were pair-housed and had *ad libitum* access to water throughout the experiment. Rats had unrestricted access to food during the cocaine sensitization and withdrawal phases of the experiment but were maintained at ~85% of their free-feeding body weight during the following behavioral phases.

Behavioral procedures took place in Med Associates (St Albans, VT, USA) operant chambers located in sound- and light-attenuated cubicles. The chambers were equipped with four photobeams for monitoring locomotor activity across a horizontal plane ~2 cm above a stainless steel grid floor. Each chamber was also equipped with two retractable levers positioned to the right and left of a food magazine, which was mounted on the right end wall. Two pellet dispensers connected to the magazine and were used to deliver either plain (i.e., grain) or chocolate-flavored purified dustless precision pellets (45 mg, BioServ, Frenchtown, NJ, USA). The hind wall and the hinged front door were made out of transparent Plexiglas. A single houselight (3 W, 24 V) located on the left end wall illuminated the chamber.

During the cocaine sensitization phase of the experiment, we added visual, tactile, and olfactory cues to the bare chamber described above in order to create a distinctive context. Panels with vertical black-and-white stripes were positioned outside the transparent hind wall and front door; a white perforated Plexiglas sheet covered the grid floor; and 0.2 ml of pure almond extract (McCormick and Co. Inc., Baltimore, MD, USA) was poured directly into the stainless steel waste pan located under the grid floor.

All experimental procedures involving rats were approved by the UC Irvine Institutional Animal Care and Use Committee and were in accord with the National Research Council Guide for the Care and Use of Laboratory Animals.

### Drugs

Cocaine hydrochloride (Sigma-Aldrich, St. Louis, MO, USA) was dissolved in sterile saline (0.9%NaCl). Cocaine and saline (i.e., vehicle) solutions were injected i.p. at the volume of 1 ml/kg.

### Cocaine Exposure Protocol

To establish basal locomotor responding, all rats were first given a single injection of saline and were immediately placed in the operant chambers, where photobeam breaks were recorded for 60 min. Rats were then divided into three groups: two cocaine groups receiving cocaine injections at either 15 or 30 mg/kg, and one saline group (all *n*'s = 10) receiving saline injections. Rats were injected once daily for 15 consecutive days. Immediately after each injection, the rats were placed in the behavioral chambers (with modified context as described above) for 60 min during which locomotor activity was recorded. All rats remained undisturbed in their home cages for a further 29 days before being put on food restriction for subsequent behavioral testing.

### Instrumental Training

Starting on withdrawal day 32, rats received magazine training for 2 days. In each session, they received 20 grain and 20 chocolate food pellets randomly delivered on a random time (RT) 30 s schedule while the levers were retracted. Rats were then given 10 days of instrumental training on two distinct action–outcome contingencies (i.e., R1 → O1 and R2 → O2). Training with the right and left levers was carried out in two separate sessions each day. The specific lever-outcome arrangements were counterbalanced with drug treatment conditions, such that for half of the rats in each treatment group right lever pressing was paired with the delivery of the chocolate pellet while left lever pressing earned the grain pellet, whereas the other half received the opposite arrangement. During each session, only one lever was extended. The session was terminated after 30 min elapsed or 20 pellets were earned. The two daily sessions were separated by at least 2 h, and their order was alternated every day. For the first 2 days of the instrumental training phase, lever pressing was continuously reinforced (CRF). Instrumental training under a random ratio (RR) as opposed to a random interval schedule of reinforcement is known to discourage the emergence of habitual control over reward seeking (41). Because our study looked specifically at the effect of cocaine on goal-directed control, the schedule of reinforcement were gradually shifted to an RR-5 schedule for the next 2 days (i.e., lever presses resulted in a pellet delivery with *p*= 0.2), followed by an RR-10 schedule (*p*= 0.1) for an additional 2 days, and finally to an RR-20 (*p* = 0.05) for the last 2 days of the instrumental training.

### Devaluation Testing

In order to selectively diminish the value of one food outcome, relative to the other, all rats were allowed to become satiated on grain or chocolate pellets by providing them with 60 min of unrestricted access to that food (25 g/rat placed in a bowl, counterbalanced with the drug treatment conditions) in the home cage. Immediately following home cage pre-feeding (induction of specific satiety), rats underwent a devaluation test to assess their tendency to perform the two lever-press responses. Rats had continuous access to both levers throughout the test. Each test began with a 5-min extinction phase, during which lever pressing was recorded, but was not reinforced, which was done to assess response tendencies in the absence of explicit feedback. This was immediately followed by a 15-min reward phase, during which each response resulted in the delivery of its respective outcome according to CRF (for the first 5 pellets) and RR-20 (for the remainder of the test) schedules of reinforcement. On the following experimental day, rats underwent instrumental retraining sessions identical to the instrumental sessions described above, with the exception that the schedule of reinforcement shifted from CRF to RR-20 within the session (three pellets at CRF, two pellets at RR-5, one pellet at RR-10, and the remainder at RR-20). Retraining sessions lasted 30 min or were terminated after the delivery of 20 pellets. On the following day, all rats were given a second outcome devaluation test with the opposite outcome devalued. The order according to which each outcome was tested in a devalued state was counterbalanced between animals and treatment groups. Data presented are the average responses on devalued and non-devalued outcomes from the two testing days.

### Action–Outcome Contingency Degradation Training

#### Following a day of instrumental retraining (same as during devaluation testing), rats underwent a contingency degradation protocol during which each lever-press action continued to produce its original pellet outcome on a modified RR-20 schedule commonly used in such studies (9, 16, 39, 42). Specifically, sessions were divided into a series of 1-s periods and the first

press that was performed in each periods had a 1-in-20 chance of producing reward [p(O/A) = 0.05]. As before, the two actions were trained in separate daily sessions, though these sessions were now limited to 20 min and did not have a limit on the number of rewards that could be earned. Most importantly, however, during this phase of the experiment, one of the two pellets was additionally delivered in a non-contingent manner. Specifically, during each 1-s period without a lever-press response, either grain or chocolate pellets were delivered with the same probability that they would have been delivered following performance of the appropriate response [p(O/no A) = 0.05], thus degrading this action–outcome contingency. This outcome was delivered non-contingently during both daily contingency degradation training sessions, regardless of which lever was being trained. For degraded sessions, the non-contingent outcome was the same as that which was earned by a response on the available lever, whereas for non-degraded sessions, the non-contingent outcome was different from the earned outcome. Consequently, the noncontingent outcome could be expected with the same probability whenever the rat was placed in the behavioral chamber, regardless of whether they lever pressed or not. In contrast, the alternative outcome could only be obtained by performing the non-degraded action. Grain pellets were non-contingently delivered for half of the rats (counterbalanced with action–outcome contingency and drug treatment conditions), whereas the remaining rats received non-contingent chocolate pellets.

#### Testing

After 5 days of contingency degradation training, all rats underwent a 5-min choice extinction test, during which both levers were made available (Test 1). Lever presses were continuously recorded but did not produce any outcomes nor were any outcomes delivered non-contingently. Rats then received an additional 5 days of contingency training, followed by a second 5-min extinction test (Test 2).

### Data Analysis

Data were analyzed using mixed-design analysis of variance (ANOVA). Drug treatment was a between-subjects variable. Within-subjects variables included treatment day for the cocaine sensitization, training day for the instrumental training, outcome value for the devaluation tests, contingency and training day for the contingency degradation training, and contingency for the contingency degradation extinction test. When Mauchly's test indicated that the assumption of sphericity had been violated, we used the Greenhouse–Geisser correction. To examine the source of interactions, Dunnett's *post hoc* tests were used to assess group differences in the simple effects of Devaluation or Degradation (i.e., the difference in response rates across the two actions) and individual one-way ANOVAs were conducted to assess within-subjects effect. We also assessed group differences in choice of Devalued (or Degraded) actions during these tests, calculated as a percentage of total lever presses [Action 1/ (Action 1 + Action 2) × 100]. Because these data had a binomial distribution, they underwent arcsine transformation before we analyzed them using a one-way ANOVA followed by Dunnett's *post hoc* testing, when appropriate. We also conducted onesample *t*-tests against the test value of 50% (i.e., no preference on either lever) for each group.

### RESULTS

### Locomotor Sensitization

To assess baseline locomotor activity, all rats were given a single injection of saline before being placed in the behavioral chamber. No effect of Treatment group was detected [*F*(2,27) = 0.40; *p* = 0.68], indicating that basal activity did not differ between groups. However, as shown in **Figure 1**, subsequent cocaine treatment did significantly increase locomotor activity over days, relative to saline treatment. A mixed ANOVA (Day × Treatment) detected a significant main effect of Day [*F*(6.63,178.98) = 2.99; *p* < 0.01], a main effect of Treatment [*F*(2,27) = 42.79; *p* < 0.001], and a Day × Treatment interaction [*F*(13.26, 178.98) = 3.91; *p* < 0.001]. To further explore this interaction, we performed repeated-measures ANOVAs on the locomotor activity for each treatment group. Whereas this confirmed a significant increase in activity over days in cocaine-treated rats [*F*(4.27,38.41) = 4.17 and *F*(4.98,44.85) = 2.73; *p*'s < 0.05 for cocaine 15 and 30 mg, respectively], the analysis showed that saline-treated rats displayed a gradual decrease in activity [*F*(3.84,34.5) = 12.08; *p* < 0.001], indicating habituation to the context.

### Instrumental Training

Averages rate of responding on the two levers for the 8 days of instrumental training are presented in **Figure 2A**. Rats in all treatment groups rapidly acquired lever pressing and increased their response rates as the ratio schedule requirements were augmented. Statistical analysis revealed that the cocaine treatment had no effect on the acquisition of lever pressing during the training phase. A mixed ANOVA (Day × Treatment) revealed a significant main effect of Day [*F*(2.94,79.45) = 94.18; *p* < 0.001], but found no effect of Treatment [*F*(2,27) = 0.81; *p* = 0.45], or Day × Treatment interaction [*F*(5.88, 79.45) = 0.87; *p* = 0.52].

### Outcome Devaluation

Outcome devaluation testing was then conducted to assess the degree to which the rats could flexibly modify their choice between the two lever-press actions following a selective reduction in the incentive value of one of the two reward outcomes, accomplished using a sensory specific satiety procedure. Data presented in **Figures 2B, C** represent the average lever press rate during the two devaluation tests (see Materials and Methods for details).

#### Extinction Phase

During the first 5 min of each devaluation test, the two levers were present but did not result in outcome delivery. All three groups showed a reduction in their performance of the action whose

outcome was currently devalued (Devalued action), relative to the action whose outcome was non-devalued (Non-devalued action), demonstrating that, regardless of drug treatment, all groups exhibited the capacity to use action–outcome learning to adapt their food-seeking behavior in a goal-directed manner. Supporting this interpretation, a mixed ANOVA (Devaluation × Treatment) revealed a significant main effect of Devaluation [*F*(1,27) = 26.33; *p* < 0.001], but found no main effect of Treatment [*F*(2,27) = 0.15; *p*= 0.86], or Devaluation × Treatment interaction [*F*(2,27) = 1.79; *p* = 0.19]. We went on to look at the effect of devaluation at the group level. Paired *t*-tests revealed a significant effect of the devaluation procedure on lever pressing for each treatment group (*t*'s < −2.5). Furthermore, when looking at the percentage of total presses directed toward the Devalued action (**Figure 2B**), a oneway ANOVA showed no significant differences between groups [*F*(2,27) = 1.41; *p* < 0.05]. For all groups, the Devalued action was chosen at a significantly lower rate than would be expected by chance (i.e., 50%; all *t*'s < −2.84), indicating a preference for the Non-devalued action.

#### Reinforced Phase

During the last 15 min of each devaluation test, both levers were reinforced with their respective outcomes according to an RR-20 schedule (**Figure 2C**). Here too, all groups exhibited a selective reduction in lever pressing for the devalued outcome, relative to the alternate action. A mixed ANOVA detected a significant main effect of Devaluation [*F*(1,27) = 22.38; *p* < 0.001], but found no main effect of Treatment [*F*(2,27) = 0.73; *p* = 0.49], or Devaluation × Treatment interaction [*F*(2,27) = 1.2; *p* = 0.32]. As during the extinction test, choice of the Devalued action (% of total press) did not significantly differ among groups [*F*(2,27) = 0.44; *p* > 0.05], and all groups displayed significantly preference for the Non-devalued action (all *t*'s < −2.49).

## Contingency Degradation

#### Training

Next, we investigated the effects of cocaine treatment on rats' capacity to adjust their instrumental food-seeking behavior to accommodate a selective reduction in action–outcome contingency. **Figure 3A** shows the rats' average response rates during contingency degradation training sessions, plotted separately for each treatment group, for the action whose outcome was noncontingently presented (Degraded action) and for the alternate action (Non-degraded action), whose outcome was only delivered in a response-contingent manner. Data are expressed as percentage of performance from the instrumental training baseline (i.e., last day of instrumental retraining), whose values are presented in **Table 1**. A mixed ANOVA conducted on these data found no effect of Treatment [*F*(2,27) = 0.42; *p* = 0.66], or Degradation (tobe Degraded vs. to-be Non-degraded; *F*(1,27) = 0.0; *p*= 0.99), and found no evidence of a pre-existing Treatment × Degradation interaction [*F*(2,27) = 1.37; *p* = 0.27].

**Figure 3A** shows the results of contingency degradation training. As is frequently the case in such experiments (39, 42, 43), we did not observe any response-specific effect of the noncontingent reward delivery during contingency degradation training sessions, though we did observe a general decline in response rates over days, an effect that was similar for all groups. A mixed ANOVA (Day × Degradation × Treatment) detected a significant effect of Day [*F*(2.7,73.04) = 3.47; *p* < 0.05], but found no effect of Degradation or Treatment [*F*(1,27) = 1.71; *p* = 0.2, and *F*(2,27) = 0.2; *p* = 0.82, respectively]. Nor were there any significant interactions (greatest *F* value = 1.97; *p* > 0.15).

#### Testing

Non-contingent rewards are known to have acute action-biasing effects on instrumental performance (15, 44–47) that can oppose and potentially obscure the expression of contingency degradation learning (48, 49). Therefore, our primary test of sensitivity to contingency degradation involved assessing rats' choice between the two lever-press actions in a choice extinction test. An initial test administered between contingency training sessions 5 and 6 (Test 1; see **Figure 3B**) found no Degradation effect [*F*(1,27) = 0.82; *p* = 0.37], Treatment effect [*F*(2,27) = 0.05; *p*= 0.95], or Degradation × Treatment interaction [*F*(2.27) = 0.65; *p* = 0.53]. Choice of the Degraded lever (percentage of total lever presses; see **Figure 3B**) did not differ between groups [one-way ANOVA, *F*(2,27) = 0.22; *p* > 0.05], and no groups exhibited a preference that significantly differed from chance (i.e., 50%, all *t*'s > −0.2). However, when rats were re-tested following contingency degradation session 10 (Test 2; **Figure 3C**), we found that cocaine-treated groups had learned to selectively reduce their performance of the Degraded action. A mixed ANOVA detected a significant main effect of Degradation [*F*(1,27) = 5.59; *p* = 0.03], but found no effect of Treatment [*F*(2,27) = 1.79; *p* = 0.19]. More importantly, however, there was a significant Degradation × Treatment interaction [*F*(2,27) = 3.68; *p* = 0.04], indicating that the groups differed in their choice between the two actions. Interestingly, the Degradation effect was significant for the group given repeated exposure to the high dose of cocaine (*p* = 0.02), but was not significant, according to paired *t*-tests, for saline-treated rats (*p* = 0.44), or for rats treated with the low dose of cocaine (*p* = 0.06). Moreover, *post hoc* analysis on the responses difference score showed that the group treated with cocaine 30 mg/kg significantly differed from the saline-treated group (Dunnett's test; *p* < 0.05). However, analysis of choice measure found evidence of contingency learning for the group given exposure to the low dose of cocaine. An ANOVA revealed a significant effect of Group [*F*(2,27) = 4.19; *p* < 0.05], and *post hoc* Dunnett's test found that the group treated with 15 mg/kg, but not 30 mg/kg, cocaine significantly differed from the saline-treated group (*p*'s = 0.02 and 0.07, respectively). Only the two cocainetreated groups chose the Degraded action significantly below levels that would be expected by chance (50%; *t*'s < −3.15, while for the saline-treated group, *t* = 0.64).

### DISCUSSION

The current study examined the effects of repeated cocaine exposure on adaptive goal-directed behavior under conditions that discourage habitual control. Rats pre-treated with cocaine exhibited normal sensitivity to outcome devaluation, demonstrating that they had encoded the two action–outcome relationships and

#### Table 1 | Instrumental training baseline.


*Summary of the mean (*±*SEM) rate of lever pressing (presses per minute) during the last day of instrumental training (RR-20 schedule of reinforcement) before the start of contingency degradation training.*

were unimpaired in using this information when adapting to an acute, outcome-specific reduction in the value of a behavioral goal. Interestingly, cocaine-treated rats displayed augmented – rather than impaired – sensitivity to action–outcome contingency degradation.

These findings would seem to be at odds with a vast body of data indicating that chronic exposure to cocaine or other abused drugs can bias adaptive behavioral control in favor of habits (25–29, 50). Nelson and Killcross (25), for instance, were the first to show that rats given repeated experimenter-administered amphetamine injections prior to learning to lever press for food reward developed devaluation-insensitive (habitual) performance under limited-training conditions that support devaluation sensitive performance in drug-naive rats. Repeated cocaine pre-exposure is known to have a similar habit-promoting effect (27, 29, 50, 51). Such findings are consistent with the view that pathological behaviors observed in addiction, and animal models of cocaine seeking, reflect an excessive reliance on automatic, inflexible response selection (3–5, 7).

An important question raised by such findings is whether this overreliance on habits is caused by an enhancement in habitrelated processes or if it is simply a compensatory response to dysfunction in goal-directed processes. Some insight into this issue was provided early on by Nelson and Killcross (25), who found that instrumental performance remained goal-directed (devaluation sensitive) when rats were exposed to amphetamine after initial training but before testing. This result suggests that the drug-induced bias toward habitual performance that Nelson and Killcross observed when rats were exposed to amphetamine prior to training was caused by an enhancement of habit *formation* and not a disruption of goal-directed control. However, it is worth noting that LeBlanc et al. (27) found that rats previously exposed to cocaine displayed insensitivity to food outcome devaluation even when they were given response-contingent reinforcement at test, which is remarkable because normal (drug naive) rats are known to rapidly re-exert goal-directed control over their behavior under such conditions (4). Consequently, this finding could reflect a deficit in goal-directed control or at least the acquisition of habits that resist transition back to goal-directed control.

It is important to emphasize that most studies on this subject, whether investigating the effect of repeated amphetamine [e.g., Ref. (25)] or cocaine treatment [e.g., Ref. (27, 29)], have employed relatively simple instrumental tasks that provide subjects with only one reward option. Although this approach is useful for studying habit formation, it is not optimal for assessing the integrity of goal-directed learning and decision-making processes. As just discussed, when this approach is used, performance that is insensitive to outcome devaluation may either reflect an overreliance on habitual control, or a failure to properly encode or use the detailed action–outcome representations needed to respond in a goal-directed manner. Another problem with this approach is that it is more susceptible to concerns about the role of incidental Pavlovian learning in expression of task performance. There is evidence that Pavlovian context-reward learning can facilitate instrumental reward seeking (52), and that the strength of its influence is sensitive to changes in physiological need state (53, 54). Such findings support the long-standing view that Pavlovian learning processes contribute to the motivational control of instrumental behavior (55). Consistent with this, it was recently shown (56) that when rats are given limited training on a simple (one reward) lever-press task, it is possible to eliminate the sensitivity of instrumental performance to outcome devaluation by extinguishing the training context prior to testing. Such findings suggest that, for instrumental tasks involving only one reward option, outcome devaluation performance may be largely mediated by stimulus–outcome rather than action–outcome learning.

These concerns can be avoided by using a more complex instrumental decision-making task, such as the one used in the current study, in which animals are allowed to choose between two distinct reward-motivated actions. Although poorly understood, it is known that decision-making scenarios such as this discourage the acquisition of habitual control (30, 33, 34). Interestingly, it has been shown that rats can develop response-specific habits when given extensive training with one of two distinct action–outcome contingencies [e.g., Ref. (57)]. However, in such studies, each action is trained and tested in a unique context. In contrast, rats given extensive training with two action–outcome contingencies in a common context fail to develop habitual performance (30, 33). This has been observed even when rats are given a choice between responses during training and test sessions (34), which suggests that contextual changes across phases of the experiment (i.e., shifting from training sessions with only one response to test sessions in which two responses are available) are not primarily responsible for disrupting habitual performance during choice tests. Although more research is needed to characterize the psychological and neurobiological mechanisms that arbitrate between habitual and goal-directed action selection strategies, such findings suggest that having a choice between distinct response options at test is an important factor that biases behavioral control in favor of the latter. Assessing goal-directed control in rats trained (and tested) on two action–outcome contingencies in a common context also has another practical benefit in terms of data interpretation. Because the test context is associated with both the devalued and non-devalued reward, it alone (i.e., as a Pavlovian cue) is unlikely to provide the kind of reward-specific information needed to support differential action selection based on expected reward value.

For these reasons, two-option choice tasks provide a more direct approach for assaying goal-directed learning and action selection. Therefore, the current findings provide strong evidence that goal-directed processes are largely spared following repeated exposure to cocaine, at least for the drug exposure regimens tested here. As animals here were passively exposed to cocaine, our study did not address whether chronic cocaine self-administration also spares goal-directed decision-making nor does the current study speak to whether rats come to rely on a habitual or goal-directed strategy when seeking or taking cocaine. However, our findings may shed light on a recent study investigating changes in behavioral control over cocaine selfadministration. It is known that rats given extensive opportunity to self-administer cocaine tend to develop a compulsive pattern of intake characterized by an insensitivity to response-contingent punishment (1, 2, 58). More recently, however, it was shown that providing rats with concurrent access to an alternative response option (sugar self-administration) attenuates the development of compulsive cocaine seeking under these exposure conditions (59). This fits nicely with the current results and further suggests that two-option scenarios such as the one used here promote goaldirected decision-making over habitual control. However, further research will be needed to more directly test this hypothesis.

Because our aim was to investigate the long-term behavioral effects of this treatment, we used a relatively lengthy (15-day) cocaine exposure regimen that included both intermediate (15 mg/kg) and high (30 mg/kg) drug doses, followed by a relatively lengthy (32-day) interval between drug exposure and the initiation of behavioral training for food. This is notable because previous findings of drug-induced facilitation of habit formation have typically used shorter drug exposure (6–10 days) and exposure-to-training intervals (7–14 days). Such procedural differences, however, are unlikely to explain our findings given that cocaine exposure regimens similar to those used here are known to be effective in causing persistent alterations in reward-motivated behavior (51, 60). For instance, Schoenbaum and Setlow (51) found that cocaine-treated rats (14 injections; 30 mg/kg) given a 21-day withdrawal period before training on a simple food-motivated Pavlovian approach task developed rigid conditioned approach behavior that was insensitive to reward devaluation. Furthermore, using a two-option task such as the one used here, LeBlanc (37) found normal sensitivity to outcome devaluation in rats pre-treated with a shorter cocaine exposure regimen known to facilitate habit formation (27).

The study by LeBlanc (37) is one of very few that has assessed the effects of repeated drug exposure on adaptive goal-directed behavior using a two-option choice task that discourages habit formation. Another such study (38) found that rats given repeated amphetamine injections prior to training also showed normal sensitivity to reward devaluation during a two-option choice test. Together with the current results, such findings suggest that although chronic experience with psychostimulant drugs can profoundly alter adaptive behavior, this is not related to generalized hypofunction in neural systems underlying goal-directed behavior. That said, recent studies have shown that alcohol- and methamphetamine-associated contextual cues are effective in disrupting goal-directed choice between different reward options (61, 62), suggesting that Pavlovian stimulus-drug learning may contribute to drug-induced behavioral dysregulation. Importantly, this possibility was not investigated in the current study, as rats were exposed to cocaine in the presence of contextual cues that were clearly discriminable from those present during instrumental training and testing and were repeatedly handled and exposed to the main behavioral apparatus (without further cocaine exposure) prior to testing, which likely extinguished any unintended drug-related learning that happened to occur.

Our finding that repeated cocaine exposure heightened rats' sensitivity to action–outcome contingency degradation demonstrates that the cocaine regimen used here was, in fact, effective in altering goal-directed processes, albeit in a manner that is at odds with the view that cocaine exposure disrupts goal-directed control. However, this finding was not entirely unanticipated. Though few in number, studies assessing the impact of chronic drug exposure on this aspect of learning have observed similar effects (39, 40). Most relevant to the current study, Phillips and Vugler (39) used a two-option task, similar to the one used here, to investigate the effects of a sensitizing regimen of amphetamine injections on contingency degradation learning. They found that amphetamine-treated rats displayed enhanced sensitivity to contingency degradation, in that they selectively suppressed their performance of an action that was no longer needed to produce its outcome, an effect that emerged well before it did in saline-treated rats (39). It should be noted that, in this study, amphetamine-treated rats did not significantly differ from salinetreated rats during a final (non-reinforced) choice test. However, because this test was conducted after both groups displayed evidence of contingency sensitivity during training sessions, it was not likely to reveal group differences in the *rate* of contingency degradation learning. This was not an issue in the current study since we conducted choice extinction tests before saline-treated rats showed evidence of contingency degradation learning, an effect that can require many sessions of training to emerge in some studies (39, 43), and which may have been particularly slow to develop for the task used here due to our use of highly similar reward options.

The differential effects of cocaine exposure on devaluation and contingency testing suggest that this drug treatment does not augment goal-directed learning or control in a general way. Instead, it is possible that this finding reflects a fundamental alteration in the way animals adapt to changes in action-reward contingencies. For instance, it has been shown that cocainetreated rats' exhibit heightened sensitivity to differences in reward delay and magnitude when deciding between reward options (60). Another possibility is that cocaine exposure alters processes specific to contingency degradation learning, including the ability to track information about non-contingent reward deliveries and integrate this with information about response-contingent reward probabilities. Because non-contingent rewards occur in the absence of other, more predictive cues, it is believed that the likelihood of their occurrence is tracked through context conditioning (63, 64). This view assumes that the probability that an instrumental action will be performed depends on its ability to serve as a reliable predictor of reward, relative to other potential predictors, including contextual cues. Given this competition, the rate at which an action is performed should be inversely related to the degree to which the test context predicts the delivery of the reward earned by that action. From this perspective, the key to understand cocaine's impact on contingency learning may be related to its well-established facilitative influence on Pavlovian (stimulus-reward) learning (27, 65–70), since this should allow the context to better compete with instrumental actions for cocaine-treated rats.

Drug-induced enhancement in stimulus-reward learning has been linked to hyper-responsivity in ascending dopamine systems (70, 71). This is interesting given the finding that dopamine-depleting lesions of the dorsomedial striatum disrupt rats' sensitivity to action–outcome contingency degradation but spares their ability to select between actions during outcome devaluation testing (72), even though this structure is known to be a key mediator of both of these features of goaldirected behavior (41). There is, in fact, quite strong evidence that dopamine transmission is not critical for the instrumental incentive learning process responsible for encoding changes in value of rewards or in using such information to control instrumental goal-directed behavior (31, 73), which may explain our finding that these processes were relatively unaffected by repeated cocaine exposure. Interestingly, Corbit et al. (29) recently found evidence that a habit-facilitating cocaine exposure regimen augmented presynaptic glutamate signaling in the DMS. While it was suggested that this phenomenon could reflect a state of DMS dysfunction, leading to impaired goal-directed control, we suggest that it may also contribute to the augmented contingency degradation effect reported here.

It remains unclear if drug-induced augmentation of instrumental contingency degradation learning is a harmless side effect of drug intake or if it contributes in some way to the addiction process. For example, it has been suggested that some individuals may use psychostimulants in order to cope with poor cognitive performance associated with pathologies, such as attention deficit and hyperactivity disorder (ADHD). Interestingly, it was recently shown that treatment with the psychostimulant methylphenidate

### REFERENCES


could restore certain features of goal-directed control in a rat model of ADHD (74). However, the immediate beneficial effects of such drugs may lead to drug abuse and addiction. For instance, it is believed that the use of psychostimulants for selfmedication purposes could be an important contributor to the high comorbidity rate of ADHD and substance use disorder (75). Alternatively, it is possible that the augmentation of goal-directed contingency learning following chronic cocaine exposure may actually have disruptive effects on behavioral control that were not observed in the current study. For instance, it has been suggested that in some circumstances, chronic drug intake may disrupt the development of adaptive habits for routine tasks (40), which could overburden the goal-directed system and impair decision-making when cognitive resources become taxed. The hypothesis that drug exposure disrupts behavioral flexibility by misallocating cognitive resources should be explored further, as it could have important implications for addiction theory and research.

### AUTHOR CONTRIBUTIONS

BH contributed to experimental design and execution, performed statistical analysis and generated figures, and participated in the writing of the manuscript. AL contributed to experiment design and execution and assisted with data analysis. SO contributed to experimental design, data analysis, and manuscript preparation. All authors read, provided feedback, and approved the final version of the paper.

### FUNDING

This research was funded by NIDA grant DA029035 to SO.


**Conflict of Interest Statement:** The authors declare that this research was conducted in the absence of any commercial, financial, or other relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Halbout, Liu and Ostlund. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Ventral Tegmental Area Afferents and Drug-Dependent Behaviors

#### *Idaira Oliva and Matthew J. Wanat\**

*Department of Biology, Neurosciences Institute, University of Texas at San Antonio, San Antonio, TX, USA*

Drug-related behaviors in both humans and rodents are commonly thought to arise from aberrant learning processes. Preclinical studies demonstrate that the acquisition and expression of many drug-dependent behaviors involves the ventral tegmental area (VTA), a midbrain structure comprised of dopamine, GABA, and glutamate neurons. Drug experience alters the excitatory and inhibitory synaptic input onto VTA dopamine neurons, suggesting a critical role for VTA afferents in mediating the effects of drugs. In this review, we present evidence implicating the VTA in drug-related behaviors, highlight the diversity of neuronal populations in the VTA, and discuss the behavioral effects of selectively manipulating VTA afferents. Future experiments are needed to determine which VTA afferents and what neuronal populations in the VTA mediate specific drug-dependent behaviors. Further studies are also necessary for identifying the afferent-specific synaptic alterations onto dopamine and non-dopamine neurons in the VTA following drug administration. The identification of neural circuits and adaptations involved with drug-dependent behaviors can highlight potential neural targets for pharmacological and deep brain stimulation interventions to treat substance abuse disorders.

#### *Edited by:*

*Mark Walton, University of Oxford, UK*

#### *Reviewed by:*

*Giovanni Martinotti, University G. d'Annunzio, Italy Miriam Melis, University of Cagliari, Italy Elyssa Margolis, University of California San Francisco, USA*

#### *\*Correspondence:*

*Matthew J. Wanat matthew.wanat@utsa.edu*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 15 December 2015 Accepted: 23 February 2016 Published: 07 March 2016*

#### *Citation:*

*Oliva I and Wanat MJ (2016) Ventral Tegmental Area Afferents and Drug-Dependent Behaviors. Front. Psychiatry 7:30. doi: 10.3389/fpsyt.2016.00030*

Keywords: VTA, substance use disorders, addiction, dopamine, plasticity

### INTRODUCTION

Illicit drug use is a significant global problem, with the United Nations Office on Drugs and Crime estimating that 246 million people worldwide used illicit drugs in 2013. More problematic is the high incidence of substance use disorders (SUDs), which in 2014 was estimated to afflict roughly 21.5 million people in the US, corresponding to ~8% of the population (1). In addition to the personal impact of a SUD, there is a significant economic impact due to lost productivity, crime, and health care costs, which according to the US office of National Drug Policy is estimated to cost \$180.8 billion per year in the US alone.

SUDs are now recognized to exist along a continuum where the severity of the disorder is related to the number of diagnostic criteria met by an individual within the past year. According to the DSM-V, the criteria for a SUD fall into four major symptomatic clusters: impaired control (i.e., use more than intended), social impairment (i.e., substance use at the expense of personal relationships and impaired job performance), risky behavior (i.e., use despite known adverse consequences), and pharmacological effects (i.e., tolerance and withdrawal). One of the most daunting aspects in treating SUDs is the high incidence of relapse, which occurs in ~40–60% of individuals (2). In drug users, exposure to drug-paired cues elicits craving that in turn can promote the possibility of a relapsing episode (3). Weakening the relationship between drugs and associated cues holds promise as a non-pharmacological method for treating SUDs (4). However, our understanding of the specific neural circuits and neural adaptations responsible for drug-related behaviors is incomplete.

### RODENT MODELS OF DRUG-DEPENDENT BEHAVIORS

Rodent model systems are commonly employed to examine the effects of abused drugs on behavior. In this review, we will concentrate on psychostimulants and opiates, as extensive laboratory research has focused on these drug categories. The non-contingent administration of psychostimulants or opiates increases locomotor activity in rodents (5). Repeated non-contingent drug injections can lead to a progressive and long-lasting increase in this drug-induced locomotor activity, a phenomenon referred to as behavioral sensitization (5). A single injection of cocaine at high doses is also capable of eliciting sensitization (6, 7). Furthermore, even when no drug is administered, locomotor activity is elevated in the same context where animals received a single drug injection on the preceding day (8). These results illustrate that the association between a drug and the context where the drug is experienced is rapidly learned following a single exposure.

Drug-paired cues exert a powerful influence over behavioral actions in individuals with a SUD (3). The development of an association between drugs and cues can be examined in humans in the laboratory (9, 10), as well as in rodents by utilizing a conditioned place preference (CPP) behavioral paradigm (11). This rodent assay involves repeated non-contingent drug injections in one chamber and control injections in an adjacent, but contextually distinct chamber. The relative preference between the drug-paired and control contexts is subsequently assessed in a test session where the rodent can freely access both chambers in a drug-free state (11). The CPP training procedure can include an extinction phase and a reactivation test (12, 13), which models drug abstention and relapse observed in humans suffering from a SUD. While CPP paradigms examine contextual learning involving reinforcing outcomes, conditioned place aversion (CPA) assays examine learning involving aversive outcomes. In particular, CPA paradigms are commonly utilized to study the negative affective state following drug withdrawal (14, 15).

Behavioral sensitization and CPP paradigms are relatively easy to implement, but they require experimenter administered drug injections. Rodents can be readily trained to self-administer drugs via an intravenous catheter. A number of drug self-administration assays have been developed to model the behavioral symptoms observed in humans with a SUD. For example, rodents with limited access (1 h) to drugs in daily self-administration sessions maintain stable drug intake. However, rodents with extended access (6 h) to drugs increase their intake over multiple training sessions, similar to the escalated drug consumption that can be observed in individuals diagnosed with a SUD (16–18). Just as drug use does not necessarily lead to a SUD, not every rodent who self-administers drugs will develop an addiction-related phenotype. When rodents are extensively trained to self-administer drugs (~3 months), a subset of rats exhibit characteristics found in humans with SUDs, such as persistent drug seeking in the absence of reinforcement, exerting greater effort to obtain a drug infusion, and seeking drugs despite aversive consequences (19). Rodents trained to self-administer drugs are also used to model relapse. Relapse in humans is often precipitated by three major factors: taking the drug, exposure to cues previously associated with the drug, or experiencing a stressful life-event (20–22). These same triggers (drug intake, exposure to drug-related cues, or stress) can reinstate drug-seeking behaviors in rodent drug self-administration models as well (23).

Just as with humans with a SUD, drug-dependent behaviors in rodents involve a component of learning, whether it is contextual (behavioral sensitization, CPP, CPA, and cue-induced reinstatement) or operant (drug self-administration). While numerous brain regions are involved with mediating learning and drugrelated behaviors, we will focus on the ventral tegmental area (VTA) in this review. We will also discuss the major inputs to the VTA, how these inputs influence VTA neuron activity, and present recent findings on how these VTA afferents are involved with drug-dependent behaviors.

### VTA INVOLVEMENT IN DRUG-DEPENDENT BEHAVIORS

The dopamine neurons arising from the VTA that project to the nucleus accumbens (NAc) are involved with mediating the reinforcing actions of abused substances (24–26). While abused drugs increase dopamine levels in the NAc (27, 28), many non-habit forming drugs do not affect dopamine overflow (27). Psychostimulants affect dopamine levels primarily by altering dopamine clearance from the extracellular space (29, 30), whereas opiates indirectly elevate dopamine transmission by suppressing inhibitory input onto dopamine neurons (31–33).

The neural circuitry mediating any behavior is complex, though extensive research over the past few decades illustrates that the VTA is critically involved with both rewarding and aversive drug-dependent behaviors. For example, the VTA is required for behavioral sensitization induced by amphetamine or mu-opioid receptor agonists, though evidence for the involvement of the VTA in cocaine behavioral sensitization is mixed (5). The VTA is also involved with CPP for both psychostimulants and opiates (34–39), and with CPA elicited by kappa opioid receptor activation (15). The VTA is also necessary for stress-, cue-, and drug-primed reinstatement in rodents self-administering cocaine (23, 40–42) or heroin (43–45). While VTA-dependent behaviors are often mediated by dopamine neurons, increasing evidence illustrates the involvement of non-dopamine VTA neurons in regulating behavioral outcomes.

### DIVERSE NEURONAL POPULATIONS WITHIN THE VTA

The VTA along with the neighboring substantia nigra pars compacta are the primary dopamine producing nuclei in the brain (46). Early electrophysiological recordings indicated that the VTA was comprised of two distinct neuronal populations, presumed to be dopamine neurons and local GABA interneurons (31, 47). However, a subset of VTA neurons exhibited a unique electrophysiological response to serotonin and opioid receptor agonists, providing evidence for the existence of an additional neuronal population in the VTA (48). Accumulating evidence over the past decade has highlighted the complexity of the VTA both in regards to neuronal composition and projection targets.

Dopamine neurons comprise the largest neuronal population within the VTA, as tyrosine hydroxylase (TH), the rate-limiting enzyme for dopamine synthesis, is found in ~60% of VTA neurons (46, 49). VTA dopamine neurons typically innervate only a single target region, with different populations projecting to numerous brain nuclei, including the NAc, dorsal striatum, cortex, amygdala, globus pallidus, and lateral habenula (LHb) (46, 50, 51). However, recent evidence indicates that dopamine neurons projecting to the medial NAc also send collaterals outside of the striatum (50). Traditionally, dopamine neurons have also been identified based upon electrophysiological properties, including the presence of a long triphasic action potential, a low baseline firing rate, burst firing, and the presence of the *I*h current (52, 53). However, action potential duration may not be sufficient to identify the neurotransmitter content of VTA neurons (49, 54). Additionally, many neurons within the medial aspects of the VTA have *I*h but do not contain TH. While action potential duration and *I*h are not always indicative of dopamine content, these electrophysiological properties can be related to where VTA neurons project (55–57).

The second largest neuronal population in the VTA consists of GABA neurons (~25%) that are commonly identified by the presence of glutamic acid decarboxylase (GAD) (58, 59). While initially thought to function primarily as local interneurons (31), VTA GABA neurons directly influence the activity of VTA dopamine neurons (60, 61) and also project to the ventral pallidum (VP), lateral hypothalamus (LH), and LHb, with smaller projections to the amygdala, prefrontal cortex (PFC), and NAc (62–64). Recently, dopamine neurons were identified as an additional source of GABA in the VTA, as these neurons can synthesize GABA through an aldehyde dehydrogenase-mediated pathway (65). VTA and substantia nigra dopamine neurons package GABA into vesicles through the vesicular transporter for dopamine, indicating that GABA can be coreleased with dopamine to elicit electrophysiological effects on medium spiny neurons in both the NAc and dorsal striatum (66, 67).

In addition to dopamine and GABA neurons, a small percentage of VTA neurons contain vesicular glutamate transporter 2 (VGluT2), a marker for glutamate neurons. These neurons predominately reside in the medial aspects of the VTA and project to the ventral striatum, PFC, VP, amygdala, and LHb, as well as synapse onto local dopamine neurons (57, 64, 68–72). A subset of the VGluT2 positive neurons in the VTA also express TH and can project to the PFC and ventral striatum (70). These neurons release both dopamine and glutamate (73–77) though they are not typically released at the same site or from the same synaptic vesicles (78). While the VTA was thought to be comprised solely of dopamine and GABA neurons, recent studies illustrate that the VTA is comprised of dopamine neurons that can corelease GABA, dopamine neurons that corelease glutamate, GABA neurons, and glutamate neurons.

Optogenetic modulation of VTA neurons can elicit either appetitive or aversive behavioral outcomes depending upon the neuronal population that is targeted. Activation of dopamine neurons is acutely reinforcing and sufficient for establishing a CPP, whereas silencing dopamine neurons is aversive and elicits a CPA (60, 79, 80). Stimulating VTA dopamine neurons also enhances reinforcing behaviors in operant tasks (81–84). In contrast, selective activation of VTA GABA neurons is aversive, elicits a CPA, and reduces reward consumption by inhibiting the activity of local VTA dopamine neurons (60, 61). Interestingly, activating VTA GABA neurons that synapse onto cholinergic interneurons in the NAc enhances the discrimination between neutral and aversive stimuli (63). Optogenetic activation of VGluT2-containing neurons in the VTA is also sufficient for establishing CPP, an effect that is mediated by activating local VTA dopamine neurons (72). Collectively, these studies suggest that VTA-mediated behavioral effects, including drug-dependent behaviors, likely involve a complex interplay between the distinct neuronal populations in the VTA.

### AFFERENT REGULATION OF THE VTA

The VTA is innervated by a diverse array of inputs, many of which are interconnected. Large afferents to the VTA include the rostromedial tegmental nucleus (RMTg), VP, bed nucleus of the stria terminalis (BNST), LH, pedunculopontine tegmental nucleus (PPT), laterodorsal tegmental nucleus (LDT), dorsal raphe nucleus (DR), NAc, PFC, and amygdala (50, 85–87). While VTA dopamine and GABA neurons are innervated by many of the same brain regions (50), little is known about the inputs to VGluT2 positive neurons in the VTA. Below, we will discuss how notable inputs to the VTA can influence the activity of VTA neurons, how these inputs influence VTA-dependent behaviors, and recent findings on VTA afferents involved with drug-dependent behaviors.

### Rostromedial Tegmental Nucleus

The RMTg (also referred to as the tail of the VTA) is a nucleus comprised of GABA neurons that function as an inhibitory relay between the LHb and the VTA (86, 88–92). Lesions of the RMTg demonstrate a critical role for this brain region in modulating aversive behaviors (86). Additionally, neurons in the RMTg are activated by aversive stimuli and inhibited by rewards (86). The RMTg heavily influences the firing of VTA neurons, as RMTg inactivation increases dopamine neuron firing (93), whereas stimulating the RMTg attenuates dopamine neuron firing (93–95).

The RMTg is increasingly recognized as an important nucleus in mediating the effects of abused drugs. The reinforcing effect of opiates was originally thought to arise from activation of mu-opioid receptors on VTA GABA interneurons (31), though accumulating evidence suggests the major target of opiates is instead the RMTg afferents to the VTA (33, 96, 97). The administration of morphine decreases RMTg cell firing, which reduces the inhibition onto VTA dopamine neurons, resulting in elevated dopamine neuron firing (94–96). Indeed, selective activation of mu-opioid receptors in RMTg neurons projecting to the VTA is sufficient for eliciting a real-time place preference (98). Following opiate withdrawal, inhibiting RMTg neurons no longer elevates VTA dopamine neuron firing. This inability of the RMTg to disinhibit dopamine neurons is mediated in part by an alteration in VTA glutamatergic tone (93). While the RMTg projection to the VTA mediates the acute reinforcing effects of opiates (33, 96, 98), additional VTA afferent pathways are involved with dopamine neuron tolerance to opiates following withdrawal (93).

Psychostimulants also influence the activity of RMTg neurons (94). The non-contingent administration of cocaine elevates the levels of Fos, a transcription factor associated with increased neuronal activity, in RMTg neurons (99, 100). Interestingly, Fos levels in RMTg neurons projecting to the VTA are elevated following extinction in rats self-administering cocaine (101). The RMTg is also necessary for cocaine-related aversive behaviors that are observed once the rewarding effect of cocaine dissipates (102). Further experimentation is needed to validate whether the RMTg projection to the VTA is involved with both aversive and reinforcing behaviors elicited by cocaine.

### Ventral Pallidum

The VP is involved in processing rewarding stimuli and motivated behavior (103). GABA neurons in the VP provide a large source of inhibitory input to the VTA (87, 104). Activating VP neuron terminals elicits inhibitory GABA currents in both dopamine and non-dopamine VTA neurons (105). The functional effect of inactivating the VP results in an increase in the population activity in putative dopamine neurons (106) though the effect on non-dopamine VTA neurons is unknown. Numerous lines of evidence implicate the VP in drug-dependent behaviors. VP neurons projecting onto dopamine and non-dopamine neurons are acutely inhibited by opiates (105). Additionally, VP lesions or pharmacological manipulations in the VP can block morphineinduced sensitization (107, 108), drug-induced CPP (35, 109, 110), self-administration (111), and reinstatement (40, 41, 112). VP neurons projecting to the VTA are Fos activated following cue-induced reinstatement for cocaine (101) and silencing these neurons is sufficient for blocking cue-induced reinstatement (113). While VP neurons project to both dopamine and nondopamine neurons in the VTA (105), it is unclear what neuronal population(s) in the VTA are influenced by the VP inputs during drug-dependent behaviors.

### Bed Nucleus of the Stria Terminalis

The BNST is involved in mediating fear and anxiety (114–120) and is considered to be a relay nucleus between stress and reward pathways (121, 122). The neuronal composition of the BNST is diverse, with efferent populations of GABA and glutamate neurons along with local GABA and cholinergic interneurons (122, 123). BNST neurons also express an assortment of neuropeptides including neuropeptide Y, corticotropin-releasing factor, enkephalin, dynorphin, and substance P (124). Electrical stimulation of the BNST exerts an excitatory influence on midbrain dopamine neurons (122, 125, 126) and elevates dopamine release in the NAc (127). Recent studies suggest that this excitatory effect on dopamine neurons is predominately mediated through GABA BNST neurons disinhibiting VTA GABA neurons, resulting in anxiolytic and rewarding behavioral outcomes (128–130). Interesting, glutamate neurons in the BNST also innervate VTA GABA neurons, and activation of these neurons elicits aversive and anxiogenic behaviors (129). Within the context of drug-dependent behaviors, local pharmacological manipulations illustrate a critical role of the BNST in the stress-induced reinstatement of drug seeking (41, 131, 132). Furthermore, recent studies implicate the BNST–VTA pathway in the locomotor-activating effects of cocaine (133) and in the expression of cocaine CPP (134), though the involvement of this pathway in other drug-dependent behaviors has not yet been explored.

### Lateral Hypothalamus

The LH is critical for the expression of motivated behaviors including feeding and drug seeking (135). The LH provides both glutamate and GABA inputs to the VTA (85, 136). In addition, LH neurons projecting to the VTA also contain neuropeptides such as neurotensin and orexin/hypocretin (137, 138). Electrical stimulation of the LH increases the activity of putative dopamine neurons and inhibits the activity of putative GABA neurons in the VTA (139). Many lines of evidence demonstrate that activation of this LH–VTA pathway is reinforcing. Rodents will readily self-stimulate for electrical activation of the LH, but this behavioral effect is inhibited by dopamine receptor antagonism (140) or inactivation of the VTA (141). Furthermore, optogenetic activation of LH inputs to the VTA also supports self-stimulation through a neurotensin-dependent mechanism (142).

Accumulating evidence over the past decade highlights the importance of orexin-containing neurons in feeding, the sleep/wake cycle, and drug-dependent behaviors (143). Orexin-producing neurons are exclusively localized within the hypothalamus and project widely throughout the brain (144), though it is the projection to the VTA that is heavily involved with drug-dependent behaviors. Intra-VTA injections of orexin receptor antagonists attenuate morphine CPP (145, 146), which is consistent with the reduced morphine dependence observed in orexin-deficient mice (147). Conversely, intra-VTA administration of orexin reinstates morphine CPP (12). Orexin antagonists targeting the VTA also diminish behavioral sensitization to cocaine (148), cocaine self-administration (149), and cue-induced reinstatement (150). Interestingly, orexin neurons in the LH also contain dynorphin, which inhibits the activity of VTA dopamine neurons. A recent study suggests that orexin in the VTA facilitates drug-related behaviors in part through attenuating the effects of dynorphin (149). Although the orexin-containing neurons in the LH have received considerable attention in the context of addiction, additional neuronal populations in the LH–VTA pathway are also likely involved in drug-dependent behaviors, as the nonorexin-producing neurons in the LH are Fos activated following cue-induced reinstatement (101).

### Laterodorsal Tegmental Nucleus and Pedunculopontine Tegmental Nucleus

The LDT and PPT are involved in modulating arousal and rewarddriven behaviors (92, 151–154). These nuclei are comprised of distinct populations of acetylcholine, GABA, and glutamate neurons that project to the midbrain dopamine system (155, 156). Anatomical studies indicate that the VTA primarily receives input from the LDT (87, 155, 157). *In vivo* electrophysiological experiments illustrate that electrical stimulation of the LDT elicits burst firing in putative VTA dopamine neurons (158). Selective activation of LDT inputs to the VTA evokes excitatory currents in VTA dopamine neurons projecting to the lateral NAc (92). Stimulating this LDT–VTA pathway *in vivo* elicits CPP and reinforces operant responding (92, 154). Increasing evidence indicates that the LDT is also involved in drug-dependent behaviors. Specifically, local pharmacological manipulations demonstrate the LDT is critical for the acquisition and expression of cocaine CPP (159), as well as with cocaine-primed reinstatement of drug seeking (160). Interestingly, the cholinergic neurons of the LDT are involved with the behavioral responsiveness to cocaine-paired cues (161). Further studies are needed to ascertain whether drug-dependent behaviors also involve the GABA and glutamate projections from the LDT to the VTA.

Whereas the VTA is preferentially innervated by the LDT, the PPT primarily targets the substantia nigra (87, 155). Although the anatomical evidence indicates there is a small PPT projection to the VTA (87, 155), electrophysiological studies *in vivo* and *in vitro* suggest a functional relationship exists between the PPT and VTA (106, 162, 163). The discrepancy between the anatomical and electrophysiological studies is unclear, though proposed explanations include the possibility that a single PPT neuron innervates numerous VTA neurons or that electrical stimulation excites fibers of passage or nearby regions, such as the LDT (87). Regardless, electrical stimulations targeting the PPT increases burst firing of putative VTA dopamine neurons (106), while PPT inactivation reduces dopamine neuron firing to salient stimuli (162). The PPT is also implicated in drugdependent behaviors, as lesions attenuate amphetamine- and morphine-induced locomotor activity (164), and PPT inactivation reduces cocaine-primed reinstatement of drug seeking (160). PPT lesions reduce both heroin self-administration and morphine CPP (165, 166). However, PPT cholinergic neurons are not involved with cocaine self-administration, heroin selfadministration, cocaine CPP, and heroin CPP (167), suggesting the involvement of PPT glutamate and/or GABA neurons in these drug-related behaviors.

### Dorsal Raphe

The DR is the primary source of serotonin in the brain, but also contains glutamate (85), GABA (168), and dopamine neurons (169). While the DR is often studied within the context of controlling affective state (170), it is also involved in reinforcing instrumental behavior (171). Serotonin exerts a variety of electrophysiological responses in VTA neurons. The predominant *in vitro* response in putative dopamine neurons is excitatory, though a small proportion of dopamine neurons are inhibited by serotonin (172). In contrast, equal numbers of putative GABA neurons are excited and inhibited by serotonin (172). The net effect of these electrophysiological responses appears to be excitatory, as *in vivo* intra-VTA administration of serotonin elevates dopamine levels in the NAc (173).

Serotonin influences drug-related behaviors (174), which could involve the DR serotonin neurons projecting to the VTA. However, the DR projection to the VTA is primarily comprised of glutamate neurons that predominantly innervate dopamine neurons (85, 87, 175). Activation of DR glutamate neurons evokes excitatory currents in VTA dopamine neurons and elicits dopamine release in the NAc (175). Selective activation of the non-serotonergic DR–VTA pathway reinforces instrumental behavior and is sufficient for eliciting CPP (175, 176). In contrast, activation of serotonergic DR neurons projecting to the VTA is only weakly reinforcing (176). These anatomical and behavioral findings suggest that the VTA is likely not a primary locus where serotonin acts to influence drug-related behaviors. Instead, the non-serotonergic DR neurons projecting to the VTA are well positioned to mediate drug-dependent behaviors, though this has not yet been experimentally examined.

### Nucleus Accumbens

GABA neurons in the NAc project to the VTA and are thought to mediate a "long-loop" inhibitory feedback to regulate dopamine neuron activity (177). Mu-opioid receptor agonists acutely inhibit the GABA afferents from the NAc to the VTA (33, 178). The inhibitory transmission from the NAc inputs onto VTA GABA neurons is enhanced following repeated injections of cocaine, which in turn disinhibits VTA dopamine neurons (179). In addition to being influenced by opiates and psychostimulants, the NAc afferents to the VTA are Fos activated during cocaine cue-induced reinstatement (101). While these results suggest the NAc–VTA pathway is involved in drug-related behaviors, no experiments to date have examined the behavioral effect of selectively perturbing this pathway.

### Prefrontal Cortex

The medial PFC mediates a variety of cognitive functions (180), is involved in the reinstatement of drug-seeking behavior (23), and exhibits Fos activation following an acute administration of amphetamine (181). The VTA receives a dense glutamate projection from the medial PFC (85), with pyramidal neurons synapsing onto both dopamine and non-dopamine VTA neurons (62, 182). Electrically stimulating the PFC can either inhibit or excite putative dopamine neurons within the VTA (183, 184). Whereas single pulse or low frequency PFC stimulation inhibits a majority of VTA dopamine neurons (183–185), burst stimulation of the PFC excites >90% of VTA dopamine neurons (184). The mechanism behind the dopamine neuron excitation is unclear, as VTA dopamine neurons receive sparse input from the PFC (87, 186), with <15% of VTA dopamine neurons being excited by selective activation of medial PFC inputs (50). These findings collectively suggest the medial PFC preferentially targets VTA GABA neurons, though the relevance of this PFC-VTA pathway in drug-dependent behaviors has not been examined.

### Amygdala

The amygdala is an interconnected group of nuclei involved with attributing emotional value to cues (187, 188). The VTA receives amygdala input arising from the central nucleus of the amygdala (CeA) subdivision (87, 189). The CeA contains predominantly GABA neurons and is involved with fear conditioning (187, 188, 190), as well as with mediating the general motivational influence of rewarding cues (191, 192). In the context of drug-dependent behaviors, the CeA facilitates the expression of conditioned responding (193) and is also involved with mediating stressinduced reinstatement of drug-seeking behavior (194, 195). While the CeA projects to the VTA, it is currently unknown how this pathway influences VTA neuron activity and whether it is crucial for drug-dependent behaviors.

### DRUG-INDUCED SYNAPTIC PLASTICITY ON VTA NEURONS

The transition of an individual from drug naive or casual drug user to SUDs involves changes in the function of specific neural circuits (196). Given the importance of the VTA in drug-related behaviors, the synaptic adaptations in VTA dopamine neurons have been extensively studied and reviewed elsewhere (197–201). Numerous studies from a variety of laboratories have consistently demonstrated an increase in excitatory synaptic strength onto VTA dopamine neurons after *in vivo* exposure to abused drugs (202–208). Many of these studies examined the effect of drugs on the ratio of the AMPA receptor current to the NMDA receptor current (AMPA/NMDA) in VTA neurons, which allows for comparing the excitatory synaptic strength between different groups of animals (i.e., drug treated vs. control). *In vivo* exposure to drugs of abuse increases the AMPA/NMDA (202–204, 206, 207), which is mediated by insertion of calcium-permeable AMPA receptors and removal of NMDA receptors in VTA dopamine neurons (205, 208).

In addition to the excitatory synaptic alterations in VTA dopamine neurons, *in vivo* exposure to drugs also modulates inhibitory synaptic inputs to the VTA. For example, repeated injections of cocaine potentiate the NAc inhibitory input to VTA GABA neurons, which results in a disinhibition of dopamine neurons (179). This disinhibition also facilitates the ability to elicit excitatory long-term potentiation (LTP) in VTA dopamine neurons (209). VTA dopamine neurons are also capable of undergoing inhibitory LTP. Furthermore, this inhibitory LTP is blocked following an *in vivo* exposure to opiates (210, 211). A myriad of drug-induced synaptic alterations have been reported, though it is important to note that the full complement of electrophysiological changes and the duration of these alterations in VTA neurons depends upon the drug, the drug dose, and the manner the drug is administered (202–204, 206, 207, 212). Few studies to date have examined whether these drug-induced synaptic changes occur in an afferent-specific manner (179, 212). Indeed, *in vivo* exposure to different classes of abused drugs results in alterations in distinct excitatory inputs to VTA dopamine neurons (212). Although much has been learned regarding synaptic alterations in the VTA following non-contingent injections of abused drugs, additional studies are needed to ascertain the similarities and differences in the synaptic changes evoked by different classes of abused drugs (psychostimulants, opiates, alcohol, nicotine, etc.). Furthermore, electrophysiological studies are also needed to identify which VTA afferents and what VTA neuronal populations undergo synaptic alterations following contingent drug self-administration.

### CONCLUSION

The high incidence of relapse illustrates the need for identifying new therapeutic approaches for the treatment of SUDs. The treatment of opioid dependence is complicated by the severe withdrawal symptoms experienced by individuals when ceasing drug intake. The current treatment options for opioid SUDs typically focus on opioid maintenance with methadone or buprenorphine and detoxification with alpha-2 receptor agonists. However, these current treatment options often result in relapse (213). Currently there is no FDA-approved pharmacotherapy for the treatment of cocaine SUDs, though *N*-acetylcysteine is a promising and well-tolerated drug that reduces cocaine-seeking in rodents and craving in cocainedependent humans (214–217). Over the past decade, research on effective pharmacological treatments for alcohol SUDs has identified many potential targets, including opioid receptors (218), dopamine receptors (219), glutamate receptors (220), GABA receptors (221), and adrenergic receptors (222). Preclinical research highlighted the cannabinoid system as a promising target for multiple SUDs (223, 224). However, a cardiovascular clinical study examining the efficacy of rimanobant, a cannabinoid receptor antagonist, elicited severed negative neuropsychiatric effects (225) and has dampened enthusiasm for targeting the endocannabinoid system for treating SUDs. Unfortunately, no single pharmacotherapy currently exists for treating a broad spectrum of SUDs.

An alternative therapeutic direction for the treatment of SUDs involves the use of deep brain stimulation (DBS), which commonly has been utilized for the treatment of movement disorders. In preclinical studies, DBS targeting the NAc reduced cocaine behavioral sensitization (226), morphine CPP (227), reinstatement of heroin-seeking (228), and reinstatement of cocaine-seeking (229–231). Additionally, DBS targeting the LHb reduces cocaine self-administration and the reinstatement of cocaine-seeking (232). Consistent with the rodent DBS experiments, clinical studies indicate a complete remission or prolonged cessation of heroin use after DBS in the NAc in humans (233, 234). A considerable drawback of implementing DBS in humans is the invasive nature of implanting the probe. However, a couple of recent reports illustrate that non-invasive transcranial magnetic stimulation of the PFC is effective at reducing drug use and craving (235, 236). While there are promising new therapeutic approaches for treating SUDs, the ultimate goal for any intervention is to be effective and as specific as possible to limit side effects. Thus, additional basic science research is needed for identifying the specific neural circuits and adaptations responsible for the development of drug-dependent behaviors.

The implementation of optogenetic and chemogenetic approaches in behavioral experiments has validated and identified specific neural circuits that mediate a range of appetitive and aversive behaviors. Many of these studies manipulated brain regions implicated in SUDs (237), though relatively few have modulated neural circuits within the context of drug-dependent behaviors (98, 113, 133). While activity within the VTA is central to numerous drug-dependent behaviors, many questions remain. Future experiments are needed to (i) determine which VTA afferents and what neuronal populations in the VTA mediate a particular drug-dependent behavior and (ii) elucidate the associated afferent-specific synaptic changes on both dopamine and non-dopamine neurons within the VTA. Identifying the neural circuits and adaptations responsible for drug-dependent behaviors in rodents can highlight specific neural circuits for targeted pharmacological and DBS therapeutic interventions to treat humans suffering from a SUD.

### REFERENCES


### AUTHOR CONTRIBUTIONS

MW and IO contributed to the writing of this review article.

### FUNDING

This work was supported by National Institutes of Health Grant DA033386 (MW).


glutamatergic neurons in the ventral tegmental area, substantia nigra and retrorubral field in the rat. *Neuroscience* (2008) **152**:1024–31. doi:10.1016/j. neuroscience.2008.01.046


ventral tegmental area and substantia nigra. *Brain Res Bull* (1981) **7**:283–91. doi:10.1016/0361-9230(81)90020-4


placebo-controlled trial. *Alcohol Alcohol* (2011) **46**:312–7. doi:10.1093/ alcalc/agr017


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Oliva and Wanat. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Memory Systems and the Addicted Brain

#### *Jarid Goodman and Mark G. Packard\**

*Department of Psychology, Texas A&M Institute for Neuroscience, Texas A&M University, College Station, TX, USA*

The view that anatomically distinct memory systems differentially contribute to the development of drug addiction and relapse has received extensive support. The present brief review revisits this hypothesis as it was originally proposed 20 years ago (1) and highlights several recent developments. Extensive research employing a variety of animal learning paradigms indicates that dissociable neural systems mediate distinct types of learning and memory. Each memory system potentially contributes unique components to the learned behavior supporting drug addiction and relapse. In particular, the shift from recreational drug use to compulsive drug abuse may reflect a neuroanatomical shift from cognitive control of behavior mediated by the hippocampus/dorsomedial striatum toward habitual control of behavior mediated by the dorsolateral striatum (DLS). In addition, stress/anxiety may constitute a cofactor that facilitates DLS-dependent memory, and this may serve as a neurobehavioral mechanism underlying the increased drug use and relapse in humans following stressful life events. Evidence supporting the multiple systems view of drug addiction comes predominantly from studies of learning and memory that have employed as reinforcers addictive substances often considered within the context of drug addiction research, including cocaine, alcohol, and amphetamines. In addition, recent evidence suggests that the memory systems approach may also be helpful for understanding topical sources of addiction that reflect emerging health concerns, including marijuana use, high-fat diet, and video game playing.

#### *Edited by:*

*Vincent David, Centre National de la Recherche Scientifique (CNRS), France*

#### *Reviewed by:*

*Jacques Micheau, University of Bordeaux 1, France Roberto Ciccocioppo, University of Camerino, Italy*

#### *\*Correspondence:*

*Mark G. Packard markpackard@tamu.edu*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 01 December 2015 Accepted: 11 February 2016 Published: 25 February 2016*

#### *Citation:*

*Goodman J and Packard MG (2016) Memory Systems and the Addicted Brain. Front. Psychiatry 7:24. doi: 10.3389/fpsyt.2016.00024*

Keywords: memory, drug addiction, hippocampus, striatum, amygdala, stress, anxiety

## INTRODUCTION

Investigators often look to mechanisms of learning and behavior to explain how human psychopathology is acquired and expressed. An example of such an application was provided by Norman M. White who employed tenets of classical learning theory and experimental evidence supporting the existence of multiple memory systems in the brain to provide a novel, influential approach to drug addiction (1). Specifically, White indicated that drugs can play the part of "reinforcers" that, like food or water in a learning task, strengthen associations among drug-related stimuli, context, and behavior to promote drug taking and, over time, addiction. White also incorporated the emerging hypothesis that there are different types of memory that are mediated by dissociable neural systems. According to this novel view, drugs can directly modulate multiple neural systems, and these neural systems go onto encode distinct components of the drug-related memory that, when expressed, promote further drug taking.

The year 2016 marks the 20th anniversary of the multiple memory systems view of drug addiction as described by White. The present review revisits this influential hypothesis, while highlighting some important recent developments that have not only substantiated the original hypothesis but have also produced additional insights into how multiple memory systems potentially support drug addiction.

### THE MULTIPLE MEMORY SYSTEMS VIEW OF ADDICTION

Converging evidence from studies employing humans and lower animals indicates that mammalian memory is mediated by relatively independent neural systems [for reviews, see Ref. (2–4)]. The early experiments dissociating multiple memory systems were primarily conducted in the radial maze and indicated unique mnemonic functions for the hippocampus, dorsal striatum, and amygdala (5, 6). The hippocampus mediates a cognitive/spatial form of memory, whereas the dorsal striatum mediates stimulus–response (S–R) habit memory. The amygdala mediates Pavlovian and stimulusaffect-associative relationships (6, 7), while also subserving the modulatory role of emotional arousal on other types of memory (8–12).

Within the context of the multiple systems view of memory, White (1) suggested that the hippocampus, dorsal striatum, and amygdala encode unique components of drug-related memories (see **Figure 1**). The hippocampus encodes explicit knowledge pertaining to the relationship between cues and events (i.e., stimulus–stimulus associations) in the drug context. Importantly, the hippocampus does not encode behavioral responses, but rather the information acquired by the hippocampus can be used to generate the appropriate behavioral responses to receive drug reinforcement. On the other hand, the dorsal striatum encodes associations between drug-related stimuli and behavioral responses. This may allow the presentation of a drug-related cue to activate an automatic behavioral response that results in drug taking (e.g., running approach or instrumental lever press). The amygdala encodes Pavlovian-associative relationships, thus allowing neutral cues in the drug context to become associated with the drug reward. Animals later react to these conditioned cues similarly to how they originally reacted to the drug. Specifically, the conditioned cues activate conditioned emotional responses, including internal affective states and conditioned approach toward (or in some cases avoidance from) the conditioned cue. Another critical component of White's hypothesis is that drugs can modulate memory function of each of these brain regions. Thus, drugs can potentially enhance their own selfadministration via augmenting consolidation of the drug-related memories encoded by the hippocampus, amygdala, and dorsal striatum (see **Figure 1**).

Consistent with the multiple memory systems view of drug addiction, extensive evidence indicates critical roles for the hippocampus, dorsal striatum, and amygdala in drug addiction and relapse for a variety of abused substances [for review, see Ref. (13)]. The dorsal hippocampus appears to have a role in the contextual control of drug seeking for cocaine (14–16). The lateral region of the dorsal striatum (DLS) mediates S–R habitual lever pressing for cocaine and alcohol (17, 18), and the basolateral amygdala (BLA) mediates conditioned drug seeking for cocaine, alcohol, and heroin (19–22). Also consistent with White's hypothesis, substances of abuse can modulate the mnemonic functions of the hippocampus, dorsal striatum, and amygdala (23–31).

Recent studies have contributed novel amendments to the multiple memory systems approach to drug addiction. Key features of this contemporary view include (1) a neuroanatomical shift over time to DLS-dependent habit memory, (2) competitive interactions between memory systems, (3) the role of stress and anxiety in enhancing habitual drug seeking, and (4) the application of this hypothesis to new emerging sources of addiction.

### THE NEUROANATOMICAL SHIFT FROM COGNITION TO HABIT

In experimental learning situations, subjects typically employ purposeful behavior when initially solving a task. However, following extensive training, behavior becomes autonomous and can be performed with little attention, intention, or cognitive effort, constituting a "habit" [for review, see Ref. (32)]. In early demonstrations of this shift from cognitive control of behavior to habit, rodents were trained using food reward in a dual-solution plus-maze task (33–35). In this task, rats were released from the same starting position (e.g., the south arm) and had to make a consistent body-turn at the maze intersection to receive food reward always located in the same goal arm (e.g., always make a left turn to find food in the west arm). Rats could solve this task by either learning a consistent body-turn response or by making whatever response necessary to go the same spatial location. To determine which strategy the rats employed, investigators implemented a probe test in which animals were released from the opposite start arm (e.g., the north arm). If animals made the opposite body-turn to go the original goal location, they were identified as place learners. If animals made the same body-turn as during training (i.e., going to the arm opposite to the original goal location), animals were identified as response learners. Evidence indicates that after some training, most animals display place learning, whereas after extensive training, animals shift to habitual response learning (34–36). Interestingly, this shift from place learning to response learning may reflect a neuroanatomical shift. The initial use of place learning in this task is mediated by the hippocampus and dorsomedial striatum [DMS (36, 37)], whereas the use of response learning after extended training is mediated by the DLS (36).

In addition to early demonstrations using the plus-maze (34, 35), the behavioral shift to habit memory was later demonstrated using operant lever pressing paradigms (38–42). In these instrumental learning tasks, animals initially lever press purposefully in order to obtain the outcome and will cease lever pressing once the food outcome is devalued. However, following extensive training animals will shift to habitual responding and will continue pressing the lever even after the food outcome has been devalued (40). As originally demonstrated in the plus-maze (36), the transition from cognition to habit in instrumental learning tasks might also be attributed to a neuroanatomical shift. The initial cognitive control of behavior in these instrumental learning tasks is mediated

by the hippocampus and DMS (43, 44), whereas later habitual responding is mediated by the DLS (18, 45, 46).

Numerous investigators have suggested that the neuroanatomical shift to habit memory demonstrated in maze and instrumental learning tasks might also underlie the shift from recreational drug use to compulsive drug abuse (13, 47–50). Consistent with this hypothesis, investigators have demonstrated for a variety of abused substances that the DMS mediates goaldirected responding for drug reinforcement and the DLS mediates habitual responding for drug reinforcement (18, 31, 51–53).

Considering the high abuse potential of some drugs, investigators have suggested that addictive drugs might enhance DLSdependent habit memory function and thereby accelerate the shift from cognitive to habitual control of behavior. Consistent with this hypothesis, repeated exposure to amphetamine or cocaine facilitates the shift from goal-directed to habitual responding for food reinforcement in instrumental lever pressing tasks (31, 54–59). In addition, lever pressing for addictive substances (e.g., alcohol or cocaine) versus food reward has been associated with greater habitual responding versus goal-directed responding (24, 60, 61). In humans, alcohol-dependent individuals show greater habitual responding in an instrumental learning task, relative to non-dependent control individuals (62). This enhancement of DLS-dependent habit memory by addictive drugs has also been observed in rodent maze learning tasks. Cocaine, amphetamine, and alcohol exposure have been associated with enhanced learning in DLS-dependent maze tasks or greater use of DLS-dependent response strategies in dual-solution versions of the maze (25, 63, 64). In humans, the use of abused substances, including alcohol and tobacco, has been correlated to the greater use of dorsal striatum-dependent navigational strategies in a virtual maze (65). Thus, some drugs of abuse might enhance DLS-dependent habit memory, and this heightened engagement of the DLS memory system might accelerate the transition from recreational drug use to habitual drug abuse. This proposed mechanism is consistent with White's (1) original contention that drugs of abuse might sometimes facilitate their own self-administration by enhancing function of memory systems.

### COMPETITION BETWEEN MEMORY SYSTEMS

Although it is possible that addictive drugs enhance habit memory directly by enhancing function of the DLS [e.g., Ref. (29)], another possibility is that drugs of abuse enhance habit memory indirectly via modulation of other memory systems. This alternative mechanism invokes the hypothesis that in some learning situations, memory systems compete for control of learning and that by impairing the function of one memory system, function of another intact system might be enhanced (11, 66). Notably, the hippocampus and DLS might sometimes compete for control of learning, whereby lesion of the hippocampus enhances DLS-dependent memory function (5, 6, 67, 68). Competitive interactions can also be demonstrated in dual-solution tasks, when impairing one memory system results in the use of a strategy mediated by another intact system. For instance, animals given DMS lesions display DLS-dependent habitual responding for food reward in instrumental learning tasks (44).

Considering the competitive interactions that sometimes arise between memory systems, one possibility is that some drugs of abuse might enhance DLS-dependent habit memory indirectly by impairing cognitive memory mechanisms mediated by the DMS and hippocampus. As noted previously, alcohol is associated with greater use of DLS-dependent habit memory in maze and operant lever pressing paradigms (24, 61, 62, 64, 65). Evidence also indicates that alcohol impairs learning in hippocampus-dependent spatial memory tasks [(64, 69–72); for review, see Ref. (73)], as well as in DMS-dependent reversal learning tasks (74–77). Consistent with a competitive interaction between memory systems, it has been hypothesized that alcohol may facilitate DLS-dependent habit memory indirectly via impairing cognitive memory mechanisms (78).

It should be noted that aside from alcohol, numerous drugs have been associated with cognitive memory deficits. Exposure to morphine, heroin, methamphetamine, MDMA (ecstasy), or chronic cocaine similarly produces hippocampus-dependent spatial memory impairments across a variety of tasks (79–89). It is tempting to speculate that, as suggested for alcohol, cognitive memory impairments produced by addictive drugs might indirectly enhance DLS-dependent habit memory, and that this might be one mechanism allowing drug self-administration to become habitual in human drug abusers. On the other hand, it is also possible that spatial learning deficits produced by addictive drugs might occur indirectly via enhancement of DLS-dependent memory processes. Consistent with this hypothesis, stimulating CREB activity in the DLS impairs hippocampus-dependent spatial memory (90), whereas inhibition of CREB activity in the DLS reverses the spatial memory impairments produced by morphine (91).

### ROLE OF STRESS AND ANXIETY

An additional consideration regarding the multiple memory systems approach to drug addiction is the role of stress. Converging evidence indicates that robust emotional arousal facilitates DLSdependent habit memory in rodents and humans [for reviews, see Ref. (9–12)]. Administration of anxiogenic drugs enhances DLSdependent response learning in the water plus-maze (92–97). This enhancement of DLS-dependent habit memory is also observed following exposure to unconditioned behavioral stressors [e.g., chronic restraint, tail shock, predator odor, etc. (98–101)] and exposure to fear-conditioned stimuli [tone previously paired with shock (102, 103)]. Although originally demonstrated in rodents (92), this enhancement of habit memory induced by robust emotional arousal has also been demonstrated extensively in humans (99, 104–110).

The mechanisms allowing stress/anxiety to facilitate habit memory remain largely unknown; however, evidence indicates a critical modulatory role of the BLA (93–95, 100). Consistent with a competitive interaction between memory systems, some evidence also suggests that stress/anxiety might enhance DLS-dependent habit memory indirectly by impairing hippocampal function (94, 95).

Enhancement of habit memory following stress or anxiety may be relevant to understanding some prominent factors leading to drug abuse. Namely, stressful life events or chronic prolonged periods of stress/anxiety are associated with increased vulnerability to drug addiction and relapse in humans (111–117), and similar observations have been made in animal models of drug self-administration [for review, see Ref. (118)]. Investigators have suggested that consistent with the influence of emotional arousal on multiple memory systems (10), acute or chronic stress may enhance drug addiction and relapse in humans by engaging DLS-dependent habit memory processes (9, 49, 119). Consistent with this suggestion, stress in cocaine-dependent individuals is associated with decreased blood-oxygen-level-dependent (BOLD) activity in the hippocampus and increased activity in the dorsal striatum, and these BOLD activity changes are associated with stress-induced cocaine cravings (120).

### EMERGING SOURCES OF ADDICTION

Aside from drugs of abuse, the multiple memory systems hypothesis has also been recently employed for understanding other emerging sources of addiction. For instance, the rise in obesity over the past few decades has led to a comparable surge in experimental interest, with many investigators drawing parallels between drug addiction and overeating [for review, see Ref. (121–123)]. Some recent evidence has suggested that like drug addiction, food addiction might be partially attributed to heightened engagement of DLS-dependent habit memory. In rats, binge-like food consumption facilitates the shift from cognitive to habitual control of behavior (124, 125). Moreover, habitual behavior in bingeing animals is associated with increased DLS activity and may be prevented by blocking AMPA or dopamine D1 receptors in the DLS (125). Diet-induced obesity has also been recently associated with the use of habit memory in a Y-maze task (126).

Another emerging behavioral disorder that parallels some features of drug addiction is pathological video game playing or video game addiction [for review, see Ref. (127)]. Like drug addiction, long-term excessive video game playing has been associated with reduced dopamine D2 receptor binding in the dorsal striatum (128). Videogame playing is also correlated to increased activation of the dorsal striatum (129, 130), and greater dorsal striatal volumes predict higher levels of video game skill (131). People who regularly play action video games are more likely to use dorsal striatum-dependent habit memory in a virtual maze (132), and pre-training video game playing leads to habitual responding over goal-directed responding in a two-stage decision-making task (133). Thus, as proposed for drugs of abuse, playing video games might enhance video game addiction via engaging the DLS-dependent habit memory system.

Finally, the multiple memory systems approach might also be useful for understanding marijuana addiction. Although marijuana may have lower abuse potential than other illicit substances classically considered within the context of drug addiction research (e.g., cocaine, morphine, heroin, etc.), heavy cannabis use can nevertheless promote drug dependence and withdrawal symptoms as observed with other drugs of abuse (134–137). It has recently been suggested that marijuana addiction might be partially attributed to increased engagement of DLS-dependent habit memory (138). Whereas acute cannabinoid exposure impairs DLS-dependent memory function (139, 140), repeated cannabinoid exposure leads to greater DLS-dependent habitual responding in an instrumental learning task (141). In addition, heavy cannabis users display greater activation of the dorsal striatum, relative to non-users, when performing a marijuana version of the implicit association task (142), and participants with a history of cannabis use are more likely to use dorsal striatumdependent habit memory in the virtual maze (65).

Given the successful application of the memory systems approach to emerging sources of addiction, it is reasonable to hypothesize that multiple memory systems might also be implicated in other behavioral pathologies associated with addiction, such as compulsive shopping, Internet addiction, and sex addiction. Indeed, whether the memory systems approach might be useful for understanding pathological gambling has also received some attention (143, 144).

### CONCLUSION

Twenty years of experimental evidence has largely corroborated White's (1) multiple memory systems approach to drug addiction. Evidence indicates that the hippocampus mediates contextual control of drug self-administration, the DLS mediates S–R habitual responding for drug reinforcement, and the amygdala mediates conditioned drug seeking. In addition, subsequent research has led to additional insights regarding the multiple

### REFERENCES


memory systems view of drug addiction including the shift to habit memory, competition between memory systems, and the role of stress and anxiety.

Future research should attempt to integrate the memory systems approach with other theories of addiction, such as opponent motivational processes (145). It would also be useful to incorporate into the memory systems view additional features of addiction, such as drug dependence, tolerance, and withdrawal. Although the present review predominantly focused on the brain regions originally considered by White (i.e., the hippocampus, dorsal striatum, and amygdala), it should be noted that additional brain regions related to learning and memory have also been critically implicated in drug addiction and relapse, including the medial prefrontal cortex and nucleus accumbens [for review, see Ref. (13)]. Finally, although beyond the scope of the present review, it should be acknowledged that extensive evidence suggests that cellular and molecular changes in the midbrain dopaminergic system also contribute to addiction (146).

Although habit memories might be especially difficult to control, some evidence indicates that DLS-dependent memory, once acquired, can in some circumstances be suppressed (147) or even reversed (148, 149). Thus, it is possible that the pharmacological manipulations and behavioral procedures leading to the reversal or suppression of habit memory in animal models of learning might potentially be adapted to treat drug addiction and relapse in humans.

### AUTHOR CONTRIBUTIONS

JG and MP both contributed ideas and writing of the present mini-review.


water maze while sparing working memory. *Synapse* (2003) **48**(3):138–48. doi:10.1002/syn.10159


stress disorder. *Rev Neurosci* (2012) **23**(5–6):627–43. doi:10.1515/ revneuro-2012-0049


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Goodman and Packard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Desensitizing Addiction: Using Eye Movements to Reduce the Intensity of Substance-Related Mental Imagery and Craving

#### *Marianne Littel\*, Marcel A. van den Hout and Iris M. Engelhard*

*Clinical Psychology, Utrecht University, Utrecht, Netherlands*

#### *Edited by:*

*Daniel Beracochea, Institut de Neurosciences Cognitive set Integratives d'Aquitaine (INCIA), France*

#### *Reviewed by:*

*Giovanni Martinotti, University G. d'Annunzio, Italy Kesong Hu, Cornell University, USA*

> *\*Correspondence: Marianne Littel m.littel@uu.nl*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 25 September 2015 Accepted: 25 January 2016 Published: 08 February 2016*

#### *Citation:*

*Littel M, van den Hout MA and Engelhard IM (2016) Desensitizing Addiction: Using Eye Movements to Reduce the Intensity of Substance-Related Mental Imagery and Craving. Front. Psychiatry 7:14. doi: 10.3389/fpsyt.2016.00014*

Eye movement desensitization and reprocessing (EMDR) is an effective treatment for posttraumatic stress disorder. During this treatment, patients recall traumatic memories while making horizontal eye movements (EM). Studies have shown that EM not only desensitize negative memories but also positive memories and imagined events. Substance use behavior and craving are maintained by maladaptive memory associations and visual imagery. Preliminary findings have indicated that these mental images can be desensitized by EMDR techniques. We conducted two proof-of-principle studies to investigate whether EM can reduce the sensory richness of substance-related mental representations and accompanying craving levels. We investigated the effects of EM on (1) vividness of food-related mental imagery and food craving in dieting and non-dieting students and (2) vividness of recent smoking-related memories and cigarette craving in daily smokers. In both experiments, participants recalled the images while making EM or keeping eyes stationary. Image vividness and emotionality, image-specific craving and general craving were measured before and after the intervention. As a behavioral outcome measure, participants in study 1 were offered a snack choice at the end of the experiment. Results of both experiments showed that image vividness and craving increased in the control condition but remained stable or decreased after the EM intervention. EM additionally reduced image emotionality (experiment 2) and affected behavior (experiment 1): participants in the EM condition were more inclined to choose healthy over unhealthy snack options. In conclusion, these data suggest that EM can be used to reduce intensity of substance-related imagery and craving. Although long-term effects are yet to be demonstrated, the current studies suggest that EM might be a useful technique in addiction treatment.

Keywords: EMDR, eye movements, addiction, food craving, cigarette craving, working memory taxation, mental imagery, addiction memory

### INTRODUCTION

Eye movement desensitization and reprocessing (EMDR) is a well-established, effective treatment for posttraumatic stress disorder [PTSD: (1, 2)]. During EMDR, patients recall their traumatic memories while making horizontal eye movements (EM). This decreases the sensory richness of the memories and makes them less emotionally intense. Interestingly, mounting research shows that EM can also decrease the vividness and emotionality of positively laden memories (3, 4), and images of possible future events (flash-forwards) (5–8). This suggests that EMDR might be suitable for the treatment of other types of psychopathology in which maladaptive memory and mental imagery plays a role, including addictive disorders (9).

Addictive disorders are chronic and relapsing in nature and pose a widespread problem with great societal, economic, and personal costs. Remission rates are extremely high, with more than 85% of individuals returning to substance use within 1 year after quitting (10). Over the past years, there has been little progress in identifying new, effective interventions, and relatively few existing interventions have been validated experimentally (11). The present studies were designed to provide proof-of-principle for the use of EMDR in the treatment of addiction. More specifically, it was examined whether making EM during the recall of substance-related images can reduce their vividness, emotionality, and ability to elicit craving, as well as general craving and substance-use behavior.

Eye movement desensitization and reprocessing was originally developed by Shapiro (12) to facilitate the cognitive processing of traumatic memories. In the basic EMDR protocol (13), the client is instructed to hold an unpleasant memory in mind, while EM is induced by having the client follow a side-to-side motion of the therapist's index finger. The client then reports current sensations, cognitions, and emotions, including the distress caused by the memory. Sets of EM are repeated until the client reports that the distress has been reduced to a minimal level. Then, the client is guided to practice a positive cognition to go with the memory. Multiple meta-analyses show that EMDR is effective in the treatment of PTSD (1, 2, 14). Practice guidelines now consider both cognitive behavior therapy (CBT) and EMDR to be treatment of choice. Importantly, a meta-analysis by Lee and Cuijpers (15) shows that the EM component of the therapy has significant additional value over and above repeated activation of the memory without EM. In addition, numerous lab studies [e.g., Ref. (16, 17); and also see Ref. (18) for an overview] show that autobiographical memories become less vivid and emotional after applying only the EM component of EMDR, as compared to memory recall only. Hence, EM seems important for EMDR to have its effects, but it is still unclear how this works.

A plausible explanation of the effects of EM is provided by the working memory (WM) theory. WM is a cognitive system for temporary storage and manipulation of information (19, 20) and has limited capacity. During EMDR, people simultaneously recall traumatic memories and make EM, two processes that have both been demonstrated to tax WM (17, 21). The subsequent competition for its limited capacity affects memory recall. Memories are processed in a more detached manner and become less vivid and emotional. This memory "blurring" does not only take place during or immediately after the intervention but also appears to have long-term effects [i.e., 1 day or week later; (16, 22)]. EMDR seems to exploit the fact that the retrieval of memories returns them to a labile state, during which they can be altered or updated (23, 24). After memory recall plus EM, less vivid, less emotional, and less detailed versions of memories are reconsolidated into long-term storage.

Evidence for the WM theory of EMDR is provided by many well-controlled lab studies. They show that simultaneous EM reduce memory vividness, but so do other dual WM tasks, such as mental arithmetic (7) or copying a complex drawing (22), compared to memory recall without a dual task. Furthermore, and as noted before, negative memories are affected by dual WM tasks, but so are other kinds of taxing mental images, including positive memories [e.g., Ref. (3–5)] and distressing images about possible future events (flash-forwards) (3, 5–8).

In addictive disorders, the retrieval of substance-related memories is crucial to the experience of craving, which is, in turn, a strong predictor of substance use maintenance and relapse (25–27). These substance-related memories include classically and instrumentally learned associations between cues and effects (e.g., the association between feeling stressed and smoking and between smoking and becoming relaxed). They also include episodic memories, such as memories of specific encounters with the substance (e.g., a great first use experience), memories of substance use consequences, and memories of loss of self-control and relapse (9, 28). Craving is often maintained and augmented by sensory imagery [e.g., imagining sight, smell, future use: (29, 30)]. Research shows that instructions to form mental images of substance use increase craving [e.g., Ref. (31, 32)], with more vivid imagery predicting higher craving intensity (31, 33–35).

Craving can be reduced by dual task procedures. Many studies have shown that engaging in non-substance-related imagery or visuospatial tasks while experiencing high craving levels reduces craving frequency and intensity [for overviews, see Ref. (36, 37)]. Concurrent cognitive activity therefore provides a valuable way of coping with the acute effects of craving and can be easily implemented in clinical practice [e.g., Ref. (38)]. When craving is experienced, one can engage in a dual task. However, this method requires substance-dependent persons to identify craving while it can still be controlled, whereas self-monitoring, self-evaluation, and cognitive control are often compromised in addiction (39). Furthermore, and in contrast to EMDR, this method is not designed to alter substance-related representations in memory storage, and long-term effects are not expected after one quits using it. To achieve prolonged craving reduction, specific instructions must be given to retrieve the images before engaging in the dual task. Only reactivated memories enter a labile state and are susceptible to alteration or disruption (23, 24).

Three studies so far have investigated the effects of visuospatial WM tasks (33, 40, 41) during *instructed* imagery of favorite foods in a sample of healthy (non-preselected) students. All tasks significantly reduced the vividness of the food-related imagery and craving compared to a control condition. Although long-term effects were not measured, these studies provide first indications that concurrent tasks can degrade substance-related images.

Research on the effectiveness of the full EMDR procedure in addiction is limited. In most studies, EMDR predominantly focused on traumatic memories constituting comorbid PTSD and not on memory representations or sensory imagery constituting substance craving and dependence itself (42). The investigations of EMDR that did specifically target substance-related memories are clinical anecdotes or case reports [for a list, see Ref. (43)]. Although most of them describe positive results, some found mixed (44) or negative results (45). Only one controlled study has been published so far (46). In this study, thirty alcohol-dependent patients received either treatment as usual (TAU) along with two EMDR sessions or TAU only. Target memories were memories of specific instances of intense craving and relapse. Patients in the TAU + EMDR group showed a significant reduction in alcohol craving one as well as six months posttreatment, compared to patients receiving TAU only. In addition, fewer patients from the TAU + EMDR group relapsed. Unfortunately, the study has several limitations, including small sample sizes and multiple drop-outs on follow-up measures. Nonetheless, the results are encouraging for the application of EMDR targeting specific addiction memories, especially because the effects were obtained after only two sessions.

In order to determine whether EMDR can serve as a promising adjunct to current treatment options for addiction, more research is necessary, including well-controlled proof-of-principle studies showing that EM can desensitize addiction-relevant memory representations and imagery. In the present studies, the effects of EM on the vividness and emotionality of substance-related images and associated craving were investigated. Because craving is triggered by addiction memories and exacerbated by mental imagery, both were used as targets in each of the two studies. In the first study, EM targeted food-related *imagery* and food craving in healthy dieting and non-dieting participants. It extends the studies by Kemps et al. (33), McClelland et al. (40), and Steel et al. (41), by placing more emphasis on the retrieval or formation of food-related mental images before the dual task was introduced. Moreover, our study solely focused on the effects of EM as dual task to reduce craving. Furthermore, there were methodological differences, such as the use of a between-subjects design, which prevents possible carry-over effects of interventions on craving. The second study was concerned with smoking-related *memories* and cigarette craving and was conducted in smokers. Both studies employed the EMDR lab model [cf., Ref. (3, 5, 47)], in which half of the participants recalled a substance-related image while making EM (recall + EM), whereas the other half of the participants recalled the image while keeping eyes stationary (RO). Image vividness, emotionality, and craving were measured before (pretest) and after the intervention (posttest). We expected that recall + EM, relative to RO, would decrease image vividness, emotionality, and craving from pre- to posttest.

### STUDY 1: THE EFFECTS OF EM ON FOOD-RELATED IMAGERY AND FOOD CRAVING

The first study focused on craving for food. Although food craving is commonly experienced and plays a significant evolutionary role (48), it is associated with unfavorable outcomes, including high-calorie food consumption and body mass index (BMI) (49), binge eating (50), development of obesity (51), and having difficulty in maintaining a diet (52). Many lines of research demonstrate that parallels exist between drug and food cravings in neuroanatomy, neurochemistry, and learning (53–55), providing the rationale for study 1.

Dieting and non-dieting participants were instructed to actively imagine eating their favorite food. We compared the effects of recall + EM versus RO on the vividness and emotionality of these food-related images, as well as specific craving in response to these images and more general craving for their favorite food. Furthermore, we compared snack choice at the end of the task. It was expected that, compared to RO, recall + EM would decrease craving, vividness, and emotionality of the food-related imagery. We also expected healthier snack choices after EM than RO. Because dieters are trying to exert control over their food intake, they are likely to experience motivational conflict when they think of their favorite food (56). Therefore, we expected that food-related imagery would be more taxing for dieters, resulting in greater effects of the intervention in this group. Generalizability of effects was explored by comparing craving for two other favorite foods at pre- and posttest.

Both the present study and study 2 were approved by the local ethics committee of the Faculty of Behavioral and Social Sciences of Utrecht University. All participants provided written informed consent.

### Methods

#### Participants

Eighty-nine female students (*M* age = 21.5, SD = 2.2) participated in experiment 1. They were recruited *via* advertisements at Utrecht University, specifically calling for non-dieters and dieters. Dieters (*n* = 42) were eligible if they reported to be on a diet with the goal of losing weight. They were on the diet for 3.2 months (SD = 4.4) on average. Individuals with explicit knowledge of EMDR were excluded. Participants received either financial compensation or course credit for participation.

### Materials

#### Eye Movement Task

An EM task [cf., Ref. (3, 5)] was used to simulate the EM component of EMDR. A white dot was presented on a black screen, which moved from side-to-side with 1 s per cycle, or a blank screen was presented. The moving dot and blank screens were displayed during four intervals of 24 s separated by 10 s breaks. Participants sat at a 50 cm distance from the computer screen. Participants recalled their food-related image while tracking the dot (recall + EM) or watching the blank screen (eyes stationary; RO).

#### Visual Analog Scales

Before (pretest) and after (posttest) the EM task, participants recalled their food-related images and rated them on vividness using 10 cm Visual Analog Scales (VASs) ranging from 0 (not vivid) to 100 (very vivid), on emotionality using a VAS ranging from 0 (very unpleasant) to 100 (very pleasant), and on image-specific craving ("How strong is your urge to eat [targetfood] at this very moment") using a VAS ranging from 0 (no craving) to 100 (intense craving). The EM task and VASs were presented using OpenSesame v.0.27.1 (57).

### General State Food Cravings Questionnaire

Current craving for the target food was assessed with the Dutch translation of the General State Food Cravings Questionnaire [G-FCQ-S: (58)]. This questionnaire consists of 15 items (e.g., "I know I'm going to keep on thinking about tasty [food] until I actually have it") that are scored on 5-point Likert scales, ranging from "I totally disagree" to "I totally agree." The reliability is excellent (Cronbach's α = 0.93). For the purpose of this study, the word "food" was replaced with the participants' favorite food.

### General Trait Food Cravings Questionnaire

The General Trait Food Cravings Questionnaire [G-FCQ-T: (58)] was used to measure trait craving, i.e., the tendency to experience craving for food in general. It is composed of 21 questions (e.g., "I feel like I have food on my mind all the time"), which are scored on a 6-point Likert scales. The Dutch translation has good validity and reliability (Cronbach's α = 0.90).

### Behavioral Task

As a behavioral outcome measure of EM and RO interventions, participants' snack choice was measured. At the end of the experiment, all participants were offered an apple or a candy bar. They could pick one of these or refuse both. Choosing an apple and refusing a snack were considered healthy choices, whereas choosing the candy bar was considered an unhealthy choice.

### Procedure

Upon arrival, participants were screened for study eligibility. After signing informed consent, participants were asked several questions about their diet and reported their height and weight, in order to calculate their BMI, and filled out the G-FCQ-T. Then, participants were instructed to select three food items that they craved most at that specific moment. These were entered into the software, and intensity of craving for each food was assessed using on-screen VASs. Out of the three selected foods, participants then picked their favorite one, i.e., the food they craved most at that specific moment. This food became the target for the EM or RO intervention, whereas the other two foods did not (non-targets). First, participants filled out the G-FCQ-S, of which the word "food" was replaced with participants' target food. They were asked to vividly picture this food and imagine its taste and smell as if they were eating it right now. When the image was clear, they rated vividness and emotionality of this image and image-specific craving using VASs. Subsequently, the EM task started. Half of the participants recalled their image while making EM. The other half recalled their image while keeping eyes stationary (RO). Immediately after the recall + EM or RO intervention, target images were again scored on vividness, emotionality, and craving. The two non-target images were also scored on craving. Then, the G-FCQ-S was filled out for a second time. After finishing this questionnaire, participants proceeded to the behavioral task and were offered the choice between a candy bar and an apple. Participant assignment to recall + EM or RO was counterbalanced. At the end of the experiment, participants were debriefed and given their reward.

### Design and Statistical Analyses

A 2 × 2 × 2 crossed design was used with group (2; dieters, non-dieters) and condition (2; recall + EM, RO) as betweensubjects factors and time (2; pretest, posttest) as withinsubjects factor.

Five 2 (group) × 2 (condition) × 2 (time) mixed model ANOVAs were conducted to assess whether food-related image vividness, emotionality, and craving VAS scores, G-FCQ-S scores, and non-target craving scores were more reduced after recall + EM than after RO. A Chi-square goodness of fit test was performed to assess whether the healthy snack option would be selected more frequently than would be expected by chance after EM [cf., Ref. (59)]. An alpha level of 0.05 was used for all statistical tests. When the direction of the differences was predicted, one-tailed *p*-values are reported.

### Results

Dieters had significantly higher BMIs (*M* = 23.7, SD = 4.1) than non-dieters (*M* = 21.4, SD = 2.4), *t*(87) = 3.12, *p* < 0.01, and showed greater trait craving (*M* = 72.2, SD = 11.8) than nondieters (*M* = 64.6, SD = 11.8), *t*(87) = 3.06, *p* < 0.01, indicating that two distinct groups were recruited.

### Vividness VAS

Findings are graphically depicted in **Figure 1**. There were no main effects of Time, *F*(1,85) = 0.00, *p* = 1, Condition, *F*(1,85) = 0.75, *p* = 0.39, or Group, *F*(1,85) = 1.09, *p* = 0.30. The crucial Condition × Time interaction was significant, *F*(1,85) = 4.01, *p* = 0.05, η<sup>2</sup> = 0.05. For the RO condition, a significant increase was observed between the pre- and posttest vividness scores, *t*(43) = 1.69, *p*= 0.05, *d*= 0.52. For EM, there was a non-significant trend toward a decrease instead, *t*(44) = 1.33, *p* = 0.10, *d* = 0.41. The Condition × Time interaction effect was not moderated by dieting Group, *F*(1,85) = 0.68, *p* = 0.41.

### Emotionality VAS

There were no significant main or interaction effects, all *F*'s < 2.36, all *p*'s > 0.13.

### Craving VAS Target Food

There were no significant main effects of Time, *F*(1,85) = 0.00, *p* = 0.96; Condition, *F*(1,85) = 0.63, *p* = 0.43; or Group, *F*(1,85) = 2.44, *p* = 0.12. However, the crucial Condition × Time interaction was significant, *F*(1,85) = 4.14, *p* = 0.05, η<sup>2</sup> = 0.05. Paired sample *t*-tests showed that there was a trend for increasing pre- to posttest craving scores in the RO condition, *t*(43) = 1.38, *p* = 0.09, *d* = 0.42, whereas for recall with EM, craving scores dropped significantly from pre- to posttest, *t*(44) = 1.66, *p* = 0.05, *d* = 0.51.

There was a trend for a Condition × Time × Group interaction, *F*(1,85) = 3.32, *p* = 0.07, η<sup>2</sup> = 0.04. The non-dieting group showed a pre- to posttest increase in craving in the RO condition, *t*(22) = 2.35, *p* = 0.01, *d* = 0.70, and a craving decrease in the EM condition, *t*(23) = 1.90, *p* = 0.04, *d* = 0.57. Craving did not increase or decrease in response to RO or EM in the dieting group, all *t*'s < 0.16, all *p*'s > 0.44.

### Craving VAS Non-Target Food

A significant main effect of Time was observed, *F*(1,85) = 31.20, *p*< 0.001, indicating a decrease of craving for non-preferred, nontargeted foods over time. Overall, dieters showed significantly less craving in response to their non-preferred foods, *F*(1,85) = 3.99, *p* < 0.05. No other significant effects were found, all *F*'s < 0.36, all *p*'s > 0.55.

### G-FCQ-State

There was a trend toward a main effect of Time, *F*(1,85) = 2.97, *p* = 0.09, indicating a slight increase of state food craving over time across groups. There were no significant main effects for Condition, *F*(1,85) = 0.76, *p* = 0.58; or Group, *F*(1,85) = 0.09, *p* = 0.76. The crucial Condition × Time interaction was significant, *F*(1,85) = 4.15, *p* = 0.05, η<sup>2</sup> = 0.05. Paired sample *t*-test showed that for the RO condition, G-FCQ-S scores significantly increased from pre- to posttest, *t*(43) = 3.31, *p* < 0.01, *d* = 0.71,

whereas in the EM condition G-FCQ-S scores remained stable over time, *t*(44) = 0.18, *p*= 0.43, *d*= 0.04. There was no significant Condition × Time × Group interaction, *F*(1,85) = 0.01, *p* = 0.91.

#### Snack Choice

Results are shown in **Figure 2**. After RO, the frequency of healthy snack choices did not differ from chance, χ<sup>2</sup> (1) = 0.36, *p* = 0.55. However, after recall + EM, the healthy snack option was more frequently chosen than would be expected by chance alone, χ<sup>2</sup> (1) = 3.76, *p* = 0.05.

### Discussion

A brief session of EM significantly reduced craving evoked by food-related images compared to a control condition in which no EM were made. This effect was most pronounced in non-dieting participants. In addition, there was a trend for recall + EM to decrease food image vividness, whereas it increased after recalling the image without making EM. General craving for the selected food (G-FCQ-S) did not decrease after recall + EM, but remained stable over time. Note that general craving for food increased after RO, which can be expected due to the passage of time (60) and repeated craving imagery (31, 32). Accordingly, one might argue that making EM during recall attenuates craving. After only a brief application of EM (4 × 24 s), we consider this a clinically relevant result, especially because the G-FCQ-S is a broad measure that incorporates items that do not specifically refer to the preferred food (e.g., "I'm hungry"). Finally, a brief session of EM during food-related imagery affected subsequent snack choice; participants in the EM condition chose the healthier options more often than expected by chance, whereas participants in the RO condition did not.

### STUDY 2: THE EFFECTS OF EM ON SMOKING-RELATED MEMORIES AND CIGARETTE CRAVING

In study 2, we compared the effects of recall + EM versus RO on the vividness and emotionality of smoking-related memories, memory-related cigarette craving, and general cigarette craving in daily smokers. In contrast to study 1, RO and EM interventions targeted memories1 instead of mental images formed in the lab. Moreover, we presented EM in six sets of 24 s instead of four in order to increase WM taxation [cf., Ref. (8)], and we used a small craving manipulation at the start of the experiment in order to increase craving. Furthermore, because cravings are emotionally ambivalent and likely to involve both positive and negative affect (61), we changed the positive endpoint of the emotionality scale to a neutral one (see Methods). We expected that, compared to RO, recall + EM would decrease craving and the vividness and emotionality of smoking-related memories.

### Methods

#### Participants

Fifty smokers (*M* age = 23.4, SD = 6.6, 58% females, 42% males) participated in experiment 2. They were recruited *via* advertisements at Utrecht University and word-of-mouth and were eligible if they smoked at least five cigarettes per day for 7 days per week. On average, they smoked 10.4 cigarettes per day (SD = 5.8) had smoked for 6.5 years (SD = 6.5). Their mean nicotine dependence level, as measured with the Fagerström test for nicotine dependence (FTND), was 2.0 (SD = 1.9), which can be considered low. They had not smoked for 4.2 h (SD = 4.8) prior to the experiment. Participants received either financial compensation or course credit for participation.

### *EM Task*

The EM task was similar to the task used in experiment 1, except that we presented horizontally moving white dots or blank screens during six intervals of 24 s. Participants recalled the image of their smoking-related memory while either tracking the dot or watching the blank screen.

### *Visual Analog Scales*

Similar to experiment 1, participants rated their smoking-related memories on vividness and memory-specific craving using 10 cm VASs ranging from 0 (not vivid/no craving) to 100 (very vivid/ intense craving) before (pretest) and after (posttest) the EM task. Emotionality was now measured on a 10-cm VAS ranging from not emotional to very emotional.

#### *Fagerström Test for Nicotine Dependence*

Nicotine dependence levels were measured with the Dutch translation of the FTND (62, 63). The FTND is composed of six items, has good reliability, and correlates significantly with number of cigarettes smoked per day.

### *QSU-Brief*

Upon arrival, during pre- and posttest, cigarette craving was measured with the 10-item of the brief questionnaire on smoking urges [QSU-brief: (64)]. This questionnaire is scored on a 7-point Likert scale and contains items like "All I want right now is a cigarette" and "I am going to smoke as soon as possible." The Dutch translation was used, which has adequate psychometric properties (65).

### Procedure

Participants were instructed to refrain from smoking for at least 1 h prior to the experiment. As an incentive, participants were told that this would be checked with a breath analyzer. Upon arrival, participants were screened for study eligibility and subjected to a non-invasive CO Ppm estimate utilizing the Bedfont piCO simple Smokerlyzer (Bedfont Scientific, Harrietsham, Engeland, 2011; *M* = 14.6 CO Ppm, SD = 9.7). After providing informed consent, participants recalled a recent memory of a specific situation or an emotional state2 in which they experienced craving and smoked a cigarette, for example, a get-together with friends in a bar, or feelings of stress. In line with the Dutch EMDR protocol (66), they were asked to "play" these memories in their minds and make a "screen shot" of the most vivid moment. They had to write down keywords of the resulting image. Participants then sat down behind the computer and filled out on-screen questions about demographics and smoking history, the FTND, and the

<sup>1</sup>More specifically, EM targeted images of memories, cf., the EMDR protocol. See Section "Procedure" for more detailed information. To avoid confusion with the images formed in study 1, we will describe the images of study 2 as "memories."

<sup>2</sup>An exploratory differentiation was made between the two types of memories, but no significant differences were observed. For the sake of comprehensiveness, we confine ourselves to the variables of primary concern.

QSU-brief. Then, they underwent a simple craving induction procedure, in which five smoking-related pictures were shown of people smoking or holding cigarettes and inhaling or exhaling cigarette smoke. These pictures were presented full-screen for 5 s. Afterwards, the QSU-brief was administered for a second time. Then, keywords of the selected smoking-related image were entered into the software, and participants were instructed to recall their specific memory for 10 s and rate it on vividness, emotionality, and craving. Next, the EM task started. Half of the participants recalled their memory while making EM. The other half recalled their memory while keeping eyes stationary (RO). Immediately after the recall + EM or RO intervention, memories were again scored on vividness, emotionality, and craving. Then, the QSU-brief was filled out for a third time. Participant assignment to EM or RO was counterbalanced. At the end of the experiment, participants were debriefed and given their reward.

### Design and Statistical Analyses

A 2 × 2 crossed design was used with condition (2; recall + EM, RO) as between-subjects factors and time (2; pretest, posttest) as within-subjects factor.

Four 2 (condition) × 2 (time) mixed model ANOVAs were conducted to assess whether smoking-related image vividness, emotionality, and craving VAS scores, and QSU-brief scores decreased more after recall + EM than after RO. An alpha level of 0.05 was used for all statistical tests. When the direction of the differences was predicted, one-tailed *p* values are reported.

### Results

The craving manipulation caused a significant increase in craving, *t*(49) = 5.37, *p* < 0.001.

#### Vividness VAS

There were no significant main effects of Time, *F*(1,48) = 0.06, *p* = 0.81 or Condition, *F*(1,48) = 1.22, *p* = 0.28. The crucial Condition × Time interaction was significant, *F*(1,46) = 4.76, *p* = 0.03, η<sup>2</sup> = 0.09. Paired sample *t*-test showed that for the RO condition, vividness scores significantly increased from pre- to posttest, *t*(27) = 1.85, *p* = 0.04, *d* = 0.71, whereas in the EM condition vividness scores remained stable over time, *t*(21) = 1.28, *p* = 0.11, *d* = 0.56.

#### Emotionality VAS

For memory emotionality, there were no significant main effects of Time, *F*(1,48) = 0.79, *p* = 0.38 or Condition, *F*(1,48) = 1.85, *p* = 0.18. The crucial Condition × Time interaction showed a non-significant trend toward significance, *F*(1,48) = 2.80, *p* = 0.10, η<sup>2</sup> = 0.06. In the RO condition, there were no significant differences between pre- and posttest emotionality scores, *t*(27) = 0.56, *p* = 0.29, *d* = 0.22. For EM, the pre- to posttest emotionality scores showed a significant decrease, *t*(21) = 1.87, *p* = 0.04, *d* = 0.82.

#### Craving VAS

There were no significant main effects of Time, *F*(1,48) = 1.62, *p* = 0.21 or Condition, *F*(1,48) = 0.79, *p* = 0.38. However, a significant Condition × Time interaction was observed, *F*(1,48) = 4.19, *p* = 0.05, η<sup>2</sup> = 0.08. Craving scores significantly increased in the RO condition, *t*(27) = 2.32, *p* = 0.01, *d* = 0.89, whereas for recall + EM, craving scores remained constant over time, *t*(21) = 0.59, *p* = 0.28, *d* = 0.26.

#### QSU-Brief

There were no significant main or interaction effects, all *F*'s < 0.24, all *p*'s > 0.24.

### Discussion

When WM was not taxed during smoking-related memory recall, both memory vividness and memory-evoked craving increased, which is to be expected due to the passage of time (60) and repeated substance-related imagery (31, 32). Because these significant increases were not observed in the recall + EM condition, it might be concluded that image vividness and craving were attenuated by recall + EM. In addition, there was a trend for recall + EM to decrease the emotional intensity of smokingrelated images compared to RO.

### GENERAL DISCUSSION

Results of the current studies indicate that brief sets of EM during the recall of substance-related images can decrease (study 1) or attenuate (study 2) the craving that is specifically evoked by these images, can attenuate general craving (study 1), can decrease (study 1) or attenuate (study 2) substance image vividness, can decrease image emotionality (study 2), and affect subsequent behavioral choices (study 1), compared to a control condition of substance-related imagery or memory retrieval without EM.

These results are in line with previous studies where EM significantly decreased the vividness and emotionality of autobiographical memories and flash-forwards (18), with three earlier studies in which visual–spatial tasks during food-related imagery decreased image vividness and craving (33, 40, 41), and one RCT among alcohol-dependent patients where two sessions of EMDR in addition to TAU reduced alcohol craving and relapse (46). We extended these studies by applying the methodologically sound EMDR lab model to investigate the effects of EM on substancerelated mental images, with special emphasis on the reactivation of these images prior to the EM task [cf., the EMDR protocol (66)]. In addition, we investigated effects of EM on both sensory imagery and substance-related memory representations, we used a different range of outcome measures, including a behavioral measure in study 1, and we were the first to test effects of recall + EM in smokers.

Both in study 1 and 2, intervention effects were partially driven by vividness and craving increments in the RO condition (see also **Figures 1** and **3**). However, observing post-intervention dissociations between EM and RO after only four or six sessions of 24 s, while craving naturalistically increases during abstinence (60), and even more after active imagery (32, 34), is still clinically relevant. Because craving increases over time and/or due to imagery, one might assume that *if* craving would be increased to a maximum level prior to the experiment, an EM intervention would reduce craving. In the second study, we used a small craving induction procedure. Although craving significantly increased,

the mean QSU-brief score after the craving induction was still only 3.3 (SD = 1.5), which is, on a scale from 1 to 7, definitely not maximal. Future studies should employ more thorough craving induction procedures to maximize craving levels at the start of the experiment.

Not all follow-up tests reached statistical significance. This might be explained by the fact that participants did not select the most suitable target images and memories. In study 1, not all participants selected foods that are typically craved (67, 68). In fact, only 47.2% selected high-caloric, sweet, or fatty foods (e.g., chocolate, cookies, fries, etc.). The other participants selected fruit, vegetables, and lunch or dinner meals. These more "neutral" foods are probably less sensitive to the EM intervention (69). In study 2, participants selected memories of specific, recent situations, and emotional states during which they experienced craving and smoked a cigarette. However, at pretest, image-specific craving scores were only 43.0 (SD = 30.8) on a scale ranging from 0 to 100. These relatively low pre-intervention craving scores might have prevented more substantial effects of EM and might explain why effects did not generalize to the more general craving measure (QSU-brief) at the end of the session. Also, because of these low craving scores, it is unclear how relevant these recent smoking memories actually are to smoking dependence. Memories of craving instances further back in the past might have a larger impact on current craving and smoking behavior. In future studies, an effort should be made to find out what specific memories contribute to current craving in each participant, and these memories should be targeted during the intervention. Other mental representations might serve suitable targets as well, such as trigger situations that someone is confronted with daily or weekly (e.g., waiting at a bus stop), associations of substance use with extremely pleasurable memories (e.g., first shot), or memories of prior relapse and loss of control (43).

In contrast to our hypothesis, dieting participants did not exhibit larger decreases in craving or image vividness after EM. This is in line with results from Kemps et al. (33) showing that watching dynamic visual noise during food imagery reduced image vividness and craving in both dieting participants and non-dieting controls. In the present study, however, there was a trend for dieters to show reduced intervention effects compared to non-dieters. This unpredicted finding might, however, be explained by their selection of food-related images: 33.3% chose a fruit or vegetable as their target food, compared to 12.8% in non-dieters group. As noted before, these foods are not typically craved, and more "neutral" targets have been observed to be less sensitive to recall + EM (69).

Furthermore, no spontaneous generalization effects were found for recall + EM on craving for non-recalled favorite foods, indicating that making EM during the imagination of one favorite food does not simply cause any other favorite food to become less desired. However, this finding might be explained by current methodology: non-targets were not explicitly retrieved prior to craving scorings, which might have prevented elaboration upon craving-related thoughts. Pretest craving scores were indeed lower for non-target foods (*M* = 66.9, SD = 15.2) than for target foods (*M* = 80.7, SD = 14.0).

In sum, despite non-maximal craving levels at the start of the experiment, and despite the fact that suboptimal target images were selected, we found significant effects of EM on the sensory richness of substance-related memories and imagery, associated craving, and subsequent behavior in two non-clinical samples. In line with previous studies, these data suggest that EM can be used as coping skill to temporarily reduce the intensity of craving. It remains to be investigated if EM can definitely alter substancerelated memory and serve to reduce the occurrence or intensity of future cravings, without simultaneous taxing of WM. However, as noted before, several studies that adopted a similar design, i.e., the EMDR lab model, did observe effects that lasted over time [e.g., Ref. (22)].

It seems implausible that very short recall + EM interventions of the type used here will result in therapeutic effects. Note that in EMDR for PTSD, a series of sessions lasting 1 h or more are used to reduce the intensity and occurrence of trauma related flashbacks outside the clinic. It would be fascinating to test if the full EMDR procedure for food or drug craving may decrease craving in the long run and reduce relapse rates. However, it should first be established which images should best be targeted (memories, imagery, associations, cues, etc.), whether the effects are observed

### REFERENCES


for all facets of craving [reward, relief, obsessive craving, see Ref. (70)], whether the effects of EM generalize to actual substance use behavior, and whether they generalize to people trying to control or quit their substance use, i.e., the eventual target group for the EMDR intervention.

### AUTHOR CONTRIBUTIONS

ML: designed the experiments and supervised students during data collection, analyzed the data, and wrote the paper. MH and IE: provided valuable feedback.

### ACKNOWLEDGMENTS

This study was supported with a TOP-grant (40-00812-98-12030) from the Netherlands Organization for Health Research and Development (ZonMw) awarded to MH. IE is supported with a Vidi grant (452-08-015) from the Netherlands Organization for Scientific Research (NWO). We thank Helena van Hove and Julia van Alphen for their assistance with data collection for study 1, and Charlotte Woertman, Ditta Zwegers, Kelly Zuijdervliet, Minta van Hall, Nienke Feis, Renée Grevers, Wouter van Hattum, and Wouter Tielemans for their assistance with data collection for study 2.


in normal controls and alcoholics. *Compr Psychiatry* (2013) **54**(7):925–32. doi:10.1016/j.comppsych.2013.03.023

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Littel, van den Hout and Engelhard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Neuropsychological Consequences of Chronic Drug Use: Relevance to Treatment Approaches

*Jean Lud Cadet1 \* and Veronica Bisagno2 \**

*1National Institute on Drug Abuse Intramural Program, Molecular Neuropsychiatry Research Branch, Baltimore, MD, USA, <sup>2</sup> Instituto de Investigaciones Farmacológicas (ININFA), Universidad de Buenos Aires-CONICET, Buenos Aires, Argentina*

Heavy use of drugs impacts of the daily activities of individuals in these activities. Several groups of investigators have indeed documented changes in cognitive performance by individuals who have a long history of chronic drug use. In the case of marijuana, a wealth of information suggests that heavy long-term use of the drug may have neurobehavioral consequences in some individuals. In humans, heavy cocaine use is accompanied by neuropathological changes that might serve as substrates for cognitive dysfunctions. Similarly, methamphetamine users suffer from cognitive abnormalities that may be consequent to alterations in structures and functions. Here, we detail the evidence for these neuropsychological consequences. The review suggests that improving the care of our patients will necessarily depend on the better characterization of drug-induced cognitive phenotypes because they might inform the development of better pharmacological and behavioral interventions, with the goal of improving cognitive functions in these subsets of drug users.

Keywords: marijuana, cocaine, methamphetamine, frontal cortex, cognition

## INTRODUCTION

Substance use disorders continue to be a major health concern worldwide. Chronic use of various drugs can impact brain structures and functions (1, 2). Use of these drugs may also be associated with both acute and chronic neuropsychological abnormalities (3). The present review summarizes some of the evidence documenting cognitive changes reported in drug users [with a focus on marijuana, cocaine, and methamphetamine (METH)]. We also discuss potential biological substrates for these observations. The neuropathological changes associated with the use of larger quantities of some of these drugs have been recently reviewed (1). In addition to having differential abuse liability, the use of some of these substances is also associated with differential pathoanatomic changes in the brain (1). There is also evidence that a history of substance use may also exacerbate pre-existing neuropsychological deficits (4) and comorbid neurological or psychiatric disorders (3). It is also clear that substance-related changes in neuropsychological functions may negatively impact activities of daily living, including ability to manage finances and/or holding on to jobs (5). A meta-analysis of METH users and cognition revealed that these individuals exhibited small-to-medium effect sizes for an association between neurocognitive impairment and employment (6). Cognitive domains associated with employment status included executive function, learning and memory, attention, and general intellectual ability (6). In the present review, we will discuss alterations that are linked to psychological and neural mechanisms that detect error signals and generate suitable behavioral

#### *Edited by:*

*Vincent David, Centre National de la Recherche Scientifique, France*

#### *Reviewed by:*

*Aviv M. Weinstein, University of Ariel, Israel Hedy Kober, Yale University, USA*

#### *\*Correspondence:*

*Jean Lud Cadet jcadet@intra.nida.nih.gov; Veronica Bisagno vbisagno@ffyb.uba.ar*

#### *Specialty section:*

*This article was submitted to Addictive Disorders, a section of the journal Frontiers in Psychiatry*

*Received: 19 April 2015 Accepted: 27 December 2015 Published: 15 January 2016*

#### *Citation:*

*Cadet JL and Bisagno V (2016) Neuropsychological Consequences of Chronic Drug Use: Relevance to Treatment Approaches. Front. Psychiatry 6:189. doi: 10.3389/fpsyt.2015.00189*

responses (7). Also discussed is the accumulated evidence of poor learning and memory, diminished executive functions, and risky decision-making in some individuals with a history of heavy drug use (8–11).

### MARIJUANA USE

Marijuana is the most commonly used illicit substance (12). Investigations of cognitive functions in heavy marijuana users have recently documented poor performance in a number of cognitive subdomains. Some of these deficits appear to be related to frequency of drug use and can impact activities of daily living.

### Neuropsychological Findings

Adult marijuana users suffer from changes measured in broad cognitive domains (13, 14). These include memory (9, 13, 14, 15), attention (16), decision-making (17), and psychomotor speed (9, 18). Bolla et al. (9) reported that impairments observed in marijuana users could be measured in heavy users even after 28 days of forced abstinence during their participant stay on a closed research unit, with light use of marijuana not being associated with any significant decrements in performance (9). In a recent study, Colizzi et al. (19) studied whether functional variations in cannabinoid receptor 1 (CNR1) gene and marijuana exposure interact to modulate prefrontal functions and related behaviors. The authors suggested that deleterious effects of marijuana use may be more evident in individuals with specific genetic backgrounds that might impact receptor expression (19). Additionally, it is important to note that, even if marijuana use during early adulthood is associated with cognitive impairments in selected domains, prolonged abstinence may promote improvement in performance (13, 14, 20). These data are summarized in **Table 1**.

Functional imaging studies comparing activation in both adult and adolescent chronic marijuana users to healthy controls during the performance of different cognitive tasks have reported that chronic marijuana users showed altered patterns of brain activity [Ref. (31–38), see **Table 2**]. There is also evidence to suggest that heavy marijuana use may produce deficits on measures of decision-making and inhibitory control that persist for long periods of time (27). Among recreational marijuana users, lack of inhibitory control depends on contextual or situational factors, with loss of control being evident only when situations or tasks involve a motivational component (27). Also, poorer cognitive performance in areas of risk-taking, decision-making, and episodic memory may influence the degree to which marijuana users engage in risky behaviors with consequent negative health consequences (39). In addition, it has been reported that the main active ingredient in marijuana, delta-9 tetrahydrocannabinol (THC), can alter time perception by impairing time estimation and production in the seconds range (30). Temporal processing changes may have functional consequences because it is relevant to many everyday tasks, including driving (30).

Interestingly, although much more in-depth research remains to be done on this controversial issue, marijuana use during adolescence has been reported to increase the risk of developing psychotic disorders later in life (40). THC was also reported to induce acute psychotic symptoms in healthy individuals (41)



↓*, Cognitive deficits.*

and to increase the risk of psychotic disorders after long-term use (42). A recent study by Bhattacharyya et al. (43) reported a significant relationship between the effects of THC on striatal activation, its effects on task performance, and appearance of positive psychotic symptoms, suggesting that THC might induce psychosis by influencing the neural substrate of attentional salience processing (43). Although more research is needed on this subject, there are plausible biochemical pathways that marijuana can impact to induced psychotic responses in some individuals. Specifically, the endocannabinoid system consists of cannabinoids receptors and endogenous cannabinoid ligands that interact with these receptors to impact the release of several neurotransmitters, including GABA, glutamate, and dopamine (44, 45). Therefore, it seems possible that exposure to marijuanabased psychoactive substances during adolescence could negatively impact glutamatergic and GABAergic systems, with subsequent alterations of maturation processes of these systems, resulting in psychosis-like phenomena (46). The appearance of psychiatric disturbances might also depend on the exact dose, time windows during adolescence, and/or duration of drug exposure (24, 28, 40). Interestingly, hair analyses also revealed that marijuana users with high THC concentration were more likely to exhibit schizophrenia-like symptoms (47, 48). Some of the neuroimaging and cognitive changes reported in marijuana users appear to be moderated by gender (24, 49). These findings highlight potential THC-induced neuroadaptations in the



*ACC, anterior cingulate cortex; DLPC, dorsal lateral prefrontal cortex; PFC, prefrontal cortex;* ↓*, decreased brain activation;* ↑*, increased brain activation;* ↓*, cognitive deficits.*

adolescent brain and support the importance of prevention and treatment of adolescent users (28). Nevertheless, this topic needs to be further investigated before any firm conclusion can be reached concerning the relationship of THC to psychosis and other psychiatric diseases.

### COCAINE USE

Although cocaine is a highly addictive agent, the vast majority of cocaine users do so recreationally over extended periods of time without developing dependence (50). Thus, documenting the potential cognitive effects of cocaine is an important public health issue because of its high prevalence in the general population. Recent neurobehavioral studies have shown that cocaine heavy users show a number of cognitive decrements that may be secondary to cocaine-induced changes in brain structure and function (1). These cognitive deficits are detailed below.

### Neuropsychological Findings

Heavy cocaine use is associated with decrements in performance in several cognitive domains [Ref. (51), detailed in **Table 3**]. These include problems in executive function, decision-making, increased impulsivity, abnormal visuoperception, abnormal psychomotor speed, impaired manual dexterity, poor verbal learning, and decrements in memory functions (8, 52–58). Additionally, cocaine users showed different patterns of brain activation while performing cognitive tasks [Ref. (59–67), see **Table 4**]. Chronic cocaine users show poor insight and judgment, lack foresight, and are also disinhibited (68). These cognitive changes are probably related to functional dysfunctions in the prefrontal cortex (69) since patients who suffer damage in this brain region manifest similar cognitive problems (70). This suggestion is supported by neuroimaging studies demonstrating hypofrontality in cocaine users performing tasks of attention and executive function (62, 71). From this perspective, the possibility that a core deficit in executive functions, such as context processing, might contribute to the well-documented impairments in top-down control that are commonly associated with heavy cocaine use (72). In addition to those observations in chronic heavy cocaine users, subtle cognitive deficits have been reported in non-dependent, recreational cocaine users (50, 73–76).

There is a compelling evidence to suggest that cocaine-associated impairments in cognitive functioning might be secondary to cocaine-induced dysfunctions in dopaminergic systems (88–93). Cerebral hypoperfusion observed in the frontal and temporoparietal cortical areas of cocaine users (77, 94) may also subserve some of the observed cognitive deficits in these patients. These suggestions are consistent with the report of increased cerebral vascular resistance in cocaine users, abnormalities that lasted for, at least, 1 month of monitored abstinence (95).

In addition to specific deficits observed in cocaine users, these individuals may also suffer from psychosocial impairments. For

#### Table 3 | Cognitive deficits reported in cocaine users.


↓*, Cognitive deficits;* ↑*, cognitive improvement;* ↑*, neurobehavioral symptoms.*

example, a recent study by Preller et al. (87) suggests a relationship between social cognition test outcomes in cocaine-dependent patients and real-life social functioning. Specifically, participants showing more empathy and better mental processing abilities had a larger social network. In addition, social network size was correlated with duration and amount of cocaine use. This suggests that cocaine use and the associated altered empathy and insight may have consequences in everyday life, including fewer social contacts and deprivation of emotional support (87). Additionally, Preller et al. (85) also reported that individuals with cocaine dependence have blunted reward responses to social interactions as well as having reduced orbitofrontal cortex signals while performing a social cognition test. Taken together, these observations suggest that the treatment armamentarium may need to include interventions that boost more interactions of patients with other individuals in various social networks. This argument may explain, in part, why the affiliation-promoting peptide, oxytocin, may have beneficial effects in substance use treatment (96, 97). The possibility that social reward deficits might precede or be consequent to cocaine use needs to be investigated further (96).

In summary, although these cocaine-associated changes in cognitive functions have been well documented, their biological substrates have yet to be understood. Recent functional and structural imaging data provide ample support for impaired connectivity in frontostriatal (4, 98) and striatal-insular (99) connections that serve as neuroanatomical and functional substrates for some of the cognitive deficits reported in cocaine using individuals. A clinical approach that takes into consideration the fact that some patients may actually suffer from cognitive impairments should stimulate investigations in order to provide more details on the basic substrates of cocaine use by humans (74).

### METHAMPHETAMINE USE

Methamphetamine use is a serious public health problem (100). Long-term exposure to the drug has been shown to cause severe neurotoxic and neuropathological effects with consequent disturbances in several cognitive domains (1). These neuropsychological impairments that can impact the daily lives of METH users are detailed below.

### Neuropsychological Findings

Chronic METH users show mild signs of cognitive decline (10) affecting a broad range of cognitive functions [Ref. (5, 6, 101–112), see details in **Table 5**; but see also Ref. (113) for a counterargument]. A meta-analysis study by Scott et al. (107) identified significant deficits of a medium magnitude in several different cognitive processes that are dependent on the functions of frontostriatal and limbic circuits. The affected domains include episodic memory, executive functions, complex information processing speed, and psychomotor functions (107). Additionally, METH use often results in irritability, agitation, and numerous other forms of psychiatric distress probably related to the myriad of interpersonal problems experienced by these patients (114, 115). METH dependence is also associated with complaints of cognitive dysfunctions including memory problems and self-reported deficits in everyday functioning (110). Additionally, impulsive behaviors may exacerbate their psychosocial difficulties and promote maintenance of drug-seeking behaviors, especially, by those who use large amounts of the drug (116, 117). The nature and magnitude of cognitive deficits associated with chronic



*ACC, anterior cingulate cortex; DLPC, dorsal lateral prefrontal cortex; NAcc, nucleus accumbens; OFC, orbitofrontal cortex; PFC, prefrontal cortex;* ↓*, decreased brain activation;*  ↑*, increased brain activation;* ↓*, cognitive deficits.*

METH use increase the risk of poorer health outcomes, high-risk behaviors, treatment non-adherence, and repeated relapses (110, 118). These adverse consequences might be secondary to poor executive function and memory deficits that may contribute to continuous drug-seeking behaviors (70). It needs to be noted that partial recovery of neuropsychological functioning and improvement in affective distress can be achieved after a period of sustained abstinence from METH (5). Hart et al. (113) have reviewed the literature and suggested that the deficits reported may be statistically but not clinically significant. In a follow-up analysis of similar data, Dean et al. (10) came to a different conclusion. These issues are important to clinicians who are responsible for the daily and/or long-term care of patients because small deficits may be of substantial importance when it comes to patients being able to follow instructions that would help them to participate in their own care, given the high rate of recidivism in that patient population (119, 120). Therefore, identifying patients with neuropsychological deficits would allow for the development of specific cognitive or pharmacological approaches that would benefit them.

Neuroimaging studies have documented several alterations in brain activation patterns induced by METH [Ref. (104, 121–128), see **Table 6**]. These studies reported decreased frontal activation associated with impaired decision-making (104) and cognitive control (127). Other brain regions sensitive to METH effects include the cingulate gyrus and insula (122, 128). METH users who showed impaired attention (122) and impaired cognitive control (128) exhibited abnormalities in these brain regions (see **Table 6**). It is worth mentioning that, in some cases, stimulantdependent patients report clinically significant neuropsychological abnormalities prior to lifetime initiation of psychostimulant use (68).

### Recovery of Neurocognitive Functioning and Treatment Implications

Chronic use of several illicit drugs is associated with variable degrees of impaired cognitive functioning that shows different levels of improvement during sustained abstinence (3). Recovery from METH dependence is associated with improved performance in tests of mental flexibility, attention, processing speed, verbal memory, fine motor functioning, and verbal fluency (5). Improvements in performance are also seen in abstinent marijuana users (13, 14). Moreover, Brewer et al. (131) found that activation in corticostriatal regions, linked to cognitive control, correlated with abstinence and cocaine-free urine toxicology (131). There was also an inverse correlation between prefrontal cortex activation and treatment retention (131), thus supporting the notion that identification of patients with cognitive deficits are important for the long-term care of these patients (3, 132). This suggestion is supported by the results of a very recent report that strength of craving for METH can be reduced by cognitive strategies (133). In

#### Table 5 | Cognitive deficits reported in methamphetamine users.


↓*, Cognitive deficits;* ↑*, cognitive improvement;* ↑*, neurobehavioral symptoms.*

addition, patients who participated in computer-assisted cognitive behavioral therapy showed improved task performance and reduced task-related signal changes in several regions implicated in cognitive control, impulse control, and motivational salience, including the anterior cingulate and midbrain (134).

### CONCLUSION

Chronic use of illicit substances, including marijuana, cocaine, and METH, is associated with abnormal goal-directed behaviors that are thought to be the manifestations of altered corticostriatal-limbic circuits (2, 135). Nevertheless, the wealth of clinical presentations, neuroimaging studies, and some pathological findings suggest that the biochemical and structural effects of chronic heavy use of drugs may reach beyond the boundaries of these reward circuits (1). The data reviewed here indicate that chronic use of illicit drugs is accompanied by moderate Table 6 | Functional neuroimaging studies on methamphetamine users performing cognitive tasks.


*ACC, anterior cingulate cortex; DLPC, dorsal lateral prefrontal cortex; PFC, prefrontal cortex;* ↓*, decreased brain activation;* ↑*, increased brain activation;* ↓*, cognitive deficits;*  ↑*, cognitive improvement;* ↑*, neurobehavioral symptoms.*

cognitive impairments in some patients. These observations may be related to functional and structural changes in various brain regions, including both cortical and subcortical regions of the human brain (1, 98, 136). In addition, it has been reported frontal deficits in psychostimulant-dependent patients reporting current clinically neurobehavioral abnormalities may be linked to pre-existing abnormalities (68). Because drug dependence develop over many months, it is likely that drug-related changes of behaviors may be modulated by some of these pathological phenomena in such a way as to significantly impact the clinical course of chronic use of these substances. Thus, impaired learning and memory functions might negatively impact the ability of a specific subset of patients to benefit from general treatment approaches. This inability may explain, in part, the high rate of recidivism in this patient population. This argument suggests that approaches to these individuals should take into consideration the diversity of patterns of substance use and clinical presentations. This argument suggests that thorough neuropsychological and neuroimaging assessments should be undertaken to identify these subsets of drug users. This approach should help to dichotomize patients as being unimpaired or impaired, with specific cognitive and pharmacological treatments targeting subgroups of patients. Present approaches that group all patients together

### REFERENCES


need to be revamped to allow for more rational and data-driven approaches to treatment.

### FUNDING

The Intramural Research Program of the National Institute on Drug Abuse, NIH, and DHHS supports JC. VB is supported by a grant from ANPCyT, PICT 2012-0924, Argentina.

cannabis users: relationship to cortisol levels. *J Neurosci* (2011) **31**:17923–31. doi:10.1523/JNEUROSCI.4148-11.2011


spatial working memory. *Psychiatry Res* (2008) **163**:40–51. doi:10.1016/j. pscychresns.2007.04.018


*Neuropsychopharmacol Biol Psychiatry* (1998) **22**:1061–76. doi:10.1016/ S0278-5846(98)00057-8


sex with men. *Drug Alcohol Depend* (2004) **76**:203–12. doi:10.1016/j. drugalcdep.2004.05.003


methamphetamine-dependent subjects. *Psychiatry Res* (2011) **194**:287–95. doi:10.1016/j.pscychresns.2011.04.010


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2016 Cadet and Bisagno. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **A commentary on "A new initiative on precision medicine"**

*Udi E. Ghitza\**

*Center for the Clinical Trials Network, National Institute on Drug Abuse, National Institutes of Health, Bethesda, MD, USA*

**Keywords: substance use disorders, alcohol, marijuana, cannabis, nicotine, tobacco**

**A commentary on**

#### **A new initiative on precision medicine** *by Collins FS, Varmus H. N Engl J Med (2015) 372:793–5. doi:10.1056/NEJMp1500523*

### **Commentary**

U.S. President Barack Obama recently announced a new Precision Medicine Initiative, and Drs. Francis Collins and Harold Varmus have begun to provide a vision for how some of this initiative might be implemented by the U.S. National Institutes of Health (NIH) (1). Precision medicine may be defined as "an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person" (2). A vision of the NIH portion of the Precision Medicine Initiative is to launch a large-scale national cohort study of a Million or more Americans to advance understanding of how to optimize treatments customized to individual variability in genomic and environmental health-determinants (2). A precision-medicine approach, using shared-decision making with patients and their providers as partners in patient-centered care, offers an important opportunity to improve substance-use disorders (SUDs) prevention and treatment outcomes (3, 4). Pertinent to precision medicine, the Collaborative Research on Addictions at NIH, comprising the National Institute on Drug Abuse, National Institute on Alcohol Abuse and Alcoholism, and the National Cancer Institute, in partnership with other NIH Institutes, Centers and Offices, is currently planning to launch a longitudinal cohort study of Adolescent Brain and Cognitive Development (ABCD). This study will follow 10,000 youth over up to a 10-year period, approximately ages 9–10 at baseline when largely naïve to use of alcohol, marijuana, nicotine, and other drugs. This national cohort study presents a key opportunity to answer fundamentally important questions to informing a precisionmedicine approach regarding prevention of SUDs in youth (5). Several relevant questions are: (1) how does repeated exposure to abused substances, such as nicotine, alcohol, and cannabis, impact normative brain development essential for memory and cognitive functioning? (2) How do drug-altered brain-maturation pathways inform precision-medicine-tailored SUDs prevention approaches targeting high-risk youth? (3) Which brain-development events altered following adolescent drug use heighten likelihood of transformation of unhealthy drug use into full-blown SUDs in subpopulations with, or without, co-occurring mental health disorders? (4) How do druginduced alterations in brain-development and memory impairments interact with genomic and epigenetic risk factors in these different subpopulations to increase vulnerability to SUDs? (5) In what manner does use of specific substances impact use of other substances? Thus, the objective of the ABCD study is timely to precision medicine: to better understand how exposure to abused substances modifies brain-development trajectories and how this relates to emotional and mental health, social development, memory and other cognitive function, as well as academic and other outcomes (5).

*Edited by:*

*Mark Walton, University of Oxford, UK*

*Reviewed by: Daniel Beracochea, Bordeaux 1 University, France*

> *\*Correspondence: Udi E. Ghitza ghitzau@nida.nih.gov*

#### *Specialty section:*

*This article was submitted to Addictive Disorders and Behavioral Dyscontrol, a section of the journal Frontiers in Psychiatry*

> *Received: 31 March 2015 Accepted: 27 May 2015 Published: 08 June 2015*

#### *Citation:*

*Ghitza UE (2015) A commentary on "A new initiative on precision medicine". Front. Psychiatry 6:88. doi: 10.3389/fpsyt.2015.00088*

Numerous studies suggest that heavy substance use during childhood and adolescence influences long-term brain and cognitive development and heightens risks for SUDs and co-occurring mental disorders (6–9). Therefore, it is critical to recruit youth in the early, pre-symptomatic phase in order to measure mental health and psychosocial factors over time to understand how they contribute to observed changes in brain and cognitive development (5).To inform how clinicians may optimally intervene early to prevent escalation of unhealthy drug use in youth, this research will prospectively identify and characterize developmental processes across behavioral, cognitive, and neurobiological domains that give rise to transitions between hazardous substance use and SUDs trajectories in diverse populations of youth. Such longitudinal research will also evaluate how critical factors mediate or modify these relationships during sensitive brain-development windows. In such a large-scale longitudinal cohort study, an important consideration will be to implement a sampling strategy which includes a community-based sample that is broadly representative of the U.S. general population. Biospecimens will also be collected for subsequent genomic/epigenomic and other analyses in future research studies.

The ABCD study will leverage latest brain imaging advances, bioinformatics methods for analyzing biomedical big data, and electronic health records information to determine how substance use affects brain-development trajectories, relevant geneenvironment interactions, memory capabilities, mental disorders, and other medical and functional outcomes. Another consideration is achieving sufficient statistical power and comprehensive controls to account for the many possible confounds in which youth who choose to frequently use alcohol or other drugs might also have other co-occurring problems either naturally or due to other lifestyle choices or circumstances. The ABCD study will also carefully characterize and control for socio-demographic, prenatal drug exposure, drug availability, family history, physical or sexual abuse, head trauma, behavioral, and other environmental risk factors (5).

Open data sharing and safeguarding privacy need to be cornerstones for such lines of research, to build a trustworthy scientific

### **References**


knowledge base and support a national network of scientists with innovative precision-medicine approaches to SUDs prevention and treatment. Collected genetic biospecimens need to be appropriately paired with other relevant health information and suitably processed, curated, and stored, in a manner whereby informed consent is obtained consistent with allowing participants' permission for their future research use. Furthermore, to maintain highquality repositories of biomedical big data, such research areas would need to develop sustainable operational and governance standards and conform to industry best practices (10). Moreover, to permit data sharing, procedures need to be put in place to enable harmonization of data collection, querying, extraction, and storage, across study sites with disparate electronic-healthrecord-system standards and data structures. Standardization of collected measures and data harmonization is needed to return clinical data in a consistent manner to a centralized repository and permit semantic mapping to achieve health information interoperability. The above research directions require a collaborative, sustained national effort involving many scientists, clinicians, and bioinformatics experts.

In summary, the ABCD study and similar research offer a valuable opportunity to inform precision-medicine research on how to leverage bioinformatics advances in genomics and health information technology to guide customization of molecular, clinical, and environmental information toward optimizing SUD-prevention in youth. Findings from such research may also guide precision medicine through systematic identification of risk/protective factors, biomarkers, and individual variations in these, which critically mediate effects of substance use on the trajectory of the developing brain, memory, and other cognitive areas in youth.

### **Acknowledgments**

UG is an employee of the Center for the Clinical Trials Network, NIDA, which is the funding agency for the National Drug Abuse Treatment Clinical Trials Network. The opinions in this paper are those of the author and do not represent the official position of the U.S. government.

midlife. *Proc Natl Acad Sci U S A* (2012) **109**:E2657–64. doi:10.1073/pnas. 1206820109


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Ghitza. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Contingency management and deliberative decision-making processes**

#### *Paul S. Regier <sup>1</sup>† and A. David Redish<sup>2</sup> \**

*<sup>1</sup> Graduate Program in Neuroscience, University of Minnesota, Minneapolis, MN, USA, <sup>2</sup> Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA*

#### *Edited by:*

*Mark Walton, University of Oxford, UK*

#### *Reviewed by:*

*Serge H. Ahmed, CNRS, France Luigi Janiri, Università Cattolica del Sacro Cuore, Italy*

#### *\*Correspondence:*

*A. David Redish, Department of Neuroscience, University of Minnesota, 6-145 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA redish@umn.edu*

#### *† Present address:*

*Paul S. Regier, Center for Studies of Addiction, Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA*

#### *Specialty section:*

*This article was submitted to Addictive Disorders and Behavioral Dyscontrol, a section of the journal Frontiers in Psychiatry*

> *Received: 27 February 2015 Accepted: 07 May 2015 Published: 01 June 2015*

#### *Citation:*

*Regier PS and Redish AD (2015) Contingency management and deliberative decision-making processes. Front. Psychiatry 6:76. doi: 10.3389/fpsyt.2015.00076* Contingency management is an effective treatment for drug addiction. The current explanation for its success is rooted in alternative reinforcement theory. We suggest that alternative reinforcement theory is inadequate to explain the success of contingency management and produce a model based on demand curves that show how little the monetary rewards offered in this treatment would affect drug use. Instead, we offer an explanation of its success based on the concept that it accesses deliberative decisionmaking processes. We suggest that contingency management is effective because it offers a concrete and immediate alternative to using drugs, which engages deliberative processes, improves the ability of those deliberative processes to attend to non-drug options, and offsets more automatic action-selection systems. This theory makes explicit predictions that can be tested, suggests which users will be most helped by contingency management, and suggests improvements in its implementation.

**Keywords: decision-making, deliberation, addiction, contingency management, neuroeconomics, impulsivity, addiction treatment**

### **1. Contingency Management**

Contingency management is a method of driving behavioral change through reinforcement with tangible rewards (1). It has been shown to significantly reduce drug-using behavior and increase continuous abstinence rates (2–9).

There are two main variations of contingency management, voucher-based and prize-based. In voucher-based treatment, patients are awarded points that accumulate for submission of drugnegative urine samples (3–5, 8). These points start out very low and can be exchanged for merchandise at any time. For example, in the Higgins et al. (5) study, points for the first clean sample were worth \$2.50 and each subsequent sample was worth \$1.50 more. By the end of the first month, a drug-negative sample was worth \$16.50.

In prize-based treatment, patients earn a chance to win a prize with each drug-negative sample (1, 9–12). Typically, in these studies, prizes were worth around \$1, \$5, \$20, and \$100, and the probability to win higher-valued prizes was lower than lower-valued prizes (0.4% for a \$100 prize and 68% for a \$1 prize). Overall, the chance of the drug-negative sample having a monetary value of anything over a dollar was <7%.

## **2. Current Theories: Alternative Reinforcement**

The success of contingency management is thought to be primarily due to the reinforcing properties of an alternative reward that is offered to patients for remaining abstinent (1, 5).

The conceptualization of contingency management is that drug consumption is much like any other consumption of goods, and thus that increasing the cost of drugs should decrease use. Contingency management increases the cost of drugs because it creates an opportunity cost that is lost (the alternative reinforcer) when the user takes drugs. Reasoning for this is based on operant conditioning theories, noting that targeted behaviors increase with reinforcement and decrease in the presence of substitutes (13–16).

In economic terms, this change in use with cost can be measured as *elasticity*, which can be quantitatively defined as the change in the number of choices selected as cost increases (17–21). To determine this, one can measure the amount of effort an agent is willing to expend in order to gain the reward as a function of the cost. The function that results is called the *demand curve* (see **Figure 1**). A commodity that decreases quickly with cost is said to be "highly elastic," while a commodity that decreases slowly with cost is said to be "inelastic." (See **Table 1** for definitions of the behavioral/neuroeconomic concepts used in this article.)

Quantitatively, the effectiveness of an alternative reinforcement depends on the elasticity of the drug that the alternative reinforcer is substituting for. Although early descriptions of drug use assumed that drugs were taken irrespective of cost, Becker and Murphy (17) pointed out that drugs were economic objects, and, as such, should show elasticity. While there are theoretical reasons to expect differences in the elasticity between drugs and natural rewards (17, 22), nevertheless, drugs do show elasticity both in non-human animals (23–27) and in humans (28–32). This means that increasing the cost (or increasing the size of the alternate options, which increases the opportunity cost) of taking the drug should decrease use. *Alternative reinforcement theory* predicts that the change in drug use from contingency management should be proportional to the elasticity of drug use.

As reviewed above, contingency management provides relatively low-value monetary rewards for abstinence (especially in the first month of treatment). For example, in voucher-based contingency management, rewards are as low as \$2.50 for the very

#### **TABLE 1 | Economic theoretical constructs used in this article**.

	- *•* **Willingness to pay**: a measure of the valuation of an object as a function of the amount of money or effort an agent is willing to put into achieving it
	- *•* **Revealed preference**: a measure of valuation of an object as a function of whether it is preferred or not when given in contrast to another option

Experiments find these measures can produce incompatible outcomes.

first negative urine sample and \$16.25 for a negative sample after remaining abstinent the entire first month (5). The pre-clinical experiments suggest that the value of alternative reinforcement rewards used in contingency management should not reduce drug consumption as much as it does. The pre-clinical experiments suggest that either cost of the drug or magnitude of the reinforcer would need to be significantly higher than what is typically used in contingency management if alternative reinforcement alone were to account for the observed reductions of drug use in contingency management studies.

### **3. The Problem with the Alternative Reinforcement Theory**

If we assume that drugs are economic objects, and thus are subject to change in demand or price, then one way to quantitatively measure level of consumption as a function of price is with a *demand curve*. The demand curve measures a fundamental concept of consumption: as price of the economic object increases, the consumption of that object will decrease (33, 34).

**Figure 1** shows the structure of a typical demand curve. These curves can be well-fit with Eq. 1 measuring the relationship of the cost of some commodity (*C*) and the consumption of that commodity (*Q*) (35):

$$Q = LC^b - e^{-aC} \tag{1}$$

where *L* measures consumption at *C* = 1, and *b* and *a* are variables that relate to slope and acceleration of the slope, respectively. The slope of the curve predicts the elasticity of the commodity.

$$E = b - a\mathbb{C} \tag{2}$$

*Pmax* is the point at which the elasticity *E* = *−*1, which is the point at which elasticity transitions from *<*1 unit of decreased use per unit of increased cost (inelastic) to more than 1 unit of decreased use per unit of increased cost (highly elastic). Because the elasticity terms *a*, *b*, and the cost *C* appear in the exponents in Eq. 1, once the cost crosses *Pmax* [when *C >* (*b* + 1)/*a*)], consumption drops off very quickly. Using demand curves, we can construct a quantitative model to determine how monetary rewards should affect consumption of a drug. As mentioned previously, monetary values early in treatment are relatively low, and demand curve modeling suggests that these rewards alone would affect consumption of the drugs very little.

### **3.1. Modeling Contingency Management: The Monetary Value of Vouchers Early in Contingency Management Treatment Should have a Negligible Effect on the Consumption of Cocaine**

Bruner and Johnson (21) constructed demand curves for individuals that regularly use cocaine by asking subjects how much cocaine they would buy as the cost increased. As noted above, providing alternative rewards increases the cost of the commodity (here the drug) through lost opportunities (an *opportunity cost*) – if the person takes the drug, then they do not get the alternative reward. This means that we can use these demand curves to predict how this opportunity cost should change the choices made.

Individuals in treatment get a voucher value of \$2.50 the first time they provide a clean sample<sup>1</sup> . Using the assumption that individuals seeking treatment spend an average of \$99/day (Petry, personal communication) during a typical day of cocaine use, and given that 1 unit of reward in the Bruner and Johnson (21) data was worth \$5 on the street, a starting contingency management reward value of \$2.50/day is worth approximately \$0.13/unit.

A shift of \$0.13/unit on the demand curve would be predicted to produce a negligible effect on cocaine consumption [the Bruner and Johnson (21) demand curve predicts a 1.6% change]. Even at the end of the first month of contingency management treatment, when patients receive a voucher worth \$16.25 (\$0.82/unit), there should be little change in consumption [the Bruner and Johnson (21) demand curve predicts a 17% change].

In order to quantitatively measure whether these economic changes could explain contingency management's effects, we took the effect sizes reviewed in the meta-analysis by Lussier et al. (36) and asked how much the Bruner and Johnson (21) demand curve would predict from the economic change in cost alone. Of course, patients seeking treatment have increased costs for drug use due to many factors beyond the simple loss of the contingent alternate reward. Similarly, there is a large variability in how contingency management studies are run and what additional

treatments they are paired with. Finally, the Bruner and Johnson (21) analysis is from one set of cocaine addicts, while the studies reviewed by Lussier et al. (36) range from alcoholics to stimulant addicts. Nevertheless, 21/27 studies had predicted changes less than the observed effect size, and the median ratio was that the predicted effect was less than half the observed (median ratio = 0.43). **Figure 2** shows the distribution of observed effect sizes against the economically predicted changes. The predicted changes are significantly less than the observed changes (matched pairs median test, *p* = 0.00008).

This analysis suggests that the simple economic description of contingency management is inadequate – the rewards offered in contingency management are too small to have the observed effects. We suggest that this is because the microeconomic model on which the economic explanation for contingency management is based is inadequate – human decision-making depends on more than simple cost-benefit analyses. Instead, the human decisionmaking process is better described as an interaction between multiple competing components (37–43), each of which uses different processes to combine reward information (value) with past experiences (memory) to select actions (make decisions). We suggest that contingency management taps into certain aspects of these multiple decision-making systems to drive behavior to be more likely to reject the drug-taking choice.

### **4. Valuation**

Early psychological and economic research postulated that reinforcers are *transituational*, meaning that the efficacy of the reinforcer remains consistent across different experimental conditions (44–46). However, studies have shown that reinforcers do not consistently elicit reliable behavioral outputs in different contexts (47).

In the fields of behavioral and neuroeconomics, decisions are assumed to derive from an underlying "value" or "utility" placed on outcomes. However, this value cannot be directly observed experimentally, and thus must be interpreted from experimental conditions. The two primary methods for deriving this value are

<sup>1</sup> In the Higgins et al. (5) study, subjects got a voucher worth \$2.50 for the first clean sample. Taking this voucher as covering staying abstinent for more than 1 day would only decrease the predicted impact of the voucher. Since our analysis will show that voucher size is inadequate to drive changes in the demand curve, any increased required abstinence will not change our conclusions.

*willingness to pay* experiments, in which an agent is given an opportunity to pay a cost for an outcome, and *revealed preference* experiments, in which an agent is given a choice between two or more options. In willingness-to-pay experiments, the agent has to decide whether to continue to pursue a given option or not. In revealed-preference experiments, the agent has to decide which option to pursue. Importantly, experiments in rats, monkeys, and humans all find differences between how animals value options under these two measurements, often finding incompatible outcomes (43, 47–49). Thus, converting experiments from single option (*Go or don't?*) to multiple option (*Which one?*) can change how animals appear to value a given option.

A typical willingness-to-pay experiment would be the *breakpoint* procedure, in which an animal presses a lever to receive reward. The first reward is delivered with only a single-lever press, but the second requires two-lever presses, the third requires four, the fourth eight, and so on, doubling each time. At some point, the cost becomes too high and the animal stops pressing the lever (48, 50–52). In humans, willingness-to-pay can be assessed by simply asking "how much would you pay for this outcome?" (47).

By contrast, a typical revealed-preference experiment would provide an animal two levers, one of which provides one type of reward (A), while the other provides another type of reward (B) (48, 52, 53). The animal is only able to select one lever at any given time and thus must choose between the separate options. The implication is that the selected option is more valuable than the non-selected option. In humans, revealed preference can be assessed by asking "which option would you prefer?" (47).

Extensive evidence exists within the behavioral and neuroeconomics literature that these two measures can produce incompatible valuations, in which human subjects may be willing to pay more for option A than for option B, even when they would prefer to take option B when faced with the two options together (47). Recently, Ahmed (48) found in self-administering rats that measuring value by means of a breakpoint procedure (willingnessto-pay) can produce different ordering than when measuring value by means of a choice procedure (revealed preference); that is, subjects were willing to pay more for drug than saccharin but preferred saccharin to drug when given the choice (48). This strongly suggests that value is not an intrinsic (transituational) property, but is highly dependent on the contextual surrounding components.

These analyses implies that single-option experiments, in which an agent is tasked with deciding whether to pursue a given object or not may access different process than multiple-option experiments, in which an agent is tasked with deciding which option to pursue.

### **4.1. Valuation Inconsistencies Arise from Multiple Decision-Making Systems**

Current theories suggest that this underlying lack of transsituationality arises because animals (including humans) make decisions based on several incompatible decision-making systems, each of which processes information about the decision in fundamentally different ways. Because these different systems drive behavior at different times, the same agent can show different valuations under different experimental conditions.

Classically, the idea that valuation is inconsistent and not transsituational has been addressed in terms of *dual-process theories* that humans (and presumably other animals as well) have two separable components of decision-making, one which is impulsive and depends on reacting to immediate, concrete rewards, and another which is more rational and capable of waiting for larger, more abstract rewards (54–57). Importantly, the impulsive (often called "reactive") system is not necessarily always chasing positive rewards; it can also avoid negative consequences (58), nevertheless, the key difference in the two dual-process hypothesis is that the impulsive system attends to immediate consequences while the other (cognitive, often called "reflective") system takes into account farther future consequences (59–62). In many of these discussions, the impulsive system is identified as more "emotional" and more related to an animal's history, while the rational system is identified with more cognitive processing. In many of these theories, the rational system is assumed to be a selfcontrol system, which inhibits the activity of the impulsive system (63–65), often referred to as a form of "self-control" (66, 67). This theory has a very long history (68–70) and there are good summaries of the modern perspectives on this dichotomy (40, 59, 63, 65, 67, 71). Anatomically, the impulsive system is associated with the nucleus accumbens and amygdala, while the rational system is associated with the prefrontal cortex (54, 56, 57, 59, 72, 73).

Recent computational work examining how agents process information to make a decision (such as taking a drug or not) suggests that multiple action-selection systems compete and interact to produce that decision. Current theories suggest that decisions arise from as many as four separable systems, each depending on different information-processing computations (37, 42, 43, 74–77). Each system uses past experience differently and processes information about the world differently, and thus each has advantages and disadvantages in different situations. An agent that correctly identifies the best action-selection system to use in a given situation will outperform a different agent that does not. Because different systems drive behavior at different times, valuation is not necessarily self-consistent.

Following these recent taxonomies (43), we identify four decision-making systems each of which selects actions through a different computation: (1) reflexes, in which evolutionarily useful stimulus–response pairs are hard-wired within a neural system (78, 79), (2) Pavlovian actions, in which an animal learns when to release a species-specific behavior (80–82), (3) procedural actions, in which arbitrary action chains are stored and released on cue (83, 84), and (4) deliberation, which entails a slow, goal-oriented search and evaluate process (42, 85–87). Each of these systems is instantiated in a different anatomical network – reflexes in spinal cord and brainstem (88), Pavlovian actions with amygdala and the periaqueductal gray (89, 90), procedural with motor cortex, cerebellum, and the basal ganglia (91–93), and deliberation with hippocampus and the prefrontal cortex (87, 94–96).

There are many similarities between the dual-process and multiple decision-making systems theories, particularly in the separation between more automatic and more cognitive systems (40, 43, 65). Both theories, for example, suggest that stress and cognitive load will disrupt the more cognitive systems, shifting behavior to more automatic systems. Both theories suggest that the more automatic systems tend to react to more immediate stimuli, while the more cognitive system is capable of incorporating information that is not immediately present.

However, there are important differences between the theories. For example, the information-processing theories do not imply that the more automatic systems are more impulsive, as hypothesized by the classical dual-process distinction. For example, a fire chief with extensive expertise is using a fast, non-deliberative process to make the right choice (83); no one would argue that a fire chief is making an impulsive choice. The more recent models have shown that intuition and developed expertise arises from a different computational process than emotion, suggesting that these are different systems (43). Additionally, the informationprocessing theory provides for interacting components that can make cognitive systems react differently in the face of concrete stimuli (97, 98).

In addition, the hypothesized causes of addictive behavior is different in the two theories, which has implications for how contingency management should be used and what modifications would do to its success. These subtle differences between these theories make different predictions and change some of the implications of our fundamental hypothesis (that contingency management accesses deliberative processes, see below). We will address the differences between these theories below, but first we address the main implications of our hypothesis that contingency management accesses deliberative processes, which are similar under the two theories.

### **5. Hypothesis: Contingency Management Accesses Deliberative Systems**

Our hypothesis is that the provision of a concrete, identified, alternative reward in contingency management both engages deliberative processes and improves the ability of those deliberative processes to attend to non-drug options. In a sense, contingency management transitions the drug-valuation process from a willingness-to-pay condition to a revealed-preference condition. In addition, we propose that the concrete and more immediate rewards provided by contingency management increase the ability of deliberative systems to attend, value, and select the alternative (non-drug) reward. (This may be why the prize-based CM systems are more effective with lower value rewards than comparably more expensive monetary-based voucher systems.)

### **5.1. Pre-Clinical Experimental Support for this Hypothesis**

Non-human animal self-administration studies have also found that drugs are economic objects and show a non-zero elasticity. As with human studies, increasing the cost (measured in terms of number of lever presses required to receive drug) decreases the number of self-administered drug-taking events (28, 99–101). Similarly, providing an alternative reinforcer reduces the amount of drug self-administration in both rats and monkeys (23–25, 27, 48, 53, 100, 102–106). These studies fall into two categories, which require dramatically different levels of alternative reward to decrease drug use.

Classically, the simplest measure of the cost-dependence of drug self-administration in non-human animals is the *breakpoint* analysis (52, 99). These studies find that much larger costs are required before an animal will cease drug self-administration than before an animal will cease taking non-drug rewards (51, 100). This suggests that it would require very large non-drug rewards to counteract drug self-administration. The first set of studies (24, 25, 27, 104) confirmed this hypothesis, in that they used single-response conditions and found that reductions in drug self-administration were only observed after very large alternative rewards. For example, Woolverton et al. (27) found that the opportunity cost of the drug option needed to be increased 100 fold (for low-drug concentrations) to 1000-fold (for average and high-drug concentrations) in order to significantly reduce selfadministration. In these studies, animals could switch between conditions that either provided cocaine on pressing the primary lever or alternative reward on pressing the same primary lever. In other words, the animal could switch between situations that enabled non-deliberative processes. Other studies using similar techniques have found similar proportions (24, 25, 107, 108).

Interestingly, Ahmed [(48), see Ref. (100, 105, 106)] found much smaller alternatives could reduce drug self-administration. In these studies, the animals had two options directly available to them on opposite sides of the chamber – one lever provided cocaine, while the other provided saccharin. Preference was measured by whether the animals selected the saccharin lever or the cocaine lever. These studies also examined single-option breakpoints, in which only one lever was provided and cost was measured as the number of lever presses required before the animal gave up. These studies found that although the breakpoints for cocaine were much higher than the breakpoints for saccharin, animals preferred saccharin when provided with a revealed-preference two-lever choice paradigm. Similarly, LeSage (53) showed that providing a small amount of sucrose for not self-administering nicotine was sufficient to reduce the number of nicotine responses.

These studies support the proposed dichotomy between willingness-to-pay valuations (measured by single-lever breakpoint studies and situation-change studies, theoretically dependent on non-deliberative processes) and revealedpreference valuations (measured as forced choices between two explicit levers). The revealed-preference studies required much smaller rewards to decrease drug self-administration than the willingness-to-pay studies. The difference in size of alternate reward required to change behavior under the two paradigms suggests that the difference lies in fundamental processes underlying decision-making across multiple species (including at least rats, monkeys, and humans).

### **6. Components of Contingency Management that Affect Deliberation**

The information processing that underlies deliberative decisionmaking processes is now beginning to be elucidated (87, 98, 109), particularly, in contrast to other decision-making systems (39, 43, 110). Deliberation requires recognition of a situation, a serial consideration of the potential actions available, and evaluation and comparison of those potential options (42, 87).

The main advantage of deliberation is that because these expected consequences are represented during the decision process, they can be evaluated during that process, in the context of the agent's current goals (86). This means that the individual options must be found (85, 98, 111, 112) and then the valuation constructed (40, 47, 73, 113). Both the search process and the construction of value will be modulated by processes that computationally affect neural information processing (98, 114). Examples of these include working memory abilities (57, 115), whether the consequence is phrased as a win or a loss (40, 47, 116, 117), attention (113, 118), emotional state (119), surrounding options (120), and even the presence of unrelated numbers, such as in anchoring [where unrelated anchors such as one's social security number can be used to change one's expected cost and thus one's willingness to pay for a reward (40, 47, 117, 118)].

The deliberative process is slow and computationally intensive, likely because of the cumbersome memory-retrieval and imagination-construction system needed to calculate the possible outcomes in order to evaluate them (83, 87, 98, 112). The evaluation achieved through deliberation depends on a number of stimulus factors, including the expected delay to the reward (121), and the concreteness of the reward (97). Deliberation also depends on a number of internal factors, such as one's perceived needs and desires (86, 119), as well as one's cognitive and executive-function abilities (98), such as episodic future thinking (95, 96), working memory (115, 122), and ability to hold attention (123, 124).

Valuation derived from deliberation depends on a direct imagination of expected outcomes and a comparison between choices (87, 98, 109). As the preclinical studies reviewed above show (48, 53), when an explicit choice between the drug and non-drug reward options is available, the drug option is less likely to be chosen; therefore, factors that increase the likelihood of engaging deliberative processes or that increase the deliberative valuation of a non-drug option should increase the efficacy of contingency management.

### **6.1. Delay to Reward**

Rewards that are only available in the future are less valuable than rewards provided immediately (125–127) – something could happen between now and the time one expects to receive the reward (thus diminishing the usefulness of that reward) and immediate rewards can be invested (thus increasing the usefulness of immediate rewards). The diminishing value of future rewards relative to immediate rewards is quantifiably measurable through questionnaires in which subjects make decisions between immediate and delayed amounts of money, drug, or both (121).

Drug users reliably show faster discounting rates than nonaddicts (128–132). Recovered addicts, however, show normal discounting rates (128). Although this early study was unable to determine whether this was a selection process in which the addicts with more normal discounting rates responded better to treatment, a more recent study has determined that successful treatment has the effect of normalizing over-fast discounting rates (133).

Many theoreticians have suggested that these preferences for more immediately available rewards can drive drug use because drugs provide very strong immediate rewards (euphoria, relief from dysphoria) while abstinence provides only longterm rewards (health, family, financial) (134, 135). Contingency management may have the effect of bringing the long-term rewards closer by providing more proximal rewards for abstinence (money, vouchers, draws from the prize-bowl).

Given the actual discounting rates reported in realistic subjects (128, 130, 136), \$2.50 for the first drug-negative sample would be discounted quickly and seems unlikely to be able to deflect the user away from drugs, especially in the beginning of contingency management treatment. The delay-discounting rates that would be necessary to make these small rewards provided at the end of a week strong enough to affect decisions made days earlier in the week are unreasonably slow (137, 138), particularly for addicts, who have faster discounting rates than non-addicts [for review, see Ref. (28)]. Studies have shown that individuals discount smaller values more quickly than larger values [discounting curves are steeper, Ref. (139)], which would further reduce the discounted effectiveness of the small rewards provided early in treatment.

Furthermore, both human and non-human subjects tend to show hyperbolic discounting functions (121, 140, 141). Any non-exponential (including hyperbolic) discounting function will show *preference reversals* in which one choice is preferred when both choices are far in the future, but the other becomes preferred as the subject approaches the time of that second choice (142). Thus, even if a user decided at the beginning of the week to prefer the contingent reward (\$2.50) to taking drugs, when faced with the immediate choice, the user would seem likely to choose the drug-use option.

During treatment in prize-based contingency management, upon submission of a drug-negative sample, individuals immediately earn a chance to win a tangible prize. In addition, individuals have a chance (albeit low in probability) to win a high-value prize for every draw they earn. This means that even though the average overall value of reinforcers earned by subjects tends to be lower in prize-based contingency management compared to voucherbased contingency management, the availability of a more immediate reward and the chance to win a high-value prize may cause individuals to discount less. These differences in discounting rates between the two versions of contingency management may help to explain similar treatment efficacy even with differing value of total potential reward.

### **6.2. Concreteness**

The long-term rewards of abstinence tend to be more abstract than the short-term reinforcement provided by drug use (135). Several authors have suggested that the major difference between immediate rewards and delayed rewards is the concreteness of immediate rewards and the abstractness of delayed rewards (98, 143, 144).

Trope and Liberman (143) suggest that high-temporal distance creates difficult-to-conceptualize (high-level, more abstract) construals that are more difficult to reason about, while low-temporal distance creates easier-to-conceptualize (low-level, more concrete) construals. They hypothesize that more concrete options are considered to be more valuable than more abstract options. For an addict, abstinence is a high-level construal placed in the hard-toimagine far future and is more abstract and less valuable than a concrete reinforcer, such as the option to use drugs in the present or near future, which is a low-level construal.

Current decision-making theories suggest that evaluating future outcomes depends on constructing episodically-imagined futures (87, 109, 113, 145). Kurth-Nelson and Redish (98) suggested that discounting rates may depend on how difficult it is for this construction process to find those potential future possibilities. Supporting this hypothesis is evidence that fronto-parietal areas are more active when people select the delayed option (56, 57), that subjects with better working memory and higher IQs tend to discount more slowly (115), and that training working memory can slow discounting rates (122, 133). Rewards placed in concrete episodic futures (35AC on vacation in Paris next month) are discounted more slowly than abstract future rewards (35AC next month) (146). Kurth-Nelson and Redish (98) suggest that the decreased discounting of concrete options is due to concrete futures being easier to find and construct in the deliberative search process.

Taken together, these theories imply that more concrete rewards have higher subjective value compared to abstract rewards. What does this mean for addiction? Typically, an addict has a choice between using a drug and not using a drug. The option of using the drug has immediate and concrete rewarding effects. Drug's rewarding effects include subjective pleasurable effects and relief from withdrawal, and both of these effects are expected and concrete. The option of *not* using has immediate negative effects (147), but the primary distal rewarding effects are very abstract (135).

Contingency management changes this scenario by providing the addict with a concrete reward (money, a voucher, a specific prize) contingent upon abstinence, which is more proximal than rewards for abstinence alone. This allows the addict to achieve the goal of reducing drug consumption and increasing abstinence by focusing, not on the abstract abstinence, but rather on the concrete alternative.

This theory suggests that one effect of contingency management is to make both options immediate and concrete. The combination of the discounting/proximity and the concreteness theories suggest that contingency management creates a situation where the alternate reward (i.e., abstinence over drug use) is both more concrete and closer in temporal distance; thus, making it more equal to the drug-use option.

The importance of concreteness is highlighted by comparing voucher- and prize-based treatments. Although subjects were encouraged to imagine concrete items that the voucher could be used for (5), in prize-based studies, the prizes are physically present in a show-cabinet right there with the prize-bowl (9). Vouchers were also useable for a variety of rewards, while winning a given prize meant that that was the concrete prize you got.

Both voucher- and prize-based have been found to be similarly effective, even though the value of possible earned rewards is much lower in the prize-based studies (5, 9, 11, 12, 148). In both versions, high-value rewards have been found to be more successful than low-value rewards; however, the size of rewards offered in these conditions differs considerably. Even though the total value of possible rewards received in the high-value prize-based method was lower than the low-value voucher-based method, the high-value prize-based method was still effective for significantly reducing drug consumption, while the low-value voucher-based method was not. This not only exemplifies the importance of value but also how the concreteness of the reward affects perceived value. The presence of more concrete alternative rewards (specific prizes) appears to have more of an effect than less concrete alternative rewards (voucher exchanged for money, in turn, used for unspecified merchandise).

## **7. Conclusion and Further Discussion**

In summary, we propose that contingency management's success occurs because it provides an alternate reinforcer that forces the subject into a deliberative mode, which allows different valuation processes than non-deliberative modes. It also provides both a decreased time-to-reward and increased concreteness for the alternate reward, which should increase the valuation of the alternate reward relative to the valuation of the drug and move the agent from a willingness-to-pay valuation mode to a choice between/revealed-preference valuation mode.

### **7.1. Relationship to Classical Dual-Process Theories**

Many theoreticians have suggested that addiction arises from a mismatch between the balance of two systems (typically called a "hot" or impulsive system and a "cold," rational system) (64, 149, 150). While it is possible to place our hypotheses for contingency management within that two-system framework, we believe that the evidence suggests that addiction is more complicated than the simple out-of-balance theory proposes. Instead, we work from the theory that continued drug use can arise from computation errors in a number of places within the decision-system, of which a mismatch in balance between systems is only one potential failure mode (39, 43).

It is important to differentiate the *vulnerabilities* theory of addiction that arises from the *multiple action-selection-system* theory from the *out-of-balance* theory of addiction that arises from the *dual-process* theory. (See **Table 2** for a list of these decision-concepts used in this paper.) Our proposal that contingency management drives subjects toward deliberative processes could follow from either of these two addiction/decision-making theories, but the implications are different, depending on which theory pertains.

The *out-of-balance* hypothesis of addiction is that addicts have a problem with the balance between the two systems in the dual-process theory (54, 55, 66, 67, 149). These systems can be driven out of balance either from hyperactivity in the impulsive system or hypoactivity in the rational system (55, 56, 151,

#### **TABLE 2 | Economic theoretical constructs used in this article**.

	- *•* **Out-of-balance theory**: the idea that addiction arises from an imbalance between the impulsive and rational systems

*•* **Vulnerabilities theory**: the idea that addiction arises out of processing failures in one or more of the action-selection systems

152). In either case, improving the strength of the rational system [for example, by providing working memory training (122) or by increasing activity in the prefrontal cortex (153)] should decrease drug use because it should shift the balance toward the more rational system. Our proposal that contingency management drives decision-making toward deliberation implies that if the dual-process and out-of-balance theories are correct, then what contingency management is doing is shifting the balance between these two systems. Evidence supporting this concept was recently published by Wesley et al. (57), who found that in an explicit cocaine-money choice, choosing money later over cocaine now produced additional activity in the dorsolateral prefrontal cortex.

The vulnerabilities hypothesis of addiction is that there are many potential "failure modes" within these systems, any of which can lead to addictive behaviors (39, 43, 77, 154). The concept that there are many vulnerabilities implies that addiction can arise from multiple causes. Our proposal that contingency management drives decision-making toward deliberation implies that if the multiple-action-selection systems and vulnerabilities theories are correct, then what contingency management is doing is twofold: (1) it is shifting the decision-making system into deliberation because it is providing two choices, and (2) it is improving the deliberation system algorithm, by making the goals more concrete and more immediate.

There are similarities and differences between these theories. Both theories include separate action-selection systems, only one of which includes an explicit planning component.


However, the vulnerabilities theory further proposes that there are failure modes within the deliberative system as well, and thus suggests that only a subset of patients will be helped by contingency management, and that different aspects of contingency management will help different patients.

*•* For patients who have vulnerabilities in the Pavlovian or procedural systems who may express a desire to quit in the absence of drug-related cues, but find themselves unable to when faced with drug-related cues, contingency management can provide a second option to attend to, even when faced with drug-related cues, which can enable the deliberative system to retain control. This likely relates to the difference in valuation between single-option choices (go/no-go, willingness to pay) and dual-option choices (select between).


### **7.2. Predictions and Implications**

#### 7.2.1. Identify Patients Capable of Deliberating

The idea that contingency management primarily accesses deliberative systems implies that it will be most successful in patients with viable deliberative systems. This suggests that identifying patients with intact deliberative systems would help identify patients most likely to be helped by contingency management programs. There are a number of cognitive tasks known to access deliberative systems (94, 146, 158–161). Whether these tasks are changed in addicts, however, remains unknown. The vulnerabilities theory predicts that some addicts will continue to show deliberative abilities in these tasks, and that those addicts will be best served by contingency management.

This hypothesis further suggests that patients with deficient deliberative systems would be helped by first training those systems. Working memory training, for example, decreases discounting rates as much as drug treatment (133).

### 7.2.2. Prediction: Contingency Management will Depend on Prefrontal Integrity

The two hypotheses that contingency management depends on deliberative processes and that deliberative processes depend on prefrontal integrity predict that contingency management will be most successful in patients with strongly active prefrontal systems. Evidence that prefrontal cortical interactions with hippocampus and other neural systems are a necessary component for deliberative decision-making processes is well-established (71, 94, 96, 145, 162, 163). For example, functional connectivity between prefrontal cortex and nucleus accumbens predicts success in drug-dependence treatment and an avoidance of relapse (152). In rats, optogenetic stimulation of prelimbic (prefrontal) cortices decreases compulsive drug seeking, while optogentic inhibition of prelimbic (prefrontal) cortices increased it (153). Similarly, in humans, repetitive transcranial magnetic stimulation (rTMS) over the dorsolateral prefrontal cortex reduced reported craving in nicotine addicts (164).

It also suggests that patients with improved cognitive abilities (115) and with prefrontal cortices more likely to play active roles in decision-making (56, 57, 152) will be more capable of using contingency management. These hypotheses imply that further improvements in cognitive resources [such as with working memory training (122, 133)] or increasing prefrontal activity (153) will make patients be more capable of using contingency management.

### 7.2.3. Combine Contingency Management with Working Memory Training and Cognitive Reassessment Therapy

Contingency management is often provided with synergistic treatment of pharmacological and sociological treatments (counseling, 12-step group work, methadone or nicotine-replacement treatment, etc.) (1). While these additional treatments provide potential rectification of decision-making vulnerabilities and failure modes, we suggest that they do not directly address the reasons for the success of contingency management. Under the hypothesis that contingency management depends on deliberative processes, improvements in those deliberative processes should provide additional improvements in the success of contingency management.

Deliberative decision-making entails the creation and imagination of hypothetical episodic futures and evaluation of those futures (43, 85, 109, 111, 145). As such, it requires a search process and memory to compare those evaluations (87, 98, 163). Changes in the recognition of the underlying paths through those futures affect the decisions made (138, 140, 165). For example, the famous dictum that "there is no such thing as one drink for an alcoholic" implies that decisions are not between drinking one drink and not, but between drinking many drinks and not. This process leads to *bundling*, in which future decisions are bundled together, which changes the underlying valuation of those future decisions (140, 165).

Changes in the ability to create, imagine, test, and remember those futures will also likely increase the ability to engage that deliberative system. It is possible to improve executive function and working memory through training (122, 166). These procedures decrease impulsivity as measured by discounting experiments. Given the data that cognitive load decreases engagement of the deliberative system (67, 124, 160), merely recognizing that patients are particularly vulnerable under stress and situations of increased cognitive load (165, 167), could suggest proactive procedures (such as increased rewards or increased reminders) during times of stress and cognitive load.

### 7.2.4. Increasing Value of the Alternate Option

From the very first introductions of contingency management, it has been clear that providing an increased value of the alternate rewards increases the success rate (1, 11, 148). This is a straightforward prediction of the alternate reinforcement theory. However, as expected from the discussion of the pre-clinical

data (above), dramatic changes would require very large alternate rewards. For example, increasing the payout from \$0.50 on the first negative drug urine sample to \$7.00 produces a significant effect (168). Given the political difficulty of paying for drug treatment programs, finding ways to increase the success of contingency management without dramatically increasing costs would be particularly useful. Prize-based contingency management is one example of reducing costs without decreasing efficacy (1, 11).

### 7.2.5. Concrete Options are Discounted Less than Abstract Options – Provide Reminders of the Concrete Alternate Reward

If one could increase the proximity of the rewards at the moment of decision, one could further increase the value of the alternative option. Thus, one potential improvement would be to provide a concrete reminder of the alternate reward (such as what the current voucher value is) on an easily accessible place (such as a smartphone app) that could be accessed at the actual moment of decision.

Although concrete options are more valuable than abstract options, symbolic reminders of concrete options might also increase the value of alternate options. For example, simply stating a delayed reward will be delivered during an episodic event decreases discounting and increases value relative to equivalent, but less concrete rewards (146). Similarly, pictures of food rewards are more valuable than text descriptions of those rewards (169). Thus, visual symbols can improve both concreteness and deliberation. This suggests that providing the picture of the specific concrete option being worked toward is likely to further improve the reminder. Similarly, providing direct information about the values of the alternative options (such as days clean, days remaining to reward, points that would be lost due to relapsing) would make it easier for the patient to evaluate the alternative outcome, which should make it easier for the patient to attend to (and select) the alternative outcome. This could also be accomplished through a smartphone app that shows the picture of the reward being worked toward and information about the voucher points needed to achieve that goal.

### 7.2.6. Preventing Relapse after Contingency Management Treatment

As with any treatment, many patients relapse after treatment. The vulnerabilities theory suggests that addiction is caused by a multitude of potential failure modes (39, 43). Although contingency management is a support mechanism that can aid in a person's recovery, other failure modes may still remain even after completion of the contingency management series. However, contingency management can be combined with other treatments (1, 5, 9). Studies have shown that the cognitive and discounting impairments that arise during drug and alcohol use improve with continued abstinence (133, 170–174). Thus, contingency management can create a span of time for an individual to repair these failure modes, while also learning important skills to increase the chance to remain abstinent in the future.

One potential solution would be to teach users to create their own contingency management process, providing their own deliberative alternatives. Changes in expectations and representations of the outcomes of potential options can change decision-making choices, even without changes in the underlying action-selection processes (135, 138, 140).

### **Author Contributions**

The manuscript was co-written by both authors.

### **References**


### **Acknowledgments**

We would like to thank Nancy Petry, Josh Gordon, and Zeb Kurth-Nelson for discussion, and Nancy Petry for comments on an early draft of the manuscript. *Funding:* This work was supported by NIH grants R01-DA024080, R01-DA030672, and T32-DA007234.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Regier and Redish. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

digital media

of impactful research

article's readership