# TWENTY YEARS AFTER THE IOWA GAMBLING TASK: RATIONALITY, EMOTION, AND DECISION-MAKING

EDITED BY : Jong-Tsun Huang, Yao-Chu Chiu, Ching-Hung Lin and Jeng-Ren Duann PUBLISHED IN : Frontiers in Psychology and Frontiers in Neuroscience

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-528-7 DOI 10.3389/978-2-88945-528-7

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# TWENTY YEARS AFTER THE IOWA GAMBLING TASK: RATIONALITY, EMOTION, AND DECISION-MAKING

Topic Editors:

Jong-Tsun Huang, China Medical University, Taiwan Yao-Chu Chiu\*, Soochow University, Taiwan Ching-Hung Lin\*, Kaohsiung Medical University, Taiwan Jeng-Ren Duann, National Central University, Taiwan

Image configuration design by Yao-Chu Chiu; Ching-Hung Lin; Ting Chiu. Photograph by Ting Chiu. Graphic Design by Ting Chiu.

The world is full of uncertainty. In unpredictable circumstances, can emotions facilitate advantageous decision-making? A neuroscience team, led by Antonio Damasio, explored this question using the Iowa Gambling Task (IGT). To the present day, the findings of numerous IGT-related investigations strongly influence clinical and interdisciplinary research, for example, in neuroeconomics and neuromarketing.

This special issue examines IGT-based research progress over the past 20 years through literature reviews, clinical examinations, model construction, theoretical integration, and brain imaging technology. Both supportive and opposing viewpoints are provided to frame correlations between rationality, emotion, decision-making, and IGT. Potential future directions for IGT studies are discussed.

\*Yao-Chu Chiu, yaochu@mail2000.com.tw \*Ching-Hung Lin, eandy924@gmail.com

Citation: Huang, J-T., Chiu, Y-C., Lin, C-H., Duann, J-R., eds. (2018). Twenty Years After the Iowa Gambling Task: Rationality, Emotion, and Decision-Making. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-528-7

# Table of Contents

*06 Editorial: Twenty Years After the Iowa Gambling Task: Rationality, Emotion, and Decision-Making*

Yao-Chu Chiu, Jong-Tsun Huang, Jeng-Ren Duann and Ching-Hung Lin

## SECTION I

## REVIEWS


Damien Brevers, Antoine Bechara, Axel Cleeremans and Xavier Noël

*31 The Iowa Gambling Task and the Three Fallacies of Dopamine in Gambling Disorder*

Jakob Linnet


## SECTION II

## CLINICAL EXAMINATIONS


Varsha Singh

*104 The Impact of Frontal and Cerebellar Lesions on Decision Making: Evidence From the Iowa Gambling Task*

Caroline de Oliveira Cardoso, Laura Damiani Branco, Charles Cotrena, Christian Haag Kristensen, Daniela Di Giorge Schneider Bakos and Rochele Paz Fonseca

## SECTION III

## MODEL CONSTRUCTION


Junyi Dai, Rebecca Kerestes, Daniel J. Upton, Jerome R. Busemeyer and Julie C. Stout


## SECTION IV

## THEORETICAL INTEGRATION

*180 It's All in How You Think About It: Construal Level and the Iowa Gambling Task*

Bradley M. Okdie, Melissa T. Buelow and Kurstie Bevelhymer-Rangel

*190 Decision Making in Healthy Participants on the Iowa Gambling Task: New Insights From an Operant Approach*

Peter N. Bull, Lynette J. Tippett and Donna Rose Addis *207 A Potential Role of Reward and Punishment in the Facilitation of the Emotion-Cognition Dichotomy in the Iowa Gambling Task*

Varsha Singh

*215 Sex-Differences, Handedness, and Lateralization in the Iowa Gambling Task*

Varsha Singh

## SECTION V

## BRAIN IMAGING TECHNOLOGY

*230 Altered Dynamics Between Neural Systems Sub-Serving Decisions for Unhealthy Food*

Qinghua He, Lin Xiao, Gui Xue, Savio Wong, Susan L. Ames, Bin Xie and Antoine Bechara

*240 Decision and Dopaminergic System: An ERPs Study of Iowa Gambling Task in Parkinson's Disease*

Daniela Mapelli, Elisa Di Rosa, Matteo Cavalletti, Sami Schiff and Stefano Tamburin

*249 Cognition and Emotional Decision-Making in Chronic Iow Back Pain: An ERPs Study During Iowa Gambling Task*

Stefano Tamburin, Alice Maier, Sami Schiff, Matteo F. Lauriola, Elisa Di Rosa, Giampietro Zanette and Daniela Mapelli

*260 Learning on the IGT Follows Emergence of Knowledge but not Differential Somatic Activity*

Gordon Fernie and Richard J. Tunney

# Editorial: Twenty Years After the Iowa Gambling Task: Rationality, Emotion, and Decision-Making

Yao-Chu Chiu<sup>1</sup> , Jong-Tsun Huang<sup>2</sup> \*, Jeng-Ren Duann<sup>3</sup> and Ching-Hung Lin4,5 \*

*<sup>1</sup> Department of Psychology, Soochow University, Taipei, Taiwan, <sup>2</sup> Graduate Institute of Biomedical Sciences, China Medical University, Taichung, Taiwan, <sup>3</sup> Institute of Cognitive Neuroscience, National Central University, Taoyuan, Taiwan, <sup>4</sup> Department of Psychology, Kaohsiung Medical University, Kaohsiung, Taiwan, <sup>5</sup> Research Center for Nonlinear Analysis and Optimization, Kaohsiung Medical University, Kaohsiung, Taiwan*

Keywords: rationality, emotion, decision-making, Iowa Gambling Task, somatic marker hypothesis, ventromedial prefrontal cortex, expected value, gain-loss frequency

**Editorial on the Research Topic**

#### Edited by:

*Antonio Damasio, Brain and Creativity Institute, University of Southern California, United States*

## Reviewed by:

*Marco Verweij, Jacobs University Bremen, Germany*

#### \*Correspondence:

*Ching-Hung Lin eandy924@gmail.com Jong-Tsun Huang jongtsun@mail.cmu.edu.tw*

#### Specialty section:

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology*

Received: *26 September 2017* Accepted: *22 December 2017* Published: *25 January 2018*

#### Citation:

*Chiu Y-C, Huang J-T, Duann J-R and Lin C-H (2018) Editorial: Twenty Years After the Iowa Gambling Task: Rationality, Emotion, and Decision-Making. Front. Psychol. 8:2353. doi: 10.3389/fpsyg.2017.02353* **Twenty Years After the Iowa Gambling Task: Rationality, Emotion, and Decision-Making**

## RATIONALITY AND EMOTION IN DECISION-MAKING

Traditionally, the role of "emotion" has received little attention in research studies of decisionmaking (Finucane et al., 2000). However, 20 years ago, the "Somatic Marker Hypothesis" (SMH) proposed by the neuroscientist Antonio Damasio was introduced to explore decision-making under uncertainty (Bechara et al., 1994; Damasio, 1994). The SMH suggested that, under uncertain situations, second-level processing of the intact emotion system could facilitate rational decisionmaking in the long term. The core brain regions of the somatic marker (SM) system are believed to be located in the ventromedial prefrontal cortex (VMPFC) and orbitofrontal cortex, which integrate bodily signals from the peripheral to the central nervous system to create a response such as subjective feeling, and can also modulate and monitor decision-making (e.g., gut-feeling). The signals in the SM system can be regarded as a representation of certain positive or negative events or circumstances. In short, the intact SM system helps decision makers avoid disadvantageous choices or situations and instead consider advantageous choices or situations (Damasio, 1994, 1996).

Damasio and other notable neuroscientists also designed an examination tool referred to as the Iowa Gambling Task (IGT) that can be used to simulate dynamic real-life decision-making behavior as well as test the SMH (Bechara et al., 1994, 1997, 2000). This group of researchers evaluated VMPFC lesions using the IGT as a testing tool and recorded skin conductance responses (SCRs) to create an ideal experimental paradigm for exploring rationality and emotion in decision-making. The IGT has been used both as an indexical tool for studying the interaction between emotions and decision-making, and as a tool for clinical research and assessment (Bechara, 2007, 2016). The IGT has made a significant impact on cross-field research. In preparation for the publication of this special issue, "Iowa Gambling Task: 20 Years After," we searched PubMed database using the phrase "Iowa Gambling Task" and found more than 400 IGT-related articles in 2012. Notably, the number of relevant articles has nearly doubled over the last 5 years to more than 800 in 2017. As numerous indices show, the IGT has provided a communal experimental platform for research in multiple fields that focus on issues related to emotions and decision-making.

## VALIDITY ISSUES WITH IGT INVESTIGATIONS

The IGT is a gambling game that simulates a gain–loss experience in an uncertain environment. The gain–loss structure of the IGT utilizes four decks of cards marked A, B, C, and D. The selection of decks A and B results in a relatively large gain (US\$100) in each trial and large losses (e.g., US\$1250 in deck B) in some trials. The selection of decks C and D results in a small gain (US\$50) in each trial and small losses (e.g., US\$250 in deck D) in some trials. On average, selections from decks A and B over 10 trials will cause decision-makers lose US\$250, and as such, these are defined as disadvantageous decks. Conversely, selections from decks C and D over 10 trials will cause decision-makers gain US\$250, so these are defined as advantageous decks. The advantageous decks (C, D) provide small immediate gains in each trial, but the long-term outcome is positive; by contrast, the disadvantageous decks (A, B) provide large immediate gains in each trial, but the long-term outcome is negative (see **Table 1**, Bechara et al., 1994). Before playing the IGT, experimenters encourage participants to earn or avoid losing as much money as possible. However, at the start of the game, participants have insufficient information to guide them in making the right choice. They are also unaware of the internal gambling structure and the end result of the game. Theoretically, the participants are therefore situated in an uncertain environment. Furthermore, in order to gain the best outcome, participants would have to use their intuition based on their emotions determined by the SM system (Bechara et al., 1994, 1997, 2000).

Decision-makers receive gain/loss information after each round of card selection in IGT-related experiments. It is impossible to guess the internal gambling structure in advance, or to predict how to make the most money, but once the game is in progress, decision-makers gradually tend to prefer the good decks and avoid choosing the bad decks, potentially drawing upon physiological feedback. For example, their SCRs could be construed as an alarm signal that encourages the decision-maker to avoid selecting the bad decks before the cards are overturned. At the start of the game, participants are unable to differentiate between good or bad decks, but they exercise a "gut-feeling" in making selections for the IGT. This emotion thus influences decision-makers by guiding them to eventually choose only the good decks and thus obtain the best outcomes. Conversely, participants affected by VMPFC lesions are devoid of the SM system and are therefore unable to register gain/loss experience during the IGT. Therefore, VMPFC patients were unable to inhibit their preference for the bad decks, lost consecutively, and presented a shortsighted choice pattern (Bechara et al., 1997, 1999, 2000).

Nonetheless, some researchers have questioned the relevance of the IGT in testing the SMH (Dunn et al., 2006). Several other research teams have adopted the IGT to examine the SMH and have provided evidence that does not match the results obtained by Damasio's team. For instance, Tomb et al. (2002) have revealed that the amplitude of SCRs was unaffected by monetary and expected values (EVs) of cards during the IGT. Furthermore Maia and McClelland (2004, 2005) have found that decision-makers possess sufficient knowledge to detect the gambling structure during the early stages of the game, and as a consequence their processing is explicit, not implicit. In reply, Bechara et al. (2005) have emphasized that the SM signal does not just represent implicit processing. More specifically, healthy decision-makers mostly perform the IGT rationally and can be influenced by the SM system in either a covert or overt manner. In this manner, the original Iowa group has argued that the data reported by Maia and McClelland (2004) do not invalidate the SMH.

Furthermore, several researchers (Wilder et al., 1998; Fernie and Tunney, 2006; Lin et al., 2007; Chiu et al., 2008, 2012; Upton et al., 2012; Steingroever et al., 2013; Seeley et al., 2014) have discovered another critical issue that guides selection behavior during the IGT. All these authors have highlighted the importance of the number of gains or losses obtained, and not their expected value. The decision-makers in these studies considered choosing decks B and D due to the associated highfrequency gains and low-frequency losses, without considering the long-term outcome. Notably, the SMH has mostly based upon evidence gained by comparing the IGT performances and SCR responses of VMPFCs compared to healthy decision makers. An important point to note is that, based on the basic assumption of the SMH, healthy decision-makers should perform well and gradually approach the positive expected value choice in the IGT because of the alarm signals created by somatic markers and vice versa. However, empirical and modeling observations based on the prominent deck B (PDB) phenomenon and gain/loss frequency have clearly demonstrated a decision-maker's inability to consider long-term outcomes (or EV) in the IGT (Wilder et al., 1998; Ahn et al., 2008; Upton et al., 2012; Lin et al., 2013; Seeley et al., 2014; Worthy and Maddox, 2014; Lin et al.; Worthy et al.). Consequently, findings related to the PDB phenomenon and gain/loss frequency have clearly echoed the main points reported in previous literature concerning behavioral decision-making (Lichtenstein et al., 1969; Kahneman and Tversky, 1979; Tversky and Kahneman, 1981). In particular, the two viewpoints (SMH vs. behavioral decision) have separately represented foresighted and myopic viewpoints to interpret the decision-maker's behavior. Consequently, the two explanatory schemes were obviously controversial and incongruent in terms of understanding choice behavior under uncertainty.

If the frequency of gains or losses largely influences a participant's poor performance during the IGT, this finding not only belies the basic assumption of the SMH proposed by Damasio's team, but also calls into question whether the effects of gain/loss frequency could be observed in the data reported by Tomb et al. (2002) and Maia and McClelland (2004). It is particularly important to resolve the latter point because the findings of these two studies generally hinge upon the basic assumption of the SMH, in that the SM system assists the decision-maker in obtaining the best long-term outcome. If this basic assumption needs to be reexamined, then the arguments proposed in these two studies will also need to be reevaluated.

It is also important to highlight that an increasing number of studies are showing evidence that healthy participants exhibit



*Note: The red marked the loss event; the blue marked the gain event; This table was sourced from Bechara et al. (1994).*

myopic choice behavior similar to VMPFC patients (Caroselli et al., 2006). Furthermore, over the last 20 years, advancements in brain imaging technology have allowed such studies to include more clinical patients (Ernst et al., 2002; Fukui et al., 2005; Lin et al., 2008, 2015; Li et al., 2010), thus allowing the IGT to gain ground in becoming a useful tool for investigating the correlation between rationality, emotions, and decision-making.

In the meantime, modeling-related studies have also gradually enhanced our existing knowledge by shedding light on the cognitive processing of decision-makers while playing the IGT (Busemeyer and Stout, 2002; Ahn et al., 2008; Worthy et al., 2012; Steingroever et al., 2014). The papers we solicited for inclusion in this book also echo and expand on many of these issues.

## THE SPECIAL ISSUE OF "IGT: 20 YEARS AFTER"

In 2012, we started preparing this special publication, entitled "Iowa Gambling Task: 20 Years After," and invited researchers from various fields related to IGT development from across the world to submit contributions. The proposed content includes reviews, prospective notes, as well as empirical, modeling, behavioral, and brain imaging studies. The chosen researchers were invited and peer-reviewed to present their knowledge and perspective on these issues. Based upon our suggestions, we expect the contributed papers to discuss the advancement of IGTrelated issues. Papers were solicited from August 2012 till the end of 2015. A total of 24 papers were accepted that reflect the entire picture of IGT development over the past 20 years. These 24 papers can be divided into five categories as detailed below.

**Category I: Reviews:** (1) Must et al. review IGT and depression-related issues; (2) Brevers et al. review studies on IGT and gambling disorders; (3) Linnet provide a review of IGT in the context of dopamine and gambling disorders; (4) Cassotti et al. review IGT in relation to developmental studies; (5) Turnbull et al. consider IGT performance as the processing of emotionbased learning; (6) Overman and Pierce examine the effects of real plus virtual cards and additional trials; and (7) van den Bos et al. provide a global overview of rodent version of the IGT.

**Category II: Clinical examinations:** (1) Sallum et al. discuss the IGT and attention deficit hyperactivity disorder; (2) Xiao et al. combine the IGT and functional magnetic resonance imaging (fMRI) in order to investigate adolescent smoking behavior; (3) Singh describe the connection between sleep deprivation and IGT performance; and (4) de Oliveira Cardoso et al. provide a behavior-image study that investigates the correlation between frontal and cerebellar lesions and IGT performance.

**Category III: Model construction:** (1) Worthy et al. compare predictability between win-stay/lose-shift and Value-Plus-Preservation (VPP) models in the IGT; (2) Steingroever et al. validate the predictive power of the Prospect Valence Learning–Delta model; (3) Dai et al. provide an improved cognitive model for predicting IGT choice behavior; (4) Lin et al. refine a simplified model for estimating IGT performance; and (5) Ahn et al. compare three advanced IGT-related computational models.

**Category IV: Theoretical integration:** (1) Okdie et al. provide a statement on construal level theory for IGT-related performance; (2) Bull et al. consider sensitivity toward reward and punishment in healthy IGT participants; (3) Singh suggest a potential role for reward and punishment during the IGT; and (4) Singh consider the influence of sex-differences, handedness, and lateralization on IGT performance.

**Category V: Brain imaging technology:** (1) He et al. combine IGT and fMRI to investigate decisions involving unhealthy food; (2) Mapelli et al. utilize the IGT and event-related potentials (ERPs) to depict the behavioral performance and brain activation of patients with Parkinson's disease; (3) Tamburin et al. combine the IGT and ERPs to detect choice behavior and brain activation in patients with chronic lower-back pain; and (4) Fernie and Tunney describe a study on the correlation between SCRs and knowledge effects in the IGT.

The articles selected for inclusion in this special issue provide good coverage of neuroimaging modalities (ERP, fMRI, and SCR) used in previous IGT experiments. However, there might still be some room for a data-driven data analysis method (Mckeown et al., 2003) to relieve the limitation brought about by the fixed event structure used in a model-based method. After all, the brain responses to such a complex process might not always be time-locked to the event onset (Duann and Chiou, 2016).

## CONCLUSION

The 24 papers that form this new book are mostly consistent with IGT developmental issues over the past 20 years, such as the application of IGT in clinical scenarios, integrative investigations with combined brain imaging technology and the establishment of new models and theories. However, it is also necessary to continue global investigations and debate with regards to some existing and unresolved issues related to the IGT. For example: (1) What types of brain lesions (mental dysfunction) does the IGT truly measure? (2) Can SCRs be combined with the IGT to form a critical index of somatic markers? (3) Does the IGT measure ability for implicit or explicit learning? (4) Does EV or gain/loss frequency primarily guide decision-making behavior in the IGT? (5) Is it possible to devise a more sensitive data analysis

## REFERENCES


method that can allocate more specific brain responses to the precise behaviors of IGT performance, such as the events of win, loss, and the switching of card decks? We recommend that future studies of IGT consider these questions seriously and provide in-depth investigations and discussions.

## AUTHOR CONTRIBUTIONS

Y-CC, C-HL, and J-TH discussed the main structure of this article. Y-CC and C-HL drafted the preliminary title, literature review, and chapter categorization, as well as the initial draft. J-RD and C-HL provided additional viewpoints for future development in the use of brain imaging for studying the IGT. J-TH and J-RD provided final refinements to this article.

## ACKNOWLEDGMENTS

The authors of this article and editors of this book would like to thank the Ministry of Science and Technology (Taiwan) for its financial support, under Contract No. NSC 102-2410-H-031-014-. C-HL's work was supported in part by Kaohsiung Medical University, Taiwan (KMUTP103F00-03). Special thanks go to Prof. Verweij's careful review as well as the NOVA and Charlesworth Editing Groups for their valuable help with English editing and proofreading services for this manuscript. Furthermore, we deeply appreciate the kind and helpful contributions and coordinated actions of the contributors, reviewers, FIP editorial officers and chief editors. Without the contributions of everyone involved, the collective viewpoints in this publication would not be as provocative and comprehensive.

Gambling Task. Psychol. Assess. 14, 253–262. doi: 10.1037/1040-3590.14. 3.253


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chiu, Huang, Duann and Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The Iowa GamblingTask in depression – what have we learned about sub-optimal decision-making strategies?

## *Anita Must1\*, Szatmar Horvath1,2 , Viola L. Nemeth1 and Zoltan Janka1*

<sup>1</sup> Department of Psychiatry, University of Szeged, Szeged, Hungary

<sup>2</sup> Department of Psychiatry, Vanderbilt University, Nashville, TN, USA

#### *Edited by:*

Ching-Hung Lin, Kaohsiung Medical University, Taiwan

#### *Reviewed by:*

Benjamin Hayden, University of Rochester, USA V. S. Chandrasekhar Pammi, University of Allahabad, India

#### *\*Correspondence:*

Anita Must, Department of Psychiatry, University of Szeged, 57 Kalvaria Avenue, P.O. Box 427, Szeged H-6701, Hungary e-mail: must.anita@med.u-szeged.hu

Our earlier study found patients with depression to show a preference for larger reward as measured by the Iowa Gambling Task (IGT). In this IGT version, larger rewards were associated with even larger consequent losses. In the light of the clinical markers defining depressive disorder, this finding might appear controversial at first. Performance of depressed patients on various decision-making (DM) tasks is typically found to be impaired. Evidence points toward reduced reward learning, as well as the difficulty to shift strategy and integrate environmental changes into DM contingencies. This results in an impaired ability to modulate behavior as a function of reward, or punishment, respectively. Clinical symptoms of the disorder, the genetic profile, as well as personality traits might also influence DM strategies. More severe depression increased sensitivity to immediate large punishment, thus predicting future decisions, and was also associated with higher harm avoidance. Anhedonic features diminished reward learning abilities to a greater extent, even predicting clinical outcome. Several questions about how these aspects relate remain to be clarified. Is there a genetic predisposition for the DM impairment preceding mood symptoms? Is it the consequence of clinical signs or even learned behavior serving as a coping strategy? Are patients prone to develop an aversion of loss or are they unable to sense or deal with reward or the preference of reward? Does the DM deficit normalize or is a persisting impairment predictor for clinical outcome or relapse risk? To what extent is it influenced by medication effects? How does a long-lasting DM deficit affect daily life and social interactions? Strikingly, research evidence indicates that depressed patients tend to behave less deceptive and more self-focused, resulting in impaired social DM.The difficulty in daily interpersonal interactions might contribute to social isolation, further intensifying depressive symptoms.

**Keywords: decision-making, depression, anhedonia, habenula, vmPFC, outcome**

## **INTRODUCTION**

Depression is traditionally considered an affective disorder. Yet, research in the past decades has drawn attention to the substantial impairment in cognitive function. Various aspects of cognitive disturbance have frequently been reported in the acute phase of the illness (Harvey et al., 2004; Rogers et al., 2004). These include domains of executive function, such as planning and problem solving (Naismith et al., 2003), inhibition and semantic fluency (Ravnkilde et al., 2002; Gohier et al., 2009) – present even in first episode major depressive disorder (Schmid and Hammar, 2013) – decision-making (DM; Chamberlain and Sahakian, 2006) and various aspects of memory processes (Rose and Ebmeier, 2006; Taylor Tavares et al., 2007). Convincing research evidence has accumulated about the key cognitive deficits characterizing a major depressive episode (for a review and meta-analysis see Castaneda et al., 2008; Lee et al., 2012). The cognitive alterations affect several aspects of daily life functioning including work performance, planning and DM and even social interactions. However, the cognitive profile defined during a depressive episode might not merely be the consequence of depressive symptoms (Hammar and Ardal, 2009). Findings indicate that improvement of the cognitive disturbance and aspects of daily life functioning are not always in accordance with the remission of a depressive episode (Kennedy et al., 2007). Nevertheless, the cognitive deficit plays a crucial role in functional recovery from depression (Jaeger et al., 2006) while a persistent cognitive impairment might be an important factor associated with long-lasting disability in everyday functioning. In his thought provoking review, Kendler offers the concept that our own decisions might well intervene in causal pathways from the genome to behavior and phenotype. Kendler argues, that human cognitive DM capacity may either suppress or augment the expression of risk genes and heritability of a trait (Kendler, 2013). Consequently, the DM capacity might have a fundamental effect on social skills and coping strategies, influencing vulnerability, preventing symptoms or even enhancing relapse risk.

The Iowa Gambling Task (IGT) has had a substantial impact on our understanding of the complex aspects of DM in the past two decades. Here we aim to provide a targeted review of the literature in the effort to shed some light on its revealing role on the DM deficit in depression. Special emphasis is directed to the influence of anhedonic symptoms, the role of habenula and the

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 1 — #1

ventromedial prefrontal cortex (vmPFC) in dysfunctional DM and their combined effect on outcome prediction.

## **THE IOWA GAMBLING TASK – A BENCHMARK OF REAL-LIFE DECISION-MAKING**

Designed by Bechara et al. (1994) the IGT resembles the real-life DM process relying on contingencies of reward and penalty by taking the advantage of the uncertainty of outcomes. Development of the task was guided by the somatic marker hypothesis assuming that signals of the body given as a reaction to the experience of reward or punishment guide behavior toward longterm beneficial choices (Damasio, 1994). The IGT involves four decks of cards and participants are asked to freely choose one card at a time from one of the decks. In the original version ("ABCD") selection from two decks (A and B) is followed by a high immediate reward in measures of play money on a computerized system, but at unpredictable points, an even higher penalty occurs. Picking from the other two decks (C and D) is associated by smaller gain but even smaller future loss, which proves to be more advantageous in long-term. Thus, participants start to show a preference for the more advantageous decks of cards and tend to avoid decks A and B, defined by disadvantageous future consequences. This DM tendency is also predicted by anticipatory skin conductance responses among healthy participants. Driven by the interest to understand the neuroanatomical background and the motivational aspects of the DM process, Bechara et al. (2000) designed variant ("EFGH") of the original IGT version by reversing the order of reward and punishment. Here the advantageous decks (E and G) yielded immediate high loss but even higher consequent gain, while decks F and H contained the more disadvantageous cards on long-term with smaller penalties but even lower rewards at unforeseeable time points (Bechara et al., 2000). Patients suffering from major depressive disorder are typically found to show altered sensitivity to reward and punishment on both IGT variants. This involves fewer selections from the advantageous decks on the "ABCD" version (Han et al., 2012) and less shifting of DM strategies in the light of encountered experiences during both the standard and the contingency-shift phases of the IGT (Cella et al., 2010). Our earlier study detected a preference for larger reward as measured by IGT in a group of depressed patients. While performing the "ABCD" version, participants suffering from depressive disorder tended to choose from the disadvantageous decks offering high immediate reward. Despite the consequent increased punishment, patients failed to shift strategy and to develop a long-term beneficial DM tendency (Must et al., 2006). Increased reward preference in depression might appear controversial at first. However, the critical underlying factor might rather be the impaired ability in reinforcement processing. The reward-related processing deficit revealed in depression leads to a difficulty to integrate feedback information in guiding future behavior. Consequently, depressed patients focus on the immediate outcome thus preferring the decks with higher reward on the short-term, The decreased ability to integrate reward-related reinforcement history might thus be considered a manifestation of reduced reward responsiveness (Eshel and Roiser, 2010). Depressed patients appear to experience a more pronounced decisional conflict in DM situations explained by a dysfunctional processing of seemingly unpredictable or counterfactual outcomes (Chase et al.,2010). Moreover, depressed patients have been characterized by a prolonged attenuation of temporal discounting of rewards (Lempert and Pizzagalli, 2010) also suggesting the impairment of DM processing. Considering the difficulty to shift strategy even after encountering large subsequent penalties we might even speculate that depressed patients consider the loss to be inevitable, inherent to a rewarding stimuli. Findings suggest that depressed individuals presume punishing consequences to be more likely to occur than rewarding ones. This is supported by the notion that depressed patients did not change their behavior under conditions of absent versus negative feedback, seemingly expecting some penalty, as if predestinated (Elliott et al., 1998). If faced with immediate punishment, as in the "EFGH" version of the IGT, a potential large gain in the future might not outweigh a high loss in the present. In this case individuals with depression might prefer to make fewer selections of the risky decks, picking more cards defined by low magnitude punishment (Cella et al., 2010) though disadvantageous on the long-term. Strikingly, acutely depressed patients have also been shown to learn to avoid risky responses better than controls (von Helversen et al., 2011). This might be related to higher harm avoidance, enhanced sensitivity to aversive stimuli, a bias toward negative self-evaluation and is also consistent with clinical symptoms of depression (Paulus and Yu, 2012). The possibility might even be raised that certain subgroups of depressed patients with different leading clinical symptoms are characterized by distinct DM strategies. In the next section we discuss factors potentially influencing the DM process including clinical markers and neuroanatomical correlates of depression. A special emphasis is given to the role of DM strategies based on different aspects of reward contingencies in predicting social and functional outcome of the disorder.

## **DECISION-MAKING IN DEPRESSION**

## **THE INFLUENCE OF CLINICAL SYMPTOMS: ANHEDONIC vs. NON-ANHEDONIC PATIENTS**

The effect of depressive symptoms on the DM process constitutes an area of particular interest but not only in association with illness state, i.e., acute phase or remission. Converging evidence examining cognitive disturbances in the longitudinal course of depression suggests that certain neuropsychological domains are more related to the clinical state than others. Among latter, the deficit in executive function and attention might constitute the most trait-like impairment (Douglas and Porter, 2009). Neurocognitive alterations involving executive functions are present in groups of depressed adolescents (Maalouf et al., 2011) and can be detected in unmedicated patients with major depressive disorder (Porter et al., 2003). Disturbance of the complex construct of DM might also be present before the onset of depressive symptoms and contribute to their persistence. A mechanism of critical importance implied in this process is reduced reward learning. Depressed patients tend to show a difficulty in modulating behavior as a function of reward (Elliott et al., 1996). Evidence suggests, that this impairment is particularly associated with anhedonia. Anhedonia is defined as the inability to experience pleasure, to respond to positive reinforcers resulting in dysfunctional DM and consequently

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 2 — #2

in an impairment in goal-directed behavior (Der-Avakian and Markou, 2012). Recent evidence indicates that depressed patients with anhedonic symptoms are characterized by a significant deficit in reward learning abilities. Moreover, anhedonic features not only influenced behavioral modulation in the light of reward contingencies, but were found to have a predictive role for the diagnosis of major depression to persist for at least 8 weeks besides antidepressive treatment (Vrieze et al., 2013). This raises the notion of an interaction between a persistent DM deficit and symptoms of anhedonia in depression serving as predictors of clinical outcome. In the past decades, the IGT has proven an effective method to address this trait-like DM disturbance.

## **THE NEUROANATOMICAL BACKGROND: A FOCUS ON INTERACTIONS OF THE HABENULA AND THE vmPFC**

Historically, the IGT was a pioneering method in the examination of lesions of the vmPFC. Patients with bilateral damage to the vmPFC develop severe impairments in social and personal DM, otherwise having largely preserved intellectual abilities. These patients are characterized by "myopia" for the future, repeatedly engaging in decisions with long-term negative consequences in spite of previous experiences (Bechara et al., 1994). Structural and functional alterations of the vmPFC have long been implicated in the etiology of depression (Drevets et al., 2008). However, the exact role of the vmPFC and its interconnected subregions is not clearly

**FIGURE 1 | After encountering a large penalty eventually exceeding an expected reward while playing the Iowa Gambling Task (IGT), healthy control participants tend to switch strategy.** The absence of an expected reward is associated with habenula activation and subsequent decrease in ventromedial prefrontal cortex (vmPFC) activity. Adequate integration of reinforcement history favors long-term advantageous decision-making (DM). Depressed patients might be influenced by immediate reinforcers, such as high rewards during a DM task including the IGT. After encountering a large reward a subsequent and unexpected penalty in addition to the absence of the presumably expected win might

be associated with excessive habenula activation. Increased firing of the habenula results in low striatal dopamine and subsequently decreases vmPFC activation. This might then contribute to improving DM strategies and to a better outcome of the disorder. A specific subgroup of depressed patients with anhedonic symptoms might expect an inevitable punishment after obtaining a large reward during the DM task. Consequently the excessive firing of the habenula might not occur, leading to a relative increase in vmPFC activity. Dysfunction of the overactivated vmPFC affects DM strategies and is associated with "myopia" for the future as well as symptomatic persistence.

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 3 — #3

understood (Myers-Schulz and Koenigs, 2012). Imaging studies report abnormally high levels of resting-state activity within the vmPFC in major depression (Greicius et al., 2007) potentially resulting in an altered DM process. An influential model of the role of vmPFC in affective disorders emphasizes the topdown inhibition of the amygdala and consequent control of the ventral tegmental area (VTA) dopaminergic neurons (Price and Drevets, 2010). When assessing the neuroanatomical correlates of DM and particularly reward processing, attention has more recently been directed to the habenular complex. Increased activity of the habenula has been implicated in the etiology of major depression (Shumake and Gonzalez-Lima, 2003). Functional hyperactivity of the habenula results in the suppression of dopamine cell activity in the VTA and subsequently, inhibition of the striatum, amygdala, nucleus accumbens, and prefrontal cortical areas, including the vmPFC (Hikosaka et al., 2008). The habenula plays a crucial role in behavioral responses to decisional consequences. In the absence of an expected reward increased activity of the habenula occurs (Hikosaka, 2010). After encountering a large reward during the IGT a subsequent and unexpected penalty in addition to the absence of the presumably expected win might be associated with habenula activation. An excessive firing of the habenula would result in low striatal dopamine and subsequently decrease vmPFC activation. However, we might speculate that some depressed patients rather tend to expect an inevitable punishment after obtaining a large reward during the IGT. Thus the excessive firing of the habenula might not occur. This in turn would lead to a relative increase in vmPFC activity in particular as a reaction to larger rewards (**Figure 1**). Strikingly, anhedonia has been associated with an excess of activity of the ventral region of the prefrontal cortex including the vmPFC, with a significant role of dopamine (Gorwood, 2008) Anhedonia in depression might be associated with a distinct pattern of regulation between interconnected neuroanatomical correlates. This is then manifested in a DM deficiency as measured by the IGT having a specific underlying mechanism.

## **THE PREDICITING EFFECT OF A PERSISTENT DM DEFICIT ON DAILY LIFE AND SOCIAL INTERACTIONS**

Depressed patients with anhedonic symptoms are characterized by reduced ability to modulate their DM strategies as a function of reward. They repeatedly opted for disadvantageous choices, disregarding long-term consequences. Furthermore, this DM tendency associated with anhedonia had a predictive value for symptom persistence and outcome in major depression (Vrieze et al., 2013). A neuroanatomical correlate of this deficiency is presumed to be the abnormally increased activity of the vmPFC resulting in functional alteration. Strikingly, depressed patients

## **REFERENCES**

Bechara, A., Damasio, A. R., Damasio, H., and Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. *Cognition* 50, 7–15. doi: 10.1016/0010-0277(94) 90018-3

Bechara, A., Tranel, D., and Damasio, H. (2000). Characterization of the decision-making deficit of patients with ventromedial prefrontal cortex lesions. *Brain* 123(Pt 11), 2189–2202. doi: 10.1093/brain/123.11.2189

Castaneda, A. E., Tuulio-Henriksson, A., Marttunen, M., Suvisaari, J., and more responsive to antidepressive treatment exhibit a decrease in activation of vmPFC areas after medication is administered (Drevets et al., 2002). Similarly, depressed patients showing symptomatic remission to deep brain stimulation also exhibit a decrease in the activation of a vmPFC subregion after therapy (Mayberg et al., 2005). This suggests an association between the occurrence or reduction of clinical symptoms and the activity of specific brain areas. Furthermore, the regulatory processes between these interconnected brain structures might be reflected in DM strategies as measured by the IGT. Parallel to a decline in vmPFC activation patients tend to be influenced more by aversive stimuli (Koenigs and Grafman, 2009). An excess of vmPFC activity relates to a disturbance in reward learning manifested in the preference for immediate reward disregarding future consequences, i.e., "myopia" for the future. Thus, the opposite condition, i.e., a decrease in activation, might be the one beneficial on long-term leading to more advantageous choices. Of critical importance is the assumption that a more preserved DM strategy with intact reward learning is associated with correct value recognition in social life, supportive of remission (Zhang et al., 2012).

## **CONCLUDING REMARKS**

This targeted review aimed to direct special emphasis on the influence of anhedonic symptoms, the role of habenula and the vmPFC in dysfunctional DM and their combined effect on outcome prediction in depression. We propose that depressed patients with anhedonic symptoms tend to expect an inevitable punishment after obtaining a large reward during the IGT. Thus an excessive firing of the habenula typically detected in the absence of an expected reward does not occur. A consequential relative increase in vmPFC activity would then lead to dysfunctional DM strategies, disadvantageous choices, and a reduced ability to modulate behavior in the light of previous experiences. Disturbed reward responsiveness and reinforcement processing in association with anhedonic symptoms affect persistence of clinical symptoms and value recognition in everyday social life thus predicting outcome in depression (**Figure 1**).

The above concept integrating clinical markers, cognitive strategies and neuroanatomical correlates serving outcome prediction in major depression is targeted and by far not exclusionary. Another mechanism of significant importance involves the glutamatergic – GABAergic imbalance reflected by altered prefrontal levels of GABA and glutamate during value-guided choices reported in patients with major depressive disorder (Jocham et al., 2012). Future dedicated studies will not only favor clinical and neurocognitive research in depression but also assist clinical practice in treatment and outcome prediction.

Lonnqvist, J. (2008). A review on cognitive impairments in depressive and anxiety disorders with a focus on young adults. *J. Affect. Disord.* 106, 1–27. doi: 10.1016/j.jad.2007. 06.006

Cella, M., Dymond, S., and Cooper, A. (2010). Impaired flexible decision-making in Major Depressive Disorder. *J. Affect. Disord.* 124, 207–210. doi: 10.1016/j.jad. 2009.11.013

Chamberlain, S. R., and Sahakian, B. J. (2006). The neuropsychology of mood disorders. *Curr. Psychiatry Rep.* 8,

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 4 — #4

458–463. doi: 10.1007/s11920-006- 0051-x


depression: distinct roles for ventromedial and dorsolateral prefrontal cortex. *Behav. Brain Res.* 201, 239– 243. doi: 10.1016/j.bbr.2009.03.004


*Scand. J. Psychol.* 43, 239–251. doi: 10.1111/1467-9450.00292


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 July 2013; accepted: 22 September 2013; published online: 10 October 2013.*

*Citation: Must A, Horvath S, Nemeth VL and Janka Z (2013) The Iowa*

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 5 — #5

*Gambling Task in depression – what have we learned about sub-optimal decision-making strategies? Front. Psychol. 4:732. doi: 10.3389/fpsyg.2013. 00732*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Must, Horvath, Nemeth and Janka. This is an open-access* *article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, providedthe original author(s) or licensor* *are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fpsyg-04-00732" — 2013/10/8 — 21:14 — page 6 — #6

**REVIEW ARTICLE** published: 30 September 2013 doi: 10.3389/fpsyg.2013.00665

## *Damien Brevers 1,2,3\*, Antoine Bechara2, Axel Cleeremans <sup>3</sup> and Xavier Noël <sup>1</sup>*

*<sup>1</sup> Department of Medicine, Psychological Medicine Laboratory, Faculty of Medicine, Université Libre de Bruxelles, Brussels, Belgium*

*<sup>2</sup> Department of Psychology, Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA*

*<sup>3</sup> Department of Psychology, Consciousness, Cognition & Computation Group, Center for Research in Cognition & Neuroscience, Université Libre de Bruxelles,*

*Edited by:*

*Brussels, Belgium*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*V. S. Chandrasekhar Pammi, University of Allahabad, India Jakob Linnet, Aarhus University Hospital, Denmark*

#### *\*Correspondence:*

*Damien Brevers, Research-Fellow FNRS, Psychological Medicine Laboratory, Faculty of Medicine, Université Libre de Bruxelles, Brugmann-campus, 4 place Van Gehuchten, 1020 Brussels, Belgium e-mail: dbrevers@ulb.ac.be*

The Iowa Gambling Task (IGT) involves probabilistic learning via monetary rewards and punishments, where advantageous task performance requires subjects to forego potential large immediate rewards for small longer-term rewards to avoid larger losses. Pathological gamblers (PG) perform worse on the IGT compared to controls, relating to their persistent preference toward high, immediate, and uncertain rewards despite experiencing larger losses. In this contribution, we review studies that investigated processes associated with poor IGT performance in PG. Findings from these studies seem to fit with recent neurocognitive models of addiction, which argue that the diminished ability of addicted individuals to ponder short-term against long-term consequences of a choice may be the product of an hyperactive automatic attentional and memory system for signaling the presence of addiction-related cues (e.g., high uncertain rewards associated with disadvantageous decks selection during the IGT) and for attributing to such cues pleasure and excitement. This incentive-salience associated with gambling-related choice in PG may be so high that it could literally "hijack" resources ["hot" executive functions (EFs)] involved in emotional self-regulation and necessary to allow the enactment of further elaborate decontextualized problem-solving abilities ("cool" EFs). A framework for future research is also proposed, which highlights the need for studies examining how these processes contribute specifically to the aberrant choice profile displayed by PG on the IGT.

#### **Keywords: gambling disorder, Iowa Gambling Task, decision-making, dual-process model, willpower**

## **INTRODUCTION**

Gambling, defined as an activity in which something of value is risked on the outcome of an event when the probability of winning or losing is less than certain (Korn and Shaffer, 1999), is a very popular recreational activity. Between 50 and 80% of the general population gamble at least one time per year (e.g., Abbott and Volberg, 1995; Welte et al., 2002). However, for some individuals (about 15% of frequent gamblers and about 1.6% of the general population; Wardle et al., 2007; INSERM, 2008), gambling can spiral out of control and become a financial burden on the individual and his/her family.

Gambling disorder is defined as persistent and recurrent maladaptive gambling behavior characterized by an inability to control gambling that disrupts personal, family or vocational pursuits (APA, 2013). More specifically, similar to substance (e.g., alcohol, cocaine) addictions, pathological gamblers (PG) exhibit a loss of willpower to resist gambling: they persist in gambling despite the occurrence of negative consequences (e.g., loss of a significant relationship, job or career opportunity) (APA, 2013).

Over the last decade, research has focused on the neurocognitive determinants of gambling disorder and found a number of similarities between drug addiction and gambling addiction (for a review, see Leeman and Potenza, 2012), suggesting that gambling addiction shares common mechanisms with substance addiction. These findings are in line with the new classification of gambling disorder in the DSM-V (APA, 2013), which views gambling disorder as a "behavioral addiction" that, unlike substance abuse, does not involve intake of an exogenous substance. Hence, given the absence of the confounding effect of chemical substances that can alter the brain in many non-specific ways, the study of gambling disorder offers one critical approach to understand and extract components specifically involved in the development of addiction.

With respect to the study of impaired decision-making in addiction, the Iowa Gambling Task (IGT; Bechara et al., 1994) has been regarded as the most widely used and ecologically valid measure of decision making in this clinical population. One of the reasons for this ecological validity is that performing advantageously on this task requires, as in real-life, dealing with uncertainty in a context of punishment and reward, with some choices being advantageous in the short-term (high reward), but disadvantageous in the long run (higher punishment); other choices are less attractive in the short-term (low reward), but advantageous in the long run (lower punishment). Hence, the key feature of this task is that participants have to forgo short-term benefits for long-term benefits, a process that is presumably severely hampered in drug and gambling addicts (APA, 2013). Accordingly, performance on the IGT has been shown to be a sensitive measure of impaired decision-making in a diversity of neurological and psychiatric conditions (Bechara, 2007). For instance, patients with frontal lesions (Bechara et al., 1994, 2000; Manes et al., 2002) and substance dependent (SD) individuals (Petry et al., 1998; Grant et al., 2000; Bechara et al., 2001; Whitlow et al., 2004) have demonstrated a preference for short-term gains despite larger net losses while performing the IGT. With regard to PG, it also appears that they display a stubborn preference for disadvantageous deck selection during the IGT (see **Table 1**).

But what are the processes underlying this inability to optimally ponder immediate vs. long terms consequences of a choice (Bechara, 2005)? On the basis of the dual-process model of selfregulation (e.g., Bechara, 2005; Everitt and Robbins, 2005; Redish et al., 2008), the ability to decide advantageously according to short-term and long-term outcomes involves the optimal activation of two neural systems: (i) an "*impulsive,*" amygdala-striatum dependent, neural system that promotes automatic, habitual, and salient behaviors; and (ii) a prefrontal "*reflective*" neural system that forecasts the future consequences of a behavior and allows inhibitory control of automatic responses. The "*impulsive*" system is critical for processing the incentive motivational effects of a variety of natural (e.g., food) and non-natural rewards (e.g., money), which are mainly processed through an amygdala-striatal neural system (Robbins et al., 1989; Wise and Rompre, 1989; Robinson and Berridge, 1993; Di Chiara, 1999). Importantly, this is also the neural system that has been argued to be responsible for the transfer of reward seeking from controlled to automatic and habitual behaviors (Everitt et al., 1999; Everitt and Robbins, 2005). The "*reflective*" system is necessary to control basic impulses and allow the more flexible pursuit of long-term goals. This system includes executive functions (EFs), which could be understood as a variety of cognitive abilities that allow the conscious control of thought, emotion and action. The action of the reflective system depends on the integrity of two sets of neural systems: a "cool" and a "hot" EFs system (Zelazo and Müller, 2002). These "cool" and "hot" EFs are achieved through relatively slow, controlled processes and allow to hold on to a mental representation for contemplation and selfreflection (Smith and DeCoster, 2000). "Cool" EFs are mediated by lateral inferior and dorsolateral frontostriatal and frontoparietal networks and refer to abstract decontextualized reasoning (Kerr and Zelazo, 2004). More specifically, "cool" executive processes include problem-solving abilities that require the capacity to represent a dilemma, maintain, and organize information in working memory, strategically plan and execute a response, evaluate the efficacy of the solution, and make necessary changes based on the outcome (e.g., shifting back and forth between multiple tasks and the ability to deliberately suppress prepotent responses that are no longer relevant) (Zelazo and Müller, 2002). Hence, "cool" EFs is associated with rational and cognitive determinations of risks and benefits associated with options, and requires the knowledge of the risk/benefit ratio, the ability to retrieve them from memory, and the ability hold them in mind while comparing and contrasting them through working memory processes (Seguin et al., 2007). In contrast, "hot" EFs refer to one's ability to monitor the self and the situation for what are considered to be acceptable social behaviors, regulate emotional responses, and inhibit impulsive reactions. These EFs are mediated by ventromedial (VMPC) and orbito (OFC) prefrontal cortex structures that are closely connected to the limbic system, which confers to hot EFs a critical role in regulating affective and motivational processes (Zelazo and Müller, 2002). Hence, by overcoming impulsive triggers, "hot" executive processing results in the ability to advantageously weigh short-term gains against long-term losses, that is, to optimally anticipate the potential outcomes of a given decision (Damasio, 1996). Importantly, several theoretical accounts advance that before elaborate decontextualized problem-solving abilities and other related cognitive skills (i.e., "cool" EFs) can begin to be enacted, the ability to control emotional reactions and inhibit basic behavioral impulses may be required first (Barkley, 1997; Sonuga-Barke et al., 2002; Giancola et al., 2012). More specifically, the ability to control emotional reactions and inhibit basic behavioral impulses by "hot" EFs would allow rational and cognitive determinations of risks and benefits associated with options (Giancola et al., 2012). For instance, when exposed to highuncertain rewards, individuals with intact "hot" EF capacities will be capable to control their emotional responses and inhibiting their impulses directed at the reward, which will then make it significantly more likely that they will engage in the more cool abstract reasoning/problem-solving aspects of EF. In turn, the enactment of those "cool" EFs would reinforce the efficiency of reward anticipation processes (e.g., to weigh short-term gains against long-term losses on both emotional and rational bases). Thus, adequate decision-making reflects an integration of cognitive (i.e., "cool" EFs) and affective (i.e., "hot" EFs) systems, and the ability to more optimally weigh short term gains against long term losses or probable outcomes of an action. One important consequence of this assumption is that, if learning is suddenly interrupted (e.g., absence of deck selection outcomes during a IGT "blind" phase, occurring after an standard 100-choice interaction with the IGT; Stocco et al., 2009), individuals can still make their decisions based on representations they have previously acquired through cognitive and affective processes (e.g., Stocco et al., 2009).

In the present review, based on this dual-process model and on recent influential theoretical accounts (Hofmann and Friese, 2008; Hofmann et al., 2009; Verdejo-Garcia and Bechara, 2009; Stacy and Wiers, 2010; Noël et al., 2013), we argue that PGs' exaggerate the salience associated with gambling cues to the point that these cues literally "hijack" the cognitive and affective reflective processes necessary to choose on the basis of both short-term and long-term outcomes. In other words, the "working hypothesis" here is that the extreme saliency associated with high short-term rewards in PG detrimentally impacts their decision-making profile during the IGT.

### **GAMBLING DISORDER AND IGT PERFORMANCE**

There is a convergence in findings from studies examining decision-making using the IGT in PG (see also **Table 1**). More specifically, abstinent (e.g., Goudriaan et al., 2005) or nonabstinent (e.g., Power et al., 2012) PG with (e.g., Cavedini et al., 2002) or without co-morbid substance (e.g., Brevers et al., 2012a) abuse seem to display a stubborn preference for disadvantageous deck selection during the IGT, as compared with

#### **Table 1 | Studies using the IGT in gambling disorder.**


*(Continued)*

#### **Table 1 | Continued**


*(Continued)*


*These studies were selected in the basis of a comprehensive literature search conducted in PUBMED and PsychINFO with key search terms, including: Iowa gambling task, IGT, decision making, uncertain\*, ambig\* in combination with the key word gambl\*. Cross-references were searched in the selected articles. A total of 1387 hits were retrieved in PUBMED and PsychINFO using the search terms. Selection criteria for studies were inclusion of the original or adapted version of the IGT, presence of a gamblers group (ranging from frequent to severe pathological gamblers). After this selection, 28 papers remained, 7 articles were excluded because no control group was included in the study (n* = *1) or it concerned review articles (n* = *6). SOGS, South Oaks Gambling Screen; HC, healthy controls; PG, pathological gamblers; PrG, problem gambler; SD, substance dependent.*

healthy control participants. Nevertheless, a couple of studies reported non-significant difference between PG and controls on the IGT (Tanabe et al., 2007; Linnet et al., 2011a,b, 2012; De Wilde et al., 2013). This finding could be due to the low sample size of the PG group recruited in these studies (see **Table 1**). This absence of significant difference might also stem from the heterogeneity of gambling addiction (even if PGs' preferred gambling was not reported in these studies). More specifically, the literature dichotomizes gambling activities into non-strategic (e.g., slot machines games) and strategic (e.g., poker) gambling (e.g., Potenza, 2001; Grant et al., 2012). Strategic gambling conceivably involves different cognitive demands than non-strategic gambling. Poker, for example, in addition to involve "hot" emotional self-regulation (bluffing, regulation of loss-induced frustration; Palomäki et al., 2013), requires "cool" executive processes such as, working memory and mental flexibility (e.g., keeping track of cards played to determine odds of receiving a certain card). Hence, one may infer that strategic gamblers differ from non-strategic gamblers on several neuropsychological processes. Grant et al. (2012) have recently examined this possibility but did not report any difference between strategic (e.g., poker, sports betting, stock market) and non-strategic gamblers (e.g., slots, roulette) with regard to their ability to shift between multiple tasks (i.e., set-shifting) and to inhibit a prepotent motor response. With regard to the IGT, Goudriaan et al. (2005) found a difference in decision-making strategies between slot machine gamblers and casino gamblers (engaged mainly in strategic card games), with the former performing worse than the latter, and the latter not different from their controls.

In light of the limited research, further studies are needed to explore the multiple aspects of "hot" and "cool" EFs in strategic and non-strategic PG. Moreover, the use of complementary profile analyses may bring important information with regard to the multifaceted aspect of the gambling dependence state. For instance, despite a significant between-group difference, up to 30% of healthy controls have been reported to exhibit poor performance on the IGT (Li et al., 2010) and normal performance has also been observed among PGs (Álvarez-Moya et al., 2011). In addition, Peterson et al. (2010) observed that, in both PG and controls, highly sensation-seeking subjects had a significant increase in neural activity in a brain region that receives dopamine projections, i.e., in the ventral striatum (a brain area involved in the anticipation of monetary rewards; Knutson et al., 2003) during the IGT. As a whole, these results support the view that gambling disorder is a multifaceted psychopathological state and that PG may be clustered into distinct subgroups (e.g., high sensation-seeking PG vs. low sensation-seeking PG; Peterson et al., 2010) in future IGT studies.

## **HYPERACTIVITY OF IMPULSIVE PROCESSES TOWARD GAMBLING-RELATED CUES IN PG**

The amygdala-striatal "*impulsive*" system has been argued to be responsible for the transfer of reward seeking from controlled to automatic and habitual behaviors (Everitt et al., 1999; Everitt and Robbins, 2005). Those incentive automatic/habitual behaviors are assumed to emerge from the activation of certain associative clusters in long-term memory by perceptual (e.g., words, images, video) or imagined stimulus input (Strack and Deutsch, 2004). These associations are created and strengthened gradually through classical conditioning processes, that is, by the learning history of temporal or spatial coactivation between external stimuli and affective reactions (Hofmann et al., 2008, 2009). These associative clusters endow the organism the ability to evaluate and respond to the environment quickly in accordance with one's current needs and previous learning experiences (Hofmann et al., 2008, 2009). When, for example, the gambler encounters gambling-related cues, the "gambling cluster" may get reactivated, which will automatically trigger a corresponding impulse, consisting of a positive incentive value attributed to gambling and a corresponding behavioral schema to approach it (Stacy and Wiers, 2010). In other words, repeated and marked "high" throughout the repetition of gambling experiences, learned associations between gambling-rewards hedonic effects and stimuli in the environment endow these gambling-related cues with the ability to directly access the mental representations associated with the action of gambling and, like gambling itself, make them attractive (Hofmann et al., 2009). As a result, gambling-related cues may be flagged as salient and automatically trigger motivation-relevant associative memories (i.e., implicit association) and may also grab the addicts' attention (i.e., attentional bias) (Stacy and Wiers, 2010).

So far, two studies (Yi and Kanetkar, 2010; Brevers et al., 2013a) have directly investigated implicit association (i.e., spontaneous associations between addiction related cues and affective, arousal, motivational representation in memory, which are independent of, or not available to, conscious awareness; Greenwald and Banaji, 1995) toward gambling-related cues in PG. More specifically, these studies showed that PG exhibited positive, but not negative implicit associations toward gambling cues on the well-known Implicit Association Task (Greenwald et al., 1998). Several studies have also emphasized the presence of attentional bias for gambling related stimuli in PG. For instance, two recent studies (Brevers et al., 2011a,b) found that PG exhibit attentional bias (i.e., a modified attentional processing for addiction-relevant stimuli; Franken, 2003) toward gambling-related cues at early stage of attentional processing (e.g., attentional encoding; initial orientation of attention), which depends essentially on automatic-habit processes (Browning et al., 2010; Cisler and Koster, 2010). Other evidence for the presence of attentional bias in problem gambling comes from Zack and Poulos (2004), who investigated whether gamblinglike drugs could prime the addiction-related implicit cognition network. More specifically, these authors observed that, during a rapid reading task in which target words were degraded with asterisks (e.g., w∗a∗g∗e∗r), a dopamine agonist amphetamine (dopamine is a neurotransmitter that plays a major role in reward-driven learning for every type of rewards) heightened PG readiness to read gambling-related words while concurrently slowing reading speed of neutral words (Zack and Poulos, 2004). In addition, Zack and Poulos (2004) showed that the dopamine agonist enhanced self-reported motivation to gamble in PG. These results suggest that activation of the mesolimbic dopamine system gives rise to an incentive "seeking" state, which also involves the collateral suppression of alternative motivations.

Enhanced saliency for gambling-related cues in problem gamblers has also been highlighted by functional magnetic resonance imaging (fMRI) research on cue reactivity (Crockford et al., 2005; Goudriaan et al., 2010; but see Potenza et al., 2003). For instance, Goudriaan et al. (2010) observed that, while viewing gambling-related pictures, PG exhibited higher brain activation than controls in areas involved in the reactivity to emotional information (i.e., the amygdala; Gallagher and Chiba, 1996), in the formation of interoceptive representation (the insular cortex; Craig, 2009), and in the regulation of emotional input (i.e., the VMPC; Rolls and Grabenhorst, 2008). In addition, these authors observed that subjective ratings of craving in PG correlated positively with brain activation in the VMPC and in the insular cortex. These results are important because they suggest that the perception of gambling cues in PG trigger gambling urge, which encompass brain areas involved in impulsive emotional processes (the amygdala, the insula), as well as "hot" EFs (i.e., VMPC activation).

## **HYPERACTIVE IMPULSIVE PROCESSES AND IMPAIRED IGT PERFORMANCE IN PG**

Findings depicted in the previous section suggest that problem gambling is underlined by powerful impulsive motivational-habit machinery directed at gambling-related cues, which could possibly interfere or "hijack" the top-down reflective mechanisms necessary for triggering alarming signals about future outcomes. Therefore, one can assume that similar processes may bias PGs' decision-making during the IGT toward options featuring high, short-term rewards.

Findings from brain-imaging studies on the IGT in gambling disorder are in line with this assumption. Indeed, recent positron emission tomography (PET) studies found that, in contrast to their comparison controls, disadvantageous performance on the IGT was associated with dopaminergic release in the ventral striatum in PG (Linnet et al., 2010, 2011a). More specifically, whereas in healthy controls dopamine is released in response to advantageous deck choices, in PG, disadvantageous deck selections (Linnet et al., 2010, 2011a) and subjective excitement (Linnet et al., 2011b) are higher in response to dopamine release. Using fMRI technique, Power et al. (2012) have observed that, during high-risk choice in the IGT, PG exhibited increased activation in regions encompassing the extended reward pathway, including brain areas involved in the integration of emotional and cognitive input (i.e., the orbitofrontal cortex, OFC; Rolls and Grabenhorst, 2008), involved in the reactivity to emotional information (i.e., the amygdala) and in short-term reward-based behavioral learning (i.e., caudate nucleus; Haruno and Kawato, 2006). However, in another fMRI study, Tanabe et al. (2007) observed a diminished VMPFC activation during the IGT in SD individuals and also individuals who are SD and PG (SDPG). Since these studies did not focus on pure PG, it is important to caution that the observed diminished VMPFC activation might not be due to gambling addiction alone, but rather to repeated ingestions of exogenous substance that cause harmful effects in the brain

A main limitation of these brain-imaging studies (both PET and fMRI) is that components of decision-making during the IGT have not been broken down into more specific processes that allow a better evaluation of the differential brain activation associated with different steps of decision-making. More specifically, it is unclear whether enhanced impulsive processes toward disadvantageous deck selection is related to outcome anticipation (i.e., when the subject is pondering potential options before making a decision; Cohen and Ranganath, 2005), outcome expectation (i.e., the subject has made a decision and waits the outcome; van Holst et al., 2012) or outcome processing (i.e., the subject receive a feedback on the chosen option). This issue have been recently addressed by two fMRI studies which have investigated neural activation associated with the outcome anticipation (Miedl et al., 2010) and expectation (van Holst et al., 2012) phases of gamblingrelated decision-making in PG. Specifically, Miedl et al. (2010) observed that, before taking high-risk decisions in a quasi-realistic blackjack scenario, PG exhibited enhanced brain responses in the inferior OFC and in the medial pulvinar nucleus (the pulvinar is a relay thalamic nucleus that receives interoceptive input and in turn projects to the insula, all of which are brain areas associated with impulsive urges; Sewards and Sewards, 2003), whereas controls showed a significant signal increase in low-risk conditions, which might reflect a cue-induced signal increase for high-risk situations in PG (Miedl et al., 2010). With regard to outcome expectation, van Holst et al. (2012) showed that, compared with their controls, PG exhibited higher activity in the ventral striatum and the OFC during the expectation of gambling-related outcome.

Altogether, findings from brain-imaging studies suggest that disadvantageous decision-making during the IGT (or during others situations of monetary gambling) in PG may be due to their hypersensitivity, or exaggerated salience, to immediate and larger monetary rewards. In other words, in PG, the need to make a gambling-related choice (i.e., disadvantageous decks during the IGT) could be so high that it could literally "hijack" the "hot" reflective resources (evidenced through OFC activation) toward short-term gratification. Nevertheless, it is noteworthy that these brain-imaging findings are in apparent contradiction with psychophysiological findings from Goudriaan et al. (2006) who observed lowered skin conductance and heart rate responses associated with disadvantageous deck selection in PG, as compared to controls. Indeed, hyperactivity in the fronto-striatal brain reward pathway is typically associated with higher autonomic-arousal responses. For instance, striatal (e.g., Salimpoor et al., 2011) and VMPC (e.g., Wong et al., 2007) activations have been associated with greater heart rate and skin conductance response. Hence, further studies are needed to implement a careful online measurement of autonomic arousal during fMRI scanning (for a review on how integrating fMRI with psychophysiological measurements during the IGT, see Wong et al., 2011), which would complement fMRI findings in providing a more comprehensive understanding on the physiological and neural mechanisms of impaired decision-making in PG. Moreover, additional studies are needed in order to examine the association between IGT and other indexes of "hot" executive processes, that is, processes involved in the regulation of short-term reward in PG. One option would be to examine the association between the IGT and the delay discounting task (DDT; Madden et al., 1997). In this task, individuals are to choose between smaller immediate rewards and larger, delayed rewards (e.g., \$9 immediately vs. \$15 in 1 week). Several studies showed that, as compared with their controls, PG exhibited a higher intolerance to delayed gratification on the DDT (e.g., Brevers et al., 2012b). Moreover, evidence suggests that the OFC play an important role in the capacity to delay reward on the DDT (e.g., Rogers et al., 1999; Rahman et al., 2001; Krawczyk, 2002). In addition, Monterosso et al. (2001) found that performance on the IGT was significantly correlated with performance on the DDT in a group of cocaine-dependent individuals. These findings suggest that the IGT and the DDT tap similar affective decision-making processes.

Importantly, it appears that there is no association between impairments in "cool" executive functioning and IGT performance in PG (for a review on "cool" EFs impairments in PG, see Goudriaan et al., 2004; van Holst et al., 2010). Roca et al. (2008) examined IGT performance and prepotent motor response inhibition (i.e., the ability to deliberately suppress dominant, automatic responses that are no longer relevant or required) in 11 PG and 11 controls. These authors showed that PG performed worse than controls on the IGT, and they had a poorer ability to inhibit prepotent responses as assessed with a GO/NO-GO task. However, there was no significant correlation between GO/NO-GO commission errors and overall IGT performance. More recently, based on some evidence supporting that inhibitory processes may be more important during the latter half of the IGT (Noël et al., 2007; see also **BOX 1** for a discussion on the association between "cool" EFs and latter stages of the IGT), Kertzman et al. (2011) examined the association between IGT and prepotent motor response inhibition (GO/NO-GO and Stroop task) as a function of early (trials 1–40) and latter (trials 41– 100) stages of IGT performance. However, as in Roca et al. (2008), Kertzman et al. (2011) found no significant relationship between impaired response inhibition in PG and their disadvantageous decision-making during the latter stages of the IGT. According to these authors, the fact that impaired IGT performance in PGs was not a direct result of their impaired inhibition functioning may be an expression of more general executive functioning deficits (e.g., working memory, cognitive flexibility). However, this assumption is not congruent with findings from a recent study by Brevers et al. (2012a) which highlighted that PGs' impaired performance on dual tasking (a main central executive components of working memory) was not correlated with their lowered IGT performance, at either the early or the latter stages of IGT. These findings suggest that impaired IGT performance in PG is independent from their deficit in "cool" executive processes. To a broader extent, these results are in line with theoretical accounts which advance that before elaborate decontextualized problem-solving abilities and other related cognitive skills can begin to be enacted, the ability to control emotional reactions and inhibit basic behavioral impulses is required first (Barkley, 1997; Sonuga-Barke et al., 2002; Giancola et al., 2012). Put differently, the "hijack" of impulsive incentive process on the "hot" reflective resources would hamper further elaborated decontextualized problem-solving abilities (i.e., "cool" executive processes). Further studies are needed in order to confirm that impaired "cool" executive processes do not impact PGs' IGT performance. One option would be to increase the number of IGT trials (e.g., from 100–120) and to examine the association between these later trials and performance on tasks estimating "cool" EFs. Indeed, the impact of "cool" is higher during the later trials of the IGT (see **BOX 1**). Another option would be to use the IGT with the reversal contingencies condition (Fellows and Farah, 2005). In this task the initial reward/punishment schedule are rearranged such that the two disadvantageous decks no longer had an initial advantage in the opening trials. Hence, if PGs obtain same performances as those of healthy controls, it would suggest that it is a difficulty in reversing early learning that is underpinning the behavioral profile of PG on the IGT (Dunn et al., 2006).

## **GAMBLING DISORDER AND POST-DECISION APPRAISALS DURING THE IGT**

Throughout this paper, we have seen that PG exhibited poor deck selection during the IGT. But how do they react to the consequences of their choice? More specifically, are PG impaired in their ability to react to loss and reward during the IGT? Goudriaan et al. (2006) have demonstrated that PGs' heart rate decreased after choosing from either the good or bad decks, whereas the heart rate of their controls decreased after disadvantageous choices, but increased after advantageous choices. These findings indicate that, as compared to controls, PG exhibit decreased reactivity to rewards and losses during the IGT. Furthermore, in another study, Goudriaan et al. (2005) observed that, compared to controls, PG displayed a higher response speed and lower response shifting after rewards and net losses. Taken together, findings from Goudriaan et al. (2005, 2006) are consistent with several brain imaging studies that observed a reduction of cerebral activity for the processing of rewards and losses in PG during monetary gambling task (Reuter et al., 2005; de Ruiter et al., 2009). Nevertheless, Oberg et al. (2011) have recently observed that disadvantageous IGT deck selection in PG was associated with a hypersensitive neural response at a very early (i.e., 185 ms) post-feedback latency (i.e., the MedioFrontal Negativity, which is involved in the early, rapid positive vs. negative appraisal of feedback; Yeung et al., 2004), but lower neural activity at a later phase (i.e., 300 ms) of feedback processing (i.e., the P300 Theta Amplitude which reflects a later, attention-sensitive, more elaborated appraisal of outcome evaluation; Sato et al., 2005). Hence, these results indicate that, although PG may exhibit a blunted absolute response to outcome signals in general, the neurobiology of feedback processing in problem gambling is probably more complex. Noteworthy, mean age of PG participants recruited by Oberg et al. (2011) was 23 and their scores of problem gambling

## **Box 1 | The impact of "cool" EFs during the IGT**

The IGT has been shown to tap into "hot" EFs, that is, aspects of decision-making that are influenced by affect and emotion (Bechara, 2004). Specifically, Bechara and colleagues have demonstrated that, whereas healthy controls learn to avoid the disadvantageous decks, patients with damage to VMPFC continue to choose from these disadvantageous decks (e.g., Bechara et al., 1994, 1997, 2000). Nevertheless, several recent findings suggest that not all aspects of the IGT are equal at detecting "hot" decision-making processes. Consistent with this view, performances on working memory (Brevers et al., 2012a), dominant response inhibition (Noël et al., 2007) and cognitive flexibility (Brand et al., 2007; Iudicello et al., 2013) have been associated with performance of healthy controls on the latter stages of the IGT. Hence, these results suggest that "cool" executive processes may be involved in the latter trials of the IGT.

One explanation for these findings is that, across trials, the IGT may vary according to its level of uncertainty (Brand et al., 2006). More specifically, selections during the last block of trials may be referred as decision-making under risk (i.e., situations of decision-making in which probabilities of reward and loss are known) because participants should have experienced the different win/loss contingencies enough to know which decks are risky and which are not. By contrast, because there has not been time for a participant to experience any of the win/loss contingencies during early deck choices, the first blocks of the IGT refer to decision-making under ambiguity (i.e., situations of decisionmaking in which probabilities of reward and loss are unknown).

Several theoretical accounts advance that processes underlying decision-making may depend upon the degree of uncertainty and the amount of information offered to the decision-maker (e.g., Brand et al., 2006; Krain et al., 2006). More specifically, because it does not offers explicit rules for possible outcomes or probabilities, decision-making under ambiguity has to be made via the reactivation of emotions associated with similar previous experiences (i.e., "hot" executive processes; Brand et al., 2006; Krain et al., 2006). By contrast, decision-making a decision under risk, which offers explicit rules for reinforcement and punishment, would involve both the integration of pre-choice emotional processes and rational analytical system aspects (i.e., "cool" executive processing; Brand et al., 2006; Krain et al., 2006). In other words, deteriorations in "hot" and "cool" executive functions could alter differently decision-making under risk and decision-making under ambiguity. For instance,Brand et al. (2007) observed that individuals with lowered "cool" executive functioning (i.e., concept formation, shifting between multiple tasks, and dominant response inhibition) but with intact "hot" executive processing (i.e., pre-choice emotional activation reactivity associated with an advantageous decision-making profile) exhibited less disadvantageous choices in situations of decision-making under ambiguity as compared with situations of decision-making under risk. By contrast,Brand et al.(2007) also found that individualswith selective deficits in pre-choice emotional activation but with intact "cool" executive functioning exhibited disadvantageous choices in decision-making under risk and under ambiguity. Additional studies have shown that advantageous decision-making under risk, but not under ambiguity, is associated with efficient "cool" executive processing (i.e., calculative strategies; Brand, 2008; Brand et al., 2009). Moreover, advantageous decision-making under risk (Starcke et al., 2011), but not under ambiguity (Turnbull et al., 2005), is lowered when subjects have to take a decision while concurrently performing a secondary task (i.e., random number generation), which are known to load "cool" executive resources (Baddeley and Della Sala, 1996).

severity were relatively low. Hence, in Oberg et al. (2011), PGs' hypersensitivity to reward at early post-feedback latency might be due to the fact that they were at an early-stage of problem gambling and had not yet suffered the long-term consequences of excessive gambling (e.g., tolerance to money reward). Further longitudinal investigations would be helpful in evaluating the potential use of Oberg et al. (2011) findings as an early indicator of predisposition to gambling or other addictive behaviors.

As a whole, these results indicate that, throughout the repetition of gambling behaviors, PG acquire an extensive experience in making complex financial decisions involving variable wins, losses and probabilities. Thus, while gambling disorder does not entail exogenous drug administration, neural systems that process rewards may nonetheless undergo neuroadaptive change as the gambler experiences a chronic regime of winning and losing, coupled with the changes in arousal that are induced by those events. Because of this tolerance, problem gamblers may start to act out more frequently and, sometimes, in more dangerous ways by often gambling with greater and greater stakes toward options featuring high but uncertain rewards.

Are PG also impaired in their ability to assess the quality of their already poor decisions? In other words, is there a dissociation between PGs' subjective evaluation of IGT performance and their actual performance (i.e., metacognitive ability)? Such impairment of metacognitive capacity in individuals suffering from addiction may be reflected in one of the most common observation from the clinic of addiction, that is, impairment in recognition of the severity of the disorder by the addict (i.e., lack of insight; Goldstein et al., 2009). For instance, only 4.5% of the 21.1 million persons classified as needing (but not receiving) substance use treatment reported a perceived need for therapy (SAMHSA, 2007). Hence, when metacognitive judgment becomes exceedingly disrupted, the repetition of addictionrelated behaviors may be heightened by the underestimation of addiction severity.

Metacognitive judgment during the IGT has been recently examined in PG by Brevers et al. (2013b). These authors examined metacognitive capacities in PG by asking participants to wager on their own decisions after each choice during the IGT (i.e., IGT with post-decision wagering; Persaud et al., 2007). These authors observed that, unlike controls, PG participants tend to wager high while performing poorly on the IGT. This result suggests that PG exhibited impairments not only in their ability to correctly assess risk in situations that involve ambiguity, but also in their ability to correctly express metacognitive judgments about their own performance. That is, PG not only perform poorly, but they also erroneously estimate that their performance is much better than it actually is. In line with these findings, Goudriaan et al. (2005) showed that PG exhibited lower IGT conceptual knowledge than their controls when they were asked to indicate which decks were advantageous or disadvantageous. Interestingly, in another recent study, Brevers et al. (2013c) showed that PG were also impaired in their capacity to evaluate accurately the quality of their decisions during a non-gambling task in which the quality of choice remains uncertain throughout the task (i.e., an artificial grammar-learning paradigm). After each trial of this task, participants had to indicate how confident they were in their grammaticality judgments. Results showed that, by contrast with their controls, there was no correlation between PGs' grammaticality judgments and their level of confidence, which suggests a disconnection between performance and confidence in PG. To a broader extent, these findings indicate that PG are impaired in their metacognitive abilities on a non-gambling task, which suggests that gambling disorder is associated with poor insight as a general factor.

Future studies are needed to confirm this assumption. The use of functional neuroimaging studies, which could probe the neural basis of these deficits, is one option. Indeed, a recent investigation showed that the prefrontal cortex, and especially areas involved in "cool" EFs, such as the dorsolateral prefrontal cortex, are activated while subjects report metacognitive judgment on their performance during "neutral" situations of decision-making. For instance, Del Cul et al. (2009) have demonstrated that prefrontal lesions could affect subjective reports of visual experience more than visual task performance. Moreover, Slachevsky et al. (2001, 2003) have shown that lesion affecting the prefrontal cortex also affects awareness as well as the monitoring of actions or sensory-motor readjustments. Other studies showed that bilaterally-depressed activity in the dorsolateral prefrontal cortex, through transcranial magnetic stimulation, can affect metacognition but not task performance during a visual discrimination task (Turatto et al., 2004; Rounis et al., 2010).

## **SUMMARY**

PG display a stubborn preference for disadvantageous deck selection throughout the IGT, which suggest that they are hampered in their ability to resist short-term high and uncertain rewards. In this paper, based on dual-process model of willpower (e.g., Bechara, 2005; Everitt and Robbins, 2005; Redish et al., 2008), and on recent influential theoretical accounts (Hofmann et al., 2008, 2009; Verdejo-Garcia and Bechara, 2009; Stacy and Wiers, 2010; Noël et al., 2013), we advanced the view that this inability to forgo short-term benefits for long-term benefits may be underlined by an exaggerated response to cues predicting immediate and large monetary rewards (see **Figure 1** for a framework summarizing processes underlying A. advantageous deck selection in healthy controls and B. disadvantageous deck selection in pathological gamblers).

We first reviewed findings showing that gambling-related cues automatically trigger PGs' motivation-relevant associative memories (Yi and Kanetkar, 2010; Brevers et al., 2013a) and grab the addicts' attention (e.g., Brevers et al., 2011a,b). In addition, findings from cue reactivity studies suggest that scores of subjective craving correlated positively with PGs' brain activation in areas involved in impulsive/automatic emotional processes (i.e., the amygdala, the insula) but also in "hot" EFs (i.e., the VMPC) (Crockford et al., 2005; Goudriaan et al., 2010). These results suggest that gambling disorder is underlined by powerful impulsive motivational-habit machinery directed at gambling-related cues, which could possibly bias PGs' decisionmaking during the IGT toward option featuring high, short-term rewards.

**FIGURE 1 | (A)** A framework for advantageous deck selection in healthy controls. *Pathway (a)*: Impulsive motivational processes directed at options featuring short-term salient rewards. *Pathway (b)*: The moderation of impulsive processes by "hot" reflective processes involved in the reduction of impulsive-incentive reactions and in the ability to anticipate the potential outcomes of a given decision on an emotional basis. *Pathway (c)*: The ability to control emotional reactions and inhibit basic behavioral impulses by "hot" executive/reflective functions allows rational and cognitive determinations of risks and benefits associated with options (only during the last trials of the IGT, that is, when participants have experienced the different winl/loss contingencies enough and become aware of which decks are more at risk than others), which further reinforce the efficiency of reward anticipation processes (e.g., to weigh short-term gains against long-term losses on both emotional and rational bases). *Pathway (d)*: Adequate sensitivity to loss and reward and accurate assessment of the quality of the decision, which would bias advantageously forthcoming deck selections. **(B)** A framework for disadvantageous deck selection in pathological gamblers. *Pathway (a)*: Hyperactive impulsive motivational processes directed at options featuring high, short-term rewards (as evidenced with attentional bias and implicit association toward gambling-related cues in PG; *see* Hyperactivity of impulsive processes toward gambling-related cues in PG). These impulsive processes could possibly interfere with or "hijack" the top-down "hot" reflective mechanisms necessary for triggering alarming signals about futures

outcomes (as evidenced by fMRI studies which showed that, during disadvantageous lGT choice or during gambling·-related choice, PG exhibit increased activation in brain regions encompassing both impulsive-amygdala, ventral striatum, caudate nucleus, medial pulvinar nucleus - and "hot" reflective·- orbitofrontal cortex - processes; *see* Hyperactive impulsive processes and impaired IGT performance in PG). As a result, disadvantageous deck options may be flagged as salient and preferred to advantageous decks. *Pathway (b)*: The "hijack" by impulsive incentive processes of the "hot" reflective resources would hamper further elaborated decontextualized problem-solving abilities (suggested by the absence of correlation between PGs' impairments in "cool" executive functioning and their lowered IGT performances, at either the early or the latter stages of IGT; *see* Hyperactive impulsive processes and impaired IGT performance in PG). *Pathway (c)*: Hyposensitivity to loss and reward in PG (as evidenced by fMRI studies which observed a diminished ventral striatal response in PG after receiving monetary rewards and losses; *see* Gambling disorder and post-decision appraisals during the IGT) and failure at correctly assessing the quality of their already poor decision (evidenced by studies which observed a dissociation between PGs' subjective assessment of performance and objective performance; *see* Gambling disorder and post-decision appraisals during the IGT). As a result, PG might fail at properly integrate the outcomes of their actions over time, which could lead them to persist in taking high-risk choices, despite suffering large losses.

Accordingly, we then focused on studies investigating processes involved in PGs' impaired IGT performance. PET studies highlighted that disadvantageous performance on the IGT was associated with dopaminergic release in the ventral striatum in PG (Linnet et al., 2010, 2011a,b, 2012). Moreover, fMRI findings (Power et al., 2012) observed that, in line with cue-reactivity studies (e.g., Goudriaan et al., 2010), high-risk choice during the IGT in PG was underlined by an increased neural activation in regions involved in the reactivity to emotional information (i.e., the amygdala), in short-term reward-based behavioral learning (i.e., the caudate nucleus), and in the integration of emotional and cognitive input (i.e., the OFC). In other words, these results suggest that the incentive-salience associated with gambling-related choice (i.e., disadvantageous decks selection during the IGT) in PG is so high that it could literally "hijack" the "hot" reflective resources toward short-term gratifications. In addition, it appears that PGs' impairments in "cool" executive processes, including working memory (Brevers et al., 2012a) and response inhibition (Roca et al., 2008; Kertzman et al., 2011), are not associated with their disadvantageous decks selection, at both early (e.g., trials 1– 40) or late (e.g., trials 41–100) stages of IGT performance. These findings suggest that PGs' impaired IGT performances are not due to their lower level of "cool" EFs.

In the last part of this paper, we highlighted the issue that gambling disorder might also be associated with a diminished feedback reactivity during the IGT. In addition, recent findings suggest that PG not only perform poorly on the IGT, but they also erroneously estimate that their performance is much better than it actually is (Brevers et al., 2013b). These findings on feedback reactivity and metacognitive capacity imply that PG might fail at properly integrating the outcomes of their actions over time in order to form a global impression of the trade-offs between risk and reward, which could lead them to persist in taking high-risk choices, despite suffering large losses.

## **FUTURE STUDIES**

As suggested throughout this paper, additional studies are needed in order to further examine the processes associated with impaired IGT performance in PG. For instance, future studies should examine the association between IGT and other tasks estimating "hot" executive processes, such as the delayed discounting task (e.g., Hongwanishkul et al., 2005). Moreover, additional fMRI studies are also needed in order to better evaluate differential brain activation as it relates to different phases of decision-making during the IGT (i.e., outcome anticipation, outcome expectation, and outcome processing). It should also be useful to implement a careful online measurement of autonomic arousal during the fMRI scanning, which would complement fMRI findings in providing a more comprehensive understanding on the physiological and neural mechanisms underlying impaired decision-making in PG (e.g., Wong et al., 2011). Further studies are also needed in order to confirm that impaired "cool" executive processes do not impact PGs' IGT performance, by using for instance, the IGT with the reversal contingencies condition (Fellows and Farah, 2005) or by increasing the number of IGT trials (because the impact of "cool" is higher during the later trials of the IGT). Finally, future studies should also assess preand post-IGT gambling-related craving in PG. Indeed, recent theoretical accounts argue that the subjective experience of urge and craving may increase the drive and motivation to gamble (and to choose decks featuring high reward but higher losses during the IGT) in PG by sensitizing or exacerbating the activity of the habit/impulsive system, and by subverting attention, reasoning, planning, and decision-making processes to seek and access gambling (Verdejo-Garcia and Bechara, 2009; Sutherland et al., 2012; Noël et al., 2013).

## **CONCLUSION**

In conclusion, because it mimics both real life and gamblingrelated decision-making situations, the IGT may be the most ecologically valid estimation of decision-making impairments in PG. Accordingly, through the use of this task, studies on gambling addiction have yielded a consistent view of disadvantageous decision-making in PG. In this review, we advanced that this aberrant profile of decision-making may be underlined by a hyperactivity of impulsive processes toward high-uncertain rewards, which can interfere with "hot" and "cool" reflective resources necessary for self-regulation. Nevertheless, much as to be done as it remains unclear on how these processes contribute specifically to the aberrant choice profile displayed by PG on the IGT.

## **REFERENCES**


alcohol and stimulant abusers. *Neuropsychologia* 39, 376–389. doi: 10.1016/S0028-3932(00)00136-6


A. (2007). Decisions under ambiguity and decisions under risk: correlations with executive functions and comparisons of two different gambling tasks with implicit and explicit rules. *J. Clin. Exp. Neuropsychol.* 29, 86–99. doi: 10.1080/13803390500507196


(2012b). Impulsive action but impulsive choice determines problem gambling severity. *PLoS ONE* 7:e50647. doi: 10.1371/journal. pone.0050647


*Sci.* 351, 1413–1420. doi: 10.1098/ rstb.1996.0125


attitudes, self-esteem, and stereotypes. *Psychol. Rev.* 102, 4–27. doi: 10.1037/0033-295X.102.1.4


performance in pathological gamblers and healthy controls. *Scand. J. Psychol*. 106, 383–390. doi: 10.1111/ j.1360-0443.2010.03126.x


*Brain Sci*. 31, 461–470. doi: 10.1017/ S0140525X08004986


407–411. doi: 10.1097/00001756- 200503150-00020


Brand, M. (2011). Decision-making under risk conditions is susceptible to interference by a secondary executive task. *Cogn. Process.* 12, 177–182. doi: 10.1007/s10339-010- 0387-3


931–959. doi: 10.1037/0033-295X. 111.4.931


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 June 2013; accepted: 05 September 2013; published online: 30 September 2013.*

*Citation: Brevers D, Bechara A, Cleeremans A and Noël X (2013) Iowa Gambling Task (IGT): twenty years after – gambling disorder and IGT. Front. Psychol. 4:665. doi: 10.3389/fpsyg. 2013.00665*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Brevers, Bechara, Cleeremans and Noël. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The Iowa Gambling Task and the three fallacies of dopamine in gambling disorder

## *Jakob Linnet 1,2,3,4\**

*<sup>1</sup> Research Clinic on Gambling Disorders, Aarhus University Hospital, Aarhus, Denmark*

*<sup>2</sup> Clinical Department, Center of Functionally Integrative Neuroscience, Medical School of Aarhus University, Aarhus, Denmark*

*<sup>3</sup> Division on Addiction, Cambridge Health Alliance, Cambridge, MA, USA*

*<sup>4</sup> Department of Psychiatry, Harvard Medical School, Harvard University, Cambridge, MA, USA*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Wael Asaad, Brown University, USA Eric M. Wassermann, NIH/NINDS, USA*

#### *\*Correspondence:*

*Jakob Linnet, Research Clinic on Gambling Disorders, Aarhus University Hospital, Nørrebrogade 44, Building 30, DK-8000 Aarhus C, Denmark e-mail: linnet@cfin.au.dk*

Gambling disorder sufferers prefer immediately larger rewards despite long term losses on the Iowa Gambling Task (IGT), and these impairments are associated with dopamine dysfunctions. Dopamine is a neurotransmitter linked with temporal and structural dysfunctions in substance use disorder, which has supported the idea of impaired decision-making and dopamine dysfunctions in gambling disorder. However, evidence from substance use disorders cannot be directly transferred to gambling disorder. This article focuses on three hypotheses of dopamine dysfunctions in gambling disorder, which appear to be "fallacies," i.e., have not been supported in a series of positron emission tomography (PET) studies. The first "fallacy" suggests that gambling disorder sufferers have lower dopamine receptor availability, as seen in substance use disorders. However, no evidence supported this hypothesis. The second "fallacy" suggests that maladaptive decision-making in gambling disorder is associated with higher dopamine release during gambling. No evidence supported the hypothesis, and the literature on substance use disorders offers limited support for this hypothesis. The third "fallacy" suggests that maladaptive decision-making in gambling disorder is associated with higher dopamine release during winning. The evidence did not support this hypothesis either. Instead, dopaminergic coding of reward prediction and uncertainty might better account for dopamine dysfunctions in gambling disorder. Studies of reward prediction and reward uncertainty show a sustained dopamine response toward stimuli with maximum uncertainty, which may explain the continued dopamine release and gambling despite losses in gambling disorder. The findings from the studies presented here are consistent with the notion of dopaminergic dysfunctions of reward prediction and reward uncertainty signals in gambling disorder.

**Keywords: gambling disorder, Iowa Gambling Task (IGT), dopamine, addiction, positron-emission tomography**

## **INTRODUCTION**

Impaired performance on the Iowa Gambling Task (IGT) is associated with a range of substance use disorders and behavioral addictions including gambling disorder. The term "gambling disorder" was recently introduced in version 5 of the Diagnostic Statistical Manual (DSM) (American Psychiatric Association DSM 5, 2013) as a separate chapter on "behavioral addiction" under the substance use classification. In DSM-IV (American Psychiatric Association DSM-IV, 1994) the disorder was classified as "pathological gambling" under "impulse control disorders." The change in classification and grouping has two important implications. First it suggests that gambling disorder shares the clinical characteristics of substance use disorders rather than impulse control disorders. This change is significant because it underscores the relevance of comparing gambling disorder with other forms of addiction with regard to for instance clinical epidemiological and neurobiological aspects of the disorder. Second it uniquely differentiates gambling disorder as a "behavioral addiction" from other substance use disorders which emphasizes

that addiction can be purely behavioral and need not involve the intake of exogenous substances.

The research approach on neurobiological markers of IGT performance in gambling disorder presented here focuses on these two distinctions. On the one hand it identifies common features of dopaminergic dysfunctions and impaired IGT performance in gambling disorder and related substance use disorders; on the other hand it seeks to identify unique patterns of dopamine dysfunctions in relation to impaired IGT performance of gambling disorder sufferers compared with evidence from the literature on substance use disorders.

The present article suggests that there are three hypotheses of dopaminergic dysfunctions in gambling disorder, which appear to be fallacies, i.e., it has not been possible to find support for the hypotheses in a series of positron emission tomography (PET) studies on gambling disorder. The first hypothesis suggests that gambling disorder sufferers have lower baseline binding potentials, as seen in substance use disorders; the second hypothesis suggests that gambling activity is associated with higher dopamine release in gambling disorder, i.e., that gambling disorder sufferers have dopaminergic hypersensitivity toward gambling; the third hypothesis suggests that winning is associated with higher dopamine release in gambling disorder, i.e., that gambling disorder sufferers have dopaminergic hypersensitivity toward winning. Finally, it is suggested that reward prediction and reward uncertainty signals, which are learning mechanisms associated with dopamine release, might better account for the dopaminergic dysfunctions and impaired IGT performance in gambling disorder, and evidence is presented to support this vantage point.

## **THE IOWA GAMBLING TASK IN SUBSTANCE USE DISORDERS AND GAMBLING DISORDER**

The IGT is an executive functions task, which simulates real life decision making in the way that it factors reward and punishment (Bechara et al., 1994). Individuals choose between four decks of cards labeled A, B, C, and D, with the objective to win as much money as possible. In decks A and B ("disadvantageous decks"), choosing a card is followed by an immediately high gain of money, but at unpredictable trials the selection is followed by a high loss, leading to a net loss over time. In decks C and D ("advantageous decks"), the immediate gain is smaller, but the future loss is also smaller, leading to a net gain over time. Other versions of the IGT have been developed, where, for instance, the contingencies are reversed (Bechara et al., 2002).

Originally, findings on the IGT showed that patients suffering from lesions in the ventromedial prefrontal cortex (sometimes referred to as the orbitofrontal cortex) have a higher preference for immediate rewards despite negative future consequences (Bechara et al., 1994, 2000). These findings led to the suggestion that lesion patients suffer from insensitivity to future consequences. The findings of impaired decision-making in lesion patients were replicated in individuals suffering from substance use disorders, suggesting that these individuals prefer immediate rewards despite negative long-term consequences (Bechara et al., 2001; Bechara, 2003). The impairments were linked to prefrontal cortex dysfunctions, based on the evidence from lesion patients. The findings were later extended to gambling disorder, where gambling disorder sufferers show decision-making impairments similar to individuals suffering from substance use disorders (Grant et al., 2000; Petry, 2001; Cavedini et al., 2002; Goudriaan et al., 2005, 2006a; Linnet et al., 2006).

Linnet et al. (2006) investigated "chasing one's losses," a key diagnostic symptom of gambling disorder. The authors compared 61 gambling disorder sufferers with 39 healthy controls. Gambling disorder sufferers were recruited through a treatment center, and healthy controls were selected from a pool of firstyear psychology students. All participants completed a modified version of the IGT called the "Mouse Game" where individuals had to help a mouse gather cheese, rather than winning money. The contingencies were the same as the IGT, but units were converted into grams of cheese and the winning and losing sounds were removed, in order to reduce the association with gambling. The decks on the Mouse Game were stacked with 100 cards, such that participants could not deplete the decks during trials; the last 40 cards were added to the original 60-card stack on the IGT.

The study aimed at developing a quantifiable behavioral measure of chasing in a gambling situation where decisionmaking and skill would determine the outcome of the game. It was hypothesized that gambling disorder sufferers would have impaired IGT performance and increased *episodic chasing* (i.e., sequences of persistent poor choices leading to losses) compared with healthy controls, suggesting that they would be less likely to use negative feedback to change their behavior. To define chasing on the IGT, an index of behavior focused on choice sequences was compiled. *Advantageous choice sequence* was defined as five consecutive advantageous decisions (cards from deck C or D) and a *disadvantageous choice sequence* as five consecutive disadvantageous decisions (cards from deck A or B). The chance of choosing five consecutive good or bad cards at random is 2−<sup>5</sup> = 0.03125 (*p* < 0.05).

The result showed that gambling disorder sufferers had significantly higher chasing on the IGT than healthy controls (*df* = 4, *F* = 3.61, *p* ≤ 0.01). The advantageous and disadvantageous chasing episodes were distributed evenly throughout the game. In other words, individuals did not solely have, e.g., advantageous decision-making sequences in the beginning of the game and disadvantageous sequences toward the end of the game. Rather, a pattern emerged for players with several behavior episodes in which both advantageous and disadvantageous decisions were present at the beginning of the game, developing into a "learning curve" of predominantly advantageous or disadvantageous sequences as the game unfolded. These results are consistent with the notion that gambling disorder sufferers are more impulsive and less likely to adopt a long term advantageous strategy, even in the face of negative feedback, than healthy controls. They are also consistent with the notion of reduced PFC functions and/or dopamine dysfunctions in the disorder.

## **SUBSTANCE USE DISORDERS AND THE DOPAMINE SYSTEM**

Using drugs such as cocaine, amphetamine, and methamphetamine increases extracellular dopamine in the synaptic cleft, and binds more dopamine to the dopamine receptors of the synapses (Stahl, 2006). In healthy individuals increased dopamine binding to dopamine D2/<sup>3</sup> receptors is associated with a higher self-reported hedonic pleasure (Volkow et al., 2002a). The hedonic pleasure from drug liking is linked to two factors: (1) the baseline level of dopamine receptor availability before drug use; and (2) the change in dopamine receptor availability following drug use. Dopamine receptor availability is measured using, for instance, PET, where a radioactive ligand such as [11C]raclopride is injected into the blood stream, and measured based on its binding properties. Raclopride binds to available dopamine D2/<sup>3</sup> receptors in the brain, and the raclopride binding potential is an index of dopamine receptor availability:

$$\text{BP}\_{\text{ND}} = \frac{B\_{\text{max}} - B}{V\_d K\_d} \tag{1}$$

where *Bmax* is the maximum binding capacity of the receptors, *B* is the binding of the radioligand, *Vd* is the volume distribution, and *Kd* is the ligand's half-saturation concentration.

A higher baseline raclopride binding potential is interpreted as a higher number of dopamine receptors available for binding; a higher (positive) change in raclopride binding potential from a baseline to an experimental condition is interpreted as an increased release of dopamine because more dopamine is bound to the receptors in the experimental condition. Substance use disorders are associated with lower baseline dopamine receptor availability and reduced dopamine release from drug use.

#### *Baseline levels of dopamine receptor availability*

Healthy individuals with lower baseline dopamine receptor availability have higher hedonic pleasure from drug use than individuals with higher levels of dopamine receptor availability (Volkow et al., 1999, 2002b). These findings have been interpreted to suggest that lower baseline dopamine receptor availability is a risk factor for developing substance use disorders, while higher receptor availability could help prevent developing substance use disorder.

In a study of 15 methamphetamine use disorder sufferers and 20 healthy control subjects Volkow et al. (2001) found that the methamphetamine use disorder sufferers had significantly lower dopamine binding than control subjects. The authors note that the results could either reflect a pre-conditioned vulnerability toward addiction, or a down-regulation of dopamine receptors or loss of dopamine transporters following the methamphetamine use disorder.

Later, Volkow et al. (2006) compared dopamine receptor availability of non-addicted family members from families with a history of alcoholism and family members from families without a history of alcoholism. Individuals from families with a history of alcoholism had significantly lower baseline dopamine receptor availability than individuals from families without alcoholism. The results are consistent with the notion that higher baseline dopamine receptor availability is a protective factor against alcoholism and substance dependence. Individuals from families without a history of alcoholism may have been "protected" from developing substance use disorder by higher dopamine receptor availability, while individuals from families with a history of alcoholism may be at risk for developing substance use disorder due to lower receptor availability.

#### *Dopamine release and substance use*

Healthy volunteers show a significant correlation between change in dopamine binding and hedonic response to substance use; individuals with larger dopamine release from substance use report larger hedonic response (Volkow et al., 2002a). However, the evidence of change in dopamine release and hedonic response in substance use disorders is more complex (Volkow et al., 1997, 2002a, 2008; Kalivas and Volkow, 2005). Volkow et al. (1997) investigated dopamine release and hedonic response from methamphetamine use in 20 detoxified cocaine use disorder individuals and 23 healthy controls. Participants were given a moderate dosage of intravenously injected methamphetamine, a substance similar to cocaine. The results confirmed previous reports that cocaine use disorder individuals had lower baseline dopamine receptor availability than healthy controls. They further showed that healthy controls had significantly increased dopamine release throughout the striatum and felt significantly more "high" and "restlessness" from drug use compared to cocaine use disorder individuals.

The results suggest a blunted dopaminergic effect toward methamphetamine and reduced feelings of "high" in cocaine use disorder sufferers. In other words, individuals with cocaine use disorder neither have increased dopamine release nor increased pleasure from using drugs similar to cocaine compared with healthy control individuals. Substance use disorders therefore cannot be explained by increased dopamine release from substance use or higher pleasure from dopamine release per se. The involvement of dopamine in substance use disorders is more complex.

## **GAMBLING DISORDER AND THE DOPAMINE SYSTEM**

The dopamine system is sensitive to behavioral stimulation related to monetary reward (Koepp et al., 1998; Breiter et al., 2001; Zald et al., 2004). For instance, Koepp et al. (1998) found that skilled video game players had significant dopamine release in the striatum when playing a video game for money.

Another line of evidence of the role of dopamine in gambling disorder comes from the literature on gambling disorder in Parkinson's disease sufferers in agonist treatment. Parkinson's disease sufferers, who are treated with dopamine agonists, have significantly higher prevalence of gambling disorder than individuals who receive other forms of treatment (Grosset et al., 2006; Lu et al., 2006; Weintraub et al., 2006). Agonist treatment is also associated with other impulse control disturbances such as hypersexuality, compulsive shopping, and compulsive eating (Steeves et al., 2009). These data suggest that certain changes to the dopamine system is associated with increased risk of addiction and impulse control disorders, including gambling disorder. While the dopaminergic mechanism behind the increased risk of gambling disorder is currently unknown, Steeves et al. (2009) found significant dopamine release in the ventral striatum of Parkinson's disease patients suffering from gambling disorder who gambled for money. Furthermore, de la Fuente-Fernandez et al. (2002) found significant dopamine release in the ventral striatum of Parkinson's patients expecting a drug reward in placebo trials. The authors concluded that the dopamine release was mediated by the expectation of reward. Unlike gambling disorder sufferers, Parkinson's disease sufferers in agonist treatment with gambling disorder have reduced binding potentials as a consequence of Parkinson's disease, and they therefore represent an atypical case of gambling disorder. For this reason the present review predominantly focuses on dopaminergic dysfunctions in gambling disorder without Parkinson's disease.

While use of substances such as cocaine is associated with dopamine release throughout the striatum, the ventral striatum is specifically involved in drug expectation and monitoring of reward (Delgado et al., 2000; de la Fuente-Fernandez et al., 2002), and this region appears to be central to gambling disorder and substance use disorder (Reuter et al., 2005; Abler et al., 2006; Linnet et al., 2010, 2011a,b). Evidence from the animal literature also supports the involvement of the ventral striatum in drug seeking and addictive behavior (Dalley et al., 2007; Uhl, 2007; Doya, 2008). Dopaminergic dysfunctions in the ventral striatum might therefore contribute to the decision making impairments on the IGT seen in gambling disorder. However, while substance use disorder and gambling disorder may share a common neurobiological basis, there might be differences in dopaminergic dysfunctions related to drug use and gambling.

The present review examines similarities and differences in dopaminergic dysfunctions between substance use disorder and gambling disorder based on a series of articles investigating the relation between dopaminergic neurotransmission and IGT performance in gambling disorder (Linnet et al., 2010, 2011a,b, 2012). In the study we scanned gambling disorder sufferers and healthy controls with PET using the radioligand [11C]raclopride to measure dopaminergic neurotransmission during a baseline and a gambling condition of the IGT. In the baseline condition participants played a non-decision IGT similar to that of Bolla et al. (2003, 2004), where the computer automatically instructed the participants which cards to choose, and no winning or losing sounds were used; during the gambling scan participants chose freely between the decks, and received auditory feedback when winning or losing. Since each PET scan lasted 60 min, and it only takes ∼20 min to administer the IGT, three versions of the IGT were used: the regular ABCD version, and subsequent KLMN and QRST versions, where the contingencies between decks become increasingly ambiguous. Raclopride binding potentials (BPND) and change in binding potential (-BPND) between baseline and gambling condition were recorded. Higher raclopride binding potentials (BPND) indicate a higher number of D2/<sup>3</sup> receptors available for dopamine binding. Decreased raclopride binding potentials from baseline to gambling condition indicate dopamine release because dopamine occupies more receptors during gambling and leaves fewer receptors available for raclopride binding. Raclopride binding potentials were measured using the ERLiBiRD method (Gjedde et al., 2005), and a ventral striatum mask using criteria similar to those of Mawlawi et al. (2001) was used to determine the anatomical location of the ventral striatum. Other masks were used for the putamen and caudate nucleus. The raclopride emission recordings were co-registered with structural MR images for each individual using MNI tools, and transformed into a common stereotaxic coordinate space (Talairach and Tournoux, 1988).

The study findings gave rise to the notion of the three "fallacies" of the role of dopamine in gambling disorders. Specifically, we found no support for the hypotheses that: (1) gambling disorder sufferers have lower baseline dopamine receptor availability; (2) gambling disorder sufferers have increased dopamine release when gambling; and (3) gambling disorder sufferers have increased dopamine release when winning.

## *Fallacy 1: gambling disorder sufferers have lower baseline dopamine receptor availability*

While studies of substance use disorder have consistently and independently shown lower baseline dopamine receptor availability throughout the brain in substance use disorder (Volkow et al., 1997, 2001), we found no such differences in gambling disorder (Linnet et al., 2010, 2011a,b, 2012). Linnet et al. (2010) compared raclopride binding potentials (BPND) in the ventral striatum of 16 gambling disorder sufferers and 15 healthy controls. The results showed no significant differences in baseline binding potentials between the two groups. Follow-up studies expanding the cohort to 18 gambling disorder sufferers and 16 healthy controls (Linnet et al., 2012) confirmed these findings throughout the striatum. Later independent studies support that gambling disorder sufferers do not differ in baseline dopamine receptor availability compared with healthy controls (Clark et al., 2012; Boileau et al., 2013).

The findings differ from the literature on substance use disorder, where individuals with substance use disorder have significantly lower binding potentials than healthy controls (Volkow et al., 2001). The differences in results may suggest a downregulation of receptor availability as a consequence of substance use disorder, which is not present in gambling disorder. Co-morbidity between gambling disorder and substance use disorders is generally high (Crockford and el-Guebaly, 1998; Ibanez et al., 2001; Kausch, 2003; Petry et al., 2005; Dannon et al., 2006; Kessler et al., 2008), and presence of substance use disorders increases severity of gambling disorder (Rush et al., 2008) or risk thereof (el-Guebaly et al., 2006). However, our population of gambling disorder sufferers (Linnet et al., 2010, 2011a,b, 2012) was screened for substance use disorders. It is therefore, possible that lower levels of baseline dopamine binding potentials are found in individuals suffering from co-morbidity of gambling disorder and substance use disorders. More importantly, the findings might have implications for understanding the role of dopamine in the behavioral addictions (Holden, 2001; Shaffer and Kidman, 2003; Petry, 2006; Potenza, 2006; Grant et al., 2010), and may indicate neurobiological distinctions between behavioral addictions and substance use disorders at the level of the striatum and ventral striatum.

## *Fallacy 2: gambling disorder sufferers have increased dopamine release when gambling*

Despite the evidence of a blunted dopamine response in substance use disorder (Volkow et al., 1997), the fallacy of a hyperdopaminergic response to reward in substance use disorder has transcended into the field of gambling disorder. The dopamine system is sensitive to behavioral stimulation related to monetary reward (Koepp et al., 1998; Breiter et al., 2001; Zald et al., 2004), which has lead to the suggestion of dopamine dysfunctions in gambling disorder (Holden, 2001). However, the evidence of a hyperdopaminergic response to reward in gambling disorder is mixed. Steeves et al. (2009) reported an increased dopamine response to winning in a PET study of Parkinson's disease patients with gambling disorder compared with Parkinson's disease patients without gambling disorder. However, we (Linnet et al., 2011b) found that some gambling disorder sufferers and healthy controls had significant dopamine release in the ventral striatum when gambling on the IGT, compared with the no-gambling condition, but we did not find differences between the two groups in the magnitude of dopamine release (see **Figure 1**). **Figure 1** shows gambling disorder sufferers (PG) and healthy controls (HC) with positive changes in binding potential (BPND ≥ 0, black bars) from baseline to gambling condition, suggesting dopamine release. It can be seen from the figure that the two groups do not differ in the magnitude of dopamine release from gambling. Similarly, we found no group differences in negative changes in binding potential (BPND < 0, white bars), suggesting dopamine inhibition. Comparing gambling disorder sufferers and healthy controls throughout the striatum revealed similar results (Linnet et al., 2012).

Even if the evidence supported the fallacy of a hyperdopaminergic response to reward in substance use disorder, PET activation paradigms used to study substance use disorder and gambling disorder may be too different to enable conclusions about differences or similarities in dopamine release toward reward in the two populations, because administering a drug may activate the dopamine system in a very different way than a gambling simulation.

More importantly, the blunted dopamine response to reward in substance use disorder might poorly explain the mechanisms of addiction and a possible common neurobiological pathway of addiction. What then, might explain dopaminergic dysfunctions in addiction? Robinson and Berridge (1993, 2000, 2003, 2008) have suggested that dopaminergic response to *anticipated* reward ("wanting"), rather than the reward itself ("liking") constitutes a fundamental dopaminergic mechanism in addiction. In addiction "wanting" increases, while "liking" decreases, and this decrease in "liking" might correspond with the blunted dopamine response to reward. Dysfunctions in dopaminergic response to *anticipated* reward, on the other hand, might constitute a common mechanism of addiction, because it occurs in the absence of reward, and therefore may have similar (dys)function, whether the reward is food, drugs or gambling. This mechanism might correspond to the common clinical symptoms in addictions such as preoccupation or craving. It might also be involved in continued use despite negative consequences such as depressed mood or loss chasing.

disorder sufferers (PG, *n* = 8) and healthy controls (HC, *n* = 9) show no significant differences in magnitude of dopamine inhibition from baseline to gambling condition (-BPND < 0, white bars). The ordinate shows the change in binding potential (-BPND), while the error bars indicate Standard Error Means (SEM). Star symbols (∗) indicate *p*-values in comparison to baseline. Reprint with permission from Linnet et al. (2011b).

In gambling disorder dopaminergic coding of uncertainty might represent a dysfunctional reward anticipation, which reinforces the gambling behavior despite losses (see the section on "Dopaminergic coding of reward prediction and uncertainty in gambling").

## *Fallacy 3: gambling disorder sufferers have increased dopamine release when winning*

Steeves et al. (2009) found that Parkinson's disease sufferers with gambling disorder had increased dopamine release when winning on a modified version of the IGT compared with Parkinson's disease sufferers without gambling disorder. The task was rigged with a 3:1 reward vs. penalty ratio, so it always produced an overall gain. The authors attributed the increased dopamine release in gambling disorder to the gains from gambling, and suggested that the increase might reflect a priming effect or premorbid dopaminergic hypersensitivity of the ventral striatal circuits.

However, the results, are in contrast to findings by Linnet et al. (2010). We found that gambling disorder individuals who lost money had significantly higher dopamine release in the left ventral striatum than healthy controls, *F*(1, <sup>29</sup>) = 5.52, *p* < 0.05 (*p* < 0.02 one-tailed). Furthermore, a Two-Way ANOVA showed a significant interaction effect, *F*(2, <sup>28</sup>) = 4.18, *p* = 0.05, where dopamine release was associated with losses in the gambling disorder group and gains in the healthy control group, see **Figure 2**. No group differences were found in the right ventral striatum.

**FIGURE 2 | Binding potential changes (***-***BPND) in left ventral striatum.** Gambling disorder sufferers who lose money (PG, black bar, *n* = 8) have significantly higher dopamine release in the left ventral striatum than healthy controls (HC, white bar, *n* = 5). Gambling disorder sufferers who win money (PG, black bar, *n* = 8) do not differ in dopamine release from healthy controls (HC, white bar, *n* = 10). Mean and Standard Errors are illustrated in the bars and error bars, respectively. Dopamine release results in positive values because raclopride binding potentials decrease from baseline to gambling condition (baseline > gambling = positive value). Conversely, dopamine inhibition results in negative values because raclopride binding potentials increase from baseline to gambling condition (baseline < gambling = negative value). Reprint with permission from Linnet et al. (2010).

These apparent differences raise the question of whether or not alternative models can better explain the role of dopamine release in relation to gains and losses in gambling disorder. Dopaminergic coding of reward prediction and uncertainty might offer such a model.

## **DOPAMINERGIC CODING OF REWARD PREDICTION AND UNCERTAINTY IN GAMBLING**

Reward prediction error in the dopamine system refers to a mechanism that updates positive and negative reward predictions of a stimulus. The mechanism constitutes a neural correlate of the mathematical and behavioral Rescorla-Wagner learning rule (Schultz, 2006). For instance, in random binary outcome situations (e.g., reward vs. no-reward) the expected value (EV) is the average value that can be expected from a given stimulus, which is a linear function of reward probability (**Figure 3A**). In contrast, uncertainty, defined as the variance (σ2) of the probability distribution is the mean squared deviation from the expected value, which is an inverted quadratic function of reward probability (Schultz et al., 2008) (**Figure 3B**).

Midbrain and striatal dopamine coding of expected value and uncertainty follow linear and quadratic functions similar to their mathematical expressions (Fiorillo et al., 2003; Preuschoff et al., 2006; Schultz, 2006). Fiorillo et al. (2003) found distinct phasic and sustained midbrain activation toward reward probability in monkeys using electrophysiological measures of dopamine neurons in the ventral midbrain areas A8, A9, and A10. Phasic dopamine activation was larger in anticipation of stimuli with larger reward probability, and smaller in anticipation of stimuli with smaller reward probability. The sustained activation was largest toward stimuli with maximum uncertainty (*P* = 0.5) and

occupies fewer receptors. Reprint with permission from Linnet et al. (2012). **(A, B)** are amended from **Figure 1**, **(C)** is amended from **Figure 2**.

declined toward higher and lower probabilities. The phasic and sustained activation patterns were distinct both in terms of timing of signal and dopamine neurons coding the response.

Preuschoff et al. (2006) found distinct neural coding of expected value and uncertainty in the ventral striatum of healthy men and women in a monetary card-guessing task. Expected value was linearly associated with early anticipatory blood oxygen level dependent (BOLD) activation, such that higher reward probabilities were associated with higher anticipatory BOLD activation, and lower reward probabilities were associated with lower anticipatory BOLD activation. In contrast, uncertainty showed an inverse quadratic association with late anticipatory BOLD activation, such that the highest BOLD activation was seen around maximum uncertainty (*P* = 0.5) and the lowest BOLD activation was seen around maximum certainty (*P* = 1.0 and *P* = 0.0).

Linnet et al. (2012) hypothesized that dopaminergic coding of outcome uncertainty on the IGT in gambling disorder would follow the reward prediction error signal, i.e., have the properties of an inverted U-shaped curve. The IGT consists of two "advantageous" and two "disadvantageous" decks that will lead to long-term gains and losses, respectively. The person is free to chose between decks, and the IGT performance can therefore be expressed as the probability (*P*) of advantageous deck selection, such that the variance is (1 − *P*)∗*P*.

The results confirmed the hypothesis of a significant inverse quadratic relationship between dopamine release and IGT performance among gambling disorder sufferers, which was strongest in the combined striatum, *F*(2, <sup>15</sup>) = 9.28, *p* = 0.002 (see **Figure 3C**). The quadratic relationship between dopamine release and IGT performance did not reach significance level in the healthy control group.

These results have implications for the findings by Steeves et al. (2009) and Linnet et al. (2010). In the study by Steeves et al. (2009) the computer program used a random sequence generator to determine the card sequence, and the outcome was therefore random, or uncertain, even though it always resulted in an overall gain. It is therefore possible that the dopaminergic coding in gambling disorder was also related to the variance of the task and not solely to the overall gain. The findings by Linnet et al. (2010) that gambling disorder sufferers had increased dopamine release during periods of losing—not winning—could suggest that dopamine release reinforces the gambling behavior despite losses, and preclude the individual from inhibiting the gambling behavior in order to stop gambling or avoid further losses. Both studies can be explained in terms of dopaminergic coding of reward prediction and uncertainty.

Since variance is a common feature in all forms of gambling, and since uncertainty and variance is maximized in most forms of gambling, the dopaminergic response to maximum variance might reinforce the gambling behavior despite losses, and this might constitute a common underlying mechanism in gambling disorder. The odds structure in the most addictive forms of gambling are optimized toward maximum uncertainty and variance, where the payback percentage is around 80–99% (e.g., slot machines typically have payback rates of 82–90%, and black jack has payback rates as high as 99%). Since the odds only slightly favor the house, and the variance is maximized, these games provide the optimal conditions for dopaminergic coding of uncertainty and reinforcement of gambling behavior despite losses, which may underlie clinical behavior such as "chasing one's losses."

From the perspective of gambling disorder, the outcome of winning or losing does not matter in the short term. What matters is that the game properties will always lead to losses in the long run, and the variance in outcome will always lead to dopaminergic reinforcement of the gambling behavior. This combination constitutes an inherent risk for gambling disorder sufferers and individuals at risk for developing gambling disorder.

Dopaminergic coding of reward prediction and uncertainty offers a model for explaining why: (1) gambling disorder sufferers are drawn toward the risk and uncertainty of gambling; (2) gambling disorder sufferers continue gambling despite losses; and (3) gambling disorder sufferers do not adapt an advantageous strategy despite negative feedback. At the same time it is clear that this model does not account for all behavior. For instance, our data are limited to PET and dopamine D2/3 receptors. While our findings are consistent with findings from fMRI studies (e.g., Preuschoff et al., 2006) the temporal resolution of PET does not allow us to differentiate between anticipation and outcome evaluation in gambling. Furthermore, it is possible that other dopamine receptors, e.g., D1-class receptors, might interact with- and contribute to the dopamine dysfunctions in gambling disorder. Finally, the IGT performance in healthy controls was not reinforced by dopaminergic coding of uncertainty. The following sections therefore addresses the possible differences of dopamine functions in IGT performance between gambling disorder sufferers and healthy controls.

#### **DOPAMINE RELEASE AND IGT PERFORMANCE IN GAMBLING DISORDER**

To investigate adaptive learning functions of dopamine in IGT performance we (Linnet et al., 2011a) compared IGT performance in relation to dopamine release in the ventral striatum of 16 gambling disorder sufferers and 14 healthy controls. We used the regular ABCD version and the combined ABCD, KLMN and QRST versions, where group differences were measured as the average performance across the three different versions (ABCD + KLMN + QRST/3). The study compared overall group differences in IGT performance as well as group differences of IGT performance in relation to dopamine release in the ventral striatum.

A previous IGT study (Sevy et al., 2006) showed that pharmacological reduction of dopaminergic activity is associated with impaired IGT performance in healthy control volunteers, while increase of dopamine is associated with better IGT performance. We (Linnet et al., 2011a) therefore hypothesized that dopamine release in the ventral striatum would improve performance in healthy controls. Based on suggestions that risk and outcome uncertainty is associated with dopamine release in gambling disorder (Fiorillo et al., 2003), it was further hypothesized that dopamine release in the ventral striatum of gambling disorder sufferers would be associated with more risky decision-making, reflected in lower IGT performance.

The results showed that gambling disorder sufferers and healthy controls did not differ in IGT performance on the ABCD version or combined performance across the three tasks. However, when comparing IGT performance between gambling disorder sufferers and healthy controls dependent on dopamine release, a highly significant pattern emerged. Healthy controls with dopamine release in the ventral striatum had significantly higher IGT performance on the ABCD version than gambling disorder sufferers, *F*(4, <sup>11</sup>) = 14.40, *p* < 0.0005 (**Figure 4A**). In contrast, gambling disorder sufferers and healthy controls without dopamine release (dopamine inhibition) did not differ in IGT performance, *F*(4, <sup>15</sup>) = 1.78, *ns* (**Figure 4B**). Gambling disorder sufferers who released dopamine in the ventral striatum had significantly *lower* IGT performance than gambling disorder sufferers who did not release dopamine, *F*(4, <sup>14</sup>) = 8.25, *p* = 0.005, while healthy controls who released dopamine had significantly *higher* IGT performance than healthy controls who did not, *F*(4, <sup>12</sup>) = 4.85, *p* < 0.05.

binding potentials decrease (-BPND ≥ 0) in ventral striatum have significantly higher IGT performance on the ABCD version than gambling disorder sufferers (PG, white squares, *n* = 8), *F*(5, <sup>13</sup>) = 14.40, *p* < 0.0005. The abscissa shows trial blocks (1–20, 21–40, and so forth), while the

binding potentials (-BPND ≥ 0) do not differ in IGT performance on the ABCD version compared with gambling disorder sufferers (PG, white squares, *n* = 8), *F*(5, <sup>17</sup>) = 1.78, ns. Reprint with permission from Linnet et al. (2011b).

The findings suggest that dopamine release was associated with adaptive behavior in healthy control individuals, but maladaptive behavior in gambling disorder sufferers. This might suggest that the function of dopamine differed between the two groups. Among gambling disorder sufferers the dopamine function appears to code uncertainty and reinforce risky and disadvantageous decision making. Among healthy controls the dopamine function appears to code outcome and reinforce adaptive and advantageous decision making. The dopamine dysfunctions and maladaptive gambling behavior in gambling disorder could further be fueled by the subjective experience of gambling. To address this aspect, the levels of gambling excitement were investigated.

#### *Dopamine and subjective experience*

Subjective gambling experiences such as increased excitement is central to gambling disorder (Neighbors et al., 2002; Rockloff and Dyer, 2006; Pantalon et al., 2008; Vachon and Bagby, 2009). Gambling excitement is associated with physiological measures of arousal (Moodie and Finnigan, 2005; Wulfert et al., 2005, 2008), and physiological arousal is generally increased during gambling (Leary and Dickerson, 1985; Dickerson et al., 1992; Coventry and Constable, 1999; Coventry and Hudson, 2001; Ladouceur et al., 2003; Moodie and Finnigan, 2005; Wulfert et al., 2005). Individuals with problem gambling or gambling disorder do not necessarily differ in physiological arousal from individuals without gambling problems (Griffiths, 1993; Sharpe et al., 1995; Coventry and Norman, 1997; Brown et al., 2004; Sodano and Wulfert, 2010), but some studies find an interaction between specific patterns of excitement and physiological arousal in gambling disorder (Goudriaan et al., 2006b). It is therefore, possible that a similar interaction exists between gambling excitement and dopaminergic neurotransmission in gambling disorder.

We (Linnet et al., 2011a) investigated the relation between subjective experience of gambling excitement and dopamine release in the ventral striatum of 18 gambling disorder sufferers and 16 healthy controls. It was hypothesized that dopamine release would be associated with increased excitement levels in gambling disorder sufferers compared with healthy controls.

Measures of excitement levels were obtained during PET scans, after each gambling round (ABCD, KLMN, and QRST). The computer automatically asked participants to rate their excitement level ("How exciting do you think the game is right now?") on a scale ranging from 1 to 10, where 1 was the lowest rating and 10 was the highest.

The results showed that gambling disorder sufferers had significantly higher excitement levels than healthy controls throughout the three games, *F*(2, <sup>31</sup>) = 6.45, *p* = 0.01. However, these differences were due to increased excitement levels in gambling disorder sufferers with dopamine release. Gambling disorder sufferers with dopamine release had significantly higher excitement levels throughout the games than healthy controls with dopamine release, *F*(2, <sup>12</sup>) = 10.69, *p* < 0.005 (**Figure 5A**), while no differences in excitement levels were found between gambling disorder sufferers and healthy controls without dopamine release (dopamine inhibition) (**Figure 5B**). Gambling disorder sufferers with dopamine release also had significantly higher excitement levels than gambling disorder sufferers without dopamine release, *F*(2, <sup>15</sup>) = 6.94, *p* = 0.01, while there were no differences between healthy controls with dopamine release and healthy controls without dopamine release.

Furthermore, there was a significant positive correlation between dopamine release and excitement level in gambling disorder sufferers, *r*(18) = 0.52, *p* < 0.05, which did not reach significance level among healthy controls (see **Figure 6**). No linear interaction was found between excitement level and IGT performance or between IGT performance and dopamine release in either group. This suggests that the higher excitement levels in gambling disorder sufferers was specifically associated

**FIGURE 5 | Excitement levels between gambling disorder sufferers and healthy controls. (A)** Gambling disorder sufferers (PG, filled circles) with dopamine release (-BPND ≥ 0) have significantly higher excitement across games than healthy controls (HC, open circles) with dopamine release. **(B)** Gambling disorder sufferers (PG) and healthy controls (HC) without dopamine release (-BPND < 0) do not differ in excitement level across games. Reprint with permission from Linnet et al. (2011b).

**FIGURE 6 | Correlation between binding potential changes and excitement level.** Gambling disorder sufferers (PG, filled circles) show a significant correlation between excitement level on the abscissa and change in binding potential (-BPND) on the ordinate, while the correlation fail to reach significance level in Healthy Controls (HC, open circles). Values above zero indicate dopamine release, while values below zero indicate dopamine inhibition. Reprint with permission from Linnet et al. (2011b).

with increased dopamine release and not with better IGT performance.

These data might suggest that individuals with gambling disorder suffer from a dopaminergic "double deficit" condition, where dopamine release is associated with both impaired gambling behavior and increased excitement levels, and that both factors may contribute to the gambling disorder.

## **CONCLUSION**

The studies presented here point in the direction that gambling disorder sufferers: (1) do not have lower baseline dopamine binding; (2) do not have dopaminergic hypersensitivity toward gambling per se; (3) do not have dopaminergic hypersensitivity toward winning; (4) show dopaminergic sensitivity toward uncertainty in outcomes consistent with reward prediction error; (5) show maladaptive gambling behavior with dopamine release; and (6) show increased gambling excitement with dopamine release.

Together, the evidence suggests that dopamine is involved in adaptive as well as maladaptive decision making in gambling. Dopamine may guide and reinforce advantageous decision making, as seen in healthy controls, and may have helped these individuals develop a strategy and stay on task. From the perspective of reward prediction error, healthy controls might have taken a problem solving approach to the IGT, where the dopamine release was associated with a phasic dopamine response from the adaptive decision making of identifying advantageous decks. In other words, healthy controls received a dopaminergic "reward" from developing good strategies.

On the other hand dopamine might also be linked to disadvantageous decision making, and lead to long term losses, as seen in gambling disorder sufferers. From the perspective of dopaminergic coding of uncertainty, these individuals might have seen the IGT as a game of chance and assumed a more risk taking approach, where the dopamine release was associated with a sustained dopamine response from uncertainty. In other words, these individuals received a dopaminergic "reward" from uncertainty. Altogether, the dopaminergic dysfunctions may represent a "double deficit" condition, where dopaminergic dysfunctions toward

## **REFERENCES**


risk and uncertainty reinforce maladaptive gambling behavior and increase excitement levels in gambling disorder.

However, the role of dopamine in gambling is complex and the suggestion of the three "fallacies" is therefore limited to the presented research. For instance, while there were no differences in PET measures of dopamine release between gambling disorder sufferers and healthy controls playing the IGT, there may be dopaminergic group differences in other contexts such as timing (e.g., dopaminergic activation in early or late anticipation or evaluation), type of game (e.g., real life gambling vs. IGT), motivational state etc.. For instance, the temporal resolution of PET does not allow differentiation between phasic and sustained dopamine response. Furthermore, the findings are limited to the level of dopamine D2/3 receptors; dopaminergic neurotransmission may differ at, e.g., the level of dopamine D1 receptors. Finally, the list is not exhaustive, i.e., there may be other types of "fallacies," which challenge our understanding of the role of dopamine in gambling disorder and addiction.

In conclusion, the suggested "fallacies" and role of dopaminergic dysfunctions in the coding of reward prediction and uncertainty in gambling disorder presented here may serve as a starting point for further development of a dopaminergic model of addiction in gambling disorder and substance use disorders.

## **ACKNOWLEDGMENTS**

This study was supported by funding from the Danish Agency for Science, Technology and Innovation grant number 2049-03- 0002, 2102-05-0009, 2102-07-0004, and 12-130953; and from the Ministry of Health grant number 1001326 and 121023; and from the National Center for Responsible Gaming as provided by The Institute for Research on Pathological Gambling and Related Disorders in the Division on Addictions at Cambridge Health Alliance. Its contents are solely the responsibility of the author and do not necessarily represent the official views of the Danish Agency for Science, Technology and Innovation, the Ministry of Health or the National Center, the Institute, or Cambridge Health Alliance. The author declares that he has no competing financial interests. The author wishes to thank the following persons for contributing to the research presented: Albert Gjedde, Doris Doudet, Kim Mouridsen, Arne Møller, and Ericka A. Peterson.

39, 376–389. doi: 10.1016/S0028- 3932(00)00136-6


et al. (2013). The D2/3 dopamine receptor in pathological gambling: a positron emission tomography study with [11C]-(+)-propylhexahydro-naphtho-oxazin and [11C]raclopride. *Addiction* 108, 953–963. doi: 10.1111/add.12066


task and its neurological correlates. *Cereb. Cortex* 14, 1226–1232. doi: 10.1093/cercor/bhh083


*Drug Alcohol Depend.* 84, 231–239. doi: 10.1016/j.drugalcdep. 2006.02.007


arousal. *Addiction* 98, 733–738. doi: 10.1046/j.1360-0443.2003.00412.x


*Psychol.* 54, 25–53. doi: 10.1146/ annurev.psych.54.101601.145237


with pathological gambling: a [11C] raclopride PET study. *Brain* 132(Pt 5), 1376–1385. doi: 10.1093/brain/awp054


A. R., et al. (2008). Dopamine increases in striatum do not elicit craving in cocaine abusers unless they are coupled with cocaine cues. *Neuroimage* 39, 1266–1273. doi: 10.1016/j.neuroimage.2007.09.059


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 June 2013; accepted: 17 September 2013; published online: 08 October 2013.*

*Citation: Linnet J (2013) The Iowa Gambling Task and the three fallacies of dopamine in gambling disorder. Front. Psychol. 4:709. doi: 10.3389/fpsyg. 2013.00709*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Linnet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## What have we learned about the processes involved in the Iowa Gambling Task from developmental studies?

## *Mathieu Cassotti 1,2\*, Ania Aïte1, Anaïs Osmont 1, Olivier Houdé1,2 and Grégoire Borst <sup>1</sup>*

*<sup>1</sup> Laboratory for the Psychology of Child Development and Education, CNRS Unit 8240, Sorbonne Paris Cité, Paris Descartes University, Paris, France <sup>2</sup> Institut Universitaire de France, Paris, France*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*William Hedgcock, University of Iowa, USA Maggie E. Toplak, York University, Canada*

#### *\*Correspondence:*

*Mathieu Cassotti, Laboratory for the Psychology of Child Development and Education, CNRS Unit 8240, Université Paris Descartes, Sorbonne, Laboratoire A. Binet, 46 rue Saint Jacques, 75005 Paris, France e-mail: mathieu.cassotti@ parisdescartes.fr*

Developmental studies using the Iowa Gambling Task (IGT) or child-friendly adaptations of the IGT converged in showing that children and adolescents exhibit a strong bias in favor of disadvantageous choices whereas adults learn to decide advantageously during the course of the task. In the present article, we reviewed developmental studies that used the IGT or child-friendly adaptations of the IGT to show how these findings provide a better understanding of the processes involved in decision-making under uncertainty. For instance, developmental studies have underlined that until late adolescence, the dominant strategy is to focus only on the frequency of punishment and to choose among options with infrequent losses. Indeed, school-aged children and adolescents' choices in the IGT seem to be guided by the loss frequency leading them to fail in distinguishing between advantageous and disadvantageous options. In addition, recent developmental studies revealed that adults switch less often after losses than school-aged children and adolescents. These findings suggest that psychological tolerance to loss may facilitate learning the characteristics of each option, which in turn improves the ability to choose advantageously. In conclusion, developmental studies help us refine our understanding of decision-making.

**Keywords: Iowa Gambling Task, emotion-based learning, executive control, inhibition (psychology), children and adolescents, developmental psychology, loss aversion**

## **INTRODUCTION**

In most situations of everyday life, people make decisions in circumstances where some information about the potential outcomes of choices are lacking and must be inferred from experience. Over the last two decades, considerable efforts in the field of psychology and neuroscience have been leveled at identifying the processes involved in this category of decision-making situations (Bechara and Damasio, 2005; Dunn et al., 2006). In particular, from a developmental perspective, a growing body of research suggests that the ability to decide advantageously evolves with age (see **Table 1**; Crone et al., 2005; Cassotti et al., 2011; Aïte et al., 2012; Beitz et al., 2014). Therefore, in the present article, we reviewed these developmental studies to demonstrate how it helps us refine our understanding of decision-making under uncertainty.

According to the Somatic Marker Hypothesis (SMH) (Bechara et al., 2000; Bechara and Damasio, 2005), emotion-related signals, developed from past experience of the emotional consequences following choices, guide decision making in situations of uncertainty. More specifically, this theory assumes that advantageous decisions rely on the development of an integral emotional reactivity (i.e., somatic maker). These somatic markers allow one to avoid disadvantageous options and to develop a preference for advantageous ones.

Most of the empirical supports for this model came from studies using the Iowa Gambling Task (IGT, Bechara and Damasio, 2005), a task initially developed to simulate the inherent uncertainty of daily-life decisions' situations through an opaque gain-loss schedule. The IGT consists of a card game in which participants are instructed to win as much money as possible by selecting among four possible decks of cards (labeled A, B, C, or D). Importantly, the decks' characteristics are not disclosed, and should be gradually inferred from feedbacks obtained during the game. Indeed, feedbacks are provided after each selection so that participants systematically win some money, but also and unforeseeably lose some. The four decks differ in the magnitude of wins and losses in such a way that to succeed at this game, players must withdraw from attractive but disadvantageous in the long term decks (A, B) and opt for less attractive but advantageous in the long term decks (C, D).

Typically, healthy adults progressively and implicitly learn to choose advantageously during the course of the task (Crone and van der Molen, 2004; Dunn et al., 2006; Cassotti and Moutier, 2010; Cassotti et al., 2011; Turnbull et al., 2014). In addition, advantageous performances has been linked to an anticipated emotional reactivity, as measured by Skin Conductance Responses (SCRs). Indeed, healthy participants display gradually higher anticipatory SCRs before picking a card in disadvantageous decks than in advantageous ones, suggesting that an emotional warning signal leads them to avoid disadvantageous choices (Bechara et al., 2000; Carter and Smith-Pasqualini, 2004; Guillaume et al., 2009). Similarly, anticipatory heart rate


**Table 1 | Example of** 

**developmental**

 **studies on** 

**decision-making**

 **under ambiguity.**

*Each line refers to authors, age and number of participants,*

 *task design and major findings of the study.* responses and SCRs were found to be critical to distinguish good and bad performers in such decision-making under uncertainty tasks suggesting a key role of this anticipated emotional reactivity in the ability to make advantageous decisions (Crone et al., 2004; Denburg et al., 2006).

In agreement with the SMH both neuropsychological studies and neuroimaging studies have shown the involvement of specific brain regions implicated in emotional processes in the IGT (Reimann and Bechara, 2010). More specifically, poor IGT performance and lower anticipatory SCRs were observed in amygdala-damaged patients (Bechara et al., 1999) as well as in patients with ventromedial prefrontal cortex (VMPFC) lesions (Bechara and Damasio, 2005). Using functional Magnetic Resonance Imaging, several studies have reported the activation of a prefrontal network including the VMPFC in the IGT which provides convergent evidence that decision-making under uncertainty might rely on an emotional neural circuitry (Lawrence et al., 2008; Li et al., 2010).

Additional empirical evidences in favor of the SMH came from studies exploring the influence of the emotional context on advantageous decisions in the IGT (Hinson et al., 2006; Davies and Turnbull, 2011; Aïte et al., 2013; see Turnbull et al., 2014 for a review). In response to remaining critics about the reliability of electrophysiological measures such as SCRs (Tomb et al., 2002; Dunn et al., 2006), Aïte et al. (2013) recently designed an emotional priming paradigm of the IGT to determine whether the ability to choose advantageously in ambiguous situations is driven by an integral emotional signal as assumed by the SMH. In this study, the emotional context was either congruent or incongruent with the feedback delivered after each choice and was manipulated using pictures of either happy or fearful faces (Tottenham et al., 2009). The results of this study strongly support the SMH by evidencing that decision making was improved when the integral emotional signal was reinforced by a congruent emotional context and impaired when the integral emotional signal was disrupted by an incongruent emotional context.

## **CHILD FRIENDLY ADAPTATION OF THE IGT AND ADVANTAGEOUS DECISION-MAKING**

Given that neuroimaging studies over the past 20 years have consistently shown continuing neuroanatomical and neurofunctional development of the prefrontal cortex across childhood and adolescence (Crone and Dahl, 2012), numerous developmental studies examining decision making under uncertainty have focused on changes in performance on child friendly adaptations of the IGT between school-aged children and adolescence (Crone et al., 2005; Cassotti et al., 2011; Beitz et al., 2014).

One of the first studies that investigated developmental changes in decision-making ability during adolescence showed that this ability continues to improve until late adolescence and even in adulthood (Crone and van der Molen, 2004). The authors of this study have designed an age-appropriate version of the IGT: the Hungry Donkey Task (HDT), in which the stimulus presentation was modified to make the task more meaningful for children (Crone and van der Molen, 2004). Indeed, rather than picking cards to win money for themselves, participants are invited to assist a hungry donkey in winning as many apples as possible by opening doors. Two doors (A and B) constitute disadvantageous choices (resulting in overall loss), and two doors (C and D) advantageous choices (resulting in overall gain). Critically, this task maintains the basic format and a similar schedule of rewards and losses as those described by Bechara et al. (1994). As in other IGT studies, participants' performance is measured in terms of changes in individuals' net scores for blocks of 20 trials by subtracting the number of choices in disadvantageous doors from the number choices in advantageous doors.

In a series of behavioral studies, Crone and colleagues have demonstrated that 6- to 9-year-old children and 10- to 12-yearold children fail to avoid disadvantageous options during the course of the task, as opposed to 13- to 15 year-old adolescents, who gradually learn to choose advantageously (Crone and van der Molen, 2004, 2007; Crone et al., 2005; Huizenga et al., 2007). Nevertheless, adolescents' performance is still suboptimal compared to adults, suggesting that the ability to distinguish between advantageous and disadvantageous options continues to improve during late adolescence (Overman, 2004; Overman and Pierce, 2013). Although some authors point out that there may be differences between the development of decision making involving personal gain or loss and decision making that leads to help another such as in the HDT, developmental studies using the standard IGT have confirmed that advantageous decision making progressively increases until early adulthood (Hooper et al., 2004; Cassotti et al., 2011; Beitz et al., 2014). It has been proposed that this age-related improvement in performance on child adaptations of the IGT during adolescence may be due the slow functional maturation of the VMPFC until early adulthood (Crone and van der Molen, 2004, 2007).

## **DEVELOPMENT OF THE FREQUENCY BIAS**

Because the IGT is a complex task, different factors may contribute to the similar decision-making deficit observed in children and VMPFC patients. In support of this hypothesis, developmental studies revealed that children have a marked preference for options associated with infrequent losses, regardless of whether these options are advantageous or disadvantageous in the long run (Crone and van der Molen, 2004, 2007; Crone et al., 2005; Huizenga et al., 2007; Carlson et al., 2009; Cassotti et al., 2011). Indeed, the four options proposed in the IGT and the childfriendly versions of this task also differ in the frequency of losses, with two options associated with a low frequency of losses (10% for decks/doors B and D) and two options associated with a medium frequency of losses (50% for decks/doors A and C). Developmental studies converged in showing that children and adolescents increasingly opt for choices associated with infrequent, rather than frequent losses during the task and that this frequency bias decreases with age (Huizenga et al., 2007). In contrast, adults display a strong preference for low loss frequency choices early in the task but progressively opt for advantageous options during the course of the task. To the best of our knowledge, this specific sensitivity to options associated with a low frequency of losses has not been observed in VMPFC patients.

Furthermore, Crone and van der Molen (2007) demonstrated that adolescents display higher anticipatory SCRs preceding a choice among options associated with frequent losses in contrast to adults who generate progressively higher anticipatory SCRs before disadvantageous selections (Bechara and Damasio, 2005). These results suggest (a) that loss frequency highly influences children and adolescents' decision making, and (b) that children and adolescents exhibit difficulties in considering both the frequency and the amount of loss leading them to make long-term disadvantageous decisions (see also Jansen et al., 2011; van Duijvenvoorde et al., 2012). Interpreted within the framework of the SMH, these data suggest that adolescents reactivate the negative emotional responses associated with high-loss frequency options without taking into account the final outcome.

Interestingly, recent evidence underlined a comparable inability to integrate both loss frequency and final outcome in adults' decision making (Lin et al., 2009; Cassotti and Moutier, 2010). For example, using the Soochow Gambling Task (SGT), an adaptation of the IGT designed to directly contrast the impact of loss frequency and final outcome on decision making, Lin et al. (2009) demonstrated that adults substantially based their choices on loss frequency rather than on final outcome (i.e., the advantageous or disadvantageous nature of the decks on the long run). More specifically, when the frequency of loss in the decks is manipulated to take either one of two extreme values (80 vs. 20%), adults are guided by gain–loss frequency leading them to fail in distinguishing between advantageous and disadvantageous choices. Given that this loss-frequency bias was observed only in children and adolescents when the classical IGT was used, Aïte et al. (2012) further explored developmental changes in the ability to consider both the loss frequency and the final outcome in decision making using an age adapted version of the SGT. Results confirmed that children and adolescent not only preferred choices associated with infrequent losses in the SGT but also failed to differentiate advantageous options from disadvantageous ones, a developmental pattern similar to the one previously evidenced using the IGT. Contrarily to Lin et al.'s study (2009), findings indicated that adults did manage to consider the final outcome when making their decision but only for options associated with low-loss frequency. Thus, adults are not only guided by the loss frequency but also by the amount of the loss as evidenced by a preference for the advantageous option among the low-loss frequency options. Taken together, developmental studies using the IGT, the HDT, or the SGT converge in showing a robust frequency bias in childhood and adolescent that decrease in adults (Crone et al., 2005; Crone and van der Molen, 2007; Aïte et al., 2012).

## **STRATEGIC ADJUSTMENTS OF DECISION MAKING**

A large majority of previous studies measured performance in the IGT or child adaptations of the IGT in terms of changes in difference between the number of advantageous selections and the number of disadvantageous choices. However, such measures provide no information on the strategic adjustments that immediately follow gains and losses over the course of the task. Given that participants can chose any of the four options on each trial, age differences observed in the standard net score and the tendency to focus on loss frequency could result from age differences in the strategic exploration of the four options.

In line with this hypothesis, Cassotti et al. (2011) have recently examined response-switching behavior following rewards and punishments in children, adolescents and adults. In this study, adults tended to persevere with the same choice after a win (i.e., a "win–stay" strategy) and to shift to a new choice after a loss (i.e., a "loss–shift" strategy). In contrast, children and adolescents failed to control a spontaneous tendency to explore the different options as shown by a higher frequency of switches following gains and losses in children and adolescents than in adults. Another study has not only confirmed these developmental differences but has also outlined a possible role for these strategic adjustments following gains and losses (Aïte et al., 2012). Given that the proportion of switches after losses correlated negatively with the number of advantageous selections, the reduced lossshift pattern of response observed in adults, as compared to the two younger groups, could constitute a critical adaptive process allowing one to choose advantageously. Indeed, adults' tolerance to loss may allow them to learn the characteristics of each option and to increase their ability to consider not only loss frequency but also final outcome.

In line with these developmental studies (Cassotti et al., 2011; Aïte et al., 2012; van Duijvenvoorde et al., 2012) computational models that have included win-stay/loss-shift strategies to predict performance in the IGT (Worthy et al., 2007; Worthy and Maddox, 2012) converged in showing that the tendency to inhibit the loss-shift response is a central component of decision-making behavior in the IGT. Altogether, these results suggest that immature decision making may be due to the difficulty to execute inhibitory control on an automatic loss-shift response.

## **CONCLUSION**

In the present review, we have discussed how developmental studies have made significant progress in the understanding of the processes involved in emotional-based learning in the IGT. Behavioral and electrophysiological studies have clearly demonstrated a focus on loss-frequency (Crone et al., 2005; Crone and van der Molen, 2007) and a preponderance of a loss-shift strategy in children and adolescent as compared to adults (Aïte et al., 2012). In line with recent behavioral and neuroimaging studies (Spear, 2011; Habib et al., in press) children and adolescent might be more focused on loss frequency and might rely more on a loss-shift strategy because of an exacerbate aversion to losses. Given that only adults have the ability to choose advantageously (Crone and van der Molen, 2004, 2007; Overman, 2004; Overman and Pierce, 2013), the subtle developmental differences regarding the factors that guide decision making and the strategic adjustment that are used might reflect the critical components needed for making advantageous decisions. Indeed, the data collected in developmental studies suggest that the ability to develop emotion-related signals that integrate both loss-frequency and final outcome requires inhibitory control of intuitive exploration strategies. As such, the present article provides new fuel for the current debates on the respective contribution of executive control and emotion-based learning in the IGT.

## **REFERENCES**

Aïte, A., Borst, G., Moutier, S., Varescon, I., Brown, I., Houdé, O., et al. (2013). Impact of emotional context congruency on decision making under ambiguity. *Emotion* 13, 177–182. doi: 10.1037/a0031345


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 March 2014; accepted: 31 July 2014; published online: 20 August 2014. Citation: Cassotti M, Aïte A, Osmont A, Houdé O and Borst G (2014) What have we learned about the processes involved in the Iowa Gambling Task from developmental studies? Front. Psychol. 5:915. doi: 10.3389/fpsyg.2014.00915*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Cassotti, Aïte, Osmont, Houdé and Borst. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Emotion-based learning: insights from the Iowa Gambling Task

## *Oliver H. Turnbull\*, Caroline H. Bowman, Shanti Shanker and Julie L. Davies*

School of Psychology, Bangor University, Bangor, UK

#### *Edited by:*

Yao-Chu Chiu, Soochow University, Taiwan

#### *Reviewed by:*

Masataka Watanabe, Tokyo Metropolitan Organization for Medical Research, Japan Varsha Singh, Indian Institute of Technology Delhi, India

#### *\*Correspondence:*

Oliver H. Turnbull, School of Psychology, Bangor University, Bangor, LL57 2AS Wales, UK e-mail: o.turnbull@bangor.ac.uk

Interest in the cognitive and/or emotional basis of complex decision-making, and the related phenomenon of emotion-based learning, has been heavily influenced by the Iowa Gambling Task. A number of psychological variables have been investigated as potentially important in understanding emotion-based learning. This paper reviews the extent to which humans are explicitly aware of how we make such decisions; the biasing influence of pre-existing emotional labels; and the extent to which emotion-based systems are anatomically and functionally independent of episodic memory. Review of literature suggests that (i) an aspect of conscious awareness does appear to be readily achieved during the IGT, but as a relatively unfocused emotion-based "gut-feeling," akin to intuition; (ii) Several studies have manipulated the affective pre-loading of IGT tasks, and make it clear that such labeling has a substantial influence on performance, an experimental manipulation similar to the phenomenon of prejudice. (iii) Finally, it appears that complex emotion-based learning can remain intact despite profound amnesia, at least in some neurological patients, a finding with a range of potentially important clinical implications: in the management of dementia; in explaining infantile amnesia; and in understanding of the possible mechanisms of psychotherapy.

**Keywords: emotion-based learning, intuition, prejudice, psychotherapy, episodic-memory**

## **INTRODUCTION**

Over the last few decades, there has been a growing interest in the cognitive and/or emotional basis of complex decision-making (e.g., Bechara et al., 1994; Damasio et al., 1996; Rogers et al., 1999; Manes et al., 2002; Turnbull et al., 2003; Bowman et al., 2005; Peatfield et al., 2012). This interest was, in large part, inspired by the well-established finding that neurological patients with lesions to ventromesial (VM) frontal lobes often showed normal intelligence, with near or near-to-normal performance on a range of "executive" tasks (e.g., Bechara et al., 2000b). However, in spite of these domains of preservation, such individuals often displayed difficulties in learning from past mistakes, with real life manifestations such as entering repeatedly into inappropriate relationships, and unsuitable business agreements. Such decisions may immediately seem rewarding, but typically prove to be counter-productive in the long run, often leading to career termination and financial losses (Damasio et al., 1991; Bechara et al., 2000a). Notably, such individuals display failures in using emotional feedback from previous situations (i.e., the punishing consequences of impulsive actions) in the guidance of their future choices.

Measuring these decision-making failures in the real-world is challenging, both ethically and methodologically. The Iowa Gambling Task (IGT) was developed as a simple neuropsychological tool to tap into such deficits in emotional-processing, which might be associated with complex decision-making difficulties, as observed in individuals with frontal lobe lesions (Rolls et al., 1994; Damasio et al., 1996; Lezak et al., 2012). In a poetic turn of phrase, patients with VM lesions were argued to have "myopia for the future" (p. 217), where their focus was on the *immediate*

outcome of decisions, with an apparent indifference to the longterm consequences of their actions (Bechara et al., 1994; Bechara, 2005).

A key element of the recent complex decision-making literature has been the role of *emotion* (Bechara et al., 1994; Damasio et al., 1996; Rogers et al., 1999; Manes et al., 2002), and indeed its ability to drive emotion-based learning (EBL) during complex decision-making (Damasio et al., 1996; LeDoux, 1996, 2000; Turnbull et al., 2003, 2006). EBL systems are known to facilitate insights about the possible outcome of complex decisions, based on prior experience of the emotional consequences of actions, with particular objects and/or agents (Claparède, 1951; Johnson et al., 1985; Tranel and Damasio, 1993; Bechara et al., 1994; Damasio et al., 1996; Rogers et al., 1999; LeDoux, 2000). The role of emotion in such decision-making is supported by studies of patients with VM frontal, amygdala, and insular lesions (e.g., Bechara et al., 1997, 1999, 2003; Clark et al., 2008), as well as studies measuring skinconductance changes (e.g., Bechara et al., 1996, 1997, 1999; see also Suzuki et al., 2003). Importantly, (see below) this class of memory (or learning) appears to be independent of the episodic memory systems of the medial temporal lobe (Claparède, 1951; Tulving and Schacter, 1990; Turnbull and Evans, 2006; Evans-Roberts and Turnbull, 2011).

#### **THE IOWA GAMBLING TASK**

The IGT (Bechara et al., 1994) has become the key experimental paradigm in evaluation of emotion-based decision-making, especially when humans are faced with emotion-mediated information, ambiguous contingencies, and uncertain consequences

(e.g., Rogers et al., 1999; Manes et al., 2002; Bowman and Turnbull, 2004; Happaney et al., 2004). The IGT has been extraordinarily influential, with Bechara et al.'s (1994) original paper having already acquired over 3000 citations on a Google Scholar search for this paper (November 2013). The spread of influence is also remarkably diverse, spanning a range of theoretical, and clinical papers in psychiatry (e.g., Cavedini et al., 2002; Evans et al., 2005; Must et al., 2006), psychology (e.g., Schmitt et al., 1999; Blair et al., 2001), neuropsychology (Turnbull and Evans, 2006; Torralva et al., 2007), and neurology (e.g., Bechara et al., 1999; North and O'Carroll, 2001; Cavedini et al., 2002; Anderson et al., 2006).

A number of psychological variables have been investigated as potentially important in understanding the nature of these EBL systems. The most prominent of these are (i) the extent to which we are explicitly aware of the basis of such decisions; (ii) the biasing influence of pre-existing emotional labels in complex decision making; and (iii) the extent to which EBL systems are anatomically and functionally independent of episodic memory systems. Each of these issues are briefly reviewed in this article.

## **DECISION-MAKING OUTSIDE AWARENESS**

An important element in our understanding of the nature of emotion-based-learning, and the factors that drive learning on the IGT in particular, is the question of conscious awareness. The Iowa group have argued that the IGT is extremely complex in nature (Damasio, 1994, pp. 205–222), and that participants do not appear to explicitly understand the contingencies of the game (Bechara et al., 1994, 1996, 1997, 2000b). In analyzing this issue, it is important to keep in mind the definition of "awareness" used by the original Iowa group (e.g., Bechara et al., 1997) – an issue which may explain some of the emergent controversies amongst IGT researchers.

Bechara et al. (1997) explored how participants "conceptualized" the task, by which they appear to have meant the broad understanding of the contingency values on the IGT, and the types of (explicit) strategies used on the task. In their study, they asked participants (patients with VM lesions and neurologically normal controls) two questions: "(i) Tell me all know about what is going on in this game? (ii) Tell me how you feel about the game" (Bechara et al., 1997, p. 1293). In other words, they sought a definition of "awareness" which emphasized formal but also general (and, arguably, rational or cognitive) understanding, as well as broadly based feelings about the task. An initial phase of task awareness was labeled as the "*hunch"* period, where neurotypical participants experienced conscious, but poorly formed impressions about the task (Bechara et al., 1997). During this period, neurotypical participants reported "liking" or "disliking" certain decks, often guessing the general contingencies of the decks. During a later phase of the task, most neurotypicals reached a "conceptual" period – developing a better awareness of the rewarding nature of the decks. Notably, after encountering losses on specific decks, (neurotypical) participants developed pre-decision anticipatory skin conductance responses (SCRs). While, neurological patients did not generate these anticipatory SCRs, nor did they tend to enter the "*hunch"* period. In the later periods most neurological patients were unable to shift their pattern of choice away from the "bad" decks though many did develop a conceptual awareness. However, Bechara et al. (2000a, p. 301) reported some instances of the famous "knowing versus doing" dissociation, (first noted by Teuber, 1964) where "... patients "say" the right thing but "do" the wrong thing". Even more paradoxically, they reported that some neurotypical participants did *not* reach the conceptual period in that they did not describe an awareness regarding which decks were good and which were bad, *yet* they still made increasingly advantageous choices over time (Bechara et al., 1996, 1997).

In sum, they appear to suggest that conscious awareness on the task, and good performance are unrelated. The Iowa group explained the "unconscious" (unaware) nature of these decisions in terms of the somatic marker hypothesis (SMH; ; Damasio, 1989a,b, 1994; Damasio et al., 1991, pp. 205–222; Bechara et al., 2000a), where "bodily" (i.e., extra-cerebral) systems play a role in facilitating decisions (Bechara et al., 1997). This proposition has received some experimental support (Bechara et al., 2000a), but it has also attracted criticism (Tomb et al., 2002; see Dunn et al., 2006 for a review).

Further support for advantageous decision-making occurring *outside* of explicit awareness, might be argued to come from the "BLINK" task (Peatfield et al., 2012). An analog of the classic IGT, BLINK is some 25 times faster to complete than the conventional computerized IGT (Bechara et al., 1999). Here, individual decisions are presumably so rapid that little opportunity arises for conscious awareness to develop, thus meeting the criteria for "fast and frugal" decision-making (Gigerenzer, 2004). Notably, in spite of the rapid response rate on the BLINK paradigm, participants show IGT-like performance (see Peatfield et al., 2012 for a detailed discussion of BLINK).

In recent years, Bechara's claim of advantageous decisionmaking outside of awareness has been shaped by a series of papers which suggest that*some*forms of conscious awareness *are* available to participants on the IGT. The first of these more formal investigations was reported in Maia and McClelland's (2004) study, based on a structured questionnaire that assessed participants' knowledge of the IGT. Maia and McClelland probed the general awareness of task contingencies, without asking participants to specify the cognitive details underlying their understanding. Importantly, most participants who made advantageous choices, and thus showed preference for one or more of the decks, also demonstrated conscious feelings about the decks. Indeed, by the end of Block 1 (i.e., 20 card selections made) participants were able to report basic affective properties of decks, and by 50 card selections, the majority of participants could correctly report "good" decks. Such understanding would readily correspond to participants' decision preference (see Maia and McClelland, 2004).

Their results suggested that when behaving advantageously, participants not only had access to some explicit knowledge about the "goodness" or "badness" of the deck, but also had reportable knowledge that was well placed to facilitate choice (Maia and McClelland, 2004, p. 16078). Maia and McClelland (2004) therefore claimed that participants playing the IGT *did* have access to explicit awareness about the contingencies of the game. They argued that this resulted from the self-paced nature of the task, which allowed ample time for deliberative reasoning, and also that the outcomes of choices were presented

in a clear numerical form, which aided explicit tracking and learning of the incentive nature of each deck (at least to *some* degree – though see Peatfield et al., 2012 above). Thus, Maia and McClelland (2004) posited a degree conscious awareness of the task in participants, albeit of a different form of awareness to that proposed by Bechara et al. (1997). Indeed, this difference was captured by asking participants probing questions about the task, rather than by assessing notoriously difficult-toverbalize and general feeling. Therefore, Maia and McClelland's (2004) quantitative method successfully examined explicit awareness, but failed to tap affectively mediated *qualitative* knowledge (feelings) about the game that may indeed facilitate favorable choices. Importantly, Maia and McClelland (2004) suggests multiple source of information might possibly guide the choice during complex-decisions.

Further, empirical support on the question of awareness, comes from the work of Bowman et al. (2005). In this study, participants quantitatively rated the "goodness/badness" of each deck after each twenty-card block. Bowman et al.'s (2005) data suggested that participants could explicitly report affective evaluation (i.e., the relative goodness/badness) of the task objects, even during the "pre-hunch" phase (Bechara et al., 1997). In fact, participants showed obvious awareness of the "valence" of the decks, even following the *first* 20 trials of the task. Other studies (e.g., Evans et al., 2004; Cella et al., 2007) using the same method of tracking task subjective awareness, confirm these original findings, and indeed extend them to a psychiatric population (Evans et al.,2005). However, Turnbull et al. (2007) confirmed, in neurotypicals, that dissociations do occur between explicit deck ratings and behavioral choices on the IGT – suggesting that participants can and do actively ignore explicit knowledge regarding the incentive values of their choices, in favor of implicit emotion-mediated knowledge, especially in situations where varying sources of information come into conflict.

Thus, it appears that explicit (emotion-mediated)-knowledge of incentive values of choice is available *much* earlier than originally claimed by Bechara et al. (1997). This form of awareness is also a type substantially different in quality to that encountered during explicit cognitive approaches to decision making (Gilhooly and Murphy, 2005). The descriptions of these decisions emphasize the fact that the non-cognitive choices are, in contrast, poorly formed ("a hunch") and laden with affect ("a gut feeling"). It is perhaps this knowledge that subserves the phenomenon that has been long described as "*intuition"* (see also Kahneman and Tversky, 1973; Stanovich and West, 2000; Kahneman, 2003; Turnbull et al., 2003, 2005).

## **INTUITION?**

We are therefore faced with an interesting, and under-investigated, phenomenon, whereby humans are aided in navigating complex and uncertain problem-spaces, via the awareness of emotionbased signals – presumably derived from prior experience of objects and/or agents. (Kahneman et al., 1982; see Kahneman, 2003) have long described the properties of such *intuitive* responses as being fast, rapid, explicit, effortless, and emotionally laden. Stanovich and West (2000) have proposed a similar dichotomy (e.g., Hogarth, 2001; Myers, 2002). Both seek to

discriminate between systems underpinning "intuition" versus "reasoning" (Kahneman, 2003). One approach (intuition; or System 1) generates an overall and apparently imprecise general impression of objects or situations, through an involuntary process sometimes described as *natural assessment* (Tversky and Kahneman, 1983). This phenomenon emerges without intention or effort, and could not (they argued) be verbalized explicitly. In contrast, the reasoning pathway (System 2) is involved when more formal *judgements* are made, even if these are not overtly expressed (Kahneman, 2003; for more on this in relation to the IGT see Bechara, 2005; Cella et al., 2007; Stocco et al., 2009). However, such reason-based decisions were always intentional and explicit.

"*Intuitive*" is therefore a label which appears to capture a decision process reflecting imprecise and emotion-based impressions. We have argued that such EBL systems may pre-empt or guide reason-based choice, when faced with settings involving combinations of a complex problem space; high levels of uncertainty and ambiguity; and laden or infused with affect. Interestingly, this literature potentially links to emotion-based systems of the sort found in psychiatric disorders (Evans et al., 2005), or neurological disorders of emotion regulation (Fotopoulou et al., 2004) – where both groups show impaired understanding in the form of delusional beliefs. These affectively laden biases may perhaps appear without conscious awareness, and lack explicit understanding, even when producing successful outcomes (Damasio, 1994, pp. 187–189; Turnbull et al., 2007).

In sum, one *form* of conscious awareness *does* appear to be readily achieved during the IGT, but this is in the sense of an emotion-based impression: "How much do I like this object?" (Bowman et al., 2005; Evans et al., 2005), though this may also explain why Bechara et al. (1997) report that optimal IGT decision-making operates outside of formal *cognitive* scrutiny.

## **PRE-EXISTING AFFECTIVE BIAS ON THE IGT**

The IGT is usually regarded as a good simulation of the complexity of real-world decision-making, given that it involves exploratory decisions under both risk *and* ambiguity (e.g., Brand et al., 2007), with shifting contingencies over time. Although other tasks may provide a better psychological dissection of the decision-making processes (e.g., Fellows, 2004; Dunn et al., 2006, 2010; Brand et al., 2007), the IGT is typically regarded as affording an ecologically rich and complex problem space (Damasio et al., 1991; Bechara et al., 1994). Of particular interest is the "balance," or trade-off, between cognition and affect, as a measure of adaptive task performance (e.g., Manes et al., 2002; Fellows and Farah, 2005; Dunn et al., 2006, 2010; Cassotti et al., 2011). For instance, affective states appear to especially underpin adaptive decisions in the early "opaque" and ambiguous period of the IGT, with the latter phase of the task (as discussed above) more readily informed by conscious awareness of the incentive properties (e.g., Maia and McClelland, 2004; Bowman et al., 2005; Dunn et al., 2006; Brand et al., 2007; Wagar and Dixon, 2007; Stocco and Fum, 2008).

What then of the fact that humans are often biased or predisposed – toward objects, even before they first encounter them? And how does this bias shift over time? Notably, the IGT involves an intrinsic affective *shift,* where initially learned associations require reversal for adaptive behavior on the task (Fellows and Farah, 2005). In many ways, such pre-existing affective biases might be regarded as the psychological foundation of prejudice – for example where humans express a pre-existing negative evaluation, in the absence of knowledge of the object's intrinsic properties (e.g., Allport, 1954/1979). Overcoming such biases clearly requires reversal of an affectively laden association. Notably, such social biases are understood to be both common and well-established, with the potential to linger outside full awareness (Devine, 1989; Amodio et al., 2003; Gregg et al., 2006). Indeed, the notion that most objects rapidly and automatically evoke affective states is now well-established (e.g., Zajonc, 1980; LeDoux, 1996; Ito and Urland, 2003; Cunningham et al., 2004). Therefore, an ecologically valid *starting* point for the IGT would be a set of objects which are affectively laden, rather than neutral.

A relevant distinction, and one often stressed by the social cognition literature, is that affect can be sourced from an evaluation of the features of the target itself (*integral affect*), or influenced by the background mood state or another unrelated source (*incidental affect*, Pham et al., 2001; Mussweiler and Bodenhausen, 2002; Finucane et al., 2003). Thus, integral affect may result from actual, perceived, or even imaginary characteristics of the decision targets – i.e., with a focus or the object itself. In contrast, incidental affect is sourced from temporary mood states, trait affective states (e.g., anxiety), or transferred from other diffuse sources distinct from the target object (e.g., Cohen et al., 2008).

How might these sources of affect influence complex decisionmaking? It is likely they are incorporated into an online affective state, which is readily placed to infuse and bias choices (Damasio, 1994; Finucane et al., 2003; Cohen et al., 2008). Here, the literature is patchy in its coverage. The influence of "incidental" affect on judgment and decision-making has been well-studied, suggesting that there are gains in the flexibility and openness of problemsolving in positive mood states (e.g., Isen, 2001), and risk-aversion in states of anxiety (e.g., Raghunathan and Pham, 1999). Indeed, *incidental* affect appears to have important impacts on IGT performance (Schmitt et al., 1999; Carter and Smith-Pasqualini, 2004; Suhr and Tsandis, 2007). However, the primacy (e.g., Zajonc, 1980; LeDoux, 1996) and importance of *integral* (object biased) affect for judgement and decision-making has been less well-investigated (Pham et al., 2001; Finucane et al., 2003). Surprisingly, only a few studies (Hinson et al., 2006; Davies and Turnbull, 2011; Aïte et al., 2013) have assessed integral affective bias in decision-making paradigms like the IGT – although questions of this sort are highly relevant for human social decision-making (e.g., Bechara et al., 1994).

Notably, real-world social behavior involves encountering agents and objects that develop, and ultimately come to *possess*, ambiguous and ambivalent characteristics (e.g., Cacioppo and Berntson, 1994; Cunningham et al., 2003). Thus, an appraisal of a well-known individual (e.g., Tony Blair, Barack Obama, Lance Armstrong, Edward Snowden) may well evoke both negative and positive evaluations, potentially resulting in a net-weighted (heuristic-based) attitude (e.g., Van Harreveld et al., 2004). Ecologically rich paradigms such as the IGT have only recently been employed to examine the impact of affective biases in complex and dynamic decision-making (Hinson et al., 2006; Davies and Turnbull, 2011; Aïte et al., 2013). The following section presents an overview of this research.

## **INSIGHTS FROM TASKS INVOLVING AFFECTIVE BIAS**

Given the proposed primacy of emotion-based processes (Bechara et al., 1994, 1997), it is perhaps surprising that only three studies have examined affective bias within IGT-style decision-making. While each study uses different variants of the IGT, and a range of affective biases, the data are broadly consistent – demonstrating that pre-existing bias readily impacts complex decision-making (Hinson et al., 2006; Davies and Turnbull, 2011; Aïte et al., 2013).

Using a three-deck variant of the IGT, Hinson et al. (2006) invoked affective bias, by associating task decks with emotional words, which varied according to deck incentives. In the incongruent condition, the "good" deck was labeled with negative words, and "bad" deck labeled with positive words (with the associations reversed for the congruent condition). Additionally, a third "neutral" deck was labeled with emotionally neutral words. As one might predict, incongruent affective bias impaired performance, while congruent bias enhanced decision-making. Thus, the use of stable emotional landmarks from the outset of the task readily biased IGT-style decision-making.

The SCR data collected during the experiments (Hinson et al., 2006) were used to examine the development of discriminating anticipatory SCRs. Incongruent affective bias was found to hinder the development of these physiological markers – with little discrimination in differential SCRs across the three decks. However, in the congruent condition, these anticipatory markers appeared to selectively distinguish between bad deck choices from both good and neutral options. These responses are often viewed as an index of decision biasing "somatic markers" (Bechara et al., 1997). However, in this study the somatic signals produced no causal influence on decision behavior, merely acting as one index of adaptive decision-making (Hinson et al., 2006).

Building on these findings, Davies and Turnbull (2011) investigated features of the classic Gambling Task potentially influenced by affective bias – expanding the topic to include features such as sensitivity to punishment cues (Dalgleish et al., 2004), and the dynamic tracking and weighting of overall deck attitudes (e.g., Van Harreveld et al., 2004; Bowman et al., 2005) that were not explored in the Hinson et al. (2006) experiments. The Davies and Turnbull (2011) tasks introduced affective bias using visual stimuli that were either non-social (International Affective Picture System; Lang et al., 2001) or more socially salient, in the form of racially diverse faces (Tottenham et al., 2009). To control for individual variation, the stimuli were also customized for each participant, by pre-evaluation. As in the Hinson et al. (2006) studies, there was a growing preference for selections from the advantageous decks. Importantly, affective bias altered selection in both congruent and incongruent conditions; especially both experiments demonstrated that affective labels impaired selection behavior specifically under *incongruent* conditions. Additionally, the study (experiment 2) also showed a clear influence of affective bias on *subjective* ratings of task objects over the task.

This sparks the question of *how* such decision-making is changed. Congruency did not influence shifts from the frequently punishing decks, nor did it alter preferences for decks with lower lossfrequency. Also, decoupling subjective evaluation data to absolute deck ratings showed that weighting of deck attitudes were unaltered by the congruency manipulation. However, incongruent association *selectively* modulated evaluation of the disadvantageous decks. Indeed, consolidating the importance of awareness of the affective nature of the punishing bad decks, subjective awareness of their incentive nature was strongly associated with adaptive task performance (cf. Maia and McClelland, 2004; Bowman et al., 2005). Such dissociation between deck ratings suggests that deck attitudes in general were not influenced by affective bias. Instead it appears that sensitivity to *accumulating* losses is a major driving force in IGT decision-making (Christakou et al., 2009;Weller et al., 2007, 2010; cf. Dunn et al., 2010).

## **PRE VERSUS POST-DECISIONS**

Both of the above studies (i.e., Hinson et al., 2006; Davies and Turnbull, 2011) introduced affective bias at the *decision* level. In contrast, a recent study (Aïte et al., 2013) suggests that placing affective stimuli during the *post-decision* (feedback stage) phase of decision-making also affects performance on IGT. Here, the ability to make an advantageous choice increases when the emotional context is congruent with the feedback, while this is impaired in an incongruent condition. Indeed, facial emotion appears to carry intrinsic incentive value (Shore and Heerey, 2011); therefore presenting bias during the feedback phase should modulate the net decision feedback. For example, providing a reward of \$10 with a smile would provide more positive reinforcement than the same reward with a fearful face.

The findings of Aïte et al. (2013) are thus consistent with affective bias influencing the decision process via a range of plausible pathways and mechanisms – both affective and cognitive (e.g., Hinson et al., 2002, 2006; Dunn et al., 2006, 2010). Incongruent affective bias again leads to a robust impact on IGT decisionmaking (cf. Hinson et al., 2006; Davies and Turnbull, 2011). This would be consistent with the observations made by Davies and Turnbull (2011), and further imply that affective bias within IGT variants disrupts adaptive shifting of decision behavior in the face of changing contingencies (i.e., reversal learning).

A notable inference derived from this study surrounds the use of additional supporting feedback the IGT (and other decisionmaking paradigms) often present additional feedback with affective value (e.g., smiley faces). Such feedback probably *consolidates* reinforcement of primary incentive feedback, potentially complicating task interpretation (Shore and Heerey, 2011). However, as highlighted by Aïte et al. (2013), the use of such feedback may be unhelpful methodologically, and should therefore be discouraged in IGT experiments.

In sum, a modest number of studies have manipulated the affective loading of IGT tasks – with positive and negative biases, and pre or post-decision influences. All make it clear that affective labeling has a substantial affect on performance, biasing outcome in the direction of the emotion-based influence. Psychophysiological data showed that anticipatory SCRs did not appear to be an important (or necessary) indicator of good

decisions. Finally, the awareness of accumulating loss was found to be critical for adaptive task performance (cf. loss aversion; Weller et al., 2007, 2010). In demonstrating these effects, the studies show a useful analogy for the biases of prejudice in everyday decisionmaking, while demonstrating the flexibility of the IGT as a research tool.

## **DISSOCIATING EPISODIC MEMORY AND EMOTION-BASED LEARNING**

The remarkably rich literature on the IGT has been a central source of evidence for the role of the frontal lobes in EBL (e.g., Bechara et al., 1994, 1997; Rogers et al., 1999; Bowman and Turnbull, 2004). Indeed, Bechara et al. (1997) original paper especially emphasized the role of ventromedial pre-frontal cortex (VMPFC). A later set of studies narrowed the focus, to investigate which specific frontal regions (right or left, dorsal or lateral, or ventral or medial) played the most significant role in EBL (e.g., Rogers et al., 1999; Duncan and Owen, 2000; see Manes et al., 2002; for a detailed discussion).

However, this focus on the frontal lobes, and thus on executive functions, has potentially ignored the role of other brain areas, and indeed other classes of psychological ability. An especially interesting question is the relationship between EBL and episodic memory. In this section of the paper, we will present evidence from lesion (e.g., Damasio et al., 1996; Bechara et al., 1998; Turnbull and Evans, 2006) and neuroimaging studies (e.g., Patterson et al., 2002; Fukui et al., 2005), to understand the relationship between these key psychological systems.

## **EMOTION-BASED LEARNING AND EPISODIC MEMORY**

The neurobiology of EBL is far less well understood than that mediating episodic memory. However, an introductory survey of likely brain regions might include a full range of subcortical emotion systems (e.g., Panksepp, 1986, 1998; Davidson and Irwin, 1999; LeDoux, 2000; Rolls, 2000; Calder et al., 2001; Phan et al., 2002; Bechara et al., 2003; Berridge, 2003; Patterson and Schmidt, 2003; Adolphs et al., 2005), as well as the connection between these systems and pre-frontal cortex, in many cases through the VM frontal lobes (Davidson and Irwin, 1999; Bechara et al., 2000a; Bechara, 2004; Anderson et al., 2006).

Consistent with this, studies also suggest that certain emotional-learning processes clearly involve medial prefrontal cortex (e.g., Lane et al., 1997; Reiman et al., 1997). A metaanalysis of neuroimaging studies, for example, suggest that medial prefrontal cortex is involved in emotion-based tasks, while the anterior cingulate and insula are involved when tasks have both emotional and cognitive load (see Phan et al., 2002).

However, lesion-study and imaging findings have suggested that *episodic* memory systems (Tulving, 1972, 1983) particularly include the medial *temporal* lobes and associated structures (e.g., Zola-Morgan et al., 1986; McDonald and White, 1993; Schacter et al., 1995; Nyberg et al., 1996; Schacter et al., 1996; Rugg et al., 1997; Clark and Squire, 1998). In principle, if EBL and episodic memory systems are anatomically independent (Tranel and Damasio, 1993) it should be possible to disrupt one system and leave the other intact.

Evidence of such intact EBL has long been reported, notably in a classic patient with amnesia (Claparède, 1951). In this well-known report, Claparède concealed a pin in his palm, before shaking the hand of an amnesic patient. On the day following this painful episode, the patient refused to shake the physician's hand, despite having no conscious recollection of the incident (Claparède, 1951; for review see Eichenbaum and Cohen,2001). Modern and systematic evidence for the claim comes from the work on the profoundly amnesic patient, Boswell (Tranel and Damasio, 1990, 1993; Feinstein et al., 2010). In the experiment (Tranel and Damasio, 1993), Boswell engaged in inter-personal encounters with stooges who played a "good,""neutral," or "bad" character in their interactions. After a week, Boswell was shown sets of photographs that included the face of one of the individuals, and an unfamiliar face, and was asked to "Pick the person you would like best?" (p. 83). Naturally, Boswell had no explicit memory of any of the individuals (tested with a free or cued recall). However, when asked to make a forcedchoice response, Boswell chose the "good" character almost 80% of the time, and virtually never chose the bad character (Tranel and Damasio, 1993).

What of *complex* learning tasks that also have a reward-based element? Interestingly, some studies have reported relatively normal performance by amnesic patients on the Wisconsin Card Sorting Test (WCST; e.g., Leng and Parkin, 1988; Shoqeirat et al., 1990), and the probabilistic "Weather Prediction" Task (WPT; Knowlton et al., 1994, 1996). A plausible hypothesis is that these tasks also have an emotion-based preference – given that the experimenter provides "correct" or "incorrect"feedback after each trial.

## **COMPLEX EMOTION-BASED LEARNING**

Empirical evidence from such studies (see also Johnson et al., 1985; Tranel and Damasio, 1989, 1990, 1993) thus suggests that capacity to learn *complex* emotional valence may be retained in profoundly amnesic patients. However, many of reports of the sort described above relate to relatively *simple* patterns of emotional valence learning (uniformly good versus uniformly bad, e.g., Tranel and Damasio, 1993; Feinstein et al., 2010), rather than the more sophisticated patterns of valence which characterize everyday life (e.g., Barraclough et al., 2004). As noted earlier, it is precisely this complicated pattern of reward and punishment that the IGT was designed to assess (Bechara et al., 2000a).

In this context, Turnbull and Evans (2006) measured the IGT performance of a profoundly amnesic patient (SL) who had suffered a posterior cerebral artery stroke, producing profound amnesia. On the IGT, SL performed at a comparable level to controls, across a 3-week period, where each week his performance was no different to (or in one case much better), than controls. This learning was also seen despite the fact that the reward-contingency pattern was shifted between sessions (c.f. Fellows and Farah, 2003), and that SL was unable to explicitly recall any aspect of the previous sessions, or recognize the examiner – evidence suggesting that EBL was preserved.

Thus, complex EBL can remain intact despite profound amnesia – though this effect is not universal. Turnbull and Evans (2006) patient may have been a relatively rare example of such a powerful dissociation. Gutbrod et al. (2006) report patients with lesions to the basal forebrain (*N* = 5) or medial temporal lobe (*N* = 6) who performed the IGT. Here two patients *did* develop a behavioral

preference, though the other nine patients performance remained at chance. In a further study, Gupta et al. (2009) investigated five patients who had bilateral hippocampal damage, and reported that no patients developed a preference for advantageous over disadvantageous choice.

Further evidence of preserved implicit EBL has been reported in patients with Alzheimer's disease (AD), another pathology targeting the medial temporal lobe (e.g., Winograd et al., 1999; Blessing et al., 2006). For example, Evans-Roberts and Turnbull (2011) investigated EBL using the IGT in a patient with dementia of the Alzheimer's type – who had profound impairment of both verbal and visual recent episodic memory, and completed the Gambling task over three weeks (as in Turnbull and Evans, 2006). Mr. A again performed consistently above chance, an effect which seems unlikely to be a result if the more "liberal" response bias of Alzheimer's patients (Budson et al., 2006).

An interesting related finding was the remarkably good performance, in SL's recognition of paired-associate items (Turnbull and Evans, 2006). He had comprehensively failed to bring even a single one of these pairs to conscious recall on any his 40 previous exposures to the pairs, but nevertheless appeared to have encoded at least some aspect of a memorial linkage between them. One explanation might be that he had stored some emotional marker associated with each pair ("rose–bag," good; "elephant– glass," bad). Another possibility might be that the previously tested items had acquired some positive emotional valence through the "mere-exposure" effect (Zajonc, 1980; see Turnbull and Evans, 2006 for detailed report of SL).

These data support the growing evidence that there are multiple memory systems in the brain, especially supporting an anatomical and functional dissociation between episodic (e.g., Schacter and Tulving, 1994; Schacter et al., 2000) and emotion-based memory (Tranel and Damasio, 1993; Damasio, 1994; Panksepp, 1998; see Eichenbaum and Cohen,2001; Phan et al.,2002 for reviews). These findings are consistent with performance of amnestic patients in other non-declarative memory and learning system (e.g., motor learning). The evidence clearly suggests that EBL systems appear to encode more sophisticated patterns of valence learning than have previously been reported, and sustain these over substantial periods of time, especially in patients with"hippocampal"amnesia (Turnbull and Evans, 2006).

## **CLINICAL IMPLICATIONS**

These findings have a range of potentially important clinical implications. For example the Evans-Roberts and Turnbull (2011) study on preserved EBL in dementia clearly supports claims from the "person-centered" literature (e.g., Kitwood, 1997; Sabat and Collins, 1999) – that in spite of progressive memory loss (Blessing et al., 2006) patients with AD are able to learn and retain emotion-based knowledge. Unfortunately, the behavior of many of those who care for patients with AD is less than optimal (Sabat and Collins, 1999). Such carers may hold the opinion that they can perhaps speak critically of such patients, because they will inevitably forget the experience. The systematic findings reported above suggest that patients with AD may retain emotion-based memories, which may have direct impact on interpersonal relationships with

patients with memory loss, both in a personal and therapeutic context.

In addition, the finding of preserved emotional learning in the face of profound amnesia is of some interest in the context of infantile amnesia. It is well established that humans have poor, or non-existent, episodic memory for the first few years of life (Freud, 1905; Dudycha and Dudycha, 1941; Sheingold and Tenney, 1982 for review see Pillemer and White, 1989). Indeed, there is some consensus that the earliest adult autobiographical memories are for events that occurred between 2.5 and 4 years of age (Waldfogel, 1948; Wetzler and Sweeney, 1986; Bruce et al., 2000; MacDonald et al., 2000).

Some researchers posit that language development plays a crucial causal role in such childhood amnesia (Allport, 1937; Schachtel, 1947; Simcock and Hayne, 2002; see also Hayne, 2004 for a review). While, modern neuroscientific accounts of the phenomenon stress especially the late development of hippocampal (conscious) memory systems (for further discussion, see Yovell, 2000; see also Jacobs and Nadel, 1985; Turnbull and Evans, 2006). However, surely these children are learning from this period of early childhood? It is now clear that infants *do* process a well-developed capacity for learning of *emotional* valence in relation to objects, for example, the quality of attachment relationships with specific adults (Winnicott, 1960; Bowlby, 1969; Ainsworth et al., 1978; Ainsworth, 1985; Fonagy et al., 1991a,b). The empirical evidence from childhood amnesia studies suggests EBL systems might be available to infants possibly much before the hippocampus-based systems develop.

Interestingly, this issue may also be important for our understanding of the mechanisms of psychotherapy. It has been suggested that aspects of the therapeutic alliance might (for example) be mediated by emotion-based non-episodic memory systems (Turnbull et al., 2006). In principle, this topic could be investigated through the study of neurological patients with amnesia in a psychotherapeutic setting (Turnbull et al., 2006). In a report of a patient with severe and stable amnesia, Mr. N (see Kaplan, 1994, pp. 590–624 for details), there is at least some evidence that the patient shows therapeutic gains from the interaction with the therapist (Turnbull et al., 2006). Moore et al. (2012) report a similar finding. These preliminary data suggest that during psychotherapy the interpersonal properties of the therapeutic relationship may still exist in patients with profound amnesia, suggesting that the therapeutic alliance may be mediated by a class of memory system that is separate to that of episodic recall.

#### **CONCLUSION**

Summarizing the literature over the last two decades, it is evident that EBL, in the face of a complex ambiguous decision-making landscape, is an important psychological process that occurs rapidly, and is remarkably flexible. This specific form of learning contributes to the scientific understanding of psychological phenomena such as intuition, prejudice that were long ignored, and often difficult to define functionally.

For much of its history, psychological science focused on rational choice, rather that the less well-specified and emotionbased intuitive aspects of human choice (Gilhooly and Murphy, 2005). These later systems are clearly enormously important

for human beings, and this paper has reviewed our growing understanding of range of important issues: the flexibility of theses systems, their access to conscious awareness, their relationship to episodic memory, their role in prejudice, and a number of potentially important implications for psychotherapy and care of the elderly. However, this strand of research is clearly still only in the early stages of development, and we anticipate a range of future discoveries on this scientifically important topic.

#### **REFERENCES**


Bowlby, J. (1969). *Attachment and Loss*, Vol. 1, Attachment. New York: Basic Books.


after stereotactic subcaudate tractotomy. *Am. J. Psychiatry* 161, 1913–1916. doi: 10.1176/appi.ajp.161.10.1913


A-5. Gainesville, FL: The Center for Research in Psychophysiology, University of Florida.


Zola-Morgan, S., Squire, L. R., and Amaral, D. G. (1986). Human amnesia and the medial temporal region: enduring memory impairment following a bilateral lesion limited to field ca1 of the hippocampus. *J. Neurosci.* 6, 2950–2967. doi: 10.1093/neucas/2.4.259-aw

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 December 2013; accepted: 10 February 2014; published online: 21 March 2014.*

*Citation: Turnbull OH, Bowman CH, Shanker S and Davies JL (2014) Emotionbased learning: insights from the Iowa Gambling Task. Front. Psychol. 5:162. doi: 10.3389/fpsyg.2014.00162*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Turnbull, Bowman, Shanker and Davies. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 12 December 2013 doi: 10.3389/fpsyg.2013.00935

## Iowa GamblingTask with non-clinical participants: effects of using real virtual cards and additional trials **+**

## *William H. Overman\* and Allison Pierce*

Department of Psychology, University of North Carolina Wilmington, Wilmington, NC, USA

#### *Edited by:*

Ching-Hung Lin, Kaohsiung Medical University, Taiwan

#### *Reviewed by:*

Gordon Fernie, University of Aberdeen, UK Sarah E. MacPherson, University of Edinburgh, UK

#### *\*Correspondence:*

William H. Overman, Department of Psychology, University of North Carolina Wilmington, 601 South College Road, Wilmington, NC 28403, USA e-mail: overmanw@uncw.edu

Performance on the Iowa GamblingTask (IGT) in clinical populations can be interpreted onl in relation to established baseline performance in normal populations. As in all comparison of assessment tools, the normal baseline must reflect performance under conditions i which subjects can function at their best levels. In this review, we show that a number o variables enhance IGT performance in non-clinical participants. First, optimal performanc is produced by having participants turn over real cards while viewing virtual cards on computer screen. The use of only virtual cards results in significantly lower performanc than the combination of real + virtual cards. Secondly, administration of more than 100 trial also enhances performance. When using the real/virtual card procedure, performance i shown to significantly increase from early adolescence through young adulthood. Unde these conditions young (mean age 19 years) and older (mean age 59 years) adults perfor equally. Females, as a group, score lower than males because females tend to choose card from high-frequency-of-gain Deck B. Groups of females with high or low gonadal hormone perform equally. Concurrent tasks, e.g., presentation of aromas, decrease performance i males. Age and gender effects are discussed in terms of a dynamic between testosteron and orbital prefrontal cortex. y s n f e a e s s r m s s n e

**Keywords: Iowa GamblingTask, optimal performance, real cards, virtual cards, non-clinical populations, age, gender, orbital prefrontal cortex**

## **BACKGROUND**

#### **OUTLINE OF PRESENT REVIEW**

In this review, we discuss results from over 1,500 non-clinical subjects performing our real/virtual version of the Iowa Gambling Task (IGT), and we compare our results with those from previous IGT studies. First, we describe our laboratory real/virtual card IGT task. The sections of this paper include: (1) a detailed description of our real/virtual card IGT, (2) an experimental comparison of performance on the real/virtual IGT vs. four versions of a commercialized IGT from Psychological Assessment ResourcesTM (PARTM IGT), (3) results from our laboratory using the real/virtual card IGT that study the relationship of participant age and IGT performance, (4) results from our laboratory using the real/virtual IGT of gender differences in performance, (5) a general discussion of this review, and (6) a brief summary.

## **DESCRIPTION OF OUR REAL/VIRTUAL CARD IGT VERSION** *Caveat*

The ventromedial prefrontal cortex (VMPFC) and orbital prefrontal cortex (ORBPFC) are not anatomically equivalent (Zald and Rauch, 2006). Nevertheless, in the IGT literature these two areas are often used interchangeably or without anatomical precision. Consequently, in this review we use the designation that is employed in the particular study to which we are referring at that point in the paper.

#### *Brief history*

Prior to 1997 our laboratory conducted numerous studies that revealed gender differences on cognitive tasks known to be dependent on the integrity of the VMPFC in both children and young monkeys (Overman et al., 1996b, 1997). In order to investigate functions across the life span, we were searching for an adult-level cognitive task that was related to the VMPFC. The IGT was relatively new and especially appealing to our goals because performance was significantly impaired by damage to the VMPFC (Bechara et al., 1994, 1997). In 1997 there was no readily available computerized version of the task so we developed a computerized IGT that followed the exact win–loss sequence used by Bechara et al. (1994, 1997). For more than a year we administered this computerized task to college-aged participants and found little or no learning, i.e., they did not learn to preferentially choose advantageous cards. Rarely did a participant choose more than 60% advantageous cards across 100 trials. When we asked participants about their experience and strategies, they frequently said that they believed the four decks of virtual cards were interactive. For example, they might say "if I choose from Deck A three times in a row, this will change the next card in Deck B and prevent a loss." The strategies of interactive decks persisted despite our telling participants to treat the virtual decks as real, physical decks of cards. Thus, it appeared to us that with the computerized IGT, subjects based decisions on two things: (1) card value and (2) erroneous strategies of how the decks interacted on the computer.

Obviously, if one were to use real decks of cards, IGT decisions must be based solely on the values of the selected cards (which cannot interact). Consequently, we developed a version of the IGT that employed the simultaneous use of real and virtual cards (real/virtual card IGT). In this task, participants chose from decks of paper cards while an experimenter mimicked their card choice

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 1 — #1

on virtual decks on an adjacent computer screen. The real and virtual cards were prearranged so that each card, when turned over, exactly matched the wins and/or losses as used by Bechara et al. (1994). In addition, the computer kept score of wins and losses and displayed the ongoing total of money.

This technique dramatically improved performance, presumably because it was obvious to the participant that paper decks could not be interactive and, thus, erroneous strategies of deck interaction were eliminated. With the real/virtual card IGT, participants showed significant and progressive learning on the task, gradually increasing their choice of advantageous cards up to approximately 70–80%+ depending on how many trials were administered (Reavis and Overman, 2001; Overman et al., 2004, 2006). Furthermore, we discovered that administration of more than the traditional 100 trials revealed important data not otherwise shown. Not only did overall IGT performance continue to improve, i.e., percentage of advantageous cards continued to increase beyond 100 trials, but clear and significant gender differences in task performance emerged (e.g., Reavis and Overman, 2001). Specifically, females chose significantly fewer advantageous cards from Decks C and D than did males. This gender difference was driven by females' preference for cards from disadvantageous Deck B, which has a high win-to-loss ratio. Other researchers have confirmed similar gender differences (for review of sex differences on the IGT, see van den Bos et al., 2013).

## *Description of the real/virtual IGT*

In our IGT, the subject sits in front of four decks of paper cards, behind which is a computer screen showing four virtual decks of cards. As the subject selects a paper card, the adjacent experimenter selects the same virtual card. The real and virtual cards have exactly the same value of wins and losses. The computer shows a running total of "money." The real/virtual card IGT version has the identical sequence of wins and losses for every card in the task as used by Bechara et al. (1994, 1997). Each deck contains 40 cards as in Bechara et al. (1997), and a deck can be reused if depleted. Participants start the task with \$2,000 in points. There are two advantageous and two disadvantageous decks. Throughout the task, the advantageous decks (Decks C and D) always reward \$50 and the disadvantageous decks (Decks A and B) always reward \$100. Ten consecutive choices from \$50 advantageous Decks C or D result in a net gain of \$250; while 10 consecutive choices from \$100 disadvantageous Decks A or B result in a net loss of \$250. Advantageous \$50 Deck D and disadvantageous \$100 Deck B contain 10 wins and one loss per 10 trials [analyzed below as "high frequency of gain (HFOG)" decks], while advantageous \$50 Deck C and \$100 disadvantageous Deck A contain 10 wins and five losses per 10 trials (analyzed below as "low frequency of gain" decks). Consistent selection from \$100 disadvantageous Decks A and B results in long-term monetary losses, whereas consistent selection from \$50 advantageous Decks C and D result in long-term monetary gains. We use color designation (blue, yellow, green, and red) for the decks (e.g., Reavis and Overman, 2001; Overman et al., 2004, 2011) because pilot studies showed that color names were easier for the experimenter to attend to when mimicking the participant's choice on the computer. The letter/color variable does not affect performance (Overman et al., 2011, 2013).

## **USE OF REAL CARDS AND ADDITIONAL TRIALS INCREASE IGT PERFORMANCE: COMPARISON OF SIX**

## **IGT VERSIONS INCLUDING THE COMMERCIALLY AVAILABLE PARTM IGT NUMEROUS VERSIONS OF IGT HAVE BEEN USED**

The IGT has been used in hundreds of scientific studies but, unfortunately, testing procedures have varied widely. The variations include, but are not limited to, specified details of test procedures, instructions to the participant, number of trials, analysis by gender, analysis of performance in terms of percent advantageous cards vs. a net score, analysis by deck type, the use of real vs. virtual cards, use of real money, and education level of subjects (for reviews, see Fernie and Tunney, 2006; Overman et al., 2013). Perhaps one reason there have been so many IGT versions is that Bechara et al. (1999) did not publish details of procedures, such as instructions, until several years after its introduction, and type of instruction is known to affect IGT performance (see Balodis et al., 2006; Fernie and Tunney, 2006).

## **ARE RESULTS EQUIVALENT WHEN USING VIRTUAL AND REAL CARDS**

In the original IGT (Bechara et al., 1994), construct assessment was based only on the number of real cards chosen from each deck type. In 2000 a computerized version was utilized (Bechara et al., 2000). Recently, a similar computerized IGT has become commercially available from Psychological Assessment Resources (PARTM; 2007). This test is designed as an assessment tool for clinical populations and as a complement to other neuropsychological tests (Bechara, 2007). The PARTM IGT differs from the original IGT on two dimensions: (1) it uses virtual cards and (2) it employs an increasing progression of wins and losses every 10 trials [see Bechara et al. (2000)].

Changes in test instrumentation must proceed with caution. Sometimes such a change can introduce confounding variables (Steinmetz et al., 2010). In the case of the IGT, there may have been unintended consequences of using virtual cards rather than real cards as this has been documented for another well-known test of frontal function, the Wisconsin Card Sorting Task (Steinmetz et al., 2010).

Performance equivalence between test versions is critically important because one of the requirements of a sound assessment tool of a psychological construct is that all versions of the test should use procedures that yield optimal performance for all test takers, i.e., there should be nothing about the *test procedures*, *per se*, that restricts performance. This is a basic element of construct validity. Only by establishing the optimal baseline of decisionmaking in normal participants can comparisons with clinical populations be accurate. Others have questioned the construct validity of the traditional IGT (e.g., Dunn et al., 2006).

#### **EXPERIMENTAL EFFECTS OF USING REAL + VIRTUAL CARDS AND EXPLICIT INSTRUCTIONS**

There are multiple components of construct validity for the IGT. Among those constructs that have been studied are the definition of decision-making, reliability, and the impact of personality and mood (Buelow and Suhr, 2009). To expand research in this area, we addressed the *validity* component of optimal performance on the IGT, especially the PARTM IGT (Overman et al., 2013). In this study, we compared performance on five versions of the

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 2 — #2

IGT including four versions of the commercially available PARTM. Across the five versions, several procedural variables were systematically manipulated: (a) method of delivery (computerized versions vs. versions using real decks of cards); (b) number of trials (100, 200, and 400); (c) instructions given to the participant; and (d) incentives for the subject to perform as well as possible.

This study had two primary experiments. In Experiment 1, we compared performance on five versions of a 100-trial IGT: four versions of the PARTM IGT and one version that more closely resembled the original IGT. In Experiment 2, we compared performance on the 100-trial IGT used in our laboratory with performance on the same IGT with 200 trials and an additional 200 trials plus incentive. In the first experiment, 214 male and 364 female college students were randomly assigned to one of five versions of the IGT. Full descriptions are presented in Overman et al. (2013), but a brief description is necessary for understanding of our data:

*IGT Version 1.* Commercially available computerized PARTM version of the 100-trial IGT using the standard instructions included with the PARTM IGT: the participant was told that some decks are "worse than others" and that they were "to try to win as much money as possible and avoid losing money as much as possible."

*IGT Version 2.* PARTM IGT (100 trials) With Explicit Instructions: the participant was told there were two types of decks, "good" and "bad" and that if they consistently chose from the good decks, they would win more money than they would lose and that their goal was to figure out which were good and bad decks to win as much money as possible.

*IGT Version 3.* PARTM IGT (100 trials) with original PARTM instructions, but using paper cards from four tangible decks.

*IGT Version 4.* PARTM IGT (100 trials) with explicit instructions *and* with paper cards from four tangible decks.

*IGT Version 5.* IGT traditionally used in our lab using paper cards and virtual cards (real/virtual card IGT; 100 trials). Version 5 employed an identical pattern of wins and losses across all cards as in the original version of the IGT in that Decks A and B always paid \$100 and Decks C and D always paid \$50 (Bechara et al., 1994). In addition, if a deck was depleted, it could be reused, giving the participant a choice between all four decks. The instructions were identical to those used in IGT Version 2.

[Note: For all of the PARTM versions, each deck contained 60 cards and if a deck was depleted, the participant was forced to choose from the remaining three decks. In addition, the PARTM decks paid an *average* of \$100 or \$50; the net loss or gain from each deck increased across each block of 10 cards. For example, for Deck A, at the outset of the test, the total gain for Block 1 is \$1000 and the total loss for Block 1 is \$1250 for a net loss of \$250. In each subsequent block of 10 trials the average gain increases by \$10 per block, i.e., the average win in Block 2 will be \$110 and in Block 3 the average win will be \$120 and so forth. In addition beyond the first block, the number of losses increases by one card per block. Thus, there are six losses in Block 2, seven losses in Block 3 and so on. The total net loss per block increases by \$150 such that for Block 2 the net loss is \$400, rather than the net loss of \$250 in Block 1 and for Block 3 the net loss is \$550 and so on. While the number of losses increases from block to block, the amount of the loss per card remains within the range of \$150–350 for each block. The incremental changes in gains and losses continue through all six blocks so that the total net loss for 60 cards in Deck A is \$3750. The progression of wins and losses is explained for each deck in the PARTM IGT manual (Bechara, 2007).]

Note: The PARTM IGT has optional "slot machine" sounds that can accompany the visual display of wins and losses; however, these sounds were not employed in any version of the task in our study.

#### **RESULT #1: NO EFFECT OF TYPE OF INSTRUCTION**

A 2 (gender) × 5 (IGT Version) × 4 (blocks of trials) was conducted. As discussed below, there was a significant effect of (a) IGT Version, (b) blocks of trials, (c) an interaction between gender and block, and (d) an interaction between version and block.

There was no significant effect of the nature of instructions, explicit or not. This was shown by the dual facts that (1) performance was not statistically different on Version 1 (PARTM IGT) and Version 2 (PARTM IGT + explicit instructions), and (2) performance was not statistically different on Version 3 (PARTM IGT with cards and regular instructions) and Version 4 (PARTM IGT with real cards and explicit instructions).

It is important to note that all instruction types employed "hints" about what the subject was expected to do. Fernie and Tunney (2006) reported that IGT instructions including a hint about the nature of the task significantly improved performance relative to instructions with no hint. The hint referred to by Fernie and Tunney (2006) concerned instructing the subject that "some decks are worse than others and you can win if you stay away from the worst decks." In the present study both types of instructions contained a similar hint. The PARTM IGT instructions were essentially the same as those used by Bechara et al. (1999) and said "the goal of the task is to win as much as possible and lose as little as possible; some decks are worse than others; you will win if you stay away from the worse decks." The "explicit" instructions we used in IGT Versions 2, 4, and 5 said "there are good decks and bad decks; if you pick from the good decks you will win more money than you lose, but if you pick from the bad decks you will lose more money than you win; your job is to figure out which are the good decks and which are the bad decks."

It is possible that hints may affect the degree of awareness (Dunn et al., 2006; Persaud et al., 2007), which in turn, could affect performance. Perhaps the inclusion of similar "hints" in the two instructional sets in this study eliminated any performance effect of this variable. Nevertheless, in our comparison of IGT versions, there was no systematic difference in IGT performance between versions that used originally published instructions (Bechara et al., 1999, and PARTM IGT) or "explicit" instructions.

## **RESULT #2: USE OF REAL CARDS + VIRTUAL CARDS ENHANCES PERFORMANCE**

IGT performance (percent of advantageous cards selected: Decks C + D) was significantly higher when real/virtual cards were used

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 3 — #3

vs. virtual cards alone. As shown in **Figure 1**, performance was higher in IGT Versions 3, 4, and 5 than in Versions 1 and 2 (when only virtual cards were used). There were no significant performance differences on Versions 1 and 2.

#### **RESULT #3: USE OF REAL/VIRTUAL CARDS PROMOTES LEARNING THROUGHOUT THE 100 TRIAL TASK**

Another finding emerged from the analysis of performance across four blocks of 25 trials each. As shown in **Figure 2**, performance was equal among all IGT versions during the first block of 25 trials (the exploration period), but in all versions, performance in Block 2 was significantly higher than performance in Block 1. In other words, in each IGT version learning occurred within the first 50 trials. However, as shown in **Figure 2**, when real cards were used (Versions 3–5), learning continued to improve in Blocks 3 and 4. In contrast, when only virtual cards were used (Versions 1 and 2), performance leveled off for the remainder of the trials after the second block. This result is more dramatic when presented as a comparison between combined versions using real/virtual cards (Versions 3–5) and combined versions using only virtual cards (Versions 1 + 2; **Figure 3**).

One important feature of our real/virtual card IGT should be emphasized at this point. In our task, if a deck was depleted, it was turned over and could be reused. This is not the case for the

PARTM IGT in which a depleted deck cannot be reused. This means that the participant would then be forced to choose between three decks, some of which might not be his/her preferred advantageous deck type. This situation would penalize subjects that learn early in the game as noted by Dunn et al. (2006).

#### **WIDESPREAD ASSUMPTION OF EQUIVALENT RESULTS WITH REAL AND VIRTUAL CARDS**

The data presented above raise an important point that is relevant for all research using the IGT. Until now, there has been a widespread assumption in the IGT literature that performance is equal when using real or virtual cards. Two specific IGT papers have been frequently cited as the basis for this assumption: Bowman et al. (2005) and Bechara et al. (2000).

#### *Bowman et al. (2005)*

Bowman et al. (2005) compared IGT performance with real cards and with a computerized format. They reported no significant difference in performance between the two formats. However, these results are difficult to interpret because of the low number of subjects and the almost exclusive use of females. There were only 22 subjects in each of three experiments. Across all experiments there were 56 females and 10 males, i.e., 85% female. Given the consistent finding that females do not perform as well as males on the IGT (Reavis and Overman, 2001; Bolla et al., 2004; Overman, 2004; Overman et al., 2004, 2006), the data from Bowman et al. (2005) may represent something of a floor effect among groups. This lower compression of scores may have obscured significant differences between groups that might have been apparent if there had been more subjects and a balanced number of males and females.

#### *Bechara et al. (2000)*

Published papers frequently cite Bechara et al. (2000) when stating that IGT performance is equal when real or virtual cards are used (Bechara, 2007; Schneider et al., 2007; Buelow and Suhr, 2009). Close analysis of the paper by Bechara et al. (2000) shows a different picture. The authors tested normal participants and patients with damage to the VMPFC on two IGT versions using real cards versions [A, B, C, D and E, F, G, H (card task with different order of cards and payments from original A, B, C, D)] and two computerized IGT versions using virtual cards versions [A'B'C'D' and E'F'G'H']. Not only were the latter two versions computerized,

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 4 — #4

they also introduced progressive wins and losses for every 10 trials. In other words, two factors were changed during the switch to a computerized task: (a) real vs. virtual cards and (b) stable vs. increasing wins and losses per block. As expected, in all four versions of the IGT,VMPFC patients were impaired relative to normal subjects. In this regard, the authors write: "the results from the computer tasks mirrored those from the original task (ABCD) and variant (EFGH) task" (Bechara et al., 2000, p. 2197). Given the context of the paragraph, the authors appear to be documenting the fact that VMPFC patients were impaired relative to normal participants regardless of whether real vs. virtual cards were used or whether wins and losses were stable or progressive. However, this does not mean that normal participants performed equally well with real vs. virtual cards. In fact, their data indicate that normal participants performed better when real cards were used. As shown in Figure 4A (Bechara et al., 2000, p. 2197), normal participants' learning leveled off after the first two blocks of trials when virtual cards were used; however, with real cards, normal participants' learning increased throughout the task from Block 1 to 5 Figure 2A, p. 2195). The paper contained no statistical analyses of the data. However, inspection of the SEM bars indicates two important things: first, when real cards were used, there was little or no overlap, block to block, from Block 1 through 5 Figure 2A, i.e., learning continued after the second block and throughout the task. Secondly, when virtual cards were used, there was considerable overlap in Blocks 2–5 Figure 4A, i.e., learning plateaued after the second block. So, it appears that normal participants performed better when real cards were used than when virtual cards were used.

*Caveat.* The virtual card version employed progressive wins and losses which the real card versions did not. Although the progressive win/loss schedule in Bechara et al. (2000) was not described, it may have been similar to the progressive version of the PARTM. So the differences in performance in normal subjects may have been due to either the real/virtual variable or the progressive consequences variable. Our study only compares the variable of real vs. virtual cards and we show that the card variable is critically important for IGT performance.

## **EFFECTS OF ADMINISTERING MORE THAN 100 TRIALS ENHANCES IGT PERFORMANCE**

In part A of the second experiment by Overman et al. (2013), the same subjects who participated in 100 trials of the Version 5 IGT were given an additional 100 trials. There was a significant effect for both number of trials and gender (discussed in Section"Gender Differences on IGT Performance: Deck-by-Deck Analysis"). In the first set of 100 trials, participants chose an average of 62% advantageous cards. This significantly increased to 72%. Furthermore, in the last (eight) block of trials, males and females chose 85 and 67% advantageous cards, respectively. These results clearly show that IGT performance is significantly enhanced with the addition of extra trials. In addition, males outscored females in the last block of 100 trials, and they continued to do so throughout the second set of 100 trials. This indicates that females did not "catch up" with the males even given additional trials, i.e., the female difference was not simply due to a slow start in performance. Most importantly, the continued increase in selection of advantageous cards during the second 100 trials by all subjects means that the decision-making process was not complete after only 100 trials. If the purpose of the IGT is to "measure decision-making," one presumes it is meant to assess *complete* or *finished* decision processes. Our results indicate that for the IGT, decision-making processes are not complete until well after 100 trials. Others have noted that the administration of more than 100 trials might reveal important insights of different populations, e.g., that patient groups may be slow to learn and show increased performance beyond 100 trials (Dunn et al., 2006).

## **SUMMARY OF SECTION "USE OF REAL CARDS AND ADDITIONAL TRIALS INCREASE IGT PERFORMANCE: COMPARISON OF SIX"**

Iowa Gambling Task performance is maximized when real/virtual cards are used and there are more than 100 trials. This real/virtual card procedure is inconvenient as compared to a simple computerized IGT, in part because the task requires an experimenter to mimic responses on the computer. However, convenience is not a substitution for complete and accurate assessment of performance.

## **EFFECTS OF AGE ON THE IGT**

## **EXPERIMENTAL EXAMINATION OF PERFORMANCE ON REAL/VIRTUAL CARD IGT FROM EARLY ADOLESCENCE THROUGH OLD AGE**

Because of its sensitivity to decision-making impairments among patients with circumscribed brain damage, the IGT has been used as a behavioral proxy for brain development across the life span (Wahlstrom et al., 2010). The traditional IGT is too complex for young children, so simplified versions have been developed for this population. Traditional IGTs, including our real/virtual card version, have been administered to participants from early adolescence to old age.

## **PERFORMANCE OF CHILDREN ON VARIATIONS OF THE IGT**

To our knowledge, the real/virtual card IGT has not been administered to children. But it is important to review, if even briefly, findings from children who perform age-appropriate, "childfriendly" versions of the IGT. These studies consistently show increases in performance within several age ranges: from ages 3 to 4 years (Kerr and Zelazo, 2004), from ages 3 to 5 years (Hongwanishkul et al., 2005), and from ages 6 years to adulthood (Crone and van der Molen, 2004). The latter study employed a widely used child IGT version known as the "hungry donkey task." In this task, subjects choose between four virtual doors that reveal wins and losses of apples to feed a hungry donkey. The win/loss schedule is essentially the same as in the traditional IGT. On this task, adults (ages 18–25) performed significantly better than adolescents (ages 13–15), who performed significantly better than both an older group of children (ages 10–12) and a younger group of children (ages 6–9). Younger and older groups of children performed equivalently (Crone and van der Molen, 2004). In a study by Garon and Moore (2004), 3-, 4-, and 6-year-old children were given a 40-trial variant IGT that involved "bears" and "tigers" and candy rewards. There were no significant age or gender differences in performance. There was a block × gender effect in which females made

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 5 — #5

more advantageous choices than males in the second block of 20 trials. The meaning of this finding is unclear. The authors acknowledge that their task was quite different from the IGT in terms of the nature of instructions and performance feedback and those differences may have contributed to the unexpected gender effect.

## **PERFORMANCE ON REAL/VIRTUAL CARD IGT FROM ADOLESCENCE TO OLD AGE**

Since there are substantial brain changes, especially in the prefrontal cortex (PFC), during adolescence, the IGT is an ideal task to use with this population. Indeed, several hypotheses attribute poor decision-making among adolescents to neuroanatomical changes in areas within the PFC. Optimal performance on the IGT is dependent on the integrity of several regions in the PFC including the ORBPFC (Bechara et al., 1994), the dorsolateral (DL) PFC (Fellows and Farah, 2005), or dorsomedial (DM) PFC (Manes et al., 2002). Damage to any of these areas impairs IGT performance as defined by selection of more cards from disadvantageous decks than from the advantageous decks. These brain areas and others change during adolescence. In general, during this period, cortical gray matter increases and decreases at somewhat different schedules in different brain regions (Giedd et al., 1999). In the frontal cortex, overall gray matter increases during early adolescence and peaks at age 12 for males and age 11 for females (Giedd et al., 1999). This peak is followed by a decrease in cortical gray matter volume until late adolescence.

However, the specifics of prefrontal development are exceedingly complex. The frontal cortex is heterogeneous and not all sub-areas develop simultaneously during adolescence. There is regionally specific development with some areas being pruned, while other areas are showing increases in synapses (Giedd et al., 1999). Some researchers have suggested that changes in DLPFC are most highly correlated with adolescent behavior patterns (e.g., Lewis, 1997; Sowell et al., 2001; Paus, 2005) while others have suggested that changes in VMPFC are most highly correlated to such patterns (Hooper et al., 2004; Schwartz et al., 2010). Our research with IGT performance during adolescence was not designed to determine the underlying neural bases for behavior changes. Rather we have studied behavioral changes in IGT performance throughout adolescence.

In two studies, detailed below, we administered the real/virtual card IGT (with more than 100 trials) to non-clinical participants ranging in age from 11 to 62 years. In the first study, we administered 200 trials of the real/virtual card IGT to children in the sixth through the 12th grade, as well as to college students (Overman et al., 2004). In the second study, we administered 150 trials of the real/virtual card IGT to adults rangingfrom college-age to 60+ years (Reavis and Overman, 2001). In that study, hormone levels were determined from blood samples for young women (low or high hormone) and older women (with or without estrogen replacement therapy, ERT).

### **AGES 11–23 YEARS: PERFORMANCE OF ADOLESCENTS ON REAL/VIRTUAL CARD IGT**

We measured the performance of adolescents in sixth through 12th grade (11–18 years) and college students (17–23 years) on 200 trials of our real/virtual IGT and a control task, the Wisconsin Card Sorting Task (WCST; Overman et al., 2004). In addition, we administered surveys of impulsivity and excitement-seeking (impulsivity and excitement-seeking subscales of the NEO Personality Inventory; Costa and McCrae, 1992). TheWCST was used as a control task for generalized executive dysfunction because it is more dependent upon DLPFC systems than on VMPFC systems (Cabeza and Nyberg, 2000, but see Manes et al., 2002; Fellows and Farah, 2005 for evidence of IGT impairment following damage to the DLPFC). Thus, normal performance on the WCST plus impaired performance on the IGT would suggest localized dysfunction (VMPFC) rather than a generalized prefrontal dysfunction. A number of studies have reported little relationship between IGT and WCST performance measures (see Bechara, 2007), although there may be some correlation between WCST and performance in the last blocks of the IGT (Brand et al., 2007).

Performance was analyzed using the percentage of advantageous cards (C + D) selected across 200 trials. As shown in **Figure 4**, there was a steady and statistically significant increase in performance across age. Performance of sixth and seventh grade participants was significantly lower than performance of participants in the ninth grade and above. In addition, performance of participants in the eighth grade was significantly lower than participants in the 11th grade and higher. Performance of participants in the ninth grade and higher were equivalent. There were 30 males and 30 females in each age group and we speculate that the use of more subjects would have revealed additional statistical differences between groups.

Furthermore, an analysis of performance in each block of 50 trials showed, across subjects, significantly lower performance in Block 1 as compared to Blocks 2–4; performance in Blocks 2 and 3 were statistically equivalent, but significantly lower than that in the last block of trial. In other words, as shown above, performance improved throughout the 200 trials on the real/virtual card IGT.

In addition to finding significant effects of age on IGT performance, a gender difference also emerged. These will be discussed

age accounted for 90% of the variance. Vertical bars indicate SEM.

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 6 — #6

more thoroughly in the following section. We did not find a significant correlation between performance on the IGT and measures of substance use, impulsivity, or excitement-seeking. This lack of a significant correlation was due to two factors: (a) there was relatively little substance use within this particular cohort who volunteered to stay after school hours to be tested and (b) because 25 pair-wise correlations were run and thus, the risk of a type 1 error was substantial, the standard correction for multiple correlations generated a stringent criterion alpha level of 0.0009. In other studies there is evidence that IGT performance is negatively correlated with substance use (e.g., Bechara and Martin, 2004) and impulsivity (e.g., Best et al., 2002). In addition, we did not find a significant correlation between IGT and WCST performance.

The steady increase in IGT performance throughout adolescence to young adulthood can be interpreted in many ways, one of which is that increases in performance are related to the ongoing neuroanatomical and neurochemical development of the frontal lobe. Regardless of the interpretation, the data unambiguously reveal a clear distinction between adolescent and young adult participants.

Our findings with adolescents have been replicated (Hooper et al., 2004). In that study, participants were given 100 trials (five blocks of 20 trials) of a computerized IGT with a contingency value scaled below the traditional IGT in order to employ real monetary rewards or punishments. Overall, 14- to 17-year-old participants performed significantly higher than 9- to 10-year-old participants. In Block 4, the 14- to 17-year-old group performed better than both the 9- to 10- and 11- to 13-year-old groups. In Block 5, the 14- to 17-year-old group performed better than only the 9- to 10-year-old group. These data confirm our finding that older adolescents learned the task earlier and to a greater extent than younger participants.

## **AGES 19–63 YEARS: PERFORMANCE ON REAL/VIRTUAL CARD IGT FROM EARLY ADULTHOOD TO OLD AGE**

In a separate study of possible age-related changes in IGT performance we tested non-clinical adults ranging in age from 19 to 63 years (Reavis and Overman, 2001). These subjects were also tested on a probability-learning task, the California Weather Task (CWT; Knowlton et al., 1994, 1996a,b). The WT was chosen for two reasons: first, as is the case for the IGT, learning is gradual across multiple trials. Secondly, as is the case for the early trials of the IGT, individuals can learn without being aware of the information they have acquired. This notion is an essential component of the somatic marker hypothesis (SMH; Damasio et al., 1991). In contrast to the IGT, performance on the WT is dependent upon the dorsal striatum (Packard et al., 1989), and, as such, is impaired in Parkinson's and Huntington's patients, but not in amnestic adults (Knowlton et al., 1994). The CWT is a probabilistic classification habit task. Participants are shown up to four cards on a computer screen and must gradually learn which combinations of cards predict one of two weather outcomes: rain or sunshine. A particular card is associated with the outcome of sunshine 75, 57, 43, or 25% of the time, and thus, associated with the outcome of rain 25, 43, 57, and 75% of the time. Participants are exposed to any combination of cards on a given trial, and they must gradually learn which cards and combinations are probabilistically related to a given outcome. The computer provides visual and auditory feedback corresponding to a correct or incorrect response.

In addition to the real/virtual card IGT and WT, sensationseeking (sensation-seeking scales; Zuckerman, 1979), and depression (Center for Epidemiological Studies Depression Scale; Radloff, 1977) were assessed as well as hormone status (estradiol, progesterone, and testosterone were assayed from blood samples). This resulted in six distinct groups as described in the Section "Gender." The groups were: (1) young males, mean age 19.1 years; (2) older males, mean age 59.4 years; (3) young menstruating females, mean age 19.8 years; (4) young mid-luteal females, mean age 22.4 years; (5) older women on ERT, mean age 54.5; (6) older women not on ERT, mean age 62.7. The order of the IGT and WT were counterbalanced. Performance was measured by the percentage of advantageous cards (from Decks C + D) across 150 trials. Additionally, rule-stating was measured by asking the participant to tell the experimenter "all they knew about the game and how they felt about the game" at intervals of 10 trials. If they did not state which two decks they thought were good or bad, they were prompted to do so. After the response they were reminded that the good and bad decks always remained the same. We recorded at what point during the task the rule was stated correctly, i.e., that Decks C and D were the "advantageous", or "best", etc., decks.

There was no significant effect of age on IGT performance between young adults (groups 1 + 3 + 4) and older adults (groups 2 + 5 + 6). Nor was there a significant effect of hormones within groups of males or females. Across 150 trials, young participants selected 65.6% advantageous cards and older participants selected 60.6% advantageous cards (Reavis and Overman, 2001). As shown in **Figure 5**, both young and old participants improved performance across blocks of trials.

There were no differences in IGT performance among the four hormonal groups of women, so all were collapsed into one group. Similarly, there was no significant difference IGT performance between the two groups of men so both were collapsed into one group. A comparison of males vs. females revealed a significant gender difference. This is discussed in detail below, but essentially men and women were equal in performance on the first block of

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 7 — #7

performance but males performed significantly better thanfemales on the second and third blocks.

While there were no significant age differences in adults in IGT performance, there was a significant age difference in rule stating. Significantly more college-age participants (*M* = 63%) stated the correct rule at some point throughout the task than did older participants (*M* = 38%), indicating perhaps a higher level of cognitive awareness of the task and supporting the claim by Dunn et al. (2006) that the IGT is not cognitively impenetrable. In addition, there was an interesting test order effect as well, but only for men. When the CWT preceded the IGT, younger men improved in their IGT performance. This is in contrast to older men's IGT performance, which declined when the CWT preceded the IGT (Reavis and Overman, 2001). Since fatigue does not seem to be a viable explanation, it appears that having the CWT first was somehow a benefit to younger men and a detriment to older men. At this time, we do not have an explanation for the order effect, and it is an important topic for future research because it indicates that administration of multiple tests might affect performance on the IGT.

With regard to the questionnaires, we found that (1) across all subjects, sensation-seeking was significantly correlated with performance on the real/virtual IGT (*r* = 0.162, *p* = 0.025), (2) males scored higher than females on this scale, and (3) depression scale scores were not correlated with performance on either the IGT or WT (Reavis and Overman, 2001).

#### *Comparison with previous studies of young vs. older adults*

There are mixed reports about the performance of young and older adults on the IGT.

*Failure to document age changes on IGT.* Some studies have been consistent with our finding of no age differences. For example, when using the computerized PARTM IGT, Wood et al. (2005) found no performance differences between young adults (ages 18–25) and older adults (ages 65–88). There have been some indications for age changes in tasks that rely on DLPFC in the face of no age changes for tasks that rely on VMPFC. MacPherson et al. (2002) tested young (mean age = 28.8 years), middle-age (mean age = 50.3 years) and older (mean age = 69.9 years) adults on two batteries of frontal tasks: (1) "DLPFC tasks": WCST, Self Ordering Pointing Task, and Delayed Response and (2) so called "VMPCF tasks": IGT (using real cards), Faux Pas Task, i.e., detecting social slips, and an Emotional Identification Task. The results revealed age-related declined on all three DLPFC tasks but no age-related changes on the VMPFC tasks with the exception of identifying sadness on the Emotional Identification Task (MacPherson et al., 2002). A somewhat similar study found partially contrasting results. Lamar and Resnick (2004) tested young (mean age = 28 years) and older adults (mean age = 69 years) on four DLPFC tasks and three orbitofrontal (OFC) tasks. There were no age differences on any of the DLPFC tasks but there were some age differences on the OFC tasks. Specifically younger adults scored better than older adults on delayed matching and delayed non-matching to sample tasks. However, there were no age differences on the OFC task of the IGT. In fact, both young and older adults performed exactly the same on the IGT, and chose 55% advantageous cards (Lamar and Resnick, 2004). The selection of only 55% advantageous cards across 100 trials seems to be low compared to most other IGT findings. The authors mention"decks of cards" but it is not clear whether this referred to real paper cards or virtual cards. Finally, Kovalchik et al. (2005) found no performance differences between young adults (18–26 years) and elderly adults (70–95 years) when using a two-deck variation of the IGT.

*Documentation of age changes on IGT.* In contrast to the failure to find age-related IGT changes, there are a few reports of IGT impairments among some older adults, at least when defined by subgroups of older adults. Denburg et al. (2006) found that there were two significantly different groups among older adults: impaired (as indicated by a negative net score) or unimpaired (as indicated by a positive net score). Similarly, Denburg et al. (2005) found that a subset of older adults (56–85 years of age) showed impairments on the IGT relative to younger adults (26–55 years of age). Specifically, they found that 14 out of 40 (35%) of the adult group were impaired while the majority (65%) was unimpaired. These results were supported by Fein et al. (2007), who found a greater number of adults between the ages of 56 and 85 years were impaired on the IGT in comparison to adults between the ages of 18–55 years. Thus, some, but not all, older adults are reported to show IGT impairments. However, the same can be said of young healthy adults. Some, but not all, young adults score poorly on the IGT (in terms choice of advantageous card selection) and prefer decks with infrequent losses also, as documented by Steingroever et al. (2013) and by Reavis and Overman (2001).

## **SUMMARY OF SECTION "EFFECTS OF AGE ON THE IGT"**

Age clearly impacts IGT performance, as shown by the differential levels of performance of adolescents through young adulthood (Hooper et al., 2004; Overman et al., 2004). However, with regard to older adults, there are mixed results depending on procedure and type of analysis. In the brief review cited above, four studies failed to find age differences on the IGT and three found age differences in subgroups of older adults. When using our real/virtual IGT, we find little or no evidence for age-related IGT decrements beyond young adulthood.

## **GENDER DIFFERENCES ON IGT PERFORMANCE: DECK-BY-DECK ANALYSIS BASIC FINDINGS**

Both normal males and females show learning on the IGT across 100 or 200 trials, in that they learn to select significantly more advantageous cards than disadvantageous cards. However, males perform at higher levels than females. Gender differences in performance on the IGT were first documented by Reavis and Overman (2001) and have been replicated frequently (for review, see van den Bos et al., 2013). While males, as a group, choose significantly more advantageous cards than do females, there is always overlap on IGT scores for populations of males and females (e.g., Reavis and Overman, 2001; van den Bos et al., 2013). The male bias had been documented with real/virtual cards (e.g., Reavis and Overman, 2001; Overman et al., 2004, 2006) or virtual cards (e.g., Bolla et al., 2004; Denburg et al., 2009).

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 8 — #8

In terms of card selection, the sex difference is the result of females' preference for HFOG cards, either from disadvantageous Deck B (van den Bos et al., 2013) or from both HFOG decks, disadvantageous Deck B + advantageous Deck D (Reavis and Overman, 2001; Overman et al., 2006). In terms of a biological basis for these performance differences, there are a number of hypotheses that are discussed below.

The general IGT literature is muddled with reference to gender differences for several reasons. First, many studies have not analyzed IGT performance for gender (Bechara et al., 1997, 2000; Nagy et al., 2006). Secondly, some studies have analyzed for gender and failed to find a difference; however, there was no deck-bydeck analysis (van Honk et al., 2003). Additionally, some studies have conducted a deck-by-deck analysis but not a gender analysis (Wilder et al., 1998; Lin et al., 2007). In the article by Lin et al. (2007), the lack of gender analysis is of considerable concern, because the authors make a particular note that HFOG *disadvantageous* Deck B is selected more than, say *advantageous* Deck C, and results in a "prominent Deck B phenomenon." Unfortunately, there is no way of knowing whether the preference for HFOG decks was driven by females or not. As discussed below, IGT analyses by gender, deck type, and blocks of trials are essential for the formulation of refined hypotheses about how and why various groups display differential performance.

#### **IGT GENDER DIFFERENCES ACROSS THE LIFE SPAN**

There is evidence that the sex difference in IGT performance persists from childhood to young adulthood, and perhaps longer. Males as young as 7–15 years of age years perform significantly higher than females on a child-friendly variant of the IGT, the hungry donkey task (Crone et al., 2005). Adolescent males (11– 18 years of age; Reavis and Overman, 2001) as well as adult males (18–62 years of age) choose significantly more advantageous cards than do females on a 200 trial real/virtual card IGT.

#### **INTERPRETATION OF FEMALE PREFERENCE FOR HFOG DECKS**

At this point in time, it is not known precisely why females as a group tend to prefer HFOG cards relative to males. However, several possibilities can be ruled out: gender differences in math ability, response perseveration, and hormones.

#### *IGT gender differences are not due to differential math ability*

The IGT can be classified as employing "arithmetic" cards because every card contains a "plus" value and many have a "minus" value. Thus, the participant must have rudimentary calculation skills to determine which decks are"paying off."Several studies have shown a male advantage in certain mathematical domains (Hedges and Nowell, 1995; Benbow et al., 2000). So, perhaps males make these calculations more rapidly and accurately than do females and, thus, are more efficient on the IGT. To test this theory, we created a new real/virtual card IGT version in which every card contained a win and a loss, and thus required a calculation (Overman et al., 2006). The net outcome of each card and deck matched the corresponding card and deck in the original IGT (Bechara et al., 1994). Since females were inferior to males on the traditional IGT (that requires calculation on 30% of the cards), then one would predict that they would score even more poorly on the new version when calculation was required on 100% of the cards, if females' poorer performance were due to differential math abilities. Two hundred trials of this real/virtual card IGT were administered to 31 females and 30 males ranging in age from 17 to 29 years. Results showed that across 200 trials, both males and females learned the task; females did not perform significantly differently than men in terms of choosing advantageous cards, although the trend approached significance, *p* = 0.08, with males selecting 69% advantageous cards with females selecting 62% advantageous cards, which is similar to results on the normal IGT. A finer analysis of card choices revealed the gender difference previously found in our laboratory. Specifically, females chose significantly more cards from disadvantageous Deck B than did males. Moreover the magnitude of this difference increased as the task progressed so that in the last block of trials females chose almost twice as many cards from Deck B than did males (25 vs. 13%). Thus, additional math requirements did not aid nor hinder females' IGT performance relative to that of females with traditional math requirements.

### *IGT gender differences are not due to differential response perseveration*

Infant female non-human primates (Clark and Goldman-Rakic, 1989) and infant female humans (Overman et al., 1996b) perseverate significantly more than respective males on reversal tasks that rely on the ORBPFC. This phenomenon may be relevant for adult females' differential preference for HFOG disadvantageous cards in Deck B. In the original IGT (Bechara et al., 1994), within the first 10 trials, the \$1250 penalty card is the ninth card in Deck B. Because almost all participants explore all decks in the first 20–40 trials, the first penalty card from Deck B is typically not selected until well into the task, e.g., on average the 36th draw (Bechara et al., 1997) and the 26th and 29th draw for males and females (Overman et al., 2006). Until these relatively late trials, cards in Deck B have always paid \$100, while cards in the other decks have been both rewarded and punished. Thus, participants may gain a sense that Deck B has great positive weight, which may lead to perseveration on this card throughout the task. This may be more likely in females than in males. In other words, they may not reverse preferences as early as men, as is the case for monkeys and infants in object reversal tasks cited above. Several authors have claimed that impaired IGT performance of patients with VMPFC damage is the result of a deficit in reversal learning (Rolls, 1999, 2005; Fellows and Farah, 2005).

This hypothesis has been directly tested with brain-damaged patients. Fellows and Farah (2005) administered the IGT to subjects with VMPFC damage. These authors hypothesized that these patients perform poorly on the IGT because (a) they do encounter high paying cards from Deck B several times without a penalty in the early trials, and (b) they have deficits in shifting their choices to low paying advantageous cards. Indeed, when penalty cards were moved to the front of the decks in the IGT, the VMPFC patients performed as well as normal controls (Fellows and Farah, 2005).

We tested this perseveration hypothesis in normal participants, by creating a new version of the real/virtual card IGT in which the \$1250 loss card from Deck B was moved to the third place in the

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 9 — #9

first block of 10 trials and to the first place in Blocks 2–4 (Overman et al., 2006). College-aged males (*n* = 31) and females (*n* = 30) were given this task. Results showed that the position manipulation did not alter sex differences. Although both males and females learned the task, i.e., choose increasingly more advantageous cards as the task progressed, females still chose significantly more cards from Deck B than did males, which resulted in a significantly lower overall performance than males. Thus, differential response perseveration does not appear to be the reason for IGT gender differences.

#### *IGT gender differences are not due to hormones within gender*

Although there are consistent findings of sex differences on the IGT, an analysis of high and low gonadal hormones in males as a group and females as a group has not revealed an effect of hormonal status. Reavis and Overman (2001) verified hormonal status (estradiol, progesterone, and testosterone) in young and older adults by blood sample assay. This resulted in six distinct groups: men: young males (mean age = 19.1) older males (mean age = 59.4). Women: young females low hormones (mean age=19.8), youngfemales with high hormones (mean age=22.4), older post-menopausal females on ERT (mean age = 54.5), older post-menopausal females not on ERT (mean age = 62.7). As expected, males were significantly higher (by a factor of 20) in levels of testosterone than females. Results across 150 real/virtual card IGT trials revealed no significant differences in IGT performance (advantageous Decks C + D) between young and old groups or between the two male hormone groups or among the four female hormone groups. Thus, variations in menstrual cycle do not affect IGT performance in terms of selection of advantageous cards.

However, there was a significant gender difference with males choosing 67.7% advantageous cards and females choosing 60.7% advantageous cards. As shown in **Figure 6**, males and females were equal in IGT performance in the first block of trials, but males scored significantly higher in Blocks 2 and 3. Also, significantly more males (68%) stated the correct rule of which decks are the two good decks than did females (48%). Furthermore, males stated the correct rule significantly earlier in the task (75th trial on average) than did females (97th trial on average).

Our findings have been recently replicated by van den Bos et al. (2013), who found that while both males and females learn to choose advantageous cards across the task, males choose more cards from Decks C and D and while females chose more cards from Decks B–D. This once again reveals females' differential preference for cards from HFOG Deck B as reported above.

## **TESTOSTERONE AS A POSSIBLE CONTRIBUTOR TO IGT GENDER DIFFERENCES**

Hormone assays in the study by Reavis and Overman (2001) revealed that, as expected, males have significantly more testosterone than females (1.88 vs. 0.09 ng/ml blood). Furthermore, males outperformed females on the IGT. This raises the question of a link between testosterone and differential IGT performance and perhaps a link with the ORBPFC. As shown in an experimental double dissociation, perinatal testosterone in monkeys accelerates the functional maturation of ORBPFC and slows the functional maturation of area TE in the inferior temporal lobe (see Overman and Bachevalier, 2001). In brief summary, infant male monkeys significantly outperformfemales on an object reversal task that is dependent upon ORBPFC. Exposure to perinatal testosterone renders female monkeys equal to males on reversal learning, and lesions of ORB in adult monkeys dramatically impair reversal learning in both males and females. In the second part of the dissociation, infant female monkeys outperform males on a TE-dependent concurrent discrimination task; however, castrated infant males perform as well as normal females and better than normal males on this task (Overman and Bachevalier, 2001).

Perinatal testosterone status is very similar in infant monkeys and humans. Thus, one might predict that young male children would outperform young female children on the object reversal task. This is exactly the finding from studies in our laboratory (Overman et al., 1996a,b). Here, children were tested with nonverbal procedures exactly as in previous studies with monkeys. The findings from those studies were as follows: (1) Male and female children performed equally on object discrimination tasks, indicating no differences in general learning ability. (2) Males under the age of 34 months were superior to age-matched females in object reversal learning with a pattern of results almost exactly like that in infant monkeys. (3) In addition to slower learning relative to males, about 20% of female children under the age of 29 months showed hyperemotional behaviors commensurate with the start of reversal training. (4) Females under the age of 36 months were superior to age-matched males in learning a concurrent discrimination task.

Thus, infant male and female children display differences in their learning abilities almost exactly like those shown by infant monkeys and, in infant monkeys, task performance is clearly dependent on perinatal differences in testosterone. With important implications for IGT performance (which is dependent upon ORBPFC), one of the monkey/child tasks, object reversal, is also known to be dependent upon functions of the ORBPFC, and the functional maturation of which depends upon testosterone. All of these data are suggestive for a "testosterone/ORBPFC dynamic" that is related to gender differences on the IGT. This dynamic is undoubtedly complex, for as shown in the next section there is strong evidence that different regions of ORBPFC are related to

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 10 — #10

performance on the IGT in males and females. However, participation of the ORBPFC in the IGT is clearly not the whole story as other regions of the PFC, such as the DLPFC, have been implicated in IGT performance (e.g., Bechara, 2007).

## **IMPLICATIONS FOR INVOLVEMENT OF ORBPFC: WIN–LOSS SENSITIVITY AND RISK TAKING**

In a comprehensive review of sex differences on the IGT, van den Bos et al. (2013) evaluated performance results from six risk-taking tasks in order to explore sensitivity to punishment as a possible explanation for IGT gender differences. These authors concluded that performance data from these six tasks, as a group, do not support win–loss sensitivity as an explanation for IGT gender differences. However, when analyzing results specifically from the Cambridge Gambling Task (Deakin et al., 2004; van den Bos et al., 2012) and the Risky Gain Task (Lee et al., 2009) van den Bos et al. (2013) propose that *oversensitivity to loss* may be driving female performance in risk-taking tasks.

In contrast to the hypotheses of van den Bos et al. (2013), others have interpreted the IGT gender difference as being driven by a male aversion to loss and a female preference for reward (Overman et al., 2006). This interpretation is based upon the results of an PET imaging study by Bolla et al. (2004)who reported that while performing the IGT, males and females showed differential activation in subregions of ORBPFC. Specifically, males showed increases in activation in large regions of lateral ORBPFC (BA 47), while females showed increases in activation in regions of medial ORBPFC (BA 11). This is important because O'Doherty et al. (2001) have shown that in humans, the lateral ORBPFC is sensitive to punishment, whereas the medial ORBPFC is involved in reward and guessing when outcomes are uncertain. With these differences in mind, Overman et al. (2011) attempted to disrupt or override lateral OFC activity in men and render them more similar to females in IGT performance. These authors presented different aromas to participants every 10 trials during a 200 trial real/virtual card IGT, with the hypothesis that since medial ORBPFC was shown by Bolla et al. (2004) to increase in activation in females during IGT, that presentation of aromas might increase activation in males and females and render the genders equivalent in IGT performance. Female IGT performance was predicted to remain at the normal low level because of a floor effect. Indeed, when aromas were presented (and medial ORB putatively increased in activity), male IGT performance declined to the level of females. Furthermore, males who received aromas performed significantly below that of control males who did not experience aromas. If it were true that increasing activation in medial ORBPFC resulted in males adopting a reinforcement strategy more like that of females, then they (males) they might show a preference for the HFOG cards of Deck B. That is exactly what the data confirmed. In the control no-aroma IGT task, females chose significantly more cards from Deck B than did males, but in the aroma IGT task, males chose a number of cards from Deck B equal to that of females.

These results support the hypothesis that IGT sex differences may be driven, in part, by differential pattern of activation in brain regions, particularly subregions of the ORBPFC, which, in turn leads to differential sensitivity to reward and punishment by males and females.

## **DISCUSSION**

## **OVERVIEW**

There are five main points in this review, each of which is discussed below.

(1) In the past 20 years, there have been many different procedural task variations in IGT research, making it difficult to directly compare results across studies. If one is to make accurate comparisons of IGT performance across various populations, it is necessary to have a baseline of optimal performance among normal participants. Two variables result in optimal performance in non-clinical, college-age participants: the use of a combination of real and virtual cards and the administration of more than 100 IGT trials. Optimal performance on the PARTM IGT does not occur with the use of virtual cards alone across 100 trials.

(2) The use of real/virtual card IGT procedures has revealed a positive linear relationship between IGT performance from ages 11 to 25 years. We did not find a significant difference in performance between normal young adults and older adults up to 65 years of age. However, there have been reports that subgroups of older adults may be impaired on the IGT.

(3) The use of real/virtual card IGT procedures has revealed a significant gender difference in performance with males choosing more advantageous cards than females. This difference exists from adolescence to old age.

(4) A deck-by-deck analysis reveals that the gender difference is driven by females' preference for cards from HFOG decks, B and D. The gender differences are not due to differences in math ability, response perseveration, or hormonal differences within gender.

(5) The IGT gender difference may be related to a dynamic between testosterone and orbital prefrontal systems.

#### **ENHANCED IGT PERFORMANCE WITH REAL/VIRTUAL CARDS**

Our comparison of performance on five versions of the IGT, including four versions of the PARTM IGT, demonstrated that the use of virtual cards alone did not result in optimal performance in non-clinical, college-age participants. IGT procedures that employed both real and virtual cards yielded significantly higher scores in two measures of performance. First, as shown in **Figure 1**, choice of advantageous cards across 100 trials was significantly higher with the use of real/virtual cards. Secondly, as shown in **Figure 2**, the difference was more pronounced when examining performance in blocks of the task, especially in later blocks. With both procedures, significant learning (choice of advantageous cards) occurred by the second block of trials, as shown in previous reports (e.g., Bechara et al., 2000; Overman et al., 2013), but as also shown in both of these reports, the use of virtual cards alone appears to produce little if any additional learning beyond the second block. Furthermore, our data clearly show that participants achieve significantly higher performance when they are administered 200 IGT trials.

#### *Implications for marker hypothesis (SMH)*

These findings have two very important implications for the SMH and for the use of the PARTM IGT. In IGT tests of the SMH

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 11 — #11

(Damasio et al., 1991), Bechara et al. (1997) propose that the level of cognitive understanding continues throughout the IGT. Specifically, conscious realization that Decks C and D are advantageous is said to arise after approximately the 80th trial (Bechara et al., 1997, but see Maia and McClelland, 2004, who argue that participants may have knowledge about the decks early in the game). It would seem that selection of advantageous cards would increase with awareness that they are "good." This appeared to be the case when real cards were used (Bechara et al., 1994). In these data, the single "typical control" showed increasing selection of advantageous cards throughout the task: choosing 56, 72, 80, and 96% respectively in the four blocks of 25 trials. In our study, increasing performance across the task was seen only when real/virtual cards were used but not with virtual cards only as in the PARTM IGT. This brings up the question of whether all components of the SMH (i.e., emotional and conscious) come into play in the computerized IGT.

## *Implications for the PARTM IGT*

Taken together, our data and the data from Bechara result in a conundrum. First, the SMH (Damasio et al., 1991) was tested and supported with IGT procedures using real cards (Bechara et al., 1994, 1997). Secondly, our data and (perhaps) those by Bechara et al. (2000) show that selection of advantageous cards increases throughout the task with the use of real cards but not with the use of virtual cards. However, the PARTM IGT utilizes only virtual cards. Thus, when using the PARTM IGT, accurate characterization of clinical and non-clinical populations becomes problematic.

## **PERFORMANCE ACROSS AGE**

It is clear that performance on the IGT or similar child-friendly tasks improves from approximately ages 6–25 years. During adolescence and early adulthood (approximately 11–25 years), IGT performance improves significantly (Hooper et al., 2004; Overman et al., 2004). With regard to adolescents' "real world" decisionmaking, the incidence of risky decisions decreases during the same time period in which performance on the IGT increases (see Spear, 2000). This is a period of significant changes in brain connectivity that occur throughout the brain, including in the frontal lobe and the PFC (which is closely involved in decision-making in general and the IGT in particular). Since the ORBPFC is strongly involved with IGT performance, it is an easy speculation to suppose that improved IGT performance has its underpinnings in the functional maturation of the ORBPFC and related networks. Of course, there are numerous social and environmental changes that concomitantly occur with IGT improvement during this time frame. Undoubtedly, there are complex interactions between changing brain systems and changing external variables.

It is less clear whether there are significant changes in IGT performance between young and elderly non-clinical adults. Several laboratories have failed to find significant changes between younger and older adults using virtual card IGT (Wood et al.,2005) or real/virtual card IGT (Reavis and Overman, 2001), see **Figure 5** in this review. Of particular note is the fact that the normative data used to validate the PARTM IGT reveal little, if any, change in IGT performance from ages 18 to 79 years (Bechara, 2007). These data report advantageous card selection in three age groups of normal adults: young (18–39 years), older (40–59 years) and elderly (60–79 years), in which on average the scores were 60.5, 58.5, and 57%, respectively. While no statistical comparisons are presented in the PARTM IGT manual, there is almost certainly not a significant difference between these three groups given the normal variance that occurs during IGT performance. However, by using subgroup analyses of IGT performance, others have collected data that suggest there are more "impaired" older adults (56–85 years) as compared to younger adults (18–55 years; Denburg et al., 2006; Fein et al., 2007).

## **DIFFERENTIAL PERFORMANCE BY MALES AND FEMALES**

*Basic findings that males choose more advantageous cards* The finding that males, as a group, choose significantly more advantageous cards than females is a robust finding across several IGT procedures (see van den Bos et al., 2013). In terms of the larger IGT literature, three points are of particular importance. (1) Most IGT studies have not analyzed for gender. (2) Gender differences may not be apparent in 100 IGT trials; however, they become apparent when performance is analyzed across 200 trials, and in particular, in the latter blocks of trials. (3) A deck-by-deck analysis reveals more detailed information about the behavioral underpinnings of the gender difference. Specifically, females perform lower than males because they have a tendency to select more cards from Deck B, which is a HFOG deck. It would seem to follow that to best understand IGT performance in clinical populations, studies should include gender analyses, more than 100 trials, and deck-by-deck analyses.

## *How to report IGT data*

The procedure of a deck-by-deck analysis raises the question of how best to report IGT performance. Net score reports (advantageous cards minus disadvantageous cards) cannot reveal information about preferences for different decks. Information about deck selection is better presented in terms of percentage scores. Furthermore, net score data are actually transformations of more readily understandable data using percentages. For example, a comparison of net scores of 6 vs. 10 is not easily interpreted as 65 vs. 75% advantageous cards were selected. The transformation from percent to net score does not seem to be necessary, and in fact, it prevents a complete analysis of performance.

## *Deck-by-deck analysis and gender differences*

The value of a deck-by-deck analysis is clearly shown in a recent review of IGT studies (Steingroever et al., 2013). These authors argue that individual deck analysis reveals critical information about the process of decision-making during the IGT. For example, in a review of 17 studies, Steingroever et al. (2013) show that cards from Deck B (disadvantageous with infrequent losses) are chosen as often as the two good decks (C and D). The implication is that subjects are choosing on two factors (a) preference for advantageous decks that yield long-term gain and (b) preference for cards with a HFOG, even though they may be disadvantageous in the long run. Our results add another twist to this analysis: females drive the preference for high-frequency-of-gain cards. There are additional studies that report "high frequency of gain" preference

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 12 — #12

Overman and Pierce Real/virtual card IGT results

(Chiu and Lin, 2007; Lin et al., 2007; Chiu et al., 2008). However, the data in these papers are difficult to interpret for two reasons. First, none of the reports analyzed/reported effects of gender. Secondly, these authors employ task versions that differ significantly from the mainstream IGT procedures. For example, in the IGT modification used by Chiu et al. (2008), the schedule of wins and losses repeats every five trials for each deck. This would seem to render the task much more transparent than the original IGT (e.g., Bechara et al., 1994; Overman et al., 2006).

Our results clearly show that in normal college-age participants, the lower IGT score byfemales is driven by their tendency to choose HFOG cards from Deck B. But females also learn to choose good decks across the task. Thus, their decision-making processes seem to be driven by frequency of gain as well as long-term gain.

#### *Interpretation of gender differences*

The meaning of gender differences in IGT performance is not completely understood at this time. The differences are not due to gender differences in math ability, response perseveration, or hormone fluctuations within females (Reavis and Overman, 2001; Overman et al., 2004). There are some interesting speculations relating differential IGT performance by males and females to differences in females' preference for reward and males' aversion to loss (Bolla et al., 2004), which, in turn, may be related to sex differences in a testosterone/ORBPFC dynamic (Overman et al., 1996b, 2011; van den Bos et al., 2013). The fact that women, as a group, do not perform as highly as men on the IGT should not be interpreted to mean that females are "inferior decision makers." Such speculation does not agree with fact that significantly more males than females make poor "real-life" decisions, e.g., regarding substance abuse and gambling. Rather, it appears that within the context of the IGT, some, but not all, females are responding differently than males to specific components of the task.

#### **SUMMARY**

The IGT has been an incredibly fruitful research tool during the past 20 years. However, the findings from the multitude of the studies are difficult to integrate and interpret due to wide variations in task methodologies and analyses. Dunn et al. (2006) make a valuable effort to integrate data from a variety of IGT studies in order to critically evaluate the SMH. However, they suggest that there are a variety of potential designs that could be used to better elucidate understanding of IGT decision-making as well as the SMH. In this review, we argue that a more complete understanding of IGT phenomena will best evolve if the performance of all populations are compared and contrasted to a common baseline of optimal performance. We believe that this baseline is established by the use of real/virtual cards across more than 100 trials, deck-by-deck analyses, and analyses for gender.

#### **REFERENCES**


"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 13 — #13

levels of neuroticism. *Ann. Behav. Med.* 37, 164–172. doi: 10.1007/s12160-009- 9094-7


decision-making in multiple sclerosis. *J. Int. Neuropsychol. Soc*. 12, 559. doi: 10.1017/S1355617706060644


"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 14 — #14


evidence for intact functioning. *Schizophr. Res.* 30, 169–174. doi: 10.1016/S0920- 9964(97)00135-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 06 August 2013; accepted: 26 November 2013; published online: 12 December 2013.*

*Citation: Overman WH and Pierce A (2013) Iowa Gambling Task with non-clinical participants: effects of using real* + *virtual cards and additional trials. Front. Psychol. 4:935. doi: 10.3389/fpsyg.2013.00935*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Overman and Pierce. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fpsyg-04-00935" — 2013/12/10 — 20:33 — page 15 — #15

## A rodent version of the Iowa GamblingTask: 7 years of progress

## *Ruud van den Bos 1\*, Susanne Koot 2,3 and Leonie de Visser 3 †*

<sup>1</sup> Department of Organismal Animal Physiology, Faculty of Science, Radboud University Nijmegen, Nijmegen, Netherlands

<sup>2</sup> Division Behavioural Neuroscience, Department of Animals in Science and Society, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands

<sup>3</sup> Department of Neuroscience and Pharmacology, Brain Centre Rudolf Magnus, University Medical Centre Utrecht, Utrecht, Netherlands

#### *Edited by:*

Ching-Hung Lin, Kaohsiung Medical University, Taiwan

#### *Reviewed by:*

Bauke Buwalda, University of Groningen, Netherlands Darrell A. Worthy, Texas A&M University, USA

#### *\*Correspondence:*

Ruud van den Bos, Department of Organismal Animal Physiology, Faculty of Science, Radboud University Nijmegen, Heyendaalseweg 135, NL-6525 AJ, Nijmegen, Netherlands e-mail: ruudvdbos@science.ru.nl; ruudvandenbos1@gmail.com

#### *†Present address:*

Leonie de Visser, Excerpta Medica BV-Adelphi Group, Amsterdam, Netherlands

In the Iowa Gambling Task (IGT) subjects need to find a way to earn money in a context of variable wins and losses, conflicting short-term and long-term pay-off, and uncertainty of outcomes. In 2006, we published the first rodent version of the IGT (r-IGT; Behavior Research Methods 38, 470–478). Here, we discuss emerging ideas on the involvement of different prefrontal-striatal networks in task-progression in the r-IGT, as revealed by our studies thus far. The emotional system, encompassing, among others, the orbitofrontal cortex, infralimbic cortex and nucleus accumbens (shell and core area), may be involved in assessing and anticipating the value of different options in the early stages of the task, i.e., as animals explore and learn task contingencies. The cognitive control system, encompassing, among others, the prelimbic cortex and dorsomedial striatum, may be involved in instrumental goal-directed behavior in later stages of the task, i.e., as behavior toward long-term options is strengthened (reinforced) and behavior toward long-term poor options is weakened (punished). In addition, we suggest two directions for future research: (1) the role of the internal state of the subject in decision-making, and (2) studying differences in task-related costs. Overall, our studies have contributed to understanding the interaction between the emotional system and cognitive control system as crucial to navigating human and non-human animals alike through a world of variable wins and losses, conflicting short-term and long-term pay-offs, and uncertainty of outcomes.

**Keywords: decision-making, humans, rats, prefrontal cortex, foraging behavior, behavioral models**

## **INTRODUCTION**

In 1994, Bechara and colleagues published the first paper on the Iowa Gambling Task (IGT; Bechara et al., 1994). In this task subjects need to find a way to earn money in a context of variable wins and losses, conflicting short-term and long-term pay-off, and uncertainty of outcomes. The IGT mimics daily, real-life, decisions (Damasio, 1994) and has given a strong impetus to understanding the role of the emotional system in the organization of decision-making behavior as well as the role of different prefrontal structures herein (e.g., Bechara,2005; deVisser et al.,2011a; Gläscher et al., 2012). Furthermore, it has proven to be a useful neuropsychological tool to assess deficits in decision-making behavior underlying disorders related to, e.g., anxiety, eating, and addiction (reviews: Dunn et al., 2006; van den Bos and de Ridder, 2006; de Visser et al., 2011a; van den Bos et al., 2013a).

A number of rodent versions of the IGT (r-IGT) have been published during the last decade (van den Bos et al., 2006a; Pais-Vieira et al., 2007; Rivalan et al., 2009; Zeeb et al., 2009), allowing studying general, cross-species, principles underlying decisionmaking at a behavioral and a neural level (review: de Visser et al., 2011a). Elsewhere, we have reviewed IGT-like decisionmaking behavior related to eating behavior (van den Bos and de Ridder, 2006), different r-IGT models (de Visser et al., 2011a), neural structures (de Visser et al., 2011a), sex differences (van den Bos et al., 2013b), social modulation (van den Bos et al., 2013c), stress (van den Bos et al., 2013c) and (pathological) gambling (van den Bos et al., 2013a). Here, we review emerging ideas on the involvement of the emotional system and cognitive control system in r-IGT task-progression, i.e., we discuss the involvement of different prefrontal-striatal networks underlying task-progression (see IGT: Involvement of Prefrontal Structures). In addition, we suggest two directions for future research: (1) the role of the internal state of the subject in decision-making, and (2) studying differences in task-related costs (see New Directions for the r-IGT). We start by introducing our r-IGT (see A Rodent Model of the IGT) and end with a few general remarks (see Final Remarks ).

## **A RODENT MODEL OF THE IGT**

In 2001 Spruijt, van den Bos, and Pijlman published a review in which they discussed, among others, the economy of animal behavior: which neurobiological mechanisms underlie foragingrelated decision-making behavior in animals such that long-term behavior is, by and large, optimal (Spruijt et al.,2001). As discussed by Cabanac (1971, 1992) emotions are important causal factors in steering behavior toward the best long-term option (Cabanac, 1971: *pleasant is useful*). Similar ideas have emerged from studies using the IGT (Damasio, 1994; Bechara et al., 1997, 1999). We therefore adopted the IGT as research-tool to address questions related to guiding behavior toward a long-term optimal solution and underlying neural circuits (van den Bos, 2004; van den Bos et al., 2006a).

To model the IGT we developed a choice-box with one arm containing 1 sugar pellet with 2 out 10 times a quinine-saturated sugar pellet (8 pellets win per 10 choices; "long-term advantageous arm") and one arm containing 3 sugar pellets with 9 out of 10 times quinine-saturated sugar pellets (3 pellets win per 10 choices;"longterm disadvantageous arm"; van den Bos et al., 2006a; de Visser et al., 2011a). Thus, in this way we introduced a conflict between short-term and long-term pay-off of options as in the human IGT (de Visser et al., 2011a). We also introduced two empty arms as a control for non-specific effects, such as related to memory. Recently, we have automated the task for use in the home-cage (Koot et al., 2009a, 2012; de Visser et al., 2011a).

When we compare the performance of rats and mice in the r-IGT to performance of humans in the IGT we observe similar patterns. In the first part of the task subjects explore the different options [first 40–60 trials in humans (100 trials in total), first 40–60 trials in animals (120 trials in total)], while in the second part they choose the long-term advantageous option more often (see van den Bos et al., 2006a). In contrast to other r-IGT models (see Rivalan et al., 2009; Zeeb et al., 2009) and the human IGT (Bechara et al., 1994) we have not differentiated between longterm outcome and frequency of reward/punishments of options in our r-IGT. However, given the strong similarity between our human and animal data thus far (e.g., de Visser et al., 2010, 2011b; van den Bos et al., 2012, 2013b), this has as yet not proven to be a setback or inherent problem of our task.

## **IGT: INVOLVEMENT OF PREFRONTAL STRUCTURES**

The output of decision-making processes, i.e., which action is taken in the end, is suggested to be determined by an interaction of two different forebrain systems: an emotional (limbic) system and a cognitive control (associative) system (e.g., McClure et al., 2004; Bechara, 2005; van den Bos et al., 2006b; de Visser et al., 2011a; Gläscher et al., 2012; **Figure 1**). During IGT performance these systems are activated in parallel, i.e., act as feed-forward and feedback systems, to optimize long-term behavior, and only differ in relative weight in different phases of the task (de Visser et al., 2011a). While the emotional system may be dominating the early phase in healthy individuals, the cognitive control system may be dominating the late phase, suppressing (eventually) activity in the emotional system.

At the level of prefrontal structures, in humans the emotional system may encompass the orbitofrontal cortex (OFC) and the ventromedial prefrontal cortex (VMPFC), while the cognitive control system encompasses the dorsolateral prefrontal cortex (DLPFC) and anterior cingulate cortex (ACC; e.g., McClure et al., 2004; Northoff et al., 2006; Lin et al., 2008; Lawrence et al., 2009; Li et al., 2010; de Visser et al., 2011a; Gläscher et al., 2012). The development of rodent versions of the IGT has led to the question whether activity of similar structures underlies IGT-like decisionmaking in rodents. This would not only enhance the validity of the models, but also allows for specific manipulations in different structures.

In our studies thus far, we clearly observed a role for the lateral orbitofrontal cortex and medial prefrontal cortex [infralimbic (IL) and prelimbic (PrL) cortex] in task-performance (de Visser et al., 2011b,c; van Hasselt et al., 2012; Koot et al., 2013, 2014).

system; transparent blue, see text for further explanation].

More specifically, focussing on the medial prefrontal cortex, we observed that inactivation of the PrL cortex was effective when rats already chose for the long-term advantageous option, but not when they were still exploring the different options (de Visser et al., 2011c). In contrast, manipulations with the IL cortex were effective, regardless of whether rats were still exploring or already chose for the long-term advantageous option (Koot et al., 2014). Thus, these data tend to suggest that activity in the IL cortex may precede activity in the PrL cortex. If so, one would predict that a correlation will be found between c-Fos expression (as marker of neuronal activity; see de Visser et al., 2011b; van Hasselt et al., 2012; Koot et al., 2013) in the IL cortex and task-performance in trial block 51–60 when only 60 trials are given (conform de Visser et al., 2011b; van Hasselt et al., 2012; Koot et al., 2013), while no such correlation will be found for the PrL cortex. Pilot experiments have confirmed this prediction. Given that data from different experiments seem to converge to the notion that the IL cortex may be (functionally) equivalent to the VMPFC in humans, while the PrL cortex may be equivalent to the dACC and DLPFC in humans (Milad and Quirk, 2012; Gass and Chandler, 2013; Mihindou et al., 2013), data in the r-IGT seem to match the data in the human IGT (conform de Visser et al., 2011a). These findings are in line with data which suggest that the IL and PrL may play different roles in the organization of behavior, such as shown in studies in fear-conditioning (Milad and Quirk, 2012), appetitive behavior (Burgos-Robles et al., 2013; Horst and Laubach, 2013), and control in addictive behavior (Gass and Chandler, 2013; Mihindou et al., 2013).

In general, our findings on the involvement of prefrontal areas in r-IGT performance are in line with those of other studies (Rivalan et al., 2011; Zeeb and Winstanley, 2011; Paine et al., 2013; Pittaras et al., 2013; Zeeb and Winstanley, 2013). Next to r-IGT related performance differences in activity in prefrontal areas we have observed task-related performance differences in activity in striatal areas (e.g., de Visser et al., 2011b). **Figure 2** incorporates our findings in a broader perspective of prefrontal-striatal areas underlying r-IGT task-progression. This tentative neurobehavioral model of task-progression in the r-IGT is based upon models of cortico-basal ganglia systems (Yin and Knowlton, 2006; Yin et al., 2008). As discussed by Yin et al. (2008) areas encompassing the nucleus accumbens/ventral striatum are involved in Pavlovian processes, while areas encompassing the dorsal striatum are involved in instrumental behavior. When we more specifically relate this difference to the earlier discussion on the medial prefrontal cortex this amounts to the following tentative picture (see legend **Figure 2** for other areas). The core area of the accumbens has been implicated in anticipatory/preparatory behavior related to Pavlovian cues signaling the expected value of commodities (Yin et al., 2008). In similar vein, the VMPFC in humans is involved in anticipatory (Pavlovian) signaling of good *versus* bad options in the IGT aiding in directing decision-making behavior toward the best long-term option (Bechara et al., 1999), which has been framed in a broader context as "affective meaning" (Roy et al., 2012). Given the suggestion that the IL cortex in rats may be related to the VMPFC in humans (Milad and Quirk, 2012; Gass and Chandler, 2013; Mihindou et al., 2013), the IL cortex and core area of the nucleus accumbens in tandem may play a role in aiding to direct behavior toward the best long-term option by anticipating expected values of options. In contrast, the dorsomedial striatum is involved in organizing instrumental goal-directed behavior, i.e., in reinforcing behavioral acts and/or behavioral patterns which are conducive to reaching the goal, while punishing behavioral acts and/or patterns which deviate from reaching the goal (Yin et al., 2008; Paton and Louie, 2012; Kravitz et al., 2012). The PrL cortex as rodent equivalent of the dorsal ACC and DLPFC (Milad and Quirk, 2012; Gass and Chandler, 2013; Mihindou et al., 2013) may play a role in assessing final cost-benefit options of instrumental behavior by error-monitoring as well as outcome feedback, working memory and organizing goal-directed behavioral actions (Killcross and Coutureau, 2003; Ostlund and Balleine, 2005; Kolling et al., 2012). In tandem, therefore, the PrL and dorsomedial area of the striatum may play a role in organizing instrumental goal-directed behavior toward the best long-term option (Ostlund and Balleine, 2005).

In sum, these systems exert different levels of control over decision-making behavior (see van den Bos et al., 2006b; de Visser et al., 2011a; van den Bos et al., 2013b). The emotional system is involved in immediate responding to (potential) rewards, losses or threats (i.e., impulsive behavior) as well as in emotional control, i.e., adjusting behavior to changing contingencies and anticipating the value of intended choices. In this way it allows the organism to label the environment in terms of "long-term hot and not spots." This emotional-laden information is input for the cognitive control system, which subsequently "organizes" instrumental behavior toward the best long-term option, i.e., this system is more involved in response inhibition, error-monitoring,

**FIGURE 2 | Hypothetical neurobehavioral model of task-progression in the r-IGT (Yin and Knowlton, 2006;Yin et al., 2008).** It should be noted that the cingulate areas and insular cortex are not included (see New Directions for the r-IGT). Furthermore the subdivisions of the OFC are not shown (see van den Bos et al., 2013b). In transparent red, the emotional system is shown, of which striatal areas are involved in Pavlovian behavior (see Yin et al., 2008): while the shell is involved in immediate (hedonic) responses (stimulus-outcome; (un)conditioned consummatory/hedonic responses), the core is involved in anticipatory/preparatory behavior (stimulus–stimulus relation). In transparent blue, the cognitive control system [dorsomedial (DM) striatum; action-outcome; goal-directed behavior] and sensorimotor/habit system [dorsolateral (DL) striatum; stimulus-response; habit-like behavior], of which striatal areas are involved in instrumental behavior. Thus far, we have not trained animals to the point of showing habitual behavior. Arrows indicate mutual interactions between midbrain dopaminergic areas and striatal areas, while the dotted arrows indicate disinhibition of dopaminergic areas by striatal areas [see Yin and Knowlton (2006) for discussion]. Dopaminergic projections to the prefrontal cortex are not shown. Also the interaction with the serotonergic (5-HT) system is not shown (see for discussion: Homberg et al., 2008; Koot et al., 2012; van den Bos et al., 2013b). Abbreviations: amy, amygdala; OFC, orbitofrontal cortex; IL, infralimbic cortex; PrL, prelimbic cortex; SI/MI, primary and sensory motor cortices; VTA, ventral tegmental area; SNPc, substantia nigra pars compacta.

switching and long-term/future perspectives. At a behavioral level this would amount to the differentiation between responses to emotional-laden stimuli, such as anticipatory responses, and developing consistent behavior toward the best-long-term option (instrumental learning).

Both the human IGT and our rodent version of the IGT are associative learning tasks which tap-off learning-related processes under conditions of uncertainty without any prior training, i.e., in the very early stages of processing information, and subsequently organizing a consistent behavioral response toward the best longterm option. Therefore, it is critical to assess to what extent neural findings underlying task-performance relate to other paradigms, which use more extensive training protocols. Thus, animals may have acquired competing responses during earlier training affecting subsequent behavior and activation of structures (Rivalan et al., 2011; see New Directions for the r-IGT).

## **NEW DIRECTIONS FOR THE r-IGT**

The r-IGT has contributed to understanding neurobiological mechanisms of how subjects may arrive at the best long-term option. Thus far, we have not systematically investigated the role of hunger levels on decision-making behavior. In our experiments we have used a very moderate level of food deprivation (90–95% of free feeding weight). However, increasing levels of deprivation may lead to different behavioral outcomes. It is known that hunger levels (or current energy budget) have an effect on decisionmaking, exploration, impulsivity and risk-taking behavior (Krebs and Davies, 1993; Inglis et al., 1997, 2001; de Visser, 2003; Koot et al., 2009b; Proctor, 2012). Thus, in the r-IGT both hunger levels before the task and increasing satiation during the task may have an effect on subsequent choices made. For instance, as subjects are extremely hungry they may become more risk-taking and focus on short-term rather than long-term options. The insular cortex may play a role in shifting between these behavioral strategies. For, this structure has been implicated in interoceptive awareness, homeostatic control and energy expenditure (Butti and Hof, 2010; Prévost et al., 2010; Craig, 2011). Furthermore, the insular cortex has connections with the dorsal and ventral striatum and thereby may exert an effect on immediate and longterm focus (see Tanaka et al., 2004). Moreover, we have already seen a relation between insular cortex activity and r-IGT performance in rats (van Hasselt et al., 2012; Koot et al., 2013), in line with other studies that have shown a relation between insular cortex activity and decision-making/risk-taking in rats and humans (Clark et al., 2008; Xue et al., 2010; Ishii et al., 2012). Thus, one direction for future research using the r-IGT may be to study the role of the internal state and insular cortex activity in decision-making.

In the r-IGT new information is acquired which is not integrated with earlier obtained information. However, in real "rodent" life, decision-making is an ongoing process of using earlier acquired information, checking/updating "known" options, and deciding to explore new options should they occur. More precisely, real-life decision-making exists of coding the value of options, assessing the overall value of the environment (rich/poor) and assessing whether to engage with a current option or move to another location (see Kolling et al., 2012). Studies in humans and animals have shown that the ACC may be critically involved in assessing levels of energy expenditure of instrumental behavior or actions in relation to reward, i.e., in assessing physical or actionrelated costs (Walton et al., 2003; Rudebeck et al., 2006; Croxson et al., 2009; Prévost et al., 2010; Cowen et al., 2012; Kolling et al., 2012). These studies have in addition shown that the OFC and VMPFC are more involved in delays and probabilities related to reward and punishments. In line with this both we (probabilities; e.g., de Visser et al., 2011b; van Hasselt et al., 2012) and Rivalan et al. (2011; delays) have seen only little correlation of c-Fos expression in cingulate areas with task-performance or effects of lesions of cingulate areas on task performance. Thus, costs may be dissected into different components with different underlying neural structures: physical (or foraging) costs related to instrumental behavior/actions associated with activity in the ACC, and costs associated with delays to reward and frequencies of punishments/omissions associated with activity in the OFC

and VMPFC. From this perspective the barrier-climbing based decision-making task that we have used earlier (van den Bos et al., 2006c) may be remodeled to assess the effects of physical costs on IGT-like performance. In addition, new tasks may be developed. For instance, in which animals have learned the value of different options in an environment (costs associated with frequencies/delays), and subsequently are presented a choice between a pair of options with a relatively low pay-off (but one slightly better than the other) and a pair of options with a relatively high pay-off (but one slightly better than the other) associated with a physical (or foraging) cost, for instance, by climbing barriers; or, for instance, a choice between a known pair and a completely novel option associated with a foraging cost. Recently, such paradigms in humans have dissected the role of the ACC (engage or leave; foraging decision) and VMPFC (decision based on differences within a pair of options) in decision-making behavior (Kolling et al., 2012). Thus, a direction for future research using the r-IGT may be to study the role of foraging decisions, foraging costs and ACC activity in decision-making.

## **FINAL REMARKS**

Here we discussed a few aspects related to IGT-like decisionmaking, i.e., decision-making in a context of variable wins and losses, conflicting short-term and long-term pay-offs, and uncertainty of outcomes. Our interest for engaging into the IGT was fuelled by questions related to understanding mechanism underlying long-term successful foraging behavior in animals (Spruijt et al., 2001; van den Bos, 2004). The r-IGT that we developed has contributed to understanding the interaction between the emotional system and cognitive control system as crucial systems in this respect. Recently, we have discussed how to bridge the gap between these mechanisms and evolutionary models that focus on the function or long-term consequences of behavior (van den Bos et al., 2013c). Along with understanding the role of the internal state and understanding different task-related costs, this will be one of the challenges for future research.

## **ACKNOWLEDGMENTS**

The authors wish to thank the many students that have performed experiments in the course of their Master program. In particular, they wish to thank Wilma Lasthuis, Sietse Jonkman and Esther den Heijer whose help has been invaluable in setting-up the r-IGT in the early stages. Furthermore, they wish to thank the technicians Annemarie Baars, Peter Hesseling, Marla Lavrijsen and Jose van 't Klooster for their expert technical assistance over the years. In addition, we would like to thank our colleagues for constructive discussions and collaborations in the different stages of our research, in particular Bart Houx, Berry Spruijt, Denise de Ridder, Judith Homberg, Walter Adriani, Gianni Laviola, Louk Vanderschuren, and Marian Joels. We would also like to thank the reviewers of this MS for their constructive comments that helped in focussing this paper.

Finally, the senior author of this paper (RvdB) wishes to dedicate this MS to the late professor Alexander Cools; not only for his long-lasting inspiration in the field of behavioral neuroscience but also for pointing out the work by Damasio when discussing the role of emotions in the economy of behavior.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 September 2013; accepted: 23 February 2014; published online: 18 March 2014.*

*Citation: van den Bos R, Koot S and de Visser L (2014) A rodent version of the Iowa Gambling Task: 7 years of progress. Front. Psychol. 5:203. doi: 10.3389/fpsyg.2014.00203*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 van den Bos, Koot and de Visser. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Staying and shifting patterns across IGT trials distinguish children with externalizing disorders from controls

#### *Isabela Sallum1 \*, Fernanda Mata1,2, Débora M. Miranda1 and Leandro F. Malloy-Diniz <sup>1</sup>*

*<sup>1</sup> Laboratório de Investigações Neuropsicológicas, Instituto Nacional de Ciência e Tecnologia de Medicina Molecular, Faculdade de Medicina – Universidade Federal de Minas Gerais, Belo Horizonte, Brazil*

*<sup>2</sup> Faculty of Medicine, School of Psychology and Psychiatry, Monash University, Melbourne, VIC, Australia*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Krishna P. Miyapuram, Indian Institute of Technology Gandhinagar, India Maggie E. Toplak, York University,*

#### *\*Correspondence:*

*Canada*

*Isabela Sallum, Laboratório de Investigações Neuropsicológicas, Instituto Nacional de Ciência e Tecnologia de Medicina Molecular, Faculdade de Medicina – Universidade Federal de Minas Gerais, Av. Alfredo Balena, 190, Belo Horizonte, MG 30130-100, Brazil e-mail: belasallum@gmail.com*

The Iowa Gambling Task (IGT) is the most widely instrument used in the assessment of affective decision-making in several populations with frontal impairment. The standard performance measure on the IGT is obtained by calculating the difference between the advantageous and the disadvantageous choices. This standard score does not allows the assessment of the use of different strategies to deal with contingencies of gain and losses across the task. This study aims to compare the standard score method used in IGT with a method that analyses the patterns of staying and shifting among different decks across the 100 choices, considering contingencies of choices with and without losses. We compared the IGT performance of 24 children with externalizing disorders (Attention Deficit Hyperactivity Disorder and Oppositional Defiant Disorder) and 24 healthy age-matched children. The analyses of the standard score across all blocks failed to show differences among children with externalizing disorders and control children. However, healthy children showed a pattern of shifting more from disadvantageous decks to advantageous decks and choosing more consecutive cards from the advantageous decks across all blocks, independently of the contingency of losses. On the other hand, children with externalizing disorders presented a pattern of shifting more from advantageous decks to disadvantageous ones in comparison to healthy children and repeatedly chose cards from the B deck across all blocks. This findings show that even though differences among groups might not be found when using the standard analyses, a different type of analysis might be able to show distinct strategies on the execution of the test.

**Keywords: attention deficit hyperactivity disorder, oppositional defiant disorder, iowa gambling task, strategy, decision making, externalizing disorders**

## **INTRODUCTION**

Children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) have been characterized as poor decision makers whose response in decisions involving risk is guided by attractive immediate choices independent of their negative outcomes in the long term (Barkley, 1997). For instance, ADHD patients are at greater risk of accidental injuries in childhood (Byrne et al., 2003). In adolescence and adulthood, ADHD patients have been found to have an increased likelihood to impulsively quit a job (Halmøy et al., 2009), to express aggressive behaviors in response to driving related anger and crash-related outcomes (Richards et al., 2006), and experience antisocial activities and arrests as a consequence of illegal drug use (Barkley et al., 2004). Several studies have demonstrated the impulsive immediatist response style of ADHD children in a laboratory setting using the Iowa Gambling Task (IGT) and its child-friendly versions of this task (Garon and Moore, 2006; Malloy-Diniz et al., 2008; Masunami et al., 2009). Nevertheless, some studies did not find affective decision-making deficits in ADHD children using this same instrument (Geurts et al., 2006; Suhr et al., 2008; Hobson et al., 2011; Ibáñez et al., 2011).

The IGT (Bechara et al., 1994) is a well known worldwide measure of affective decision-making under uncertainty and it has become available as a clinical instrument in the past decade (Bechara, 2007). In the task, participants are given a \$2000 loan of play money and are instructed to win as much money as possible by repeatedly choosing cards from four different decks. The expected value of the decks vary such that two decks are associated with high immediate gains, but repeated selections result in financial loss (disadvantageous decks, A and B). Conversely, the other two decks are associated with low immediate gains, but repeated selections result in financial gain in the long run (advantageous decks, C and D).

Standard measures frequently used for analyzing IGT performance combine the difference between total advantageous and disadvantageous cards selected throughout the task and the pattern of this difference according to five 20-block trials over the course of the 100 selections of cards (Bechara et al., 1998). Other outcome measures used for analysing IGT performance include total money won (van den Bos et al., 2006); total of cards selected on individual decks (Chiu and Lin, 2007); comparison between the number of cards selected from the decks A and C (low-frequency losses) and decks B and D (highfrequency losses) (Chiu and Lin, 2007); and analyses of deck selection in all the 100 trials vs. the last 50 trials (Rocha et al., 2011).

However, it has been demonstrated that the most used performance score (simple difference score between advantageous and disadvantageous choices) has important limitations (Buelow and Suhr, 2009; Ferguson et al., 2009; Visagan et al., 2012). These limitations are distinguished because it only takes into account longterm outcomes (Horstmann et al., 2012) and ignores the strategy used by the participant during the task. For instance, participants who do not adopt any strategy during the task might have a score close or even above zero if they choose cards randomly and do not show a preference for one of the decks. On the other hand, participants who choose predominantly cards from the disadvantageous decks in the first block of the IGT and demonstrate a slow and gradual preference for the advantageous decks over the task, can have a lower score compared to participants who choose randomly. Furthermore, it should be noted that all of these outcome measures mentioned above do not allow any interpretation of shifts between decks and stays. Even though a recent search for more detailed methods to analyze IGT performance in different clinical populations characterized by orbitofrontal cortex deficits has received attention in the past decade, to our knowledge no study has employed an analysis based on shift frequencies between the decks and stays to investigate IGT performance in ADHD children.

Our hypothesis about the inconsistent findings regarding the decision-making deficits in ADHD children might be at least partially explained by the often exclusive use of the standard net score to compare IGT performance between children with externalizing disorders and typical developing children. It should be noted that these conflicting findings raises questions about the appropriateness of this instrument in ADHD diagnosis (Buelow and Suhr, 2009) and should be investigated in a more detailed way. We hypothesized that ADHD children present difficulty in using the information about the gain and loss aspects of the decks to efficiently select cards from the advantageous decks throughout the task. As pointed by Meel et al. (2005), advantageous decision-making requires frequent monitoring and updating of current strategies to take into account new information. The examination of the appropriateness and success of performance plays an important role in determining and implementing behavioral adjustments (Ridderinkhof et al., 2004). Importantly, it has been suggested that online monitoring of external feedback may be relatively preserved in ADHD children (Meel et al., 2005; Groen et al., 2013), although they fail to properly utilize internal feedback to adjust their current response strategies.

Given these findings, it is important to compare the standard score method most used to analyze IGT performance with a method that analyzes the patterns of staying and shifting among different decks considering contingencies of choices with and without losses. This comparison could help to investigate whether this alternative analysis method is capable to characterize more accurately the decision-making deficits of children with externalizing disorders.

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Twenty-four children diagnosed with externalizing disorders from a public health service (Attention Deficit Hyperactivity Disorder and/or Oppositional Defiant Disorder; 6 girls; Mean age = 10.04 years, *SD* = 1*.*654) and 24 aged-matched controls (9 girls; Mean age = 10.29 years, *SD* = 1*.*546), all ranging from 7 to 14 years old, participated in the present study. Clinical diagnoses were done by a psychiatrist using the K-SADS-PL (Schedule for Affective Disorders and Schizophrenia for School-Age Children, Present and Lifetime Version; Kaufman et al., 1997). Of our clinical sample, 83% met criteria for ADHD only (20 children, 7 classified with Predominantly Inattentive subtype and 13 classified with Combined subtype), 4% met criteria for ODD only (1 boy) and 13% met criteria for both ADHD and ODD (3 children, 1 classified with Predominantly Hyperactive subtype and 2 classified with Combined subtype). The participants had similar socioeconomic backgrounds (as measured by the Brazilian Criterion of Economic Classification; see on the methods section), with predominantly middle to low socioeconomic status, and attended public schools, except for one boy in the clinical group who attended a private school. The children from the clinical group were restricted from their medication for 24 h before the assessment.

#### **MEASURES**

#### *The Brazilian Criterion of Economic Classification (CCEB)*

Socioeconomic status was measured using the CCEB (Brazilian Research Enterprises Association; ABEP, 2008), a widely used measure of purchase power of families living in urban areas in Brazil. The questionnaire assesses available resources at home and the educational level of the householder, resulting in a scale ranging from 0 to 46. The families are further classified into eight economic classes, from top to bottom: A1 (42–46 points), A2 (35–41), B1 (29–34), B2 (23–28), C1 (18–22), C2 (14–17), D (8–16), and E (0–7). Our sample had a mean of 20.33 (*SD* = 5*.*164). Only one child from the control group was classified as being part of the class E. The others ranged from classes B2 to D. There were no differences amongst groups.

#### *Standard IGT analyses*

A computerized version of the IGT developed by Malloy-Diniz et al. (2007) for the Brazilian population was used. For the standard analyses of the IGT, the choices across the task are divided in 5 blocks with 20 trials each, and what is analysed is the proportion of choices in advantageous decks (C and D) minus disadvantageous decks (A and B) across the task.

#### *Strategy use analyses—Staying and Shifting patterns in the IGT and deck preferences*

In order to verify different strategies used by the two groups in the IGT, analyses of the patterns of staying and shifting among different decks across the 100 choices were done for each participant. For these analyses, we considered how many times each participant would choose to stay in a certain deck or shift to another deck according to the presence or absence of losses after each choice. Staying was defined as choosing the same deck immediately after this deck was chosen (for example, choosing the A deck right after this deck was chosen). Shifting was defined as choosing a different deck than the immediate previous one (for example, choosing the B deck after choosing the A deck). Different levels of complexity were encompassed, since considering only the number of overall choices of staying and shifting, to considering patterns of staying and shifting according to different types of decks (advantageous × disadvantageous; high-frequency losses decks × low-frequency losses decks), different decks (A,B,C,D) and contingencies of losses (with × without losses). The number of overall choices in each separate deck was also analysed in order to test if the groups would differ in their preferences.

Further analyses were run in order to identify the different strategies used by the groups across the 5 different blocks. For that, eight conditions of shifting/staying were considered based on the division of cards amongst advantageous and disadvantageous cards: (1) staying in an advantageous deck without losses; (2) staying in an advantageous deck after losses; (3) staying in a disadvantageous deck without losses; (4) staying in a disadvantageous deck after losses; (5) shifting from an advantageous deck without; (6) shifting from an advantageous deck after losses; (7) shifting from a disadvantageous deck without losses; (8) shifting from a disadvantageous deck after losses.

During the task, the choices each participant makes defines whether this person is more prone to receive a punishment or not (for example, choosing predominantly decks A and C will lead to a higher chance of losing conditions, while choosing predominantly decks B and D will lead to smaller chances of losing condition), therefore the analyses were done using proportions. For example, for condition 1 it was considered the raw number of choices for staying in an advantageous deck without losses, divided by the overall number of choices without losses. This was done for all of the conditions. Such method of analysis also allows the comparison between conditions without losses and after losses.

## **RESULTS**

To analyse the standard measure of the IGT (number of advantageous choices minus number of disadvantageous choices at each block), a 2 (groups) × 5 (blocks) mixed model analyses of variance (ANOVA) was run. Huynh Feldt corrections were used since the sphericity assumptions were violated. No main effects or interactions were found to be significant. The effects of group fell short from being significant, *F(*1*,* <sup>46</sup>*)* = 0*.*052, *p* = 0*.*821. These results thus, show that, according to this analysis, children with externalizing disorders presented a similar performance when compared to healthy controls.

Analyses of overall differences in shifting/staying and deck preferences among the groups were run using


*Mdn, median; m, mean; sd, standard deviation; K–S, Kolmogorov–Smirnov; p, p-value; Adv, advantageous decks (C and D); Disadv, disadvantageous decks (A and B); Hfl, high-frequency losses decks (A and C); Lfl, low-frequency losses decks (B and D). The bold values indicate the choices that were more frequent for the clinical group in comparison to the control group.*

*ADHD, Attention Deficit Hyperactivity Disorder; ODD, Oppositional Defiant Disorder.*

Kolmogorov-Smirnov test, showing 16 statistically different variables (see **Table 1**). These first analyses showed that healthy children overall chose more to stay in any given deck in comparison to children with externalizing disorders. Healthy children would also stay more in advantageous decks, regardless of the contingency of presence or absence of losses, and would shift more from disadvantageous decks to advantageous ones whenever there were penalties (losses). On the other hand, the clinical group was more prone to shift from advantageous decks to disadvantageous ones and chose more from the deck B in comparison to healthy controls. When analyzing high-frequency losses decks (Hfl) and low-frequency losses decks (Lfl), again the control group would stay more in any of these conditions in comparison to the clinical group. This is probably an effect of the overall preference of the control group in staying in any condition in comparison to the clinical group. Healthy children would also shift more from a Lfl to another Lfl, showing a preference over the clinical group for choosing cards with low frequency of losses.

A 2 (groups) × 4 (conditions 1–4) × 5 (blocks) three-way mixed models analyses of variance (ANOVA) was conducted to analyse differences in staying across blocks. Sphericity was not assumed and Huynh-Feldt corrections were used. There was a main effect of conditions, *F(*2*.*313*,* <sup>106</sup>*.*390*)* = 4*.*277, *p* = 0*.*012, and blocks *F(*3728*,* <sup>171</sup>*.*478*)* = 7*.*557, *p <* 0*.*001, and only a borderline significant two-way interaction between groups × blocks, *F(*3728*,* <sup>171</sup>*.*478*)* = 1*.*936, *p* = 0*.*111. Further analyses were run to compare the choices amongst groups for each condition separately using a Bonferroni-corrected *p*-value of 0.0125 (one analysis for each condition) in order to control for type 1 error. There was only a borderline significant interaction between groups × blocks for condition 1 (staying in an advantageous deck without losses), *F(*4*,* <sup>184</sup>*)* = 2*.*629, *p* = 0*.*036, showing that the pattern of choices for this condition would only change across blocks for the clinical group. As can be seen in **Figure 1**, children from the clinical group would increase their choices for staying in an advantageous without losses across the blocks, whilst children from the control group would already start the task choosing this condition more often. Furthermore, analyses of simple effects were run verifying each condition for each group separately (Bonferroni-corrected *p*-value of 0.0125). For some of the analyses, sphericity was not assumed and Huynh-Feldt corrections were used. For the control group, the number of choices for staying in a disadvantageous deck without losses increased across blocks, *F(*3*,*059*,* <sup>70</sup>*.*361*)* = 4*.*045, *p* = 0*.*01, which was not observed for the clinical group since they already would start the task choosing this condition more often. On the other hand, the clinical group presented a borderline significant increase across blocks in choosing to stay in a disadvantageous deck after losses, *F(*4*,* <sup>92</sup>*)* = 2*.*838, *p* = 0*.*029, while this was not observed for the control group. Further analyses comparing each

**healthy controls and children with externalizing disorders (ADHD and ODD). (A)** shows the choices in the staying conditions; Condition (1) staying in an advantageous deck without losses/ overall choices without losses; Condition (2) staying in an advantageous deck after losses/ overall choices after losses; Condition (3) staying in a disadvantageous deck without losses/ overall choices without losses; Condition (4) staying in a disadvantageous

losses/ overall choices without losses; Condition (6) shifting from an advantageous deck after losses/ overall choices after losses; Condition (7) shifting from a disadvantageous deck without losses/ overall choices without losses; Condition (8) shifting from a disadvantageous deck after losses/ overall choices after losses. *ADHD, Attention Deficit Hyperactivity Disorder; ODD, Oppositional Defiant Disorder*.

condition for each group were run, but no significant differences were found.

To analyse the shifting conditions, a 2 (groups) × 4 (conditions 5 to 8) × 5 (blocks) three-way mixed models analyses of variance (ANOVA) was conducted. Sphericity was not assumed and Huynh-Feldt corrections were used. There was a main effect of conditions, *F(*2*.*681*,* <sup>123</sup>*.*329*)* = 7*.*235, *p <* 0*.*001, and blocks *F(*3*.*939*,* <sup>181</sup>*.*2010*)* = 9*.*622, *p <* 0*.*001, two two-way interaction between blocks and groups, *F(*3*.*939*,* <sup>181</sup>*.*2010*)* = 3*.*020, *p* = 0*.*020, and blocks and conditions, *F(*9*.*341*,* <sup>429</sup>*.*677*)* = 11*.*656, *p <* 0*.*001, and a higher order interaction between blocks, conditions and groups, *F(*9*.*341*,* <sup>429</sup>*.*677*)* = 1*.*898, *p* = 0*.*048. Since the main and two-ways interactions are contained in the higher order interaction, analyses of simple effects were run focusing on this interaction. Analyses of simple effects were run to compare each condition amongst each other using a Bonferroni-corrected *p*-value for 6 comparisons (*p* = 0*.*008), but there was no significant interaction between condition × groups for any of the analyses. Furthermore, analyses were run to compare the choices amongst groups for each condition separately (Bonferronicorrected *p*-value of 0.0125). There was a significant blocks × groups interaction for choices of shifting from a disadvantageous deck after losses (Condition 8), *F(*4*,* <sup>184</sup>*)* = 3*.*509, *p* = 0*.*009, showing that the pattern of choices for this condition would have greater changes across blocks for the clinical group, presenting a decrease. When analyzing each condition separately for each group (four analyses, Bonferroni-corrected *p*-value of 0.0125), it was shown that both groups would decrease their choices of shifting from an advantageous deck across the task [control group: *F(*4*,* <sup>92</sup>*)* = 16*.*308, *p <* 0*.*001; clinical group: *F(*4*,* <sup>92</sup>*)* = 8*.*887, *p <* 0*.*001]; both would also present a decrease in choosing to shift from a disadvantageous deck without losses [control group: *F(*4*,* <sup>92</sup>*)* = 4*.*649, *p <* 0*.*001; clinical group: *F(*4*,* <sup>92</sup>*)* = 5*.*205, *p* = 0*.*002], and both presented a change across blocks in choosing to shift from a disadvantageous deck without losses [control group: *F(*4*,* <sup>92</sup>*)* = 6*.*492, *p <* 0*.*001; clinical group: *F(*4*,* <sup>92</sup>*)* = 7*.*728, *p <* 0*.*001], however, the pattern of choices for this condition was different for each group, once the control group increased their choices in this condition in the beginning of the task and then maintained the a constant number of shifts, while the clinical group presented an increase of choices in the beginning of the task, followed by a decrease in the ending, as shown in **Figure 1**.

When both the staying and shifting analyses are taken together, it's possible to see that the children with externalizing disorders start the task shifting more and they take longer to establish a pattern of staying in a deck, even thought in the beginning they choose to stay more in disadvantageous decks without losses. The clinical group also shows a significant decrease in shifting across the blocks. On the other hand, even though the control group also shows a decrease in shifts across blocks, they already start the task staying more in advantageous conditions and their pattern of shifts do not change as much as for the clinical group. Overall, the clinical group seemed to present more changes in the pattern of shifting and staying across blocks than the control group, and seem to start using a strategy of shifting less and staying more in a deck in the last blocks.

Since the overall analyses of shifting, staying and deck preferences showed that the clinical group presented a preference for the deck B in comparison to controls, a 2 (groups) × 4 (decks; A,B,C,D) × 5 (blocks) Three-Way mixed models ANOVA was conducted to analyse preference for a specific deck across the task. Greenhouse-Geisser corrections were used since the sphericity assumptions were violated. No main effects or interactions were found to be significant. To further analyze a possible effect over the different groups, a 4(decks) × 5 (blocks) Two-Way mixed models ANOVA was conducted separately for each group. Sphericity was assumed. For the control group, there was no significant main effects or interactions. However, for the clinical group, there was a main effect of decks, *F(*3*,* <sup>69</sup>*)* = 4*.*883, *p* = 0*.*004. Analyses of simple effects Bonferroni-corrected for 6 comparisons (*p* = 0*.*008) showed that there was a preference for choices in deck B over deck A, *F(*1*,* <sup>23</sup>*)* = 14*.*796, *p* = 0*.*001, and B over D, *F(*1*,* <sup>23</sup>*)* = 8*.*822, *p* = 0*.*007. The average number of choices for each deck across the blocks is shown in **Figure 2**, for the clinical and control groups.

## **DISCUSSION**

The present study aimed to analyse strategy use in the performance of the IGT amongst healthy children and children with externalizing disorders, comparing the standard score analyses used in the IGT with a more detailed analyses based on shiftfrequencies between the decks and staying-frequencies in each deck. In the present study, standard performance analysis did not reveal any statistically significant difference between children

with externalizing disorders and healthy controls. This finding suggests that the clinical group may not be impaired in affective decision-making, which is not in agreement with the real-life decision-making problems observed in children with externalizing disorders. Nevertheless, differences in the strategies adopted by the participants of the different groups on the execution of the task could be observed when the analysis based on shiftfrequencies between the decks and staying-frequencies in each deck were used.

Analyzing shifts and staying frequencies amongst decks showed overall that children from the control group shifted from disadvantageous decks to advantageous ones more frequently than the clinical group, while children from the clinical group shifted more from the disadvantageous decks whenever they had losses to another disadvantageous one. Overall, staying in either types of deck was noted statistically more often in typical children compared to children with externalizing disorders across the task. These findings indicate that healthy controls might choose more cards from the advantageous decks than the clinical group, even though it could not be observed in the standard performance analysis. When analyzing the overall choices of shifting and staying, the clinical group did not seem to present a clear strategy, which can be evidenced by the fact that shifts from the deck D without losses to the deck B were statistically more frequent in participants from the clinical group compared to controls. However, when observing the performance throughout the task, the clinical group seemed to start choosing to shift less and stay more in a deck in the last blocks. This shows that they might have established a strategy in the ending of the task, even though it is not necessarily a good one, since they start staying more in both advantageous and disadvantageous decks and start shifting less from disadvantageous decks. It is important to notice that the control group presented a significant change in the shifting conditions across blocks, but their pattern of staying and shifting did not change as much throughout the task when compared to the clinical group. These results do not show a clear strategy emerging throughout the task, instead they show that the control group already establish a pattern of choices in the beginning of the task. This corroborates with the overall analyses showing that in the entire task, healthy children choose to stay more in any given deck when compared to the clinical group.

As we hypothesized, children and adolescents with externalizing disorders seem to have some difficulty to use information about the gain/loss aspects from past choices to advantageously select cards throughout the task. The shifting patterns of the clinical group, as observed on the overall analyses, showed that they choose more than controls to shift to a disadvantageous deck. This could also possibly be explained by a difficulty in discriminating between the "good" and the "bad" decks of the task, as proposed by Meel et al. (2005). In a study investigating decision-making and autonomic response to reinforcement in ADHD children, Meel et al. (2005) demonstrated that this clinical population presented difficulty in discriminating between positive and negative outcomes associated with affective evaluation.

In addition to analyzing the shift frequencies between decks and stays, more detailed methods of analysis to investigate IGT performance have also focused on preferences for individual decks during the task. By employing such an analysis in the present study, it was found that healthy controls did not present a clear preference for a specific deck, whereas children with externalizing disorders demonstrated a preference for the deck B throughout the task.

Toplak et al. (2005) showed that both ADHD and healthy controls demonstrated a preference for cards from the deck B to cards from other decks, which is partially similar to our findings. Moreover, Horstmann et al. (2012) showed that healthy young adults were more prone to choose cards from the decks B and D in the IGT, rather than cards from the decks A and C, because the first ones present a lower frequency of losses. Overall, for the decks A and C, 50% of all the choices present a loss, while for the decks B and D, only 10% of the choices present a loss, although those losses are higher. The authors argued that the frequency of punishment, rather than the magnitude of it, seems to control the gambling behavior on the IGT.

In the present study, it was shown that healthy children would choose to stay in any of these types of decks more often than the clinical group, probably reflecting their overall tendency to stay in any given deck more often than children with externalizing disorders. Furthermore, the preference manifested by the clinical group for the deck B, but not for the deck D, can probably be explained by the magnitude of reinforcement, since the deck B presents a reinforcement that is the double of the deck D, even though it's punishment is 10 times higher than that of deck D. Either a working memory issue, or a higher sensitivity of children with externalizing disorders to reinforcement than to punishment might explain such preference pattern.

In consideration of the first possible explanation for these findings, it should be noted that considering that deck B demands less tracking of expected values since losses are less frequent, selection of cards from this deck may reflect less recruitment of working memory as highlighted by Toplak et al. (2005). It has been shown that ADHD children and adolescents perform worse than controls in tasks evaluating working memory (Nikolas and Nigg, 2013), and so do children with ODD only and comorbid ADHD and ODD (Rhods et al., 2011). Importantly, cognitive research has shown that working memory plays an important role in subserving the active mental representation of an individual's self-regulatory goals and related ways by which these goals can be achieved (Miller and Cohen, 2001).

In consideration of the second explanation for these findings, it has been shown that ADHD children seem to be oversensitive to rewards and to be less sensitive to punishments (Luman et al., 2005). Luman et al. (2008) investigated the performance of ADHD children in a decision-making task involving choosing an advantageous deck vs. disadvantageous decks in two conditions: one in which the frequency of penalties increased and another, in which the magnitude of penalties increased. The authors found that ADHD children performed similar to controls in the condition of increasing frequency of penalties, but did worse whenever the magnitude of penalties increased. This indicates that ADHD children are sensitive to frequency, but not to the magnitude of losses. The preference for the deck B found in this analyses, in conjunction to the analyses of staying and shifting shows that although the children from the clinical group stop to consistently stay in a specific deck after the second block they maintain the deck B as a reference and choose to shift from other cards to this card.

Compared to controls, children with externalizing disorders also chose cards from the deck B statistically more frequently. Toplak et al. (2005) found that ADHD adolescents selected more cards from the deck B and fewer cards from the deck D compared to controls, which partially confirms our findings. The wide age range of our group can possibly explain the absence of preference for a specific deck demonstrated by the healthy control group in the present study. Possibly, a preference for the deck B would be encountered if only children and adolescents over 10 years old were investigated. In fact, in a study aiming to measure affective decision-making in typical children and adolescents 8- to 17 years old, Smith et al. (2012) found that younger children failed to show a preference for either deck whereas IGT performance of children from ages 10 to 13 was characterized by persistent selections of cards from the disadvantageous decks.

This study has important limitations. Unfortunately, intelligence was not assessed in the current study, although it is unlikely that intelligence may have affected the present findings since it has been shown that it is not related to affective decision-making (Mata et al., 2013). The wide age range of the present study is also a limitation, since IGT performance in children and adolescents is known to be influenced by development (Smith et al., 2012). Another limitation is that the number of participants is small and the clinical group is very heterogeneous. It's possible that children with different ADHD subtypes, ODD only, or comorbid ADHD with ODD, would show different strategies in the execution of the IGT. For example, Toplak et al. (2005) showed that participants with ADHD of the Combined Subtype chose the decks B and D more frequently than children with ADHD of the Inattentive subtype, while the latter chose more the decks A and C comparison. Regardless of these limitations, this study showed that even though differences in affective decision-making between children with externalizing disorders and controls were not found using IGT standard performance analyses, considering how the task was executed and the strategies used, a more detailed analyses might be more precise in identifying patterns of performance across this task. Different authors suggested other types of analyses of the IGT, such as the analysis of dyadic moves across blocks as proposed by Ferguson et al. (2009). This considers the number of times participants chose one type of deck choice (advantageous or disadvantageous) followed by other advantageous or disadvantageous choice. This type of analysis also encompasses strategy-use during the performance of the IGT. The combination of different types of analyses of the IGT with other cognitive measures, such as working memory tasks, might further help clarifying performance on this task.

#### **REFERENCES**


attention-deficit hyperactivity disorder. *Int. J. Psychophysiol.* 72, 283–288. doi: 10.1016/j.ijpsycho.2009.01.007


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 June 2013; accepted: 14 November 2013; published online: 02 December 2013.*

*Citation: Sallum I, Mata F, Miranda DM and Malloy-Diniz LF (2013) Staying and shifting patterns across IGT trials distinguish children with externalizing disorders from controls. Front. Psychol. 4:899. doi: 10.3389/fpsyg.2013.00899*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Sallum, Mata, Miranda and Malloy-Diniz. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The cognitive processes underlying affective decision-making predicting adolescent smoking behaviors in a longitudinal study

#### *Lin Xiao1 \*, Gilly Koritzky1, C. Anderson Johnson2 and Antoine Bechara1,3,4*

*<sup>1</sup> Department of Psychology, Dana and David Dornsife Cognitive Neuroscience Imaging Center, Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA*

*<sup>2</sup> School of Community and Global Health, Claremont Graduate University, Claremont, CA, USA*

*<sup>3</sup> Psychiatry Department and Faculty of Management, McGill University, Montreal, QC, Canada*

*<sup>4</sup> Department of Neurology, University of Iowa Hospitals and Clinics, Iowa City, IA, USA*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Krishna P. Miyapuram, University of Trento, Italy Melissa T. Buelow, The Ohio State University Newark, USA*

#### *\*Correspondence:*

*Lin Xiao, Department of Psychology, Dana and David Dornsife Cognitive Neuroscience Imaging Center, Brain and Creativity Institute, 3641 Watts Way, HNB B28C, Los Angeles, CA, USA e-mail: linxiao@usc.edu*

This study investigates the relationship between three different cognitive processes underlying the Iowa Gambling Task (IGT) and adolescent smoking behaviors in a longitudinal study. We conducted a longitudinal study of 181 Chinese adolescents in Chengdu City, China. The participants were followed from 10th to 11th grade. When they were in the 10th grade (Time 1), we tested these adolescents' decision-making using the IGT and working memory capacity using the Self-ordered Pointing Test (SOPT). Self-report questionnaires were used to assess school academic performance and smoking behaviors. The same questionnaires were completed again at the 1-year follow-up (Time 2). The Expectancy-Valence (EV) Model was applied to distill the IGT performance into three different underlying psychological components: (i) a motivational component which indicates the subjective weight the adolescents assign to gains vs. losses; (ii) a learning-rate component which indicates the sensitivity to recent outcomes vs. past experiences; and (iii) a response component which indicates how consistent the adolescents are between learning and responding. The subjective weight to gains vs. losses at Time 1 significantly predicted current smokers and current smoking levels at Time 2, controlling for demographic variables and baseline smoking behaviors. Therefore, by decomposing the IGT into three different psychological components, we found that the motivational process of weight gain vs. losses may serve as a neuropsychological marker to predict adolescent smoking behaviors in a general youth population.

**Keywords: adolescents, smoking, decision-making, EV model, Iowa Gambling Task (IGT), longitudinal study**

## **INTRODUCTION**

Affective decision-making is one of the most important social functions in our real-life, which enables us to choose wisely according to long-term negative outcomes rather than short-term immediate reward (Bechara, 2005). Impaired affective decisionmaking has been shown in a variety of neurological and psychiatric conditions such as addiction (Bechara and Damasio, 2002), obsessive-compulsive disorder (Whitney et al., 2004), pathological gambling (Cavedini et al., 2002), and schizophrenia (Sevy et al., 2007; Yip et al., 2009). Recent longitudinal studies also found that affective decision-making capability could predict relapse in addicts (De Wilde et al., 2013) and adolescent binge drinking behaviors (Xiao et al., 2009).

One of the most frequently used neuropsychological tasks to assess affective decision-making in the laboratory is the Iowa Gambling Test (IGT) (Bechara et al., 1994). Compared to other tasks, which assess brain functions related to the calculation of probability or expected value, IGT requires participants to learn and infer from their past experience (such as reward and punishment encountered during the task) about outcome probabilities (Bechara, 2004). Affective and emotional systems therefore play a critical role in such learning processes (Werner et al., 2009; Heilman et al., 2010). The decision-making in the IGT is guided by an emotional signal that assigns negative value for the disadvantageous choices and positive value for advantageous choices, thereby leading behavior toward long term favorable options (Bechara and Damasio, 2005). Recently, IGT or IGT analogous tasks have been used widely from early adolescence to adulthood (Crone and van der Molen, 2004; Hooper et al., 2004; Overman, 2004). Research also found that affective decision-making could be modified by social and environment factors and still develops during adolescence (Xiao et al., 2011).

Recently, researchers found the IGT is a complex task which involves several psychological processes (Busemeyer and Stout, 2002). Using a mathematical model, three different psychological processes can be dissociated from IGT performance: (1) a learning-rate component which indicates the sensitivity to recent outcomes vs. past experiences; (2) a motivational component which indicates the subjective weight the adolescents assign to gains vs. losses (3) a response component which indicates how consistent the adolescents are between learning and responding (Busemeyer and Stout, 2002). The model has been successfully used to discriminate IGT performance among different clinical groups (Stout et al., 2004; Yechiam et al., 2005). Here we apply this approach to study the effects of these psychological processes revealed by the model on the development of adolescent real-life risky behaviors, namely smoking.

Affective decision-making can be affected by other cognitive functions such as working memory (Bechara and Martin, 2004; Toplak et al., 2010). Therefore, in this study, we used the Selfordered Pointing Test (SOPT) (Peterson et al., 2002) to assess working memory capacity, a task that was developed by Petrides and Milner (1982). This task requires in each trial, an individual to memorize a maximum number of 12 items, either visually or phonologically encoded, and hold them "online" for further operations. Because there are six trials of the SOPT, the maximum capacity is not required in the first trial but the amount of information increases cumulatively over the course of each trial. This process resembles that of transient online storage (Perry et al., 2001), or active monitoring and retrieving of the increasing amount of information (Petrides, 1995) in the concept of working memory. This task has been linked to neural activity within the Dorsolateral Prefrontal Cortex (DLPC) (Petrides et al., 1993) and has been used to assess working memory capacity in several studies (Chey et al., 2002; Pukrop et al., 2003; Chaytor and Schmitter-Edgecombe, 2004; Ward et al., 2005). Moreover, studies have shown that working memory is highly related to general cognitive functions such as reading, mathematics, and reasoning (Engle et al., 1992; Colom et al., 2004; Jarrold and Towse, 2006).

Some previous research studies have found impaired affective decision-making measured by the IGT in adolescent and college smokers (Xiao et al., 2008; Buelow and Suhr, 2012). However, due to the cross-sectional design of these studies, the temporal causal relationship between neuropsychological functions and smoking behaviors remained unclear. That is, these studies could not determine whether abnormalities in their neural systems were the consequence of long-term cigarette use, or whether these abnormalities reflected a developmental predisposition that led to cigarette use. Therefore, the current study tests the ability of psychological processes of affective decision making to account for changes in adolescents' smoking behaviors at a 1 year follow-up. The prospective cohort studies are informative for efficiently investigating a causal relationship between risk factors and adolescent substance use behaviors because it is a longitudinal observation of the individual through time. We also take into account other risk factors reported in previous studies including previous smoking behaviors, working memory capacity, and academic performance. We hypothesize that the psychological processes underlying the IGT at baseline (Time 1) would predict adolescent smoking behaviors 1 year later (Time 2) even after adjusting baseline smoking behaviors, working memory, and academic performance.

#### **METHODS**

#### **SAMPLE**

Data collection for this study was supported by the Pacific Rim Transdisciplinary Tobacco and Alcohol Use Research Center, which is investigating social, environmental, and biological determinants of tobacco and alcohol use and abuse among youth in China. All research protocols and instruments were approved by the University of Southern California, Claremont Graduate University, and Chengdu, China CDC Institutional Review Boards. With the assistance of the Municipal Education Committee and the Chengdu Center for Disease Control and Prevention (CCDCP), in Chengdu City, Sichuan Province, four schools were recruited for the study. To ensure maximum variability across the student sample, two academic high schools, one of high- and one of low/middle academic status, and two vocational schools, one of middle- and one of low academic status, were selected. School administrators and teachers from the selected schools agreed to participate in the research after receiving a thorough explanation of the project from the CCDCP staff. One 10th grade class from each of the four schools was randomly selected, and a total of 223 students were invited to participate. Students voluntarily took part in the study and were told that they could discontinue their participation at any time. Out of that total, 16 participants at baseline (Time 1) and twenty-six in the one year follow-up (Time 2) were excluded from the data analysis due to computer malfunctions or failure to complete the survey or follow instructions on the SOPT. The analytic data set included 181 participants (81.2% of total participants) at both the baseline and year 1 study sessions.

## **MEASURES**

Baseline (Time 1) measures included two computer-assisted neurocognitive assessments and a paper-and-pencil self-report questionnaire. One year follow-up (Time 2) measures included a paper-and-pencil self-report questionnaire. The instructions for the neuropsychological tasks and the questionnaires were translated into Mandarin Chinese (the only language used in the surveys) and back-translated by the Chinese graduate students in the Pacific Rim Transdisciplinary Tobacco and Alcohol Use Research Center prior to use.

#### *Baseline measures*

*Iowa gambling task (IGT).* As described in previous studies (Bechara et al., 1994, 1999), the IGT is a computerized version of the gambling task with an automated and computerized method for collecting data. In the IGT, four decks of cards labeled A- , B- , C- , and D are displayed on the computer screen. The backs of the cards resemble real decks of cards. The participant starts the task with a sum of make-believe money in his or her account (i.e., \$2000), represented by a green bar that changes in length as the participants "wins" or "loses" money during the task. The subject is required to select one card at a time from one of the four decks. When the subject selects a card, a message indicating the amount of money the subject has won or lost is displayed on the screen. The pre-programmed schedules of gain and loss are controlled by the computer. Turning each card can bring an immediate reward of \$100 in Decks A and B and \$50 in Decks C and D- . As the game progresses, there are also unpredictable losses among the cards. Total losses could amount to \$1250 in every 10 cards in Decks A and B compared to \$250 in Decks C and D- . Decks A and Bare equivalent in terms of overall potential net losses, and Decks C and D are equivalent in terms of overall potential net gains over the course of the trials. The difference is that in Decks A and C- , the punishments are more frequent but of smaller magnitude. Whereas the punishments in Decks B and D- are less frequent but of greater magnitude. Thus, Decks A and B- are disadvantageous because they yield high immediate gains but greater losses in the long run (i.e., net loss of \$250 for every 10 cards), and Decks C and D are advantageous in that they yield lower immediate gains but smaller losses in the long run (i.e., net gain of \$250 for every 10 cards).

The following variables were used for data analysis: (1) After the IGT was completed, a net score was obtained by subtracting the total number of selections from the disadvantageous decks (A- + B- ) from the total number selections from the advantageous decks (C- + D- ). (2) In light of more recent evidence reporting that individuals have a preference for decks with infrequent punishments (Decks B and D) (Overman et al., 2004; Buelow and Suhr, 2012), we calculated scores from the four decks. (3) The IGT net scores chosen in first 40 trials and latter 60 trials were obtained given there is a difference in decision-making between first (decision-making under ambiguity) and latter trials (decisionmaking under risk) (Brand et al., 2006, 2007; Buelow and Suhr, 2012). (4) parameters of the revised expectancy-valence model over 100 trials were calculated. In the modeling we employed the revised *Expectancy Valence* model (rEV; Busemeyer and Stout, 2002; Yechiam et al., 2005). This is a learning model that predicts the next choice ahead in repeated choice-making. The model has three components, each represented by an estimated parameter.

(a) Relative weight to gains and losses, measured by the attention-weight parameter. The subjective evaluation of the gains and/or losses obtained upon making a choice is called a valence, and denoted *v*(*t*). It is calculated as a weighted average of the gains and losses resulting from the chosen option in each trial *t*.

$$\nu\_{\circ}(t) = \boldsymbol{\omega} \cdot \boldsymbol{\omega} \dot{\boldsymbol{m}}(t) - (1 - \boldsymbol{\omega}) \cdot \boldsymbol{\log}(t),$$

where *win (t)* and *loss (t)* are the amounts of money won or lost on trial *t*; and *w* is the attention weight parameter (0 ≤ *w* ≤ 1).

(b) Relative sensitivity to recent vs. past outcomes, measured by the recency parameter. The outcomes produced by each alternative *j* are summarized by an expectancy score, denoted *Ej(t)*, and updated as follows:

$$E\_{\dot{\jmath}}(t) = E\_{\dot{\jmath}}(t-1) + \phi \cdot [\nu(t) - E\_{\dot{\jmath}}(t-1)],$$

where *j* is the selected alternative. The recency parameter, φ, describes the degree to which subjective expectancies reflect the influence of the most recent relatively to more distant past experiences (0 ≤ φ ≤ 1).

(c) The effect of expectancies on further choice, measured by the choice consistency parameter. The probability of choosing an alternative is a strength ratio of the expectancy of that alternative, relative to all choice options (using Luce's rule):

$$\Pr[G\_j(t+1)] = \frac{e^{\theta(t) \cdot E\_j(t)}}{\sum\_j e^{\theta(t) \cdot E\_j(t)}},$$

where Pr[G*j*(*t*)] is the probability that alternative *j* will be selected on trial *t*. The term θ*(t)* controls the consistency of the choice probabilities and the expectancies, where: θ*(t)* = *c*<sup>5</sup> − 1, and *c* is the choice consistency parameter (0 ≤ φ ≤ 10).

The accuracy of the model is assessed by comparing its prediction to that of a baseline model. In the baseline model, choices are estimated based on the respondent's mean choices. The estimation procedure is described in detail in Busemeyer and Stout (2002). The statistical test used for comparing the fit of the models was the Bayesian Information Criterion (BIC) for log likelihood differences. Positive values of the BIC statistic indicate that the cognitive model performs better than the baseline model. In the present study, the mean BIC value was 6.24, hence the model fit was adequate.

*Self-ordered pointing test (SOPT).* We used a computerized version of the SOPT (Peterson et al., 2002), which was based upon a task originally developed by Petrides and Milner (1982). The SOPT has both verbal and non-verbal components with 3 trials of each. In the verbal component, subjects view pictures of concrete, nameable objects (clock, book, bus, etc.); whereas in the non-verbal component, subjects view abstract designs that are difficult to name or encode verbally. In each trial, 12 pages are presented sequentially, with each page depicting the same 12 pictures but in a different spatial arrangement on each page. Subjects are instructed to point to a different picture in each presentation. To effectively select a different picture each time, subjects must retain pictures in working memory. The total number of correct selections of different pictures represents the working memory score. There is a maximum possible score of 12 on each trial and a total of 72 for all 6 trials. In our study, the internal consistency across the 6 trials was 0.86.

#### *Questionnaire measurements*

*Current smoking.* Current Smoking status was assessed with this item: "During the past 30 days, have you smoked cigarettes?" Those who indicated smoking in past 30-days were classified as current smokers. Current smoking levels were assessed with this item "During the past 30 days, on the days you smoked, how many cigarettes did you smoke per day? The response options range from "I did not smoke cigarettes during the past 30 days," "Less than 1 cigarette per day" to "More than 20 cigarettes per day."

*School academic performance (SAP).* Students self-reported their academic performance in school by answering the following question: "What is your usual academic performance at your current school or the last school where you received grades?" The five response options ranged from: "Mostly A's, or 90 or more points, or Superior" to "Mostly F's, or Below 60 points, or Failing." A higher score represented a higher academic performance.

#### *One-year follow-up questionnaire measurements*

*Measures.* The same questions in the baseline were used to ask current smoking and current smoking levels. Those who indicated smoking in past 30-days were classified as current smokers.

### **PROCEDURES**

At baseline (Time 1), trained data collectors from the CCDCP and the University of Southern California provided written and verbal instructions to the students and administered the computerbased assessments and questionnaires in temporary computer labs set up at each school. Students completed the questionnaire and the computer-based assessments (the IGT and SOPT) during a 1 h period. All the students completed the IGT first and then finished the SOPT. Students were provided with earphones to muffle any potentially distracting noises in the environment. One year later (Time 2), students completed the follow-up questionnaire.

### **DATA ANALYSIS**

Data were analyzed with the Statistical Package for the Social Sciences for Windows, Version, 17.0 (SPSS Inc., Chicago, IL). Since our sample size (*N* = 181) was relatively large, and since the residuals from the methods satisfied the normality and homoscedasticity assumptions, the variables from EV models, IGT net scores, SOPT, and SAP were treated as continuous without any transformation. The relationships between Time 1 and Time 2 smokers were analyzed using Chi-square tests separately for different levels of current smoking. Independent *t* tests were used to compare measures at Time 1 between current and noncurrent smokers at Time 1 and Time 2. To reveal potential predictors of current smokers/current smoking levels at Time 2, logistic/linear regression models were used with each of three psychological processes obtained from the IGT performance (Time 1) as the dependent variable and current smokers/current smoking levels (Time 2) as the independent variable, conditioning on Time 1 demographic characteristics, working memory, academic performance, and current smokers/current smoking levels.

## **RESULTS**

#### **RELATIONSHIP BETWEEN CURRENT SMOKERS TIME 1 AND TIME 2**

The relationship between smokers at baseline and year one was shown in **Table 1**. It shows that 84.5% (*N* = 153) adolescents were non-current smokers at both Time 1 and Time 2. 9.4% (*N* = 17) and 12.2% (*N* = 22) adolescents were current smokers at Time 1 and Time 2, respectively. 6.1% (*N* = 11) adolescents were



current smokers at both Time 1 and Time 2. The combination of smokers at baseline was significantly different from that of smokers at year one [χ<sup>2</sup> *(*1*)* = 48*.*53, *P <* 0*.*001].

## **MEASURES AMONG TIME 1 AND TIME 2 CURRENT SMOKERS AND NON-CURRENT SMOKERS**

**Table 2** shows the measures among Time 1 and Time 2 current smokers and non-current smokers. At Time 1, 88.2% of current smokers but only 46.3% of non-current smokers were males [χ<sup>2</sup> *(*1*)* = 10*.*81, *P <* 0*.*001]. Moreover, 76.5% of current smokers but only 42.1% of non-current smokers were vocational school students [χ<sup>2</sup> *(*1*)* = 7*.*35, *P <* 0*.*01]. However, current smokers did not show significant differences with non-current smokers on the measures of IGT net score. Although current smokers chose more from Deck A and B but less from Deck C and D compared to noncurrent smokers, such differences were not statistically significant for each deck. There were also no differences on three psychological processes underlying decision-making (recency, weight to gain vs. loss, and consistency) or working memory scores. Current smokers scored significantly lower on academic performance than non-current smokers at Time 1 (*P <* 0*.*05). 41.2% of current smokers at Time 1 reported they have had less than 1 cigarette per day in the past 30 days.

At Time 2, 81.8% of current smokers but only 45.9% of noncurrent smokers were males [χ<sup>2</sup> *(*1*)* = 9*.*97, *P <* 0*.*01]. Moreover, 68.2% of current smokers but only 42.1% of non-current smokers were vocational school students [χ<sup>2</sup> *(*1*)* = 5*.*29, *P <* 0*.*05]. However, current smokers did not show significant differences with non-current smokers on the measures of IGT net score. Although current smokers chose more from Deck A and B but less from Deck C and D compared to non-current smokers, such differences were not statistically significant for each deck. There were also no differences on two psychological processes underlying decision-making (recency, and consistency) or working memory scores. However, current smokers scored significantly lower on weight to gain vs. loss and SAP compared to non-current smokers (*P <* 0*.*05). 45.5% of current smokers at Time 2 reported they have had 2–5 cigarettes per day in the past 30 days.

#### **BEHAVIORAL PERFORMANCE ON THE IGT**

Previous research showed that the IGT taps into two decisionmaking contexts, decisions under ambiguity in the first trials and decisions under risk in the latter trials (Brand et al., 2006, 2007; Buelow and Suhr, 2012). We therefore computed the original IGT net score in the first 40 cards selected and last 60 cards selected. For each block, we counted the number of selections from Decks A and B- (disadvantageous) and the number of selections from Decks C and D- (advantageous), and then derived a net score for that block [(C- + D) - (A- + B- )]. A net score above zero implied that the participants were selecting cards advantageously, and a net score below zero implied disadvantageous selection.

**Figure 1** presents the net scores as a function of group (noncurrent smokers and current smokers) and block at Time 1 and Time 2 after controlling for age, gender and school type. At time 1, the IGT performance for smokers is shown on the left panel in **Figure 1**. The comparison of the plots shows that current smokers

#### **Table 2 | Measures in non-current and current smokers.**


*\*P < 0.05; Comparing to non-current smokers.*

at baseline did not differ with the non-current smokers in the first 40 trials on the IGT scores. Although current smokers chose more in disadvantageous decks than non-current smokers, a betweenwithin ANCOVA test did not reveal any significant difference in groups (*non-current smokers vs. current smokers)* or interaction between groups and blocks after controlling for age, gender, and school type (*P >* 0*.*1).

The IGT performance for smokers at Time 2 is shown on the right panel in **Figure 1**. The group effect was not significant. However, there was a significant interaction effect between groups and blocks after controlling for age, gender, and school type [*F(*1*,* 176*)* = 6*.*65; *P <* 0*.*05]. The current smokers did not show difference with non-current smokers in the first 40 trials on the IGT performance. However, they performed significantly worse compared to non-current smokers in the latter trials (*P <* 0*.*05).

#### **VARIABLES PREDICTING CURRENT SMOKERS AT YEAR ONE**

Logistic regressions were performed to predict current smokers at year one at model I in **Table 3**. The IGT overall net score and three psychological variables were examined individually in four different models after controlling for demographic variables, working memory, academic performance, and baseline current smokers. Among the IGT overall net score and three psychological variables, only weight to gain vs. loss significantly predicted the current smokers at year one after controlling for demographic variables, working memory, academic performance, and baseline current smokers (*P <* 0*.*05, *OR* = 0*.*07, 95%*CI* = 0*.*01, 0.55). Baseline current smoking also significantly predicted the current smoker at year one (*P <* 0*.*001, *OR* = 21*.*65, 95%*CI* = 5*.*17, 90.61).

Linear regressions were performed to predict current smoking levels at year one at model II in **Table 3**. The IGT overall net score and three psychological variables were examined individually in four different models after controlling for demographic variables, working memory, academic performance, and baseline current smoking levels. Among the IGT overall net score and three psychological variables, only weight to gain vs. loss significantly predicted the current smoking levels at year one after controlling for demographic variables, working memory, academic performance, and baseline current smokers (*P <* 0*.*05, Beta = −0.122, 95%*CI* = −0*.*536, −0.024). Baseline current smoking levels also significantly predicted the current smoking levels at year one (*P <* 0*.*001, Beta = 0.643, 95%*CI* = 0*.*618, 0.002). Only the results of the model including the weight to gain vs. loss are presented in **Table 3**.

## **DISCUSSION**

We investigated the potential contribution of three different psychological processes (recency, weight to gain vs. loss, consistency) to affective decision-making as measured by the IGT in Chinese adolescents, and their relationship to real-life risky behaviors, namely their smoking behavior, using a longitudinal study design. We found that only weight to gain vs. loss significantly predicted

the current smoking behavior one year later. To our knowledge, this is one of the first longitudinal studies to investigate the different psychological processes underlying affective decision-making measured by the IGT in the development of smoking behaviors among adolescents.

Previous studies show that individuals have a preference for decks with infrequent punishments (Decks B and D) (Overman et al., 2004; Buelow and Suhr, 2012), we calculated scores from the four decks and found that current smokers chose more from Deck A and B but less from Deck C and D compared to noncurrent smokers. However, such differences were not statistically significant for each deck. Therefore, our results could not be explained by the preference of the current smokers for the decks with infrequent punishment. In this study and our previous study (Xiao et al., 2008), the current smokers did not differ on the IGT total net scores over 100 trials compared to non-current smokers. However, in this study, we found that the current smokers at year one performed worse on the latter but not the first trials of the IGT than the non-current smokers, which suggest the current smokers showed impaired decision-making capacity, especially the decisions under risk (Brand et al., 2006, 2007; Buelow and Suhr, 2012). Consistent with these findings, by decomposing the IGT into three different psychological components, we also found that the motivational process of weight gain vs. losses but not consistency and recency processes serves as a neuropsychological marker to predict smoking behaviors one year later in the general youth population. These results also suggest that the subprocesses of affective decision-making measured by the IGT may be more sensitive indictors for adolescent risky behaviors than the IGT net score alone.

Our results were consistent with previous studies which revealed that several populations including young polydrug uses,

#### **Table 3 | Variables predicting current smokers (model I) and current smoking levels (model II) at Time 2.**


*aFemale as reference group; bAcademic School as reference group. \*\*\*P < 0.001; \*P < 0.05.*

patients with Asperger and individuals with lesions of the right somatosensory and insula cortex showed impaired in the motivational process of weight gain vs. losses as measured by the IGT (Yechiam et al., 2005, 2008; Johnson et al., 2006b). Therefore, the current smokers in this general adolescent population would be similar to these young polydrug users, patients with Asperger, and individuals with lesions of the right somatosensory and insula cortex. As mentioned in Yechiam et al. (2005), the impairment in the motivational process of weight gain vs. losses may represent difficulties in establishing an emotional representation of the different decks in the IGT. Other studies also show that the right somatosensory and insula cortex is critical for mapping somatic states and translating the raw physiological signals into what one subjectively experiences as a feeling toward the pleasures of gain or the pain of loss (Damasio, 1994; Bechara and Damasio, 2005).

It is interesting that we found only weight to gain vs. loss but not consistency and recency processes linked to the development of adolescent smoking behaviors. The different psychological process underlying affective decision-making measured by the IGT may engage different neural systems. Although to our knowledge no functional imaging study has addressed this topic directly to date, one study examined three psychological processes underlying affective decision-making correlated with gray matter volume (GMV) in healthy controls and patients with schizophrenia (Premkumar et al., 2008) and found that in healthy participants, weight to gain vs. loss was associated with frontal, temporal, parietal, and striatal regions GMV. Recency was associated with GMV in temporal regions, and consistency was associated with GMV in the frontal, temporal, posterior cingulate, and occipital regions (Premkumar et al., 2008). Another study also found genetic factors related to dopaminergic and serotoninergic neural transmitter systems linked to the psychological process of weight to gain vs. loss (Sevy et al., 2006). Future functional imaging and other studies are needed to examine the distinct neural or genetic basis for the three psychological processes underlying affective decision-making.

In our study, working memory as measured by the SOPT performance did not show significant difference between current smokers and never smokers. Although current smokers showed lower school academic performance compared to noncurrent smokers, it did not predict smoking behaviors at Time 2. Considerable evidence showed that the structural maturational brain processes are continuing well through adolescence, especially in regions and systems associated with risk and reward seeking, emotion regulation and response inhibition (Spear, 2000; Fuster, 2002; Paus, 2005). Specifically, among the latest brain regions to mature without reaching adult levels until the 20 s are some portions of PFC including orbitofrontal, ventrolateral, and medial prefrontal regions (Giedd, 2004; Gogtay et al., 2004). However, studies also show that developmental increases in the IGT performance could not be explained by developmental changes in working memory capacity, inductive reasoning, and behavioral inhibition (Crone and van der Molen, 2004; Hooper et al., 2004), suggesting that maturation of the ventromedial prefrontal cortex may be a developmentally distinct process from maturation of other regions of the prefrontal cortex.

One limitation of the current study is reliance on self-reports of cigarette use, raising the question of whether the respondents reported accurately on their smoking behaviors. However, empirical studies have shown that the self-reported data are, by and large, valid across racial, ethnic, and cultural groups (Wallace and Bachman, 1993; Johnston et al., 1994). Another limitation of our study was that the sample sizes of current smokers are relatively small. However, the prevalence of the cigarettes smoked per day during the past 30-days in our sample was very similar to that in other large-scale population studies in the school students in China (Grenard et al., 2006; Johnson et al., 2006a). Although the rate of cigarette use in our study is lower than the overall rate of U.S. sample, it is comparable to that of Asian students reported in the United States both national and regional studies. For example, compared to high school youth of other racial/ethnic groups, Asian American high school students smoke cigarette at the lowest rate. 10% Asian American students smoke of cigarettes compared with 22% of white and 19% of Hispanic high school students (http://www. healthwellnc.com/TUPCHERITAGETOOLKIT/May/1Fact%20S heets/Legacy%20Asian%20Americans%20and%20Smoking%20 Fact%20Sheet.pdf). Although no legal age has been specified for cigarette use in mainland China, environmental circumstances may be more protective of children in China than in Western countries (more time spent in school and the home, less free time with peers, and less pocket money). This might help explain why in China, uptake and progression to regular smoking continues well into middle adulthood, rather than leveling in adolescence as in the west. Furthermore, statistical significance on both models of current smokers and smoking levels indicates that the effects are robust, and population representativeness of the sample, bolstered by inclusion of students from both major types of Chinese high schools, suggests that the findings are widely generalizable to Chinese youth. However, future studies are needed to establish replicability to other cultural/environmental settings.

In summary, by decomposing the IGT into three different psychological components, we found that the motivational process of weight gain vs. losses significantly predicted adolescent smoking behaviors one year later. Thus, distilling complex decision processes into their underlying components can shed light on real-world choices made by adolescents in the general population. As the EV model has mainly been used in the literature for characterizing the cognitive style of clinical or delinquent populations, the present work demonstrates its potential in a new field. These findings also suggest that intervention targeting the adolescent's motivational process—namely, the relative weighting of gain and loss—may help to reduce smoking behaviors at an early stage.

### **ACKNOWLEDGMENTS**

The authors would like to thank the USC TTUARC Intervention Team and Chengdu Center of Disease of Control (CDC) Team for their assistance with project coordination and data collection. We also express our gratitude to the municipal government, Health Bureau, and Education Committee in Chengdu, China for their support.

(TTUARC), funded by the National Cancer Institute (grant #1 P50 CA84735-02 to C. Anderson Johnson), the National Institute on Drug Abuse (grant #DA16708 to Antoine Bechara) and National Cancer Institute (NCI) (grant # CA152062 to Antoine Bechara).

## **FUNDING**

This work was supported by the Claremont Graduate University—University of Southern California (USC) Pacific Rim Transdisciplinary Tobacco and Alcohol Use Research Center

## **REFERENCES**


86–99. doi: 10.1080/138033905005 07196


alcoholics. *Eur. Addict. Res*. 19, 21–28. doi: 10.1159/000339290


Asperger's disorder: evidence from the Iowa Gambling Task. *J. Int. Neuropsychol. Soc*. 12, 668–676. doi: 10.1017/S1355617706060802


Kumari, V. (2008). Emotional decision-making and its dissociable components in schizophrenia and schizoaffective disorder: a behavioural and MRI investigation. *Neuropsychologia* 46, 2002–2012. doi: 10.1016/j.neuropsychologia. 2008.01.022


analysis of decision-making processes in cocaine abusers. *Psychon. Bull. Rev*. 11, 742–747. doi: 10.3758/BF03196629


schizophrenia: a preliminary examination of the influence of tobacco smoking and relationship to Wisconsin card sorting task performance. *Schizophr. Res*. 110, 156–164. doi: 10.1016/j.schres.2009. 01.012

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 June 2013; accepted: 11 September 2013; published online: 01 October 2013.*

*Citation: Xiao L, Koritzky G, Johnson CA and Bechara A (2013) The cognitive processes underlying affective decisionmaking predicting adolescent smoking behaviors in a longitudinal study. Front. Psychol. 4:685. doi: 10.3389/fpsyg. 2013.00685*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Xiao, Koritzky, Johnson and Bechara. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Dual conception of risk in the Iowa Gambling Task: effects of sleep deprivation and test-retest gap

## *Varsha Singh\**

*Humanities and Social Science, Indian Institute of Technology Delhi, New Delhi, India*

#### *Edited by:*

*Yao-Chu Chiu, Soochow University, Taiwan*

#### *Reviewed by:*

*V. S. Chandrasekhar Pammi, University of Allahabad, India Shunsuke Kobayashi, Fukushima Medical University, Japan William Killgore, McLean Hospital, USA*

#### *\*Correspondence:*

*Varsha Singh, Humanities and Social Science, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India e-mail: vsingh.iitb@gmail.com*

Risk in the Iowa Gambling Task (IGT) is often understood in terms of intertemporal choices, i.e., preference for immediate outcomes in favor of delayed outcomes is considered risky decision making. According to behavioral economics, healthy decision makers are expected to refrain from choosing the short-sighted immediate gain because, over time (10 trials of the IGT), the immediate gains result in a long term loss (net loss). Instead decision makers are expected to maximize their gains by choosing options that, over time (10 trials), result in delayed or long term gains (net gain). However, task choices are sometimes made on the basis of the frequency of reward and punishment such that frequent rewards/infrequent punishments are favored over infrequent rewards/frequent punishments. The presence of these two attributes (intertemporality and frequency of reward) in IGT decision making may correspond to the emotion-cognition dichotomy and reflect a dual conception of risk. Decision making on the basis of the two attributes was tested under two conditions: delay in retest and sleep deprivation. An interaction between sleep deprivation and time delay was expected to attenuate the difference between the two attributes. Participants were 40 male university students. Analysis of the effects of IGT attribute type (intertemporal vs. frequency of reinforcement), sleep deprivation (sleep deprivation vs. no sleep deprivation), and test-retest gap (short vs. long delay) showed a significant within-subjects effect of IGT attribute type thus confirming the difference between the two attributes. Sleep deprivation had no effect on the attributes, but test-retest gap and the three-way interaction between attribute type, test-retest gap, and sleep deprivation were significantly different. *Post-hoc* tests revealed that sleep deprivation and short test-retest gap attenuated the difference between the two attributes. Furthermore, the results showed an expected trend of increase in intertemporal decision making at retest suggesting that intertemporal decision making benefited from repeated task exposure. The present findings add to understanding of the emotion-cognition dichotomy. Further, they show an important time-dependent effect of a universally experienced constraint (sleep deprivation) on decision making. It is concluded that risky decision making in the IGT is contingent on the attribute under consideration and is affected by factors such as time elapsed and constraint experienced before the retest.

**Keywords: Iowa Gambling Task, decision making, risk, reward–punishment, sleep deprivation, test-retest gap**

## **INTRODUCTION**

The Iowa Gambling Task (IGT; Bechara et al., 1994) is used to test a hypothesis about emotion and decision making called the somatic marker hypothesis (SMH; Damasio, 1994). The main assumption in the SMH–IGT framework is that risk is perceived in terms of its intertemporal attribute, i.e., choice of immediate as opposed to delayed reward and punishment is considered risky (Bechara et al., 2005). However, IGT task choices also differ on the basis of the frequency of immediate rewards and punishments; thus, task choices differ in two ways. To clarify, the IGT offers a choice among four decks of cards, labeled A- , B- , C- , and D- . Unlike the original paper-and-pencil based task (ABCD), the computerized task (A- B- C- D- ) has increased delayed punishment and therefore it amplifies the effect of disadvantageous choices (see Bechara et al., 2000 for differences between the two variants). Unbeknown to the decision maker, decks A and B have high immediate rewards (100 points per card-pick) with 50% of cards drawn from deck A giving a loss of 35–100 points and 10% of cards drawn from deck B giving a loss of 1250 points, such that 10 cards drawn from decks A and B result in a net loss of 250 points. Decks C and D have small immediate rewards (50 points per card-pick) with 50% of cards drawn from deck C giving a loss of 25–75 points and 10% of cards drawn from deck D giving a loss of 250 points, such that 10 cards drawn from decks C- and D result in a net gain of 250 points. Therefore the four decks differ in two ways: (a) net outcome across time (i.e., inter temporal attribute) by which decks A and B could be considered risky in the long term, whereas decks C and D could be considered safe in the long term, and (b) frequency of immediate reward and punishment notwithstanding net or long-term outcomes (i.e., frequency attribute) by which decks A and C could be perceived as risky due to frequent punishments/infrequent rewards and decks B and D could be perceived as safe due to infrequent punishments/frequent rewards.

It is commonly understood that risk perception and decision making in the IGT is governed by the intertemporal attribute (Bechara et al., 2005), and that choices on the basis of the frequency attribute have no long-term advantage (Dunn et al., 2006). Nevertheless, there have been many observations of decision making on the basis of the frequency attribute (Wilder et al., 1998; Ritter et al., 2004; Bark et al., 2005; Fishbein et al., 2005; Shurman et al., 2005; Toplak et al., 2005; van den Bos et al., 2006). This preference is incompatible with the SMH–IGT framework as demonstrated, for example, by the finding that deck B was considered "risky" on the basis of the intertemporal attribute and is preferred to other "safe" decks (Lin et al., 2007), whereas deck C that was considered "safe" is avoided by healthy participants (Chiu and Lin, 2007). Furthermore, dispositional risk seekers who were assessed using a modified risk-taking scale (Domain-Specific Risk-Taking; Weber et al., 2002) preferred decks A and C and avoided decks B and D- (Singh and Khan, 2008). Together, these findings suggest that, in the IGT, risk might be perceived in two ways, either by the intertemporal attribute or by frequency of reward and punishment.

This dual conception of risk in the form of two attributes (intertemporality and frequency) represents an important dichotomy of cognition-emotion in IGT decision making. Support for this dichotomy comes from dual process theories of reasoning according to which there are two systems that process information differently. One system is automatic, emotion-based, and concerned with the present, whereas the second is reflective, cognition-based, and concerned with the future (Tversky and Kahneman, 1971). Decision making on the basis of the intertemporal attribute in the IGT reflects explicit learning (Maia and McClelland, 2004), is dependent on hippocampus-mediated memory systems such as the declarative memory system (Gupta et al., 2009), engages working memory (Hinson et al., 2002), and requires cognitive processing (Stocco et al., 2009). However, decision making on the basis of the frequency attribute is attributed to automatic processing (Wilder et al., 1998; Stocco et al., 2009). These findings suggest that decision making on the basis of the intertemporal attribute is the result of activity in the cognitionbased system whereas decision making on the basis of the frequency attribute may reflect activity in the emotion-based system. Indeed, Stocco et al. (2009) found a double dissociation in decision making on the basis of the two attributes (intertemporality and frequency). These researchers tested the role of cognitive resources first by introducing a secondary task during learning of the deck payoffs, and second, by restricting display of the outcome, that is, by restricting access to information about the deck payoffs. Contrary to their expectation, absence of a secondary task (working memory load) was associated with greater decision making on the basis of the frequency attribute. Thus, absence of a secondary task, assumed to benefit the cognition-based system, instead appeared to benefit the emotion-based system.

Unlike previous research, the present study was aimed at differentiating decision making on the basis of the two attributes (intertemporality and frequency) by manipulating re-test gap and sleep deprivation, factors known to influence decision making on the IGT. The dual process theory suggests that task-familiarity (e.g., at retest) is conducive to activity of cognition-based system (rather than to activity of emotion-based system). Accordingly, decision making on the basis of the intertemporal attribute is observed to improve at retest (i.e., preference for safe longterm advantageous decks increases at retest) (Bechara et al., 2000); this supports the contention that intertemporal decision making is cognition-based. However, it is unclear whether task-familiarity at the retest reduces the reliance on emotionbased system and results in a decrease in decision making on the basis of the frequency attribute (i.e., preference for infrequent punishment—frequent rewards decks decreases at retest).

Furthermore, the difference between the two attributes should be attenuated by two factors: (1) time delay, i.e., test-retest gap, and (2) sleep deprivation. For example, it has been observed that a lengthy (1 month) test-retest gap strengthens intertemporal decision making much more (i.e., greater increase in choices made from the long-term advantageous decks) (Bechara et al., 2000) than a shorter (1 week) test-retest gap (Turnbull and Evans, 2006) suggesting that task familiarity offered by a retest and a long test-retest gap and benefits intertemporal attribute. The present study investigates the interaction between attribute type and the test-retest gap. Few studies have investigated the effects of sleep deprivation on the IGT (e.g., Killgore et al., 2006, 2007), however none have compared decision making on the basis of both the attributes. Sleep deprivation impairs performance on tasks that rely on the explicit memory system (Drosopoulos et al., 2005; Fischer et al., 2006), it is the same system that governs intertemporal decision making in the IGT (Maia and McClelland, 2004). Although decision making is often analyzed only on the basis of intertemporality (i.e., the cognition-based system) (e.g., Killgore et al., 2006, 2007), the impairment caused by sleep deprivation has been explained as a failure of integration of both cognitive and affective processes (Killgore et al., 2007). This makes it essential to understand the effects of sleep deprivation on both attributes of decision making in the IGT.

A few studies have investigated the combined effects of sleep deprivation and test-retest gap on the IGT (e.g., Killgore et al., 2006, 2007) however, none have compared decision making on the basis of both attributes. Killgore et al. (2006) found that a short (1 day) test-retest gap, when combined with sleep deprivation, impaired decision making and increased risky choices in the IGT (greater number of choices made from the short-term advantageous decks). At least one animal study has shown that sleep deprivation and a short test-retest gap disrupt learning of a hippocampus-dependent task, whereas sleep deprivation fails to cause a disruption with a longer delay (Graves et al., 2003) suggesting that the effects of sleep deprivation might be timedependent. As pointed out earlier, intertemporal decision making is dependent on hippocampus-mediated memory systems (Gupta et al., 2009), therefore, it was expected that a long test-retest gap would reduce sleep deprivation impairments on the IGT in the case of the intertemporal attribute. Overall, for the intertemporal attribute, a short test-retest gap and sleep deprivation is expected to inhibit performance whereas a long test-retest gap is expected to counteract (at least partially) the negative effects of sleep deprivation. Thus, the present study was focused on the interaction between sleep deprivation and test-retest gap and it was expected that this interaction would attenuate the difference between the two attributes (intertemporality and frequency).

The research aims of the present study were to compare the two attributes (intertemporality and frequency of immediate reinforcement) when conditions were varied along two dimensions, test-retest gap and sleep deprivation. It was hypothesized that decision making would differ across the type of attribute (intertemporal/frequency of reinforcement); it was expected that sleep deprivation (sleep deprived/not sleep deprived) and test-retest gap (short/long delay) would affect the two attributes differently. A three-way interaction between attribute type, sleep deprivation condition, and test-retest gap was expected. Specifically, advantageous intertemporal decision making (i.e., net scores calculated on the basis of intertemporal attribute) was expected to decrease under conditions of sleep deprivation and short test-retest gap.

## **MATERIALS AND METHODS**

#### **SAMPLE**

Forty healthy, non-smoking, right-handed, Indian male students volunteered for the study (Mean age = 24.92 years; *SD* = 1*.*99). Even though the use of caffeine does not reverse sleep deprivation impairments in intertemporal decision making (Killgore et al., 2007), self-reported consumption of tea/coffee greater than 4–5 cups per day was an exclusion criterion. An all-male sample was employed because gender plays a critical role in sleepdeprivation-related risk behavior (Acheson et al., 2007) and in IGT decision making (Tranel et al., 2005). In addition, female students were reluctant to stay overnight (a condition of testing) due to the sociocultural environment (gender roles) of the country where the research was conducted (i.e., India). All participating students were enrolled in a PhD program in either the Department of Biosciences and Bioengineering (90%) or the Department of Humanities and Social Sciences (10%). The students were told that the study aimed to understand decision making and would require them to be available for two sessions. No incentives (money or course credit) were offered because these could produce each participant's superficially "best" task performance rather than mimic real or natural task performance.

#### **DESIGN AND ANALYSIS**

A 2 × 2 × 2 mixed repeated-measures design was employed with scoring type (intertemporal; frequency of reinforcement), sleep deprivation (sleep deprived; not sleep deprived), and test-retest gap (short; long) as factors. The analysis was repeated on the first factor. The variables were the difference between total net IGT scores at retest (T2) and baseline (T1) sessions, (1) scored according to the intertemporal attribute [T2 ((C- + D- ) − (A- + B- )) − T1 ((C- + D- ) − (A- + B- ))] and (2) scored according to frequency of preference for immediate reinforcement [T2((B- + D- ) − (A- + C- )) – T1((B- + D- ) − (A- + C- ))].

In the present study, the difference between total net IGT scores at test (T1) and retest (T2) is considered. This differs from previous studies where decision making was analyzed using block-wise scores at retest (T2). Such studies have either used an alternate version of the IGT at retest (Killgore et al., 2006), or have changed the deck payoffs at retest (Turnbull and Evans, 2006) to maintain uncertainty in decision making at retest. This method of analyzing block-wise performance is appropriate for comparing participants' rates of learning across trials because the initial trials of the IGT (even at baseline) are considered to involve decision making under uncertainty, whereas latter trials are considered to involve decision making under risk or known payoffs (Brand et al., 2007). However, the present study aimed to test decision making under risk (i.e., under knowledge of payoffs) rather than under uncertainty (i.e., under none/partial knowledge of payoffs). Therefore, it was deemed acceptable to utilize a consistent variant of the IGT with the same deck payoffs throughout the entire study. Furthermore, in a sleep deprivation study, Killgore et al. (2006) used a within-subjects design, that is, participants served as their own controls, a design that did not require accounting for differences between the participants at the baseline, making it appropriate to analyze decision making only at retest. However, the present study used a mixed design therefore it was essential to take into account differences in performance at baseline (T1) and retest (T2) for all participants.

#### **MATERIALS**

A computerized version of the IGT (A- B- C- D- ) and task instructions were presented on a computer screen. There were 60 cards in a deck, and the exclusion criterion was exhausting any of the four decks at either Time 1 (T1) or at Time 2 (T2); none of the participants exhausted a deck. In the present study, deck pay offs matched those used by Bechara et al. (2000) such that the task amplified the negative consequences of selecting disadvantageous decks.

#### **PROCEDURE**

Participants filled in a questionnaire giving their demographic information. They were then presented with an overview of the experiment, and gave informed consent. Participants were also informed that their participation was voluntary, and that they could drop out of the experiment at any stage. The study received the approval of three committees comprising interdisciplinary experts: (1) a thesis committee (Research Progress Committee), (2) a departmental committee, and (3) an institute-level committee for the post-graduate research program (competent authority for giving clearance for conducting research on human participants). Participants were then randomly assigned to one of four groups (short-test-retest gap/sleep deprivation, long test-retest gap/sleep deprivation, short test-retest gap/no sleep deprivation, or long test-re-test delay/no sleep deprivation). Each participant was tested to measure baseline IGT decision making (T1 consisted of 100 trials). The two groups with short test-retest gaps were retested (T2 consisted of 100 trials) 24 h after the baseline session. However, the two groups with long test-retest gap were retested 12 weeks after the baseline session. Participants in the sleep-deprivation conditions were retested after one night of sleep deprivation and participants in the no-sleep-deprivation conditions were retested after a single restful night of sleep. Sleep deprivation introduced immediately after baseline (T1) and a long test-retest gap would have allowed investigation of posttask learning and consolidation; however the focus of the present experiment was on comparing decision making on the basis of the two attributes (intertemporality and frequency) based on the presence (or absence) of sleep deprivation and the length of test-retest gap.

All participants spent the night before the retest session in a dormitory in the presence of a male research assistant who observed no tossing and turning or other discomfort among the participants as they slept. The environment matched that of dormitories that are a regular feature of student life in the engineering institutes in India. The dormitory had furniture (beds, tables, chairs, side-tables, ceiling fans), lighting (tube lights), and computers, and was maintained at a temperature similar to the students' own rooms. Participants in the sleep-deprivation group were allowed to read books or magazines, watch movies, or complete college assignments while in the dormitory room. Participants in the sleep-deprivation group were in the company of a male research assistant and refrained from drinking caffeinated beverages (e.g., tea, coffee) throughout the night. Participants in the no–sleep-deprivation group were asked to sleep (in presence of a male research assistant who observed no discomfort among participants, and who woke participants in time for the retest session). All participants were discouraged from taking afternoon naps the day before the retest and were reminded not to discuss the study with others. All IGT testing was done between 7:00 a.m. and 9:00 a.m., and in each retest half of participants were sleep-deprived and half were not. Baseline and retest times were matched for all participants; for example, if a participant underwent the baseline session at 8.00 a.m., his or her retest session was also at 8.00 a.m.

#### **RESULTS**

**Table 1** gives descriptive statistics for the IGT decision-making scores, calculated in two ways (intertemporality and frequency) for the four groups. Larger standard deviations (suggesting greater variability) have been observed for intertemporal decision making in the IGT (Bowman and Turnbull, 2003; Newman et al., 2008) and were also observed in the present study.

There was a significant main effect of attribute type, *F(*1*,* <sup>36</sup>*)* = 7*.*51, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*17. The was a significant interaction effect between attribute type and test-retest gap, *F(*1*,* <sup>36</sup>*)* = 5*.*01, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*12. There was also a significant three-way interaction among attribute type, sleep deprivation, and test-retest gap, *F(*1*,* <sup>36</sup>*)* = 5*.*16, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*12. The results showed a difference between the total net scores calculated on the basis of two different conceptualizations of risk in IGT—one based on the intertemporal nature of reward and punishment and the other based on preference for a specific frequency of immediate reward and punishment. The test-retest gap interacted with attribute type suggesting that risk taking (as understood according to the two different attributes) is differentially susceptible to time delay between the two exposures to the IGT. Contrary to expectations, sleep deprivation did not have an effect on IGT decision making analyzed via the two attributes. However, the three-way interaction between sleep deprivation, time delay, and attribute type was significant.

To further investigate the role of the test-retest gap, the threeway interaction (attribute type × time delay × sleep deprivation) was further probed with a repeated measures ANOVA on the data that was split according to the short and long testretest gaps. For the short test-retest gap, the effect of attribute type was not significant, but the interaction between sleep deprivation and attribute type was significant, *F(*1*,* <sup>18</sup>*)* = 4*.*55, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*20. In contrast, for the long test-retest gap, there was a significant effect of attribute type, *F(*1*,* <sup>18</sup>*)* = 9*.*61, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*35, whereas the interaction of sleep deprivation with attribute type was not significant. These results suggest that the difference between the two attributes is unaffected by a short test-retest gap, but that sleep deprivation introduced with a short test-retest gap attenuates the difference between the two attributes. Conversely, the difference between the two attributes is affected by a long test-retest gap, but the difference between the two attributes is unaffected by introducing sleep deprivation with a long test-retest gap. **Figures 1**, **2** depict the time-dependent effects (short vs. long test-retest gap, respectively) of sleep deprivation on the two attributes.

To test whether there was a difference between decision making at retest (T2) and at baseline (T1) for the two attributes (i.e., to test whether decision making at T2 was different from that at T1), a paired *t*-test was done for total net scores derived via the two attributes. There was a significant improvement

**Table 1 | Groupwise differences between total net IGT scores at retest and baseline (T2 – T1), calculated according to the intertemporality and frequency attributes (***n* **= 40).**


*Note: Values given are means (standard deviations). Long time/Sleep dep., Long test-retest gap, sleep deprivation; Short time/Sleep dep., Short test-retest gap, sleep deprivation; Long time/No sleep dep., Long test-retest gap, no sleep deprivation; Short time/No sleep dep., Short test-retest gap, no sleep deprivation.*

the mean.

in total net IGT scores between baseline (*M* = 12*.*53, *SD* = 33*.*05) and retest (*M* = 24*.*38, *SD* = 24*.*09) when scored on the basis of intertemporal attribute [i.e., (C + D) – (A + C)], *t(*39*)* = −2*.*32, *p <* 0*.*05. However, when total net IGT scores were calculated according to preference for immediate reinforcement, they showed a slight decline from baseline (*M* = 18*.*58, *SD* = 19*.*38) to retest (*M* = 11*.*95, *SD* = 26*.*12); however, this difference was not significant (**Figure 3**). As expected, the results suggested that, overall, performance on the basis of the intertemporal attribute increased with an increase in task exposure.

### **DISCUSSION**

error of the mean.

The present study tested dual conception of risk in the IGT as manifested by two decision making attributes (intertemporal

attribute and frequency of reward and punishment). As expected, The ANOVA showed a difference between the two total net IGT scores derived from the two attribute types (intertemporality and frequency). Thus, the data support the hypothesis that there is a distinction between the two conceptualizations of risk in the IGT indicating that decision making of cognition-based system differs from that of emotion-based system at the retest.

As expected, differences in the two attributes were affected by the length of test-retest gap suggesting that temporal stability in risk taking is contingent both on the attribute under consideration and on the time gap between test and retest. Contrary to expectations, sleep deprivation had no independent effect on the two attributes. This could be, in part, because rewards and punishments are present in both attribute types, and because sleep deprivation alters risk differentially for reward and punishment (McKenna et al., 2007; Venkatraman et al., 2007). For instance, decision making on the basis of frequency of reinforcement (i.e., cards drawn from decks B and D- ) is thought to reflect a preference for frequent rewards rather than for infrequent punishments (Wilder et al., 1998; Lin et al., 2007). Decision making in the IGT is believed to be complex in nature IGT (Upton et al., 2011). For instance, when sleep deprivation induces risk-taking (intertemporal risk) and such risk-taking is tested via the IGT, the mitigating effects of caffeine cannot be observed (Killgore et al., 2007). However, when risk-taking was tested via another task, called the Ballon Analog Risk Task (BART), caffeine was found to restore risk taking (in sleepdeprived individuals) to baseline level (Killgore et al., 2008). Killgore et al. attribute this difference in the mitigating effects of stimulants to the fact that the IGT has a "gain" frame whereas the BART has a "loss" frame Killgore et al. (2008). In fact, it is believed that risk perception in the IGT may further differ between the domains of reward and punishment (Levin et al., 2012).

In support of a hypothesized distinction between the attributes, sleep deprivation in conjunction with the test-retest gap had a significant effect on the two attributes. This suggests that the length of a test-retest gap plays a crucial role in how sleep deprivation affects risky decision making when conceptualized in two different ways. Follow-up analysis showed that a short test-retest gap did not affect the two attributes differentially, but that introducing sleep deprivation with a short test-retest gap enhanced the difference between the two attributes. On the other hand, a long test-retest gap did affect the two attributes differentially and introducing sleep deprivation after a long delay did not have any differential effect on the two attributes. These results are consistent with those of Killgore et al. (2006, 2007) in which sleep deprivation and short test-retest gaps (49, 51, 72 h) impaired intertemporal decision making. It is possible that the combination of a short test-retest gap and sleep deprivation creates fatigue which promotes dichotomizing of the two attributes. This explanation is aligned with that given by Killgore et al., in which fatigue due to sleep deprivation Killgore et al. (2007) or due to even a modest self-reported decrease in sleep duration Killgore et al. (2012) is believed to contribute to a failure of cognitionemotion integration. In other words, fatigue might contribute to differentiation of cognition and the emotion-based system. Even though the present study did not test post-task consolidation, it is possible that the effect of sleep deprivation is time dependent for the widely used intertemporal attribute in the IGT. The current results imply that temporal stability of the two attributes is different and that learning of the two attributes might be differentially vulnerable to the effects of test-retest gap and sleep deprivation.

As expected, repeated task exposure (by retest) appeared to be conducive to activity of cognition-based system. At retest, there was a marked increase in choices made on the basis of the intertemporal attribute. In the IGT, the intertemporal attribute embodies a common conception of risk; that is, risk is considered as an anticipated tradeoff between immediate and delayed outcomes. On the other hand, choices made on the basis of the frequency-of-reinforcement attribute suggest that risk perception in the IGT may be automatic and reflect spontaneous processing of the frequency of rewards and punishments (Stocco et al., 2009). In line with dual process theory (e.g., Evans, 2003; Kahneman and Frederick, 2007), the frequency attribute may be the "default attribute" and decision making on the basis of the intertemporal attribute may require inhibition or overriding of this "default" mode. For example, in the present study, it is possible that repeated task exposure at retest overrode the response of the emotion-based system while at the same time strengthening intertemporal decision making. This dual conception of risk in the IGT is aligned with the behavioral decisionmaking literature that considers risk as "anticipated as well as anticipatory" and "a deliberate as well as instinctive process" (Loewenstein et al., 2001; Slovic et al., 2004, 2005; Slovic and Peters, 2006).

Consistent with the findings of Kahneman and Frederick (2007), the present results indicate that two distinct types of reasoning and rationality are manifested in IGT decision making. Could the conceptualization of risk and rationality advanced by the SMH–IGT framework—that is, risk as an intertemporal choice and rationality as making long term advantageous decisions—be a reflection of the environment where the task was developed and the cognitive demands of that environment? Decision making in the IGT is observed to be governed by frequency of reinforcement rather than the intertemporal attribute in several cultural contexts including Taiwan (Chiu and Lin, 2007; Lin et al., 2007; Chiu et al., 2008), Iran (Ekhtiari et al., 2009), Brazil (Schneider et al., 2010), and India (Singh and Khan, 2008). Future studies could utilize the IGT to understand cultural variations in risk perception and risk taking at the behavioral as well as the neural level.

Apart from the small sample size and the use of an all-male sample, the present study had other limitations, such as the lack of physiological monitoring to ascertain the effects of sleep deprivation and a lack of accounting for individual disposition (Franken and Muris, 2005) and mood (Suhr and Tsanadis, 2007). One disadvantage of varying the test-retest gap (short and long test-retest gap) is the inability to equate the affective and motivational states of the two groups between the two testing sessions. Importantly, studying the effects of both, sleep and test-retest gap on the IGT decision-making task will require weighing the advantages and disadvantages of the research paradigm utilized (Pace-Schott et al., 2011). For example decision making without reward (as in the present IGT study) has the advantage of ensuring that performance does not depend on incentives that compensate for the effects of sleep deprivation and that performance is not bolstered by reward incentives; a lack of incentive can be disadvantageous in that it may be difficult to make inferences about motivation in a decision making task where no incentive is provided. However, at least one study has shown that there is no difference in IGT decision making based on whether incentives are real (monetary) or facsimiles (Bowman and Turnbull, 2003).

## **CONCLUSION**

The present results contribute to current understanding of IGT decision making related to two important attributes, the intertemporal attribute and the frequency of reinforcement attribute. The results also add to knowledge concerning the larger question of the dichotomy between cognition and emotion in decision making. For instance it might be possible that failure to incorporate the cognition and emotion dichotomy is responsible for the instability that is observed in risky decision making (Fox and Tannenbaum, 2011; Vlaev, 2011). Apart from pointing out that stability in risk taking in the IGT is contingent on the attribute under consideration, the current results also suggest that inconsistency in risk taking (across a time span) observed in decision making tasks could be due to factors (such as time elapsed and constraint) affecting dichotomization of the emotion-cognition processes.

## **ACKNOWLEDGMENTS**

The study was completed in partial fulfillment of the requirements for a doctoral degree at the Indian Institute of Technology– Bombay. I thank Naveen Kashyap for assistance with data collection.

## **REFERENCES**


measures of prefrontal cortical dysfunction in schizophrenia. *Schizophr. Res.* 68, 65–73. doi: 10.1016/S0920-9964(03)00086-0


decks in the Iowa Gambling Task. *Biol. Psychol.* 71, 155–161. doi: 10.1016/j.biopsycho.2005.05.003


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 25 August 2013; published online: 19 September 2013.*

*Citation: Singh V (2013) Dual conception of risk in the Iowa Gambling Task: effects of sleep deprivation and test-retest gap. Front. Psychol. 4:628. doi: 10.3389/ fpsyg.2013.00628*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Singh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The impact of frontal and cerebellar lesions on decision making: evidence from the Iowa Gambling Task

## *Caroline de Oliveira Cardoso\*, Laura Damiani Branco , Charles Cotrena , Christian Haag Kristensen , Daniela Di Giorge Schneider Bakos and Rochele Paz Fonseca*

*Graduate Department of Psychology, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, Brazil*

#### *Edited by:*

*Jong-Tsun Huang, China Medical University, Taiwan*

#### *Reviewed by:*

*Ifat Levy, Yale University School of Medicine, USA Michele Poletti, Azienda Sanitaria Locale di Reggio Emilia, Italy*

#### *\*Correspondence:*

*Caroline de Oliveira Cardoso, Graduate Department of Psychology, Pontifícia Universidade Católica do Rio Grande do Sul, Av. Ipiranga 6690, Building 11 – Rm 932, Rio Grande do Sul, 90610-900 Porto Alegre, Brazil e-mail: carolineocardoso@ yahoo.com.br*

Although the frontal lobes have traditionally been considered the neural substrates of executive functioning (EF), recent studies have suggested that other structures, such as the cerebellum, may be associated with these abilities. The role of the cerebellum has only been sparsely investigated in connection with decision making (DM), an important component of EF, and the few results obtained on this front have been inconclusive. The current study sought to investigate the role of the cerebellum in DM by comparing the performance of patients with cerebellar strokes, frontal-damaged patients, and a healthy control group on the Iowa Gambling Task (IGT). A total of nine cerebellar-damaged adults participated in the study, as well as nine individuals with frontal strokes and 18 control individuals. Patients were administered a version of the IGT adapted to the population of Southern Brazil. There was a marginal difference in mean IGT net scores between the two clinical groups, although both displayed impaired performance as compared to the control group. Overall, the DM ability of patients with cerebellar damage proved to be more preserved than that of individuals with frontal lobe strokes, but less preserved than that of the control group. These data suggested that, while the frontal lobes may be the most important brain structures for DM, the cerebellum might also play an active role in this cognitive function. Future studies assessing participants with lesions in different cerebellar regions and hemispheres will prove invaluable for the understanding of the neural structures involved in DM, and make significant contributions to the globalist-localizationist debate in DM neuroscience.

**Keywords: decision making, Iowa Gambling Task, cerebellum, frontal lobe, stroke, executive functions**

## **INTRODUCTION**

Of all the cognitive processes explored by clinical and cognitive neuropsychology, executive functioning (EF) stands as one of the most extensively studied due to its complexity, interrelations with other cognitive processes, and the ongoing search for a sufficiently comprehensive theoretical model. Traditionally, EF has been considered dependent on, or even synonymous with, frontal lobe functioning (Baddeley, 1986; Funahashi, 2001; Elliott, 2003; Demakis, 2004; Heyder et al., 2004; Barkley, 2011).

The frontal lobes have been identified as the neural substrate of EF by a large number of studies (Alvarez and Emory, 2006; Jurado and Rosselli, 2007). Some of the most robust evidence linking frontal lobe activity to EF comes from patients with frontal lesions, who often present with impairments in tasks that assess EF (Bechara et al., 1994; Burgess and Shallice, 1996; Stuss et al., 2000). Such studies have also shed light on the more specific anatomical bases of different EF. For instance, executive subcomponents which are more heavily based on rational thinking, such as logical reasoning and planning, are generally associated with the dorsolateral prefrontal cortex, whereas functions which depend on emotional and motivational processing, such as social behavior regulation and decision making (DM), are more closely associated with ventromedial prefrontal cortex (VMPFC) functioning (Ardila, 2008; Chan et al., 2008; Brock et al., 2009). However, scientists have begun to question the exclusive role of the frontal lobes in EF in light of evidence that points toward the involvement of other brain regions in this set of cognitive abilities.

Executive impairments in patients with lesions in areas other than the prefrontal cortex (Cummings, 1993; Kramer et al., 2002), as well as functional neuroimaging studies of healthy participants during EF tasks (Fassbender et al., 2004; Collette et al., 2006) have indicated that this set of cognitive abilities does not reside in a single cerebral structure, but is instead the result of associations between a number of brain regions. These associations include reciprocal projections between the prefrontal cortex and other cortical and subcortical regions, such as the anterior cingulate cortex, the thalamus, the basal ganglia, and the cerebellum (Heyder et al., 2004; Collette et al., 2005, 2006; Alvarez and Emory, 2006; Verdejo-García and Bechara, 2010). Some authors suggest that EF is a product of the activation of frontal-subcortical circuits (Cummings, 1993; Tekin and Cummings, 2002), such as the frontal-cerebellar connection (Middleton and Strick, 2000; Heyder et al., 2004; Krienen and Buckner, 2009).

The cerebellum has long been considered essential to posture as well as motor control and coordination. However, studies published since the 1990s have expanded this perspective by showing that this structure is also involved in functions that are not exclusively related to motor control (Leiner et al., 1986; Schmahmann et al., 2007). Evidence obtained from clinical (Schmahmann and Sherman, 1998; Hayter et al., 2007) and neuroimaging studies (Stoodley and Schmahmann, 2009; Baillieux et al., 2010) has shown that the cerebellum is involved in a series of cognitive functions, such as verbal and working memory, EF, language, emotion processing, and attention (Karatekin et al., 2000; Timmann and Daum, 2007; Baillieux et al., 2010; Grimaldi and Manto, 2012).

Cerebellar structures contain a series of efferent and afferent connections to a number of other brain regions, such as the dorsolateral and dorsomedial prefrontal cortices, portions of the posterior parietal cortex, the superior temporal region, the thalamus, and the limbic system (Schmahmann and Pandya, 1995; Middleton and Strick, 2000; Riva and Giorgi, 2000; Bugalho et al., 2006; Krienen and Buckner, 2009). Given its localization and connections, it seems likely that the cerebellum contributes to both motor and cognitive/emotional abilities (Rapoport et al., 2000; Fonseca and Parente, 2007). However, the functional implications of this pattern of connectivity have still to be investigated.

Schmahmann and Sherman (1998) assessed participants with cerebellar lesions and suggested the term "cerebellar cognitiveaffective syndrome" to describe the pattern of dysfunctions observed. This syndrome includes alterations in EF (planning, abstract reasoning, verbal fluency) and working memory, visuospatial disorganization, difficulties in language production, and personality changes. The authors hypothesize that these impairments occur due to interruptions in the neural circuitries linking the cerebellum to prefrontal, temporal, and posterior parietal cortices, as well as to the limbic system. This syndrome has been observed in both children and adults with acquired lesions of different etiologies, such as strokes (Neau et al., 2000) and degenerative diseases of the cerebellum (Cooper et al., 2010).

Studies of EF have identified performance deficits in patients with cerebellar damage in the same assessment instruments used to detect executive dysfunction in patients with prefrontal lesions (Manes et al., 2009). Patients with cerebellar damage have been found to perform worse than control groups in tasks such as the Stroop Test (Gottwald et al., 2004), the Wisconsin Card Sorting Task (Karatekin et al., 2000; Abel et al., 2007), in instruments which assess cognitive flexibility (Manes et al., 2009) and verbal fluency (Gottwald et al., 2004; Dienberger et al., 2010; Arasanz et al., 2012), as well as in ecological tasks such as the Multiple Errands Test—Hospital Version (Manes et al., 2009).

Although the evidence allows for the possibility that patients with cerebellar damage could have similar cognitive profiles to individuals with frontal lobe damage (Abel et al., 2007; Manes et al., 2009), there is a markedly low number of studies comparing these two patient groups in terms of their cognitive functioning. In one of the few studies that made such a comparison, Casini and Ivry (1999) investigated the performance of individuals with frontal and cerebellar damage in a perceptual task. While both patient groups had impaired performance in the task, the impairments in patients with frontal lobe damage were associated with deficits in divided attention, while impairments in patients with cerebellar damage occurred due to alterations in temporal processing abilities.

The role of the cerebellum in DM has also been very sparsely investigated, even though DM has been shown to be dependent on a series of executive processes in which the cerebellum has been implicated (Del Missier et al., 2012). DM abilities in patients with neurological conditions are often assessed through the Iowa Gambling Task (IGT; Bechara et al., 1994), a tool developed by Bechara et al. (1994) based on the somatic marker hypothesis (SMH). Somatic markers consist of combinations of physiological and emotional reactions elicited by particular decisional behaviors. As a result of implicit learning, the markers become associated with the behaviors by which they were initially caused, and serve as positive and negative cues to guide subsequent decisions. According to the SMH, the brain circuitry responsible for DM processes consists primarily of the VMPFC and its connections to the limbic system. The most significant evidence toward this hypothesis was obtained from patients with lesions to the VMPFC (Bechara et al., 1999), who displayed significant DM deficits in spite of an absence of any other executive or intellectual impairments. The decisional pattern displayed by these individuals, which involved an inability to delay gratification and a tendency toward impulsively selecting immediately pleasurable alternatives, was described as "myopia for the future." Bechara et al. (1999) found that, unlike healthy individuals, these patients did not experience increased autonomic activation prior to making risky decisions on the IGT. Based on these findings, the authors suggested that the cause of the DM impairment observed in patients with VMPFC lesions was the inability to access somatic markers.

The IGT investigates DM under uncertainty, as the participant is asked to choose among decks of cards without any prior knowledge of the contingencies associated with each deck (Bechara et al., 1997; Escartin et al., 2012). Although the IGT was initially developed to detect DM impairment in patients with lesions in the VMPFC, it has also been successful in detecting DM deficits in individuals with neurological conditions such as traumatic brain injury (Bonatti et al., 2008; Yasuno et al., 2014), patients with subarachnoid hemorrhage (Escartin et al., 2012) and Parkinson's disease, or psychiatric disorders such as substance dependence (Bechara and Damásio, 2002), compulsive gambling (Kertzman et al., 2011), schizophrenia (Bellani et al., 2009), autism spectrum disorder (South et al., 2014), attention-deficit/hyperactivity disorder (Malloy-Diniz et al., 2007), and bipolar disorder (Martino et al., 2010; Powers et al., 2013).

The IGT has also been used to assess the role of different brain regions on DM performance through studies of patients with damage to specific cerebral structures. For instance, a study conducted by Brand et al. (2007) on patients with Urbach-Wiethe disease found significantly lower IGT scores and skin conductance responses in the clinical group as compared to a healthy control group. These results suggested an association between amygdala damage and impaired learning from experience, which has a particularly negative effect on DM under ambiguity, where the outcomes of different choices are not explicitly stated and one must rely solely on their own experience to calculate probabilities and assess the risks associated with each of the alternatives available. Similar findings were obtained in a study conducted by Kobayakawa et al. (2008), who assessed the IGT performance of patients with basal ganglia damage as a result of Parkinson's disease. The authors found that these patients displayed riskier DM and lower skin conductance responses to both reward and punishment when compared to control participants. Lastly, the role of the hippocampus in IGT performance was assessed by Gupta et al. (2009), in a study of patients with bilateral hippocampal damage. The authors found that these individuals displayed significantly impaired IGT performance, failing to develop a preference for advantageous decks or to exhibit a learning curve throughout the task. This study identified the importance of hippocampal activity and, consequently, declarative memory systems in the IGT.

In spite of the valuable information produced by studies of the IGT in patients with lesions in different cerebral location, to the best of the authors' knowledge, only two studies so far have used the IGT to assess DM in patients with cerebellar damage (Abel et al., 2007; Gerschcovich et al., 2011). Although the results obtained by these two studies were inconclusive, studies with healthy participants support the idea of cerebellar involvement in the IGT, as neuroimaging studies by Ernst et al. (2002) and Christakou et al. (2009), for instance, detected cerebellar activation during IGT performance. Given the state of current research in this front, further investigation of the role of the cerebellum in the IGT is required, especially since, although the evidence linking this brain structure to EF is quite robust, little is known about its involvement in DM.

The IGT has been adapted to the Brazilian population in two different studies (Schneider and Parente, 2006; Malloy-Diniz et al., 2008), which produced slightly different versions of the task. Studies have been conducted using the first version of this task (Schneider and Parente, 2006) to investigate the influence of participants' sociodemographic characteristics on task performance (Carvalho et al., 2011, 2012), as well as to ascertain its psychometric properties (Cardoso et al., 2010). The task has also demonstrated adequate validity in the assessment of DM deficits in substance-dependent individuals (Verdejo-García et al., 2007). The fact that this same version of the IGT has been successfully used in the assessment of neurological populations, such as individuals with traumatic brain injury (Sigurdardottir et al., 2010), speaks to its sensitivity in the detection of EF deficits in populations with acquired brain lesions.

On that note, the study of patients with acquired lesions is one of the most effective clinical paradigms in the investigation of the roles of different brain regions in cognitive functioning. Therefore, by studying patients with isolated cerebellar strokes, it may be possible to identify this structure's contribution to cognition as a whole (Heyder et al., 2004). It is also important to compare patients' performance with control groups and other clinical groups in which executive dysfunction is likely to be present, such as patients with frontal lobe damage. In this way, the cognitive performance of patients with cerebellar damage can be compared and contrasted with both normal cognitive function and executive dysfunction. Therefore, the current study sought to compare the IGT performance of patients with cerebellar damage to ones with frontal lobe damage as well as healthy adults. It was hypothesized that the two clinical groups would display impairments in IGT performance as compared to the control group. However, in comparing the two patient groups, it was expected that individuals with frontal lobe damage would exhibit greater impairment than those with a cerebellar stroke.

## **METHOD**

## **PARTICIPANTS**

The study recruited three participant groups, consisting of (1) nine cerebellar-damaged patients, (2) nine with frontal lobe damage, and (3) *n* = 18 healthy controls, in a 2:1:1 study design. Participants were selected from public and private hospitals in the area. All patients in the clinical samples had suffered an ischemic stroke, as diagnosed by routine neurological and neuroimaging assessments carried out in local hospitals. Patients were assessed at 1–60 months post-stroke. Data regarding the size and site of patient lesions were obtained through aretrospective review of patient records and of the results of neuroimaging examinations conducted at the hospitals from which the patients were recruited. A neuroradiologist was consulted for assistance with the interpretation of neuroimaging results. Members of the control group were recruited by convenience from the university where the study was conducted, as well as from other similar environments. Participants in the sample were native Portuguese speakers, with at least 1 year of formal schooling and 19 years of age. Exclusion criteria consisted of: neurological disorders (other than the ischemic stroke in patients in the clinical sample); being lefthanded or ambidextrous (screened by the Edinburgh Handedness Inventory—Oldfield, 1971); symptoms of aphasia which would impair the comprehension of and response to experimental tasks (as assessed by the oral language subtests of the Brazilian Brief Neuropsychological Assessment Battery NEUPSILIN— Fonseca et al., 2008); uncorrected sensory deficits (self-report in a sociodemographic questionnaire); history of alcohol abuse (screened by a score ≥2 on the CAGE Scale—version used in Amaral and Malbergier, 2004); history of illicit drug use, use of benzodiazepines and/or antipsychotics (self-report in a sociodemographic questionnaire); psychiatric disorders other than post-stroke depression (self-report in a sociodemographic questionnaire). Symptoms of depression were screened through the Geriatric Depression Scale (GDS-15; Yesavage and Sheikh, 1986—adapted to the Brazilian population by Almeida and Almeida, 1999); however, scores on this scale were used to describe the sample and not as exclusion criteria. Participants who took part in speech therapy or neuropsychological rehabilitation programs were also excluded from the sample. The following exclusion criteria were additionally applied to the control group: symptoms suggestive of depression (as measured by scores above 19 on the BDI—Beck et al., 1996, adapted to Brazilian Portuguese by Cunha, 2001) and signs of dementia [screened by scores *<*24 on the Mini Mental State Examination (MMSE), adapted to the local population by Chaves and Izquierdo, 1992, and the clock drawing test—Juby et al. (2002)].

**Table 1** displays the descriptive sociodemographic and clinical data pertaining to the patients in the clinical samples. Socioeconomic status was assessed based on Brazilian criteria for economic classification (2008). The control group was formed by individuals aged between 40 and 77 (*M* = 59*.*28; *SD* = 10*.*25) with between 4 and 20 years of formal schooling (*M* = 12*.*08; *SD* = 6*.*18), 77% of whom were female.


#### **Table 1 | Clinical sample description.**

*M, mean; SD, standard deviation; F, female; M, male; R/W, reading and writing; MMSE, Mini Mental State Examination; SES, Socioeconomic Status; L, left; R, right; 55.5% of patients were diagnosed through computerized tomography and 44.4% through magnetic resonance imaging. Specific data describing the location of the lesion was obtained for one participant (U. C.—hypodensity in the right cerebellar hemisphere on the lateral Wall of the fourth ventricle), while the remaining participants' exams only pointed to a general location within the affected lobe.*

Statistical analyses did not identify significant differences in sociodemographic characteristics between groups. The two clinical groups did not differ in regards to clinical variables (no comparisons were significant at *p <* 0*.*05).

## **PROCEDURE AND INSTRUMENTS**

All participants provided written and informed consent. Participants were assessed during a single session lasting roughly an hour and a half, during which all data pertaining to sociodemographic characteristics, exclusion criteria, and cognitive assessment were collected. The instruments used are described below:

• Sociocultural and health questionnaire (Fonseca et al., 2012a,b). This questionnaire includes questions about gender, age, education, socioeconomic status, frequency of reading and writing, and handedness. It also allows for the identification of health conditions which constitute exclusion criteria. The reading and writing inventory (Pawlowski et al., 2012) inquires as to the frequency with which individuals read newspapers, magazines, books or other types of material, and write essays, notes or other types of text. The frequency of each activity is assigned a score from 0 to 4 depending on whether the individual engages in the activity every day (4), some days a week (3), once a week (1), or never (0), for a maximum possible score of 16 for reading and 12 for writing habits. The frequency of these activities is classified as high or low depending on whether the sum scores of reading and writing frequency fall above or below 14.


performance. Similar net scores were calculated for each 20 trial block. The total number of cards drawn from each deck was also calculated for each participant, so that patterns of advantageous or disadvantageous deck choices could be identified. Lastly, a score based on the frequency of losses was calculated. This measure was introduced by Schneider and Parente (2006) to investigate whether patients base their choices on the win to loss ratios associated with each deck. This score is calculated through the equation [(B + D) − (A + C)], where positive scores indicate that more cards were chosen from decks with high win-to-loss ratios. The IGT is similar to real-world DM under uncertainty in that participants are not informed of the total number of turns involved in the task or of the probability of winning or losing associated with selecting cards from each deck. Therefore, any information used to formulate and implement DM strategies must be gathered through experiential learning over the course of the task.

## **DATA ANALYSIS**

Descriptive and inferential statistics were used in data analysis. Homogeneity analyses showed that, in spite of the small sample size, the data produced was normally distributed. As such, parametric tests were used to analyze the data. Demographic and clinical characteristics were compared between groups using Fisher's Exact test for categorical variables and One Way ANOVA followed by Tukey *post-hoc* tests for continuous ones. The variables related to IGT performance (total net score, loss frequency scores, and deck preferences) were analyzed through One Way ANOVA with Tukey *post-hoc* tests. The analyses of participants' learning curves (i.e., net scores per 20-trial block) were carried out via a repeated measures ANOVA. Lastly, Fisher's exact test was used to compare the proportion of participants with impaired vs. non-impaired performance in each group. Significance was considered at α = 0*.*05.

## **RESULTS**

## **IGT TOTAL NET SCORE**

The groups' DM performance was first investigated through a comparative analysis of total net scores obtained in the IGT. Results of a One Way ANOVA indicated a significant difference between patients with a frontal stroke [*M (SD)* = −16*.*44 *(*21*.*48*)*], a cerebellar stroke [*M (SD)* = 3*.*78 *(*13*.*17*)*], and controls [*M (SD)* = 23*.*00 *(*19*.*23*)*] (*F* = 13*.*90; *p <* 0*.*001). *Post-hoc* analyses indicated that the control group's performance was significantly different from that of patients with a frontal lobe (*p <* 0*.*001) or a cerebellar stroke (*p* = 0*.*042). Although the clinical groups did not differ from each other in terms of IGT net scores, the comparative analyses approached significance (*p* = 0*.*068). Furthermore, the analysis of coefficients of variation (frontal stroke *CV* = 1*.*30; cerebellar stroke *CV* = 3*.*78; controls *CV* = 0*.*83) suggests less homogeneity in the performance of patients with cerebellar damage compared to the other two groups.

#### **IGT NET SCORE PER BLOCK**

Net scores for each 20-trial block of the IGT were calculated for participants in all three groups. The descriptive data pertaining to these variables and the results of group comparisons are displayed in **Table 2**. **Figure 1** displays the learning curve observed in IGT performance calculated for each group based on average net scores per block.

**Table 2** and **Figure 1** show that patients with frontal damage obtained the lowest scores in all but one of the five IGT blocks, showing a persistent pattern of selections from disadvantageous decks. Control participants consistently displayed positive net scores starting in block 2. Patients with a cerebellar stroke obtained positive net scores starting in block 3, although a noticeable decrease in scores occurs between blocks 4 and 5. A Tukey *post-hoc* analysis indicated that patients with frontal lobe damage differed significantly from controls in blocks 3 (*p <* 0*.*001), 4 (*p* = 0*.*019) and 5 (*p* = 0*.*041), while in block 2 the only significant difference found was between cerebellar stroke patients and the control group (*p* = 0*.*044).

A Mixed-design Repeated Measures ANOVA compared control and clinical participants (between-groups factor) on IGT performance across all five blocks (within-subject factor). This analysis did not indicate a main effect of block (*F* = 1*.*472; *p* = 0*.*224); however, an interaction was observed (block x group) (*F* = 2.553; *p* = 0*.*013) between the control group (*p* = 0*.*025) and patients with a cerebellar stroke (*p* = 0*.*016) between blocks 1 and 4. The results suggest that these two groups' learning curves were significantly different from the curve observed in participants with frontal lobe damage.



### **AVERAGE NUMBER OF SELECTIONS PER DECK**

**Table 3** displays group comparisons of deck preference, as assessed by the average number of cards drawn from each deck.

**Table 3** suggests that frontal stroke patients selected a significantly higher number of cards from the disadvantageous decks A (*p* = 0*.*039) and B (*p* = 0*.*029) than control participants. The latter group selected a significantly higher number of cards from the advantageous deck D (*p* = 0*.*007) than patients with frontal lobe damage. No significant differences were observed between these two groups and patients with a cerebellar stroke.

#### **PERFORMANCE CLASSIFICATION BY CUTOFF SCORE**

Participants' total net scores were classified according to the cutoff scores suggested by Bechara (2007), and the percentage of participants in each category was compared between groups. Only one participant with a frontal stroke was classified as having non-impaired IGT performance (11.1%), compared to five out of nine patients with a cerebellar stroke (55.5%). In the control group, 15 participants' performance was classified as non-impaired (83.3%). An analysis through Fisher's Exact test identified significant differences between participant classifications in the control group and the patients with a frontal stroke (*p <* 0*.*001). Lastly, an equation representing the tendency to choose from high vs. low punishment frequency decks was created using the average number of deck selections in each group. However, no group differences were found on this variable (*F* = 0*.*543; *p* = 0*.*586).

## **DISCUSSION**

The current study sought to assess the DM process in patients with vascular lesions in the cerebellum, comparing it with the performance of patients with frontal lobe lesions and controls. Net IGT scores differed between the clinical groups and the control participants, while the two clinical groups trended toward a significant difference from each other. Analysis of learning curves throughout the task showed that patients with frontal lobe damage did not learn to avoid disadvantageous decks in the task, showing a distinct DM pattern from that observed in the other two patient groups. The data indicates that the DM performance of patients with a cerebellar stroke is worse than that of controls, but superior to the performance of patients with frontal lobe lesions. These findings are corroborated by results in the block scores, analysis of deck preference, and classification of performance based on net scores. A pronounced degree of heterogeneity was also identified in the group of patients with cerebellar damage: some performed more similarly to controls, and others more



similarly to the other clinical group. These results support the hypothesis that the frontal lobes play a key role in the DM process as assessed by the IGT. However, this structure should not be considered the single neural substrate of affective DM, as there is evidence to suggest that the cerebellum also plays an important role in this ability.

Results regarding the role of the frontal lobes in the DM process are supported by other findings in the literature. Observational assessments of patients with prefrontal cortex lesions indicate that, in spite of an absence of intellectual impairments, they tend to be more impulsive, indecisive, and have trouble predicting the long-term consequences of their actions (Damasio, 1996). These participants also tend to perform poorly in behavioral tasks that involve short- and long-term decisions due to a tendency toward risky decision-making behaviors (Bechara et al., 1994; Anderson et al., 1999).

Although the involvement of the cerebellum in DM has been far less studied in the literature, the few investigations conducted on the topic make for some interesting considerations regarding the way in which this brain region may be associated with decisional processes. The role of the cerebellum in DM under uncertainty has already been noted by both neuroimaging and experimental studies. Investigations of patients with brain lesions have suggested that the cerebellum may be part of a neural network which also involves regions such as the (ventromedial and dorsolateral) prefrontal cortices, the cingulate, parietal cortex, thalamus, amygdala, and the insular cortex, and is activated during IGT performance (Ernst et al., 2002; Lawrence et al., 2009). Cerebellar activation, specifically, was also investigated in a study by Blackwood et al. (2004). These authors noticed that the cerebellum belonged to a group of brain regions whose activation was observed during DM under both certainty and uncertainty, but was more pronounced in conditions of uncertainty. The authors suggested that the cerebellum plays a role in the internal representation of uncertain events, facilitating the prediction of future outcomes as well as inductive processes.

The results obtained by Blackwood et al. (2004) may help explain present findings regarding the DM performance of patients with cerebellar lesions. Impairments in the ability to maintain internal representations of uncertain events may have influenced these patients' ability to successfully complete the IGT. Alternatively, these findings could be explained through the role of the cerebellum in temporal organization. Together with the right prefrontal cortex and the basal ganglia, the cerebellum has been implicated in the internal representation of geographical and temporal distances between locations and events (Wheeler et al., 1997; Picton et al., 2006). If cerebellar lesions lead to impairments in the ability to establish temporal connections between actions and their consequences, it is possible that they also impair the ability to learn from experience, making it difficult to identify the advantageous and disadvantageous decks in the IGT, and to develop adequate strategies to conduct the task. Lastly, another explanation for the present findings is offered by Manes et al. (2009), who found that, although patients with cerebellar damage may obtain adequate scores in neuropsychological tasks, they may have significant difficulty planning and implementing effective strategies to conduct these tasks. Such a pattern of behavior could also be responsible for the impaired IGT performance observed in the patients with cerebellar lesions who took part in the present study.

In spite of the evidence pointing to the role of the cerebellum in DM, few studies have examined this cognitive function in patients with cerebellar damage. This fact is especially surprising given the large number of studies suggesting that this population displays impaired performance in other aspects of EF (Karatekin et al., 2000; Gottwald et al., 2004; Manes et al., 2009). One of the few studies which examined DM in patients with cerebellar damage was conducted by Gerschcovich et al. (2011). These authors assessed a patient with extensive bilateral cerebellar damage, and found that this individual displayed impairments in the IGT, as he consistently selected cards from the disadvantageous decks. These findings support those obtained by the present study. However, Abel et al. (2007) found that patients with cerebellar degeneration performed similarly to controls in the IGT, in that they learned to avoid the disadvantageous decks as the task progressed. Nonetheless, it is important to note two important methodological differences between these two studies: one examined a group of individuals with cerebellar degeneration (Abel et al., 2007) while the other consisted of a case study of an individual with an acquired cerebellar lesion (Gerschcovich et al., 2011). These findings show that research into the role of the cerebellum in DM is still in its infancy, and there is little convergence in the results of the few studies conducted.

The diversity in results regarding the role of the cerebellum in DM could be attributable to the variability in the cognitive repercussions of cerebellar damage. The present findings would support such a hypothesis, as it was observed that some patients with a cerebellar stroke displayed disadvantageous DM—as did individuals with a frontal stroke—while others performed similarly to controls. The heterogeneity in sample performance could also explain why some studies have not found DM impairments in patients with cerebellar damage.

It is also important to investigate whether the cognitive impairment observed in patients could be attributable to the location of the cerebellar lesion. Although the functional connectivity of the cerebellum has only begun to be explored, evidence suggests that frontal-cerebellar connections involve only a few specific areas in this brain structure (Krienen and Buckner, 2009; O'Reilly et al., 2010). The role of frontal-cerebellar connections in DM has been discussed in the literature, and it has been suggested that disruptions in this connection could impair DM (Manes et al., 2009). Therefore, it is possible that cerebellar damage leads to DM impairment only when they affect areas involved in frontalcerebellar circuits. To investigate this possibility, further studies of patients with cerebellar damage must be carried out, and involve detailed analysis of neuroimaging exams so that the role of different cerebellar areas in DM can be more precisely outlined.

The present results also highlights how little is known about the functional connectivity of the cerebellum, and suggest that behavioral studies may help in this regard by identifying cognitive functions in which the cerebellum is involved. Results obtained from comparisons between patients with frontal and cerebellar lesions also suggest that comparative studies could contribute significantly to knowledge of the cerebellum's role in cognition. If associations are found between cerebellar damage and impaired performance in tasks whose underlying cognitive functions and brain structures are well-known, it will be possible to generate more robust hypotheses about the other cortical and subcortical structures to which the cerebellum may be connected. Very valuable results in this regard could also be obtained by comparing, for instance, patients with lesions in the cerebellum with individuals who suffered strokes in the left- vs. right prefrontal cortex or the basal ganglia. Such studies could also elucidate the similarities and differences in EF between these neurological conditions.

The present results also speak to the differences in the severity of executive dysfunctions associated with different types of acquired lesions. Patients with strokes in the frontal cortex presented with more severe executive impairments than ones with cerebellar damage. These results are in agreement with those obtained in a study by Alexander et al. (2012), who found that cognitive impairments after cerebellar damage tend to be less severe and last for a shorter period of time, occurring mostly during the acute period following the stroke. The patients in the current study were assessed on average 10 months post-lesion, so that their condition could be considered chronic (Rousseaux et al., 2010); as such, the patterns of cognitive dysfunction identified can be considered permanent consequences of the stroke as opposed to the result of temporary brain changes following the lesion.

In summary, the present results suggest that patients with cerebellar strokes display impairments in DM, although these are less severe than the impairments found in patients with frontal strokes. The findings also speak to the role of brain structures outside the frontal lobes in DM, and should be further investigated in behavioral and neuroimaging studies. Furthermore, these results are in agreement with other studies in the literature that point to executive impairment following cerebellar damage. However, the findings must be interpreted in light of some limitations, such as the small sample size and the use of a single behavioral paradigm to assess DM and EF. Although the IGT has proved to be sensitive in detecting DM impairments in a number of populations, the task does not provide a reliable indication of the specific cognitive alterations that cause the impairments identified. It is also important to note that, since neuroimaging data was collected through retrospective chart reviews, the quality of the information obtained regarding the size and location of patient lesions was limited by the precision with which these exams were originally conducted. Since the records reviewed varied widely in terms of the level of detail with which patient lesions were described, it was not possible to determine lesion locations within the cerebellum and frontal lobes with much specificity. This may be considered an important limitation of the present study. Future studies should therefore be conducted with larger samples, other experimental paradigms and more detailed neuroimaging data so as to analyze the cognitive repercussions of lesion laterality and location in patients with cerebellar strokes. The association between IGT performance and symptoms of dementia and depression, which in the present study were only used as exclusion criteria, may also be an interesting topic for further investigation. It is also suggested that future studies include a control group involving post-stroke patients with lesions in areas other than the frontal lobes and cerebellum, such as basal ganglia injury, so as to control for the general effects of the presence of a vascular lesion.

## **REFERENCES**


traumatic brain injury. *Brain Cogn.* 84, 63–68. doi: 10.1016/j.bandc.2013. 11.005

Yesavage, J. A., and Sheikh, J. I. (1986). Geriatric depression scale (GDS) recent evidence and development of a shorter version. *Clin. Gerontol.* 5, 165–173. doi: 10.1300/J018v05n01\_09

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 November 2013; accepted: 15 March 2014; published online: 08 April 2014.*

*Citation: Cardoso CO, Branco LD, Cotrena C, Kristensen CH, Schneider Bakos DDG and Fonseca RP (2014) The impact of frontal and cerebellar lesions on decision making: evidence from the Iowa Gambling Task. Front. Neurosci. 8:61. doi: 10.3389/fnins. 2014.00061*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Cardoso, Branco, Cotrena, Kristensen, Schneider Bakos and Fonseca. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Decomposing the roles of perseveration and expected value representation in models of the Iowa gambling task

## *Darrell A. Worthy\*, Bo Pang and Kaileigh A. Byrne*

*Department of Psychology, Texas A&M University, College Station, TX, USA*

#### *Edited by:*

*Yao-Chu Chiu, Soochow University, Taiwan*

#### *Reviewed by:*

*Eric-Jan Wagenmakers, University of Amsterdam, Netherlands Eldad Yechiam, Technion - Israel Institute of Technology, Israel*

#### *\*Correspondence: Darrell A. Worthy, Department of*

*Psychology, Texas A&M University, 4235 TAMU, College Station, TX 77843-4235, USA e-mail: worthyda@tamu.edu*

Models of human behavior in the Iowa Gambling Task (IGT) have played a pivotal role in accounting for behavioral differences during decision-making. One critical difference between models that have been used to account for behavior in the IGT is the inclusion or exclusion of the assumption that participants tend to persevere, or stay with the same option over consecutive trials. Models that allow for this assumption include win-stay-lose-shift (WSLS) models and reinforcement learning (RL) models that include a decay learning rule where expected values for each option decay as they are chosen less often. One shortcoming of RL models that have included decay rules is that the tendency to persevere by sticking with the same option has been conflated with the tendency to select the option with the highest expected value because a single term is used to represent both of these tendencies. In the current work we isolate the tendencies to perseverate and to select the option with the highest expected value by including them as separate terms in a Value-Plus-Perseveration (VPP) RL model. Overall the VPP model provides a better fit to data from a large group of participants than models that include a single term to account for both perseveration and the representation of expected value. Simulations of each model show that the VPP model's simulated choices most closely resemble the decision-making behavior of human subjects. In addition, we also find that parameter estimates of loss aversion are more strongly correlated with performance when perseverative tendencies and expected value representations are decomposed as separate terms within the model. The results suggest that the tendency to persevere and the tendency to select the option that leads to the best net payoff are central components of decision-making behavior in the IGT. Future work should use this model to better examine decision-making behavior.

#### **Keywords: decision-making, computational modeling of decision, perseveration, expected value, iowa gambling task**

The Iowa Gambling Task (IGT) has played a critical role in the vast amount of progress that has taken place over the past two decades to develop a more complete understanding of human decision-making behavior. One of the most interesting developments in research that has utilized the IGT to examine decision-making processes has been the emergence and use of computational models to account for various aspects of behavior in the task. The Expectancy Valence (EV) model has been perhaps the most widely used model to quantitatively characterize human behavior in the task (Busemeyer and Stout, 2002; Yechiam et al., 2005, 2010; Agay et al., 2010; Hochman et al., 2010; Weller et al., 2010; Wetzels et al., 2010).

The EV model has been very useful in examining how different clinical or neuropsychological disorders affect different decision-making processes. For example, Yechiam et al. (2005) used the model to identify groups that attend more to gains than to losses (cocaine users, cannabis users, and seniors), attend more to losses than to gains (Asperger's patients), or attend to only the most recent outcomes (ventromedial prefrontal cortex patients). The EV model and other RL models have been a dominant class of models used to characterize decision-making behavior in numerous studies (Sutton and Barto, 1998; Worthy et al., 2007; Gureckis and Love, 2009a,b). The basic assumptions underpinning the EV model, and other related RL models, is that outcomes of past decisions are integrated to determine expected reward values for each option, and that decision-makers select options with higher expected rewards with greater probability than options with lower expected rewards.

Although the EV model has been widely used, recent work has found that other models can provide a better account of behavior in the task. One such model is another RL model called the Prospect Valence Learning (PVL) model (Ahn et al., 2008, 2011). One advantage of the PVL model is that it assumes that the weight people give to gains and losses follows the assumptions of Prospect Theory (Kahneman and Tversky, 1979). An additional assumption of the best-fitting version of the PVL model is the assumption that expected values for each option decay over trials. The EV model has primarily utilized a Delta learning rule that is also known as a Rescorla-Wagner rule (Rescorla and Wagner, 1972; Sutton and Barto, 1998; Yechiam and Busemeyer, 2005). This rule assumes that the expected values for each option are recency-weighted averages of the rewards received on each trial. These expected values remain unchanged until an option is chosen on a different trial. In contrast, a Decay learning rule assumes that expected values for each option decay on each trial (Erev and Roth, 1998).

The Decay rule effectively assumes that options that are not chosen will decline in expected value. Consequently, an option will become increasingly more likely to be selected the more frequently it has been selected in the recent past because its value, relative to the value of all other options, will increase due to the decaying values of the unchosen options. Thus, models that assume a Decay rule allow for the assumption that participants will *persevere* by repeatedly selecting the same option.

Another model that allows for the same assumption of perseveration, and has also provided good fits to IGT data, is a win-stay-lose-shift (WSLS) model (Worthy et al., 2012, 2013). The WSLS model assumes that participants stay (persevere) with a certain probability by picking the same option if the net reward on the previous trial was greater than zero (a "win" trial), and switch with a certain probability by picking a different option if the net reward on the previous trial was less than zero (a "lose" trial). The win-stay and lose-shift probabilities are free parameters in the model, allowing the model to account for perseverative behavior in which people sample an option repeatedly over several trials.

The WSLS and PVL models both provide better fits to data than the EV model that utilizes a Delta learning rule (Worthy et al., 2013). However, the PVL and WSLS models that have been utilized to date have a critical shortcoming in how they represent the expected values for each option. The WSLS model assumes that participants do not use any information about the relative value of each option and respond only based on whether the previous trial had a positive or negative outcome. This is a questionable assumption, at best, as it is very likely that participants give at least some consideration to the rewards they expect to receive when they select each option. The PVL model is structured so that expected reward values for each option are compared against each other to determine choice. However, the tendency to select the option with the highest expected value is conflated with the tendency to persevere by picking the same option over consecutive trials because the model uses a single value to represent both of these tendencies.

In the current work we decompose the tendency to persevere and the tendency to select options based on their reward value by developing a Value-Plus-Perseveration (VPP) model that includes separate terms to represent perseveration and expected value. Similar approaches have been utilized in other decisionmaking tasks by adding autocorrelation terms that are identical in form to the Decay rule (Lau and Glimcher, 2005; Schönberg et al., 2007; Kovach et al., 2012). The assumption underlying this modeling approach is that tendencies for perseveration and maximization of expected value are two fundamental, but separate aspects of decision-making. As we will show, fits of the VPP model provide a better account to data from human participants. The parameter estimates are also more informative in that parameters measuring important aspects of behavior that are assessed using the IGT, like loss aversion (Weller et al., 2010), are more strongly associated with behavior when expected value representation is decomposed from the tendency to persevere. Additionally, simulations from the VPP model are also more closely aligned with participants' data when including the number of trials that participants switched to a different option over the course of the task. Models that don't include a perseveration component tend to over-predict switch trials or under-predict perseverative behavior, while models that conflate perseveration and maximization of expected value tend to under-predict switch trials.

In the following sections we first present the models we fit to our data. We then present the methods for our experiment where participants performed the original version of the IGT (Bechara et al., 1994), followed by the behavioral and modeling results which include a comparison of each model's simulated performance and the performance of our participants. We conclude by discussing the implications of our results and by suggesting that this approach, or similar modeling approaches, be utilized to examine IGT behavior in different participant groups.

#### **MODEL DESCRIPTIONS**

The RL models that have been fit previously to IGT data have had three components: a utility function, a value-updating rule, and an action-selection rule. The first component, the utility function, determines the degree to which gains are weighed relative to losses. The EV utility function assumes that gains and losses are simply differentially weighted. After a choice is made and feedback [points gained, win(*t*), and lost, loss(t)] is presented, the utility *u*(*t*) for the choice made on trial *t* is given by:

$$\mu(t) = \boldsymbol{\omega} \cdot \text{win}(t) - (1 - \boldsymbol{\omega}) \cdot \text{loss}(t) \tag{1}$$

*w* (0 ≤ *w* ≤ 1) represents the degree to which participants weigh gains vs. losses. Values greater than 0.50 indicate greater weight for gains than losses.

The Prospect Valence utility function assumes that the evaluation of each outcome follows the utility function derived from Prospect Theory (Kahneman and Tversky, 1979; Ahn et al., 2008), which has diminishing sensitivity to increases in magnitude, and different sensitivity to losses vs. gains. The utility, *u(t)*, on trial *t,* of each net outcome, *x(t)*, is:

$$\mu(t) = \begin{cases} \varkappa(t)^{\alpha} & \text{if } \varkappa(t) \ge 0 \\ -\lambda |\varkappa(t)|^{\alpha} & \text{if } \varkappa(t) < 0 \end{cases} \tag{2}$$

Here α is a shape parameter (0 < α < 1) that governs the shape of the utility function, and λ is a loss aversion parameter (0 < λ < 5) that determines the sensitivity of losses compared to gains. If an individual has a value of λ greater than 1, it indicates that the individual is more sensitive to losses than gains, and a value less than 1 indicates greater sensitivity to gains than to losses.

The second component, the value-updating rule, determines how the utility *u*(*t*) is used to update expected values or expectancies *Ej*(*t*) for the chosen option, *i*, on trial *t*. The Delta rule assumes that Expectancies are recency-weighted averages of the rewards received for each option:

$$E\_i(t) = E\_i(t-1) + \phi \cdot [u(t) - E\_i(t-1)] \tag{3}$$

The recency parameter (0 ≤ φ ≤ 1) describes the weight given to recent outcomes in updating expectancies. Higher values indicate a greater weight to recent outcomes.

The Decay rule (Erev and Roth, 1998) assumes that Expectancies of all decks decay, or are discounted, over time, and then the Expectancy of the chosen deck is added to the current outcome utility:

$$E\_i(t) = A \cdot E\_i(t-1) + \delta\_i(t) \cdot u(t) \tag{4}$$

The decay parameter *A* (0 ≤ *A* ≤ 1) determines how much the past expectancy is discounted. δ*j*(*t*) is a dummy variable that is 1 if deck *j* is chosen and 0 otherwise.

The third component, the action-selection rule, is a Softmax rule that determines the predicted probability that deck *j* will be chosen on trial *t*, Pr[*Gj(t)*], is calculated using a Softmax rule (Sutton and Barto, 1998):

$$Pr(G\_j(t)) = \frac{e^{\left[\Theta(t) \cdot E\_j(t)\right]}}{\sum\_{j=1}^4 e^{\left[\Theta(t) \cdot E\_j(t)\right]}} \tag{5}$$

In the present work we utilize a trial-independent actionselection<sup>1</sup> rule for all the RL models fit to the data:

$$\theta(t) = \mathfrak{Z}^{\mathfrak{c}} - 1 \tag{6}$$

where *c* (0 ≤ *c* ≤ 5) is the response consistency or exploitation parameter. Larger values of *c* indicate a greater tendency to select options with higher expected values, while smaller values indicate a greater tendency explore options with lower expected values.

We first fit a total of four single-term RL models that were derived from the factorial combination of two utility functions (PVL and EV) and two value-updating rules (Decay and Delta rules). As will be described in greater detail below, we found that the PVL Delta Rule model provided a better fit to the data than the EV Delta Rule model. Given the better fit of the PVL Delta rule model, we used the PVL utility function and a Delta rule to determine the expected reward value on each trial for the twoterm VPP model. The PVL utility function has also been found to outperform the EV utility function in other recent work (Ahn et al., 2008). Thus, in the VPP model the values for the first term, the expected values or expectancies [*Ej*(*t*)] for each *j* choice, were determined based on Equations (2) and (3) above.

The second term, the perseveration [*Pj*(*t*)] strengths for each *j* option were determined by a more general form of the Decay rule that has been used to model perseveration or autocorrelation among choices in recent work (Schönberg et al., 2007; Kovach et al., 2012). The perseveration term for chosen option *i*, on trial *t,* differed based on whether the net outcome, *x(t)*, was positive or negative:

$$P\_i(t) = \begin{cases} k \cdot P\_i(t-1) + \varepsilon\_{\text{pos}} & \text{if } \mathfrak{x}(t) \ge 0 \\ k \cdot P\_i(t-1) + \varepsilon\_{\text{neg}} & \text{if } \mathfrak{x}(t) < 0 \end{cases} \tag{7}$$

Here *k* (0 ≤ *k* ≤ 1) is a decay parameter similar to *A* in Equation (4) above. The tendency to perseverate or switch is incremented each time an option is chosen by εpos and εneg which we allowed to vary between −1 and 1. Positive values indicate a tendency to persevere by picking the same option on succeeding trials, while negative values indicate a tendency to switch.

The overall value of each option was determined by taking a weighted average of the two terms in the model, the expected value and the perseveration strength of each *j* option:

$$V\_j(t) = \mathbf{w}\_{E\_j} \cdot E\_j(t) + (1 - \mathbf{w}\_{E\_j}) \cdot P\_j(t) \tag{8}$$

where *wEj* (0 ≤ *wEj* ≤ 1) is the weight given to the expected value for each option. Values greater than 0.5 indicate greater weight based on the expected value of each option, and values less than 0.5 indicate greater weight based on the perseverative strength of each option.

These values *Vj*(*t*) were entered into a Softmax rule to determine the probability of selecting each option, *j*, on each trial, *t*:

$$Pr(G\_j(t)) = \frac{e^{\left[\theta(t) \cdot V\_j(t)\right]}}{\sum\_{j=1}^4 e^{\left[\theta(t) \cdot V\_j(t)\right]}} \tag{9}$$

where θ(*t*) was determined based on Equation (6) above.

In addition to fitting the RL models described above we also fit a WSLS model and a Baseline model. The WSLS model we used in the present work has two free parameters and is identical to the model used in prior work from our lab (Worthy et al., 2013). The first parameter represents the probability of staying with the same option on the next trial if the net gain received on the current trial is equal to or greater than zero:

$$P(G\_j(t)|\text{choice}\_{t-1} = G\_j \& \, r(t-1) \ge 0) = P(\text{stay}|\text{win}) \quad (10)$$

In Equation (10) *r* represents the net payoff received on a given trial where any loss is subtracted from the gain received. The probability of switching to another option following a win trial is 1−P(stay | win). To determine a probability of selecting each of the other three options we divide this probability by three, so that the probabilities for selecting each of the four options sum to one.

The second parameter represents the probability of shifting to the other option on the next trial if the reward received on the current trial is less than zero:

$$P(G\_j, (t)|\text{choice}\_{t-1} = G\_j \& \, r(t-1) < 0) = P(\text{shift}|\text{loss}) \tag{11}$$

This probability is divided by three and assigned to each of the other three options. The probability of staying with an option following a "loss" is 1 − *P*(shift|loss).

Finally, the Baseline model assumes fixed choice probabilities (Yechiam and Busemeyer, 2005; Gureckis and Love, 2009a; Worthy and Maddox, 2012). The Baseline model has three free

<sup>1</sup>A trial-dependent rule has also been applied to models that have been fit to IGT data (Yechiam and Busemeyer, 2005). We found that the pattern between the relative fit of each model we present was the same regardless of which action selection rule was used and that the trial-independent rule fit best in most cases. Therefore, for simplicity we only use the trial-independent rule in the present work.

parameters that represent the probability of selecting Deck A, B, or C (the probability of selecting the Deck D is 1 minus the sum of the three other probabilities).

The right column of **Table 2** lists the equations used for each model.

## **METHOD**

### **PARTICIPANTS**

Thirty-five (22 females) undergraduate students from Texas A&M University participated for partial fulfillment of a course requirement.

## **MATERIALS AND PROCEDURE**

Participants performed the experiment on PCs using Matlab software with Psychtoolbox (version 2.5). Participants were given the following instructions:

In this study we are interested in how people use information to make decisions.

You will repeatedly select from one of four decks of cards, and you could gain or lose points on each draw. You will be given 2000 points to start and your goal is to try to finish with at least 2500 points.

Each time you draw, the card you picked will be turned over and the number of points you gained and lost will be displayed.

You will press the 'Z', 'W', 'P', and '?/' keys to draw from each deck.

Just do your best to maximize your gains and minimize your losses so you can finish with at least 2500 points.

Press any key to begin.

On each of 100 trials four decks appeared on the screen and participants selected one deck. Upon each selection the computer screen displayed the card choice, reward, penalty and net gain beneath the card decks. The total score was displayed on a score bar at the bottom of the screen. The task was self-paced, and participants were unaware of how many card draws they would receive. The schedule of rewards and penalties was identical to those used in the original IGT (**Table 1**; Bechara et al., 1994).

## **RESULTS**

We first computed a performance measure that was the proportion of trials when participants selected the good decks minus the proportion of trials that they selected the bad decks. **Figure 1** shows these performance values over five 20-trial blocks. A repeated measures ANOVA revealed a significant effect of block, *F*(4) = 5.46, *p* < 0.001, partial η<sup>2</sup> = 0.14, which suggests that participants learned to select the advantageous decks more over the course of the experiment.

#### **MODELING RESULTS**

Models were fit individually to each participant's data by maximizing the log-likelihood for each model's prediction on each trial. We used Akaike's Information Criterion (AIC) (Akaike, 1974) and the Bayesian Information Criterion (BIC) (Schwarz, **Table 1 | Reward schedule for the IGT.**


*See Bechara et al. (1994) for the full table which lists payoffs for the first 40 cards drawn from each deck. In the present task the sequence was repeated for cards 41–80 and 81–100 so that a participant could potentially select the same deck on all 100 draws. Bold values indicate amount lost on each trial.*

1978) to examine the fit of the each model relative to the fit of the Baseline model. AIC penalizes models with more free parameters. For each model, *i*, AIC*<sup>i</sup>* is defined as:

$$\text{AIC}\_{i} = -2\text{log}L\_{i} + 2V\_{i} \tag{12}$$

where *Li* is the maximum likelihood for model *i*, and *Vi* is the number of free parameters in the model. BIC is defined as:

$$\text{BIC}\_{i} = -2\text{log}L\_{i} + V\_{i}\text{log}(n) \tag{13}$$

where *n* is the number of trials. Smaller AIC and BIC values indicate a better fit to the data. Average AIC and BIC values for each single-term model are listed at the top of **Table 2**. The fits of the two Decay rule models were very similar, and better than the fits of the Delta rule models. Of the two Delta rule models, the model with a PVL utility function provided a much better fit than the model with an EV utility function. Overall, the VPP model provided the best fit to the data, based on both AIC and BIC.

## *Simulations*

Next we performed simulations for each learning model (all models except the Baseline model) to examine the proportion of trials that each model selected each option. We also examined the proportion of trials that each model switched to a different option, which is an index of the general propensity to persevere or switch. We used the parameter values that best fit our participants' data for the simulated data sets. For each model we generated 1000 data sets using parameter combinations that were sampled with replacement from the best-fitting parameter combinations for participants in our Experiment. Thus, for the EV Delta rule model we randomly sampled a combination of *w,* φ, and *c* that provided the best fit to one participant's data and used those parameter values to perform one simulation of the task. We generated 1000 simulated data sets in this manner, and performed the same simulation procedure with each learning model. This is the same approach that we've followed in recent work from our lab (Worthy et al., 2012, 2013).

**Figure 2A** shows the average proportion of times participants and each model selected each option throughout the task. The VPP model's simulated choices most closely mirror the choices made by participants, although it slightly under-predicts Deck A and B selections and slightly over-predicts Deck C and D selections. **Figure 2B** shows the proportion of switch trials by participants and by each model in 20-trial blocks of the task. Across all trials, the simulated switch trials for the VPP model are nearly equivalent to the average number of switch trials for participants, and are equivalent if rounded to the nearest whole



*Standard deviations are listed in parentheses.*

number (62–62.4 for participants and 61.75 for the VPP model's simulations). Relative to the average switches made by participants, the two single-term Delta rule models, which do not have mechanisms to allow for perseveration, switched more often during their simulations. In contrast, the two single-term Decay rule models, which do have mechanisms to allow for perseveration, switched less often during their simulations. Thus, the Delta rule models under-predicted perseverative behavior, and the Decay rule models slightly over-predicted perseverative behavior.

#### *Parameter estimates*

**Table 3** lists the average best fitting parameter values for each model along with the correlations between each parameter and performance over the entire task (proportion of Advantageous minus Disadvantageous deck selections). Of the four single-term RL models, the only parameter that was significantly associated with performance was the learning rate parameter (φ) for the PVL Delta rule model. Lower values of this parameter were associated with better performance. This could suggest that less attention to the most recent outcomes, and more attention to outcomes received over longer periods of time, may have led to better estimates of each option's expected value.

Additionally, the VPP model's estimated exploitation parameter values (c) were also positively associated with performance. We also observed a significant positive association between the WSLS models estimated lose-shift *P*(shift|loss) parameter values

**FIGURE 2 | (A)** Observed and simulated choices of each deck. Simulations randomly sampled with replacement sets of the best-fitting parameters for participants for each model. **(B)** Number of "switch" trials where participants selected a different deck than the one selected on the previous trial in 20-trial blocks.


**Table 3 | Average parameter estimates from maximum likelihood fits and association with performance for each parameter.**

*Standard deviations are listed in parentheses. \*Significant at p* < *0.05 level, \*\*\*Significant at p* < *0.001 level.*

and performance, which suggests that participants performed better if they were more likely to select a different option following a net loss.

Recent work suggests that greater attention to losses than to gains is beneficial in the IGT (Weller et al., 2010). Therefore, we were interested in examining how estimates of parameters that accounted for attention to gains vs. losses were associated with performance in the task. **Figure 3** plots these associations for each single term model. The attention to gains parameter (*w*) in the EV Delta rule model was negatively associated with performance, and the loss aversion parameter (λ) from the PVL Delta rule model was positively associated with performance. Although, the associations between these parameters and performance only approached significance, estimated values of these same parameters had basically no relationship with performance in the EV Decay (*r* = −0.04 for *w*) and PVL Decay models (λ = 0.05, where the tendency to select options based on their expected values is conflated with the tendency to persevere.

There was a strong association between performance and estimated loss aversion (λ parameter values from the VPP model (**Figure 4A**). One point to note is that many participants' data were best fit by extreme values along the bounds for these parameters from both the VPP and the single-term models. Recent work has demonstrated that a potential anomaly of the estimating parameters for individual participants via maximum likelihood is that many estimates will fall on the bounds of the parameter space (Wetzels et al., 2010; Ahn et al., 2011). Thus, it is difficult to determine whether the extremely low or extremely high loss aversion parameter values indicated exclusive attention to gains or losses by some subjects, or whether those values were due to problems with estimating parameters for individual subjects via maximum likelihood.

To address this issue we estimated the VPP model's parameters using a Bayesian hierarchical procedure that has recently been used to estimate parameters from the EV Delta rule model for IGT data (Wetzels et al., 2010). While the maximum likelihood approach provides a single best-fitting set of parameters for each subject, the Bayesian hierarchical approach yields posterior distributions for each parameter that quantify the uncertainty about each parameter, given the data. Posterior distributions were estimated based on a total of 30,000 MCMC samples from three chains, after 1000 burn-in samples. **Figure 4B** plots the association between performance and the mode of each subject's posterior distribution for the loss aversion parameter from the VPP model. Similar to the estimates from maximum likelihood there is a strong positive association between performance and loss aversion estimates (*r* = 0.52, *p* < 0.001). However, the modes of the posterior loss aversion parameter distributions are not at as extreme points near the bounds of the parameter space as the point estimates provided by the maximum likelihood fits. Thus, the relationship between loss aversion parameter estimates and performance is similar for both approaches, but maximum likelihood estimation is more likely to yield estimates near the bounds of the parameter space.

Because the measure of performance we used is only one measure among many possible ways to characterize performance on the IGT, we also examined the relationship between the mode of each subject's posterior loss aversion parameter distribution and the proportion of trials participants selected Decks A and B. These are plotted in **Figure 5**. There were negative associations between the VPP model's loss aversion parameters and selections of both options, but the association was only significant for Deck B selections (Deck A, *r* = −0.19, *p* > 0.10; Deck B, *r* = −0.51, *p* < 0.01).

## **DISCUSSION**

We presented a VPP model that included separate terms to account for perseverative behavior and tendencies to select options based on their expected values. Overall, this model provided the best fit to the data and its simulations most closely mirrored human behavior—both the proportion of times people selected each option and how often they tended to switch to a different option. This supports our assertion that it is critical to account for both perseveration and maximization of expected value in models of human decision-making behavior in tasks like the IGT, and it is also critical to ensure that these tendencies are decomposed in the model. People vary in both their tendency to

select more advantageous options and in their tendency to "stay" or "switch" on successive trials.

There was a very strong relationship between the VPP model's best-fitting loss aversion parameter values and performance in the IGT using both maximum likelihood and Bayesian hierarchical approaches to obtain individual parameter estimates. This supports recent work that suggests that loss aversion is a critical component, perhaps the most critical component, of successful performance in the IGT (Weller et al., 2010). The role of loss aversion is intuitively obvious in that the distinguishing feature between the advantageous and disadvantageous decks is that, over time, the latter provide net losses, while the former provide net gains. The relationships between estimated loss aversion parameter values and performance sharply differed based on the learning rule that was used. Parameters that accounted for attention to losses vs. gains from the single-term Delta rule models both showed associations with performance (albeit weak ones) that suggest that enhanced attention to losses improves IGT performance. In contrast, there was basically no relationship between parameter estimates of attention to losses and performance for the single-term Decay rule models. This is an important point because these models differ based on their assumptions of how important loss aversion is for successful performance in the task. We propose that the null relationship between loss aversion parameter estimates and performance for the Decay rule models is due to the conflation between representations for expected value maximization and perseverative behavior.

Additionally, it is important to note that loss aversion and attention to gains parameter estimates from all the models we fit via maximum likelihood estimation were not normally distributed. Many data sets were best fit by extreme values for these parameters which may be an anomaly that comes from to estimating parameter using maximum likelihood. Bayesian hierarchical parameter estimation is an alternative method of estimating parameters that has several advantages over maximum likelihood estimation, particularly at the individual subject level (Wetzels et al., 2010).

In an elegant and very thorough analysis of model performance in the IGT and the Soochow gambling task (Lin et al., 2007; Ahn et al., 2008; Chiu et al., 2008) recently suggested that decay learning rules are better at making short-term predictions, like which option would be chosen on the next trial, while Delta rule models are better at making long-term predictions, like an entire sequence of choices. For example, a model that included a Delta rule may provide a poorer fit to a participant's data, but parameter estimates from a Delta rule model would be better at predicting behavior for the same individual in another decisionmaking task. We propose that the advantage in short-term prediction for Decay rule models is due to their ability to account

for perseverative behavior, and the advantage in long-term prediction for Delta rule models is due to their ability to better account for things like loss aversive tendencies, which affect how participants value options. While we did not use the generalization criterion method (Busemeyer and Wang, 2000) of using parameter estimates from fits to data from one task to predict subsequent behavior in another task in the current work, we predict that isolating perseveration and expected value representation in learning models, like the VPP model we presented here, would improve both short- and long-term predictions. Indeed prior work has found that the EV model, which does not conflate expected value representation with perseveration, was more successful in the generalization criterion method than in fits to a single dataset (Yechiam and Busemeyer, 2008; Kudryavtsev and Pavlodsky, 2012). Although our study did not utilize the generalization criterion method we would predict that the VPP model would perform well in predicting behavior on subsequent tasks.

The development of the IGT 20 years ago has led to excellent cross-cutting research across various sub-disciplines in

psychological science. Decision-making is a critical component of everyday behavior, and the IGT has been the most frequently used experimental task designed to assess poor decision-making, particularly among patient groups (Bechara et al., 2001; Boeka and Lokken, 2006; Lakey et al., 2007). However, the IGT is also a complex task and basic analyses of performance in the task, like the proportion of advantageous vs. disadvantageous choices, do not provide a full account of decision-making behavior. We argue that model-fitting is a critical tool that can be applied to IGT data to allow for a more complex examination of how decisionmaking varies among groups and individuals. We found a strong link between loss aversion and performance in the IGT. However, other approaches, like the ones used by Yechiam et al. (2005), can be used to compare parameter estimates between different patient populations to identify how different groups attend to recent outcomes, attend to gains vs. losses, select options with greater expected values or tend to persevere vs. frequently switch options. It is our view that the biggest insights into decisionmaking behavior in tasks like the IGT will continue to come from approaches that include both behavioral and computational analyses of data that are collected from a wide variety of participants and groups.

## **ACKNOWLEDGMENTS**

We would like to thank Kala Battistelli, Katie Chamberlain, Courtni East, Lindsey Ferris, Monica Gamboa, Kaitlynn Goldman, Karla Gomez, Alexis Gregg, Jordan Hall,

## **REFERENCES**


Christy Ho, Lauren Laserna, Samantha Mallec, Megan McDermott, Michael Pang, Anthony Schmidt, Candice Tharp, and Lucas Weatherall for their help in collecting the data.


load and temporal myopia in dynamic decision making. *J. Exp.Psychol. Lear. Mem. Cogn.* 38, 1640–1658. doi: 10.1037/a0028146


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 June 2013; accepted: 28 August 2013; published online: 30 September 2013.*

*Citation: Worthy DA, Pang B and Byrne KA (2013) Decomposing the roles of perseveration and expected value representation in models of the Iowa gambling task. Front. Psychol. 4:640. doi: 10.3389/ fpsyg.2013.00640*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Worthy, Pang and Byrne. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Validating the PVL-Delta model for the Iowa gambling task

#### *Helen Steingroever <sup>1</sup> \*, Ruud Wetzels 2,3 and Eric-Jan Wagenmakers <sup>1</sup>*

*<sup>1</sup> Psychological Methods, Department of Psychology, University of Amsterdam, Amsterdam, Netherlands*

*<sup>2</sup> Informatics Institute, University of Amsterdam, Amsterdam, Netherlands*

*<sup>3</sup> Spinoza Centre for Neuroimaging, Amsterdam, Netherlands*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Jan Glaescher, University of Hamburg, Germany Shunsuke Kobayashi, Fukushima Medical University, Japan*

#### *\*Correspondence:*

*Helen Steingroever, Psychological Methods, Department of Psychology, University of Amsterdam, Weesperplein 4, 1018 XA Amsterdam, Netherlands e-mail: helen.steingroever@ gmail.com*

Decision-making deficits in clinical populations are often assessed with the Iowa gambling task (IGT). Performance on this task is driven by latent psychological processes, the assessment of which requires an analysis using cognitive models. Two popular examples of such models are the Expectancy Valence (EV) and Prospect Valence Learning (PVL) models. These models have recently been subjected to sophisticated procedures of model checking, spawning a hybrid version of the EV and PVL models—the PVL-Delta model. In order to test the validity of the PVL-Delta model we present a parameter space partitioning (PSP) study and a test of selective influence. The PSP study allows one to assess the choice patterns that the PVL-Delta model generates across its entire parameter space. The PSP study revealed that the model accounts for empirical choice patterns featuring a preference for the good decks or the decks with infrequent losses; however, the model fails to account for empirical choice patterns featuring a preference for the bad decks. The test of selective influence investigates the effectiveness of experimental manipulations designed to target only a single model parameter. This test showed that the manipulations were successful for all but one parameter. To conclude, despite a few shortcomings, the PVL-Delta model seems to be a better IGT model than the popular EV and PVL models.

**Keywords: reinforcement learning, expectancy valence model, prospect valence model, test of selective influence, parameter space partitioning**

## **1. INTRODUCTION**

The Iowa gambling task (IGT; Bechara et al., 1994) is arguably the most popular neuropsychological paradigm to assess decisionmaking deficits in clinical populations. In order to isolate and identify the psychological processes that drive performance on the IGT, behavioral analyses of IGT data are insufficient. A promising alternative analysis approach is to use cognitive process models. The IGT imposes high demands on these models because it is a complex task producing various types of choice patterns that a good model should be able to generate (Steingroever et al., 2013a,b). In addition, the models should also account for individual differences and for participants' switch behavior on the task (e.g., Zhao and Costello, 2007; Steingroever et al., 2013b). Despite the high demands, some plausible and elegant IGT models have been proposed. Two of the most frequently used representatives include the Expectancy Valence model (EV; see Steingroever et al., 2013b, for references), and the Prospect Valence Learning model (PVL; see Steingroever et al., 2013b, for references and a detailed description of the models). The parameters of these models correspond to distinct psychological processes such as motivation, learning/memory, and response consistency (Busemeyer et al., in press).

Since the development of the EV model in 2002, reinforcement-learning (RL) models for IGT data have been subjected to sophisticated procedures of model checking (e.g., Busemeyer and Stout, 2002; Yechiam and Busemeyer, 2005; Yechiam and Ert, 2007; Ahn et al., 2008; Yechiam and Busemeyer, 2008; Fridberg et al., 2010; Steingroever et al., 2013b). These model comparison efforts spawned a hybrid version of the EV and PVL models—the PVL-Delta model (Ahn et al., 2008; Fridberg et al., 2010; Steingroever et al., in press; see next section for a detailed description of the PVL-Delta model and recent model comparison efforts). This model seems to be promising for IGT data because it can generate a variety of empirical choice patterns better than its competitors (Steingroever et al., in press).

Whereas previous procedures of model checking focused mostly on relative comparisons of different RL models for IGT data, no efforts have been carried out to validate the PVL-Delta model (i.e., assess its adequacy in isolation). Here, we focus on two different ways of validating the PVL-Delta model: first, we conduct a parameter space partitioning (PSP) study that systematically assesses which choice patterns the PVL-Delta model generates across its entire parameter space. Thus, with this first validity check we aim to answer the question: can the PVL-Delta model generate typical empirical choice patterns over a wide range of parameter settings? Second, we conduct a test of selective influence that investigates the effectiveness of experimental manipulations designed to target only one of the model parameters. Thus, with this second validity check we aim to answer the question: do the parameters of the PVL-Delta model indeed correspond to the proposed psychological processes?

The outline of this article is as follows. In the first section, we explain the IGT, outline the PVL-Delta model, and review previous efforts to compare RL models for IGT data. In the second and third section, we present the PSP study and the test of selective influence. In the last section, we summarize our findings and discuss their ramifications. To anticipate our results, our PSP study shows that the PVL-Delta model can account for empirical choice patterns featuring a preference for the good decks or the decks with infrequent losses; however, the model fails to account for empirical choice patterns featuring a preference for the bad decks. Our test of selective influence shows that the manipulations were successful for all but one parameter.

## **2. THE IOWA GAMBLING TASK AND THE PVL-DELTA MODEL 2.1. THE IOWA GAMBLING TASK**

In this section we describe the IGT (see also Steingroever et al., 2013b, in press). The purpose of the IGT is to measure decision-making deficits of clinical populations in an experimental setting. In the traditional IGT, participants are initially given \$2000 facsimile money and are presented with four decks of cards. Participants are instructed to choose cards in order to maximize their long-term net outcome (Bechara et al., 1994, 1997). Unbeknownst to the participants, the task typically contains 100 trials. After each choice, participants receive feedback on the rewards and the losses (if any) associated with that card, and the running tally.

The task aims to determine whether participants learn to prefer the good, safe decks over the bad, risky decks because this is the only choice pattern that maximizes the long-term net outcomes. The good, safe decks are typically labeled C and D, whereas the bad, risky decks are labeled A and B. **Table 1** presents the traditional payoff scheme as developed by Bechara et al. (1994). This table illustrates that decks A and B yield high immediate, constant rewards, but even higher unpredictable, occasional losses: hence, the long-term net outcome is negative. Decks C and D, on the other hand, yield low immediate, constant rewards, but even lower unpredictable, occasional losses: hence, the long-term net outcome is positive. In addition to the different payoff magnitudes, the decks also differ in the frequency of losses: two decks yield frequent losses (decks A and C) and two decks yield infrequent losses (decks B and D).

#### **2.2. THE PVL-DELTA MODEL**

In this section, we describe the PVL-Delta model in detail. The model formalizes participants' performance on the IGT through the interaction of four model parameters that represent distinct psychological processes (Ahn et al., 2008; Fridberg et al., 2010; Steingroever et al., in press).

The first model assumption is that after choosing a card from deck *k* ∈ {1, 2, 3, 4} on trial *t*, participants evaluate the net outcome associated with the just-chosen card by means of a non-linear utility function from Prospect theory (Tversky and Kahneman, 1992)—the Prospect Utility function:

$$u\_k(t) = \begin{cases} |X(t)^A & \text{if } X(t) \ge 0 \\ -\mathbf{w} \cdot |X(t)|^A & \text{if } X(t) < 0. \end{cases} \tag{1}$$

Here *X*(*t*) represents the net outcome on trial *t*, that is, the sum of the experienced reward and loss (i.e., *X*(*t*) = *W*(*t*) − |*L*(*t*)|). The Prospect Utility function contains the first two model parameters—the shape parameter *A* ∈ [0, 1], that determines the shape of the utility function, and the loss aversion parameter *w* ∈ [0, 5]. As *A* approaches zero, the shape of the utility function approaches a step function. The implication of such a step function is that given a positive net outcome *X*(*t*), all utilities are similar because they approach one, and given a negative net outcome *X*(*t*), all utilities are also similar because they approach −*w*. On the other hand, as *A* approaches one, the subjective utility *uk*(*t*) increases in direct proportion to the net outcome, *X*(*t*). A value of *w* larger than one indicates a larger impact of negative net outcomes than positive net outcomes on the subjective utility, whereas a value of *w* approaching one indicates identical impact of negative net outcomes and positive net outcomes. As *w* approaches zero, the model predicts that negative net outcomes will be neglected.

The PVL-Delta model further assumes that, after having formed the utility of the just chosen deck through Equation 1, decision makers update their expected utility of the just chosen deck, while keeping the expected utilities of the remaining decks unchanged. This updating process is described by the Delta learning rule:

$$Ev\_k(t) = Ev\_k(t-1) + a \cdot (u\_k(t) - Ev\_k(t-1)).\tag{2}$$

The Delta learning rule states that the expected utility of the chosen deck *k* is adjusted upward if the experienced utility *uk*(*t*) is higher than expected. If the experienced utility *uk*(*t*) is lower than expected, the expected utility of deck *k* is adjusted downward<sup>1</sup> . This updating process is influenced by the third model parameter—the updating parameter *a* ∈ [0, 1]. This parameter quantifies the memory for rewards and losses. A value of *a* close to zero indicates slow forgetting and weak recency effects, whereas a value of *a* close to one indicates rapid forgetting and strong recency effects.

1We initialized the expectancies of each deck *k* to zero, *Evk*(0) = 0.

#### **Table 1 | Payoff scheme of the traditional IGT as developed by Bechara et al. (1994).**


In the next step, the model assumes that the expected utilities of each deck guide participants' choices on the next trial *t* + 1. This assumption is formalized by the softmax choice rule, also known as the ratio-of-strength choice rule. The PVL-Delta model uses this rule to compute the probability of choosing each deck on each trial (Luce, 1959; Equation 3). This rule contains a sensitivity parameter θ that indexes the extent to which trial-by-trial choices match the expected deck utilities. Values of θ close to zero indicate random choice behavior (i.e., strong exploration), whereas large values of θ indicate choice behavior that is strongly determined by the expected utilities (i.e., strong exploitation).

$$P[S\_k(t+1)] = \frac{e^{\theta \cdot E \nu\_k(t)}}{\sum\_{j=1}^4 e^{\theta \cdot E \nu\_j(t)}} \tag{3}$$

The PVL-Delta model assumes a trial-independent sensitivity parameter θ, which depends on the final model parameter: the response consistency *c* ∈ [0, 5] (Equation 4). Small values of *c* cause a random choice pattern, whereas large values of *c* cause a deterministic choice pattern.

$$
\theta = \mathfrak{Z}^{\epsilon} - \mathfrak{l} \tag{4}
$$

In sum, the PVL-Delta model has four parameters: (1) The shape parameter *A*, which determines the shape of the utility function, (2) the loss aversion parameter *w*, which quantifies the weight of net losses over net rewards, (3) the updating parameter *a*, which determines the memory for past expectancies, and (4) the response consistency parameter *c*, which determines the amount of exploitation vs. exploration.

#### **2.3. PREVIOUS COMPARISONS OF RL MODELS**

This section reviews previous model comparison studies. These studies compared the EV model, PVL model, and alternative RL models using a large variety of methods, for instance: the *post hoc* fit criterion (i.e., Busemeyer and Stout, 2002; Yechiam and Busemeyer, 2005; Yechiam and Ert, 2007; Ahn et al., 2008; Yechiam and Busemeyer, 2008; Fridberg et al., 2010), <sup>2</sup> the simulation method (i.e., Ahn et al., 2008; Fridberg et al., 2010; Steingroever et al., in press; Worthy et al., 2013), tests of generalizability (i.e., Yechiam and Busemeyer, 2005; Yechiam and Ert, 2007; Ahn et al., 2008; Yechiam and Busemeyer, 2008), tests of parameter consistency (i.e., Yechiam and Busemeyer, 2008), and PSP (i.e., Steingroever et al., 2013b) 3 .

The above model comparison studies revealed many positive properties of RL models: first, RL models predict the choices on the next trial better than a Bernoulli baseline model (Busemeyer and Stout, 2002; Yechiam and Busemeyer, 2005; Yechiam and Ert, 2007; Ahn et al., 2008; Yechiam and Busemeyer, 2008, Fridberg et al., 2010) <sup>4</sup> . Second, parameters from the RL models estimated from one RL task can be used to predict performance on a different RL task (Yechiam and Busemeyer, 2005; Yechiam and Ert, 2007; Ahn et al., 2008; Yechiam and Busemeyer, 2008). Third, the loss aversion parameter and the updating parameter of the EV model are stable across different tasks (Yechiam and Busemeyer, 2008). Fourth, the estimated model parameters can be used to improve the prediction of group membership (i.e., chronic cannabis users vs. healthy controls; Fridberg et al., 2010).

These positive properties confirm that cognitive modeling analyses are indeed useful to learn more about the psychological processes that drive performance on the IGT. However, previous model comparison studies also revealed that, even though the EV and PVL models are frequently used, they fail to outperform their competitors consistently. It appears that the performance of the RL models depends on the data set and the method used to assess model performance (i.e., fit performance vs. simulation performance; see Steingroever et al., in press, for a more detailed discussion on previous comparisons of RL models).

Instead of accepting the EV and PVL models as default models to describe IGT data, there is growing evidence that the PVL-Delta model may be a promising alternative IGT model: first, Ahn et al. (2008) showed that the PVL-Delta model results in the best simulation performance (i.e., prediction of the entire sequence of choices on the IGT under a new, unobserved payoff sequence) among the EV model, PVL model, and any combination of the components of the two models. Second, Fridberg et al. (2010) showed that, in two data sets, the PVL-Delta model outperforms the EV model in terms of *post hoc* fit and simulation performance. Third, Steingroever et al. (in press) showed that, among the EV, PVL, and PVL-Delta models, the PVL-Delta model is the only model that adequately generated the choice patterns shown by seven IGT data sets.

Even though the PVL-Delta model has recently come to the fore as a promising model for IGT data, it has not yet been sufficiently validated. Our goal here is to pursue two methods of validating the PVL-Delta model: a PSP study and a test of selective influence.

## **3. PARAMETER SPACE PARTITIONING**

## **3.1. METHODS**

We performed a PSP study to evaluate the flexibility of the PVL-Delta model (Pitt et al., 2006, Pitt et al., 2008; see also Steingroever et al., 2013b, who performed a PSP study of the EV model, PVL model, and another hybrid model: the EV model with Prospect Utility function). The PSP method systematically assesses the choice patterns predicted by the PVL-Delta model across its entire parameter space. A model is overly flexible when it can generate not only all choice patterns that are observed empirically, but also choice patterns that are logically possible, but never observed. Instead, one should prefer a less flexible, parsimonious model that—ideally—only generates choice patterns that are also frequently observed in experiments (Pitt et al., 2006, 2008).

<sup>2</sup>The *post hoc* fit criterion is also known as the one-step-ahead prediction method.

<sup>3</sup>Note that the PSP study of Steingroever et al. (2013b) did not focus on the PVL-Delta model, but on the EV model, the PVL model, and another hybrid model: the EV model with the Prospect Utility function.

<sup>4</sup>The Bernoulli baseline model assumes that a participant's probability of choosing a given deck on a given trial equals the overall proportion of choices the participant actually made from that deck.

Note that PSP is a global method (i.e., the full range of parameter values is considered), whereas the other methods that were used to compare RL models are local (i.e., assessment at a particular point in the model's parameter space; for instance, *post hoc* fit criterion, simulation method, tests of generalizability, and tests of parameter consistency). The advantage of global methods is that they enable one to assess the full range of choice patterns a model can generate, whereas the results of local methods always depend on the idiosyncrasies of any single data set (Pitt et al., 2006, 2008).

Pitt et al. (2006) describe a new search algorithm to implement PSP. In our implementation we did not use their sophisticated search algorithm, but followed the conceptual idea of PSP, and used a grid search that works as follows (see also Steingroever et al., 2013b): for each parameter of the PVL-Delta model, we chose 60 values that were equally spaced over the corresponding parameter range. Each combination of these parameter values was used to generate data for 100 synthetic participants completing a 100-trial IGT. For all analyses in this paper, we scaled the traditional payoffs of the IGT as presented in **Table 1** by dividing by 100 (cf. Ahn et al., 2011).

The generated data were used to analyze which choice patterns the PVL-Delta model can generate across its entire parameter space. Such analysis naturally requires a definition of choice patterns. Here we used two different definitions—the "broad definition of choice patterns" and the "restricted definition of choice patterns." These definitions are the same as used by Steingroever et al. (2013b).

## *3.1.1. Broad definition of choice patterns*

The "broad definition of choice patterns" is intended to provide a general idea of which choice patterns the PVL-Delta model can generate. Following Steingroever et al. (2013b), we defined five possible choice patterns: (1) Preference for the good decks over bad decks (i.e., {*C*, *D*}{*A*, *B*}), (2) preference for the bad decks over good decks (i.e., {*A*, *B*}{*C*, *D*}), (3) preference for the decks with infrequent losses over decks with frequent losses (i.e., {*B*, *D*}{*A*, *C*}), (4) preference for the decks with frequent losses over decks with infrequent losses (i.e., {*A*, *C*}{*B*, *D*}), and (5) remaining choice patterns. For each parameter combination, we computed the proportion of choices from each deck averaged across all 100 trials and all 100 repeated data generations. These average choice proportions were then sorted to determine the generated rank order of deck preferences for each parameter combination. Finally, we computed the proportion of the entire parameter space occupied by each of the defined choice patterns. Even though we defined five possible types of choice patterns, we assume based on the theory underlying the IGT (Bechara et al., 1994, 1997) and our IGT review (Steingroever et al., 2013a) that a good model for IGT data should only generate the first three types of choice patterns.

## *3.1.2. Restricted definition of choice patterns*

Note that the broad definition of choice patterns only considers the rank order of the overall proportions of choices from each deck averaged over 100 repeated data generations with the same parameter combination. This means that it does not matter whether the PVL-Delta model generated, for example, a very strong or a very weak preference for the good decks over the bad decks. Both generated choice patterns are classified as the choice pattern "good decks over bad decks" (i.e., {*C*, *D*} {*A*, *B*}). To go beyond this coarse classification, we also analyzed the model's behavior when confronted with pronounced deck preferences. To get an indication of pronounced deck preferences shown by healthy participants on the IGT, we used Steingroever et al. (2013b)'s definition of pronounced deck preferences: specifically, Steingroever et al. (2013b) searched their IGT data pool (*N* = 394; Steingroever et al., 2013a) for healthy participants that chose at least 65% cards from either the good decks (i.e., (*C* + *D*) ≥ 0.65), the bad decks (i.e., (*A* + *B*) ≥ 0.65), or the decks with infrequent losses (i.e., (*B* + *D*) ≥ 0.65). By using the 0.65-criterion, Steingroever et al. (2013b) included healthy participants with pronounced deck preferences and excluded healthy participants with random choice behaviors. For each of these three groups, Steingroever et al. (2013b) computed the mean proportions of choices from each deck (as shown in **Table 2**). For instance, participants classified to the group "pronounced preference for the good decks" chose on average 36 cards from deck C and 40 cards from deck D. Note that 53.6% of all participants in the Steingroever et al. (2013a) data pool showed a pronounced deck preference by making at least 65% choices from the two most

**Table 2 | Mean proportions of choices from each deck and mean proportions of switches during the last 50 trials of healthy participants showing a pronounced deck preference [see Table 4 in Steingroever et al. (2013b)].**


*Healthy participants are selected from the Steingroever et al. (2013a) data pool (N* = *394).*

preferred decks. This empirical popularity of pronounced deck preferences underscores how important it is that a RL model for the IGT is able to produce such choice patterns.

**Table 2** thus provides an indication of pronounced deck preferences shown by healthy participants on the IGT. We used the mean proportion of choices from these three constructed groups for our second, restricted definition of choice patterns. Specifically, we define a pronounced preference for the good decks as at least 36 and 40 choices from decks C and D, respectively; we define a pronounced preference for the bad decks as at least 25 and 52 choices from decks A and B, respectively; and we define a pronounced preference for the decks with infrequent losses as at least 37 and 39 choices from decks B and D, respectively. Based on our simulations, we then determined the proportion of the parameter space of the PVL-Delta model that produced choice patterns that satisfy this second, restricted definition.

#### *3.1.3. Switch behavior*

Finally, a good RL model for the IGT should also capture the switches participants make on the IGT (Zhao and Costello, 2007). Steingroever et al. (2013b) therefore determined the mean proportion of switches during the last 50 trials for the three groups of healthy participants showing pronounced decks preferences (revisited here in the last column of **Table 2**). The table contains for each of the three groups of healthy participants with pronounced choice patterns the mean proportion of switches during the last 50 trials and statistics quantifying the distribution of switch proportions (i.e., the interquartile range and the minimum and maximum switch proportions during the last 50 trials). This information is visualized by the boxplots shown in the left column of **Figure 1**. From **Table 2** and **Figure 1** it is evident that, in general, in all three groups participants switch frequently. However, the interquartile ranges and the minimum and maximum proportion of switches during the last 50 trials also indicate that there is large variability in the proportion of switches, such that the switch behavior of healthy participants varies between no switches at all to switches on every trial. This tendency to switch frequently, but also the large individual differences in the switch behavior of healthy participants are illustrated by **Figures 2**, **4**, **6** (see also Figures 3, 7, 10 in Steingroever et al., 2013b) which show the trial-by-trial choices (i.e., deck selection profiles) of representative healthy participants with a pronounced preference for the good decks, bad decks, and decks with infrequent losses, respectively5 .

We investigated whether the PVL-Delta model captures the empirical switch behavior by comparing the empirical and generated mean proportions of switches during the last 50 trials. Specifically, the generated mean proportions of switches were obtained by determining the mean proportions of switches during the last 50 trials for all parameter combinations that produced pronounced deck preferences. The code for the PSP study is available on www.helensteingroever.com.

## **3.2. RESULTS**

## *3.2.1. Broad definition of choice patterns*

**Table 3** presents the proportion of the parameter space of the PVL-Delta model occupied by each of the five different types of choice patterns. From this table, it is evident that the PVL-Delta model can generate all five different types of choice patterns. However, if we consider its partitioned parameter space more closely, we detect substantial differences between the popularity of the different choice patterns: the choice pattern "good decks over bad decks" is the most central to the model's overall performance, as this choice pattern occupies the largest part of the model's parameter space. The second and third largest part of its parameter space are occupied by the choice patterns "remaining" and "infrequent losses over frequent losses." It is thus evident that choice patterns that are typically shown by healthy participants—the choice patterns "good decks over bad decks" and "infrequent losses over frequent losses" (Steingroever

<sup>5</sup>See Steingroever et al. (2013b) for the deck selection profiles of all healthy participants that showed a pronounced deck preference (i.e., at least 65% choices from the two most preferred decks).

**Table 3 | Proportions of choice patterns generated by the PVL-Delta model.**


et al., 2013a)—occupy a major part of the model's parameter space.

**Table 3** also shows that the choice pattern "bad decks over good decks" is only generated over a minor part of the model's parameter space. We have therefore grounds to conclude that this choice pattern is uncharacteristic of the PVL-Delta model, and is thus almost irrelevant to its overall performance (Pitt et al., 2006). This finding is important because the choice pattern "bad decks over good decks" is considered characteristic for participants with decision-making deficits (e.g., patients with lesions to the ventromedial prefrontal cortex; Bechara et al., 1994, 1997). These patients are thought to display decision-making deficits on the IGT because their inability to foresee the long-term consequences of their choice behavior leads them to only focus on the immediate rewards.

#### *3.2.2. Restricted definition of choice patterns*

**Table 4** presents the proportion of all choice patterns generated by the PVL-Delta model that satisfy the restricted definition of choice patterns. The table also presents the mean and standard

deviation of the parameter combinations that generated these pronounced deck preferences. The table shows that only minor parts of the parameter space of the PVL-Delta model are occupied by the three types of pronounced choice patterns, even though these patterns are frequently observed in experiments. For instance, 139 healthy participants from the Steingroever et al. (2013a) data pool (35.3%) show a pronounced preference for the decks with infrequent losses (i.e., (*B* + *D*) ≥ 0.65). However, the PVL-Delta model only generates this choice pattern over 1.6% of its parameter space.

### *3.2.3. Switch behavior*

In addition to the generated choice proportions, we also determined the generated proportion of switches during the last 50 trials for all parameter combination that satisfy the restricted definition of choice patterns (Columns 2−6 of **Table 4**). We averaged these generated switch proportions separately for each of the three types of pronounced deck preferences (last column of **Table 4**). The table also contains statistics quantifying the distribution of the generated switch proportions, that is, the interquartile range and the minimum and maximum proportion of switches during the last 50 trials. This information is visualized by the right column of **Figure 1**.

When comparing the generated and observed mean proportion of switches during the last 50 trials given pronounced deck preferences, it is apparent that the PVL-Delta model underestimates the observed switch proportions, that is, the generated mean proportion of switches equals or falls below 0.07 for all generated pronounced choice patterns, whereas the observed mean proportion of switches equals or exceeds 0.35 for all observed pronounced choice patterns (**Tables 2**, **4**). In addition, for all three types of pronounced choice patterns, the interquartile range of the observed proportion of switches exceeds the interquartile range of the model-generated proportion of switches (**Figure 1**,



*Note that this definition is only based on the mean proportion of choices of the two strongest preferred decks (first column). For the selected choice patterns, the corresponding mean and standard deviation of the model parameters, and the mean proportion of switches during the last 50 trials are presented.*

**Tables 2**, **4**). However, the largest generated switch proportion given a pronounced preference for the good decks and the decks with infrequent losses, respectively, lie within the corresponding interquartile ranges of the observed switch proportions. This suggests that for a few parameter combinations, the PVL-Delta model meets both empirical regularities—pronounced deck preferences and a tendency to switch frequently.

To illustrate the differences and commonalities between the data and the predictions, we plot in **Figures 2**–**7** observed and generated deck selection profiles. **Figures 2**, **4**, **6** show the deck selection profiles of representative healthy participants with a pronounced preference for the good decks, bad decks, and decks with infrequent losses, respectively. **Figures 3**, **5**, **7** show the deck selection profiles that were generated with those parameter

combinations that resulted in a pronounced preference for the good decks, bad decks, and decks with infrequent losses, respectively, and the maximum number of switches during the last 50 trials. From the figures it is evident that there are large discrepancies between the observed and generated deck selection profiles in the case of the pronounced preference for the bad decks: The PVL-Delta model generates a few switches in the beginning of the IGT and then exploitation of a single deck, even though healthy participants keep switching across the entire IGT. However, the observed and generated deck selection profiles look very similar in the case of the pronounced preference for the good decks and the decks with infrequent losses.

To conclude, many healthy participants from the Steingroever et al. (2013a) data pool (53.6%) showed pronounced deck preferences, that is, a pronounced preference for the good decks ((*C* + *D*) ≥ 0.65), a pronounced preference for the bad decks ((*A* + *B*) ≥ 0.65), or a pronounced preference for the decks with infrequent losses ((*B* + *D*) ≥ 0.65) (**Table 2**). This empirical popularity of pronounced deck preferences is only partly reflected by the PVL-Delta model; the model produces choice patterns that satisfy the restricted definition of choice patterns only within minor parts of its parameter space (**Table 4**). In addition, healthy participants in general show many switches during the last 50 trials. However, the PVL-Delta model in general predicts that participants who show pronounced deck preferences switch rarely during the last 50 trials; all generated mean proportion of switches during the last 50 trials equal or fall below 0.07 whereas the observed mean proportions of switches lie around 0.40. But compared to the popular EV and PVL models (Steingroever et al., 2013b), the PVL-Delta model performs better: the PVL-Delta model generates higher mean proportions of switches than its two competitors for almost all pronounced choice patterns; the only exception is that the EV model generates a higher mean proportion of switches for the choice pattern featuring a pronounced preference for the bad decks than the PVL-Delta model.

Moreover, healthy participants show large individual differences in the proportion of switches during the last 50 trials, such that their switch behavior varies between no switches at all to switches on every trial. However, the PVL-Delta model tends to generate very few switches, given pronounced deck preferences, and fails to generate large proportion of switches (i.e., switch proportions higher than 0.65). But compared to the popular EV and PVL models (Steingroever et al., 2013b), the PVL-Delta model again performs better because the EV and PVL

model's failure to generate large proportions of switches, given a pronounced choice pattern, is even stronger: Given a pronounced choice pattern, the EV and PVL models fail to generate switch proportions higher than 0.35 and 0.46, respectively. Despite these discrepancies between the empirical and the generated switch behavior, we showed that—given a pronounced preference for the good decks or the decks with infrequent losses and those parameter combinations that yielded the maximum number of switches during the last 50 trials—the PVL-Delta model can produce choice patterns that strongly resemble the empirical choice patterns of healthy participants.

#### **4. TEST OF SELECTIVE INFLUENCE**

In this section we investigate whether the parameters of the PVL-Delta model indeed correspond to distinct psychological processes. We will therefore carry out a test of selective influence for the PVL-Delta model. This means that we fit the model to data collected from the standard IGT, but also from conditions that were designed to affect selectively one of the model parameters. These data were collected by Wetzels et al. (2010), and their experiment was originally designed as a test of selective influence for the EV model. However, the experimental manipulations that were intended to affect the parameters of the EV model should also be reflected by the parameters of the PVL-Delta model because of the high similarity between the two models.

#### **4.1. METHODS**

We fit the PVL-Delta model separately to four data sets reported by Wetzels et al. (2010). Specifically, Wetzels et al. (2010) conducted an experiment with a standard condition and three additional conditions that were designed to affect selectively one of the model parameters: <sup>6</sup> In the "standard condition", 19 participants completed a 150-trial IGT under the standard administration. In the "rewards condition", 20 participants completed a 150-trial IGT under the instruction to pay more attention to rewards and to consider losses as less important. We expected this manipulation to decrease the loss aversion parameter *w*.

In the "updating condition", 19 participants completed a 150 trial IGT under the standard administration. However, each choice was followed by a on-screen presentation of five numbers that the participants had to remember because, after the next choice, participants were asked about the relative position of one of the numbers. We expected this manipulation to increase the updating parameter *a*.

In the "consistency condition", 16 participants completed a 150-trial IGT under the standard administration. However, they were told that after every 10 trials the payoff schemes for the decks could have changed. We expected this manipulation to decrease the consistency parameter *c*.

To fit the PVL-Delta model, we used a Bayesian hierarchical approach detailed in the next section. This estimation procedure has been consistently shown to outperform alternatives such as maximum likelihood estimation and Bayesian individual estimation (Ahn et al., 2011; Wetzels et al., 2010).

To assess whether the chains of all parameters had converged successfully from their starting values to their stationary distributions, we visually inspected the Hamiltonian Monte Carlo (HMC) chains and used the *R*ˆ statistic (Gelman and Rubin, 1992). The *R*ˆ statistic is a formal diagnostic measure of convergence that compares the between-chain variability to the within-chain variability. Values close to 1.0 indicate convergence to the stationary distribution, whereas values greater than 1.1 indicate inadequate convergence.

To assess model performance in absolute terms, we used two different methods: the *post hoc* absolute fit method and the simulation method (see also Steingroever et al., in press). These two methods allow us to assess the model's ability to fit and generate the choice patterns present in each of the four conditions. Our implementation of both methods relies on visually contrasting—separately for each deck as a function of 15 bins each containing 10 trials—the observed mean choice proportions from the experiment against the mean choice probabilities from the model.

Both methods start by sampling parameter values from the joint posterior distributions over the individual-level parameters (hereafter individual-level joint posteriors). In the case of the *post hoc* absolute fit method, the model is provided with the sampled parameter values, but also with the actual choices and payoffs of each participant. The *post hoc* absolute fit method computes the probability of choosing each deck on the next trial based on the information on the observed choices and payoffs up to and including the current trial. The simulation method, on the other hand, is only provided with the sampled parameter values, and relies on generating choices for another sequence of payoffs that could have been observed7 . In particular, on each trial, the simulation method generates a choice based on the predicted choice probabilities. For both methods and for each participant, we repeated the process of obtaining the predicted choice probabilities 100 times to account for uncertainty in the individuallevel joint posteriors (for detailed recipes see Steingroever et al., in press) 8 .

To investigate the effect of the experimental manipulations, we visually compared the posterior distributions of the group-level parameters of all four conditions.

## *4.1.1. Bayesian hierarchical estimation procedure*

To fit the PVL-Delta model to the data of the four experimental conditions, we used a Bayesian hierarchical estimation procedure (see Wetzels et al. (2010) for the same model specification in the case of the EV model). The Bayesian graphical PVL-Delta model for a hierarchical analysis is shown in **Figure 8**. This figure shows that the graphical model consists of two plates: The inner plate expresses the replications of the choices on *t* = 1,..., *T* trials of the IGT, and the outer plate expresses the replications for *i* = 1,..., *N* participants. For the sake of clarity, we omitted the notation that indexes the deck number *k*. The quantities *Wi*, *<sup>t</sup>* (rewards of participant *i* on trial *t*), *Li*, *<sup>t</sup>* (losses of participant *i* on trial *t*), and *Chi*, *<sup>t</sup>* <sup>+</sup> <sup>1</sup> (choice of participant *i* on trial *t* + 1) can directly be obtained from the data. The quantities *ui*, *<sup>t</sup>*, *Evi*, *<sup>t</sup>* <sup>+</sup> 1, and θ*<sup>i</sup>* are deterministic because they can be calculated from Equations 1, 2, and 4. All individual-level parameters *zi*, that is, {*Ai*, *wi*, *ai*,*ci*}, are also deterministic because instead of modeling the individual-level parameters directly, we modeled their respective probit transformations *z i* , that is, {*A i* , *w i* , *a i* ,*c i* }. This means that the parameters *z <sup>i</sup>* lie on the probit scale covering the entire real line. The probit transformation is the inverse of the cumulative standard normal distribution function. The parameters *z <sup>i</sup>* are assumed to be drawn from group-level normal distributions with mean μ*z* and standard deviation σ*z* . Only after the analysis was complete, we transformed the parameters μ*z* and *z <sup>i</sup>* back to the original scale.

The model specification requires a definition of priors for the group-level means and standard deviations. We assigned a normal prior to the group-level means, μ*z* ∼ N(0, 1), and a uniform prior to the group-level standard deviations, σ*z* ∼ U(0, 1.5).

We implemented the PVL-Delta model in Stan (Hoffman and Gelman, 2011; Stan Development Team, 2013a,b). The code to fit the PVL-Delta model in Stan is available on http:// www.helensteingroever.com. To confirm that we correctly implemented the PVL-Delta model, we ran several parameter-recovery

<sup>6</sup>Note that we use the data sets that Wetzels et al. (2010) obtained after having eliminated two sources of contamination. Specifically, Wetzels et al. (2010) removed participants for whom one or more of the maximum likelihood point estimates were located on the boundary of the parameter space, and participants for whom the Bernoulli baseline model outperformed the EV model.

<sup>7</sup>Note that we used the same payoff schedule as in the corresponding experiment.

<sup>8</sup>For completeness, we also produced predicted choices based on the joint posterior of the group-level parameters (hereafter group-level joint posterior); that is, we generated data with 1000 parameter values that were randomly drawn from the group-level joint posterior. There are slight differences between the two types of posterior predictives, but the general conclusions are the same (see Appendix for further details).

studies. The results of two such studies are presented in the Appendix.

For each parameter, we ran three HMC chains simultaneously. The fitting procedure consisted of two steps: First, we initialized all chains with randomly generated starting values. We collected 1000 samples of each chain after having discarded the first 9000 samples of each chain as burn-in. However, this procedure did not result in successful convergence of the HMC chains of all parameters: for instance, for some parameters, two chains may appear to have converged to their stationary distributions and looked like "hairy caterpillars" that are randomly intermixed, whereas the third chain behaved differently and producing an inferior goodness of fit (GOF). Therefore, in a second step, we again ran three HCM chains for each parameter, but this time, we initialized all chains with parameter values close to the mean of the HCM chain that produced the best GOF in the first step. However, even this procedure resulted in convergence problems for a few participants (e.g., bimodal posterior distributions). We therefore excluded participants with such convergence issues and repeated the first and second step. This explains why the sample sizes presented in **Table 5** are slightly smaller than those reported by Wetzels et al. (2010).

**Table 5** also presents, for each data set separately, the number of burn-in samples and posterior samples that we collected for each chain. These specifications differ across data sets to ensure that all chains reached convergence. We based our inferences on these posterior samples.



#### **4.2. RESULTS**

In this section, we discuss the results of the test of selective influence. We first focus on the behavioral level by describing the choice patterns observed in the four experimental conditions. Second, we focus on the level of the cognitive modeling analyses; we describe tests confirming that the posterior distributions converged successfully from their starting values to their stationary distributions. In addition, we show that the PVL-Delta model results in a satisfactory fit performance and simulation performance for the four conditions. Finally, we visually compare the posterior distributions of the group-level parameters of all four conditions to draw inferences about the effect of the experimental manipulations.

#### *4.2.1. Behavioral data*

The mean proportion of choices from each deck within 15 blocks each containing 10 trials as observed in the four experimental conditions reported by Wetzels et al. (2010) are presented in the first column of **Figure 9**. In the standard condition, participants learned to prefer good deck C over all remaining decks; however, participants failed to learn that deck D is also a good deck.

In the rewards condition (i.e., participants were instructed to pay more attention to rewards and to consider losses as less important), participants learned to prefer bad deck B over all remaining decks. Note that even though bad decks A and B both yield high immediate rewards on every trial, participants did not learn to select deck A more often than good decks C and D. This may suggest that the experimental manipulation was only partly successful.

In the updating condition (i.e., each choice was followed by a on-screen presentation of five numbers that participants had to remember because, after the next choice, they were asked about the relative position of one of the numbers), participants show a very weak learning curve; they only learned to avoid deck A.

In the consistency condition (i.e., participants were told that after every 10 trials the payoff schemes for the decks could have changed), participants—in contrast to the intention of the experimental manipulation—did not evenly explore all decks across the entire 100 trials. Instead participants learned to prefer decks B and C over the remaining decks. It seems that participants prefer bad deck B because it yields high immediate rewards on the majority of the trials; however, participants prefer good deck C because it never yields a net loss and is therefore a safe option.

## *4.2.2. Convergence checks*

Visual inspection of the HMC chains and consideration of the *R*ˆ statistics for all parameters (all parameters had *R*ˆ values below 1.045) suggest that all chains have converged successfully.

performance, respectively, for each of the four conditions. Fit performance and simulation performance are based on random draws from the individual-level joint posteriors.

Steingroever et al. Validating the PVL-Delta model

To illustrate how we visually assessed convergence, we show the chains of one individual-level parameter in the Appendix. From the figure it is evident that the chains have converged successfully from their starting values to their stationary distribution, looking like "hairy caterpillars" that are randomly intermixed.

#### *4.2.3. Absolute model performance*

To assess the absolute model performance of the PVL-Delta model with respect to the four experimental conditions, the second and third column of **Figure 9** show the fit performance and simulation performance, respectively. Fit performance and simulation performance are based on random draws from the individual-level joint posterior. From the second column of the figure it is evident that the PVL-Delta model provides a good fit to the data of all four conditions (i.e., the model makes accurate one-step-ahead predictions when provided with access to the observed sequence of choices and payoffs). In addition, the third column of **Figure 9** illustrates that the PVL-Delta model adequately generates the choice pattern shown by the standard and update conditions. In the case of the rewards and consistency conditions, the simulation performance of the PVL-Delta model is acceptable; the model correctly predicts the most preferred deck, but fails to account for the rank order of the remaining three decks: in the reward condition, the model predicts that deck D is preferred over decks A and C even though the participants chose these three decks about equally often. In the consistency condition, the model predicts that deck D is preferred over deck C even though the participants showed the reverse pattern. To sum up, the PVL-Delta model captures the global patterns in the data providing an acceptable fit and simulation performance with respect to the four data sets at hand; this allows us to meaningfully compare the group-level parameters of the four conditions.

### *4.2.4. Test of selective influence*

**Figure 10** presents the posterior distributions of the group-level parameters of all four conditions. It is evident that the experimental manipulation is successfully reflected in the loss aversion parameter and the consistency parameter: first, compared to participants that received the standard instruction, participants who were instructed to focus on rewards (i.e., the rewards condition) had lower values for the loss aversion parameter indicating that they were indeed more reward-seeking. Second, fitting the PVL-Delta model to data of participants that were told that after every 10 trials the payoff schemes for the decks could have changed (i.e., the consistency condition) resulted in a smaller consistency parameter (i.e., a more random choice behavior) than fitting the PVL-Delta model to data of participants that received the standard instructions. However, in the update condition is no clear effect on the updating parameter. Yet, it is evident that the consistency parameter in the update condition is noticeably lower than in the standard condition (i.e., a more random choice behavior); this is consistent with the choice pattern shown by the update condition; participants only learned to avoid deck A, but show a completely indistinguishable preference for the remaining three decks.

## **5. DISCUSSION**

In this article, we conducted two tests to validate the PVL-Delta model: a parameter space partitioning study and a test of selective influence. Applying PSP to the PVL-Delta model, we have obtained a deeper understanding of the model's behavior. We used two different definitions of choice patterns; the broad definition allowed us to get an indication of how central each of the choice patterns are to the model's overall performance, and the restricted definition allowed us to assess the model's data-fitting potential when confronted with data featuring pronounced deck preferences.

Using the broad definition of choice patterns, the PSP study revealed that the PVL-Delta model can generate all typical empirical choice patterns. However, the PVL-Delta model generates the choice pattern featuring a preference for the bad decks only over a minor part of its parameter space suggesting that this choice pattern is virtually irrelevant to the model's overall performance.

Using the restricted definition of choice patterns, the PSP study revealed that the PVL-Delta model can still generate all pronounced empirical choice patterns over a minor part of its parameter space. But for these pronounced choice patterns, the PVL-Delta model generally underestimates the empirical switch proportions during the last 50 trials. In particular, given pronounced preferences for the bad decks, the PVL-Delta model fails to account for the empirical switch behavior. This failure seems to be caused by the Prospect Utility function of the PVL-Delta model: in a previous PSP study, Steingroever et al. (2013b) showed that this failure is also present in the PVL and EV-PU model (i.e., models with the Prospect Utility function), but not in the EV model (i.e., a model without the Prospect Utility function). However, in the case of the other two pronounced choice patterns—the choice patterns favoring decks with high expected value or low loss frequency we showed that the PVL-Delta model provides a good account for the empirical switch behavior for some parameter combinations.

The results of the PSP study for the PVL-Delta model and the earlier PSP studies for the EV and PVL models (Steingroever et al., 2013b) suggest that the PVL-Delta model outperforms its two competitors. The EV model fails to generate a pronounced preference for the decks with infrequent losses; the PVL model is able to generate pronounced decks preferences, but underestimates the switch proportions even more strongly than the PVL-Delta model. This superiority of the PVL-Delta model is in line with the posterior predictive checks reported by Steingroever et al. (in press).

An important advantage of PSP is that it is a global analysis technique augmenting local methods that have previously been used to compare RL models (Pitt et al., 2006, 2008). Whereas local methods, such as the *post hoc* fit criterion or the generalization criterion, evaluate a model's performance at a single point of a model's parameter space, global methods such as PSP help us to determine the full range of choice patterns that a model can generate by varying its parameter values (see also Vanpaemel, 2009). This means that we can obtain a global perspective on the datafitting potential of the PVL-Delta model. Thus, if researches wish to apply the PVL-Delta model to IGT data, they can decide based on the behavioral results whether it is appropriate to apply the PVL-Delta model or not.

The PSP results of this paper should be interpreted with care. PSP gives an indication of how central choice patterns are to the overall performance of the model. However, it is premature to conclude that the PVL-Delta model cannot generate the choice pattern "bad decks over goods decks" at all, soley because the model generates this choice pattern over a small part of the parameter space. Instead, we can only conclude that this choice pattern is not central to the model's overall performance.

It should also be noted that the inferences drawn from the PSP study strongly depend on our definitions of choice patterns. The restricted definition of choice patterns was based on IGT performance of healthy participants (Steingroever et al., 2013b). We could thus detect inconsistencies between the empirical popularity of each pronounced choice pattern in the Steingroever et al. (2013a) data pool and the frequency predicted by the PVL-Delta model. It is troubling that the PVL-Delta model fails to generate a pronounced preference for the bad decks with many switches. But it should be acknowledged that this choice pattern is not central in healthy participants' IGT performance: in the Steingroever et al. (2013a) data pool, only 5% (*N* = 18) of the healthy participants showed this choice pattern (**Table 2**). Still, this choice pattern is assumed to be characteristic for patients with decisionmaking deficits (Bechara et al., 1994, 1997), but a better empirical foundation (e.g., a literature review on the IGT performance of clinical groups) is required to accurately judge the gravity of the PVL-Delta model's failure to generate a pronounced preference for the bad decks with many switches.

The test of selective influence revealed that the experimental manipulations had a noticeable effect on the loss aversion parameter and consistency parameter, but not on the updating parameter. However, it is premature to conclude that the updating parameter does not correspond to memory processes. It may be that the experimental manipulation did not work out properly. In addition, one should bear in mind that every data set is characterized by its own idiosyncrasies. IGT data generally are highly idiosyncratic—possibly because the IGT is a very complex task (Steingroever et al., 2013a). In order to be able to draw more accurate conclusions on whether the parameters represent distinct psychological processes, independent repetitions of the test of selective influence and even different experimental manipulations are necessary.

The results of this article confirm that the PVL-Delta model is an attractive alternative to the popular EV and PVL models. However, the PVL-Delta model is also characterized by a few shortcomings because it underrepresents the choice pattern featuring a preference for the bad decks. Nevertheless, we recommend that researchers use the PVL-Delta model to disentangle psychological processes underlying IGT performance, provided that they rigorously assess absolute model performance before interpreting the model parameters (Steingroever et al., in press).

## **ACKNOWLEDGMENTS**

This publication was supported by the Dutch national program COMMIT and by an Open Access grant from NWO.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 June 2013; accepted: 14 November 2013; published online: 03 December 2013.*

*Citation: Steingroever H, Wetzels R and Wagenmakers E-J (2013) Validating the PVL-Delta model for the Iowa gambling task. Front. Psychol. 4:898. doi: 10.3389/fpsyg. 2013.00898*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Steingroever, Wetzels and Wagenmakers. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX**

In this appendix, we present how we visually assessed convergence, additional absolute model performance checks and the results of two parameter-recovery studies that confirm that we correctly implemented the PVL-Delta model. For the parameterrecovery studies, we used two synthetic data sets that were generated with the PVL-Delta model. The data-generating parameters correspond to the medians of the individual-level joint posteriors that were obtained by fitting the PVL-Delta model to two real data sets.

**Figure A1** shows the HMC chains of one individual-level parameter. From the figure it is evident that the chains have converged successfully from their starting values to their stationary distribution, looking like "hairy caterpillars" that are randomly intermixed. We inspected this type of plot for every parameter to visually assess convergence in addition to the formal diagnostic measure of convergence *R*ˆ.

**Figure A2** presents the fit performance and simulation performance of the PVL-Delta model that was obtained with random draws from the joint posterior distributions over the grouplevel parameters (hereafter group-level joint posteriors). Note that **Figure 9** presents the fit performance and simulation performance based on the individual-level joint posteriors. A comparison of both figures reveals that the fit performance based

for every parameter to visually assess convergence.

on the group-level joint posteriors (**Figure A2**) closely matches the fit performance based on the individual-level joint posteriors (**Figure 9**). However, there are a few discrepancies in the case of the simulation performance: from **Figures 9**, **A2** it is evident that the simulation performance based on the group-level joint posteriors is more extreme, that is, the most preferred deck is preferred even stronger, whereas the least preferred deck is avoided even stronger. However, it is evident that in general **Figure A2** mirrors the conclusion drawn from **Figure 9**.

**Figure A3** presents the results of the first recovery study. This data set contains 18 synthetic participants. The figure contains four panels; each panel illustrates the recovery of one of the four model parameters. In each panel, the mode of the group-level posterior is represented by the dotted line, whereas the solid line represents the true group-level parameter. In addition, the panels can also be used to assess the individual-level recovery: the unfilled dots represent the modes of the individual-level posteriors, whereas the filled dots represent the true individual-level parameters.

Note that the individual-level posterior distributions are not sorted by the subject ID; in order to visualize the degree of individual differences in each model parameter, we sorted the individual-level posterior distributions by the true individuallevel parameters.

From **Figure A3** it is evident that the group-level updating parameter is slightly underestimated, but the remaining group-level parameters are recovered very accurately. However, the recovery of the individual-level parameters is less accurate. Especially in the case of the shape parameter, most of the individual-level modes differ from the true individual-level parameters by regressing to the mode of the group-level parameter (i.e., shrinkage); small deviations are noticeable in the case of the individual-level loss aversion parameters and the individuallevel updating parameter. Yet, in the case of the consistency parameter, most individual-level parameters are recovered very accurately.

**Figure A4** presents the results of the second recovery study. This data set contains 30 synthetic participants. It is evident that all group-level parameters are recovered very accurately. However, the recovery of the individual-level parameters is less accurate. Especially in the case of the individual-level shape parameters and the individual-level loss aversion parameters, it is evident that the individual-level modes differ from the true individuallevel parameters. Yet, the recovery of the individual-level updating parameters and the individual-level consistency parameters is adequate.

**absolute model performance.** The first column shows the mean proportion of choices from each deck within 15 blocks as observed in the four experimental conditions reported by Wetzels et al. (2010). Each block contains 10 trials. The second and third respectively, for each of the four conditions. Fit performance and simulation performance are based on random draws from the Fit performance and simulation performance are based -level joint posteriors.

## An improved cognitive model of the Iowa and Soochow Gambling Tasks with regard to model fitting performance and tests of parameter consistency

*Junyi Dai 1,2\*, Rebecca Kerestes 3, Daniel J. Upton4, Jerome R. Busemeyer <sup>1</sup> and Julie C. Stout <sup>4</sup>*

*<sup>1</sup> Decision Research Laboratory, Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, USA, <sup>2</sup> Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany, <sup>3</sup> Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA, <sup>4</sup> Department of Psychological Sciences, Monash University, Clayton, VIC, Australia*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*William Hedgcock, University of Iowa, USA Darrell A. Worthy, Texas A&M University, USA*

#### *\*Correspondence:*

*Junyi Dai, Center for Adaptive Rationality, Max Planck Institute for Human Development, Lentzeallee 94, Berlin 14195, Germany dai@mpib-berlin.mpg.de*

#### *Specialty section:*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology*

> *Received: 14 November 2014 Accepted: 14 February 2015 Published: 12 March 2015*

#### *Citation:*

*Dai J, Kerestes R, Upton DJ, Busemeyer JR and Stout JC (2015) An improved cognitive model of the Iowa and Soochow Gambling Tasks with regard to model fitting performance and tests of parameter consistency. Front. Psychol. 6:229. doi: 10.3389/fpsyg.2015.00229* The Iowa Gambling Task (IGT) and the Soochow Gambling Task (SGT) are two experience-based risky decision-making tasks for examining decision-making deficits in clinical populations. Several cognitive models, including the expectancy-valence learning (EVL) model and the prospect valence learning (PVL) model, have been developed to disentangle the motivational, cognitive, and response processes underlying the explicit choices in these tasks. The purpose of the current study was to develop an improved model that can fit empirical data better than the EVL and PVL models and, in addition, produce more consistent parameter estimates across the IGT and SGT. Twenty-six opiate users (mean age 34.23; SD 8.79) and 27 control participants (mean age 35; SD 10.44) completed both tasks. Eighteen cognitive models varying in evaluation, updating, and choice rules were fit to individual data and their performances were compared to that of a statistical baseline model to find a best fitting model. The results showed that the model combining the prospect utility function treating gains and losses separately, the decay-reinforcement updating rule, and the trial-independent choice rule performed the best in both tasks. Furthermore, the winning model produced more consistent individual parameter estimates across the two tasks than any of the other models.

Keywords: Iowa Gambling Task, Soochow Gambling Task, cognitive modeling, parameter consistency, opiate users

## Introduction

The Iowa Gambling Task (IGT; Bechara et al., 1994) and the Soochow Gambling Task (SGT; Chiu et al., 2008) are experience-based risky decision-making tasks. The IGT has been used in numerous studies to examine decision-making deficits in various clinical populations, such as people with brain damage (e.g., Bechara et al., 1994, 1999), neurodegenerative diseases (e.g., Stout et al., 2001), or drug abuse problems (Grant et al., 2000; Bechara et al., 2001; Bolla et al., 2003; Bechara and Martin, 2004; Stout et al., 2004; Gonzalez et al., 2007; Vassileva et al., 2007). The Dai et al. Improved model of the IGT and SGT

SGT was developed more recently to further distinguish influential factors for decisions using a scenario similar to the IGT (Chiu et al., 2008). While IGT studies produced ambivalent results in terms of the relevant impacts of gain-loss frequency and expected value (e.g., Dunn et al., 2006), the choice pattern of healthy participants in the SGT suggested that gain-loss frequency is more influential than expected value in determining preference in such tasks.

An important feature of the IGT and SGT is the complex interplay among motivational, cognitive, and response processes underlying the explicit choice behavior revealed in these tasks. Therefore, decision-making deficits in particular participant groups may be produced by deficiencies in different component processes. Various cognitive models have been examined to disentangle this interplay of psychological processes underlying decision task performance, and successful ones are then applied to clinical populations to identify reasons for disadvantageous choice patterns. Among them are the expectancy-valance learning model (EVL; Busemeyer and Stout, 2002) and the prospect valence learning model (PVL; Ahn et al., 2008), which have been successfully fitted to empirical data from a variety of healthy and clinical groups (Busemeyer and Stout, 2002; Stout et al., 2004; Yechiam et al., 2005; Lane et al., 2006; Fridberg et al., 2010).

In this study, we aimed to improve the cognitive models for the IGT and SGT in two major aspects. Compared with previous models, the improved model should not only provide better fits to individual data, but also demonstrate a better consistency in parameter estimates across the two tasks. The former is what we expect from a better model in general, while the latter is desired for a model which presumably captures the common decision processes underlying the two tasks.

The remainder of this article is organized as follows. First, we briefly describe and compare the IGT and SGT. Second, we describe modifications to the EVL and PVL models that might yield new improvements for quantifying the component decision processes. Third, we report a previously published behavioral study on IGT and SGT with both non-opiate user controls and clinical participants (i.e., opiate users), and compare the performances of various models in fitting individual data from the empirical study. We also report results from parameter consistency tests across tasks on the various models. The article concludes with a discussion on the implications of the new model and future research orientations.

## The IGT and SGT

The IGT was initially developed by Bechara et al. (1994) as a tool to simulate real-world risky decision-making and detect decisionmaking deficits of patients with brain damage. The task requires participants to choose a card from one of four decks (labeled decks A, B, C, and D respectively) on each trial and the total number of trials is unknown to participants. When a card is chosen, the payoff of that card is revealed1 . The goal of the task is to maximize the total payoff. Some of the cards produce a pure

#### TABLE 1 | The payoff distribution of the IGT.


gain (e.g., winning \$0.50), while others lead to a mixture of gain and loss (e.g., winning \$1 but at the same time losing \$3). The cards within each deck yield the same amounts of gain but different amounts of possible loss (see **Table 1**). Specifically, each card in decks A and B yields a gain of \$1 when turned over, while each card in decks C and D yields \$0.50. For decks A and C, five out of each set of 10 trials produce a loss in addition to a gain. For decks B and D, only one out of each set of 10 trials produces a simultaneous loss. The amounts of potential loss are manipulated so that the expected values of decks A and B are negative (i.e., losing \$2.5 in each set of 10 trials) while those of decks C and D are positive (i.e., gaining \$2.5 in each set of 10 trials). The positions of trials yielding a loss within each set of 10 trials are randomized. In summary, decks C and D are better than decks A and B in terms of long-term net gain, and therefore the former are typically called the advantageous or good decks while the latter are disadvantageous or bad ones. On the other hand, decks B and D produce net gains more frequently than decks A and C.

A typical finding in the initial application of the IGT to clinical populations was that healthy people tended to choose the good decks (i.e., decks C and D) more frequently than the bad ones (i.e., decks A and B) after gaining experience with the task, but participants with brain damage to the ventromedial prefrontal cortex (vmPFC) kept choosing the bad decks throughout the whole experiment. Bechara and colleagues (Damasio, 1994; Bechara et al., 1996) interpreted this pattern as a demonstration that people with damage to vmPFC cannot accumulate information from previous experience to foresee the long-term value of a specific deck and attributed this deficit to the incapability of producing a somatic marker to guide future decisions. However, Lin et al. (2007) and Chiu et al. (2008) questioned this interpretation as well as the design of IGT, arguing that there is a severe confounding between long-term outcome (i.e., expected value) and gain-loss frequency variables in the IGT (see also Dunn et al., 2006). Consequently, the preference for the good decks among healthy people may be partly caused by the fact that deck D produces a positive expected value as well as more net gains. This argument was supported by the phenomenon of "prominent deck B" (Toplak et al., 2005; Lin et al., 2007; see Dunn et al., 2006, for a review), which suggested that healthy people also tend to choose deck B more frequently than the somatic marker hypothesis predicts. As a result, Chiu et al. (2008) designed the SGT

<sup>1</sup>The payoffs of the IGT used in this study were 1/100th of the hypothetical payoffs in the original design of Bechara et al. (1994). In this way, the participants could be paid what they actually encountered in the study. The same is true for the payoffs of the SGT used in this study.

to eliminate the confounding between long-term outcome and gain-loss frequency.

The SGT has the same surface characteristics and goal as the IGT, as well as the same expected value of each deck. However, the payoff distribution of the SGT was modified to redress the confounding in the IGT. Specifically, each card in the SGT always produces either a gain or a loss (see **Table 2**). For decks A and B, four out of each set of five trials produce a gain while the remaining one produces a large amount of loss to make the expected values of these two decks negative. In contrast, for decks C and D, four out of each set of five trials produce a loss but the remaining one produces a large amount of gain so that the expected values are positive. In this way, decks with a positive expect value also produce (net) losses more frequently. Finally, the order of payoffs is randomized within each set of five instead of 10 trials.

Chiu et al. (2008) found that, just like clinical participants, healthy people tended to choose the bad decks more often than the good ones in the SGT. Therefore, they argued that gain-loss frequency is more important than long-term outcome in predicting choice behavior in the IGT and SGT. This was echoed by a recent review of the IGT in healthy participants by Steingroever et al. (2013), which questioned the assumption that healthy people learn to prefer the options with positive expected values. Despite the apparent similarities between the IGT and SGT, the different payoff distributions warrant the use of both tasks in a single study to better understand risky choices from experience.

## Cognitive Models of the IGT and SGT The EVL and PVL Models

Various cognitive models of the IGT and SGT have been evaluated and compared previously in terms of their descriptive accuracy for the empirical choice pattern of healthy and clinical participants. Among them, the EVL model (Busemeyer and Stout, 2002) and the PVL model (Ahn et al., 2008) appeared to be the most successful and widely used ones. Both models are built upon three general assumptions. First, participants evaluate the positive and/or negative payoffs produced by their choice on each trial with a unidimensional utility function. Second, based on the utility of experienced payoff(s) on each trial, expectation about the utility of each deck is updated with a specific learning rule. Third, the expected utility associated with each deck then serves as an input to a probabilistic function which determines the choice probability of each deck on the next trial. In other words, the explicit behavior in the IGT and SGT is determined by the interplay of three processes, i.e., the motivational process (utility


evaluation), the cognitive process (expectation updating), and the response process (deck choosing).

According to the EVL model, the evaluation process is governed by the following weighted utility function

$$u(t) = (1 - W) \cdot \dot{\omega} \dot{m}(t) - W \cdot \text{loss}(t) \tag{1}$$

in which *win(t)* and *loss(t)* represent the amounts of gain and loss on trial *t* respectively, and *W* is an attention weight parameter which denotes the weight participants place on losses as opposed to gains. It is constrained between 0 and 1; the higher it is, the more attention one puts on losses than gains. We can also interpret Equation 1 as a piecewise linear utility function with an implication that participants evaluate gains and losses separately.

The expectation updating rule involved in the EVL model is the following delta-learning rule (Rescorla and Wagner, 1972),

$$E\_j(t) = E\_j(t-1) + A \cdot \delta\_j(t) \cdot [\mu(t) - E\_j(t-1)] \tag{2}$$

in which *Ej(t)* represents the expectancy or expected utility for deck *j* on trial *t* and *A* is an updating parameter denoting the influential power of the current outcome on the expectancy for a deck. The value of *A* should be constrained between 0 and 1 so that the new expectancy after updating is bounded between the old expectancy and the utility of the immediate outcome. The variable δ*j(t)* in Equation 2 is a dummy variable indicating the deck chosen on trial *t*. For example, if deck C is chosen on trial t, then δ*C(t)* = 1, and δ*j(t)* = 0 for *j* = A, B, D. When *A* equals 0, the expectancy for each deck will not change (i.e., *Ej*(*t*) = *Ej*(*t* − 1)); when *A* equals 1, the expectancy for the chosen deck will be identical to the utility of the immediate outcome and those for the other decks will remain unchanged.

Finally, the choice rule assumed by the EVL model is a trialdependent ratio-of-strength rule (Luce, 1959). Specifically, the choice probability of each deck on trial *t* + 1 is

$$\Pr[D(t+1) = j] = \frac{e^{\theta^j(t) \cdot E\_j(t)}}{\sum\_j e^{\theta^j(t) \cdot E\_j(t)}} \tag{3}$$

in which *D(t* + *1)* = *j* represents choosing deck *j* on trial *t* + 1 and θ*(t)* is a sensitivity parameter which determines the sensitivity of choice probabilities to expectancies on trial t. Equation 3 suggests that the higher the expectancy of a deck is, the more likely it will be chosen on the next trial. The trial-dependent choice (TDC) rule further assumes that

$$
\theta(t) = (t/10)^{\varepsilon} \tag{4}
$$

in which *c* is a consistency parameter. This type of choice rule implies a changing sensitivity parameter over trials.

Ahn et al. (2008) extended the literature by further exploring different formalizations of each of the three processes in an attempt to find a better model in terms of both descriptive and predictive accuracy for the IGT and SGT. Specifically, they tried a utility function based on the prospect theory (Kahneman and Tversky, 1979) in addition to the original piecewise linear utility function in the EVL, a decay-reinforcement learning (DRL) rule (Erev and Roth, 1998) in addition to the original delta-learning rule, and a trial-independent choice (TIC) rule as well as the original TDC rule.

According to the prospect utility function in Ahn et al. (2008),

$$\mu(t) = \begin{cases} \varkappa(t)^a & \text{if } \varkappa(t) \ge 0 \\ -\lambda |\varkappa(t)|^\alpha & \text{if } \varkappa(t) < 0 \end{cases} \tag{5}$$

in which *x*(*t*) is the net payoff (i.e., *win(t)* - |*loss(t)*|) on trial *t*, α is the exponent of the power function which prescribes the shape of the utility function, and λ is a loss aversion parameter. The value of α is typically constrained between 0 and 1. When α equals 0, the prospect utility function reduces into a step function with all net gains producing the same positive utility (i.e., 1) and all net losses producing the same negative utility (i.e., –λ). When α equals 1, the utility of a net gain is equal to its objective value (i.e., *x(t)*), and the utility of a net loss is proportional to its objective value with λ serving as the proportional constant. When α is between 0 and 1 exclusively, the utility function is curved with diminishing marginal utility. The loss aversion parameter λ denotes how much an individual participant is averse to losses relative to his/her degree of preference toward gains of the same magnitude. A value of λ greater than 1 indicates that the negative utility of a loss would more than counterbalance the positive utility of a gain of the same magnitude, while a value of λ smaller than 1 suggests the opposite. The specific form of prospect utility function explored by Ahn et al. (2008) suggests that only the net payoffs are evaluated, whereas the utility function of the EVL model implies that gains and losses are evaluated separately before combined together.

The DRL rule in the PVL model suggests that in the short period between two successive trials, the expectancy resulted from the first trial decays and the updated expectancy after the second trial is a sum of the decayed expectancy and the utility of the current outcome. Specifically,

$$E\_j(t) = A \cdot E\_j(t-1) + \delta\_j(t) \cdot \mu(t) \tag{6}$$

in which *A* is a recency parameter and δ*j(t)* is a dummy variable as in the delta-learning rule. Unlike the delta learning rule, the DRL rule implies that the expectancies of the unchosen decks would decrease on each trial.

Finally, the TIC rule explored by Ahn et al. (2008) assumes that θ(*t*) is invariant across trials. Specifically,

$$
\theta(\mathbf{t}) = \theta = \mathbf{3}^{\varepsilon} - \mathbf{1} \tag{7}
$$

in which *c* is a consistency parameter as in the EVL model. The higher its value is, the more consistent one's choice will be with the expectancies of the four decks. When c equals 0, θ = 0, suggesting that choice among the four decks will be totally random no matter how different their expectancies are. When c is relative large, θ will be quite big, suggesting that people will choose the deck with the highest expectancy almost for sure. The results of Ahn et al. (2008) favored a model combining the prospect utility function, DRL rule, and TIC rule. The resultant model is usually called the PVL model2 .

### Alternative Cognitive Models of the IGT and SGT

Although the EVL and PVL models have been successfully applied to various populations to disentangle the interplay between component processes underlying the IGT and SGT, there are other ways to model these two tasks. Indeed, the common structure of these two models suggests using alternative utility functions, updating rules, and/or choice rules to generate potentially better models. Consequently, in this article we propose a new utility function and a new updating rule, which will be combined with complementary components in the EVL and PVL models to create new cognitive models of the IGT and SGT.

According to the prospect utility function in the PVL model, the utility of an outcome with the same amounts of gain and loss is always zero. This is due to the fact that the net outcome is zero under this condition and only net outcome is evaluated according to the PVL model. In contrast, the same property holds for the weighted utility function of the EVL model only when the attention weight parameter, W, equals 0.5 and thus gains and losses attract the same amounts of attention. When selecting a card leads to both gain and loss of the same magnitude, a participant's overall feeling may not be neutral because, for example, the sadness associated with the loss may not be completely offset by the gain. Here, we propose an alternative form of prospect utility function that combines features of utility functions in both the EVL and PVL models,

$$u(t) = [\dot{\boldsymbol{w}} \dot{\boldsymbol{n}}(t)]^a - \boldsymbol{\chi} \left( \left| \boldsymbol{l} \boldsymbol{\delta} \boldsymbol{s}(t) \right| \right)^a \tag{8}$$

in which *win(t)* and *loss(t)* represent the amounts of gain and loss on trial *t*, and α and γ have the same meanings as in the PVL model. On the one hand, like the weighted utility function, this utility function evaluates gains and losses separately before aggregating the results to generate a comprehensive utility. On the other hand, the new utility function retains the assumptions of prospect theory concerning non-linearity (i.e., α) and loss aversion (i.e., γ ).

Other modifications incorporating features of both EVL and PVL models can be applied to the updating rule. According to the updating rule of the EVL model (i.e., the delta learning rule), after a card from a specific deck is turned over, the updated expectancy of the selected deck should lie between its previous expectancy and the utility of the current outcome. In contrast, the updating rule of the PVL model (i.e., the DRL rule) suggests that participants will add the utility of the current outcome to the (decayed) expectancy of the selected deck to update its expectancy. One potential problem with the DRL rule is that the updated expectancy of the selected deck can be larger in absolute magnitude than both its previous expectancy and the utility of the current outcome. In other words, the updated expectancy is not reasonably bounded. For example, suppose *A* = 0.9,

<sup>2</sup>The definitive feature of the PVL model is the use of the specific form of prospect utility function explored in Ahn et al. (2008). See Fridberg et al. (2010) for more information.

*Ej(t–1)* = 10, and *u(t)* = 5 in Equation 6 for a chosen deck. Then *Ej(t)* = 0.9 × 10 + 5 = 14, which is larger than both *Ej(t–1)* and *u(t)*. To get around this potential problem, we explore a new learning rule that incorporates the features of both delta and DRL rules,

$$E\_{\dot{j}}(t) = (1 - D) \cdot E\_{\dot{j}}(t - 1) + \\\\
$$

$$A \cdot \delta\_{\dot{j}}(t) \cdot [\mu(t) - (1 - D) \cdot E\_{\dot{j}}(t - 1)] \tag{9}$$

in which D is a decay parameter, A is an updating parameter, and δ*j(t)* is a dummy variable indicating the deck chosen on trial *t* 3 . This updating rule assumes mechanisms of both memory decay and delta-learning and thus might account for empirical data more accurately. We will hereafter call it the mixed updating rule.

In summary, with the new utility function and updating rule, we have a collection of three utility functions (i.e., the weighted utility function of the EVL model, the prospect utility function of the PVL model, and the alternative prospect utility function described above), three updating rules (i.e., the delta learning rule of the EVL model, the DRL rule of the PVL model, and the mixture updating rule), and two choice rules (i.e., the TDC rule of the EVL model and the TIC rule of the PVL model) to generate cognitive models of the two tasks. Forming all combinations, we evaluated and compared 18 cognitive models to find new models even better than the EVL and PVL models.

## Materials and Methods

## Participants

A total of 26 opiate users (mean age = 34.23 years, SD = 8.79) and 27 age and gender matched control participants (mean age = 35 years, SD = 10.44) were involved in this study (see **Table 3**). Opiate users were recruited from Turning Point Alcohol and Drug Centre, a community outpatient service located in Melbourne (Australia). Opiate users were treatment seeking, either currently abstinent or taking prescribed opiate substitution medication (methadone, buprenorphine). Participants were asked to abstain from illicit drugs and alcohol for 12 h prior to the testing session (excluding opiate substitution medication). If participants reported using alcohol or drugs less than 12 h before the test session, or had a blood alcohol level reading above 0.05 mg/kg on arrival, their test session was postponed for at least 1 day. Fourteen of the opiate users (54%) and five of the controls (19%) reported having a current mood disorder [among opiate users, two had major depressive disorder (MDD), two had an anxiety disorder, eight had MDD and an anxiety disorder, and two had a bipolar disorder; among controls, two had MDD, two had an anxiety disorder and one had MDD and an anxiety disorder]. Exclusion criteria for control participants were: use of illicit drugs in the previous 6 months, history of drug and/or alcohol dependence or abuse, blood alcohol level >0.05 mg/kg confirmed on arrival to the test session. In addition, any participants from either group who had a history of psychosis were excluded. All participants provided written informed consent, and the Monash University Human Ethics Committee approved all study procedures. See **Table 3** for more information concerning the sample4 .

## Procedure

The participants completed computerized versions of the IGT and SGT. The order of tasks was counterbalanced across participants. Each task had 120 trials with an unlimited number of cards in each deck. Participants were given a starting balance of \$20.00 and received any money earned above this balance at the end of the task. They could not lose any money. The total balance was updated on-screen after every selection and participants were also provided with feedback about the net change in balance after every 20 trials. Each trial was participant-initiated, and there were no time restrictions. Decks were positioned on the computer screen, from left to right, randomly across participants.

## Model Comparison Analyses Maximum Likelihood Estimation

A total of 18 cognitive models (3 utility functions × 3 updating rules × 2 choice rules) were fit to the choice data of each individual in each task. We used these models to predict choice probabilities of the four decks on each trial given the outcomes experienced on previous trials. The one-step-ahead predictions were then employed to evaluate the performance of each model. Specifically, we defined the likelihood of the observed choice sequence of each participant as the product of the predicted choice probabilities of the decks actually chosen across trials5 and we used maximum-likelihood estimation to find the best parameter values for each model. The log likelihood of the observed sequence is defined as

$$LL\_M = \sum\_{t=1}^{n-1} \sum\_{j} \ln(\Pr[D\_j(t+1)]) \cdot \delta\_j(t+1) \tag{10}$$

In the above equation, *n* denotes the number of trials, Pr[*Dj*(*t* + 1)] represents the predicted choice probability of deck *j* on trial *t* + 1 given the sequence of choices and outcomes up to and including trial *t*, δ*j*(*t* + 1) is a dummy variable with a value of 1 if deck *j* is chosen on trial *t* + 1 and 0 otherwise, and the second summation is across the four decks. A combination of grid-search with 50 different starting positions and simplex search method (Nelder and Mead, 1965) was utilized to find the best parameter values.

<sup>3</sup>Note that we replace the recency parameter *A* in Equation 6 [i.e., Equation 4 in Ahn et al. (2008)] with (1 – *D*) and reserve symbol *A* for the updating parameter to formulize the mixed updating rule. This leads to clearer symbol system in this article and higher values of *D*, the decay parameter, actually indicate more rapid memory decay. The same is true for our formulation of the decay-reinforcement learning rule.

<sup>4</sup>Behavioral data from this sample were reported and analyzed in Upton et al. (2012). In the current study we focus on the performance of various models with regard to the data.

<sup>5</sup>The first trial was actually skipped in calculating likelihood of the choice sequence since all models predict equal choice probabilities (i.e., 0.25) across decks for the first trial.

#### TABLE 3 | Summary of demographic, mood, personality, and substance use variables.


*\*p* < *0.05.*

## Model Comparisons Using the Bayesian Information Criterion

Since models explored in this study differed in number of parameters, the Bayesian information criterion (BIC; Schwartz, 1978) was used as the main performance index for model comparison, because it considers both descriptive accuracy and model complexity. We also explored a statistical baseline model as in Busemeyer and Stout (2002) and Ahn et al. (2008). This model assumes independent choices with constant probabilities across trials and served as the reference point in our model comparison. Three free parameters are involved in this model, representing the choice probabilities of the first three decks on each trial. By definition, the choice probability of the last deck equals one minus the sum of those of the first three decks. This model suggests that people choose among the four decks without considering previous choices and outcomes and thus the choice probability of each deck remains the same throughout the whole task. Consequently, a cognitive model performs better than the baseline model only if it can account for the dependency of choices on previous choices and outcomes. The BIC difference score of a specific model compared with the baseline model6 is

$$BIC = 2(\stackrel{\wedge}{L}\_M - \stackrel{\wedge}{L}\_{Baseline}) - k \cdot \ln(n) \tag{11}$$

in which <sup>∧</sup> *LL* denotes the maximum log-likelihood produced by a model, *k* denotes the difference in the number of parameters and *n* is the number of data points used in fitting the models. Positive values of the BIC difference score indicate that a cognitive model outperforms the baseline model and the higher the better. A more complex model tends to produce a higher maximum log-likelihood but is also associated with a larger value of *k*. Therefore, models with more parameters do not necessarily lead to higher BIC scores.

## Parameter Consistency Test

The implicit assumption underlying our effort to fit data from both tasks with the same model is that participants' choices in these similar tasks are at least partly governed by the same mechanisms. Consequently, model parameters estimated from the two tasks should be positively associated since they are supposed to measure relatively stable psychological characteristics of the same individual across tasks (Yechiam and Busemeyer, 2008). To test this hypothesis, we conducted a correlational analysis between the two tasks for every parameter of each model. A good model in this respect should produce positive correlation coefficient for each parameter involved.

## Results

## Model Comparison

By comparing the various models (see **Table 4**), we obtained five key findings. First, in all cases cognitive models performed better on average than the baseline statistical model when making one-step-ahead predictions. This is not unexpected given that the baseline model assumes independent choices across trials, which seems unlikely since these tasks promote learning from feedback throughout. Evidence that the cognitive models performed better than the baseline model came from the positive mean BIC difference score of each cognitive model across both the IGT and SGT. Second, most of the cognitive models fit IGT data better than SGT data. This was true no matter whether mean BIC difference score, median BIC difference score, or percentage of positive BIC difference scores was used as a criterion. Third, in both tasks the delta learning rule was always inferior to the DRL rule and the new mixed learning rule no matter what utility function and choice rule were involved.



*EU, expectancy utility function; PU, prospect utility function; PU2, alternative prospect utility function; DEL, delta learning rule; DRL, decay-reinforcement learning rule; ML, mixed learning rule; TDC, trial-dependent choice rule; TIC, trial-independent choice rule; IGT, Iowa Gambling Task; SGT, Soochow Gambling Task; M, mean; Mdn, median; SD, standard deviation.*

<sup>6</sup>More precisely, Equation 11 represents the change in BIC value between the cognitive model and the baseline model.

Furthermore, when combined with the same updating and choice rules, the new utility function based on prospect theory performed better than the previous utility functions used with the EVL and PVL models. In the IGT, regardless of what updating and choice rules were utilized, the alternative prospect utility function always produced a higher average BIC difference score than the other two utility functions. In the SGT, the BIC difference scores generated from the two prospect utility functions were always identical because the task involves only a single outcome on each trial. On the other hand, the prospect utility functions were better than the piecewise weighted utility function in the EVL model in terms of average BIC difference score no matter what updating and choice rules were in force. Similar patterns could be found when median BIC difference score or percentage of positive BIC difference scores was used as the criterion. In general, the new prospect utility function performed better than the other two utility functions.

Finally, with the alternative prospect utility function, the model with DRL rule (DRL) and TIC rule appeared to be the best: it produced a higher average BIC difference score and more positive BIC difference scores than any of the other five models in both tasks. It was only inferior to the model with DRL rule and TDC rule, and the model with mixed learning rule and TIC rules, when median BIC difference score was used as the criterion and IGT data were fit. To choose among these three competing models, we further conducted pairwise comparisons in terms of the number of participants whose BIC difference scores were higher on one model than another (Broomell et al., 2011). The model with DRL and TIC again worked better in this comparison. Specifically, when comparing the DRL+TIC model with the DRL+TDC model, the former produced higher BIC difference scores on 34 participants in the IGT, whereas the latter produced higher BIC scores on only 19 participants. The corresponding numbers in the SGT were 42 and 11 respectively. Similarly, when comparing the DRL+TIC model with the model assuming mixed updating rule and TIC rule, the former produced higher BIC difference scores on 42 participants in the IGT and 49 participants in the SGT. Given the general advantage of the new model with the alternative prospect utility function, DRL rule, and TIC rule, we will hereafter treat it as the winning model and call it the PVL2 model. This model has four parameters, with higher values indicating more rapid memory decay, more consistent choices with regard to deck expectancies, higher levels of sensitivity to outcome differences, and more loss aversion respectively. Note that almost the same results from model comparison occurred when participants were divided into separate groups of opiate users and healthy controls (see **Tables 5** and **6**).

## Parameter Consistency

**Table 7** shows the correlation coefficient for each parameter of the PVL2 model between individual estimates from the two tasks. We used Spearman's rho coefficient since both Kolmogorov– Smirnov test and Shapiro–Wilk test of normality led to significant results on each parameter in both tasks (all *p*s < 0.05), and


TABLE 5 | Summary of BIC difference scores of the 18 cognitive models relative to the baseline statistical model in the IGT and SGT among controls.

*EU, expectancy utility function; PU, prospect utility function; PU2, alternative prospect utility function; DEL, delta learning rule; DRL, decay-reinforcement learning rule; ML, mixed learning rule; TDC, trial-dependent choice rule; TIC, trial-independent choice rule; IGT, Iowa Gambling Task; SGT, Soochow Gambling Task; M, mean; Mdn, median; SD, standard deviation.*

TABLE 6 | Summary of BIC difference scores of the 18 cognitive models relative to the baseline statistical model in the IGT and SGT among opiate users.


*EU, expectancy utility function; PU, prospect utility function; PU2, alternative prospect utility function; DEL, delta learning rule; DRL, decay-reinforcement learning rule; ML, mixed learning rule; TDC, trial-dependent choice rule; TIC, trial-independent choice rule; IGT, Iowa Gambling Task; SGT, Soochow Gambling Task; M, mean; Mdn, median; SD, standard deviation.*



TABLE 8 | Correlations for parameters estimated from the IGT and SGT of the model with expectancy utility function, decay-reinforcement learning rule, and trial-independent choice rule.


one-tailed tests because, according to the hypothesis on parameter consistency, we expected positive coefficients. As expected, there was a positive association between the estimates from the two tasks for each parameter of the winning model. The only other model that also produced significant correlations on all parameters was the model with expectancy utility function (i.e., the weighted utility function in the EVL model), DRL rule, and TIC rule. However, the strength of association produced by this model was lower than that of the PVL2 model (see **Table 8**) 7 . In summary, the PVL2 model outperformed all the other models with regard to parameter consistency across the IGT and SGT. Furthermore, the same pattern of associations occurred when participants were divided into groups of opiate users and healthy controls, although certain correlation coefficients might not be statistically significant due to the small sample sizes.

## Discussion

In this article, we made a systematic comparison of various models for the IGT and SGT, including the EVL and PVL models which have been widely adopted in the literature. Specifically, with the alternative prospect utility function and mixed updating rule, we generated 18 cognitive models of the IGT and SGT by factorially combining different utility functions, updating rules, and choice rules. The winning model, i.e., the PVL2 model, is similar to the PVL model but with a different implementation of the utility function of prospect theory. The BIC scores suggested that, for both healthy controls and opiate users, the PVL2 model outperformed the EVL model in both tasks and the PVL model in the IGT. These results implied that the alternative prospect utility function might provide a better approximation to the actual evaluation process than the other two utility functions. In other words, people evaluate positive and negative outcomes separately

<sup>7</sup>See Appendix for a table with the correlation coefficients for parameter estimates from the IGT and SGT for all of the models.

before combining the results into a comprehensive measure (as in the EVL model) and they become less sensitive to outcome difference when the absolute magnitudes of outcomes increase (as in the PVL model).

Although on average all the cognitive models performed better than the baseline model in both tasks, most of the cognitive models fit IGT data better than SGT data. Close scrutiny of the differences in payoff distribution between the two tasks revealed that the SGT not only couples low-frequency losses with negative expected values but also introduces more subtle structural properties that might induce participants to respond differently in the SGT than in the IGT. For example, it might be actually desirable in the SGT to choose the bad decks one more time after they produce a negative outcome because it is very unlikely that the next outcome will be a loss again. This is not true in the IGT, at least for deck A which produces a net loss half of the time. Similarly, people might avoid choosing the same good decks in the SGT when they just yield a positive outcome because there is a high probability that the same deck will produce a loss on the next trial. Such a tendency is clearly inconsistent with the current class of models which assume implicitly that a positive outcome would increase the choice probability of the selected deck and vice versa. This might be the major reason for the poorer performance of the models on SGT data. Future research is necessary to develop more sophisticated models incorporating this tendency.

Given that the DRL rule might produce an unbounded updated expectancy, one may wonder why this updating rule is still selected by the model comparison. Two possible explanations exist for the current result. First, although the DRL rule might produce an unbounded expectancy, this is not always true. The presumably undesirable situation occurs only when Ej(t–1) and u(t) have the same sign and the former is no larger than the latter in absolute magnitude, or when the two terms have the same sign, the former is larger than the latter in absolute magnitude, and the decay parameter is relatively small. More important, according to the winning model, the choice probabilities of the four decks are related to the relative magnitudes of deck expectancies rather than the absolute magnitudes. Therefore, models allowing for unbounded expectancies across the four decks might lead to the same predictions on choice probabilities as those only producing bounded expectancies.

After establishing the PVL2 model as the best model with regard to model fitting performance, we examined the issue of parameter consistency for each model. Within-subject data on both tasks made it possible to investigate whether parameter estimates from the two tasks were associated with each other. The results indicated that individual estimates of each parameter in the PVL2 model were positively associated across the two tasks. This suggested that choice responses in these two tasks were at least partly governed by the same mechanisms reflected by the PVL2 model. Although a similar model with the same updating and choice rules but a different utility function (i.e., the weighted utility function) also led to significant correlation coefficient on each parameter involved, the strength of association between its parameter estimates was lower than that of the PVL2 model. Therefore, the PVL2 model still outperformed all the other models in terms of parameter consistency.

One natural question arises from the results advocating the new model: how does the model account for the differences in behavioral data between opiate users and healthy controls? For example, does the winning model suggest that the differences in behavioral data are the consequence of differential degrees of choice variability, outcome sensitivity and/or loss aversion between the two groups? For any cognitive model of the IGT and SGT, this is no doubt a critical issue to address. However, it seems premature to answer the question right now for the following reasons: (1) the complex pattern of abnormality in the current samples, and (2) the relatively small sample sizes. Future studies with larger and more homogeneous samples of opiate users and controls are necessary to provide a convincing answer to this question.

Besides modeling the IGT and SGT from a reinforcement learning perspective, previous research has also investigated the role of perseveration in these tasks (Worthy and Maddox, 2012; Worthy et al., 2013a). Recently, Worthy et al. (2013b) further explored the benefit of combining reinforcement learning with perseveration in describing observed data from the IGT. It turned out that a model with the delta learning rule and a separate term for perseveration outperformed the PVL model with the DRL rule. Furthermore, it was proposed that the DRL rule may perform better than the delta learning rule not because memory decay plays a critical role in the tasks but because the former accounts for participants' tendency to perseverate but not the latter. Whether the same will occur to the current modeling attempt is an open question. On the one hand, with the alternative prospect utility function, treating perseveration separately may again improve the fitting performance of a model with the delta learning rule relative to a model with the decayreinforcement rule, at least for the IGT data. On the other hand, the resultant more complicated models may perform poorly in the consistency test due to the extra parameters and more subtle interactions among all the parameters. Future studies should test models with the alternative prospect utility function and a separate term for perseveration across the IGT and SGT to advance our understanding of the mechanisms underlying these tasks.

In conclusion, our analyses on the empirical data from both healthy and clinical participants suggested that the PVL2 model with the alternative prospect utility function, DRL rule, and TIC rule performed even better than the previous best model, i.e., the PVL model, in describing individual data. In addition, the PVL2 model also produced more consistent parameter estimates across the IGT and SGT than various other models examined in this study. Consequently, we recommend the PVL2 model as a candidate model of both the IGT and SGT in future studies on these two tasks.

## Acknowledgments

We wish to acknowledge support from the National Institute on Drug Abuse Grant R01 DA030551 to JB and JS and partial support from the Australian Research Council Discovery Project DP110100696 for this project.

## References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright* c *2015 Dai, Kerestes, Upton, Busemeyer and Stout. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Appendix


TABLE A1 | Correlations for parameters estimated from the IGT and SGT for all of the models.

*EU, expectancy utility function; PU, prospect utility function; PU2, alternative prospect utility function; DEL, delta learning rule; DRL, decay-reinforcement learning rule; ML, mixed learning rule; TDC, trial-dependent choice rule; TIC, trial-independent choice rule.*

*\*p* < *0.05; \*\*p* < *0.01.*

# A Simplified Model of Choice Behavior under Uncertainty

Ching-Hung Lin1, 2, 3, 4, Yu-Kai Lin<sup>1</sup> , Tzu-Jiun Song<sup>1</sup> , Jong-Tsun Huang<sup>5</sup> \* and Yao-Chu Chiu<sup>1</sup> \*

*<sup>1</sup> Department of Psychology, Soochow University, Taipei, Taiwan, <sup>2</sup> Department of Psychology, Kaohsiung Medical University, Kaohsiung, Taiwan, <sup>3</sup> Research Center for Nonlinear Analysis and Optimization, Kaohsiung Medical University, Kaohsiung, Taiwan, <sup>4</sup> Biomedical Engineering Research and Development Center, China Medical University Hospital, Taichung, Taiwan, <sup>5</sup> Graduate Institute of Neural and Cognitive Sciences, China Medical University, Taichung, Taiwan*

The Iowa Gambling Task (IGT) has been standardized as a clinical assessment tool (Bechara, 2007). Nonetheless, numerous research groups have attempted to modify IGT models to optimize parameters for predicting the choice behavior of normal controls and patients. A decade ago, most researchers considered the expected utility (EU) model (Busemeyer and Stout, 2002) to be the optimal model for predicting choice behavior under uncertainty. However, in recent years, studies have demonstrated that models with the prospect utility (PU) function are more effective than the EU models in the IGT (Ahn et al., 2008). Nevertheless, after some preliminary tests based on our behavioral dataset and modeling, it was determined that the Ahn et al. (2008) PU model is not optimal due to some incompatible results. This study aims to modify the Ahn et al. (2008) PU model to a simplified model and used the IGT performance of 145 subjects as the benchmark data for comparison. In our simplified PU model, the best goodness-of-fit was found mostly as the value of α approached zero. More specifically, we retested the key parameters α, λ, and A in the PU model. Notably, the influence of the parameters α, λ, and A has a hierarchical power structure in terms of manipulating the goodness-of-fit in the PU model. Additionally, we found that the parameters λ and A may be ineffective when the parameter α is close to zero in the PU model. The present simplified model demonstrated that decision makers mostly adopted the strategy of gain-stay loss-shift rather than foreseeing the long-term outcome. However, there are other behavioral variables that are not well revealed under these dynamic-uncertainty situations. Therefore, the optimal behavioral models may not have been found yet. In short, the best model for predicting choice behavior under dynamic-uncertainty situations should be further evaluated.

Keywords: Iowa Gambling Task, expected utility model, prospect utility model, dynamic-uncertainty situations, gain-loss frequency, loss aversion, delta learning rule, prominent deck B phenomenon

## INTRODUCTION

The Iowa Gambling Task (IGT) is an experience-based decision-making task used extensively as a diagnostic tool for neuropsychiatric disorders (Bechara et al., 1994, 1997). It can identify various psychological disorders, such as schizophrenia and substance addiction (Bechara, 2007). Following the logic of IGT development, researchers have attempted to discover the predictors and explored the mechanisms of emotional systems and decision-making functioning under situations of uncertainty. In the IGT, decks A and B have a negative final outcome (that is, a long-term

#### Edited by:

*Hauke R. Heekeren, Free University of Berlin, Germany*

#### Reviewed by:

*Peter N. C. Mohr, Free University of Berlin, Germany Veit Stuphorn, Johns Hopkins University, USA*

#### \*Correspondence:

*Jong-Tsun Huang jongtsun@mail.cmu.edu.tw Yao-Chu Chiu yaochu@mail2000.com.tw*

#### Specialty section:

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology*

Received: *07 December 2014* Accepted: *28 July 2016* Published: *17 August 2016*

## Citation:

*Lin C-H, Lin Y-K, Song T-J, Huang J-T and Chiu Y-C (2016) A Simplified Model of Choice Behavior under Uncertainty. Front. Psychol. 7:1201. doi: 10.3389/fpsyg.2016.01201*

**152**

outcome of -\$250) over an average of 10 trials. Conversely, decks C and D have a positive final outcome (+\$250) over an average of 10 trials. According to these standard final outcomes, decks A and B are termed "bad decks" and decks C and D are termed "good decks." At the same time, decks B and D contain infrequent losses (10 gains and 1 loss over 10 trials) while decks A and C contain relatively frequent losses (10 gains and 5 losses over 10 trials). IGT-related neuropsychological studies have mostly demonstrated that patients (e.g., individuals with ventromedial prefrontal lesions) prefer to choose the bad decks to a greater degree than do healthy controls. However, in recent years, some IGT studies have demonstrated that healthy controls also prefer to choose the bad deck B due to its frequent gains (Wilder et al., 1998; Steingroever et al., 2013), a finding which has been called the "prominent deck B phenomenon" (Lin et al., 2007; Chiu et al., 2012).

The expected utility (EU) theory (von Neumann and Morgenstern, 1947) has been used most frequently over the years to predict choice behavior. The original assumption and design of the IGT was based mainly on extensions of the EU theory (Bechara et al., 1994, 1997; Bechara and Damasio, 2005). However, studies on behavioral decision-making over the past five decades (Edwards, 1954; Tversky and Kahneman, 1986; Kahneman, 2003) have indicated that a decision maker's choice is not guided by the EU, but mainly by information regarding gain and loss under risk, as suggested by Prospect Theory (PT) and the Framing Effect. Prospect Theory demonstrated that most decision makers prefer to take risks in situations with negative descriptions (e.g., loss and death) and become riskaverse in situations with positive descriptions (e.g., gain and life) (Kahneman and Tversky, 1979; Tversky and Kahneman, 1981). Experiments in PT showed that normal decision makers ignored the EU and that their attitudes toward risks varied according to the depiction of the situation. This finding goes completely against the traditional viewpoints of economics and rationales in terms of invariant axioms (Kahneman and Tversky, 1979; Tversky and Kahneman, 1981, 1986). However, these studies were based mostly on description-based rather than dynamic-consecutive (experience-based) games such as the IGT (Barron and Erev, 2003; Hertwig et al., 2004; Hau et al., 2008; Fantino and Navarro, 2012).

Behavioral modeling is efficient for interpreting behavioral results and predicting differences in choice patterns between normal decision makers and neuropsychiatric patients. Many IGT modeling studies have indicated that modeling based on the EU theory is sufficient for distinguishing between neuropsychiatric patients (or criminals) and healthy subjects (Busemeyer and Stout, 2002; Garavan and Stout, 2005; Stout et al., 2005; Yechiam et al., 2005; Luo et al., 2011). However, some alternative theories, such as the viewpoint based on gain-loss frequency, have also been used to interpret the choice behavior in such dynamic-uncertain games (Wilder et al., 1998; Lin et al., 2007; Chiu et al., 2008; Upton et al., 2012). Furthermore, based on the profound finding of Ahn et al. (2008), more and more behavioral-modeling studies have demonstrated that the PTrelated models (which consider the effects of both gains and losses, or PU function, in their modeling) are more predictive than EU models (Ahn et al., 2008; Fridberg et al., 2010; Horstmann et al., 2012; Worthy et al., 2013a,b; Worthy and Maddox, 2014; Dai et al., 2015). In short, these modeling studies have consistently indicated that the prospect of an immediate gain-loss is an important guiding factor in the choice of behavior in the IGT (Lin et al., 2004, 2007; Chiu et al., 2005, 2008):

"Subjects may apply an implicit strategy to cope with the uncertain game, therefore they favored high-frequency gains over highfrequency losses in the experiment. This "gain-stay, lose-randomize" strategy (**Figure 3**) [42] has been observed in human and animal appetitive and avoidance experiments in which human or animal encounter reward or punishment [42–48]." (Chiu et al., 2008, p. 5).

To explain their results, most of these IGT-PU models (e.g., Yechiam and Busemeyer, 2005, 2008; Ahn et al., 2008; Fridberg et al., 2010; Horstmann et al., 2012; Steingroever et al., 2014; Worthy and Maddox, 2014; Dai et al., 2015) were modified from the EU models or were hybrid models combining the PU function with general behavioral learning models such as the Prospect Valence Learning (PVL) model (Ahn et al., 2014), and learning rules such as the delta learning rule, DEL (see Ahn et al., 2008, p. 1384, Equation 3; Rescorla and Wagner, 1972) and the decay reinforcement learning rule, DRI (see Ahn et al., 2008, p. 1385, Equation 4; Erev and Roth, 1998).

For instance, to determine an optimized decision model under dynamic-uncertainty, Ahn et al. (2008) compared eight decision-learning models with regard to their generalizability. Each decision maker took part in two dynamic-uncertain games, namely the IGT and the Soochow Gambling Task (SGT) (Chiu et al., 2008). Data from the first game was used to estimate the parameters of each model and to make a prediction for the second game. Furthermore, Ahn et al. (2008) adopted three methods to evaluate the goodness-of-fit of each model for each participant: a post-hoc fit criterion, a generalization criterion for short-term estimations, and a generalization criterion for long-term estimations. Consequently, they suggested that the PU function provides the optimized predictions for new conditions, but different learning models are needed to make short- vs. long-term learning predictions.

However, these refined PU models were relatively complex, compared to the original PU function (Kahneman and Tversky, 1979; Tversky and Kahneman, 1992). Recent studies in behavioral modeling have suggested that a simplified model based on the gain-stay loss-shift (or win-stay lose-shift) principle may provide a sufficient explanation of choice behavior under uncertainty (Lin et al., 2007; Chiu et al., 2008, 2012; Worthy et al., 2013a,b). Moreover, some research has suggested that these simplified models could be consistent with a large number of behavioral studies:

"These pioneer behavior studies with the concurrent schedules of reinforcement have displayed the frequency effect for choice

**Abbreviations:** DEL, delta learning rule; DRI, decay reinforcement learning rule; EU, expected utility; IGT, Iowa Gambling Task; MSD, mean square deviation; PU, prospect utility; PT, prospect theory.

pattern [45–49]. Additionally, these concepts have also been applied to examine the behavioral model of neuropsychological deficit [50,51]." (Chiu et al., 2008, p. 5).

The purpose of this study is to simplify the structure of the PU model and to provide the behavioral modeling to test the effects of some parameters. Additionally, this study identifies which type of parameter modulation has an optimal goodness-of-fit for the behavioral data. Based on the original assumption of the PU function and the findings of recent gain-loss frequency studies, we hypothesized that if the choice behavior of normal decision makers is based mostly on the gain-stay loss-shift (win-stay lose-switch) strategy, the optimized behavioral model should be relatively simpler than that which Ahn et al. (2008) had proposed. Namely, the weighting power of immediate gain-loss should be larger than the learning effect of gaining long-term outcome. (Note: For the original form of the equations and the method of simulation adopted in the present study, please refer to Ahn et al. (2008).)

## MATERIALS AND METHODS

## Participants

We recruited 145 participants who were all college students (102 males and 43 females, mean age: 18.6, SD: 0.97). Most of the subjects were first-year students. All statistical data was analyzed at a group level and presented anonymously. In this study, the participants were welcome and totally free to participate in the psychological experiment in the university, and the procedure was consistent with publicly available literature. After completing the whole game, the authors provided a 2-h lecture on human decision-making behaviors that also explained the testing purpose and the psychological mechanism of the IGT for all the subjects. The behavioral data was collected in October 2010, at which time the Institutional Review Board approval system was still in the process of being implemented at our university. The study was conducted in accordance with the unwritten rules of the Taiwan Psychological Association. Further, the IGT was conducted to simulate real-life decisions, so it looked like a common computerized card game that someone might play on the Internet. Many research websites provide online versions of the IGT to recruit general participants via the Internet (such as http://www.millisecond.com/download/ samples/v3/IowaGamblingTask/IowaGamblingTask.web and http://pebl.sourceforge.net/battery.html). Simply put, anyone can play the online version of IGT totally free.

## Materials

The gain-loss structure of the IGT in this study followed the original table outlined by Bechara et al. (1994), and the computerized version of the IGT was programmed with Matlab 2010a (MathWorks, Natick, MA, USA). **Figure 1** shows the appearance of this computerized version.

## Procedures for Collecting Behavioral Data

The original instruction of the IGT was adopted in this study. At the beginning of the game, the instructor provided instructions to ensure that the participants knew how to play this game. Each participant had a 200-trial selection. Participants used a mouse to pick a desired deck and the screen displays "win money" or "lose money" immediately, and the outcomes are summarized in the top bars (**Figure 1**). The participants did not know when the game would terminate. They were asked to do their best to earn money or avoid losing money in the IGT, but the instructor provided no hints for success in this task. Each participant in the present study played a game consisting of 200 trials, but only the dataset for the first 100 trials was used as the comparison data. This use of the data from the first 100 trials is comparable to the standard approach used for administering the IGT in past IGTrelated studies. Meanwhile, the dataset for the last 100 trials for each participant was not analyzed in this study. The dataset for the last 100 trials could be valuable, however, in future studies aimed at exploring the extended learning effect.

## Procedures for Producing Simulation Data

A simulation method (see Ahn et al., 2008, p. 1401, Appendix B) was used to estimate the parameters (Yechiam and Busemeyer, 2005; Ahn et al., 2008), with a few initial steps modified. First, the behavioral datasets for all participants were averaged and inserted into the model. Here we averaged the behavioral data across subjects to reduce the variance in the individual results; specifically, we used the mean probability of each deck chosen as the initial index during simulation. Second, according to the results of the eight models in Ahn et al. (2008), the models with the PU function (see Ahn et al., 2008, p. 1384, Equation 2) were proven to be better than those with the EU model. In summary, Ahn et al. (2008) applied the PU function to decision-learning models (DEL and DRI) and showed that the PU models are more powerful than EU models for achieving optimized simulation results. They also showed that the mean square deviation (MSD) of the DEL model was relatively small, in comparison with the DRI model (see Ahn et al., 2008, p. 1392, Table 6). The original formula suggested by Ahn et al. (2008) is as follows [see p. 1384, Equations (2) and (3)].

PU model (PU-DEL learning model):

$$\begin{aligned} \mathbf{E\_j(t)} &= \mathbf{E\_j(t-1)} + \mathbf{A} \cdot \boldsymbol{\updelta\_j(t)} \cdot [|\mathbf{x(t)}|^\alpha - \mathbf{E\_j(t-1)}], \text{ if } \mathbf{x(t)} \ge \mathbf{0};\\ \mathbf{E\_j(t)} &= \mathbf{E\_j(t-1)} + \mathbf{A} \cdot \boldsymbol{\updelta\_j(t)} \cdot [-\boldsymbol{\uplambda}|\mathbf{x(t)}|^\alpha - \mathbf{E\_j(t-1)}], \text{ if } \mathbf{x(t)} < \mathbf{0}.\end{aligned}$$

In the above formula, Ej(t) refers to the expectancy for deck j on trial t, A is the updating parameter, and δj(t) a dummy variable which is 1 if deck j is chosen and 0 otherwise. In addition, x(t) symbolizes the net gain on trial t, λ represents a loss-aversion parameter, and α is defined as a shape parameter of the utility function (Ahn et al., 2008).

In this study therefore, we used only the best learning models in Ahn et al. (2008). In the first step of the preliminary test, we used two general approaches: the general simulation method and the one-step-ahead method (see Ahn et al., 2008, p. 1400, Appendix A). Applying the general simulation method, the chance level probability of each deck being chosen in the first trial is used (25% in the case of four decks). Therefore, the first trial is randomly produced. Given the result of the first trial, the selection probability of the following trials can be

determined using the default initial values (see Ahn et al., 2008, p. 1385, Equation 5). For instance, if j represents one of the four decks, and j = 1 corresponds to deck A, then Pr1(2) marks the probability of deck A in the second trial, whilst E1(1) marks the expectancy of deck A in the first trial. Hence, the probability of each deck can be determined. Conversely, the onestep-ahead approach was totally dependent on the empirical data. Specifically, feeding real data from each trial into the model generated the probability of each following trial. These two approaches integrated the DEL model (Rescorla and Wagner, 1972; Yechiam and Busemeyer, 2005, 2008; Ahn et al., 2008) and the DRI model (Erev and Roth, 1998; Yechiam and Busemeyer, 2005, 2008; Ahn et al., 2008, 2014; Luo et al., 2011) for the final simulation, and the optimal parameters (α, λ, and A) were evaluated by MSD (see Ahn et al., 2008, p. 1391, Equation 11). Consequently, we found that the best result was obtained by using the model of general simulation combined with the DEL model. This result is mostly consistent with the observation by Ahn et al. (2008, see p. 1392, Table 6). The result of the preliminary test is listed in **Table 1**.

## Why is Parameter C Removed First?

The c parameter (see Ahn et al., 2008, p. 1386, Equation 6) is defined as the consistency between choices and expectancies and is known as the response-sensitivity parameter (Yechiam et al., 2005). However, this parameter c was designed for TABLE 1 | Comparison of the optimized parameter values of DEL and DRI models.


EU-based models and to modulate the differences between the behavioral data and EU model predictions. Therefore, we consider it can be ruled out in our model because the present study was mostly based on the PU models of Ahn et al. (2008) in which the response-sensitivity parameter is removed, whilst virtual decision-making responds directly to parameters α, λ, and A. Therefore, the parameters were defined as the three modulators α, λ, and A. Otherwise, all procedures followed appendix B in the study by Ahn et al. (2008; see **Figure 2**).

To explain further the removal of c, it is worth noting that the response-sensitivity parameter was introduced into behavioral models to resolve a problem of inconsistency. An explanation for this is as follows. Yechiam et al. (2005) suggested that:

"the decision maker's choice on each trial is based not only on the expectancies produced by the decks, but also on the consistency with which the decision maker applies those expectancies when making choices." (Yechiam et al., 2005, p. 975).

The final outcome defined the good and bad decks in the IGT; therefore, most behavioral models have been based largely on this basic assumption. Notably, while performing the IGT, participants typically do not realize the internal rules of the game during the initial stage. However, some participants will continue to select a single deck even as they are gaining insight regarding the good decks (Maia and McClelland, 2004) or will misinterpret the internal rule of the IGT, which may stand against the basic final outcome assumption (Lin et al., 2007; Chiu et al., 2008). Therefore, some research groups added the new parameter c in these models in order to solve this problem of inconsistency.

However, we considered that the incongruence between the basic final outcome assumption and modeling result can be solved by modulating the original parameters α, λ, and A. In fact, the value of parameters α and λ can directly modulate the effect of the monetary value in each gain-loss. Moreover, the value of parameter A can modulate the choice probability of consecutive trials through the influence of past experience. The modulations of these original parameters (α, λ, and A) can be used to observe the subjects' sensitivity to monetary value, the degree of skew for loss aversion, and the influence of past experience. In other words, if the simplified model does not increase the error rate (e.g., MSD) and decreases the calculation time, then this model may be considered much better than the original one (Busemeyer and Stout, 2002).

## Why Adopt the MSD, Not G<sup>2</sup> Scores?

Based on the statement by Ahn et al. (2008, p. 1387, Equation 9) for using the MSD and G<sup>2</sup> scores as the criteria for evaluating these behavioral models, we decided to adopt the MSD but not the G<sup>2</sup> scores as the evaluation criterion in this study. On the use of G<sup>2</sup> scores, Ahn et al. (2008) state:

"It is incorrect to simply use the product of the probabilities for choices across trials because independence does not hold." (Ahn et al., 2008, p. 1399).

And on MSD scores, Ahn et al. (2008) state:

"MSD scores are more intuitive for examining how good a model is in explaining overall choice patterns." (Ahn et al., 2008, p. 1399).

Ahn et al. (2008) also pointed out the characteristics of the two evaluation indexes. Bearing all of this in mind therefore, we adopted the MSD scores as the index of parameter estimation to discover the optimal parameter sets.

## RESULTS

In this study, we found a set of parameters and produced a simplified PU model to predict the choice behavior under uncertainty. The behavioral datasets were collected to serve as the benchmark data for comparisons with the modeling data. In addition, the key parameters α, λ, and A were systematically modulated and produced by the simulation data based on the PU models of Ahn et al. (2008). Specifically, the parameters were tested to screen out the best-fitted models as well as to determine the optimized range of parameters via MSD indexing. Based on the PU models, we found that there are some bestfitted models formed when some parameters are fixed. Notably, for the best-fitted models that we found, all three parameters were consistently nearly equal, with α ≈ 0; λ ≈ 1.3; and A ≈ 0.1. Obviously, the PU model in the present study was simpler than previous ones. However, the present PU model can produce optimized predictions for choice behavior under uncertainty, which is mostly consistent with the viewpoint of gain-loss frequency.

## Behavioral Data

The average card selection indicated that subjects preferred the good decks (C + D) nearly equally to bad decks (A + B; see **Figure 3**), which is inconsistent with the original finding from the IGT (Bechara et al., 1994). The two-factor repeated measurement ANOVA (final outcome vs. gain-loss frequency) was launched here to process the statistical testing. The testing result indicated no significant difference between the bad (A + B) and good (C + D) decks [F(1, 144) = 0.23, p = 0.88], but the results showed a difference between the high-frequency (B + D) and low-frequency (A + C) gain decks [F(1, 144) = 65.89, p < 0.001]. Furthermore, the interaction between the two factors (final outcome vs. gain-loss frequency) was also significant [F(1, 144) = 66.28, p < 0.001]. However, detailed analysis of each of the two decks showed that the subject preferred to choose the bad deck B rather than the other three decks [tA−B(144) = −12.59, p < 0.001; tB−C(144) = 4.80, p < 0.001; tB−D(144) = 4.93, p < 0.001] and that deck A was avoided compared to the other three decks [tA−C(144) = −6.28, p < 0.001; tA−D(144) = −7.50, p < 0.001]. Nevertheless, there are no significant differences between decks C and D [tC−D(144) = −0.48, p = 0.63]. The present behavioral evidence confirmed the "prominent deck B phenomenon," in which most normal decision makers were influenced by the frequent gain of the deck and the preference for the bad deck was difficult to inhibit by a few unexpected losses in the standard administration of the IGT (Wilder et al., 1998; MacPherson et al., 2002; Toplak et al., 2005; Fernie and Tunney, 2006; Chiu and Lin, 2007; Fernie, 2007; Lin et al., 2007; Martino et al., 2007; Takano et al., 2010; Upton et al., 2012; Steingroever et al., 2013; Worthy et al., 2013a).

The one-way ANOVA was applied to test the learning effect in each block of 20 trials (**Figure 4**). In detail, subjects' choice pattern for the bad decks A and B are descending over time, whereas the choice pattern for the good decks C and D are ascending. The learning-tendency analysis based on long-term outcome used the subtracted number between good decks and bad decks [(C + D)—(A + B)] in each block. The result indicated that the learning effect based on long-term outcome can be observed in this analysis [F(4, 720) = 9.80, p < 0.001].

The learning-tendency analysis based on gain-loss frequency subtracted the number between frequent-gain (B + D) and frequent-loss (A + C) decks in each block. The result indicated

in the behavioral data. The behavioral result showed that most subjects avoided the bad deck A, but preferred the bad deck B. The chosen number of bad deck B was nearly double that of bad deck A. However, participants preferred the good decks C and D only about the chance level (100/4 = 25).

FIGURE 4 | Mean number of card selections in each block of 20 trials in the behavioral data. Subjects preferred the bad deck B over the other three decks throughout most blocks. However, most subjects gradually avoided selecting the bad deck A from the beginning to the game end. Additionally, a slight ascending tendency for the good decks was observed from the first block to the end block, although statistical testing for the blocks in each deck was not significant.

that the learning effect based on gain-loss frequency cannot be observed in this analysis [F(4, 720) = 0.60, p = 0.66].

However, detailed analysis of each deck in the blocks indicated that only three decks showed a significant learning tendency [FA(4, 720) = 5.96, p < 0.001; FB(4, 720) = 3.96, p < 0.01; FC(4, 720) = 1.05, p = 0.38; FD(4, 720) = 6.85, p < 0.001]. Furthermore, the post hoc analysis of each two-block in each deck demonstrated that the significant difference between each paired block existed mostly in deck A; in decks B and D there were only one and two significant differences between each paired block, respectively. The statistics are listed in detail in **Table 2**.

The choice probability of each deck in each trial showed that decks B, C, and D were preferred by the subjects throughout the game (**Figure 5**). The results confirmed the learning tendency for each deck (**Figure 4**).

TABLE 2 | Summarized statistics after post hoc analysis of each two-block set for each deck.


*The values* \* <*0.05;* \*\* <*0.01;* \*\*\* <*0.001 (Bonferroni Correction).*

## Simulation Data

In the simulation data, the mean number of card selections showed that the number of cards chosen from the good decks (C + D) was nearly equal to the number chosen from the bad decks (A + B; **Figures 6**, **7**). The two-factor repeated measurement ANOVA (final outcome vs. gain-loss frequency) was used to further demonstrate the statistical result under the simulation level. The results showed significant differences between the bad (A + B) and good (C + D) decks [F(1, 144) = 135.85, p < 0.001]. On the other hand, a significant effect was also observed between the high-frequency (B + D) and low-frequency (A + C) gain decks [F(1, 144) = 312.47, p < 0.001]. Additionally, the interaction of the final outcome and gain-loss frequency was significant [F(1, 144) = 34.32, p < 0.001]. However, a paired-t analysis showed that differences between each two decks were all significant [tA−B(144) = −19.37, p < 0.001; tA−C(144) = −15.86, p < 0.001; tA−D(144) = −25.38, p < 0.001; tB−C(144)=4.63, p < 0.001; tB−D(144) = −3.44, p < 0.001; tC−D(144) = −8.26, p < 0.001]. The present data confirmed that the "prominent deck B phenomenon" is reproduced under the simulation environment. According to learning curve analysis (**Figure 7**), the choice patterns for the bad deck B and good deck D seem to rise over time, whereas the choice pattern of the good deck C stays consistent while that of bad deck A decreases.

## Comparison between Behavioral and Simulation Data

A comparison of the behavioral and simulation data (the group data with the smallest MSD) shows that no significant difference was observed [F(1, 288) = 1.06, p = 0.30]. In short, the simulation data was similar to the actual chosen pattern of these participants.

## Simulation Result Evaluation

There were 1936 optimal MSD parameter sets after the parametric estimation (α: 11 values (per 0.1): range [0–1]; λ: 16 values (per 0.1): range [1–2.5]; A: 11 values (per 0.1): range

FIGURE 6 | Mean number of card selections in the simulation data. Using optimized parameters simulated into the PU DEL model, the chosen number of decks B–D were larger than that of deck A. This simulation result is similar to the results of the participants in this study. The good deck D is widely selected, and the bad deck B was chosen slightly more frequently than the good deck C in the present simulation data.

[0–1]). First, we presented the data using an ascending sequence to show the situations of 1, 5, and 10% MSD distribution. **Figure 8** shows the first 10% MSD error distribution. The result demonstrates the number on the horizontal axis to be positively correlated with the MSD error on the vertical axis. Therefore, this observation confirms the high reliability of these parameter sets (α, λ, A).

We overlaid the MSDs of the DRI and DEL models in **Figure 8**. The result showed that the DEL model was more accurate in making prediction than the DRI model. This finding is consistent with the previous observation of Ahn et al. (2008, p. 1392; Table 6) which showed that the MSD of the DEL model was relatively small in comparison with the MSD of the DRI model. The following figures demonstrate the number of value distributions (0 < α < 1; 1 < λ < 2.5; 0 < A < 1) under three MSD conditions (1, 5, 10%) for each parameter (α, λ, A) in the DEL model (**Figures 9**–**11**).

**Figure 9** shows that the 1% MSD is clearly allocated mostly in the low α-value section (e.g., 0 and 0.1). This impact of gainloss value is relatively restricted or vanishing for decision makers. When α is close to zero, x(t) is almost close to 1. This indicates the influence of the gain-loss frequency and the impact of λ and A. Based on the three hierarchies of MSD (the error rates from 1 to 10%), the small value of α possessed a relatively high reliability.

As shown in **Figure 10**, the simulation test demonstrates that when the λ value was in the present range (1 < λ < 2.5), the MSD distribution patterns (MSD of 1–10%) did not change significantly. When the value of α was close to zero (MSD of 1%), the λ value influenced the fluctuation of MSD value to a lesser degree. Furthermore, when the α value was equal to 0, the function of x(t) was equal to 1, and the weight effect of λ disappeared. In fact, the probability of loss trial in the IGT was only 20%. As the probability of choosing loss trial is relatively small, the appearance frequency of the λ value has an averaged distribution globally.

Additionally, the value of A influenced the consecutive trials; namely, the acquisition of strategy learning in an abstract manner. For instance, if the A value is small, the effect of influencing the consecutive trial by the previous gain-loss experiences is relatively small. In **Figure 11**, it can be observed that the A value is located in a relatively small range of the MSD.

## DISCUSSION

The empirical results of this study replicated the "prominent deck B phenomenon" in the IGT and demonstrated that most subjects preferred the bad deck B and good decks C and D rather than the bad deck A in the standard administration of IGT (see **Figures 3**–**5**). However, various research groups have made this observation on the behavioral level over the past decades (Wilder et al., 1998; Takano et al., 2010; Upton et al., 2012; Steingroever et al., 2013; Worthy et al., 2013a,b). The present modeling study indicated that some parameters in the PU model may be ineffectual in predicting the choice behavior in IGT. Therefore, we suggest that the Ahn et al. (2008) PU model is not the optimal one and that there should be some room for modification.

## The Simulation Based on the Mean Number of Card Selections

According to the simulation result of the choice pattern in each deck, deck A is relatively lower than the other three decks (**Figure 12**). Decks B, C, and D have a similar mean number of card selections. This choice pattern (A < B, C, D) existed not only in the empirical data, but also in the simulation data. The simulation result is similar to the empirical observation of the IGT choice behavior. We found that in the gain-loss structure of the IGT, two main factors, monetary value and gain-loss frequency, correlated highly with the present choice pattern. For instance, in a circle of 10 trials, decks B and D have relatively high frequency gains; for example, nine gains and one loss (Wilder et al., 1998; Worthy et al., 2013a,b; Seeley et al., 2014). If the monetary value is controlled between the two decks, the two decks will have the same gain-loss structure. The choice pattern of simulation data shows that decks B and D have a similar number of choices when the α value is close to zero. Monetary value has less influence in this condition (Lee et al., 2014).

## Observation of the Learning Processing

The empirical and simulation data consistently demonstrated the learning curve of deck A to be gradually descending (**Figures 4**, **7**). On the other hand, both behavioral and simulation findings showed similar ascending choice patterns for decks C and D. However, some differences between the behavioral and simulation data for deck B was observed, which may have arisen from some limitations in the present models. In fact, Ahn et al. (2008) mentioned that the best model (DEL) in their IGT and SGT simulation study could make enhanced predictions for global choice patterns (long-term predictions) but not for learning processing (Ahn et al., 2008, 2014).

Additionally, based on the viewpoint of gain-loss frequency, the ascending curves of decks B and D may be due to the decreasing influence of monetary value. Moreover, the location of the learning curve of deck C in the middle of the four curves may be due to the deck's occasional draws (for example, "+50, −50" in some trials) and small gains from the viewpoint of net-value calculation (Chiu and Lin, 2007; Chiu et al., 2012).

The model in the present study combined the PU function and delta learning model and undoubtedly created a hierarchical influence. The order of influence could be α to A (positive net value) or α, λ to A (negative net value). Based on this observation, α is a powerful parameter for modulating the model and predicting the participant's behavior. Conversely, when α is fixed, λ and A have less influence in mediating the model. Therefore, the value of α obviously determines the effect significantly. On the other hand, the simulation result of α (**Figure 9**) demonstrated the model to be insensitive to value change, but it correlated increasingly to the gain-loss frequency

FIGURE 11 | Counts of smallest MSD value when modulating the A value based on DEL model. The present test modulated the value of A from 0 to 1 and processing with 1936 simulations based on the DEL model. The simulation results were listed with regard to the MSD value. Here we demonstrated that in the three collections (1, 5, 10%) of smallest MSD values, the A value close to zero has the largest number of smallest MSD. This indicates that in the best-fit model (e.g., DEL), the parameter A may be fixed to a constant (close to zero) rather than a variable, which represents the ineffective influence of past experience.

effect (**Figure 13**). Nevertheless, based on the behavioral result (**Figure 4**), the selection of bad deck B gradually and unsteadily decreased. This may imply that the largest loss of deck B truly does influence choice behavior; thus, the small α value (0) of the simulation and the λ value (1.3) may not totally reflect all situations.

Over the past decade, studies of IGT modeling have evolved from the linear EU model (Busemeyer and Stout, 2002) to the non-linear PU model (Ahn et al., 2008). These models aim to quantify behavioral impact by monetary value. Notably, the PU model possessed unequal valence and value function between gain and loss; namely, unbalanced marginal effect in gain and loss conditions. However, many components from the input (perception) to the output (decision making) may influence the behavioral results. For example: visual fields, figure and character distinction, the ability to integrate information, memory encoding, and retrieval, comprehension, logical reasoning, and decision drivers may be latent causes that also influence choice behavior. The present PU model considered

FIGURE 12 | Comparison of two datasets for average card selection (for the first 100 trials). Comparing the data of the subjects and the simulation data, we observe that the number of chosen cards from decks B and D was higher than that of the other two decks. The high selection of these two decks suggests high frequent gain to be the critical factor in choice behavior under uncertainty. Furthermore, the bad deck A was consistently the least chosen in the behavioral and simulation data. This observation is congruent with those of most previous IGT studies.

only a partial set of relevant variables when predicting the decision behavior under uncertainty. There may be better and more simplified models using dynamic-change parameters.

## CONCLUSION

Based on PT theory and the study by Ahn et al. (2008), we found a simpler model of IGT behavior in the present study. Over the years, many IGT modeling studies have suggested that the PU model (Ahn et al., 2008) is better than the EU model (Busemeyer and Stout, 2002) for predicting choice behavior under uncertainty because the PU model considers the distinct influences of gain and loss. However, we considered that some parameters in the PU model may be ineffective and render this model suboptimal. In this study, we provided a method of model testing by modulating some key parameters (α, λ, and A) in the PU model. The findings from the model testing demonstrated that these parameters (α, λ, and A) possessed hierarchical influences and specific optimized ranges in the PU model. By setting α ≈ 0; λ ≈ 1.3; and A ≈ 0.1 as the optimized parameters of the simulation, the modified PU function (u(t)) can be calculated as follows:

$$u(t) = \begin{cases} |\varkappa(t)^{\alpha}| & \\ -\lambda |\varkappa(t)|^{\alpha} & \end{cases} = \begin{cases} 1, \text{ if } \varkappa(t) \ge 0 \\\ -1.3, \text{ if } \varkappa(t) < 0 \end{cases}$$

As α is approaching zero, the shape of this function is similar to a Heaviside (step) function (see **Figure 13**).

Combined with this result, we suggest a simplified model as follows:

$$\begin{array}{rcl} \mathrm{E\_{j}(t)} &=& \mathrm{E\_{j}(t-1)} + 0.1 \cdot \delta\_{\mathrm{j}}(\mathrm{t}) \cdot [1 - \mathrm{E\_{j}(t-1)}], \quad \mathrm{if} \,\mathrm{x(t)} \ge 0;\\ \mathrm{E\_{j}(t)} &=& \mathrm{E\_{j}(t-1)} + 0.1 \cdot \delta\_{\mathrm{j}}(\mathrm{t}) \cdot [-1.3 - \mathrm{E\_{j}(t-1)}], \quad \mathrm{if} \,\mathrm{x(t)} < 0. \end{array}$$

Further, we conclude that the change in some parameters (e.g., λ and A) may be powerless in influencing the models when α

FIGURE 13 | Prospect utility function and simulated gain-loss frequency effect (λ = 1 (left panel); λ = 1.3 (right panel)). According to the definition provided by Ahn et al. (2008), α is defined as the shape of the utility function and λ as the response to the effect of loss aversion. When α or λ is adjusted in the PU function, we observed little loss-aversion effect. Notably, the present study indicated that both α and λ might not be the variables that shape the selection pattern of decks A, B, C and D. Particularly, when λ = 1, the loss-aversion effect is no longer present in this figure. According to our simulation result, the optimized α value is nearly equal to zero, the PU function only represents the effect of gain-loss frequency and the effect of value is greatly diminished. More specifically, when the optimized λ value is nearly equal to 1 and α close to zero in this simulation, the value function is similar to the Heaviside (step) function. This observation implies that the effect of insensitivity to value is actually the same as the effect of sensitivity to gain-loss frequency (Lin et al., 2007; Chiu et al., 2008). The present simplified PU model mostly represents the adoption of a gain-stay loss-shift strategy under uncertainty. This finding may explain the "prominent deck B phenomenon" for healthy groups in the growing number of recent IGT studies.

approaches zero. This model testing shows that the PU model may need further simplification for it to be optimized. The simulation of this simplified model implied that decision makers were sensitive to gain-loss frequency rather than the long-term outcome. The modified model may possess better predictors for clinical categorization and distinguishing between normal subjects and neuropsychiatric patients. However, the present study determined a set of the three fixed values for α, λ, and A only by analyzing a specific dataset of IGT experimental data. To make this dataset of estimated values applicable to a wider range of IGT and SGT experiments, more data from different experimental sets would be needed. Supposing the fitting values for α, λ, and A could be converged to an acceptable range across a sufficient number of experiments, this simplified model may turn out to be a better explanation of choice behavior under uncertainty.

## AUTHOR CONTRIBUTIONS

CL, YL, and YC contributed to the conceptual innovation and literature review and the three authors contributed equally to this study. JH provided some valuable concepts. YL and TS were responsible for subject recruitment and behavioral data collection. YL contributed to the programming of models and simulation data processing. YL, TS, and CL provided the statistical analysis. YC, YL, and CL worked on the data interpretation and developed the manuscript. Also, JH provided some discussion on refining the manuscript. YC, YL, and CL set up all experimental conditions and arranged all behavioral and simulation circumstances for this study. YC, JH, and CL finalized all revisions with YL and TS.

## ACKNOWLEDGMENTS

The authors would like to thank the National Science Council of Taiwan for financially supporting this study under Contract No. NSC 99-2410-H-031-025, NSC 102-2320-B-039-001, and MOST 104-2410-H-031-014. The first author's work was also supported in part by the Kaohsiung Medical University, Taiwan (KMU-Q103022; KMUTP103F00- 03). Special thanks to Profs. Da-Bai Shen, Ming-Chin Hung, and Yung-Fong Hsu for engaging in helpful discussions, as well as to Mr. We-Kang Lee, who provided great help in the first and second rounds of proofreading and by engaging in detailed discussions regarding potential future studies. We are grateful to Kyle Scheihagen, Li-Hsin Lin, and Charlesworth group for their valuable help in English editing and proofreading for this manuscript. Also, we appreciate the kind and helpful suggestions from the reviewers, one of whom in particular suggested a concise title which could better represent the main finding of this article.

## REFERENCES


Iowa Gambling Task?" in Paper Presented at the Society for Neuroeconomics 3rd Annual Meeting (Kiawah Island, SC).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer PM and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Lin, Lin, Song, Huang and Chiu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users

*Woo-Young Ahn1, Georgi Vasilev2, Sung-Ha Lee3, Jerome R. Busemeyer 3, John K. Kruschke3, Antoine Bechara4,5 and Jasmin Vassileva6 \**

*<sup>1</sup> Virginia Tech Carilion Research Institute, Virginia Tech, Roanoke, VA, USA*

*<sup>2</sup> Bulgarian Addictions Institute, Sofia, Bulgaria*

*<sup>3</sup> Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, USA*

*<sup>4</sup> Department of Psychology, University of Southern California, Los Angeles, CA, USA*

*<sup>5</sup> Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA*

*<sup>6</sup> Department of Psychiatry, Virginia Commonwealth University School of Medicine, Richmond, VA, USA*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Eric-Jan Wagenmakers, University of Amsterdam, Netherlands Darrell A. Worthy, Texas A&M University, USA*

#### *\*Correspondence:*

*Jasmin Vassileva, Department of Psychiatry, Institute for Drug and Alcohol Studies, Virginia Commonwealth University, 203 E. Cary Street, Richmond, VA 23219, USA*

*e-mail: jlvassileva@vcu.edu*

Substance dependent individuals (SDI) often exhibit decision-making deficits; however, it remains unclear whether the nature of the underlying decision-making processes is the same in users of different classes of drugs and whether these deficits persist after discontinuation of drug use. We used computational modeling to address these questions in a unique sample of relatively "pure" amphetamine-dependent (*N* = 38) and heroin-dependent individuals (*N* = 43) who were currently in protracted abstinence, and in 48 healthy controls (HC). A Bayesian model comparison technique, a simulation method, and parameter recovery tests were used to compare three cognitive models: (1) Prospect Valence Learning with decay reinforcement learning rule (PVL-DecayRI), (2) PVL with delta learning rule (PVL-Delta), and (3) Value-Plus-Perseverance (VPP) model based on Win-Stay-Lose-Switch (WSLS) strategy. The model comparison results indicated that the VPP model, a hybrid model of reinforcement learning (RL) and a heuristic strategy of perseverance had the best *post-hoc* model fit, but the two PVL models showed better simulation and parameter recovery performance. Computational modeling results suggested that overall all three groups relied more on RL than on a WSLS strategy. Heroin users displayed reduced loss aversion relative to HC across all three models, which suggests that their decision-making deficits are longstanding (or pre-existing) and may be driven by reduced sensitivity to loss. In contrast, amphetamine users showed comparable cognitive functions to HC with the VPP model, whereas the second best-fitting model with relatively good simulation performance (PVL-DecayRI) revealed increased reward sensitivity relative to HC. These results suggest that some decision-making deficits persist in protracted abstinence and may be mediated by different mechanisms in opiate and stimulant users.

**Keywords: addiction, decision-making, computational modeling, heroin, amphetamine, protracted abstinence, Bayesian data analysis, Widely Applicable Information Criterion (WAIC)**

### **INTRODUCTION**

Drug addiction is a chronic relapsing brain disease, characterized by compulsive drug seeking and use despite negative consequences in major life domains (Goldstein and Volkow, 2011). Substance dependent individuals (SDI) are commonly characterized by decision-making deficits, both on laboratory tasks and in real life, manifested by lack of judgment and reduced concern for the consequences of their actions. What remains unknown, however, is whether these decision-making deficits are equally represented across addictions to different classes of drugs.

Current theories consider addiction to different classes of drugs as a unitary phenomenon, in part based on evidence that most drugs of abuse act on the mesocortico/mesolimbic dopamine (DA) system (Wise, 1978; Di Chiara and Imperato, 1988; Robinson and Berridge, 1993). More recently, however, animal and human studies have begun to reveal important cognitive and neurobiological differences between addictions to different classes of drugs, such as stimulants and opiates (Pettit et al., 1984; Rogers et al., 1999; Ersche et al., 2005b; Badiani et al., 2011). It is now well known that these two classes of drugs act on different mechanisms of DA modulation (Kreek et al., 2002, 2012). DA transmission mediates self-administration of stimulants, but not of opiates; in contrast, the μ-opiate receptor plays an important role for opiate, but not for stimulant self-administration (Badiani et al., 2011). Further, genetic studies reveal minimal overlap of genes associated with stimulant and opiate addiction (Yuferov et al., 2010).

Preclinical studies reveal notable differences between stimulants and opiates, which exert fundamentally different behavioral effects, such that stimulants produce arousing and activating effects, whereas opiates produce mixed inhibitory and excitatory effects (Stewart et al., 1984). Of note, the rewarding effects of stimulant self-administrations are greater in new and arousing environments than in familiar and safe environments, whereas the opposite is observed with the sedative effects of opiates (Caprioli et al., 2008). Further, the neural pathway activated by aversive stimuli from lateral habenula to rostromedial tegmental nucleus (RMTg) is affected by opiates, but not by stimulants (Lecca et al., 2011).

In contrast, studies comparing neurocognitive performance of human stimulant and opiate users have shown mixed results. Some studies reveal distinct performance patterns in stimulant vs. opiate users. Rogers et al. (1999) report that amphetamine users perform worse than healthy individuals on the Cambridge Gambling Task, whereas opiate users display intact performance on this decision task. In addition, duration of drug abuse was associated with suboptimal decision-making in stimulant users, but not in opiate users. In another study (Ornstein et al., 2000), amphetamine and heroin abusers were characterized by different attentional shifting deficits, with amphetamine users being impaired on the extra-dimensional (ED) and heroin users on the intra-dimensional (ID) shift component of the task. Also, cocaine users, but not heroin users show deficits in response inhibition (Verdejo-Garcia et al., 2007b). In contrast, other studies reveal comparable neurocognitive profiles between users of these two classes of drugs. Both cocaine and heroin users show higher discounting of delayed rewards compared to alcohol users and healthy individuals (Kirby and Petry, 2004). Further, on a task measuring reflection impulsivity, both amphetamine- and opiate-dependent individuals sample less information and perform worse than healthy individuals (Clark et al., 2006).

Decision-making is one of the neurocognitive domains on which SDI are commonly impaired. It is typically indexed in the laboratory with tasks that simulate real-life decision-making such as the Iowa Gambling Task (IGT) (Bechara et al., 1994), on which SDI often select choices that yield high immediate gains but have higher future losses (Grant et al., 2000; Bechara et al., 2001; Bolla et al., 2003; Bechara and Martin, 2004; Gonzalez et al., 2007; Vassileva et al., 2007a; Verdejo-Garcia et al., 2007a). Decisionmaking deficits among SDI are of immediate practical concern, in light of their associations with HIV risk behaviors (Gonzalez et al., 2005) and clinical outcomes such as abstinence (Passetti et al., 2008). The IGT is a complex task and poor behavioral performance could be the result of deficits in various distinct neurocognitive processes, such as hypersensitivity to reward and/or hyposensitivity to losses, failure to learn from past outcomes and losses, and/or erratic and impulsive response style. In a series of studies, Busemeyer et al. (Busemeyer and Stout, 2002; Stout et al., 2004; Yechiam et al., 2005; Ahn et al., 2008) have developed mathematical models of the task that capture the complex interplay of cognitive and motivational processes involved in decision-making. The use of such models allows one to decompose behavioral performance on the task into distinct cognitive, motivational, and response processes, thereby providing a fine-grained analysis of the underlying decision-making processes and characterizing more precisely the decision-making deficits of different clinical groups. This approach yields quantifiable parameter estimates of such processes, which have been successfully mapped in various clinical populations including cocaine users, cannabis users, alcohol users, individuals with Asperger's disease, Huntington's disease, schizophrenia, and bipolar disorder (for a review, see Yechiam et al., 2005), as well as in eating disorders (Chan et al., 2014) and patients with HIV (Vassileva et al., 2013). Studies applying this approach show that although behavioral performance may be similar across different clinical groups, the cognitive processes that underlie these behavioral profiles may vary across groups in clinically meaningful ways.

The widespread polysubstance-dependence among SDI significantly complicates attempts to dissociate pre-existing biological or personality characteristics from the effects of chronic use of different classes of drugs on neurocognitive functioning (Fernández-Serrano et al., 2011; Gorodetzky et al., 2011; Baldacchino et al., 2012). Further, we still know very little about the reversibility of the observed neurocognitive deficits with abstinence, given that with few exceptions (Ersche et al., 2005a,b; Clark et al., 2006) most studies to date have focused on current drug users or on SDI who have been abstinent for rather brief periods of time. The chronic relapsing nature of addiction suggests that some of the neurocognitive deficits, particularly those in decision-making, may persist with abstinence and may be critically implicated in increased susceptibility to relapse. In order to better understand the brain's recovery of function with protracted abstinence and to refine treatment interventions at different stages of the addiction cycle, it is crucial to get a better understanding of the specificity and the persistence of the neurocognitive deficits observed in drug users.

To address these challenges, we conducted the current study in Bulgaria, where polysubstance dependence is still relatively uncommon and where we have access to a unique population of fairly "pure" (monosubstance-dependent) amphetamine and heroin users who meet lifetime DSM-IV criteria for amphetamine or heroin dependence. The heroin epidemic in Bulgaria started in the early 1990s after the end of communism, when Bulgaria became a key transit country for heroin trafficking due to its strategic geographical position on the "Balkan Drug Route," one of the main routes for international drug traffic from South-West Asia to Western Europe. Estimates show that at times up to 80% of heroin used in Western Europe passes through this route (European Monitoring Center for Drugs and Drug Addiction, 2011). The heroin epidemic reached its peak in 1997– 1998, after which it plateaued. In the early 2000s, there were an estimated 20–30,000 regular heroin addicts in Bulgaria (population of ∼7,476,000 people), which number has remained steady over the last decade, with a recent trend for a slight decline. Typically, heroin addicts belong to a cohort of somewhat aging addicts, ∼30 years of age. In contrast, the amphetamine epidemic in Bulgaria started more recently in the new millennium when Bulgaria became a major center for production of synthetic amphetamine-type stimulants and is currently one of the top five highest-prevalence countries in Europe (European Monitoring Center for Drugs and Drug Addiction, 2011). Hence, amphetamine users are typically younger—normally in their late teens or early 20s. Notably, few SDI use the two types of drugs concurrently.

We compared the decision-making performance of heroin and amphetamine users to that of healthy controls (HC) without any history of substance dependence. We followed these behavioral analyses by applying a computational modeling approach, in order to better characterize their decision-making styles and to disentangle the distinct neurocognitive processes underlying the decision-making performance of heroin and amphetamine users. The modeling results and their interpretations depend on which model we use. Therefore, we first identified the best-fitting model by comparing three existing computational models using a Bayesian model comparison technique, a simulation method, and parameter recovery tests (see Materials and Methods below for more details). Then, we compared groups in a Bayesian way using the best-fitting model, but also tested whether we would observe similar group differences with the other models. Based on previous animal and human studies, we hypothesized that amphetamine and heroin users would show distinct decisionmaking profiles. Specifically, we expected that amphetamine users would show increased reward sensitivity and heroin users would show reduced loss aversion compared to HC (Spotts and Shontz, 1980; Stewart et al., 1984; Kreek et al., 2002).

In light of the growing evidence for the relationship of externalizing and internalizing personality traits and disorders with decision-making and drug addiction, in exploratory analyses we considered the relationship between impulsivity and psychopathy (externalizing spectrum) and depression and anxiety (internalizing spectrum) with decision-making. We hypothesized that externalizing but not internalizing traits and states would be associated with compromised decision-making.

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Study participants included 129 individuals, enrolled in a larger study of impulsivity in heroin and amphetamine users in Sofia, Bulgaria. Potential participants were recruited via flyers placed at substance abuse clinics, cafes, bars, and night clubs in Sofia and screened via telephone and in-person on their medical and substance use histories. SDI had lifetime DSM-IV histories of opiate or stimulant dependence. The current study included primarily monosubstance-dependent users with no history of dependence on alcohol or any drug other than opiates or stimulants (with the exception of nicotine, caffeine, and/or past cannabis dependence). Demographically similar individuals with no history of substance dependence were included as controls. Study participants included 38 amphetamine users, 43 heroin users, and 48 HC. Most of the heroin and amphetamine users were in protracted abstinence at the time of testing (∼2.9 years on average since they last met DSM-IV criteria for substance dependence, minimum 3 months post discontinuation of drug use). Among the 38 amphetamine users, 11 were in early (<12 months of abstinence) full (*n* = 9; 24%) or partial (*n* = 2; 5%) remission and 27 were in sustained (>12 months of abstinence) full (*n* = 25; 66%) or partial (*n* = 2; 5%) remission. Among the 43 heroin users, 12 (28%) were in early full remission, 30 (70%) were in sustained full and one (2%) was in sustained partial remission.

Inclusion criteria consisted of age between 18 and 50 years, minimum of 8 years of formal education, ability to speak and read Bulgarian, estimated IQ greater than 80, negative breathalyzer test for alcohol and negative rapid urine toxicology screen for opiates, cannabis, amphetamines, methamphetamines, benzodiazepines, barbiturates, cocaine, MDMA, and methadone. Exclusion criteria included history of neurologic illness or injury, history of psychotic disorders, and current opioid substitution therapy (OST). All participants were HIV-seronegative, as verified by rapid HIV test. All participants provided written informed consent. Study procedures were approved by the Institutional Review Boards of the University of Illinois at Chicago and the Medical University in Sofia on behalf of the Bulgarian Addictions Institute.

## **ASSESSMENT**

History of substance abuse and dependence was determined using the Structured Clinical Interview for DSM-IV Substance Abuse Module (SCID-SAM; First et al., 1996). The Raven's Progressive Matrices was administered to index estimated IQ. For the exploratory analyses, the Barratt Impulsiveness Scale—11th revision (BIS-11; Patton and Stanford, 1995) indexed the personality trait of impulsivity. Psychopathy was assessed with the Psychopathy Checklist: Screening Version (PCL:SV; Hart et al., 1995). Current depression was assessed with the [Beck Depression Inventory-II (BDI-II); Beck et al., 1996] and anxiety with the [State-Trait Anxiety Inventory (STAI); Spielberger and Gorsuch, 1983]. For the exploratory analyses, we also tabulated several substance use characteristics including number of years of drug use, length of abstinence from the primary drug of dependence, number of DSM-IV criteria met for the primary drug of dependence, severity of nicotine dependence, and history of past cannabis dependence.

#### **IOWA GAMBLING TASK**

Decision-making was measured with the computerized IGT (Bechara et al., 1994, 2001), arguably the most popular decision task in the addiction literature. The task requires participants to select cards from one of four decks with the goal of maximizing profits. Unbeknownst to participants, two of the decks (decks C and D) are *advantageous ("good")* and two (decks A and B) are *disadvantageous ("bad")* in terms of their long-term payoffs. The frequencies of punishment also vary across decks such that punishment is more frequent in decks A and C (50%) than in decks B and D (10%). In the modified version of the IGT (Bechara et al., 2001) used in the current study, each deck has up to 60 cards and the amounts of net gains or losses increased incrementally in every block of 10 cards. For example, the net loss of decks A and B in the first block of 10 cards is -\$250, but across every block it goes up with \$150 until it reaches \$1000 in the sixth block. Similarly, the net gain of decks C and D goes up from \$250 in the first block to \$375 in the sixth block, with an increment of \$25 in each block of 10 cards. The frequencies of punishment are identical to those in the original IGT version. Participants have to learn the task contingencies by trial-anderror. Healthy participants typically learn to select cards from the advantageous decks as the task progresses, thereby achieving a higher cumulative reward value. Behavioral performance analyses were based on the total net score, calculated by subtracting the number of disadvantageous deck selections from the number of advantageous deck selections. Trial-by-trial choice data of the HC, amphetamine, and heroin groups are available at http:// figshare.com/articles/IGT\_raw\_data\_Ahn\_et\_al\_2014\_Frontiers\_ in\_Psychology/1101324.

#### **COMPUTATIONAL MODELING OF DECISION-MAKING**

From a statistical perspective, the IGT is a four-armed bandit problem (Berry and Fristedt, 1985), a special case of reinforcement learning (RL) problems in which an agent needs to learn an environment by choosing actions and experiencing the outcomes of those actions. Poor performance on the IGT can be due to a number of distinct underlying neurocognitive processes such as poor learning/memory, hypersensitivity to reward, hyposensitivity to loss, or response inconsistency. In order to better characterize behavioral performance on the IGT and to disentangle the distinct neurocognitive processes underlying the performance of pure heroin and amphetamine users on the task, we next used the *computational modeling approach* (Busemeyer and Stout, 2002; Yechiam et al., 2005; Ahn et al., 2008).

We compared three of the most promising models of the IGT according to the literature (e.g., Ahn et al., 2008, 2011; Steingroever et al., 2013, 2014; Worthy et al., 2013b): the Prospect Valence Learning (PVL) model with delta learning rule (PVL-Delta) (Ahn et al., 2008), the PVL model with decay reinforcement learning rule (PVL-DecayRI) (Ahn et al., 2008, 2011), and the Value-Plus-Perseverance model (VPP) (Worthy et al., 2013b). We used Watanabe-Akaike Information Criterion (also called Widely Applicable Information Criterion; WAIC) (Watanabe, 2010) to compare the *post-hoc* fits of models. We also used a simulation method to examine whether a model with estimated parameters can generate the observed choice pattern (Ahn et al., 2008; Steingroever et al., 2014). We describe the mathematical details of all models, which are also available in the previous publication (Worthy et al., 2013b) as well as WAIC and the simulation method below.

#### *Prospect valence learning (PVL) models (PVL-Delta and PVL-DecayRI)*

The PVL models have three components. The PVL-Delta and PVL-DecayRI models are identical except that they use different learning rules. First, the outcome evaluation follows the Prospect utility function that has diminishing sensitivity to increases in magnitude and different outcome sensitivity to losses vs. gains (i.e., loss aversion). The utility, *u(t)* on trial *t* of each net outcome *x(t)* is expressed as:

$$u(t) = \begin{array}{c} \varkappa(t)^{\alpha} \\ -\lambda|\varkappa(t)|^{\alpha} \end{array} \quad \begin{array}{c} if \ \varkappa(t) \ge 0 \\ if \ \varkappa(t) < 0 \end{array} \tag{1}$$

Here α (shape parameter, 0 <α< 2) governs the shape of the utility function and λ (loss aversion parameter, 0 < λ < 10) determines the sensitivity to losses compared to gains. Net outcomes were scaled (all payoff outcomes were divided by a fixed number) for cognitive modeling so that the median highest net gain across subjects in the first block of 10 trials becomes 1 and the largest net loss becomes −11.5 (Busemeyer and Stout, 2002). If an individual has a high value of α, it indicates that he/she has greater sensitivity to feedback outcomes than an individual with a low value of α. Here, we extended the upper bound of α to be greater than 1 as some individuals may have very high values of α (e.g., Fridberg et al., 2010). A value of λ less than 1 indicates that the individual is more sensitive to gains than to losses while a value of λ greater than 1 indicates that he/she is more sensitive to losses than to gains.

Based on the outcome of the chosen option, the expectancies of the decks were computed using a learning rule. Previous studies consistently show that the decay-reinforcement learning (decayRI; Erev and Roth, 1998) has better *post-hoc* model-fits than the delta (Rescorla-Wagner; Rescorla and Wagner, 1972) rule on the IGT (Yechiam and Busemeyer, 2005, 2008; Ahn et al., 2008) but the delta rule outperforms the decayRI learning rule in simulation tests (Ahn et al., 2008; Steingroever et al., 2014). In the decayRI learning rule, the expectancies of all decks are discounted on each trial and then the expectancy of the chosen deck is updated by the current outcome utility:

$$E\_j(t+1) = A \cdot E\_j(t) + \delta\_j(t) \cdot u(t) \tag{2}$$

*A* (*recency parameter/learning rate*, 0 < *A* < 1) determines how much the past expectancy is discounted. δ*j*(*t*) is a dummy variable which is 1 if deck *j* is chosen and 0 otherwise. On the other hand, in the delta rule, the expectancy of only the selected deck is updated and the expectancies of the other decks remain unchanged:

$$E\_{\dot{j}}(t+1) = E\_{\dot{j}}(t) + A \cdot \delta\_{\dot{j}}(t) \cdot (\mu(t) - E\_{\dot{j}}(t)) \tag{3}$$

*A* determines how much weight is placed on past experiences of the chosen deck vs. the most recent selection from the deck. A low learning rate indicates that the most recent outcome has a small influence on the expectancy and forgetting is more gradual. A high learning rate indicates that the recent outcome has a large influence on the expectancy of the chosen deck and forgetting is more rapid. Note that we used the same symbol (*A*) for the learning models in the two PVL models, but A has different meaning in each learning model (i.e., recency for the DecayRI and learning rate for the Delta model).

The softmax choice rule (Luce, 1959) was then used to compute the probability of choosing each deck *j.* θ (sensitivity) governs the degree of exploitation vs. exploration:

$$\Pr[D(t+1) = j] = \frac{\mathfrak{e}^{\vartheta \cdot E\_{\hat{\mathbb{P}}}(t+1)}}{\sum\_{k=1}^{4} \mathfrak{e}^{\vartheta \cdot E\_{\hat{\mathbb{P}}}(t+1)}} \tag{4}$$

θ is assumed to be trial-independent and was set to 3c − 1 (Yechiam and Ert, 2007; Ahn et al., 2008). *c* is a *consistency parameter* (choice sensitivity), which was limited from 0 to 5 so that the sensitivity ranges from 0 (random) to 242 (almost deterministic).

#### *Value-plus-perseverance model*

Recent work suggests that participants often use a simple winstay-lose-switch (WSLS) or perseverative strategy on the IGT, which cares only about the very last trial's information for making a decision on the current trial (Worthy et al., 2013a). Worthy et al. (2013a) compared the PVL-DecayRI and the WSLS models of the IGT using model-comparison methods. They showed that the PVL-DecayRI had the best model fits for about half of the subjects, whereas the WSLS model was the best-fitting model for the other half. Based on these findings, Worthy et al. (2013b) developed a VPP model, which is a hybrid model (e.g., Daw et al., 2011) of the PVL-Delta and a heuristic strategy of perseverance. Worthy et al. (2013b) showed that the VPP model showed the best *post-hoc* model-fits and simulation performance compared to other models for the IGT in healthy individuals.

The VPP model assumes that a participant keeps track of deck expectancies *Ej*(*t*) and perseverance strengths (*Pj*(*t*)). The expectancies are computed by the learning rule of the PVL-Delta model (Equation 3). For the perseverance strengths of unchosen decks on the current trial *t*, *Pj*(*t* + 1) = *k* · *Pj*(*t*). For the chosen deck:

$$P\_j(t+1) = \begin{array}{ll}k \cdot P\_j(t) + \varepsilon\_{\mathcal{P}} & \text{if } \mathfrak{x}(t) \ge 0\\k \cdot P\_j(t) + \varepsilon\_n & \text{if } \mathfrak{x}(t) < 0 \end{array} \tag{5}$$

Here, three additional free parameters related to perseverance are introduced: *k* (0 < *k* < 1) is a decay parameter similar to A in the PVL-DecayRI model, which determines how much the perseverance strengths of all decks (including unselected decks) are decayed on each trial. ε*<sup>p</sup>* and ε*<sup>n</sup>* indicate the impact of gain and loss on perseverance behavior, respectively. A positive value would indicate that the feedback reinforces a tendency to persevere on the same deck on the next trial whereas a negative value would indicate that the feedback reinforces a tendency to switch from the chosen deck.

The overall value, *Vj*(*t* + 1), is the weighted sum of *Ej*(*t* + 1) and *Pj*(*t* + 1):

$$V\_j(t+1) = \omega \cdot E\_j(t+1) + (1-\omega) \cdot P\_j(t+1) \tag{6}$$

Here ω is the RL weight (0 <ω< 1). A low value of ω would indicate that the subject would rely less on RL but more on the perseverance heuristic. A high value of ω would indicate that the subject would rely more on RL and less on the perseverance heuristic. In the VPP model, the choice probability was again using the softmax rule but with *Vj*(*t* + 1):

$$\Pr[D(t+1) = j] = \frac{e^{\theta \cdot V\_j(t+1)}}{\sum\_{k=1}^{4} e^{\theta \cdot V\_k(t+1)}}.\tag{7}$$

#### **STATISTICAL ANALYSES**

All data analyses were conducted using Bayesian data analysis, which has several advantages over null hypothesis significance testing (NHST) (Wagenmakers, 2007; Kruschke, 2010, 2011b, 2013): In Bayesian analysis, decisions are based on posterior probabilities of parameters (which could be model indices), not on frequentist *p* values. Unlike posterior distributions, frequentist *p* values depend on the sampling and testing intentions of the analyst. Bayesian methods also seamlessly provide posterior distributions for the type of complex hierarchical models we use here, more flexibly than deriving *p* values. For clarity and to accommodate readers more familiar with NHST, we report in parallel NHST results whenever appropriate and when there are compatible NHST approaches available. We used the posterior means of individual parameters for NHST and regression analyses. For Bayesian multiple regression and correlation analyses, we used robust regression methods so that outliers don't critically affect the inferred regression coefficients and hierarchical models, which reduces the risk of "false alarms."

Posterior distributions on parameters are summarized by their central tendency (i.e., mean or mode) and by their highest density interval (HDI), which is the range of parameter values that span 95% of the distribution and have higher probability inside the interval than outside. The HDI can also be used to make decisions in conjunction with a region of practical equivalence (ROPE) around parameter values of interest such as zero (Kruschke, 2011a,b). If the ROPE excludes the HDI, then the ROPE'd value is said to be not credible. If the ROPE includes the HDI, then the ROPE'd value is said to be accepted for practical purposes. We leave the ROPE tacit in our analyses, as its exact size is not critical for our main conclusions. However, when the HDI excludes the value of interest (such as zero) but has a end not far from the value of interest, then a moderately large ROPE would overlap with the HDI and render the result indecisive.

#### *Hierarchical Bayesian parameter estimation*

The free parameters of each model were estimated using hierarchical Bayesian analysis (HBA), an emerging method in cognitive science (Lee, 2011). HBA allows for individual differences, while pooling information across individuals in a coherent way. Unlike the conventional way of parameter estimation (maximum likelihood estimation; MLE), Bayesian methods estimate full posterior distributions of parameter values rather than only point estimates. In addition, commonalities across individuals are captured by letting group tendencies inform each individual's parameter values. A recent simulation study also revealed that HBA yields much more accurate parameter estimates of the PVL-DecayRI model than non-hierarchical MLE methods. Specifically, a simulation study by Ahn et al. (2011) showed that non-hierarchical MLE estimates were often at the parameters' boundary limits (e.g., learning rate = 1) whereas parameter estimates with HBA showed much less discrepancy with actual parameter values. These results suggest that HBA would be a better method to capture individual differences in model parameters.

To perform HBA, we used a recently developed package called Stan 2.1.0 (Stan Development Team, 2014), which uses Markov chain Monte Carlo (MCMC) sampling algorithms called Hamiltonian Monte Carlo (HMC). The HMC allows efficient sampling even for complex models with multilevel structures and those with highly correlated parameters. Individual parameters were assumed to be drawn from group-level normal distributions. Normal and uniform distributions were used for the priors of normal means (μ(.)) and standard deviations (σ(.)), respectively (Wetzels et al., 2010; Steingroever et al., 2013). For parameters (say ζ for a general parameter for illustration purposes) that are bounded between 0 and 1 (e.g., *A*, *k*, ω):

$$
\mu\_{\xi'} \sim \text{Normal}\,(0, 1), \sigma\_{\xi'} \sim \text{Uniform}\,(0, 1.5),
$$

$$
\xi' \sim \text{Normal}\,(\mu\_{\xi'}, \sigma\_{\xi'}), \xi = \text{Prob}\,(\xi') \tag{8}
$$

While Worthy et al. (2013b) set the boundary limits of ε*<sup>p</sup>* and ε*<sup>n</sup>* at [−1, 1], we set no bound constraints on ε*<sup>p</sup>* and ε*n*. We believe such boundary limits are useful for practical purposes in MLE but not in HBA methods. For those parameters with no bound constraints:

$$
\xi \sim \text{Normal}\left(\mu\_{\xi}, \sigma\_{\xi}\right), \mu\_{\xi} \sim \text{Normal}\left(0, 5\right),
$$

$$
\sigma\_{\xi} \sim \text{Uniform}\left(0, 1.5\right) \tag{9}
$$

For parameters that are constrained to be greater than zero but with an upper limit (=*U*) (e.g., *U* = 2 for α, *U* = 10 for λ, *U* = 5 for *c*), we used the following transformations to allow a flat prior distribution over a full range:

$$
\mu\_{\xi'} \sim \text{Normal}\left(0, 1\right), \sigma\_{\xi'} \sim \text{Uniform}\left(0, 1.5\right),
$$

$$
\xi' \sim \text{Normal}\left(\mu\_{\xi'}, \sigma\_{\xi'}\right), \xi = U \cdot \text{Prob}\left(\xi'\right) \tag{10}
$$

We also reparameterized parameters (i.e., parameters are sampled as independent unit normals and then transformed accordingly for each parameter), which can be effective for complex hierarchical models, as suggested by Stan developers (see Chapter 19 "Optimizing Stan Code" of the Stan 2.1.0 Manual; https://github.com/stan-dev/stan/releases/download/v2.1.0/stanreference-2.1.0.pdf).

A total of 2000 samples were drawn after 1000 burn-in samples for each of 3 chains (=2000 samples × 3 chains = a total of 6000 samples). We estimated individual and group parameters separately for each population (HC, amphetamine, and heroin groups). For each parameter, the Gelman-Rubin test (Gelman and Rubin, 1992) was used to check the convergence of the chains (a.k.a. *R*ˆ statistic). *R*ˆ values close to 1.00 would indicate that MCMC chains are converged to the target distributions. In our data, all model parameters of all models had *R*ˆ values of 1.00. MCMC chains were also visually inspected, which confirmed excellent mixing of MCMC samples. Effective sample sizes (ESS) of model parameters, which are related to autocorrelation and mixing of MCMC chains (i.e., a smaller ESS is related to higher autocorrelation), were typically greater than 1000 (out of 6000 total samples). The minimum ESS of hyper-parameters was 561 in the two PVL models, and 372 in the VPP model. Visual inspection of the parameters with smaller ESSs confirmed their convergence to target distributions.

#### *Model comparisons using WAIC*

WAIC is a way to estimate a model's predictive accuracy with bias correction from over-fitting like Akaike Information Criterion (AIC; Akaike et al., 1973) and Deviance Information Criterion (DIC; Spiegelhalter et al., 2002). As a measure of predictive accuracy, the log predictive density or log-likelihood, log *p*(*y*|θ), is commonly used where *y* and θ indicate data and model parameters, respectively. WAIC is "a more fully Bayesian approach" that uses log pointwise posterior predictive density (*lppd*) and a correction (or penalty) term, each of which can be computed from MCMC samples made available from (hierarchical) Bayesian parameter estimation (for reviews and more details, see Gelman et al., 2013a,b).

Computed lppd (for each participant *i*; subscript *i* is omitted for convenience) is defined as:

$$\sum\_{t=1}^{T} \log \left( \frac{1}{S} \sum\_{s=1}^{S} \rho \left( y\_t | \theta^s \right) \right) \tag{11}$$

Here θ*<sup>s</sup>* are posterior MCMC samples (*s* = 1, 2,..., S) and *T* is the number of trials (data points). Note that the likelihood dominates the posterior under standard conditions where a posterior distribution approaches a normal distribution (Degroot, 1970; Gelman et al., 2013a,b).

There is a correction term that adjusts for the effective number of parameters and overfitting. There are two types of adjustments (*pWAIC*<sup>1</sup> and *pWAIC*2) (Gelman et al., 2013a,b). Gelman et al. (2013a,b) recommended *pWAIC*<sup>2</sup> because of its closer relationship with leave-one-out cross validation than *pWAIC*1. We report results using *pWAIC*<sup>2</sup> but both adjustments yielded very similar values. Computed *pWAIC*<sup>2</sup> (for each participant *i*, subscript *i* is omitted for convenience here) is defined as:

$$\sum\_{t=1}^{T} V\_{s=1}^{\mathcal{S}} \left( \log p \left( \mathbb{y}\_t | \theta^s \right) \right) \tag{12}$$

where *V<sup>S</sup> <sup>s</sup>* <sup>=</sup> <sup>1</sup> indicates the sample variance (i.e., the variance of log *p*(*yt*|θ*<sup>s</sup>* ) over *S* samples). WAIC*<sup>i</sup>* for each participant *i* is defined like the following so that its value is on the deviance scale like AIC, DIC, and BIC (Schwartz, 1978).

$$\text{WAIC}\_{i} = -2 \ast (\text{lppd} - \text{p}\_{\text{WAIC2}}) \tag{13}$$

We computed lppd and *pWAIC*<sup>2</sup> by rewriting the separate likelihood function in R (R Development Core Team, 2009) but it is also possible to implement WAIC in a Stan code directly (Vehtari and Gelman, under review). Specifically; we first randomly sampled 1,000 (*S* = 1,000 in Equations 11 and 12) posterior samples from each subject's individual posterior distributions. We used posterior individual distributions (instead of group distributions) for the calculation because our goal was to replicate new data and evaluate predictive accuracy in existing groups. Then we prepared a matrix of each subject for trial-by-trial predictive density (*p*(*yt*|θ*<sup>s</sup>* ), matrix size = number of samples × number of trials = 1000 × 100). Trial-by-trial predictive density was computed for each subject using each posterior sample separately. Then, using Equations (11–13), we computed lppd, *pWAIC*2, and WAIC*<sup>i</sup>* for each participant, and then summed WAIC*<sup>i</sup>* over all participants for each model (**Table 3**). The R codes for performing HBA and computing WAIC are available by request to the first author (Woo-Young Ahn; wooyoung.ahn@gmail.com).

## *Simulation method*

We also used a simulation method to evaluate how accurately a model can generate observed choice pattern in new and unobserved payoff sequences based on parameter values alone (Ahn et al., 2008; Fridberg et al., 2010; Steingroever et al., 2013, 2014). Using the procedure in Appendix B of Ahn et al. (2008) and individual posterior means as a subject's best fitting parameters, we tested the simulation performance of each model. We set the maximum number of trials to 100 and used the payoff schedule of the modified IGT. We only report the results using individual posterior means but we note that running simulations using random draws from individual posteriors (Steingroever et al., 2013, 2014) yielded very similar results (not reported for brevity).

## *Parameter recovery tests*

Using parameter recovery tests, we tested the adequacy of each model, specifically how well each model can recover true parameter values that were used to simulate synthetic data (Ahn et al., 2011; Steingroever et al., 2013). We simulated HC participants' performance on the modified IGT assuming that they behaved according to each model. We generated true parameter values based on the individual posterior means of the HC group. Then we simulated synthetic behavioral data based on the parameters, and then recovered their parameter values using the HBA described in Section Hierarchical Bayesian Parameter Estimation. See Appendix for the details.

#### *Hierarchical Bayesian multiple regression analyses*

For multiple regression analyses, often many candidate predictors are included in the model, which increases the risk of erroneously deciding that a regression coefficient is non-zero. In many cases, regression coefficients are distributed like a *t* distribution, such that the predicted variable has non-significant correlations with most candidate predictors, but a sizable relationship with only a few predictors. Also, some predictors are substantially correlated with each other, which suggests that estimating regression coefficients separately for each predictor can possibly be misleading.

We assigned a higher-level distribution across the regression coefficients of the various predictors. Specifically, regression coefficients came from a *t* distribution with parameters (mean, scale, and df) estimated from the data. Because of this hierarchical structure, estimated regression coefficients experience shrinkage and are less likely to produce false alarms. We used the program "MultiLinRegressHyperJAGS.R" from Kruschke (2011b), available at http://www.indiana.edu/% 7Ekruschke/DoingBayesianDataAnalysis/Programs/.

We used Just Another Gibbs Sampler (JAGS) for MCMC sampling and for posterior inference of regression analyses. For each analysis, a total of 50,000 samples per chain were drawn after 1000 adaptive and 1000 burn-in samples with three chains. For each parameter, the Gelman-Rubin test was run to confirm the convergence of the chains. *R*ˆ mean values were 1.00 for all parameters.

#### *Bayesian estimation for group comparisons*

For Bayesian estimation for group differences, (e.g., on behavioral performance, **Figure 1**), we used Bayesian estimation

(BEST) codes that are available at: http://www.indiana. edu/∼kruschke/BEST/. The analysis is implemented in JAGS and we used a total of 50,000 samples after 1000 adaptive and 1000 burn-in samples were drawn. *R*ˆ mean values were 1.00 for all parameters. For more details about BEST, see Kruschke (2013).

## **RESULTS**

## **PARTICIPANTS' CHARACTERISTICS**

**Table 1** shows demographic and substance use characteristics of participants. The groups differed on age, such that HC individuals were younger than heroin users [95% HDI from 3.5 to 6.8, mean of HDI = 5.1; *t*(89) = 4.81, *p* = 6.11E-06] and older than amphetamine users [95% HDI from 0.1 to 3.4, mean of HDI = 1.8; *t*(84) = 2.11, *p* = 0.037], reflecting the timeline of heroin and amphetamine influx in Bulgaria. HC individuals had higher IQ than both amphetamine [95% HDI from 0.4 to 11.1, mean of HDI = 6.0; *t*(84) = 2.28, *p* = 0.025] and heroin users [95% HDI from 2.9 to 12.8, mean of HDI = 7.8; *t*(89) = 3.13, *p* = 0.002], but there was no difference between the two drug-using groups [95% HDI from −7.8 to 3.6; mean of HDI = −2.0; *t*(79) = 0.66, *p* = 0.510].

As reported in **Table 2**, the two drug using groups scored higher on trait impulsivity (BIS-11) [HC vs. Amphetamine: 95% HDI from 5.5 to 14.9, mean of HDI = 10.2; *t*(83) = 4.66, *p* = 1.19E-05; HC vs. Heroin: 95% HDI from 5.6 to 13.7, mean of HDI = 9.7; *t*(88) = 4.87, *p* = 4.90E-06] and psychopathy (PCL:SV) [HC vs. Amphetamine: 95% HDI from 4.0 to 7.7, mean of HDI = 5.8; *t*(84) = 6.49, *p* = 5.72E-09; HC vs. Heroin: 95% HDI from 7.4 to 11.1, mean of HDI = 9.3; *t*(89) = 10.62,

#### **Table 1 | Demographic and substance use characteristics of participants.**


*aH* > *HC* > *A (Bayesian and NHST t-tests yielded the same conclusions).*

*bHC* > *A, H (Bayesian and NHST t-tests yielded same conclusions).*

*cHC* > *A (Bayesian and NHST t-tests yielded same conclusions).*

*dH* >*A* > *HC (Bayesian and NHST t-tests yielded same conclusions).*

*eA* > *H* > *HC (Bayesian and NHST* χ*-square tests yielded same conclusions).*

*<sup>f</sup> Sig. results are based on omnibus NHST ANOVA tests.*

#### **Table 2 | Personality and psychopathology characteristics of participants.**


*All group comparison results are based on Bayesian tests. HC, healthy controls; A, amphetamine; H, heroin; BIS, Barratt Impulsiveness Scale; PCL:SV, Psychopathy Checklist: Screening Version; BDI-II, Beck Depression Inventory-II; STAI, State Trait Anxiety Inventory.*

*p* = 2.20E-16] than HC individuals. Comparisons between the two drug using groups revealed that heroin users had higher levels of psychopathy than amphetamine users [HDI from 0.8 to 5.1, mean of HDI = 3.0; *t*(79) = 2.73, *p* = 0.008]. Both amphetamine and heroin users scored higher on depression (BDI-II) [HC vs. Amphetamine: 95% HDI from −4.4 to −0.5, mean of HDI = −2.3; *t*(82) = 2.26, *p* = 0.026; HC vs. Heroin: 95% HDI from −5.8 to −1.7, mean of HDI = −3.8; *t*(88) = 3.59, *p* = 5.40E-04], state anxiety (STAI-S) [HC vs. Amphetamine: 95% HDI from −7.7 to −1.6, mean of HDI = −4.5; *t*(84) = 2.90, *p* = 4.7E-04; HC vs. Heroin: 95% HDI from −9.7 to −2.5, mean of HDI = −6.4; *t*(89) = 3.90, *p* = 1.80E-04], and trait anxiety (STAI-T) [HC vs. Amphetamine: 95% HDI from −8.5 to −0.3, mean of HDI = −4.4; *t*(84) = 2.18, *p* = 0.032; HC vs. Heroin: 95% HDI from −10.0 to −1.3, mean of HDI = −5.6; *t*(89) = 2.86, *p* = 0.005] than HC individuals. There were no differences between the two drug using groups on these measures.

#### **BEHAVIORAL RESULTS**

Behavioral results revealed that the HC group made more advantageous choices than the heroin group [difference of mean net score (advantageous—disadvantageous choices per five blocks of 20 trials) = 2.77, 95% HDI from 0.7 to 4.8, mean of HDI = 2.8; *t*(90) = 2.80, *p* < 0.010] and marginally than the amphetamine group [difference of mean net score = 1.14, 95% HDI from −0.1 to 2.3, mean of HDI = 1.9; with 95.3% of the posterior samples were greater than 0; *t*(84) = 2.02, *p* = 0.047]. There were no behavioral differences between the two drug using groups in terms of net scores (see **Figure 1**). Further, the choice patterns of these two groups were qualitatively different from those of the HC group. As shown in Figures S1–S3 (left), whereas the HC group favored one of the advantageous decks (Deck D) as the task progressed, both amphetamine and heroin users consistently favored the disadvantageous deck B throughout the task. Decks B and D carry low-frequency losses and are usually chosen more often than decks with high-frequency losses such as A and C, yet one is disadvantageous (Deck B) whereas the other one is advantageous (Deck D). Our results demonstrate that past drug users who are currently in protracted abstinence continue to show similar preference for disadvantageous decks as currently dependent drug users (Bechara et al., 2001; Yechiam et al., 2005).

#### **MODEL COMPARISONS RESULTS**

We first checked which model provided the best predictive accuracy, as measured by WAIC. **Table 3** presents WAIC scores for each model, summarized for each group. Note that the smaller a model's values of WAIC scores are, the better its model-fits are. As noted in **Table 3**, the VPP model provided the best model-fits

**Table 3 | WAIC scores of each model, computed separately for each group.**


*The best-fitting model in each group is underlined.*

*HC, healthy controls; A, amphetamine; H, heroin.*

relative to the other models in all groups, followed by the PVL-DecayRI. These results are consistent with previous reports from Worthy et al. (2013b).

The simulation method and parameter recovery tests yielded somewhat different findings (Figures S1–S3). Consistent with previous reports (Ahn et al., 2008; Fridberg et al., 2010; Steingroever et al., 2013, 2014), the PVL-Delta model showed good simulation performance in all three groups, adequately predicting the rank order of four decks and good parameter recovery (Figure A3). The PVL-DecayRI model also captured the global pattern of deck preference in all groups even if it failed to fully capture the preference reversal of certain decks over trials (e.g., decks A and C in the heroin group, Figure S3). Parameter recovery tests yielded somewhat mixed results (Figure A2): A (decay rate) and c (response consistency) were recovered well, but performance on α (reward sensitivity) and λ (loss aversion) were not as good as with the PVL-Delta. The VPP model, on the other hand, showed the worst simulation and parameter recovery performance: the model over-estimated the preference of deck C in the HC and amphetamine groups and failed to predict the preference of deck C over deck A in the heroin group. These results are inconsistent with the simulation results of Worthy et al. (2013b), in which the VPP model showed the best simulation performance. However, HC participants in Worthy et al. (2013b) continued to prefer the disadvantageous deck (Deck B) throughout the task, unlike our HC participants who preferred the advantageous Deck D. Worthy et al. (2013b) reported simulation performance by averaging choice probabilities across all trials in each deck (Figure 2A in Worthy et al., 2013b). If we used the same criterion, the VPP model performs quite well for the heroin group, in which deck B is most strongly preferred and preference for decks A and C are similar on average. Another major difference between our study and Worthy et al. (2013b) is the parameters used for the simulation method: Worthy et al. (2013b) used MLE estimates whereas we used HBA estimates, which may lead to somewhat different simulation performance. With respect to parameter recovery (Figure A1) with the VPP model, posterior distributions of several parameters were very broad (e.g., ω) and some parameters were not well estimated (e.g., *k*), which might be attributed to its higher number of parameters compared to the two PVL models (8 vs. 4).

Next, we used the best-fitting (VPP) model to compare the three groups (**Figure 2** and **Table 4**). Heroin users displayed reduced loss aversion (λ) compared to HC [95% HDI from −1.2 to −0.2, mean of HDI = −0.7; *t*(89) = 8.33, *p* = 9.024E-13] and amphetamine users [95% HDI from 0.1 to 1.1, mean of HDI = 0.6; *t*(79) = 6.82, *p* = 1.63E-09] (see **Figure 3** for the 95% HDI of group differences between heroin and HC groups and Figures S4, S5 for the 95% HDI of group differences between amphetamine and other groups). In contrast, our hypothesis that reward sensitivity (α) would be higher in amphetamine users compared to HC was not supported. The learning rate (*A*) was marginally different between the heroin and the HC groups [95% HDI from −0.0 to 0.2, mean of HDI = 0.1; *t*(89) = 4.91, *p* = 4.08E-06, **Figure 3**].

We further checked whether the group differences we found using the best-fitting (VPP) model are consistent when tested with other models (PVL-DecayRI and PVL-Delta). **Tables 5, 6** summarize the mean group parameter estimates of the PVL-DecayRI (see Figures S6–S8 for the 95% HDI of group differences) and PVL-Delta (see Figures S9–S11 for the 95% HDI of group differences), respectively. As seen in **Figures 3**, S6, and S9, we consistently found reduced loss aversion in heroin users compared to HC, whichever model we used. The PVL-DecayRI model showed increased reward sensitivity (α parameter) in amphetamine users compared to HC [Figure S7, 95% HDI from 0.0 to 0.5, mean of HDI = 0.3; *t*(84) = 6.26, *p* = 1.53E-08], which was not replicated with other models.

Given that the groups differed on age, IQ, and education, we conducted NHST Analysis of Covariance (ANCOVA) tests to examine whether group differences on model parameters remain significant after controlling for these factors. Dependent variables were model parameter values (individual posterior means), group membership (e.g., HC vs. amphetamine groups) was the categorical independent variable, and covariates were age, IQ, and education. With any model (VPP, PVL-DecayRI, or PVL-Delta), group difference on loss aversion between heroin and HC groups remained significant [e.g., with the VPP model, *F*(1, 86) = 26.06, *p* = 1.16E-13]. The group difference on reward sensitivity between amphetamine and HC groups with the PVL-DecayRI model also remained significant [*F*(1, 81) = 46.28, *p* = 1.61E-09].

#### **EXPLORATORY ANALYSES: ASSOCIATIONS OF MODEL PARAMETERS WITH SUBSTANCE USE AND PERSONALITY CHARACTERISTICS**

Next, we examined associations of model parameters of the impaired neurocognitive processes (loss aversion for heroin users using the VPP model) with substance use characteristics (number of years of drug use, length of abstinence from primary drug, number of DSM-IV criteria met for primary drug of dependence, nicotine dependence, and past cannabis dependence), impulsive personality traits (BIS-11) and impulse-related personality disorders (PCL:SV). As noted earlier, we used hierarchical robust Bayesian multiple linear regression, which has a hyperdistribution on regression coefficients across predictors and large-tail distributions to accommodate outliers. The results showed that loss aversion in heroin users was not predicted by any variable (Figure S12 for the robust Bayesian multiple linear regression results). None of the regressors were significant (*p* < 0.05 with NHST).

In contrast to our null findings with the VPP model, we found two associations when we used the affected parameters from the PVL-DecayRI model (loss aversion for heroin users and reward sensitivity for amphetamine users). In heroin users,

**Table 4 | Means and standard deviations (in parentheses) of group mean parameters with the VPP model.**


*HC, healthy controls; A, amphetamine; H, heroin. aHC, A* > *H.*

loss aversion (λ) was predicted by impulsive personality traits (BIS-11 total score; mean coefficient = −0.027, 95% HDI from −0.05 to −0.00, mean of HDI = −0.03) (Figure S13). In contrast, in amphetamine users, reward sensitivity was predicted by number of years of drug use (mean coefficient = 0.042, 95% HDI of group differences from 0.01 to 0.07, mean of HDI = 0.04, see Figure S14). Other variables were not associated with model parameters. Correlational analyses with internalizing characteristics (depression and anxiety) revealed no associations with model parameters.

### **DISCUSSION**

This is the first human study that uses a computational modeling approach to investigate neurocognitive functioning in relatively pure amphetamine and heroin users. Our behavioral results reveal that heroin users show more disadvantageous decisionmaking performance than HC; however, their performance was not different from that of amphetamine users. These results are in line with the persistent nature of decision-making deficits observed among opiate addicts in particular (Vassileva et al., 2007b; Fernández-Serrano et al., 2011; Li et al., 2013). Critically, our computational modeling findings suggest that amphetamine and heroin users may be characterized by dissociable decisionmaking biases even within the context of no overt behavioral differences in performance. When we compared groups using the best-fitting (VPP) model, heroin users showed reduced loss

**Table 5 | Means and standard deviations (in parentheses) of group mean parameters with the PVL-DecayRI model.**


*HC, healthy controls; A, amphetamine; H, heroin.*

*aHC* < *A.*

*bHC* > *H.*

aversion relative to amphetamine users and HC. Notably, the reduced loss aversion among heroin users compared to healthy individuals was robust across all models we tested. With regards to amphetamine users, we did not find any distinct decision-making profile using the best-fitting VPP model. However, when using the PVL-DecayRI model, which had the second best model-fits in our data, amphetamine users showed greater reward sensitivity than HC. These group differences were at the outcome evaluation stage according to a recent framework of value-based decisionmaking (Rangel et al., 2008) and putatively reflect an emotional and activation type of self-regulation (Bickel et al., 2012).

We tested three existing cognitive models to compare the two drug user groups with HC. Consistent with previous reports (Worthy et al., 2013b), we found that the VPP model was the

**Table 6 | Means and standard deviations (in parentheses) of group mean parameters with the PVL-Delta model.**


*HC, healthy controls; A, amphetamine; H, heroin.*

*aHC > H.*

best-fitting model when measured by WAIC, followed by the PVL-DecayRI and the PVL-Delta. However, it should be noted that the VPP model has twice as many parameters as other models (8 vs. 4) and showed the worst simulation and parameter recovery performance compared to the two PVL models. In contrast, Worthy et al. (2013b) show good simulation performance for the VPP model in their dataset; however, there are two major differences between their study and ours. First, in Worthy et al. (2013b), control participants preferred the disadvantageous deck (Deck B) throughout the task, similar to the amphetamine and heroin groups in our study. Indeed, the simulation performance of the VPP model is quite good for the heroin group if we collapse trial-by-trial simulation performance over trials on each deck. Second, Worthy et al. (2013b) used MLE estimates instead of HBA estimates. Thus, it remains to be determined whether the poor simulation performance of the VPP model in our datasets is due to its over-complexity, the limited generalizability of specific behavioral patterns, or to differences in the parameter estimation methods. It would also be helpful to perform external validation tests (e.g., Wallsten et al., 2005) because the parameters of a model with good model-fits do not necessarily reflect underlying psychological constructs (Riefer et al., 2002). In this study, each participant performed only up to 100 trials: Even if hierarchical modeling allowed us to pool information across individuals, 100 trials might not contain enough information to reliably estimate 8 free parameters and capture true underlying psychological constructs. It might be related to the fact that behaviorally the amphetamine group showed different choice patterns from the HC group but none of their model parameter values are credibly different from those of the HC group. As seen in **Figure 2**, several parameters of the amphetamine group are "sub-optimal" compared to the HC group (e.g., ε*n*, k, and ω) but the group differences did not reach the threshold of credible group difference. It is possible that deficits in the amphetamine group were decomposed into several parameters, instead of into one or two parameters in the VPP model. It may be necessary and helpful to develop new models with fewer model parameters based on the psychological and neuroscience literature by using model comparison methods and performing external validation.

There are a few previous studies using the PVL-DecayRI (Vassileva et al., 2013) or the PVL-Delta (Fridberg et al., 2010) model to study decision-making processes in drug users. Consistent with our results, both chronic (current) marijuana users (Fridberg et al., 2010) and polysubstance (former) users (Vassileva et al., 2013) showed reduced loss aversion compared to HC. On the other hand, chronic marijuana users also exhibited higher reward sensitivity, impaired learning/memory, and reduced response consistency compared to HC when tested with the PVL-Delta model (Fridberg et al., 2010). Polysubstance use was also associated with impaired learning/memory when tested with the PVL-DecayRI model (Vassileva et al., 2013). Stout et al. (2004) used the EVL model and MLE method for parameter estimation, and reported reduced attention weight to loss among current cocaine users compared to HC. In the EVL model, the *w* parameter (attention weight to loss vs. gain) incorporates both reward sensitivity and loss aversion; therefore, it is difficult to directly compare the findings from Stout et al. (2004) with our results. However, it is likely that one or both of the two processes was impaired in current cocaine users in the Stout et al. (2004) study.

It should be also noted that the mean *w* parameter (RL weight) value was greater than 0.5 in all groups (**Figure 2**), suggesting that overall RL was a primary strategy in all groups. Worthy et al. (2013b) reported that the mean *w* parameter of healthy individuals was 0.49, which is the mean value of MLE individual estimates. In addition to the difference in parameter estimation methods, we also found some differences in the choice patterns of the three groups. As seen in Figure S1, healthy control individuals in our study eventually preferred the advantageous deck (Deck D) as the task progressed. On the other hand, healthy individuals in Worthy et al. (2013b) continued to prefer the disadvantageous deck (Deck B) throughout the task, which was the deck preferred by both heroin and amphetamine users in our study. It remains unclear why the two drug user groups, which showed similar behavioral patterns to participants in Worthy et al. (2013b), showed *w* value greater than 0.5 on average. A future study will be necessary to replicate the findings.

This is one of the very few studies that investigate amphetamine and heroin users in protracted abstinence (Ersche et al., 2005a,b; Clark et al., 2006). Our results indicate that decision-making deficits previously reported with current drug users (Bechara et al., 2001; Yechiam et al., 2005) may persist long after discontinuation of drug use and appear particularly pronounced in heroin users. These deficits and decision-making biases may have existed prior to onset of drug use and thereby could have contributed to an increased susceptibility to develop addiction, in line with longitudinal studies with adolescents, which show that poor response inhibition and behavioral dysfunction often precede onset of drug use and contribute to the development of addiction (Nigg et al., 2006; Wong et al., 2006). Alternatively, these deficits and biases may reflect residual, enduring and possibly irreversible effects of chronic drug use; or an interaction between pre-existing predispositions and residual effects of drugs of abuse. Although our study revealed some dissociable decision-making biases in amphetamine and heroin users, our design does not allow us to determine whether they precede onset of drug use or whether they are consequences of chronic drug use. This crucial question should be investigated by future carefully designed prospective studies.

Using the second best-fitting PVL-DecayRI model, we found that the distinct decision-making style of heroin users characterized by reduced sensitivity to loss is associated with elevated trait impulsivity, as hypothesized. These findings are in line with reports that personality variables are related to decision-making performance on the IGT among heroin users on OST (Lemenager et al., 2011). Our results indicate that similar associations are observable among heroin users in protracted abstinence who are not on OST. Speculatively, given the persistent nature of personality traits such as impulsivity, which develop early and typically prior to onset of substance dependence, the reduced loss aversion in heroin users may have predated the development of addiction and may be of etiological significance for addiction to opiates in particular. In contrast, the decision-making bias displayed by stimulant users (reward sensitivity) was not associated with personality traits but was instead related to duration of stimulant use, which suggests that such biases may potentially reflect cumulative residual effects of chronic stimulant use. It is important to emphasize that we should exercise caution when interpreting these associations, as they were not replicated with the best-fitting (VPP) model.

A question arises as to what is the clinical significance of the observed decision-making biases and deficits within the context of our participants' history of protracted abstinence, which is the standard metric of success of most addiction treatment programs. Specifically, despite the observed decision-making deficits and biases among the two drug user groups, the majority of our participants have been remarkably successful in maintaining abstinence for long periods of time and without the help of any substitution therapy. In essence, the ability of our participants to abstain for such protracted periods of time suggests that this group could be comprised of some of the least impulsive SDI, expected to display more adaptive decision-making abilities than SDI who are unable to remain abstinent for long. Future studies should determine the real-life significance of such decisionmaking deficits and biases and the role they play in the protracted abstinence stage. For example, we recently reported that some decision-making biases may have functional significance for HIV infected women with a history of illicit drug use, among whom they may be related to risky sexual behaviors and reduced adherence to HIV medication dosing schedules (Vassileva et al., 2013). Similarly, we recently found that a composite neurocognitive index of reward-based decision-making (which includes the IGT) predicts recent (past 30-days) sexual HIV risk behaviors in heroin and amphetamine users in protracted abstinence (Wilson et al., under review). Overall, our results suggest that decision-making processes other than the ones we examined may be more relevant for the successful and prolonged maintenance of a state of abstinence. Further, our findings may be specific to decision-making under uncertainty and ambiguity, as measured by the IGT. It is possible that SDI in protracted abstinence may display intact functioning in other aspects of decision-making (e.g., decisions under risk) that may have more direct relevance to the successful maintenance of abstinence. On the other hand, the fact that such decision-making deficits and biases were observed in participants who have successfully maintained prolonged abstinence raises the question of whether users who are unable to maintain long-term abstinence are characterized by even more aberrant decision-making profiles. It would be crucial for future studies to determine how "successful" long-term abstainers such as our participants compare to currently active SDI or to SDI who are unable to abstain from drug use. Future studies should also determine whether similar substance-specific biases are observable in opiate and stimulant users at other stages of the addiction cycle and ideally employ longitudinal designs to determine whether they are precursors or consequences of chronic substance use.

While clearly of theoretical significance, the extent to which our findings have implications for prevention and intervention remains to be determined. If replicated by future studies, such decision-making deficits and biases may inform treatment and recovery programs for opiate and stimulant dependent individuals. Within this context, pre-treatment decision-making assessments may represent a useful adjunct to help formulate personalized treatment plans (Baldacchino et al., 2012), which could potentially include cognitive enhancement or training that have shown some promising results (Nutt et al., 2007; Bickel et al., 2011). Our results from the PVL-DecayRI model suggest that interventions that target reduced loss aversion (punishment sensitivity) may be more suitable for heroin users, whereas others addressing increased reward sensitivity may hold promise with amphetamine users, though we should exercise caution with the latter, which failed to replicate with the best-fitting model.

There are a number of limitations that need to be considered when evaluating the current findings. First, the fact that our participants were predominantly male should be taken into account when considering the generalizability of our findings to females. Second, our findings could have been influenced by group differences in age, IQ, and education, though the reduced loss aversion in heroin users and the increased reward sensitivity in the amphetamine group (with the PVL-DecayRI model) relative to HC remained robust even after controlling for those factors. Third, computational modeling parameter estimates, like many conceptual or quantitative interpretive tools, are useful heuristics in the evaluation of observed behavior patterns, not explanatory mechanisms of the phenomena at hand. Interpretations should be rendered accordingly, though the reduced loss aversion in heroin users was robust across all models we tested.

In sum, by recruiting relatively pure amphetamine and heroin users in protracted abstinence and by parcellating their decisionmaking performance into distinct neurocognitive processes by using computational modeling and Bayesian tools, we revealed that heroin users displayed reduced loss aversion relative to HC while being in protracted abstinence. Future studies utilizing other experimental paradigms probing different aspects of decision-making and computational models will be necessary to examine which mechanisms may be at play in the decision-making performance of heroin and amphetamine users at different stages of the addiction cycle.

## **ACKNOWLEDGMENTS**

The authors thank all volunteers for their participation in this study; Kiril Bozgunov, Rada Naslednikova, and Ivaylo Raynov for testing study participants; Warren K. Bickel and reviewers of earlier drafts for helpful comments on the manuscript; and Helen Steingroever for her feedback on our implementation of WAIC. This publication was made possible by R01DA021421 grant from the National Institute on Drug Abuse (NIDA) and the Fogarty International Center (FIC) to Jasmin Vassileva. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health (NIH).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 2014.00849/abstract

## **REFERENCES**


Degroot, M. (1970). *Optimal Statistical Decisions*. New York, NY: McGraw-Hill.


Li, X., Zhang, F., Zhou, Y., Zhang, M., Wang, X., and Shen, M. (2013). Decisionmaking deficits are still present in heroin abusers after short-to long-term abstinence. *Drug Alcohol Depend.* 130, 61–67. doi: 10.1016/j.drugalcdep.2012.10.012 Luce, R. (1959). *Individual Choice Behavior.* New York, NY: Wiley.

Nigg, J. T., Wong, M. M., Martel, M. M., Jester, J. M., Puttler, L. I., Glass, J. M., et al. (2006). Poor response inhibition as a predictor of problem drinking and illicit drug use in adolescents at risk for alcoholism and other substance use disorders. *J. Am. Acad. Child Adolesc. Psychiatry* 45, 468–475. doi: 10.1097/01.chi.0000199028.76452.a9

Nutt, D., Robbins, T., and Stimson, G. (2007). *Drugs futures 2025? Drugs and the Future: Brain Science, Addiction and Society*. London: Academic Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 April 2014; accepted: 17 July 2014; published online: 12 August 2014. Citation: Ahn W-Y, Vasilev G, Lee S-H, Busemeyer JR, Kruschke JK, Bechara A and Vassileva J (2014) Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Front. Psychol. 5:849. doi: 10.3389/fpsyg.2014.00849*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Ahn, Vasilev, Lee, Busemeyer, Kruschke, Bechara and Vassileva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# It's All in How You Think About It: Construal Level and the Iowa Gambling Task

## Bradley M. Okdie\*, Melissa T. Buelow and Kurstie Bevelhymer-Rangel

Psychology, The Ohio State University at Newark, Newark, OH, USA

Recent research has identified a number of factors that can influence performance on the Iowa Gambling Task (IGT) when it is used in clinical or research settings. The current studies examine the effects of construal level theory (CLT) on the IGT. Study 1 suggests that when primed with a high construal mindset (i.e., thinking abstractly vs. concretely), individuals learned to avoid Deck A more than those primed with a low construal mindset. Study 2 suggests that when construal level is manipulated through psychological distance (i.e., selecting for a close vs. distant friend), individuals in a high construal mindset instead showed a preference for Deck A compared to individuals in a low construal mindset or a control group. Taken together, these studies suggest that IGT performance is impacted by the manner in which one construes the task. Implications for decision making research and use of the IGT as a clinical and research instrument are discussed.

#### Edited by:

Ching-Hung Lin, Kaohsiung Medical University, Taiwan

#### Reviewed by:

Varsha Singh, Indian Institute of Technology Delhi, India Mathieu Cassotti, Université Paris Descartes-Sorbonne Paris Cité, France

> \*Correspondence: Bradley M. Okdie okdie.2@osu.edu

#### Specialty section:

This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience

Received: 08 July 2015 Accepted: 07 January 2016 Published: 22 January 2016

#### Citation:

Okdie BM, Buelow MT and Bevelhymer-Rangel K (2016) It's All in How You Think About It: Construal Level and the Iowa Gambling Task. Front. Neurosci. 10:2. doi: 10.3389/fnins.2016.00002 Keywords: decision making, construal level theory, Iowa Gambling Task, learning

## INTRODUCTION

People make decisions daily, from seemingly mundane choices like what to wear to major life decisions like who to marry or what career path to take. Whether mundane or life-changing, the decisions we make define who we were, who we are, and who we will be. Individuals who choose advantageously reap the benefits of those decisions, while those that choose disadvantageously are often left wondering how they arrived at their current state. Many measures exist that assess different types of decision making, some of which also purport to predict who will decide advantageously and who will decide disadvantageously. The Iowa Gambling Task (IGT; Bechara et al., 1994) is widely used by clinicians and researchers alike, and examines both advantageous and disadvantageous selections under ambiguity and risk.

Recent research has called into question the utility of using the IGT in isolation as a clinical measure of decision making, as significant fluctuations in performance occur in a healthy control populations and the precise decision making measured by the task is still debated (for review, see Steingroever et al., 2013). Prior research has highlighted how contextual factors such as negative mood and extra learning trials can improve performance on the IGT (Buelow et al., 2013, 2014), while the anticipation of a public speaking task can decrease performance on the IGT (Preston et al., 2007) suggesting that contextual factors could mask or otherwise interfere with the assessment of the individual's actual (baseline) decision making ability. Additionally, the IGT scoring criteria can affect interpretation of selections on the task (see below for additional detail). Taken together, these contextual and scoring factors can affect performance on behavioral measures of decision making. Moreover, multiple studies have shown few significant correlations between performance on the IGT and performance on other behavioral decision making and problem solving tasks (Lejuez et al., 2003; Overman et al., 2004; Aklin et al., 2005; Skeel et al., 2007; Buelow and Blaine, 2015), leading to questions regarding whether the task is sensitive to only decision making impairments. The IGT (Bechara, 2007) is used by clinicians to assess risk-taking behavior and decision making. Understanding the different contextual factors that could influence performance on this task will help guide the improvement of the clinical assessment of decision making. Given this, the present set of studies sought to examine whether the manner in which the task is construed (abstractly or concretely) might affect outcomes on the IGT.

## THE IOWA GAMBLING TASK AS A CLINICAL MEASURE OF DECISION MAKING

The IGT is one of the most widely cited behavioral decision making tasks in the literature and has been adapted into a clinical assessment instrument (Bechara, 2007) based on the Bechara et al. (2001) test revision. The task was initially designed to assess decision making impairments in individuals with ventromedial prefrontal cortex damage who showed real-world decision making failures but scored within the normal range on formal measures of executive functioning (Bechara et al., 1994). Although originally designed for individuals with focal lesions, research has shown the IGT is sensitive to impairments due to head injury, amygdala damage, bipolar disorder, obsessive compulsive disorder, pathological gambling, substance abuse and dependence, and attention-deficit/hyperactivity disorder (see Buelow and Suhr, 2009, for discussion). Typically, poor performance on the IGT (i.e., choosing disadvantageously) is associated with the presence of these and other neurological or psychological diagnoses.

On the IGT, participants are given a loan of \$2000 and told to maximize profit over 100 trials by selecting from one of four decks of cards: A, B, C, or D. On each trial, participants always experience a win but sometimes also experience a loss. With some decks, those losses can outweigh the benefits of the immediate reward. Decks A and B have an average profit of \$100 per selection and Decks C and D have an average profit of \$50 per selection (Bechara, 2007). After 10 selections from Decks A or B, individuals have incurred a net loss of \$250; however, after 10 selections from Decks C or D, individuals instead have earned a net gain of \$250 (Bechara, 2007). From these differences in long-term outcomes, Decks A and B have been considered "disadvantageous" and Decks C and D "advantageous" (Bechara, 2007). Differences exist between Decks A and B and Decks C and D, based on the frequency of losses. Selections from Decks A and C experience losses on 50% of trials, whereas selections from Decks B and D experience losses on only 10% of trials (Bechara, 2007). These differences in frequency of losses may explain why a significant subset of "healthy control" participants exhibit a preference for Deck B, with high immediate rewards, a low frequency of losses, but long-term negative outcomes (Toplak et al., 2005; Caroselli et al., 2006; Fernie and Tunney, 2006; Buelow et al., 2013).

The IGT creators put forth that the task was sensitive to affective decision making (i.e., gut feelings and intuition; Bechara et al., 1994; Damasio, 1994; Seguin et al., 2007). Research supports this idea, indicating that individuals develop somatic markers in response to disadvantageous selections on the task that help guide performance (Bechara et al., 1996, 1997; Crone et al., 2004). However, recent research on the somatic marker hypothesis is mixed (Dunn et al., 2006), with some research suggesting that the IGT assesses both affective and deliberative decision making at different points in the task (Maia and McClelland, 2004; Brand et al., 2007; Guillaume et al., 2009; Schiebener et al., 2011). Although the precise decision making processes involved on the task are still being understood, most researchers agree early trials assess affective decision making while later trials assess deliberative decision making (Dunn et al., 2006; Wood and Bechara, 2014; Buelow and Blaine, 2015). To help further differentiate these two sets of trials, Brand et al. (2007) found that decision making during the early trials ("decision making under ambiguity") is guided by gut feelings and instincts, while during the later trials ("decision making under risk") participants have learned enough about the decks to estimate the relative risks and benefits of each. Supporting this distinction is the failure of the IGT to consistently correlate with other affective decision making tasks such as the Columbia Card Task-hot (CCT-hot; Figner and Voelki, 2004) and the Balloon Analog Risk Task (BART; Lejuez et al., 2002), suggesting that it measures a unique type of decision making not assessed in other decision making tasks. When factor analyzed, the IGT held as a unique factor in a model with the BART and CCT (Buelow and Blaine, 2015).

Despite the IGT's wide use in research and clinical practice, recent research has called into question its use as a stand-alone tool for investigating clinical decision making. Some have argued that the IGT can be influenced by different factors, including age (Blair et al., 2001; Crone and van der Molen, 2004; Kerr and Zelazo, 2004; Denburg et al., 2006; Fein et al., 2007; Garon and Moore, 2007), gender (Reavis and Overman, 2001; Bolla et al., 2005; Davis et al., 2007; Goudriaan et al., 2007; Businelle et al., 2008; van den Bos et al., 2013), personality (Addison and Schmidt, 1999; van Honk et al., 2002; Crone et al., 2003; Franken and Muris, 2005; Suhr and Tsanadis, 2007; Buelow and Suhr, 2013), extra learning trials (Buelow et al., 2013, 2014; Lin et al., 2013), and mood (Must et al., 2006; Suhr and Tsanadis, 2007; Buelow et al., 2013). It is important to acknowledge that contextual factors likely affect many clinical measures. However, the effects of contextual factors become paramount when a lack of agreement exists on what a specific instrument, such as the IGT, is truly measuring. Understanding what and how contextual factors affect performance is an important way to gain knowledge of the test's ability to measure what it was designed to measure. To fully understand an individual's decision making processes, assessment of the construct should be minimally sensitive to contextual factors. Although some of these factors may be more consistent across time (i.e., gender, personality) than others (i.e., age, mood), some factors (i.e., extra trials) are products of the testing situation. It is possible that the individual's mindset—not just current mood—may affect testing performance on the IGT. Despite these known limitations of the IGT and inconsistencies in how it is scored, it has been put forth as a clinical assessment instrument; however, no other behavioral decision making tasks have been adapted for clinical use alongside the IGT. It is important, then, to understand the different factors that influence performance on this task.

## IGT AND CONSTRUAL LEVEL

As previously stated, the IGT can be broken down into disadvantageous (Decks A and B) and advantageous (Decks C and D) deck choices (Bechara et al., 1994), with advantageous decision making dependent on consideration of long-term rather than short-term outcomes (as the authors originally intended). However, this masks differences in the frequency of wins and losses between Decks A and B and between Decks C and D. Although Decks A and B have similar long-term outcomes, selecting from Deck A results in losses on 50% of trials whereas Deck B experiences losses on only 10% of trials (Bechara, 2007). The magnitude of losses is therefore greater with Deck B than Deck A. A similar pattern emerges for Decks C (50% losses) and D (10% losses, greater magnitude of losses). Thus, when selections from each deck are analyzed independently, the IGT can also be conceptualized as a choice between high and low frequency of wins and losses (Caroselli et al., 2006; Lin et al., 2007, 2009; Chiu et al., 2008). Due to these differing foci of attention (long-term outcomes vs. frequency of losses), the IGT manual now refers to selections from Deck B as a non-optimal decision making strategy but continued selection from Deck A as indicative of pathological decision making (Bechara, 2007). Not examining the individual deck selections separately may result in someone who chooses from Deck B, to minimize frequency of losses, labeled as just as disadvantageous a decision maker as someone who continually selects from Deck A despite the negative outcomes. One contextual factor that has not been investigated with the IGT, but that may affect outcomes on individual deck selections, is the manner in which individuals construe the task.

Individuals often imagine future situations in both the near and distant future. For example, an individual may imagine what their life might be like next week or in 5 years. Considerable research suggests that the process by which an individual mentally imagines near and distant future events differs leading to variable outcomes in such domains as category breadth (Liberman et al., 2002, study 1), dimensionality of future representations (Liberman et al., 2002, study 2), and optimism (Taylor and Brown, 1988). Construal level theory (CLT; Trope and Liberman, 2003, 2010) suggests that when individuals imagine future events, they create abstract mental representations that vary to the extent that the imagined future is near or distant. Thus, CLT suggests that individuals can construe future events abstractly (high level construal) or concretely (low level construal). Research indicates that individuals will construe near future events using a low level construal and distant future events using a higher-level construal (Trope and Liberman, 2010). Moreover, CLT suggests that the use of a high level construal increases with psychological distance (Semin and Fiedler, 1991; Fujita et al., 2006a). Construal level differences have also been shown using other types of psychological distance, such as temporal and social distance (Liberman et al., 2007). For example, Fujita et al. (2006a) had participants watch a video of two students interacting and were informed the students were physically near or physically distant from the participant. Participants then provided a written description of the activity in the video. Results indicated descriptions for those in the physically distant condition contained more abstract language compared to those in the physically near condition, suggesting that participants construed the activity in the video with greater abstraction when they believed the location was distal rather than proximal.

Applied to the IGT, imagining that one is earning money for a distant acquaintance (high level construal) should lead to advantageous decision making while imagining earning money for a close friend (low level construal) should lead to disadvantageous decision making. Although the task directions do not indicate the individual should imagine earning money for a close friend, it is possible that recent knowledge of a friend's financial hardships could weigh on the individual's mind, in turn affecting performance on the IGT. Alternatively, if one actively engaged in abstract thought prior to taking the IGT, that construal process could transfer to the IGT affecting outcomes (see Smith and Branscombe, 1988; Gollwitzer and Kinney, 1989; Förster et al., 2004; for examples using non-IGT tasks).

CLT also posits that when individuals are in a high construal mindset they are more likely to rely on their internal values compared to when they are in a low construal mindset (Sagristano et al., unpublished manuscript). For example, when students are informed that a potential class is temporally distant (e.g., will take place in 1 year) they are more likely to focus on whether or not the professor treats students with respect. However, when students believe that a potential class is temporally close (e.g., will start in few days) they are more likely to focus on things such as the professor's typical grade distribution (Kivetz and Tyler, 2007; Torelli and Kaikati, 2009). Moreover, individuals are more likely to endorse attitude consistent behavior when they are imagining the behaviors in the distant rather than near future (Fujita et al., 2006b; Sagristano et al., unpublished manuscript). Thus, decisions made in the future are more likely to be guided by and reflect one's internal values and desires while decisions made in the present are more likely to reflect specific features of the decision (e.g., contextual and situational factors).

Given the extent to which different construal mindsets can affect the processing of information, it stands to reason that these psychological states may affect performance on decision making tasks in which differences in long-term vs. short-term outcomes are present, such as the IGT (e.g., Buelow et al., 2013), the Delay Discounting Task (e.g., Kirby and Herrnstein, 1995; Bickel et al., 1999; Kirby et al., 1999, 2005), and the BART (e.g., Acheson et al., 2007; Benjamin and Robbins, 2007; Lejuez et al., 2002). That is, individuals in a high construal mindset should be more likely to act in accordance with their internal preferences and values, leading to more advantageous choices on decision making tasks when the goal of the task is congruent with a focus on long-term outcomes (e.g., winning more money on the IGT). More importantly, those in a high construal state of mind are more likely to focus on abstractions (rather than specifics) leading to more advantageous decision making on the IGT and other tasks in which short- and long-term outcomes are available. That is, individuals in a high construal mindset are likely to ignore concrete and specific aspects of the decision making task and focus instead on core and stable elements of the task. In relation to the IGT, individuals in a high construal mindset may ignore the immediate consequence of their actions (e.g., large win vs. large loss) and instead focus on the more abstract goal-congruent long-term consequences (e.g., earn more money). One could also consider that decision making on tasks such as the IGT requires that one parse several pieces of information to learn to choose advantageously on the task. During the IGT, individuals are presented with win vs. loss information, but also large vs. small immediate outcomes. The ability of individuals to choose advantageously on the task requires that they ignore some information (large immediate rewards that lead to less money overall—favorable short-term outcomes) in favor of other information (small immediate rewards that lead to more money overall—favorable long-term outcomes). The ability to ignore irrelevant information to engage in goaldirected behavior should be enhanced when one is in a high (vs. low) construal mindset. Previous research has suggested that individuals in a high construal mindset focus more on attaining goals, whereas individuals in a low construal mindset focus more on avoiding losses (Pennington and Roese, 2003; Förster and Higgins, 2005; Lee et al., 2010). In addition, high construal level is associated with a focus on the "pros" or the positives of a given decision (Eyal et al., 2004), indicating participants in this mindset may begin to show a preference for the advantageous decks as information is learned and the task progresses. In a recent study, Lermer et al. (2015) found greater risk-taking on the BART among individuals in a highconstrual mindset; however, no learning of risks and benefits associated with decisions occurs on the BART compared to the IGT.

## THE PRESENT STUDIES

The present studies examined whether the manner in which you construe the task affects performance on the IGT. Previous research has shown that some contextual variables can affect performance, and it is possible that the mindset one is in during testing—high vs. low construal—can also affect performance. We believe that construal mindset serves as an attribute of the decision making process. Understanding all of the mechanisms negatively affecting IGT performance is important as the IGT is frequently used by researchers and clinicians alike. Across two studies, participants were primed with a high- or lowlevel construal procedural task, or received no prime, and then completed the IGT. We predicted that those primed with a high construal mindset would choose more advantageously on the IGT compared to those primed with a low level construal mindset.

## STUDY 1 METHOD

## Participants

Participants were 90 undergraduate students (58 females) at a regional campus of a large Midwestern university, ages 18– 33 (M = 19.00, SD = 2.30), who were enrolled in General Psychology courses. Most (67.4%) self-identified as Caucasian.

## Measures

## Iowa Gambling Task (IGT)

In the present study, the IGT version available through PAR, Inc. was utilized (Bechara, 2007). This version of the IGT is based on the revised task described in Bechara et al. (2001). The task has been validated in various patient and non-patient samples (see Buelow and Suhr, 2009, for a review). Based on the scoring issues outlined previously and the confounding of long-term outcomes and frequency of losses on the decks, the percent selections from each individual deck (A,B,C,D) during decision making under ambiguity (Trials 1–40) and decision making under risk (Trials 41–100; Brand et al., 2007) were calculated.

## Procedure

The present study was approved by the University's Institutional Review Board. Following informed consent, participants were randomly assigned to complete a construal level manipulation utilizing procedural mindset priming—the categories vs. exemplars task (Fujita et al., 2006b). In this task, participants are provided with 40 stimulus words and asked to repeatedly engage in high- or low-level construal processing by generating a superordinate category label (high construal; n = 30 participants) or subordinate exemplar (low construal; n = 30 participants) for each such that the induced mindset transfers to a subsequent task (see Smith and Branscombe, 1988; Gollwitzer and Kinney, 1989; Förster et al., 2004; for examples). For example, those induced into a low construal mindset might be given the word "soda" and provide "Coke" as their response, while those induced to a high construal mindset may provide "beverage" as their response. Immediately following this manipulation, participants completed the IGT.

An additional 30 participants made up a control group who did not complete a construal manipulation. These participants were compiled from a separate study conducted concurrently to the present study, in which participants did not receive a manipulation prior to the IGT (Bevelhymer-Rangel and Buelow, 2014).

## Data Analysis

Data were first examined for between-groups differences in demographic variables. Repeated-measures ANOVAs were conducted on the IGT deck selections, with study condition (high construal, low construal, control) as the between-subjects variable and block (Block 1: Trials 1–40; Block 2: Trials 41–100) as the repeated-measures variable.

## STUDY 1 RESULTS AND CONCLUSIONS

There was not a significant difference between the three groups in terms of age, F(2, 86) = 2.67, p = 0.08; or gender, χ 2 (2, <sup>N</sup> <sup>=</sup> 90) <sup>=</sup> 3.01, p = 0.22. For Deck A, there was a significant main effect of block, F(1, 87) = 47.96, p < 0.001, in that participants selected more from Deck A in Block 1 than Block 2 (see **Table 1** for means, standard deviations, and effect sizes). Thus, learning occurred on the IGT, as participants shifted their decisions away from the most disadvantageous deck. There was also a significant main effect of group, F(2, 87) = 3.63, p = 0.03. Participants in the Low Construal group selected significantly more from Deck A than participants in the High Construal group, p = 0.01. In addition, the control group selected marginally more from Deck A than participants in the High Construal group, p = 0.067. The interaction was not significant for Deck A, F(2, 87) = 0.13, p = 0.88.

For Deck B, none of the main [Block: F(1, 87) = 0.82, p = 0.37; Group: F(2, 87) = 0.99, p = 0.38] or interaction, F(2, 87) = 0.02, p = 0.98, effects were significant. Similarly, none of the main [Block: F(1, 87) = 1.21, p = 0.28; Group: F(2, 87) = 0.04, p = 0.96] or interaction, F(2, 87) = 0.16, p = 0.85, effects were significant for Deck C.

Finally, for Deck D, the main effect of Block was significant, F(1, 87) = 12.24, p = 0.001. Participants selected significantly more from Deck D during Block 2 than Block 1, p = 0.001. Neither the main effect of group, F(2, 87) = 0.06, p = 0.94, or the interaction effect, F(2, 87) = 0.04, p = 0.96, were significant.


B, Block; G, Group; B×G, Block × Group Interaction; Deck, percent selections from Decks A, B, C, and D on Block 1 (Trials 1–40) and Block 2 (Trials 41–100).

Taken together, the present results indicate no significant interaction between group and IGT block; however, significant main effects emerged on Decks A and D. Specifically, learning occurred on the task: participants, independent of group, learned to avoid Deck A and select from Deck D as the task progressed. These results are consistent with previous research showing the IGT does not solely rely on affective decision making processes, and instead learning can occur (Maia and McClelland, 2004; Dunn et al., 2006; Brand et al., 2007; Guillaume et al., 2009; Schiebener et al., 2011; Wood and Bechara, 2014; Buelow and Blaine, 2015). A significant main effect of construal level group emerged on Deck A, in that participants primed with a low construal mindset selected significantly more from Deck A (independent of block) than participants in a high construal mindset. In addition, the high construal group selected marginally less from Deck A than the control group. The results of Study 1 provide initial evidence that a high-level construal mindset can lead to a more advantageous decision making strategy on the IGT, in that participants learned to avoid Deck A—a deck associated with more pathological decision making (Bechara, 2007). However, to ensure that a high-level construal mindset led to changes in decision making strategy, we decided to replicate Study 1 using a different type of construal manipulation in which participants were asked to earn money on the IGT for a mere acquaintance (high-level construal) or a close friend (lowlevel construal). In a recent study, Kim et al. (2013) found that the greater the perceived psychological distance from a person, the more advantageous decisions were made. In the present study, it was hypothesized that participants in the high construal condition would again select more advantageously than those in the low construal condition.

## STUDY 2 METHOD

## Participants

Participants were 90 undergraduate students (44 females) at a regional campus of a large Midwestern University who were 18–34 years old (M = 18.92, SD = 1.99) and were enrolled in General Psychology courses. Most (72.4%) self-identified as Caucasian.

## Measures

## Iowa Gambling Task (IGT)

The standard computerized IGT was again utilized (Bechara, 2007). The percent selections from each individual deck during early (Trials 1–40) and later (Trials 41–100) trials were calculated.

## Procedure

The present study was approved by the University's Institutional Review Board. Following informed consent, participants were randomly assigned to complete a different construal level manipulation than in Study 2. This construal manipulation involved manipulating psychological distance rather than procedural mindset (Kim et al., 2013). Participants were either asked to think of someone very close to them—a close friend or family member (low construal; n = 30 participants)—or to think of someone not very close to them—a mere acquaintance (high construal; n = 30 participants). Participants in both groups were then asked to make decisions on the IGT as if they were earning money for that individual instead of themselves.

An additional 30 participants made up a control group who did not complete a construal manipulation. These participants were compiled from a separate study conducted concurrently to the present study, in which participants did not receive a manipulation prior to the IGT, and were distinct from the control group utilized in Study 1 (Bevelhymer-Rangel and Buelow, 2014).

## Data Analysis

Data were first examined for between-groups differences in demographic variables. Repeated-measures ANOVAs were conducted on the IGT deck selections, with study condition (high construal, low construal, control) as the between-subjects variable and block (Block 1: Trials 1–40; Block 2: Trials 41–100) as the repeated-measures variable.

## STUDY 2 RESULTS AND CONCLUSIONS

There was not a difference between groups in terms of gender, χ 2 (2, <sup>N</sup> <sup>=</sup> 89) <sup>=</sup> 5.05, <sup>p</sup> <sup>=</sup> 0.08. There was, however, a significant difference in age, F(2, 85) = 3.33, p = 0.04. The control group (M = 18.18, SD = 0.48) was significantly younger than the high construal group (M = 19.47, SD = 2.93), p = 0.04. Due to this age difference, correlations were calculated between age and IGT performance. Age was significantly correlated with Block 1 selections for Deck C, r(86) = 0.26, p = 0.02, and Deck D, r(86) = −0.23, p = 0.03, only. Given this minimal relationship between age and performance on the IGT and our small sample size, we elected not to include age as a covariate in the remaining analyses.

For Deck A, there was a significant main effect of Block, F(1, 87) = 36.15, p < 0.001, in that participants selected more from Deck A in Block 1 than Block 2, p < 0.001, independent of group (see **Table 2** for means, standard deviations, and effect sizes). The group effect was marginal, F(2, 87) = 2.91, p = 0.06, in that participants in the high construal group selected more from Deck A than participants in the control (p = 0.025) or low construal (p = 0.073) groups. There was also a significant block by group interaction, F(2, 87) = 6.94, p = 0.002. Participants in the control group selected significantly more from Deck A in Block 1 than Block 2, p < 0.001. In addition, participants in the low construal group selected significantly more from Deck A in Block 1 than Block 2, p < 0.001. There was no difference in selections across blocks for the high construal group, p = 0.59.

For Deck B, neither of the main effects were significant [Block: F(1, 87) = 0.38, p = 0.54; Group: F(2, 87) = 0.70, p = 0.50]. In addition, the interaction effect was not significant, F(2, 87) = 2.51, p = 0.09. For Deck C, again neither of the main effects were significant [Block: F(1, 87) = 0.02, p = 0.90; Group: F(2, 87) = 0.77, p = 0.47]. The block by group interaction was also not significant for Deck C, F(2, 87) = 1.27, p = 0.29.

For Deck D, the main effect of block was significant, F(1, 87) = 17.30, p < 0.001, in that participants selected more from Deck TABLE 2 | Study 2 variables presented as mean (standard deviation).


B, Block; G, Group; B×G, Block × Group Interaction; Deck, percent selections from Decks A, B, C, and D on Block 1 (Trials 1–40) and Block 2 (Trials 41–100).

D in Block 2 than Block 1. The main effect of group was not significant, F(2, 87) = 0.54, p = 0.58. The block by group interaction was also not significant, F(2, 87) = 0.46, p = 0.63.

Counter to our hypothesis, individuals who were told to imagine earning money for a psychologically close other (low construal) learned to avoid Deck A on the IGT compared to those that imagined earning money for a psychologically distant other (high construal). Thus, participants in a high construal mindset selected from the riskiest deck, even more so than those who did not complete a construal priming task. Both the control group and the low construal group significantly decreased their selections from Deck A as the task progressed. In a series of three studies, Beisswanger et al. (2003) found that participants made riskier decisions for their friends than for themselves; however, this difference disappeared when the likelihood of significant negative outcomes increased. As applied to the IGT, it is possible that our participants felt that the potential for loss when making decisions for a close friend outweighed the potential for loss when making decisions for a distant friend, leading to improved decision making on the task.

## GENERAL DISCUSSION

While research is still being conducted on the task, clinicians are increasingly using the IGT to clinically evaluate decision making. Although the IGT manual (Bechara, 2007) indicates that the task can be used in this manner, recent research suggests that using the task as a sole determinant of decision making ability may not be an accurate measure of one's decision making ability (for review, see Steingroever et al., 2013). The purpose of this set of studies was to investigate whether the manner in which individuals construe the task affects IGT performance.

Across two studies, manipulating the manner in which individuals construed the task differentially affected outcomes on the IGT. When using procedural mindset priming to manipulate the manner in which individuals construed the IGT (Study 1), individuals in a high-level construal mindset learned to avoid Deck A, the deck associated with pathological decision making (Bechara, 2007). Conversely, when manipulating psychological distance to manipulate construal level (Study 2), individuals in a high construal mindset failed to learn to avoid Deck A. Instead, individuals in a low construal mindset and control participants learned to avoid Deck A as the task progressed. Thus, the manipulations produced results in opposition to one another, suggesting the two may affect scores on the IGT using separate psychological mechanisms.

There are several reasons why procedural mindset priming might operate differently on the IGT compared to a manipulation involving psychological distance. Both manipulations attempt to illicit an abstract or concrete mindset, but do so in fundamentally different ways. Procedural mindset priming hinges on the assumption that repeatedly engaging in a particular type of processing style (e.g., high construal) primes that mindset, making it more likely for that processing style to transfer to an immediately subsequent task. Like most priming effects, its effectiveness is likely short-lived and dependent upon the contiguity of the first and second tasks (see Van den Bussche et al., 2009, for discussion of priming effects). To this end, construal level procedural mindset priming effects should be strongest early in the subsequent task and should fade as that task progresses. It is possible that with administration of additional trials, which has been shown to improve performance on the IGT (Buelow et al., 2013; Lin et al., 2013), the effects of construal level priming may dissipate and exert less of an influence on task performance.

In both studies, significant effects emerged on Decks A and D only, which may be attributed to individual deck-level differences. Deck D is widely regarded as the "best" or most advantageous deck, as continued selections from it result in long-term gains and a lower frequency (10%) of losses (Bechara, 2007). Deck C, although considered an advantageous deck, produces losses on 50% of trials. Although in the long term the wins outweigh the losses from this deck, individuals who are attuned to loss frequency may avoid this deck and deem it a disadvantageous deck. Previous research has shown a prominent Deck B preference among healthy control participants (e.g., Toplak et al., 2005; Caroselli et al., 2006; Fernie and Tunney, 2006), and it appears that this effect is driven by a preference for high immediate gains (in comparison to the lower immediate gains of Deck D) and the low frequency of immediate losses. Thus, for a significant subset of healthy controls, Deck B can be seen as an advantageous deck. It is possible that our lack of findings with Decks B and C are due in part to these confounds in how the decks are appraised by participants. Deck A selections are indicative of pathological decision making (Bechara, 2007), and continued selections from this deck are not typically seen in control participants. It is then possible the outcomes for Deck A in the present studies are the result of faster detection of the disadvantages of Deck A due to the manner in which participants construed the task. It is also possible that characteristics of our sample led to these deck-specific results. College student participants may exhibit the prominent Deck B phenomenon, as has been shown in some of the previous research (e.g., Caroselli et al., 2006). In addition, students may have multiple friends or family members experiencing financial strain—or may be experiencing this strain themselves—thus leading to a greater emotional investment in decisions made for close friends in Study 2. We also had more females than males in Study 1, and gender may have played a role in deck selections (e.g., Bolla et al., 2005; Davis et al., 2007; Businelle et al., 2008).

Across two studies, we have provided evidence that construal level can impact performance on the IGT. Specifically, increasing psychological distance may lead to continued selections from a disadvantageous deck, while priming individuals with a high construal mindset may lead to decreased selections from this same disadvantageous deck. It is possible that during a clinical evaluation, individuals engaging in high construal level thinking (planning for their retirement) prior to taking the IGT may likely make more advantageous decisions than those engaging in low construal level thinking prompted by psychological distance (imagining a close other), or even compared to their own decisions when not engaging in high construal level thinking. Additionally, during a clinical evaluation with the IGT, a participant may be thinking about their own recent financial difficulties, or those of a close friend or family member. Each of these processes is likely to impact performance on the IGT and, more importantly, the clinical evaluation of decision making impairment. In addition, these same contextual factors can negatively affect assessment of decision making in a lab-based setting. The IGT is still frequently used as a behavioral measure of risky decision making, and without taking contextual factors such as construal level into account, ensuring understanding of what the IGT is assessing is difficult. However, it is important to note that construal level is not the only factor affecting IGT performance, as it is likely that other factors (such as age, gender, personality, and other to-date unknown factors) can also affect performance on this task in the clinic and in the lab.

The results between the present studies were contradictory. Thus, while these differences may be accounted for by the type of manipulation used, the results should be interpreted with caution until further research can identify the exact psychological mechanism responsible for those differences or appropriate moderators can be identified. Previous research has shown that emotional state can affect decision making (Forgas, 1995), with both positive (Nygren et al., 1996; Roiser et al., 2009) and negative (Heilman et al., 2010; Buelow et al., 2013) mood affecting performance on the IGT. Within the moral decision making field, personal dilemmas elicit greater emotional response than impersonal or non-moral dilemmas (Greene et al., 2001; Skoe et al., 2002; Myyry and Helkama, 2007), which in turn affects decision making. It is possible that in Study 2, participants who were asked to decide for a close friend increased their emotional involvement and engagement on the IGT. As the IGT is thought to be based, at least in part, on emotionbased decision making, it is possible that this activation of an emotional experience increased decision making whereas making decisions for a distant acquaintance could have resulted in limited emotional input during the task. Additionally, our data were culturally homogenous and cannot account for the impact that culture may have on our findings. Individuals from different cultural backgrounds may interpret the construal tasks differently, in turn changing their assumed effects on cognitive processes.

Taken together, our findings suggest that priming participants with a high or low construal mindset affects advantageous and disadvantageous decision making on the IGT. Across multiple studies, personality, mood, and other contextual factors have been shown to affect performance on the IGT, a task designed to assess real-world decision making impairments among individuals with damage to the prefrontal cortex. Disadvantageous decision making is not specific to such damage, nor to a specific neurological or psychological diagnosis. It is important for clinicians to account for the presence of such contextual and other factors before determining a patient's decision making ability when using the IGT in isolation. Using the IGT as part of a multi-faceted approach to assessing decision making that includes the use of multiple decision making tasks should help increase the validity of the assessment, as contextual effects on any one task should be minimized when results are congruent across measures. However, to truly conduct a valid assessment of decision making utilizing multiple measures, additional measures validated for use in clinical populations are needed. This multi-faceted approach would then allow for a broader, and more accurate, understanding of decision making that is less resistant to contextual and other factors. Researchers investigating what the IGT is truly measuring should also be cognizant of contextual factors as they may affect outcomes on the IGT, leading to null relationships between it and other decision making tasks. Ultimately, clinical research informs clinical practice. Thus, a thorough understanding of the task and the factors that affect performance from lab-based settings is warranted prior to using the IGT as a stand-alone measure of decision making in clinical evaluations.

## REFERENCES


with psychopathic tendencies? J. Abnorm. Child Psychol. 29, 499–511. doi: 10.1023/A:1012277125119


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Okdie, Buelow and Bevelhymer-Rangel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Decision making in healthy participants on the Iowa Gambling Task: new insights from an operant approach

## Peter N. Bull, Lynette J. Tippett and Donna Rose Addis \*

*School of Psychology, The University of Auckland, Auckland, New Zealand*

The Iowa Gambling Task (IGT) has contributed greatly to the study of affective decision making. However, researchers have observed high inter-study and inter-individual variability in IGT performance in healthy participants, and many are classified as impaired using standard criteria. Additionally, while decision-making deficits are often attributed to atypical sensitivity to reward and/or punishment, the IGT lacks an integrated sensitivity measure. Adopting an operant perspective, two experiments were conducted to explore these issues. In Experiment 1, 50 healthy participants completed a 200-trial version of the IGT which otherwise closely emulated Bechara et al.'s (1999) original computer task. Group data for Trials 1–100 closely replicated Bechara et al.'s original findings of high net scores and preferences for advantageous decks, suggesting that implementations that depart significantly from Bechara's standard IGT contribute to inter-study variability. During Trials 101–200, mean net scores improved significantly and the percentage of participants meeting the "impaired" criterion was halved. An operant-style stability criterion applied to individual data revealed this was likely related to individual differences in learning rate. Experiment 2 used a novel operant card task—the Auckland Card Task (ACT)—to derive quantitative estimates of sensitivity using the generalized matching law. Relative to individuals who mastered the IGT, persistent poor performers on the IGT exhibited significantly lower sensitivity to magnitudes (but not frequencies) of rewards and punishers on the ACT. Overall, our findings demonstrate the utility of operant-style analysis of IGT data and the potential of applying operant concurrent-schedule procedures to the study of human decision making.

Keywords: decision making, Iowa Gambling Task, operant psychology, sensitivity to reward and punishment, learning rate

## Introduction

Life is like a game of cards. The hand that is dealt you represents determinism; the way you play it is free will.

(Jawaharlal Nehru).

Poor decision making, particularly in situations involving complexity (where choice alternatives have multiple reward and punishment dimensions which may conflict) or uncertainty (where

#### Edited by:

*Yao-Chu Chiu, Soochow University, Taiwan*

#### Reviewed by:

*Youngbin Kwak, University of Massachusetts Amherst, USA Lasana Harris, Leiden University, USA*

#### \*Correspondence:

*Donna Rose Addis, School of Psychology, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand d.addis@auckland.ac.nz*

#### Specialty section:

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology*

> Received: *23 December 2014* Accepted: *19 March 2015* Published: *07 April 2015*

#### Citation:

*Bull PN, Tippett LJ and Addis DR (2015) Decision making in healthy participants on the Iowa Gambling Task: new insights from an operant approach. Front. Psychol. 6:391. doi: 10.3389/fpsyg.2015.00391* rewards and punishers occur unpredictably), is associated with brain injury to ventromedial prefrontal cortex (VMPFC). The Iowa Gambling Task (IGT; Bechara et al., 1994) was designed to assess decision-making abilities in VMPFC patients under such conditions of complexity and uncertainty. Participants are instructed to maximize winnings while choosing repeatedly from four decks of playing cards that unpredictably yield wins and losses. Importantly, the contingencies of reward and punishment are counter-intuitively arranged so that the decks with higher wins (\$100) result in a long-term net loss, while the decks with smaller wins (\$50) yield a net gain. Participants who do not learn to prefer one or both of the \$50 decks over the course of 100 trials are considered to exhibit a decision-making impairment. Over the last 20 years the IGT has become a de-facto standard for decision-making research (Dunn et al., 2006) and has been marketed as a tool for clinical assessment (Bechara, 2007). Indeed, the IGT has not only contributed to understanding decisionmaking deficits in patients with VMPFC damage, but has also been successfully applied to a variety of disorders arising from poor impulse control (e.g., pathological gambling; Brand et al., 2005).

Theorists from disparate disciplines assume that in tasks such as the IGT, where participants make repeated choices between two or more alternatives with differing outcomes, healthy individuals attempt to maximize net rewards over time (Samuelson, 1937; MacArthur and Pianka, 1966; Charnov, 1976; Rachlin et al., 1976; Damasio, 1994) 1 . When presented with the IGT—or any other novel choice task in which the contingencies of reward and punishment for each alternative are initially unknown—to maximize net rewards an individual must first learn the contingencies via trial and error. In a simple two-alternative choice task, learning is rapid and an exclusive preference may quickly develop, while in a more complex choice task, learning rate may be reduced and uncertainty increased. Preference at any given time depends on the individual's level of certainty of the relative contingencies. Behavioral economists traditionally distinguish three discrete categories of certainty—ambiguity, risk, and certainty (Knight, 1921; Ellsberg, 1961; Levy et al., 2010). However, in a choice task that requires learning, the boundaries between categories are not clear-cut; thus we argue that it is more helpful to conceptualize these classes as regions lying along a continuum of certainty (**Figure 1**). Initially an individual is completely uncertain, and frequently switches between alternatives to learn the contingencies—and preference can appear random or indifferent. But as the individual learns the approximate frequency and magnitude of rewards and punishers, preference will typically become biased toward alternatives with higher net rewards, and the individual may come exclusively to prefer the better alternative. Thus, an individual's location on the certainty continuum at the time preference is measured can critically impact the apparent "goodness" of their decision-making abilities.

Bechara et al. (1994) found that on the IGT, healthy participants appeared to exhibit this pattern of gradual learning over

100 trials and attained high net scores (an index of relative preference for good decks), while patients with VMPFC lesions generally failed to learn the contingencies, preferring alternatives with long-term net losses (Bechara et al., 1994; Damasio, 1994). This failure to maximize net rewards by VMPFC patients (and other clinical populations) was attributed by Bechara et al. (2000), Bechara and Damasio (2002) to atypical sensitivity to reward and/or punishment in these patients. More specifically, Bechara et al. (2000) hypothesized that poor IGT performance may result from three distinct types of decision-making deficit: hypersensitivity to reward; hyposensitivity to punishment; or myopia for the future—that is, insensitivity to delayed or infrequent events, whether they be rewards or punishers (Bechara et al., 2000, 2002; Bechara and Damasio, 2002).

Sensitivity to reward and sensitivity to punishment are terms used across multiple literatures but rarely formally defined. According to Davis and Fox (2008), individuals with high reward sensitivity ". . . are more prone to detect signals of reward in their environment, to approach with greater alacrity potentially rewarding stimuli, and to experience more positive affect (pleasure/reinforcement) when they are in situations with cues of reward." (p. 43). Thus, sensitivity may be likened to an individual's subjective perception of, and "reactivity" to, a reward or punisher (e.g., a student may be more excited when given a \$50 bill than a billionaire). This conception of sensitivity originates in reinforcement sensitivity theory (RST; Gray, 1970, 1991; Gray and McNaughton, 2000), in which reward sensitivity and punishment sensitivity are considered stable personality characteristics, associated with distinct neural substrates<sup>2</sup> .

In the IGT, sensitivity cannot be measured by analyzing behavioral metrics such as the number of responses to each deck, because each deck yields both rewards and punishers. For example, a high proportion of responses to Deck B (large, frequent rewards and large, infrequent punishers) may indicate either high sensitivity to reward or low sensitivity to punishment. Therefore, supplemental measures have been used to measure sensitivity. For instance, Bechara et al. (2000, 2002); Bechara and Damasio (2002) used a physiological measure (skin conductance response; SCR), inferring that, for example, a low-magnitude SCR (relative to control participants) in response to losing money during a trial was indicative of a low sensitivity to punishment. Other IGT studies (e.g., Suhr and Tsanadis, 2007; Buelow and Suhr, 2013) have utilized self-report measures of sensitivity (e.g., Carver and White, 1994; Torrubia et al., 2001).

<sup>2</sup>Personality has been likened to "behavioral state," defined as a distillation of an individual's previous history of reinforcement and punishment (Davison, 1998).

<sup>1</sup>There is evidence that organisms do not invariably maximize (reviewed in Herrnstein, 1997); however, it is outside the scope of this paper to examine the validity of the maximization assumption.

A limitation of physiological and self-report approaches is that they don't measure the specific dimensions of sensitivity such as sensitivity to the frequency and magnitude of rewards and punishers—that influence preferences in the IGT. To this end, it may be worthwhile adopting an alternative approach used in other literatures that also investigate decision making. In behavioral economics the perceived, or subjective value of a reward or punisher (referred to as its utility; Herrnstein, 1990; Glimcher and Rustichini, 2004) is assumed to differ from its physical, or objective value. This idea converges with the concept of sensitivity from RST: Individuals may differ in the extent to which they scale the physical properties of rewards and punishers to subjective perceptions. A variety of methods and models have been developed to quantify this scaling (e.g., Wearden, 1980; Herrnstein, 1990; Glimcher and Rustichini, 2004; Takahashi, 2005; McKerchar et al., 2010; Doyle, 2012). Some of these models were derived from the generalized matching law, an equation developed by operant psychologists (Baum, 1974; for an introduction see Poling et al., 2011). Advantages of the generalized matching law are that it mathematically formalizes sensitivity and offers a well-validated procedure to measure the individual dimensions of reward and punishment (e.g., frequency, magnitude, or delay). The generalized matching law has not previously been applied to investigating decision-making deficits in human participants; its use may potentially complement the physiological and self-report measures of sensitivity previously applied to IGT research by providing a more nuanced picture of decision-making processes.

In order to derive sensitivity estimates using the generalized matching law, operant psychologists utilize concurrent-schedule procedures (Bouton, 2007). In a concurrent-schedule task, participants choose between two or more responses (e.g., key presses), each of which yields a reward with a different probability. That is, two or more schedules of reward are presented to participants concurrently. The IGT can thus be considered a type of concurrent-schedule task. Nevertheless, despite the similarities, the decision-making literature has not yet drawn on the operant study of concurrent schedules to enhance understanding of how sensitivity contributes to performance on the IGT; the present study will be the first to do so.

In addition to lacking a direct measure of sensitivity, the IGT also experiences high inter-study variability. A recent review by Steingroever et al. (2013) revealed that the strength of the learning pattern in Bechara et al.'s (1994) healthy control group has rarely been matched by authors outside Bechara's laboratory. Rather, high inter-study variability was apparent, including a number of studies reporting very low net scores for healthy individuals. Such findings raise concerns about the interpretation of low scores as indicative of a decision-making deficit (Dunn et al., 2006). Further, deck-by-deck analyses of group data suggest that what was assumed to be a preference for the decks yielding higher long-term net gains may in fact reflect a preference for decks with a low frequency of losses (e.g., Wilder et al., 1998; Dunn et al., 2006; Lin et al., 2007, 2013; Steingroever et al., 2013). This tendency to avoid frequent losses (i.e., the frequency-of-losses effect) throws into question the assumption that healthy participants learn to maximize net rewards (Bechara et al., 1994).

An issue that may contribute to high inter-study variability, but which has rarely been discussed, is that the IGT procedure itself has been inconsistently implemented across studies (Areias et al., 2013). The majority of IGT researchers have devised proprietary implementations in which fundamental parameters—including task complexity, task instructions, and task length —often vary. Task complexity (equivalent to contingency discriminability in operant psychology; Davison and Jenkins, 1985) is determined by basic design parameters including the number of choice alternatives, the number of dimensions (e.g., valence, magnitude, frequency, or delay), the predictability of rewarding and/or punishing events, and the variability of reward/punisher magnitudes. Operant studies (e.g., Takahashi and Iwamoto, 1986; Hanna et al., 1992) suggest that participants' choice behavior may be affected even by subtle task variations (e.g., appearance, color, and spatial positioning of stimuli on the screen; labeling of decks; randomization of deck position and card order; changes in card appearance or color, printed feedback, and audio associated with wins or losses). Moreover, task instructions often include information about the reward/punishment contingencies, which can influence participants' initial level of certainty. Both operant (e.g., Horne and Lowe, 1993) and IGT (e.g., Balodis et al., 2006; Fernie and Tunney, 2006; Glicksohn and Zilberman, 2010) research has demonstrated that instructions can profoundly affect the ability of participants to learn the contingencies. In particular, a "hint" in Bechara's instructions (stating that some decks are better than others and that participants should avoid bad decks) has been shown to be critical to good IGT performance (Balodis et al., 2006; Fernie and Tunney, 2006; Glicksohn and Zilberman, 2010).

In addition to high inter-study variability in the IGT, Steingroever et al. (2013) found high inter-individual variability when net scores and deck preferences for individual participants (as opposed to group data) were examined in detail. Not only did individual net scores vary widely, but up to a third of healthy participants in some studies obtained scores low enough to be classified alongside VMPFC patients. Thus, it appears that the typical practice of aggregating individual participants, decks, and trials when analyzing IGT data may obscure important information, creating confusion in interpretation and perhaps leading one to believe in the fictitious average healthy participant.

Following Bechara et al.'s (2000) explanation of poor IGT performance in terms of atypical sensitivity to reward or punishment, and consistent with the view of sensitivity as a critical personality trait (Gray and McNaughton, 2000), it might be hypothesized that the high inter-individual variability in IGT performance is due to individual differences in reward sensitivity or punishment sensitivity in healthy participants. However, Steingroever et al. (2013) also found that individual learning trajectories did not typically resemble the gradual learning curve suggested by group data. While many participants established a stable preference for one or more decks during the allowed 100 trials, they did so at varying times, and final deck preferences often differed. Still other participants never established stable preferences, exhibiting high switching rates between decks and low net scores throughout the task.

This variability in learning rate may be an important contributor to high inter-individual variability: If some healthy individuals learn the task very slowly, then their net scores (based on all 100 trials) will be considerably lower than those Bull et al. The IGT: an operant approach

who learn the task very quickly, resulting in a wide range of individual net scores. Thus, as some authors have noted (Dunn et al., 2006; Wetzels et al., 2010; Buelow et al., 2013; Ryterska et al., 2013), one cannot infer that a poor net score in the IGT is due to a decision-making deficit (i.e., atypical sensitivity), or a low rate of learning, or both.

In operant research, to control for individual differences in learning rate, a task will typically continue until participants have developed a stable preference according to some predefined stability criterion (see Sidman, 1960; Killeen, 1978; Baron and Perone, 1998). Data from earlier, learning trials are then discarded and analysis is only carried out on data from later, stable trials. In contrast, the IGT is typically fixed at 100 trials for all participants, and all data are analyzed—including data from early trials when participants were learning by trial and error (refer **Figure 1**). This is somewhat akin to evaluating the balance and coordination of a group of novice snowboarders by allowing them 10 attempts at negotiating an intermediate-level trail, and measuring the total number of times they fall over. This is effectively an assessment of learning rate—a better approach would be to first allow each participant to reach a predefined competency level (which will inevitably take a varying amount of time for each individual), before evaluating their performance on the intermediate trail. Indeed, several IGT studies have shown that when allowed more than 100 trials, many individuals who perform poorly during the first 100 trials are able to achieve good performance by Trial 200 (e.g., Fernie and Tunney, 2006; Buelow et al., 2013) or Trial 300 (Lin et al., 2013).

The present study comprised two experiments. In Experiment 1 we investigated factors other than sensitivity (i.e., poor task standardization and individual differences in learning rate) that may contribute to the high variability in IGT performance in healthy participants. In Experiment 2 we developed a novel card task based on human operant experimental procedures and applied the generalized matching law to derive behavioral estimates of sensitivity for the participants in Experiment 1 to investigate whether sensitivity differed between individuals categorized as good or poor decision makers on the IGT.

## Experiment 1

In Experiment 1 a 200-trial version of the IGT was administered to 50 participants, systematically replicating two recent studies that examined the effect of trial length on IGT performance (Buelow et al., 2013; Lin et al., 2013). In addition to increasing task length, we controlled task complexity and task instructions by implementing the IGT as closely as practicable to Bechara's original computer-based IGT (first described in Bechara et al., 1999) 3 . Operant procedures guided the analysis—we did not limit our analyses to group data, but also examined individual learning trajectories, allowing us to better capture individual differences in deck preferences. Moreover, similar to operant analyses, we established a stability criterion that allowed us to limit the analysis of net scores to stable data.

We hypothesized that group data in the first 100 trials would replicate Bechara et al.'s (1994, 1998, 1999) findings with healthy controls more closely than other studies that have introduced variations in instructions, procedure, and stimuli. Nevertheless, we predicted that individual net scores during the first 100 trials would be highly variable, and up to a third of healthy participants would perform as poorly as VMPFC patients. When given another 100 trials in which to learn the contingencies, however, we expected that mean net scores would improve and the majority of participants would develop stable preferences for one or both good decks. Thus, we hypothesized that individual differences in learning rate and deck preferences across 200 trials would contribute to high inter-individual variability and poor mean net scores.

## Method

## Participants

In line with previous sample sizes reported by Bechara and colleagues (N ranging from 13 to 49; Bechara et al., 1994, 1998, 1999, 2000, 2002; Bechara and Damasio, 2002), we enrolled 50 young adults (20 males) aged from 17 to 32 years (M = 21.44; SD = 3.79) as participants in the current study. Participants were recruited in response to advertisements at the University of Auckland and were informed they would receive NZ\$10 for completing the study, and could earn up to an additional NZ\$20 depending on their performance in the "card games." This was intended to encourage participant engagement (particularly in the more onerous tasks in Experiment 2); in actuality the design ensured all participants received the full NZ\$30.

## Experimental Task

The IGT was based on the implementation in Version 0.12 of the Psychology Experiment Building Language test battery (PEBL; Mueller, 2009; Mueller and Piper, 2014). Mueller's version was modified to more closely replicate Bechara's computer-based IGT (Bechara et al., 1999) 4 Instructions (see Supplemental Materials) were based on those provided by Bechara to Davison (2009) and similar to Bechara et al. (1999); notably, they included the following hint, previously shown to be critical to good IGT performance (Balodis et al., 2006; Fernie and Tunney, 2006; Glicksohn and Zilberman, 2010):

The only hint I can give you, and the most important thing to note is this: Out of these four decks of cards, there are some that are worse than others, and to win you should try to stay away from bad decks.

As argued in the Introduction, subtle variations in the experimental task may have an important influence on performance; therefore the IGT is described here in detail. A screen shot of the task is shown in **Figure 2**. Three differences distinguished this implementation from Bechara et al.'s (1999) version. First,

<sup>3</sup>We consider only the widely-used ABCD version of the Iowa Gambling Task and do not address the A ′B ′C ′D and E ′F ′G ′H variants introduced by Bechara et al.'s (2000).

<sup>4</sup>The modified IGT used in this study has been adopted as the default IGT implementation in PEBL Version 0.14 and can be downloaded from http://pebl. sourceforge.net/

the task was run for 200 trials instead of the standard 100 trials. Bechara's schedule only defined outcomes for 40 cards in each deck, so to ensure that a given deck in the current study would not run out of cards, the original schedule was repeated if a participant chose more than 40 times from a single deck. Second, instructions were presented on the computer screen rather than verbally to mitigate potential experimenter effects. Third, to promote task engagement, participants were told that their real-money winnings would be proportional to their play-money winnings in the IGT<sup>5</sup> .

Participants used the mouse to select a card from a deck. Following card selection, the card changed to either black or red while the outcome was displayed. Note that black and red cards did not necessarily correspond to wins or losses, but were arranged according to the schedule designed by Bechara et al. (1994, 2000). The message "You have won \$" followed by the amount of the reward was then displayed alongside a smiley face, and a winning sound was played. If a penalty was also scheduled, it was displayed immediately after the winning sound; the phrase "But you also have lost \$" was displayed alongside a sad face, and a losing sound was played. Sounds were identical to those in the implementation provided to our laboratory by Bechara (Davison, 2009). The inter-trial interval was approximately 2.5 s for rewardonly trials, or 5 s when a reward was also followed by a penalty. Two bars were displayed at the top of the screen. The upper (green) bar displayed the amount of play money won during the game, and was updated appropriately after each card selection. The lower (red) bar displayed the amount of play money borrowed to play the game. If the participant's winnings fell below zero, a further loan of \$2000 was automatically added to the red bar, and the green bar was reset appropriately. After participants had made 200 selections, they were informed of their net winnings (total winnings less the total amount borrowed) and the task ended.

## Procedure

Informed consent was obtained in a manner approved by the University of Auckland Human Participants Ethics Committee. Participants performed all tasks alone in a quiet testing room. Experimental tasks were run on a Dell PC running Microsoft Windows XP. Stimuli were presented in full-screen mode on a Dell 22-inch LCD display at the native screen resolution of 1680×1050 pixels.

## Results

## Group Data

In line with standard IGT analytical approaches, we first analyzed data at the group level. In their seminal study, Bechara et al. (1994) introduced two summary statistics: mean number of selections from each deck over 100 trials, and mean net score (number of choices from good decks C and D minus number of choices from bad decks A and B) over 100 trials. Later analyses (beginning with Bechara et al., 2000) presented mean net score as a function of each 20-trial block (highlighting the learning curve in the IGT). To facilitate comparison with previous studies, we present similar analyses; however, as the present study used 200 trials, net scores are expressed as proportions (i.e., net score divided by the number of trials) rather than absolute numbers, and in some cases data are shown separately for Trials 1–100, Trials 101–200, and Trials 1–200. For inferential analyses we utilized ANOVAs as per the standard approach to IGT data analysis since Bechara et al. (1999).

**Table 1** shows that mean net scores in the first 100 trials were similar to those found previously by Bechara and colleagues—particularly studies that utilized the computer-based

<sup>5</sup>The weight of evidence suggests that healthy participants who receive real money in the IGT do not differ significantly in performance from those who do not (Bowman and Turnbull, 2003; Carter and Pasqualini, 2004; Fernie and Tunney, 2006; but see Vadhan et al., 2009).


TABLE 1 | Mean net scores (expressed as proportions of number of trials) and aggregated deck preferences in Bechara et al.'s studies and in the present study.

*<sup>a</sup>*,*bProportions were extrapolated from figures in the original papers; however, the estimated proportions for the 1994 and 1999 studies differ slightly from those reported by Steingroever et al. (2013).*

*<sup>c</sup>Mean net score is mean proportion of choices from good decks (C and D) minus mean proportion of choices from bad decks (A and B). Proportions rather than whole numbers are used to allow comparison across different numbers of trials (multiply by 100 to compare with net scores from 100-trial IGT studies).*

IGT (Bechara et al., 1999) rather than the original IGT, which employed physical cards and facsimile money (Bechara et al., 1994). Mean net scores for Trials 1–100 in the present study were in the top 25% of the studies reviewed in Steingroever et al. (2013)<sup>6</sup> . Notably, the top-scoring study in Steingroever et al.'s review (North and O'Carroll, 2001) used the original computerized IGT supplied by Bechara. Thus, our results support the hypothesis that closer adherence to Bechara's experimental task and instructions would facilitate a closer replication of their results.

In **Figures 3A,B**, standard graphical depictions of IGT data are presented. **Figure 3A** shows mean net score as a function of 20-trial blocks. To determine whether, on average, performance continued to improve after 100 trials, a repeatedmeasures ANOVA (Greenhouse-Geisser corrected) was carried out on net scores, which confirmed a significant effect of block [F(5.86, <sup>287</sup>.21) = 47.75, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.49]. Significant linear [F(1, 49) = 126.00, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.72] and quadratic trends [F(1, 49) = 69.64, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.59] were found, supporting the visual impression from **Figure 3A** that performance improved over time and had a curvilinear shape, leveling out somewhat in later blocks in an asymptotic pattern. Planned comparisons indicated that net scores in Blocks 1–5 were significantly lower than net scores in Blocks 7, 9, and 10 (all p < 0.05), supporting the hypothesis that average IGT performance would improve if participants were given more trials in which to learn the task.

**Figure 3B** shows the total proportion of selections from each deck for all participants. Results are shown separately for Trials 1–100 and Trials 101–200 (referred to here as epochs). A repeated-measures ANOVA (epoch × deck; Greenhouse-Geisser adjusted) revealed a main effect of deck [F(1.55, <sup>76</sup>.15) = 19.20, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.28] and an interaction between epoch and deck [F(1.49, <sup>72</sup>.81) = 20.55, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.30]. Post-hoc pairwise comparisons showed that in each epoch, significantly more choices were made from the good decks C and D than the bad decks A and B (all p < 0.01). That is, on average, participants chose advantageously. The interaction between epoch and deck was reflected in different patterns of choice in the first and second epochs: Choice of Deck A (p < 0.001) and Deck B (p < 0.001) decreased significantly between the first and second epoch, while choice of Deck C increased significantly (p < 0.001). Preference for Deck D did not change across the two epochs (p = 0.39). Thus, the group data suggest that the improvement in performance in later trials was characterized by a shift in preference away from Decks A and B toward Deck C.

As several authors (e.g., Fernie, 2007; Buelow et al., 2013; Steingroever et al., 2013) have pointed out, the above analyses have limitations. The aggregation of data by block (**Figure 3A**) obscures preferences for individual decks, while aggregating by deck (**Figure 3B**) obscures the effects of time. **Figure 3C** is a more informative representation of the data, and is increasingly favored by IGT researchers. **Figure 3C** illustrates the trend toward good decks and away from bad decks over time (similar to **Figure 3A**), whilst also breaking down individual deck preferences (as in **Figure 3B**). **Figure 3C** indicates that, on average, participants learned to prefer the good decks (C and D) rather than the decks with lower frequencies of losses (B and D; cf. Steingroever et al., 2013). An examination of individual data further revealed that only 8% of the sample (Participants 9, 10, 45, and 46) exhibited a preference for Decks B and D over other decks in the first 100 trials, and only one (Participant 9) maintained this pattern of preference in the second 100 trials.

Steingroever et al. (2013) introduced a new descriptive analysis plotting the mean proportion of switches from one deck to another made during each block of trials. **Figure 3D** presents the corresponding data for the present study (though here each block is 20 trials in length whereas in Steingroever et al. each block was 10 trials). In contrast to most of the studies reviewed by Steingroever et al., mean switching appeared to decrease over blocks, suggesting that, on average, preferences stabilized as the task progressed.

<sup>6</sup>Note that one of the studies in Steingroever et al.'s review (Overman et al., 2004) administered 200 trials instead of the standard 100, so it is questionable whether it should have been included. Excluding Overman et al., the mean net scores in Trials 1–100 of the present study are the seventh highest (alongside Bechara et al., 1998) of the 31 studies reviewed.

## Individual Data

Group analyses may conceal important individual differences in behavior; therefore individual data were examined in more detail. **Table 2** indicates that, as hypothesized, net scores showed high inter-individual variability, reflected in high standard deviations (relative to means) and ranges (when averaged across 100-trial epochs). The standard deviation was higher in Trials 101–200 than in Trials 1–100 because 44 participants improved their scores (with six obtaining perfect scores) but six participants actually obtained lower scores in the second epoch. In the first epoch, 30% of the sample scored as low as VMPFC patients according to Bechara et al.'s (2001) criterion, consistent with previous studies (Steingroever et al., 2013). However, in the second epoch, only 16% remained in this category. Thus, as predicted, running the IGT for 200 trials evidently provided participants with more opportunity to learn the contingencies and thus reduced the number of participants classified as poor decision makers. Increasing the number of trials to 200 did not require an unreasonable amount of time; participants took an average of 13.72 min (SD = 2.68) to complete the task.

We examined individual learning trajectories to establish whether individual differences in learning rate and deck preference contributed to the inter-individual variability evident when the data are aggregated into epochs of 100 trials (first two rows of **Table 2**). Initial visual inspection of individual participant profiles (see Supplemental Materials) suggested that, consistent with Steingroever et al.'s (2013) analysis, participants took varying amounts of time to develop strong preferences, with many TABLE 2 | Mean net scores (expressed as proportions of number of trials) in the present study, variability statistics, and proportions of participants satisfying criteria for impaired performance.


*<sup>a</sup>The criterion used to classify a participant as "impaired" was originally defined by Bechara et al. (2001), (p. 384) as an overall net score* < *10 (i.e., net score* < *0.10 in proportional terms), based on norms calculated from VMPFC patients. Most subsequent studies have also adopted this criterion; however Steingroever et al. (2013) used a stricter criterion of net score* < *0.00 (in proportional terms), also shown here for comparison.*

failing to do so even by 200 trials. Moreover, different participants developed stable preferences for different decks, or pairs of decks.

To quantify these visual observations, a stability criterion (see Baron and Perone, 1998) was devised for the IGT. While no specific operant criterion was suitable for the IGT, an analogous approach was taken. Specifically, a participant was considered to show a strong preference for a single deck when the proportion of choices from that deck during a block was (a) at least 0.50 and (b) at least 0.25 greater than the proportion of choices from any other deck. For participants who did not prefer one particular deck, a strong preference for a pair of decks was assumed when (a) the sum of the proportion of choices from the two decks during a block was at least 0.75 and (b) the preference for each of the two decks differed by less than 0.25. Behavior was considered stable when the same preference was maintained for three consecutive blocks (i.e., 60 trials).

**Figure 4A** indicates that individual participants achieved stable preferences at varying times between the second and eighth block of trials. Only 54% of participants met the stability criterion by 100 trials, while 72% of participants had done so by 160 trials (however, note that 4% of those who met the stability criterion preferred bad decks). The results are consistent with the view that healthy individuals differ widely in learning rate, and that many require more than 100 trials to develop a strong preference. Indeed, 28% did not meet the stability criterion even after completing 200 trials. Note that the new stability criterion is stricter than Bechara et al.'s (2001) criterion (**Table 2**), which only classified 16% of participants as impaired.

**Figure 4B** summarizes preferences for each deck and each pair-wise combination of decks according to the stability criterion. Of those who developed strong preferences, almost all preferred the good decks: 34% of the sample preferred Deck C, 24% preferred Deck D, and 10% preferred both C and D approximately equally. There were two exceptions: One participant strongly preferred Deck A, whilst another preferred Decks B and D.

The individual differences in learning rate and deck preferences apparent in **Figure 4** contributed to high inter-individual variability (**Table 2**) due to the influence of the low net scores achieved by many participants. For example, in Trials 1–100 half the sample: 46% (23 participants) did not learn to prefer the good decks, while 4% (2 participants) developed preferences for bad decks. To examine the effect on performance in Trials 1–100 of controlling these two factors, analysis was restricted to the 50% of participants who mastered the IGT within 100 trials (see first row of **Table 3**). Compared to the entire sample (first row of Table 2), the mean net score was considerably higher and variability lower, supporting the hypothesis that individual differences in learning rate and deck preference contribute to poor performance and high inter-individual variability in the standard 100-trial IGT.

In operant research, differences in learning rate are controlled by discarding early learning data from analysis and focusing on preferences after subjects have satisfied the stability criterion. Similarly, to obtain a more accurate impression of how well healthy participants perform on the IGT, early learning trials should be excluded. In the present study, 68% of participants had achieved stable preferences for good decks by Trial 160; therefore data for these participants in the final 40 trials of the 200-trial IGT were examined (second row of **Table 3**). The mean net score of participants at stability was 0.87, and 13 of the 34 participants achieved scores of 1.0, suggesting certainty. We conclude that, given sufficient time to learn the task, the majority of healthy participants are able to perform extremely well on the IGT. Nevertheless, 28% failed to develop a strong preference even after 200 trials—this was investigated further in Experiment 2.

## Discussion

by the end of the task.

Historically, IGT studies have not placed high importance on procedural details, evidenced by the wide variety of procedures used (Areias et al., 2013), and the tendency not to report details of the experimental task in method sections. In the present study we report evidence supporting the hypothesis that the variations in IGT task complexity and instructions found in the literature may contribute to inter-study variability. Specifically, our close replication of Bechara et al.'s (1999) computerized experimental task and instructions resulted in mean net scores in Trials 1–100 that were comparable to Bechara et al. (1998, 1999), in contrast to the relatively low scores reported in the majority of IGT studies reviewed by Steingroever et al. (2013).

Also like Bechara et al. (1994), but in contrast to many of the IGT studies reviewed by Steingroever et al. (2013), our group data forthe first 100 trials showed no evidence of the tendency to avoid frequent punishers (i.e., the frequency-of-losses effect; Dunn et al.,

TABLE 3 | Mean net scores (expressed as proportions of number of trials) and variability statistics in participants who developed stable preferences for the good decks (C and/or D) in the present study.


2006; Lin et al., 2013). Rather, the descriptive data for our first 100 trials (see **Figure 3B**) resembled the pattern of strong preferences for Decks C and D in Bechara et al.'s (1994) control group, and only four individuals clearly preferred Decks B and D over other decks in the first 100 trials. It is unclear why the frequencyof-losses effect was not observed in the present study. We can only speculate that payment of participants (admittedly a departure from Bechara et al.'s original implementation) might have led them to be more averse to risky Deck B, although previous studies found that payment of real money in the IGT had little effect on performance (e.g., Bowman and Turnbull, 2003; Carter and Pasqualini, 2004; Fernie and Tunney, 2006). Thus, while it seems unlikely that in this case, payment affected performance, future work that systematically varies a variety of task factors is needed to determine precisely which aspects of the procedure are critical to achieving reliably high net scores<sup>7</sup> .

As hypothesized, mean net scores exhibited the high variability typically found in IGT studies, along with the common finding that many healthy participants (30% in the present study) perform as poorly as VMPFC patients in the standard 100-trial IGT (Steingroever et al., 2013). Also as predicted, increasing the number of trials greatly improved performance—in Trials 101– 200 the number of participants who performed in the range of VMPFC patients was nearly halved to 16% of the sample. The improvements in performance after 100 trials replicated the findings of the few papers that have examined the effect of extending the number of trials (e.g., Fernie and Tunney, 2006; Buelow et al., 2013; Lin et al., 2013). Thus, our study adds to a growing chorus that the existing standard 100-trial IGT may be inappropriate for clinical assessment, as it classifies a disproportionate number of healthy participants as impaired.

Our novel analysis of individual data supported the hypothesis that the inter-individual variability in IGT net scores is likely attributable to individual differences in learning rate and, to some extent, to differing preferences for individual decks. When learning rate was controlled by adopting an operant approach—that is, restricting analysis to stable data from participants who met the stability criterion—inter-individual variability in net scores was reduced relative to the group as a whole. This observation suggests that the misclassification of many healthy individuals as impaired may reflect wide differences in individual learning rates. Indeed, while some participants required only 40 trials to develop stable preferences for the good decks, others required 160 trials. Moreover, contrary to Brand et al.'s (2007) assertion that most participants have a good idea of the contingencies by 40–50 trials, only 16% of our participants developed stable preferences by 40 trials. We suggest that preferences should be analyzed only when they have stabilized in the majority of participants, which appeared to take at least 160 trials in the present sample. Nevertheless, 160 trials was close to the limit of 200 trials, and it is possible that given more trials, a larger majority of participants would have mastered the task. Further work is required to clarify the appropriate absolute trial limit in the IGT that allows the majority of healthy individuals to develop stable preferences.

Note that due to the novelty of this analytic approach to the IGT, the basic stability criterion used herein was somewhat exploratory, and therefore our conclusions are tentative. For instance, the stability criterion employed here will not detect a late change in preference once an earlier preference has stabilized (examining the individual data, approximately seven participants showed late changes in preference after 60 stable trials). Given the wide range of learning rates across individuals, a higher absolute trial limit (e.g., 300 or 400 trials) would ideally be combined with a dynamic stability criterion—that is, where the task would halt once stability was reached in each participant. This would help prevent loss of engagement in the task, which we speculate may have led some of our participants to begin experimenting with other decks again after they had reached stability.

One might argue that applying a stability criterion and discarding learning trials from analysis defeats the purpose of the IGT—the designers of the IGT may have deliberately limited the task to only 100 trials because they intended the IGT to be an implicit measure of learning rate. The underlying assumption is that atypical sensitivity is likely to be associated with slower learning. However, slower learning is not necessarily due to atypical sensitivity—in the present study, 18% of healthy participants did not develop strong preferences for good decks until the second 100 trials. In Buelow et al.'s (2013) study, the equivalent figure was 26.5% (though a different calculation was used). Buelow et al. speculated that these slow learners may exhibit a different type of decision-making deficit, albeit less severe than those who never develop strong preferences. However, given that the slow learners represent approximately 20–25% of healthy participants, arguably the more parsimonious explanation is that this simply represents normal individual variability in learning rate (i.e., the upper tail of the distribution of learning rates). By disregarding learning data, we don't wish to imply that learning rate is not of interest in its own right. However, when examining learning rates, due to individual variability it is advisable to (1) analyze individual learning curves rather than the group learning curve, and (2) focus on behavior prior to attaining stable preference (given that stable preference is assumed to be a function of sensitivity, not learning per se).

The individual analyses further extend prior IGT work by classifying participants into sub-groups according to stable deck preference. Previous studies aggregated all participants and made conclusions based on group data. For example, both Buelow et al.

<sup>7</sup>Unfortunately few clues can be found in previous published IGT studies, which generally provide little to no detail about the experimental task. However, we note that in some studies (e.g., Chiu and Lin, 2007; Fernie, 2007) the spatial positions of the card decks were randomized to control location bias, so an interesting initial avenue of research may be to investigate the effects of spatial position.

(2013) and Lin et al. (2013) reported an overall preference for Deck D in Trials 101–200 (though by the third 100 trials in Lin et al.'s study, mean preference for C and D was almost equal), while our group data showed an overall preference for Deck C in the second 100 trials. However, focusing on group data can lead to an inaccurate perception of the preferences of healthy participants. Examination of our individual data revealed that participants fell clearly into three main sub-groups: About a third preferred Deck C; a quarter preferred Deck D, and 10% showed an approximately equal preference for Decks C and D. The small sample size in the present study restricts further statistical investigation of the characteristics and sensitivity profiles of these sub-groups. However, future work with larger samples might use these sub-groups to investigate several hypotheses. For example, high sensitivity to penalty magnitude may be associated with a preference for Deck C over Deck D (larger penalties); conversely, individuals with a high sensitivity to penalty frequency might prefer Deck D to Deck C (more frequent penalties).

The individual analyses raise a further question that cannot be addressed by the IGT data alone. Although in the second 100 trials only 16% of participants were classified as impaired by Bechara et al.'s (2001) criterion, by our stricter stability criterion 28% of participants failed to exhibit a stable preference for any particular deck or decks, even after completing 200 trials. These participants may have been particularly slow learners—perhaps they would have developed preferences for the good decks if allowed more than 200 trials (e.g., Lin et al., 2013). Alternatively, their weak preferences may be explained by atypical sensitivity to reward and/or punishment (Bechara et al., 2000). Lacking additional physiological (e.g., SCR) or self-report measures, the behavioral data in the IGT (i.e., preference for each deck) cannot be used to compute sensitivity measures. Therefore, Experiment 2 employed a novel operant card task to derive behavioral estimates of sensitivity in participants, facilitating a comparison of poor decision makers with good decision makers.

## Experiment 2

Operant researchers have traditionally investigated choice using concurrent-schedule procedures, in which subjects are presented with choices between two alternatives, one of which may provide rewards with a higher probability (responses are typically rewarded at variable intervals of time). Research has established that organisms ranging from fruit flies (e.g., Zars and Zars, 2009) to human beings (e.g., Takahashi and Shimakura, 1998) approximately match their preference for an alternative to the proportion of rewards received (once they have learned the contingencies, and behavior has stabilized). For example, if one alternative provides 75% of the rewards then approximately 70–75% of responses will be made to that alternative. This phenomenon, first quantified by Herrnstein (1961) and subsequently dubbed the matching law, allows researchers to derive an estimate of sensitivity to reward based on choice behavior.

To measure sensitivity, subjects typically complete a series of conditions, each with a different ratio of rewards arranged on the two alternatives. A linear function is then fitted to the logtransformed response ratios and reward ratios, with the slope of the line yielding a quantitative measure of sensitivity. Thus, sensitivity is defined as the degree to which the subject's relative choices change when the ratio of rewards changes. The supplemental materials contain an overview of this approach and provide the generalized matching law equation and its derivation (see Supplement C). For an introduction to the matching law, please see Poling et al. (2011). For more complete coverage, refer to Davison and McCarthy (1988).

A limitation of applying operant procedures to human participants is the large number of sessions normally required for each condition, which may lead to loss of engagement (e.g., Buskist et al., 1991). Davison and Baum (2000) introduced a new procedure in which a range of reward ratios is presented to participants as a series of components, or mini-conditions, within a single session. This approach was adopted by two recent human operant studies (Lie et al., 2009; Krägeloh et al., 2010) as an efficient way to measure sensitivity while keeping the task relatively short and thus maintaining participant engagement. While in Krägeloh et al. and Lie et al. only a single dimension was measured (sensitivity to reward frequency), the present study extends measurement to three additional dimensions: sensitivity to reward magnitude, sensitivity to punishment frequency, and sensitivity to punishment magnitude.

In Experiment 2, a sub-sample of participants completed our novel operant concurrent-schedule task—the Auckland Card Task (ACT). The generalized matching law (Baum, 1974; see Supplement C)<sup>8</sup> was fitted to data, yielding behavioral estimates of sensitivity to reward magnitude, reward frequency, punishment magnitude, and punishment frequency. We hypothesized that sensitivity estimates would increase systematically as participants learned the contingencies; nevertheless, based on previous human operant research (see Kollins et al., 1997), we predicted that there would be considerable individual variability in sensitivity.

Utilizing participants' IGT performance from Experiment 1, we investigated whether individuals who persistently performed poorly in the IGT (i.e., those who did not prefer one or both good decks even after 200 trials) would exhibit measurable differences in sensitivity in the ACT (e.g., hypersensitivity to reward or hyposensitivity to punishment; Bechara et al., 2000) relative to those who performed well on the IGT.

## Method

## Participants

Thirty of the 50 participants from Experiment 1 also completed the ACT (Participants 21–50; 15 males)<sup>9</sup> , a more-than-adequate sample size for the individual level analyses employed in human operant research (Lie et al., 2009; Krägeloh et al., 2010). The mean age of this sub-sample was 21.07 years (SD = 3.89).

<sup>8</sup>The generalized matching law is an empirical model of behavior which, in contrast to cognitive models such as the expectancy-valence model (Busemeyer and Stout, 2002), makes no assumptions regarding underlying cognitive processes.

<sup>9</sup> Participants 1–20 completed a pilot version of the ACT, during which the contingencies and instructions were adjusted to optimize reliability and validity. Because of the variation in task factors during piloting, data from Participants 1–20 were excluded from ACT analyses.

## Experimental Task

The ACT was developed in Presentation (Version 16.1, Neurobehavioral Systems Inc., Albany, CA, USA). Like the IGT, the ACT required participants to choose between decks of cards containing both rewards and punishers. Aside from this superficial similarity, the ACT differed considerably from the IGT. Only two decks of cards were presented (**Figure 5**) and rewards/penalties were delivered probabilistically (see **Table 4**) rather than according to fixed schedules (e.g., the pre-ordered decks in the IGT). Participants were told that each deck contained hundreds of playing cards, and that some winning and losing cards had been shuffled into both decks. Like the IGT, screen instructions (see Supplemental Materials) included a hint as to strategy, which differed depending on the condition. For example, in Condition 1, the hint was:

Winning cards can be found in both decks, but one deck has MORE winning cards than the other. Both decks also contain an equal number of losing cards. In each round, to maximize your score in the time given, you'll first need to figure out which deck has more winning cards in it, then choose more often from that deck.

Participants made deck selections with their dominant hand by pressing the left or right control keys (the spacing of the control keys discouraged rapid alternation between decks). Most of the time, choosing a deck resulted in the brief display (200 ms) of a random playing card (from a standard 52-card deck) on top of the deck to simulate the top card being flipped over. However, occasionally (when determined by the dynamic scheduling algorithm; see Supplement E) a winning or losing card was displayed for 1000 ms and the bar graph was updated proportionately. If the score dropped below zero, the bar turned from green to red and extended to the left instead of to the right. A smiley face was displayed on winning cards above the

amount won, and a "ding" sound was played. Losing cards featured a sad face and a "buzzer" sound. Both card decks were the same color; however, the deck color varied between conditions.

Although the instructions gave the impression that flipping cards faster would help participants win more money, in reality winning and losing cards were scheduled according to two separate concurrent variable-interval (VI) schedules. That is, a key press yielded a reward (or penalty) only when a variable interval of time had elapsed since the previous reward or penalty.

## Experimental Conditions

Each participant completed four different conditions (**Table 4**). Each condition allowed for behavioral estimates of sensitivity to one of four independent variables (reward frequency, reward magnitude, penalty frequency, and penalty magnitude) to be obtained. In each condition three dimensions were held constant and equal across the two decks, whilst the independent variable of interest was systematically varied across four components to provide data points for generalized matching law linear regressions. For example, in Condition 1 sensitivity to reward frequency was measured by varying the reward frequency ratio over four components, whilst reward magnitude remained constant and equal on both decks (\$50), as did penalty frequency (0.5) and penalty magnitude (\$30). In the first component of Condition 1, the reward frequency ratio was 1:3; that is, there were three times more rewards available on Deck 2 (15 rewards) than on Deck 1 (5 rewards). In the second component the ratio was approximately 1:2 (13 rewards on Deck 2 vs. 7 rewards on Deck 1). In the third and fourth components these ratios were reversed.

Within each condition, the four components were presented in random order, with a rest break between each component, during which the hint was repeated. Participants pressed the space bar when they wished to continue. Each component continued until the participant had received all scheduled rewards and penalties, or until 8 min had elapsed since the beginning of the component, whichever occurred first. The net reward for each component was identical in every condition; thus each participant was scheduled to win exactly the same amount in each condition (provided the component did not time out before they had received all the scheduled rewards and penalties).

In each condition, the reward and penalty schedules ran independently of one another (following Critchfield et al., 2003). To ensure that participants received the proportions of rewards and penalties that were arranged for each deck, dependent scheduling (Stubbs and Pliskoff, 1969) was used. Further details of the scheduling algorithms are provided in the supplemental materials (Supplement E). At the end of each condition participants were informed of their total play money winnings. A random amount was added to or subtracted from this total so participants wouldn't realize that, despite their efforts, they were winning exactly the same amount in each game (this was because identical net rewards were arranged for each condition).


TABLE 4 | Summary of conditions in the Auckland Card Task.

*Comp., Component (presented in random order). All conditions were dependent concurrent VI VI schedules with variable-interval timing and 2-s changeover delays. Conditions 1 and 2 were VI 8-s rewards; VI 20-s penalties. Conditions 3 and 4 were VI 20-s rewards; VI 8-s penalties. Reward and penalty schedules ran conjointly and independently of one another (see Critchfield et al., 2003). Conditions 2 and 4 imposed variable-magnitude rewards or penalties (indicated by "v").*

## Procedure

The ACT was presented after the IGT, and the four conditions (**Table 4**) were presented in counterbalanced order across participants<sup>10</sup> .

## Results

#### Group Data

**Figure 6** follows the standard approach to analysing learning in Davison and Baum's (2000) experimental paradigm, showing mean sensitivity as a function of blocks of successive rewards or penalties. The graphs in **Figure 6** can be considered analogous to **Figure 3A** for the IGT, but here the dependent variable is mean sensitivity rather than mean net score. The sensitivity estimates in **Figure 6** are the averages of individual sensitivity estimates, derived by fitting the generalized matching law (see Supplement C) to the log response ratios and log reward/penalty ratios for each block across all four components. The approach taken follows that of Lie et al. (2009). Specifically, the log response ratio for each block was calculated based on all responses made during the block (e.g., from the start of the component to the fourth reward/penalty; from the fourth reward/penalty to the eighth reward/penalty, etc.). The log reward/penalty ratio for each block was based on all rewards/penalties received from the start of the component to the end of the block (e.g., from the start of the component to the fourth reward/penalty; from the start of the component to the eighth reward/penalty, etc.).

Unfortunately, participants did not always receive all the rewards and penalties scheduled by the task. At times, participants developed a strong or exclusive preference for one deck, which caused the dependent-scheduling algorithm to suspend further rewards or penalties on that deck, and eventually the component terminated after reaching its maximum time limit (8 min). Typically, it was not until the fourth or fifth block that participants began to exhibit a strong or exclusive preference. As a consequence, sensitivity estimates for some individuals in later blocks were very high and in some cases could not be calculated (i.e., where zero responses on one deck resulted in a zero denominator in the generalized matching law), resulting in some missing data (particularly in Condition 4). The missing data are evidenced in **Figure 6** by larger standard errors in some later blocks.

**Figure 6** shows that, as hypothesized, sensitivity estimates tended to be very low at the beginning of a component, but increased rapidly across blocks of rewards or penalties as participants learned the contingencies, approximately leveling out toward the last two blocks at the end of the component. To determine whether the increases in sensitivity apparent in **Figure 6** were statistically significant, for each condition a repeatedmeasures ANOVA was carried out, with block as the withinsubjects factor. In each condition, a significant main effect of block was found (Condition 1, F = 9.54, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.26; Condition 2, F = 7.35, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.25; Condition 3,

<sup>10</sup>Following Conditions 1–4 all participants also completed two additional conditions (Conditions 5 and 6) in which reward and penalty magnitudes were further manipulated; however, these conditions were irrelevant to the hypotheses examined in the current study and were therefore excluded from analysis.

F > 0.590, p < 0.001, η 2 <sup>p</sup> > 0.25; Condition 4, F = 2.76, p < 0.05, η<sup>2</sup> <sup>p</sup> = 0.15). In Condition 1, a significant linear trend [Flinear(1, 27) = 20.81, p < 0.001, η<sup>2</sup> <sup>p</sup> = 0.44] was apparent, while Condition 2 showed significant linear [Flinear(1, 22) = 13.79, p < 0.01, η<sup>2</sup> <sup>p</sup> = 0.39] and quadratic trends [Fquadratic(1, 22) = 13.04, p < 0.01, η<sup>2</sup> <sup>p</sup> = 0.37], consistent with the concavedownward shape of **Figure 6B**. Conditions 3 and 4 were affected by missing data; nevertheless, significant linear trends were found for both Condition 3 [Flinear(1, 17)= 19.83, p < 0.001, η 2 <sup>p</sup> = 0.54] and Condition 4 [Flinear(1, 16) = 5.48, p < 0.05, η 2 <sup>p</sup> = 0.26].

## Individual Data

As expected, individual participants exhibited considerable variability in sensitivity, particularly in conditions in which reward or penalty magnitude was varied—in the final block of each condition, the lowest sensitivity estimate for an individual was -1.16 (obtained in Condition 2), whilst the highest sensitivity was 5.19 (in Condition 4)11. In order to provide reliable individual sensitivity values for the cross-task analysis (below), it was necessary to compute a measure of sensitivity for each individual that represented stable behavior in the ACT (i.e., deck preference after the contingencies have been learned). Piloting on 20 participants<sup>9</sup> had already established that most participants developed a strong preference for the good deck after receiving around 8–12 rewards or penalties; therefore, no formal stability criterion was defined. Rather, **Figure 6** was visually inspected to determine approximately when sensitivity reached stable levels in the group data (i.e., when minimal bounce and trend were apparent from block to block; see Baron and Perone, 1998). Based on these observations, we averaged the sensitivity estimates for each individual across the last two blocks in Condition 1, and across the last three blocks in Conditions 2–4.

### Cross-Task Analysis

To investigate whether non-learners in the IGT showed differences in sensitivity relative to those who mastered the IGT, we compared ACT sensitivities of persistent poor decision makers (participants who failed to meet the stability criterion in Experiment 1 after 200 trials; n = 9) with those of good decision makers (participants who learned to prefer Deck C and/or Deck D within 200 trials in the IGT; n = 19). **Figure 7** shows mean ACT sensitivity for the two IGT sub-groups plotted as a function of each of the four ACT conditions. In consideration of the small and unequal sub-group sizes, two-tailed non-parametric Mann-Whitney U-tests were run. Good decision makers in the IGT exhibited significantly higher sensitivities to reward magnitude (U = 40.00, p = 0.035, r = −0.41) and penalty magnitude (U = 19.00, p = 0.025, r = −0.49) than poor decision makers. Thus, it appears that participants who developed no strong preferences on the IGT (and hence achieved low net scores) also exhibited lower sensitivity to the magnitude of both rewards and penalties in the ACT. There were no significant differences between the good and poor decision makers in their sensitivity to the frequency of rewards (U = 72.00, p = 0.643, r = −0.09) or penalties (U = 66.00, p = 0.595, r = −0.10).

#### Discussion

Group data showed that, as expected, participants generally exhibited strong increases in preference (indexed by

<sup>11</sup>In comparison, Lie et al. (2009) obtained individual sensitivity values ranging from -0.02 to 2.17. Traditional animal operant studies of choice generally yield sensitivity estimates between about 0.80 and 1.00 (Baum, 1979; Taylor and Davison, 1983).

sensitivity) for the deck yielding higher net rewards as each component progressed, consistent with Krägeloh et al. (2010) and Lie et al. (2009). On average, preference stabilized after participants received approximately 8–12 rewards or penalties in most conditions.

The hypothesis that poor decision makers in the IGT would exhibit differences in sensitivity relative to good decision makers was also supported: Poor decision makers (i.e., those who failed to develop strong preferences for the good decks in the IGT after 200 trials) exhibited significantly lower sensitivity to the magnitudes (but not the frequencies) of rewards and penalties in the ACT. That is, while poor decision makers had little difficulty determining whether rewards or penalties occurred more often on one deck than the other in the ACT, they were poor at discriminating the average dollar amounts of rewards and penalties on each deck.

It is unlikely that low sensitivity to reward magnitude would have influenced deck choice in the IGT, as it was presumably trivial to discriminate which decks provided higher rewards (rewards in each deck were invariant in both frequency and magnitude). However, Dunn et al. (2006) noted that performance in the IGT depends primarily on participants' ability to avoid decks which impose higher penalties on average; therefore a low sensitivity to punishment magnitude may have influenced IGT performance. That is, participants who had difficulty determining which deck imposed larger penalties on average in the ACT would likely have had similar difficulties in the IGT, in which penalties also occurred relatively infrequently, irregularly, and (on two of the decks) with variable magnitudes.

Previous researchers offer four different hypotheses to explain poor performance on the IGT: Bechara et al. attribute poor IGT performance to hypersensitivity to reward, hyposensitivity to punishment, or myopia for the future (Bechara et al., 2000, 2002; Bechara and Damasio, 2002). Other authors have suggested poor performance is due to a preference for the decks with lower frequencies of losses (Decks B and D; Steingroever et al., 2013). Reconciling these explanations of poor IGT performance with the current findings presents a challenge—while IGT studies (e.g., Bechara et al., 2000, 2002) have sometimes used physiological instruments such as SCR to measure overall sensitivity to reward (and punishment), the operant procedure used here has allowed us to decompose sensitivity into the finer-grained levels of frequency and magnitude.

In the ACT, the frequency-of-losses effect would presumably be reflected in a higher sensitivity to the frequency of punishers in poor IGT performers. However, we found no such pattern in the ACT, and little evidence of a frequency-of-losses effect in the IGT. Hypersensitivity to reward would likely manifest in the ACT as higher sensitivity to reward frequency or magnitude in poor IGT performers, which was not observed. We found some evidence of hyposensitivity to punishment, as evidenced by lower sensitivity to punishment magnitude in poor IGT performers, but notably this did not extend to frequency. Our pattern of results is most compatible with Bechara et al.'s (1994) myopia for the future, which can arguably be formalized as a low sensitivity to the magnitudes of events (whether they are rewards or punishers) that occur infrequently. In the ACT, poor IGT performers exhibited lower sensitivity to the magnitudes of both rewards (Condition 2) and punishers (Condition 4). Both occurred infrequently due to the variable-interval scheduling.

This interpretation is tentative given the small and unequal sub-samples in the ACT-IGT cross-task analysis and the considerable individual differences in sensitivity. The high variability precludes employing the ACT in its current form as a clinical diagnostic tool to identify poor decision makers. To compete with the IGT, any new tool would have to be more effective than the IGT at dissociating impaired and non-impaired participants, and would require norming using large samples of healthy and clinical participants. Nevertheless, with further development the ACT may be useful in the experimental domain as it has enabled us to disambiguate sensitivity to magnitude and frequency, and demonstrate that lowered sensitivity to the magnitude of events, be they rewarding or punishing, is associated with poor IGT performance.

## General Discussion

A coherent picture emerges from the two complementary experiments presented in this study. When a replica of Bechara et al.'s (1999) standard computer-based IGT (including the original instructions) was administered to healthy participants for an additional 100 trials, the majority (84%) achieved scores high enough to distinguish them from VMPFC patients, based on Bechara et al.'s (2001) criterion. When our stricter stability criterion was applied, it became apparent that most participants (68%) developed strong, stable preferences for good decks by Trial 160. Nevertheless, nearly a third (28%) failed to meet the stability criterion, and the choice behavior of these participants was characterized by frequent switching between decks throughout the task (rather than the strong preference for bad decks often exhibited by clinical participants; e.g., Bechara et al., 1994). Sensitivity measurements derived using the ACT suggested that the frequent switching shown by these participants may be related to a difficulty determining which decks impose penalties that are larger (on average), when those penalties vary in size and occur at unpredictable intervals.

While the atypical sensitivities found in these participants may represent a genuine decision-making deficit, it is also possible that the low sensitivity estimates might be a result either of very slow learning or some other confound; for example, loss of engagement (i.e., poor attention and motivation; Dunn et al., 2006) or inappropriate conscious strategy (e.g., risk appetite/aversion or superstitious behaviors; Skinner, 1948; Dunn et al., 2006). Without normed data on how many trials are required for IGT mastery (see Experiment 1 Discussion), we cannot rule out the possibility that some or all of these participants simply required more trials to reach stability.

Alternatively, the low sensitivity may have been due to loss of engagement in the task. Poor attention to the contingencies of reward and punishment is a documented problem in human operant research (Kollins et al., 1997), perhaps because operant tasks typically require large numbers of responses in order to derive reliable estimates of sensitivity. In the present study, participants may have found the ACT particularly tedious in comparison to the IGT, which provided rewards for every response. Thus, poor attention may have led to the near-zero (reflecting equal preference for both alternatives) sensitivity estimates in some individuals in the ACT. Similarly in the IGT, poor decision makers may not have attended properly to the reward and punishment contingencies. Nevertheless, from anecdotal observations, poor decision makers expressed apparently genuine disappointment and frustration at their low (or negative) winnings in the IGT, suggesting they were not inattentive. Additionally, in the ACT the mean sensitivity to reward and punishment frequencies in poor decision makers was not significantly different from that of good decision makers, suggesting attention to task at least in Conditions 1 and 3.

A second potential confound in concurrent-choice tasks such as the ACT and IGT is the development of inappropriate strategies or superstitious behavior. In the ACT this may be reflected in negative sensitivity (reflecting a preference for the poorer alternative), while in the IGT it may manifest in a preference for "risky" decks (Dunn et al., 2006). This confound may be mitigated in human participants with careful use of instructions; however, it can be difficult to strike the appropriate balance between instructions that give away too much information about the contingencies (leading to rapid learning and exclusive preference for the better alternative); or too little information (leading to no strong preference). The powerful influence of instructions on performance highlights the importance of standardizing instructions in the IGT, in which even healthy participants perform poorly unless the instructions specifically urge them to ". . . stay away from bad decks" (e.g., Balodis et al., 2006).

Notwithstanding the potential impact of confounding factors, the present study narrows the focus for future investigation of poor IGT performance in healthy participants: What type of deficit could give rise to difficulties in tracking the average size of punishers on each deck? Brand et al. (2007) found that performance in later trials in the standard 100 trial IGT was correlated with measures of executive performance; could low sensitivity to reward and punisher magnitudes reflect an executive deficit, rather than an affective decision-making deficit? Future researchers may wish to consider screening participants for issues such as dyscalculia (see Butterworth et al., 2011), and carrying out on poor decision makers additional post-study measures that probe numeracy abilities.

In conclusion, while the IGT has firmly established itself as the standard for studying decision making, and is widely used in both experimental and clinical settings, we offer three recommendations for its future application. First, to mitigate inter-study variability it is important that the task is properly standardized; that is, researchers should only use Bechara et al.'s (1999) original experimental task and instructions, or a close replica thereof. Second, to control for individual differences in learning rate that contribute to inter-individual variability, the task should be continued for a minimum of 200 trials (though more work is needed to determine the optimal limit), and only stable data should be analyzed (early trials reflecting trial-and-error learning are unreliable). Third, IGT analyses should not be limited to aggregated data (i.e., participants, decks, and trials)—important insights may potentially be gained from analysis at more detailed levels. Finally, while the ACT task used here has limitations, a crossdisciplinary approach, in which methods and models from behavioral economics and operant psychology are leveraged, may have potential in advancing the study of human decisionmaking deficits—particularly in its ability to quantify sensitivity and break it down into dimensions such as frequency and magnitude.

## Funding

This study was supported by The University of Auckland Performance-Based Research Fund.

## Acknowledgments

PB was supported by a University of Auckland Masters Scholarship; DRA was supported by a Rutherford Discovery Fellowship (RDF-UOA1001).

## Supplementary Material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 2015.00391/abstract

## References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Bull, Tippett and Addis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## A potential role of reward and punishment in the facilitation of the emotion-cognition dichotomy in the Iowa Gambling Task

## *Varsha Singh\**

*Humanities and Social Science, Indian Institute of Technology, Delhi, India*

#### *Edited by:*

*Yao-Chu Chiu, Soochow University, Taiwan*

#### *Reviewed by:*

*Uma R. Karmarkar, Harvard Business School, USA Cendri Hutcherson, California Institute of Technology, USA*

#### *\*Correspondence:*

*Varsha Singh, Humanities and Social Science, Indian Institute of Technology-Delhi, Hauz Khas, New Delhi – 110016, India e-mail: vsingh.iitb@gmail.com*

The Iowa Gambling Task (IGT) is based on the assumption that a decision maker is equally motivated to seek reward and avoid punishment, and that decision making is governed solely by the intertemporal attribute (i.e., preference for an option that produces an immediate outcome instead of one that yields a delayed outcome is believed to reflect risky decision making and is considered a deficit). It was assumed in the present study that the emotion- and cognition-based processing dichotomy manifests in the IGT as reward and punishment frequency and the intertemporal attribute. It was further proposed that the delineation of emotion- and cognition-based processing is contingent upon reward and punishment as manifested in the frame of the task (variant type) and task motivation (instruction type). The effects of IGT variant type (reward vs. punishment) and instruction type (task motivation induced by instruction types: reward, punishment, reward and punishment, or no hint) on the intertemporal and frequency attributes of IGT decision-making were analyzed. Decision making in the reward variant was equally governed by both attributes, and significantly affected by instruction type, while decision making in the punishment variant was differentially affected by the two attributes and not significantly impacted by instruction type. These results suggest that reward and punishment manifested via task frame as well as the task motivation may facilitate the differentiation of emotion- and cognition-based processing in the IGT.

**Keywords: Iowa Gambling Task, instructions, decision making, intertemporality, reward-punishment**

## **INTRODUCTION**

The Iowa Gambling Task (IGT; Bechara et al., 1994) is widely used to examine the interaction of emotion and cognition in foresighted decision making under conditions of risk and uncertainty. The task tests long-term decision making, and it is believed that inputs from emotion-based processing are beneficial rather than an impediment to long-term decision making (otherwise is believed to be a purely cognition-based process). The IGT offers a choice among four decks of cards, each labeled A- , B- , C- , and D- . The four decks differ in two ways: (a) the net outcome across time (i.e., intertemporal attribute), whereby decks A- and B are poor long-term choices and decks C and D are safe long-term choices; and (b) the frequency of immediate reward and punishment irrespective of net or long-term outcomes (i.e., frequency attribute), whereby decks A and C could be perceived as poor choices due to frequent punishments/infrequent rewards and decks B and D could be perceived as safe choices due to infrequent punishments/frequent rewards.

Task performance was originally believed to depend entirely on the intertemporal attribute (i.e., the choice of immediate outcomes over delayed outcomes is considered disadvantageous), and to disregard frequency of reward and punishment. The reward and punishment schedule of the IGT was assumed to be cognitively impenetrable (i.e., neither the frequency nor the long term payoff/outcome were believed to be cognitively processed), which implied the following: (1) that reward and punishment are indistinguishable from each other and weigh equally, and (2) that decision making is solely driven by the intertemporality of the task choices (i.e., irrespective of reward/punishment, the choice of delayed outcomes over immediate outcomes is considered advantageous). To rule out the sensitivity to reward and punishment as an alternate explanation for myopic decision making in the IGT, Bechara et al. (2000a,b) tested the first implication by comparing intertemporal decision making in two types of IGT variants: the original reward variant (A- B- C- D- ) that has "rewards" as a prominent outcome and a punishment variant (E- F- G- H- ) that has "loss/punishment" (see Appendix B for variant details) as a prominent outcome. It was demonstrated that decision making was governed by the intertemporal attribute irrespective of the frame or type of IGT variant; in other words, the reward and punishment frame of the IGT variant did not affect intertemporal decision making. However, one study (Maia and McClelland, 2004) of the IGT reward variant showed that participants exhibited knowledge of the reward and punishment schedules (specifically of long term outcome), which indicates these schedules are cognitively penetrable in the IGT reward variant. Moreover, in another study of the IGT reward variant, the frequency of reward and punishment, rather than intertemporality, was found to control decision making (Lin et al., 2007). This evidence negates the assertion that intertemporality is the sole factor governing decision making in the IGT reward variant, and supports a role of reward and punishment in IGT decision making. This influence of reward and punishment on IGT decision making, however, is still largely unclear.

It is assumed in this paper that IGT decisions are based on both frequency of reward and punishment, and intertemporality, and that these two attributes reflect emotion- and cognitionbased processing, respectively. It is contended that the role of reward-punishment in the form of IGT variant type and task motivation toward reward and punishment is to differentiate emotion-cognition processing in the IGT. Decision making based on the intertemporal attribute might require the recollection of previous outcomes to determine which decks produced net gains over the trial periods, and therefore might require cognitive resources and involve working memory. On the other hand, decision making based on the frequency attribute imposes no such demand on cognitive resources. Therefore, decision making based on the intertemporal attribute might require cognitive activity, whereas decision making based on the frequency attribute may reflect activity in the emotion-based system. Indeed, Stocco et al. (2009) found a double dissociation in decision making based on both attributes suggesting that intertemporal decision making demands cognitive resources and that the two attributes reflect emotion-cognition dichotomy.

Others have observed that intertemporal decision-making reflects explicit learning (Maia and McClelland, 2004); is dependent on hippocampus-mediated memory systems, such as the declarative memory system (Gupta et al., 2009); and engages working memory (Hinson et al., 2002). Conversely, decision making based on the frequency attribute may reflect automatic processing (Wilder et al., 1998; Stocco et al., 2009), which is indicative of emotion-based processing. Support for this dichotomy comes from dual-process theory of reasoning, which suggests the existence of two systems that process information differently. One system is automatic, emotion-based, and concerned with the present, whereas the second is reflective, cognition-based, and concerned with the future (Tversky and Kahneman, 1971). Therefore, it was assumed in the present study that IGT decision making based on the frequency of reward and punishment reflects automatic emotion-based processing, and decision making based on intertemporality reflects cognitive processing: thus, the two attributes, respectively, reflect emotion- and cognition-based processing. However, it is not yet known which factor determines the dichotomization of emotion-cognitionbased processing in the IGT.

In the present study, it is proposed that the frame of the IGT variant and the task motivation toward reward and punishment might influence the differentiation of emotion-cognition-based processing. Contrary to the assumption that intertemporal decision making is not influenced by the frames of the IGT variant (Bechara et al., 2000a), it has been observed that intertemporal decision making is more strengthened in the punishment variant than in the reward variant (e.g., Bechara et al., 2000b, 2002; Must et al., 2006, 2007; Verdejo-Garcia et al., 2006). In one such study, it was observed that the punishment variant, which produces a "loss" outcome for every choice (whereas the reward variant produces a "gain" outcome for every choice), was more conducive to the intertemporal attribute [i.e., cognition-based processing; (Singh and Khan, 2012)]. It was suggested that because the punishment/loss variant triggers risk-seeking while the "reward" variant induces risk-aversion, the punishment variant might require greater cognitive processing than the reward variant. Greater activity in the cognition-based system suggests greater differentiation of emotion-cognition processing in the punishment variant. Therefore, it was expected that the IGT variant type would affect the dichotomization of emotion-cognition-based processing in IGT decision making.

Similar to the assumption that the frame of the IGT variant does not affect intertemporal decision making (Bechara et al., 2000a), the task instructions are also based on an assumption that IGT decision making has equal reward- and punishmentrelated motivation; the instructions are bidirectional in nature, prompting the decision maker to seek reward as well as avoid punishment. However, contrary to the assumed bi-directionality of task motivation, it has been observed that intertemporal decision making in the IGT is dependent on avoiding punishment rather than seeking reward. For instance, Fernie and Tunney (2006) found that a portion of the instructions that advised the avoidance of "bad" cards was necessary for intertemporal decision making in the reward variant because omission of that portion resulted in poor intertemporal decision making. The omitted part was as follows: "All I can say is that some decks are worse than others. You may find all of them bad, but some are worse than others are. No matter how much you find yourself losing, you can still win if you stay away from the worst decks." Similarly, Balodis et al. (2006) simplified the instructions by excluding a part that advised subjects to avoid "bad" cards. The simplified instructions were as follows: "In this card game there are four decks of cards. You can draw cards from any of the decks. Every time you click on [*sic*] card, you will win some playmoney. With some card draws you will lose money as well. The object of the game is to win as much play-money as possible, or avoid losing as little of the money as possible. You will begin the game with \$2000." These simplified instructions resulted in poor intertemporal decision making, but the reinstatement of the warning resulted in improvement (Balodis et al., 2006). In a previous study, by the present author, it was observed that intertemporal decision making in the IGT reward variant is differentially affected by task motivation toward reward and task motivation toward punishment because a unidirectional version of the standard bidirectional instructions enhanced intertemporal decision making (Singh and Khan, 2012). It has been suggested that the unidirectional instructions (i.e., only to seek reward or to avoid punishment) are less taxing on working memory; this results in more efficient cognition-based processing, and consequently increases intertemporal decision making (Singh and Khan, 2012). According to dual-process theories, efficient cognition-based processing inhibits emotion-based processing (Tversky and Kahneman, 1971; Evans, 2003), and this inhibition may result in more differentiated emotion-cognition based processing. Therefore, in addition to variant type (reward and punishment variant), it was expected that task motivation toward reward and punishment might also affect the differentiation of emotion-cognition processing in the IGT.

Thus, in the present study, it is explored whether varying the reward and punishment frame via variant and/or instruction type affects the emotion-cognition dichotomy, as tested via the two attributes in IGT decision making. It was hypothesized that IGT variant type and task instruction type would influence which attribute governed IGT decision-making.

## **MATERIALS AND METHODS**

#### **SAMPLE**

Three hundred and twenty healthy undergraduate and graduate students volunteered for the study (mean age = 23.82 years; *SD* = 3*.*25; male = 160). All participants had more than 18 years of education. Most of the participants were right-handed (86.1%) and non-smokers (93.6%).

#### **DESIGN**

This study used a 2 (reward variant: intertemporal and frequency attributes) × 2 (punishment variant: intertemporal and frequency attributes) × 4 (instruction type: avoid punishment, seek reward, standard, and no hint) design. The two net scores obtained via the two attributes (attribute type) in the two variants (variant type) were the within-subjects variables, and instruction type was the between-subjects variable. The order of variant type presentation was counter-balanced and the sample was genderbalanced; neither presentation order nor gender affected the results (*p >* 0.5).

IGT decision making was analyzed according to the "net score" method (Bechara et al., 1994), in which one total net score was calculated for the five blocks. It is customary to analyze IGT performance using five block-wise net scores rather than one total net score of the five blocks because this method allows for the comparison of participants' learning rate across blocks of trials. However, the focus of the present research at this stage was to differentiate intertemporal decision making (believed to reflect cognition-based processing) from the frequency attribute (believed to reflect emotion-based processing) and to test if the variant type and instruction type affected the differentiation of the two attributes.

To calculate an index of the intertemporal attribute in the reward variant, the number of cards drawn from decks A and B were added, and their sum was subtracted from the number of cards drawn from decks C and D- ([decks C- + D- ]—[decks A- + B- ]). This was done for a block of 20 trials each, and scores for the five blocks were added to obtain a total net score for the reward variant. The formula used to calculate the intertemporal attribute index in the punishment variant was ["E" + "G"]— ["F"+ "H"]. The formula used to calculate the frequency attribute for the reward variant was ([decks "B" + "D"]—[decks "A" + "C"]); for the punishment variant, it was ["F" + "G"]—["E" + "H"].

#### **MATERIALS**

The computerized IGT progressive reward (A- , B- , C- , D- ) and progressive punishment (E- , F- , G- , H- ) variants were used. The progressive variant is slightly different from the original IGT in that it exaggerates the future outcome; that is, it increases the magnitude of long-term rewards in the advantageous decks and long-term punishments in the risky decks (Bechara et al., 2000a). Four sets of IGT instructions were used: (1) instructions that prompted the decision maker to seek reward (Reward), (2) instructions that prompted the decision maker to avoid punishment (Punishment), (3) the routinely used bidirectional instructions that prompt the decision maker to seek reward and avoid punishment (Standard), and (4) instructions that contained no prompts toward either reward or punishment (No hint; see Appendix A).

### **PROCEDURE**

Demographic information was first obtained via questionnaire from each participant. Participants were told that they would be taking part in a decision making experiment where they would be playing/gambling with play-points after which they gave their informed consent. The study was approved by a thesis committee (Research Progress Committee), a departmental committee, and an institute-level committee in charge of overseeing the postgraduate research program. Participants were tested individually in a laboratory and were assigned to one of the experimental conditions. Two IGT variants were presented in a counter-balanced design (i.e., reward variant followed by punishment variant, or vice versa) with one of the four types of instructions (Reward, Punishment, Standard, and No hint). Thus, each participant performed both IGT variants under one type of instruction. Instructions were read before the first variant was presented. After finishing the first variant, a small break was given (5 min). Following this, the same instructions were read for the second variant, and the second variant was presented. When participants had completed both variants, they were debriefed and thanked for their participation in the study.

#### **DATA ANALYSIS**

Data were analyzed using Statistical Product for Service Solutions version 16 (Chicago, IL, USA). The threshold for statistical significance was set to *p <* 0*.*05.

## **RESULTS**

Mean decision making net scores based the two attributes (intertemporal and frequency) in the two variants (reward and punishment) across the four types of instructions (reward, punishment, standard, and no-hint) are presented in **Table 1**.

The results of a repeated-measures analysis of variance using the four net scores (obtained on the basis of the two attributes in the two variants) showed a non-significant main effect of attribute type and a significant interaction of instruction and attribute type for the reward variant [*F(*3*,* <sup>312</sup>*)* = 4*.*52, η<sup>2</sup> *<sup>p</sup>* = 0*.*04, *p <* 0.01] (see **Figure 1**). Multiple comparisons for the reward variant using Tukey's Honestly Significant Difference test showed that only the unidirectional instructions for seeking reward differed from the standard and no hint instructions; however, the significance levels of these variables were *p* = 0*.*08 and *p* = 0*.*09, respectively, indicating marginal significance. A significant main effect of attribute type [*F(*1*,* <sup>312</sup>*)* = 9*.*36, η*p*<sup>2</sup> = 0*.*03, *p <* 0.01], but no interaction effect of instruction and attribute type, was observed for the punishment variant (see **Figure 2**).


**Table 1 | Descriptive statistics for instruction type, variant type, and attribute type (***n* **= 320).**

*Values shown are means and standard deviations (in parentheses).*

**D-)—(A-+ B-) and frequency (B-+ D-)—(A-+ C-).** Error bars represent standard errors.

**(E- + G- )—(F- + H-** standard errors.

## **DISCUSSION**

The study examined the effects of task motivation and IGT variant framing on the two attributes of decision making in the IGT. The results indicated that decision-making was governed equally by both attributes, and that task instructions affected attribute type in the reward variant. In the punishment variant, decision-making was differentially governed by the two attributes, and the task instructions did not affect the attributes.

These results are consistent with previous studies that showed that decision making in the reward variant is not solely based on the intertemporal attribute (e.g., Chiu and Lin, 2007; Lin et al., 2007; Chiu et al., 2008), which suggests the influence of more than one attribute on decision making in the reward variant. The present results showed an interaction between task instructions and attribute type in the reward variant, which is consistent with the observation that task instructions—specifically those that advise subjects to avoid "bad" cards—are critical for intertemporal decision making in the reward variant (e.g., Blair and Cipolotti, 2000; Balodis et al., 2006; Fernie and Tunney, 2006). The results suggest that the bifurcation of task motivation toward reward and punishment might be differentially conducive to the two attributes (i.e., it facilitates cognitive or emotional processing), and that it might facilitate dichotomization of the emotion-cognition processing when the IGT is framed in a reward variant.

The differential governing of decision making by the two attributes in the punishment variant suggests a dominance of one attribute. This observation is consistent with previous claims that intertemporal decision making dominates in the punishment variant (e.g., Bechara et al., 2000b, 2002; Must et al., 2006, 2007; Verdejo-Garcia et al., 2006). Therefore, the punishment variant may be more effective than the reward variant at differentiating between emotion- and cognition-based decision making. When the two attributes are well-differentiated, task instructions do not seem to play a critical role. The results further corroborate the assertion that instruction-induced task motivation toward reward and punishment plays a role in the dichotomization of emotioncognition processing. Results additionally show that task motivation differentially affects decision making in the reward and punishment variants of the IGT. Instructions play an important role in the reward variant, where there is equivocal attribute preference (i.e., undifferentiated emotion-cognition based processing), but not in the punishment variant, where there is unequal attribute preference (i.e., differentiated emotion-cognition based processing).

Furthermore, the present results support the earlier observation that bifurcating task instructions into reward-seeking and punishment-avoidance might reduce working memory demands (Singh and Khan, 2012), resulting in more efficient cognitionbased processing and inhibition of emotion-based processing (i.e., facilitation of the differentiation between the two attributes), in other words, a well-differentiated emotion-cognition based processing. This explanation (about the role of working memory in improving cognition-based processing via a welldifferentiation of emotion-cognition based processing) is consistent with earlier findings that intertemporal decision making in the IGT is dependent on working memory. For instance, studies have reported that performing a secondary task interfered with working memory and negatively affected intertemporal decision making in the IGT reward variant (Turnbull et al., 2005; Stocco et al., 2009). This implies that one of the ways to rectify intertemporal decision-making impairments, which are synonymous with decision making deficits in a clinical sample (e.g., substance abuse), might be to try and dissociate reward-seeking motivation from punishment-avoidance motivation through the utilization of unidirectional instructions. Impaired intertemporal decision making is believed to be due to a failure to integrate both emotion and cognition-based processing (e.g., Killgore et al., 2007). An interesting but preliminary theoretical implication of the present results in this regard, which requires further investigation, is the possibility that dissociating rather than integrating emotioncognition processing might result in better intertemporal decision making in the IGT.

Future studies that examine why the punishment frame of the IGT engages cognition-based processing and consequently facilitates the differentiation of emotion- and cognition-based processing to a greater degree than does the reward frame would be informative. The speculation that the reward and punishment frames of the IGT differentially rely on emotion- and cognition-based processing is consistent with the results of at least one study. In this experiment, the Task of Cups in a reward and punishment frame was used to analyze decision making in patients with a lesion in the amygdala, a brain region that mediates emotional responsivity (Weller et al., 2007). It was observed that participant's decision making was impaired in the reward frame and intact in the punishment frame, suggesting that decision making in the punishment frame might not rely as much on emotion-based processing as does decision making in the reward frame. This supports the present assertion that the loss frame in the IGT might engage cognition-based processing to a greater extent than the reward frame, thus resulting in a more pronounced dichotomy of emotion-cognition based processing in the loss frame compared with the reward frame.

One limitation of the present study is the lack of accounting for differences in personality (Franken and Muris, 2005) and mood (Suhr and Tsanadis, 2007), which may have affected IGT decision-making. The absence of a real-money reward or a material incentive for participation might be another limitation; however, at least one study has shown that there is no difference in IGT decision making based on whether incentives are real (monetary) or facsimiles (Bowman and Turnbull, 2003). Nevertheless, these limitations should be taken into account when interpreting the findings of this study. The findings of the present study suggest that reward and punishment manipulated via IGT task frame and task motivation play a critical role in IGT decision making, and that role might include the delineation of emotionand cognition-based processing.

## **ACKNOWLEDGMENTS**

The study was completed in partial fulfillment of the requirements for a doctoral degree at the Indian Institute of Technology-Bombay.

## **REFERENCES**


integrative framework. *Behav. Brain Funct.* 5, 1. doi: 10.1186/1744- 9081-5-1


for intact functioning. *Schizophr. Res*. 30, 169–174. doi: 10.1016/S0920- 9964(97)00135-7

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 June 2013; accepted: 29 November 2013; published online: 17 December 2013.*

*Citation: Singh V (2013) A potential role of reward and punishment in the facilitation of the emotion-cognition dichotomy in the Iowa Gambling Task. Front. Psychol. 4:944. doi: 10.3389/fpsyg.2013.00944*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Singh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## **APPENDIX A**

Four types of instructions were used in the study: standard (1a and 1b), seek reward (2), avoid punishment (3), and no hint (4a and 4b) instructions.


you choose from the good decks. Moreover, the computer does not change the position of the decks once the game begins. It does not make you win or lose at random, or make you win or lose money based on the last card you picked."


## **APPENDIX B**

1. Deck information in the two IGT variants: The IGT reward variant offers a choice among four decks of cards labeled A- , B- , C- , and D- . Unlike the original paper-and-pencil based task (ABCD), the computerized task (A- B- C- D- ) has increased delayed punishment and therefore amplifies the effect of disadvantageous choices (see Bechara et al., 2000a, for differences between the original and the computerized version). Unbeknownst to the decision maker, decks A and B have high immediate rewards (100 points per card-pick) but 50% of cards drawn from deck A giving a loss of 35–100 points and 10% of cards drawn from deck B' giving a loss of 250 points, such that 10 cards drawn from decks A and B result in a net loss of 250 points. Decks C and D have small immediate rewards (50 points per card-pick) with 50% of cards drawn from deck C giving a loss of 25–75 points and 10% of cards drawn from deck D' giving a loss of 250 points, such that 10 cards drawn from decks C and D result in a net gain of 250 points. The punishment variant offers a choice between four decks of cards labeled E- , F- , G- , and H- . After a card is picked, the "loss" is announced, which at times is followed by a "gain." Decks F' and H give immediate low losses and a low net gain, while decks E and G give immediate high losses and a high net gain. Long-term advantageous decision making is reflected in choosing high-immediate-loss decks (decks E and G- ) and avoiding low-immediate-punishment decks. Although both variants offer both rewards and punishments, the prominent outcome in the reward variant is a "win," while that in the punishment variant is a "loss," which underlies the assertion that a positive frame (i.e., "gain") is triggered in the reward variant and a negative frame (i.e., "loss") is triggered in the punishment variant.

2. The graph shows the effects of variant, instruction, and attribute type in IGT decision-making (**Figure A1**).

# Sex-Differences, Handedness, and Lateralization in the Iowa Gambling Task

Varsha Singh\*

*Humanities and Social Science, Indian Institute of Technology Delhi, New Delhi, India*

In a widely used decision-making task, the Iowa Gambling Task (IGT), male performance is observed to be superior to that of females, and is attributed to right lateralization (i.e., right hemispheric dominance). It is as yet unknown whether sex-differences in affect and motor lateralization have implications for sex-specific lateralization in the IGT, and specifically, whether sex-difference in performance in the IGT changes with right-handedness or with affect lateralization (decision valence, and valence-directed motivation). The present study (*N* = 320; 160 males) examined the effects of right-handedness (right-handedness vs. non-right-handedness) as a measure of motor lateralization, decision valence (reward vs. punishment IGT), and valence-directedness of task motivation (valence-directed vs. non-directed instructions), as measures of affective lateralization on IGT decision making. Analyses of variance revealed that both male and female participants showed valence-induced inconsistencies in advantageous decision-making; however, right-handed females made more disadvantageous decisions in a reward IGT. These results suggest that IGT decision-making may be largely right-lateralized in right-handed males, and show that sex and lateralized differences (motor and affect) have implications for sex-differences in IGT decision-making. Implications of the results are discussed with reference to lateralization and sex-differences in cognition.

#### Edited by:

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### Reviewed by:

*Ignacio Saez, University of California, Berkeley, USA William H. Overman, University of North Carolina, Wilmington, USA*

> \*Correspondence: *Varsha Singh vsingh.iitb@gmail.com*

#### Specialty section:

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology*

Received: *22 December 2015* Accepted: *27 April 2016* Published: *31 May 2016*

#### Citation:

*Singh V (2016) Sex-Differences, Handedness, and Lateralization in the Iowa Gambling Task. Front. Psychol. 7:708. doi: 10.3389/fpsyg.2016.00708* Keywords: decision-making, handedness, Iowa Gambling Task, laterality, reward punishment, task motivation

## INTRODUCTION

The Iowa Gambling Task (IGT: Bechara et al., 1994) is a widely used neuropsychological decision-making task that offers a choice between immediate vs. long-term gains. The task has been useful in addressing important theoretical issues pertaining to decision neuroscience, for example, the role of working memory and executive function (Bechara et al., 1998; Turnbull et al., 2005), and the nature of insight—implicit or explicit—into the reinforcement (Maia and McClelland, 2004; Bechara et al., 2005). The task has also been instrumental in understanding the role of the prefrontal cortex (PFC) and sub-regions (e.g., dorsal vs. ventral regions of the PFC; Fellows and Farah, 2005). It has been observed that more males than females make advantageous decisions in this task (Reavis and Overman, 2001; Bolla et al., 2004; van den Bos et al., 2007; see review by van den Bos et al., 2013), and that the right hemisphere seems to be more involved than the left in advantageous decision-making (e.g., Manes et al., 2002; Tranel et al., 2002; Clark et al., 2003; Buelow and Suhr, 2009). Even though sex differences may emerge in the IGT because it seems to be primarily right-hemisphere task (Bolla et al., 2004), it remains unclear whether sex and lateralization contribute to IGT decision-making (where lateralization is defined as asymmetrical engagement of the two hemispheres of the brain).

Of the three commonly used neuropsychological decisionmaking tasks (i.e., the IGT, the Cambridge risk task, and the Risk task), the IGT alone shows lateralization (Clark et al., 2003). Decision-making in the IGT has primarily been associated with the right hemisphere (Naccache et al., 2005; Christman et al., 2007). For instance, research on unilateral lesions suggests that functioning of the right hemisphere is largely crucial to decision-making in the IGT (Tranel et al., 2002) because IGT decision-making seems to show greater impairment for right- vs. left-lateralized lesions in the PFC (e.g., Manes et al., 2002; Bark et al., 2005), with lesion size correlating with disadvantageous IGT choices (Clark et al., 2003). However, lesion studies are fraught with problems, such as the absence of strictly lateralized damage, lack of specificity in the lateralization of the damage, and the small numbers of patients with appropriate lesions (Fellows and Farah, 2005). Additionally, brain activation studies rarely indicate that functions are governed by hemispheres on an absolute "all or none" basis; rather, a functional lateralization approach suggests that a function evokes asymmetrical or a graded activation across the two hemispheres (Knecht et al., 2000). Furthermore, limitation of applying the lateralization approach to explain complex tasks, i.e., tasks in which multiple constructs rather than a unitary construct drive performance, should be acknowledged; thus, lateralization might partially (rather than completely) explain sex-differences in decisionmaking (Rilea et al., 2004). Nevertheless, discounting sex differences in neuroscience, including in studies involving diagnostic tools, such as the IGT, could result in an incomplete understanding of brain and behavior, and psychological disorders (Cahill, 2006). Researchers have observed that consistent sex differences in widely used tasks should be re-examined to understand social issues, such as the link between sex differences in cognitive processing and the under-representation of females in science and engineering (Miller and Halpern, 2014). The present paper aims to understand sex-differences in the IGT using a functional lateralization approach.

It has been observed that there are sex-differences in the extent to which a function asymmetrically implicates a hemisphere. For example, males tend to show greater lateralization of functions compared to females (Inglis and Lawson, 1981; Azari et al., 1995; Bolla et al., 2004). Specifically, language seems to be more strongly left-lateralized in males than in females (Shaywitz et al., 1995), and performance on emotional-face processing tasks is more strongly right-lateralized in males than in females (Bourne, 2005). A recent review addressing sex-differences in the IGT noted that IGT decision-making may be predominantly rightlateralized in males and left-lateralized in females (van den Bos et al., 2013). In fact, advantageous decision-making in the IGT reflects cognitive control wherein a reflective system overrides the impulse to choose immediate rewards, and guides long term advantageous decision-making (Bechara, 2005), and some studies have observed that cognitive control is largely rightlateralized (Garavan et al., 1999; Aron et al., 2003, 2004; Knoch et al., 2006). Further, right-lateralization of cognitive control seems to differ between sexes; for instance, due to the distinct organization of inter-hemispheric interactions (specifically the morphology of the corpus callosum), males tend to show greater functional lateralization of cognitive control than females (Huster et al., 2011). Compared to other cognitive control tasks (e.g., the Stroop task), the IGT shows the most prominent sexdifferences in lateralization, whereby males primarily show right hemispheric activation whereas females show more activation predominantly in the left hemisphere (Bolla et al., 2004). Observations from lesion studies also suggest that there are sex and laterality differences in the IGT. For instance, the originators of the IGT (Tranel et al., 2005) compared four males and four females, each with a unilateral lesion on either the left or the right side, and found that right-hemisphere damage leads to decision-making deficits in male patients, whereas damage to the left hemisphere is detrimental in this respect in female patients. Therefore, lateralized activation observed via brain imaging studies as well as IGT deficits observed in unilateral lesion studies suggest that the lateralization of IGT-related decision-making is sex-specific.

Furthermore, cognitive control in IGT decision making seems to be sensitive to punishment, suggesting that affect lateralization, that is, right lateralization of negative emotion and avoidance motivation, and left lateralization of positive emotion and approach motivation (Davidson, 1992, 1995, 2004) might influence IGT decision making. A punishment variant of the IGT was introduced by the originators of the IGT, in which participants are required to choose between high losses and high gains vs. low losses and low gains; the choice of high immediate losses/high long-term gains reflects decisionmaking that is advantageous in the long-term (see **Table 1**). It was expected that healthy normal participants would make advantageous decisions in both decision frames, irrespective of the frame of the decisions, i.e., whether the decision presented was in a "gain" frame of foregoing immediate reward in the reward IGT, or in a "loss" frame of bearing immediate losses in the punishment IGT (Bechara et al., 2000). However, more advantageous IGT decision-making is observed in the punishment IGT than in the reward IGT (e.g., Must et al., 2006, 2007; Verdejo-Garcia et al., 2007), which suggests that the punishment IGT may be conducive to cognitive control. Since few studies use both reward and punishment IGT tasks, it is unclear whether sex-differences in affect lateralization will influence difference in advantageous decision-making across reward and punishment IGT.

Previously, it was observed that the instruction to seek reward rather than the bi-directional instruction to seek reward and avoid punishment contributed to a difference in advantageous decision-making in the two IGTs (Singh and Khan, 2012), and facilitated differentiation between long-term and frequencybased decision-making in the reward IGT (Singh, 2013). This suggested that valence-directedness of task motivation, that is, motivation directed toward either reward or punishment, rather than directed toward both reward and punishment, improves advantageous decision-making in the IGT, possibly due to reduced cognitive processing demands (Singh and Khan, 2012;


TABLE 1 | Characteristics of reward IGT (decks A′ , B′ , C′ , and D′ ) and punishment IGT (decks E′ , F′ , G′ , and H′ ) (Bechara et al., 2000).

Singh, 2013). It is also possible that valence-directed motivation triggers much more lateralized activity than (a) motivation directed toward both reward and punishment, (which might trigger more bilateral activity), and (b) motivation that is neither directed toward reward nor punishment, (which might trigger less lateralized activity). Thus, the valence-directedness of task motivation could generate strong lateralization and might reveal sex-differences. It has been observed that the observed male advantage in the reward IGT is due to greater punishment sensitivity. However, the female disadvantage has been attributed to either greater focus on rewards (Bolla et al., 2004; Evans and Hampson, 2015) or undifferentiated attention toward both reward and punishment (Stout et al., 2005). It is possible that valence-directedness in males triggers lateralized activity, which is conducive to advantageous IGT decision-making, whereas undifferentiated focus on rewards and punishments in females might trigger bilateral activity, which is not conducive to cognitive control.

Furthermore, motor laterality, specifically individual differences in right-handedness, could have implications for sex-specific lateralization in the IGT, because sex differences in handedness reflect sex-dependent differences in cerebral organization, which has implications for cognitive functions. For instance, language seems to be strongly left-lateralized in righthanders (Carey and Johnstone, 2014), whereas females seem to show less language lateralization, irrespective of handedness (Hagmann et al., 2006). Similarly, affect lateralization seems sex- and handedness-specific: affect lateralization is observed for right-handedness, but not for left-handedness (Brookshire and Casasanto, 2012), and is more pronounced in males than in females (Wager et al., 2003), reversing with the direction of handedness. Processing of facial emotion is strongly rightlateralized in right-handed males, whereas the relationship between right-handedness and lateralization of facial emotion processing for females is weak or non-existent (Bourne, 2008). Right-handedness influences sex-differences particularly for right-lateralized tasks (Crucian and Berenbaum, 1998). It has been reported that strongly right-handed individuals have less interhemisheric interaction and restricted access to the right hemisphere compared to mixed or non-right-handed individuals (Christman et al., 2004; Propper et al., 2005). It is possible that restricted right-hemispheric access among strong right-handers will influence their IGT-related decision-making and that the effect of restricted right-hemisphere access will differ between the sexes. In other words, strong right-handedness will have implications for sex-specific lateralization of IGT decision-making, particularly if the right hemisphere is critical for IGT-related decision-making.

The present study examined the relationship between sex, and motor and affective lateralization and its effect on cognitive control in IGT decision-making. It was expected that sex and lateralized differences in motor, affect, and cognitive control would have implications for sex-differences in advantageous IGT decision-making. Cognitive control reflected in advantageous decision-making was expected to alter across reward and punishment IGT variants, and this change was expected to differ according to sex (male vs. female), right-handedness, (righthanded vs. non-right-handed), and the valence-directedness of task instructions (directed vs. non-directed). It was expected that there would be an interaction between cognitive control and sex, as well as motor and affective lateralization. That is, lateralization of motor, affect, and cognitive control was expected to drive sex-differences in IGT-related decision-making.

## MATERIALS AND METHODS

## Participants

Three-hundred-and-twenty healthy and medication-free students (mean age = 23.81 years, SD = 3.24; 160 male) volunteered to participate in this study. The experiment was conducted in accordance with the Declaration of Helsinki; all participants provided informed consent, and the study was approved by the Research Committee of the Indian Institute of Technology–Bombay where the research was conducted.

## Design and Variables

The design consisted of two levels of total net scores as a within-participant factor (reward/punishment IGT) × 2 types of task motivation (valence-directed/non-directed) × 2 levels of right-handedness (right-handed/non-right-handed) as betweenparticipant factors. All participants performed both variants of the task (reward and punishment variants, in a counterbalanced order). Half of the sample (n = 160) received valence-directed instructions (i.e., reward-directed [n = 80] or punishmentdirected [n = 80]). The other half received valence non-directed instructions (n = 160; i.e., both reward and punishment [n = 80] or no suggestions regarding reward or punishment [n = 80]). All participants performed both variants (reward variant and punishment variant) for one instruction type, and the IGT scores on the two IGT variants served as a within-participant factor.

Cards drawn from each of 4 decks of the reward IGT and from each of the decks of punishment IGT served as a variable for deck-wise analysis. For block-wise analysis, in each of 5 blocks of 20 trials, the number of times a deck was chosen during the block of trials was calculated (i.e., for decks A′ through D′ in the reward variant) to produce block-wise net scores, and a total net score for the reward variant using the following formula: (C′ + D′ ) − (A′ + B ′ ). Similarly, in the punishment variant, the total net score was calculated as (E′ + G ′ ) − (F′ + H′ ).

## Procedure

The Edinburgh Handedness Inventory (Oldfield, 1971) was used to determine right-handedness, wherein the inventory score ranges from −100 (left-handedness) to 100 (right-handedness) and a score less than 0 is considered to indicate left-handedness. The current sample had a median score of 80, scores above 80 reflected strong right-handedness (n = 144); 80 is also the population median score obtained in the original study using a large database (Oldfield, 1971). Handedness is considered as a continuous variable; however, studies that test differences between groups use the population median score of 80 (e.g., Christman et al., 2007; Westfall et al., 2010; Lyle and Orsborn, 2011; Westfall et al., 2014). In fact, the inclusion criterion for right-handed participants in lateralization studies is a cutoff of 30; thus, anyone scoring above 30 is considered a right-hander (e.g., Knecht et al., 2000). Groups made on the basis of the median combines mixed and left handers into a group of non-right-handers, which allows a comparison between right-handedness and non-right-handedness, rather than simply comparing right- vs. left-handers; the latter is no longer considered as a the only robust classification of handedness (Prichard et al., 2013).

Computerized versions of the reward IGT and punishment IGT were used (Bechara et al., 1994). The IGTs were presented in a counterbalanced design with a 5-min break between the 2 variants. Task instructions were administered before presenting the task. Task motivation was manipulated via the task instructions, such that valence-directed instructions urged the participant toward either seeking rewards or avoiding punishments (n = 160). In contrast, non-directed instructions lacked valence-directedness (n = 160). The instructions are provided in the Appendix.

## Data Analysis

All analyses were completed using the Statistical Package for the Social Sciences (SPSS 16, India), with the level of significance set to 0.05, and the data were split by sex. Decision-making in the IGT is commonly analyzed using the "net score" method, wherein the deck choices are aggregated; that is, the total cards drawn from the 2 bad decks are deducted from the total cards drawn from the 2 good decks. The net score method has been criticized (Lin et al., 2007); however, to enable a comparison between the present results and the previous findings on sexdifferences in the IGT (e.g., Bolla et al., 2004), particularly when the comparison is cross-cultural (e.g., American- Brazil comparison: Bakos et al., 2010), it was crucial to retain the net score method for analyzing IGT decision-making, additionally, IGT decision-making is also analyzed using individual decks and blocks of trials. Correlations were used to determine whether handedness as a continuous variable is associated with IGT net scores in the reward and the punishment IGT, after partialling out the effects of sex and task instructions. Mixed ANOVA were used on decks (number of cards drawn from individual decks) and blocks (net scores on a block of 20 trials). Decision-making in the IGT was analyzed separately for reward and for punishment IGT, followed by comparison of advantageous decision-making across the 2 IGTs, wherein net scores on the 2 IGTs were considered the within-participants factor (scores consisted of [decks C′ + D′ ] − [decks A′ + B ′ ]) and ([E′ + G ′ ] − [F′ + H′ ]), and handedness (right-handed vs. non-right-handed) and instruction type (valence-directed vs. non-directed) were considered the between-participants factors. A Huynh–Feldt correction for epsilon values greater than 0.75 was used. Box's test was used to show that the data did not violate the assumption of equality of covariance matrices.

## Results

**Table 2** shows descriptive statistics of the sample set. There was no significant correlation between handedness as a continuous variable and net scores on reward and punishment IGT. Furthermore, partial correlation, wherein the effects of sex were controlled for, between handedness and both reward IGT and punishment IGT net scores failed to reach statistical significance. However, partial correlation controlling for sex and valencedirected task instructions showed that right-handedness was positively correlated with advantageous decision-making in the punishment IGT (r = 0.10; p < 0.05; see **Table 3**). As expected, right-handedness was associated with advantageous decisionmaking in the punishment IGT variant, and sex and task motivation appeared to be critical to the right-handed advantage in the punishment IGT variant.

Handedness, sex, and task instruction were correlated with advantageous decision-making in the punishment IGT but male advantage in IGT decision making remains to be addressed. To address the male-advantage observed in the IGT reward variant in the literature, and to ascertain whether laterality contributes to sex-differences in IGT the sample was split on the basis of sex (male vs. female), motor laterality or righthandedness (right-handed vs. non-right-handed), affect laterality


*"RH" denotes right-handedness and "NRH" denotes non-right-handedness.*


TABLE 3 | Correlation between handedness and net scores on the reward and punishment IGT, taking into account the effect of sex and valence-directed instructions.

*The results of 3 correlations (simple and partial) between scores on the Edinburg Inventory of Handedness and net scores on the reward IGT and the punishment IGT attained significance when the effects of sex and instructions were partialled out ("*\**", significance level of 0.05). Values denote correlation and one-tailed level of significance in the bracket.*

or valence-directedness of task motivation (valence-directed vs. non directed), and advantageous decision-making in the reward and punishment IGT were compared using 6 mixed ANOVAs (2 ANOVAs for deck analysis and 4 ANOVAs for net score analysis).

ANOVA performed on the 4 decks of reward IGT showed a main effect of deck types [F(2.75, 429.73) = 16.41, p < 0.01; means: deck A′ = 19.84, deck B′ = 28.63, deck C′ = 23.78, deck D′ = 27.76; **Figure 1**], all interactions were non-significant. Males showed differentiation between the 4 decks, but neither right-handedness nor valence-directed instructions made any contribution to deck preferences in male participants. Females showed a main effect of deck types [F(2.62, 408.68) = 38.43, p < 0.01; means: deck A′ = 18.63, deck B′ = 31.71, deck C′ = 23.28, deck D′ = 26.64], and 2-way interaction of instruction and deck type was significant [F(2.62, 408.38) = 4.78, p < 0.01]. Valencedirected instruction helped females choose more cards from the good deck D′ (mean = 29.45) and fewer cards from the risky deck B ′ (mean = 30.06) as compared to females who had not received valence-directed instructions, who picked fewer cards from deck D′ (mean = 23.84) and more cards from deck B′ (mean = 33.36; **Figure 2**).

ANOVA for the decks in the punishment IGT for males (i.e., decks E′ , F′ , G′ , and H′ ) showed a main effect of deck types [F(2.68, 418.72) = 10.96, p < 0.01; means: deck E′ = 28.41, deck F′ = 23.36. deck G′ = 27.66, deck H′ = 20.58; see **Figure 3**], and 2-way interaction of valence-directed instructions and deck types was significant [F(2.68, 418.72) = 3.73, p < 0.05]. In males, the 4 decks were differentiated, and males who had received valence-directed instructions drew more cards from the advantageous deck G′ (mean = 30.85) than males who had received non-directed instructions (mean = 24.57; **Figure 4**). On the other hand, females showed a main effect of deck type [F(2.69, 420.42) = 14.50, p < 0.01; means: deck E′ = 24.83, deck F′ = 27.15, deck G′ = 28.62, deck H′ = 19.40], but all interactions were non-significant. In females, the punishment IGT decks could be differentiated, but the deck choices of females remained independent of right-handedness or valence-directed task motivation (see **Table 4**).

In a previous study utilizing a sample of right-handed males and females, it was observed that sex-differences emerged in the earlier blocks of trials in the reward IGT, such that men's decision-making improved much earlier (i.e., in blocks 1 and 2) than in women (Bolla et al., 2004). Therefore, we analyzed learning across the 3 blocks of IGT. Scores on the first 3 blocks of trials (trials 1–20, 21–40, and 41–60) were analyzed by ANOVA to understand whether sex-differences in right-handedness and in valence-directed task motivation contributes to sex-differences in learning advantageous decision-making in the IGT. There was a main effect of the blocks in males [F(1.84, 287.03) = 21.15, p < 0.01; means: block 1 = −2.69, block 2 = 0.65, block 3 = 1.55; **Figure 5**], none of these interactions were significant. Males learned across the 3 blocks of trials, independent of right-handedness and valence-directed instructions. On the other hand, females' decision-making was improved across the 3 blocks

advantageous decks) in reward IGT decks for females. Error bars represent the standard error of the mean.

of trials [F(1.82, 283.80) = 40.04, p < 0.01; means: block 1 = −3.41, block 2 = 1.51, block 3 = 1.24], and 2-way interaction of blocks and right-handedness was significant [F(1.82, 283.80) = 3.54, p < 0.05; **Figure 6**], and interaction of blocks and valencedirectedness of task instruction was significant [F(1.82, 283.80) = 5.56, p < 0.01; see **Figure 7**]. Right-handedness and valencedirected instruction had an effect on learning in the early blocks of reward IGT for non-right-handed (means: block 1 = −3.40, block 2 = 1.77, block 3 = 2.86), rather than right-handed females (means: block 1 = −3.41, block 2 = 1.28, block 3 = −0.27), and females receiving valence-directed instructions (means: block 1 = −3.49, block 2 = 2.53, block 3 = 3.23), rather than nondirected instructions (means: block 1 = −3.33, block 2 = 0.50, block 3 = −0.75) made more advantageous decisions in the early blocks of reward IGT in females.

ANOVA of blocks 1, 2, and 3 of punishment IGT data in male participants showed a main effect of IGT blocks [F(2, 312) = 6.58, p < 0.01; means: block 1 = −2.69, block 2 = 0.65, block 3 = 1.55; see **Figure 8**], but none of the interactions were significant. Male participants showed an increase in advantageous decisionmaking in the early blocks of punishment IGT, independent of right-handedness and the valence-directedness of instructions. In females, advantageous decision-making differed across the 3 blocks of trials [F(2, 312) = 8.88, p < 0.01], suggesting an increase in advantageous choices from block 1 to block 2 (means: block 1 = −3.41, block 2 = 1.51, block 3 = 1.24). Two-way interaction of the instruction and blocks was significant [F(2, 312) = 6.85, p < 0.01], suggesting that females who had received valence-directed task instruction (means: block 1 = 0.13, block 2 = 4.38, block 3 = 3.25) made more advantageous decisions in punishment IGT than females who had received non-directed instructions (means: block 1 = 0.23, block 2 = 0.43, block 3 = 0.60; see **Figure 9**; refer to **Table 5** for results of ANOVAs).

When scores of blocks 1, 2, and 3 were separately totaled for reward and for punishment IGTs and evaluated by ANOVA, a main effect of IGT type was significant for males, suggesting that males showed a different rate of learning across the 2 IGTs [F(1, 156) = 6.46, p < 0.05; means: reward IGT = −0.49, punishment IGT = 5.08]. None of the interactions were significant, suggesting that neither right-handedness nor task instructions contributed to differences in learning across the 2 types of IGT. On the other hand, females showed a main effect of IGT type, suggesting that learning in the early blocks differed across the 2 IGTs [F(1, 156) = 8.55, p < 0.01; means: reward IGT = −0.66, punishment IGT = 4.50]. Two-way interaction of IGT type and right-handedness was significant [F(1, 156) = 5.93, p < 0.05]; non-right-handed females made more advantages decisions in punishment IGT (mean = 2.05) than in reward IGT (mean = 1.22); however, right-handed females performed poorly in reward IGT (mean = −2.40), but made


*"RH type" denotes right-handedness.*

more advantageous decisions in punishment IGT (mean = 6.77; **Figure 10**).

Lastly, the total net scores across the 5 blocks were calculated separately for the reward and for the punishment IGT and were then investigated by ANOVA. A main effect of IGT type [F(1, 156) = 7.44, p < 0.01] was significant for males (means: reward IGT = 3.06, punishment IGT = 12.24), all interactions were nonsignificant. This suggests that advantageous decision-making by males differed across reward and punishment IGTs, independent of right-handedness and valence-directed instructions. On the

other hand, there was a main effect of IGT type for females [F(1, 156) = 5.82, p < 0.05; means: reward IGT = 0.89, punishment IGT = 8.20], and the interaction of the total net scores and right-handedness was significant [F(1, 156) = 4.14, p < 0.05], wherein non-right-handed females made more advantageous decisions in the punishment IGT (mean = 5.09) than in the reward IGT (mean = 4.03). In contrast, righthanded females performed poorly in the reward IGT (mean = −2.01), but performed very well in the punishment IGT (mean = 11.08; see **Figure 11**; refer to **Table 6** for results of ANOVAs).

## DISCUSSION

The study was aimed at understanding the relationship between sex, motor, and affective lateralization in the IGT decisionmaking, specifically whether sex-differences in advantageous decision-making in the IGT is influenced by sex-differences in motor laterality (i.e., right-handedness) and affect laterality (i.e., valence-directed instruction and IGT type). It has been believed that advantageous IGT decision-making reflects lateralized cognitive control, and that motor and affect lateralization would benefit advantageous IGT decision-making. It was hypothesized that strong right-handedness (motor lateralization), affect or valence-directed task instructions (lateralized motivation), and

a punishment frame of the IGT (affect lateralization) would facilitate advantageous IGT decision-making, such that the overall more-lateralized male sex would benefit more from lateralized constructs. Advantageous IGT decision-making was analyzed on the basis of deck choices, as well as on learning across blocks of trials. To the best of our knowledge, this is the first study to consider the role of lateralized constructs, such as righthandedness and affect, in sex-differences in the IGT decisionmaking in both reward and punishment IGTs. We questioned whether there is a right-handed male-advantage in IGT, and whether it differed across reward and punishment IGTs.

Correlation analysis suggested that, as the degree of righthandedness increases, there is an increase in advantageous decision-making in punishment IGT, once the effect of sex and task instruction are accounted for. In line with the contention that IGT decision-making reflects right-lateralized cognitive control, and the negative valence of punishment IGT being largely right-lateralized, laterality was expected to benefit from a punishment frame, and results in higher advantageous decisionmaking in the punishment IGT. Next, a series of ANOVAs were performed on data that were split by sex using either IGT decks or blocks of IGT trials to understand advantageous decision-making in reward and punishment IGTs, to test whether right-handedness and valence-directedness of task motivation, as measures of motor and affect laterality, respectively, contribute to advantageous decision-making. Both males and females differentiated between the 4 deck choices of the reward IGT; however, only females benefited from valence-directed task instructions by choosing more from the advantageous decks. The poor performance of females in the reward IGT has been attributed to female preference for the disadvantageous deck B (Overman and Pierce, 2013), and the present results suggests that females who received valence-directed instructions chose less from deck B than did females who received non-directed instructions. There are two explanations for why valence-directed


TABLE 5 | Result of ANOVAs for the early blocks of the reward and punishment IGT (blocks 1, 2, and 3 as within-subject variable).

*"RH type" denotes right-handedness.*

females succeeded in avoiding deck B. (a) It is possible that, for females who received non-directed instructions, deck B, which carries large immediate rewards and infrequent but large losses, seemed ideal for pursuing the twin-goals of seeking rewards as well as avoiding punishments; this pursuit of twin goals might have triggered non-lateralized or bilateral activity. (b) It is possible that females who received valence-directed instructions were relieved of some of those demands by pursuing either rewards or avoiding punishments, thereby triggering lateralized activity (either right-lateralized activity in avoiding punishment or left-lateralized activity in seeking rewards). Such lateralized

activity is thought to be conducive to cognitive control, resulting in better advantageous decision-making.

In the punishment decks, both males and females differentiated between the 4 decks; however, only males seemed to benefit from valence-directed instructions. These results highlight sex-differences in the IGT, and show that females relied on valence-directed task instruction for choosing good decks in the reward IGT, whereas males relied on task instructions for choosing advantageous decks in the punishment IGT. To an extent, this sex difference in reliance on valence directed-instructions might reflect sex-differences in reward and


TABLE 6 | Result of ANOVAs for the total of early blocks, and total of 5 blocks of reward and punishment IGT.

*"RH type" denotes right-handedness.*

punishment sensitivity, as females tend to be reward-focused, while males tend to be sensitive to losses (Bolla et al., 2004; Evans and Hampson, 2015). Therefore, receiving valencedirected instructions may have benefited females in the reward IGT, which is focused on rewards, whereas valence–directed instructions would have benefitted males in the punishment IGT, which is focused on punishments. Unlike the reward IGT, where sex-differences in deck choices have been analyzed in detail, deck choices in the punishment IGT have rarely been discussed. Future studies aimed at attributing sex-differences in IGT-related decision-making to reward–punishment sensitivity should compare decision-making in both the IGTs.

To understand how IGT decision-making evolves with time and practice across trials, and particularly to test whether sex-difference emerge in early trials of IGT, and whether right-handedness and valence-directedness contributes thereto, analysis of the first 3 blocks of IGT trials was undertaken separately for the reward and for the punishment IGT. In the reward IGT, advantageous decision-making changed across the first 3 blocks of the reward IGT in both the sexes. These findings contradict those of a previous study in which males showed an increase in advantageous decisions in block 1 and 2, whereas females failed to show similar learning across the blocks (Bolla et al., 2004). However, these contradictory results might be due to differences in sample size and characteristics; the sample recruited by Bolla et al. (2004) was a smaller sample of righthanded, and slightly older males (n = 10; mean age: 32.6 years) and females (n = 10; mean age: 27.5 years), admitted in an inpatient facility for studying neurological differences by means of brain imaging (PET). Although both the sexes learned to make advantageous decisions in the early blocks of the reward IGT in the present study, there were sex-differences in the factors that influenced the rate of learning. Specifically, valencedirected instructions and right-handedness influenced the rate of learning in the reward IGT in females. Interestingly, when comparing non-right handed females with right-handed females, non right-handers made more advantageous decisions in the first 3 blocks of trials whereas males' performance increased across the early blocks of the reward IGT, irrespective of righthandedness or of task instructions. This sex-difference may be due to right-handedness being associated with restricted access to the right hemisphere (Propper et al., 2012), and since cognitive control underlying advantageous decision-making in the IGT is believed to be right-lateralized (Garavan et al., 1999; Aron et al., 2003, 2004; Knoch et al., 2006), it is possible that this restricted right-hemispheric access due to right-handedness is more detrimental to females than to males. Furthermore, valence-directed instructions influenced the rate of learning in the early blocks of the reward IGT in females. As mentioned earlier, valence-directed instructions might trigger affect-related motivation, which is lateralized, more so than nondirected instructions, which might lack affect-directedness and hence trigger bilateral or non-lateralized activity, thereby being detrimental for cognitive control and advantageous decisionmaking. It is also possible that there were no sex differences in the early blocks of the reward IGT, because half of the female sample received valence-directed instructions. Future studies should utilize affect-directed instructions to determine whether the results can be replicated, and whether improvement occurs in female advantageous decision-making in the reward IGT.

Advantageous decision-making changed across the 3 blocks of punishment IGT in both the sexes. Even though punishment IGT is rarely used, the rate of learning seemed to have shown improvement across the early trials in other studies (e.g., Bechara et al., 2000; Must et al., 2006, 2007; Verdejo-Garcia et al., 2007). Females who received valence-directed instructions showed greater learning than females who received non-directed instructions. It is interesting that, even though males benefited from valence-directed instructions while choosing advantageous decks in punishment IGT, valence-directed instructions did not facilitate advantageous decision-making in the early trials of the punishment IGT for males. On the other hand, valence-directed instructions facilitated block-wise advantageous decision-making in the early blocks of both the reward and the punishment IGTs in females.

On comparing learning in the early blocks of the reward and the punishment IGTs, results suggested that both males and females showed different rates of learning across the 2 IGTs; however, interesting sex-differences emerged, as righthandedness contributed to differences in the learning observed in the 2 IGTs. More specifically, right-handed females performed poorly on the reward IGT and performed well in the punishment IGT, thereby showing prominent inconsistencies in advantageous decision-making across the reward and the punishment IGTs. On comparing the total advantageous decisions made across the 5 blocks of the reward IGT with the total advantageous decisions made in the punishment IGT, we found that righthanded females performed disadvantageously in the reward IGT, but performed advantageously in the punishment IGT. Advantageous decisions in the early blocks made by males differed across the reward and punishment IGTs; however, this difference was independent of right-handedness or the instructions given. Assuming that reward and punishment is lateralized, it was expected that valence-directed task instructions that are solely directed toward reward or punishment would benefit advantageous decision-making by triggering much more lateralized activity than non-directed instructions. Accordingly, valence-directed instructions, as a measure of affect laterality, facilitated advantageous decision-making in females, irrespective of whether the reward or punishment forms of the IGT were used, and therefore did not contribute to frame-induced inconsistency in advantageous decision-making across the 2 IGTs. In contrast, right-handedness in females resulted in a selective disadvantage in the widely used reward IGT, and thereby contributed to inconsistent advantageous decision-making across the reward and punishment IGTs. These results add to our understanding of the role of valence-directed motivation in IGT decisionmaking; it was previously observed that the instruction to seeks reward benefitted advantageous decision-making selectively in the punishment IGT (Singh and Khan, 2012), facilitated separating long-term decision-making from frequency-based decision-making selectively in the reward IGT (Singh, 2013), and in the present study it was observed that both types of valence-directed instructions (i.e., only seeking reward, or only avoid punishment) facilitated advantageous decision-making, irrespective of the IGT frame, as compared to the 2 nonvalence directed instructions (seeking reward, as well as avoiding punishment, or no-specific direction). Moreover, females seemed to benefit from valence-directed instructions in the early trials of both types of IGTs, probably due to markedly more lateralized activity under valence-directed motivation.

The results of this study highlight interesting similarities and dissimilarities between the sexes. The number of advantageous decisions made by both males and females differed across the reward and punishment IGTs, suggesting that both the sexes showed frame-induced inconsistencies in advantageous decisionmaking in the IGT, which is not triggered by the type of task motivation in either sex. However, interesting sex-differences emerged, as right-handed females performed poorly in the widely used reward IGT and performed well in the punishment IGT, whereas right-handedness did not confer such a disadvantage in males. This suggests that, since most of the IGT studies compared right-handed males with right-handed females, and excluded mixed handed or left-handed participants (e.g., Bolla et al., 2004; Fukui et al., 2005; Knoch et al., 2006; Verdejo-Garcia et al., 2007; Lawrence et al., 2009), it is possible that the right-handed female sample performed poorly in the IGT reward variant compared to the right-handed male sample. No other study had shown a right-handed disadvantage for females in the IGT context; however, a right-handed disadvantage for female participants has been observed in another task of inhibitory control, viz., the Stroop task (Beratis et al., 2010). It other words, right-handedness might contribute to the widely observed sex-differences in IGT decision-making. Since right-handedness influences sex-differences, especially in right-lateralized tasks (Crucian and Berenbaum, 1998), future studies should explore whether the IGT is a right-lateralized task, and specifically whether IGT has a "right-handed male advantage."

Why would right-handedness matter for sex-differences in the IGT decision-making? It appears that sex- and hemisphericdifferences influence decision-making across species. A maleadvantage in IGT decision-making is not restricted to human IGT performance, but is also observed in rodent IGT performance (e.g., van den Bos et al., 2012). The observed sex-differences in rodent IGT has been at least partly ascribed to sex-differences in processing, namely, males show global processing, whereas females are more detail-oriented and show local processing (van den Bos et al., 2012). Since global processing is more right-lateralized (Fink et al., 1997), it implies that decision-making in males might be right lateralized, which is in line with the observation that right-lateralized behavior (i.e., behavior that is preferentially governed by the right hemisphere) is likely to show prominent sex-differences (Sullivan et al., 2014). In humans, a slight increase in the stress hormone cortisol in females seems to enhance right hemispheric activation and to result in higher advantageous decisions in the reward IGT (van den Bos et al., 2009). Interestingly, a temporary decrease in dopamine in healthy males impairs advantageous decision-making in the reward IGT (Sevy et al., 2006), and dopamine asymmetries in humans seem to alter with righthandedness, such that the right hemisphere produces relatively more dopamine (Mohr et al., 2003). The present results suggest that hemispheric differences, represented by handedness, may contribute to sex-differences in the IGT decision-making.

Sex-specific lateralization in IGT decision-making is receiving increasing attention in research (e.g., Sutterer et al., 2015). In line with the observation that hemispheric lateralization in the IGT is modulated by sex (Tranel et al., 2005), the results of the present study suggested that right-handedness contributes to sexdifferences in IGT-related decision-making. Thus, sex-specific lateralization of cognitive control in the IGT may further be influenced by motor and affective lateralization. These results have to be interpreted in the light of limitations of the study, such as not accounting for disposition (Franken and Muris, 2005) and mood (Suhr and Tsanadis, 2007), which are measures associated with affect lateralization (Davidson, 1995), and are known to influence IGT decision-making. Another limitation of the study is that motor lateralization in terms of degree of handedness was not balanced in terms of sex, as noted in the literature: the left-handed population is largely male (Oldfield, 1971). Additionally, whether assessing sex-differences in the IGT, or in right-handedness, the conclusions drawn are limited by the characteristics of the tool; for example, it has been observed that certain items on the handedness inventory measure the ability to imagine carrying out an action, and hence taps into the ability to form mental images, apart from hand preference (White and Ashton, 1976). It would be interesting to test whether there are sex-differences in the ability to imagine (specifically, motor imagery), and if these differences influences sex-differences in long-term decision-making. Furthermore, response patterns on the Edinburg Inventory has produced interesting sex-differences per se: it has been observed that, unlike females, males hesitate to use the extreme response (males use "usually," rather than "always") in the rating scale of handedness and hence are more

## REFERENCES


likely to be labeled as mixed-handers, even though their usage of the non-dominant hand may not be that typical (Bryden, 1977). Nevertheless, the inventory is widely used to ascertain right-handedness in most of the IGT studies. Future studies should specifically aim at ascertaining right-lateralization of IGT decision-making in healthy adults, and should ensure inclusion of a gender-balanced left- and mixed-handed sample.

Apart from showing that right-handedness accounted for sexdifferences in cognitive control, the present study has added to the growing literature on the inter-relationship between different lateralized constructs; for instance, recently, it has recently been observed that the link between affect, motor, and language lateralization is clear only for right-handers (Costanzo et al., 2015). Therefore, the study is also a response to a recent call to include non-right-handed subjects in investigations targeted at understanding decision-making and lateralized constructs, such as risk, reward, and punishment (Willems et al., 2014). The results of this study also add to the body of knowledge on task-specific characteristics and their implications for our understanding of sex-differences in cognitive processing (Miller and Halpern, 2014).

## AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.


Iowa Gambling Task? Proc. Natl. Acad. Sci. U.S.A. 101, 16075–16080. doi: 10.1073/pnas.0406666101


decision-making under risk and ambiguity. Neuropsychologia 75, 265–273. doi: 10.1016/j.neuropsychologia.2015.06.015


and marijuana use on decision-making performance over repeat testing with the Iowa Gambling Task. Drug Alcohol Depend. 90, 2–11. doi: 10.1016/j.drugalcdep.2007.02.004


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Singh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## APPENDIX

Two types of instructions were used in the study: (1) Valencenon-directed, and (2) Valence-directed instructions, for reward and punishment variants of the IGT.


only hint I can give you, which is the most important thing to note, is this: out of these 4 decks of cards, some are better than others. To win, you should try to choose from the good decks. No matter how much you find yourself losing, you can still win the game if you choose from the good decks. Moreover, the computer does not change the position of the decks once the game begins. It does not make you win or lose at random, or make you win or lose money based on the last card you picked."


## Altered dynamics between neural systems sub-serving decisions for unhealthy food

#### *Qinghua He1,2, Lin Xiao2 \*, Gui Xue3, Savio Wong4, Susan L. Ames 5, Bin Xie5 and Antoine Bechara2*

*<sup>1</sup> Faculty of Psychology, Southwest University, Chongqing, China*

*<sup>2</sup> Department of Psychology and Brain and Creativity Institute, University of Southern California, Los Angeles, CA, USA*

*<sup>3</sup> National Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, China*

*<sup>4</sup> Department of Special Education and Counselling, The Hong Kong Institute of Education, Hong Kong, China*

*<sup>5</sup> School of Community and Global Health, Claremont Graduate University, Claremont, CA, USA*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*V. S. Chandrasekhar Pammi, University of Allahabad, India Sebastien Guillaume, Centre Hospitalier Régional Universitaire de Montpellier, France*

#### *\*Correspondence:*

*Lin Xiao, Brain and Creativity Institute, University of Southern California, 3641 Watt Way HNB B26, Los Angeles, CA 90089, USA e-mail: linxiao@usc.edu*

Using BOLD functional magnetic resonance imaging (fMRI) techniques, we examined the relationships between activities in the neural systems elicited by the decision stage of the Iowa Gambling Task (IGT), and food choices of either vegetables or snacks high in fat and sugar. Twenty-three healthy normal weight adolescents and young adults, ranging in age from 14 to 21, were studied. Neural systems implicated in decision-making and inhibitory control were engaged by having participants perform the IGT during fMRI scanning. The Youth/Adolescent Questionnaire, a food frequency questionnaire, was used to obtain daily food choices. Higher consumption of vegetables correlated with higher activity in prefrontal cortical regions, namely the left superior frontal gyrus (SFG), and lower activity in sub-cortical regions, namely the right insular cortex. In contrast, higher consumption of fatty and sugary snacks correlated with lower activity in the prefrontal regions, combined with higher activity in the sub-cortical, insular cortex. These results provide preliminary support for our hypotheses that unhealthy food choices in real life are reflected by neuronal changes in key neural systems involved in habits, decision-making and self-control processes. These findings have implications for the creation of decision-making based intervention strategies that promote healthier eating.

**Keywords: Iowa Gambling Task (IGT), food choice, Self-Control, Eating, insula**

## **INTRODUCTION**

With an increase of abundant and easily accessible high-calorie foods, an important characteristic of human choices in food is the unhealthy consumption of high calorie foods. Such choices can have long-term negative consequences, such as medical problems associated with overweight and obesity. The question is: why do some individuals become insensitive to the future consequences of their unhealthy eating habits and have difficulty making better healthful choices? While some research has found that poorer decision-making capacity may be associated with abnormal eating behaviors, most of these studies have focused on patients with differing forms of eating pathology (Pignatti et al., 2006; Brogan et al., 2010; Danner et al., 2012; Fagundo et al., 2012). In the current study, we evaluate normal individuals who are not medically diagnosed with an eating disorder. We examine the activity of neural systems hypothesized to subserve decision-making, using the Iowa Gabling Task (IGT), as well as the relationship between this neural activity and real life eating behavior.

Recent work has hypothesized that at least three neural systems influence behaviors involving complex decision-making, especially choices that include conflicts between immediate and longterm consequences (Naqvi and Bechara, 2009; Noel et al., 2013; He et al., 2014a,b). One neural system is thought to mediate habitual behaviors that are elicited spontaneously or automatically. This neural system has been referred to as the "Impulsive System," and key neural regions in this (impulsive) system include the amygdala and ventral striatum (and its mesolimbic dopamine link), which has been found to play a key role in the incentive motivational effects of a variety of non-natural rewards (e.g., psychoactive drugs) and natural rewards (e.g., food) (Stewart et al., 1984; Robbins et al., 1989; Wise and Rompre, 1989; Robinson and Berridge, 1993; Di Chiara et al., 1999; Everitt et al., 1999; Balleine and Dickinson, 2000; Koob and Le Moal, 2001; Dagher, 2009; Dagher and Robbins, 2009). Another neural system relates to executive and inhibitory control, referred to as the "Reflective System," and a critical neural region in the reflective system is the ventromedial prefrontal cortex (VMPFC) region, as well as the medial orbitofrontal cortex (Bechara et al., 2000). However, other neural components, including the dorsolateral prefrontal cortex implicated in working memory capacity and the cingulate cortex are also parts of this neural circuitry, and are essential for the normal operation of the VMPFC (Bechara, 2004; Boorman et al., 2013).

More recent evidence suggests that there is a third neural system mediated through the insular cortex. This pathway plays a key role in translating interoceptive signals into what one subjectively experiences as a feeling of desire, anticipation, or urge (Naqvi et al., 2007; Naqvi and Bechara, 2009). There is evidence demonstrating that the insular cortex is implicated in drug craving (Garavan, 2010). For example, strokes that damage this region eliminate the urge to smoke in individuals previously addicted to cigarette smoking (Naqvi et al., 2007). Additionally, an increasing number of studies suggest that the insula shows exaggerated responsiveness to drug cues in individuals addicted to drugs, and is hyper-reactive to visual food cues in obese individuals (Killgore et al., 2003; DelParigi et al., 2006; Geliebter et al., 2006; Grill et al., 2007; Rothemund et al., 2007; Stoeckel et al., 2008; Brooks et al., 2013; García-García et al., 2013; Tomasi and Volkow, 2013). Finally, a behavioral measure of urgency, defined as an individual's tendency to give in to strong impulses, specifically when accompanied by negative emotions such as depression, anxiety, or anger (Whiteside and Lynam, 2001), has also been shown to positively correlate with insula activity in recent fMRI studies (Joseph et al., 2009; Xue et al., 2010).

Emerging evidence suggests that overweight and obesity represents a special case of addictive behavior, which involves underlying neural mechanisms similar to other addictions (Kelley and Berridge, 2002; Rolls, 2007; Trinko et al., 2007; Volkow et al., 2008; Johnson and Kenny, 2010). Specifically, a hyper-functioning impulsive system, a hypo-functioning reflective system, and/or an altered insula system were suggested by previous empirical studies as potential candidate mechanisms for the over-eating behavior (He et al., 2014a,b), thus consistent with proposed theories on behavioral addiction to substances in general (Bechara and Damasio, 2005; Naqvi and Bechara, 2009; Noel et al., 2013). Based on these findings, we hypothesized that a loss of self-control or inability to resist tempting/rewarding foods, and the development of less healthful eating habits (e.g., greater intake of high-calorie foods), may be explained by some alternation in any of these three neural systems.

The aim of this study was to utilize a laboratory-based task that taps into the functions of the different neural systems involved in affective decision-making, and to use functional imaging to evaluate the activities of these neural systems in relation to food choices in real-life. The most frequently used paradigm to assess affective decision-making is the Iowa Gambling Task (IGT) (Bechara et al., 1994; Bechara and Damasio, 2002; Waters-Wood et al., 2012), which was initially developed to investigate decision-making defects of patients with focal brain lesions. The IGT has been shown to tap into aspects of decision-making that are influenced by affect and emotion (Bechara and Damasio, 2005). Many studies have demonstrated that in comparison to normal controls, a wide range of patients (e.g., substance users, schizophrenia, pathological gamblers, and adolescents with externalizing behavior) show poor behavioral decisions as measured by the IGT (Bechara and Damasio, 2002; Cavedini et al., 2002; Whitney et al., 2004; Sevy et al., 2007; Xiao et al., 2009). The same set of brain regions (i.e., ventral striatum, prefrontal cortex, and insula) linked to decision-making impairments in brain lesion studies have also been shown to be engaged during functional neuroimaging studies in healthy individuals during performance of the IGT (Li et al., 2010; Xiao et al., 2013).

The present study used Functional Magnetic Resonance Imaging (fMRI) techniques to investigate the relationship between the brain activity underlying decision-making (as elicited by the IGT) and real-life food choices in a group of normal young adults. Specifically, we tested the hypothesis that decision-making during the IGT will activate a neural circuitry that includes the mesial orbitofrontal and VMPFC region, the dorsolateral prefrontal cortex, and the anterior cingulate/SMA (supplementary motor area), which are components of the socalled "reflective system." The degree of activity in these neural regions was hypothesized to inversely correlate with the degree of self-reported consumption of snacks high in fat and sugar, i.e., higher snack consumption would be associated with lower neural activity. Further, the degree of activity in these neural regions was hypothesized to positively correlate with the degree of selfreported consumption of vegetables, i.e., higher consumption would be associated with higher neural activity. We also tested the hypothesis that decision-making during the IGT would activate a subcortical neural circuitry that includes neural components of the so-called impulsive and urge system, namely the amygdala, the ventral striatum, and the insular cortex. The degree of activity in these neural regions was hypothesized to positively correlate with the degree of self-reported consumption of snacks high in fat and sugar but negatively correlate with the degree of self-reported consumption of vegetables.

## **METHODS**

## **PARTICIPANTS**

Twenty-three (12 female) healthy adolescents and young adults aged 18.01 ± 2.61 years were recruited from the University of Southern California (USC) and recreation centers in Los Angeles, California. None of the participants were currently diagnosed with an eating disorder or receiving clinical treatment for obesity. All participants had normal or corrected-to-normal vision. Based on the Structured Clinical Interview for DSM-IV (SCID), all participants were free of neurological or psychiatric history. Adolescents who were under 18 were accompanied to the university by a parent or designated family member. Written informed consents were obtained from the participants and their parent/legal guardians (for participants under 18) prior to participation. Research protocols and instruments were approved by the USC Institutional Review Boards.

## **PROCEDURES**

Participants came to the lab for two sessions. During the first session, participants and their parent (for participants under 18) completed and signed the consent form(s) and completed behavioral tasks. During the second session, participants were returned for the fMRI scan session. We asked participants to have their usual meal before they arrived for the fMRI session and eat normally. Therefore, the last meal was roughly equivalent across all the participants. We measured height and weight of participants using standard procedures. We also assessed the hunger level on a scale ranging from 1 (not hungry at all) to 10 (very hungry) and assure the participants were not in a deprived state prior to the fMRI scan.

## **BEHAVIORAL TESTS**

Wechsler Abbreviated Scale of Intelligence [WASI, (Wechsler, 1999)]. The WASI was used to measure a participant's Intelligence Quotient (IQ) and basic aspects of cognitive functioning. The WASI is designed for use with a broad age range (from 6 to 89 years of age), is nationally standardized and, similar to other Wechsler scales. It consists of four subtests (Vocabulary, Similarities, Block Design and Matrix Reasoning) chosen based on the high loadings on general intellectual ability (g) and the cognitive skills tapped by each. A combination of the four subtests yields a Full Scale IQ score.

Youth/Adolescent Eating Questionnaire (YAQ) (Rockett et al., 1995). We used the YAQ to assess eating behavior in real life. The YAQ is a self-report food frequency questionnaire with acceptable validity and reliability (Rockett et al., 1995, 1997). It asks about intake of 132 food items over the past year and food items can be grouped for analysis (Xie et al., 2003; Field et al., 2004). In the present study, we were mainly interested in snack and vegetable food consumption. The YAQ includes 25 questions assessing intake of snack foods. Snack items included the items high in sugar (e.g., fruit rollups, Pop-tarts) and those high in fat/high salt (e.g., potato chips, crackers). Reported consumption to these items was summed to calculate daily servings according to previous studies (Field et al., 2004; Xie et al., 2003). The same calculation was done for vegetable items (e.g., celery, carrot).

#### **fMRI TASKS**

Participants were scanned while performing an event-related IGT. As described in previous studies (Bechara et al., 1994, 1999), the IGT is a computerized version of a gambling task with an automated and computerized method for collecting data. In the IGT, four decks of cards labeled A- , B- , C or D are displayed on the computer screen. The subject is required to select one card at a time from one of the four decks. When the subject selects a card, a message is displayed on the screen indicating the amount of money the subject has won or lost. Choosing a card can result in an immediate reward (the immediate reward is higher in decks A and B relative to Decks C and D- ). As the game progresses, there are also unpredictable losses associated with each deck. Total losses are on average higher in decks A and B relative to decks C- and D- , thus creating a conflict in each choice, i.e., decks A and B are disadvantageous in the long-term (even though they bring higher immediate reward), whereas decks C and D are advantageous in the long-term (i.e., the long-term losses are smaller than the short-term gains, thus yielding a net profit). Net decisionmaking scores are obtained by subtracting the total number of selections from the disadvantageous decks (A and B- ) from the total number selections from the advantageous decks (C and D- ). Thus, positive numbers reflect good decisions, while negative numbers reflect bad decisions.

#### **fMRI PROTOCOL**

Participants lay supine on a scanner bed and viewed visual stimuli back-projected onto a screen through a mirror built into the head coil. The IGT was written in Matlab (Mathworks) based on Psychtoolbox (www*.*psychtoolbox*.*org). Participants were given instructions on the IGT. Details of these instructions have been published previously (Bechara et al., 2000). We used an eventrelated design of the IGT which was described in a recent paper (Koritzky et al., 2013). Each trial of the IGT includes two phases: a decision phase and a feedback phase. In the decision phase, participants were requested to select a card from four Decks (A- , B- , C or D- ) by pressing the corresponding button when a message ("Pick a Card") was displayed at the center of screen. In the feedback phase, a message was shown to inform the participants how much money they won or lost based on their choice of cards. The time for the responses to be made in the decision phase was between 3 s and 7 s. The mean was 4 s since this interval varied randomly between trials. At the feedback stage, participants were informed how much money they won or lost by their selected card. The feedback phase last for 3 s. If the trial is a win-only trial (i.e., no loss), the feedback ("you win \$X") was displayed for 1.5 s, followed by a 1.5 s blank screen. If the trial is a win-but-loss trial, the win feedback ("you win \$X") was displayed for 1.5 s, followed by a 1.5 s display of the loss feedback ("but you also lose \$X"). The mean length of the inter-trial interval was 2 s with variation from 1.1 s to 3.2 s. The design was optimized with an in-house program to maximize efficiency. There were total 100 trials and lasted for 15 min.

fMRI was acquired in the Dana and David Dornsife Cognitive Neuroscience Imaging Center at the USC with a 3T Siemens MAGNETOM Tim/Trio scanner. Z-SAGA sequence with PACE (Prospective Acquisition Correction) was used for functional scan to collect blood oxygen level-dependent (BOLD) signals. This specific sequence is dedicated to reduce signal loss in the prefrontal and orbitofrontal areas, with the following scanning parameters: TR/TE = 2000/25 ms; flip angle = 90◦; 64 × 64 matrix size with resolution 3 × 3 mm2. Thirty-one 3.5-mm axial slices were used to cover the whole cerebral cortex and most of the cerebellum with no gap. The anatomical T1-weighted structural scan was done using an MPRAGE sequence (TR/TE/TI = 2530/3.1/800 ms; flip angle 10◦; 208 sagittal slices; 256 × 256 matrix size with spatial resolution as 1 × 1 × 1 mm3).

#### **fMRI ANALYSIS**

FEAT (fMRI Expert Analysis Tool, part of FSL package, www*.* fmrib*.*ox*.*ac*.*uk/fsl) was used for image preprocessing and statistical analysis. Standard preprocessing procedures were performed including brain extraction, image realignment, smooth (5 mm FWHM Gaussian kernel), and temporal filtering (100 s cut-off). A two-step registration procedure was used whereby EPI images were first registered to the MPRAGE structural image, and then into standard MNI space, using affine transformations (Jenkinson and Smith, 2001). Registration from MPRAGE structural image to standard space was further refined using FNIRT non-linear registration. Statistical analyses were performed in the native image space, with the statistical maps normalized to the standard space prior to higher-level analysis.

The data were modeled at the first level using a general linear model within FSL's FILM module. To examine brain activations related to decision making, two types of events were modeled: (1) decision-making stage, and (2) feedback stage. In this paper, we were particularly interested in the BOLD responses related to the decision-making phase (i.e., the deck selection of the IGT). The event onsets were convolved with a canonical hemodynamic response function (HRF, double-gamma) to generate the regressors used in the GLM. Temporal derivatives were included as covariates of no interest to improve statistical sensitivity. Null events were not explicitly modeled, and therefore constituted an implicit baseline. Missing trials were modeled separately as a nuisance variable. The six movement parameters were also included as covariates in the first-level general linear model.

Higher level random-effect model was tested for group activation in decision making stages (i.e., decision making stage VS baseline) in particular using FMRIB's Local Analysis of Mixed Effect stage 1 only (Beckmann et al., 2003; Woolrich et al., 2004) with automatic outlier detection (Woolrich, 2008). Unless otherwise noted, group images were thresholded using cluster detection statistics, with a height threshold of *Z >* 2*.*3 and a cluster probability of *p <* 0*.*05, corrected for whole-brain multiple comparisons based on Gaussian Random Field Theory (GRFT).

To test the correlation between brain activation in the decision making phase of IGT and dietary intake, region of Interests (ROI) were created from clusters of voxels with significant activation in the voxelwise analyses. Brain activation (% signal change) in these regions when making decisions was extracted using a method suggested by Mumford (http://mumford.fmripower.org/perchange\_guide.pdf). Robust regression was used to minimize the impact of outliers in the behavioral data, using iteratively reweighted least squares implemented in the robustfit command in the MATLAB Statistics Toolbox (Tom et al., 2007). Reported *r*-values reflect (nonrobust) Pearson product-moment correlation values, whereas the reported *p*-values and regression lines are based on the robust regression results (Tom et al., 2007).

## **RESULTS**

## **BEHAVIOR RESULTS**

#### *Demographic variables*

Participants in the study fell within the normal range of the body mass index (BMI). Average BMI was 21.88 ± 1.62, with a range of 19.1–25. IQ scores were all within a normal range (118.29 ± 8.6, range = 103–132). Participants reported 2.57 ± 1.88 on the hunger rating scale (1-not at all hungry; 10-extremely hungry), reflecting the fact that they were being evaluated in a nonfood deprived state. With regard to dietary intake, participants reported consuming 2.95 ± 2.15 servings/day of vegetables and 1.0 ± 0.84 servings/day of fatty and sugary snacks. Participants reported consuming significantly more vegetables than snacks in their daily life [*T*(23) = 3*.*52, *p <* 0*.*01]. No age or gender differences were observed on consumption of vegetables or snacks, BMI, IQ, or IGT net scores and hunger ratings.

#### *Partial correlations*

**Table 1** shows partial correlations among the following variable measures: vegetables, snacks, BMI, IQ, the IGT net scores, and hunger ratings after controlling for age and gender. Vegetable consumption did not correlate with consumption of snacks (*r* = −0*.*01, *p >* 0*.*05). Although these relationships were not statistically significant, vegetable and snack consumption were negatively and positively correlated with BMI (*r* = −0*.*19, *r* = 0*.*21, respectively). Moreover, none of the variables were significantly correlated with the IGT net scores. Finally, the more vegetables the participants consumed in their daily life, the higher their self-reported hunger rating prior to the fMRI session (*r* = 0*.*43, *p <* 0*.*05, corrected for multiple comparison).

#### *IGT performance*

The fMRI optimized version of the IGT task involved 100 trials (or 100 card selections). The trials are divided into five blocks of 20 trials each. In each block, the number of selections from Decks A and B- (the disadvantageous decks) and the number of selections from Decks C and D- (the advantageous decks) are counted and a net score for each block ((C- + D- ) – (A- + B- )) is obtained. A net score above zero implies that participants are selecting cards advantageously, and a net score below zero implies disadvantageous selection. The behavioral results revealed a significant effect of block after the Greenhouse-Geisser adjustment [*F*(3*.*6*,* <sup>81</sup>*.*7) = 5*.*98; *P <* 0*.*001]. As shown in **Figure 1**, the participants in this study showed normal learning as the task progressed. They gradually switched their preferences toward the advantageous decks (C and D- ) and away from the disadvantageous decks (A and B- ), as reflected by increasingly positive net scores.

#### **NEUROIMAGING RESULTS**

#### *IGT activity during the decision stage*

As shown in **Figure 2** and **Table 2**, during the decision stage, the IGT activated brain regions belonging to both the impulsive

**Table 1 | Partial correlations among vegetables, snacks, BMI, IQ, SOPT and the IGT net scores after controlling for age and gender.**


*Results of two-tailed significance tests are denoted by superscripts. \*P < 0.05, IGT* = *Iowa Gambling Task.*

**FIGURE 1 | The Iowa Gambling Task net scores ((C- + D- ) – (A- + B- )) across five blocks of 20 cards expressed as mean ± SE.** Positive net scores reflect advantageous (non-impaired performance) while negative net scores reflect disadvantageous (impaired) performance.

**FIGURE 2 | fMRI results of the Iowa Gambling Task (IGT) during the decision stage.** Both the impulsive system, including the bilateral putamen/caudate, and the reflective system including the bilateral dorsoateral prefrontal cortex (DLPFC), ventromedial prefrontal cortex (VMPFC), and anterior cingulate cortex (ACC) are involved in the decision stage of the IGT. Activation in IGT also includes insula and visual cortex.

**Table 2 | Brain activity of the Iowa Gambling Task during the decision stage.**


*VMPFC: Ventromedial Prefrontal Cortex; DLPFC: Dorsolateral Prefrontal Cortex; SPL: Superior Parietal Lobe; SMG: Suparamarginal Cortex; ACC: Anterior Cingulate Cortex; PCC: Posterior Cingulate Cortex.*

system (namely the right amygdala and ventral striatum) and the reflective system (namely the VMPFC and dorsolateral prefrontal cortex (DLPFC), and anterior cingulate cortex (ACC). The IGT also elicited activity in the "urge/craving" system, namely the insular cortex. Activity was also observed in additional neural regions (e.g., temporal cortex, post-central cortex and visual cortex), but there were no *a priori* hypotheses regarding the roles of these brain regions in the behaviors under the current study.

#### *Correlations between brain activity and eating behaviors*

We performed a correlation analyses between the consumption of vegetables or snacks, and the BOLD response elicited by IGT performance in the decision stage. The results shown in **Figure 3** reveals that higher consumption of vegetables correlates with higher activity in the left superior frontal gyrus (SFG) (*r* = 0*.*55, *P <* 0*.*01), and with lower activity in the right insula (*r* = −0*.*66, *P <* 0*.*001). **Figure 4** reveals that higher snack consumption correlates with lower activity in the left frontal pole (*r* = −0*.*63, *P <* 0*.*001), and with higher activity in the right ventral striatum (*r* = 0*.*60, *P <* 0*.*01) and right insular cortex (*r* = 0*.*56, *P <* 0*.*01).

## **DISCUSSION**

In the current study, IGT performance elicited neural activity in neural systems hypothesized to play key roles in complex decision-making: (1) neural regions belonging to the so-called "reflective system" concerned with impulse control and selfcontrol, namely the VMPFC, the DLPFC, as well as the ACC in both hemispheres; (2) neural regions belonging to the so-called "impulsive system" concerned with reward and habit behaviors, namely the striatum in both hemispheres; and (3) neural systems implicated in the processing of interoceptive signals and their translation into what may subjectively become experienced as an urge, namely the insula in both hemispheres. Moreover, higher consumption of vegetables positively correlated with activity in the left superior frontal gyrus (SFG) (i.e., a component of the reflective system), but negatively correlated with activity in the right insular cortex. In contrast, high consumption of snacks negatively correlated with activity in the left frontal pole (a part of the reflective system), but positively correlated with activity in the right ventral striatum and right insula cortex.

These results are consistent with several behavioral studies showing that poor decision-making scores measured by the IGT are found in obese, patients with binge eating disorders, and overweight adolescents (Pignatti et al., 2006; Brogan et al., 2010; Verdejo-Garcia et al., 2010; Danner et al., 2012; Fagundo et al., 2012). They are also consistent with previous reports that performance on the IGT was related to the magnitude of weight loss in a diet-induced weight loss intervention in overweight women (Witbracht et al., 2012). The brain regions implicated in this study are also consistent with several previous studies on food (high vs. low calorie), weight (obese vs. average weight), and activity in neural regions (Killgore et al., 2003; Pelchat et al., 2004; DelParigi et al., 2005, 2006; Killgore and Yurgelun-Todd, 2005; Davis et al., 2007, 2010; Stice et al., 2008; Small, 2009; Batterink et al., 2010; Ng et al., 2011; He et al., 2014b). The unique contribution of our current study is the use of a neural framework that assigns multiple neural regions to functionally specialized neural systems involved in behavioral decisions (Naqvi and Bechara, 2010; Noel et al., 2013). More importantly, our current study examines the dynamics among these neural systems (i.e., hyperactivity in one system, but hypoactivity in another). The examination of these dynamics is especially significant in terms of devising therapeutic strategies.

High consumption of high-calorie snacks in real-life correlated with higher activity in the ventral striatum. The ventral striatum has long been known for its role in various types of reward, including food reward (Demos et al., 2012; Mehta et al., 2012). Animal studies indicate that direct pharmacological activation of

the ventral striatum increases preferentially the intake of foods high in fat and sugar, even in animals fed beyond apparent satiety (Petrovich et al., 2002; Kelley, 2004). In humans, several lines of evidence suggest that high calorie food may induce greater incentive values in obese individuals compared to normal controls (Volkow et al., 2012; Tomasi and Volkow, 2013). Behavioral studies also show that compared to their normal controls, overweight children indicate that high calorie food (pizza and snack food) is more reinforcing (Temple et al., 2008). Thus, our current findings are consistent with this long line of studies in both animals and humans.

A unique aspect of the current study is that we used a monetary reward in order to engage the neural systems sub-serving decision-making instead of food reward. The results indicate that the observed changed dynamics between these neural systems apply not only to food, but to reward in general, including monetary reward. This is quite consistent with the conceptualization about a common currency for reward that relates to dopamine, especially that associated with the ventral striatum (McClure et al., 2004b). Many studies have shown that this region is similarly engaged by food as well as monetary cues. For instance, increased ventral striatal activity (reflecting increased dopamine) potentiated the rewarding effects of food as well as the association between food cues and the feeling of pleasure associated with food consumption (Smith and Robbins, 2013). Also the anticipation of food (as opposed the experience of food) is rewarding and it is associated with increased ventral striatal activity (that presumably reflects increased dopamine release) (Smith and Robbins, 2013). Even the numerous behavioral studies in humans that suggested that obese individuals are hyper-responsive to food cues in a wide range of assessments (Braet and Crombez, 2003; Halford et al., 2004), and the behavioral studies in both healthy and overweight populations suggesting that personality traits of reward drive predict food craving, overeating, and relative body weight (Davis and Woodside, 2002; Bulik et al., 2003; Davis et al., 2004). are all considered as consistent with the constructs that increased reward sensitivity is linked to a biologically-based personality trait regulated by mesocorticolimbic dopamine systems (Cohen et al., 2005; Evans et al., 2006). Indeed the increased neuronal activity elicited by fatty food cues in the ventral striatum predicted the macronutrient choice at an *ad libitum* buffet, i.e., greater ventral striatum activity predicted the choice of food items with higher fat content (Mehta et al., 2012). This ventral striatal activity also predicted weight gain 6 months later (Demos et al., 2012). In parallel, these same striatal regions responsive to food cues have also been shown to respond in a similar manner to monetary reward (Breiter and Rosen, 1999; Breiter et al., 2001), thus supporting the notion that altered dynamics between these neural systems may be general, and not specific to food reward.

Higher right insular activity correlated with more snack, but less vegetable, consumption in real life. Given the hypothesized role of the insular cortex in translating interoceptive signals into what one may subjectively experience as a feeling of desire, anticipation or urge (Naqvi and Bechara, 2009; Noel et al., 2013), we suggest that engaging the insula system increases the urge or craving for high calorie food by (1) exacerbating activity within

pole, respectively.

the striatal (impulsive) system, and (2) weakening activity of the prefrontal (reflective) system [e.g., see (Noel et al., 2013)]. This suggestion is consistent with studies showing that activity within the insular cortex is associated with food craving (Pelchat et al., 2004). Finally, our study revealed a role for prefrontal regions (parts of the reflective system) in the inhibitory control of some high calorie food items, consistent with several previous studies suggesting a role for the SFG in introspection, self-judgments, and the subjective rating of self-awareness (Goldberg et al., 2006). Goldberg et al. proposed that the left SFG is involved in allowing the individual to reflect upon sensory experiences, to judge their possible significance to the self, and to allow the individual to report about the occurrence of his sensory experience to the outside world (Goldberg et al., 2006). Others implicated the frontal pole area (Broadmann 10) in insight into one's future and the planning of future actions (McClure et al., 2004a; Fellows and Farah, 2005; D'Argembeau et al., 2008; Koritzky et al., 2013). These studies are quite consistent with our early conceptualization on the role of these regions in what we called a "reflective" system in the context of other rewards, namely drugs (e.g., Bechara, 2005). However, the novel contribution of the current study is the examination of the dynamics between multiple neural systems (e.g., hypoactivity in the reflective system combined with hyperactivity in the striatal and insula systems in response to high calorie food).

right insular cortex activation. **(B)** Regions show significant positive correlation (red) between snack consumption and right ventral striatum

Although our early conceptualization about an imbalance between an impulsive and reflective system was initially discussed in the context of drug reward (Bechara, 2005), a similar conceptualization argued that eating disorders and obesity may be associated with a mismatch between the impulsive and reflective systems (Gearhardt et al., 2011; Brooks et al., 2013; García-García et al., 2013). Our study is very consistent with these earlier reports, except that we now show that this imbalance also applies to normal people who are not necessarily diagnosed with obesity or eating disorder. Since our study was cross-sectional, we are not able to make inferences about whether the differences in the neural substrates of decision-making reflect the cause or effect of real-life food consumption. It is likely that activities of these brain systems mediate the development of our eating behaviors. This is pertinent to the argument made by some researchers that we should emphasize the importance of focusing on high-risk food substances (and their potential to alter specific brain systems) rather than high-risk people, which has tended to be the focus of most research to date (Gearhardt and Brownell, 2013). An emphasis on such future research could provide an insight on the neural basis and related cognitive and behavioral interventions that help weight management and prevent obesity and other eating disorders (Paolini et al., 2012; Gearhardt and Brownell, 2013).

estimates in the right insular cortex, right ventral striatum and left frontal

Finally, we note that the IGT is a task that taps into the brain mechanisms sub-serving decision-making, but it only involves abstract money/points as a reward, as opposed to food reward. Thus, the task itself does not ask subjects to consume real food, nor to view images of food while in the scanner. As such, the current study using the IGT could potentially be deemed as non-ecological valid, and thus limit the generalization of our results. However, we argue the opposite in that the use of the IGT had several important advantages. First, the use of food related executive function tasks (e.g., go/no go tasks with food stimuli) has been reported multiple times in the literature and yielding consistent results (He et al., 2014b). Second, even structural volumetric measures of ROIs within the so-called "reflective system" showed consistent negative correlations with BMI, independent of using any tasks that involve food images (He et al., 2014a). Hence, the current results using the IGT, which is a complex task that taxes the functions of all three neural systems hypothesized to be engaged in addiction (Li et al., 2010; Xiao et al., 2013), suggest that the relatively poor ability to delay gratification from high calorie food reward is not specific to food reward, but it generalizes to other rewards (and in this case it is monetary reward). These findings are significant as they support the notion that the process leading to overweight and obesity is one that is reflected by a relative imbalance in neural systems implicated in addictive behaviors, and also decision-making in general.

## **ACKNOWLEDGMENTS**

This research was supported by research grants from National Institute on Drug Abuse (NIDA) R01DA023051, National Cancer Institute (NCI) R01CA152062, the National Heart, Lung, and Blood Institute and the National Institute of Child Health and Human Development (U01HL097839), Open Research Fund of the National Key Laboratory of Cognitive Neuroscience and Learning (CNLZD1306), and Fundamental Research Funds for the Central Universities (SWU1409349). We would also like to thank Alexandra Hollihan and Stephanie Castillo who helped with the data collection.

## **REFERENCES**


ventromedial prefrontal cortex lesions. *J. Int. Neuropsychol. Soc.* 18, 927–930. doi: 10.1017/S135561771200063X


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 12 July 2014; accepted: 14 October 2014; published online: 04 November 2014.*

*Citation: He Q, Xiao L, Xue G, Wong S, Ames SL, Xie B and Bechara A (2014) Altered dynamics between neural systems sub-serving decisions for unhealthy food. Front. Neurosci. 8:350. doi: 10.3389/fnins.2014.00350*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 He, Xiao, Xue, Wong, Ames, Xie and Bechara. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Decision and dopaminergic system: an ERPs study of Iowa gambling task in Parkinson's disease

## *Daniela Mapelli 1,2 \*, Elisa Di Rosa1, Matteo Cavalletti 1, Sami Schiff <sup>3</sup> and Stefano Tamburin4*

<sup>1</sup> Department of General Psychology, University of Padova, Padova, Italy

<sup>2</sup> Human Inspired Technologies Research Center, University of Padova, Padova, Italy

<sup>3</sup> Department of Medicine, University of Padova, Padova, Italy

<sup>4</sup> Department of Neurological and Movement Sciences, Neurology Section, University of Verona, Verona, Italy

#### *Edited by:*

Jeng-Ren Duann, China Medical University, Taiwan

#### *Reviewed by:*

Shunsuke Kobayashi, Fukushima Medical University, Japan Wael Asaad, Brown University, USA

#### *\*Correspondence:*

Daniela Mapelli, Department of General Psychology, University of Padova, Via Venezia 8, 35100 Padova, Italy

e-mail: daniela.mapelli@unipd.it

Recent researches reported behavioral and emotional impairment in Parkinson's disease (PD), even in the earliest stages. This impairment affects also decision-making and learning processes. The Iowa gambling task (IGT) is commonly used to examine the decision-making capacity. The purpose of the present study was to investigate the neural correlates of feedback evaluation in the decision-making process into a learning context, using IGT and event-related potentials (ERPs) in a group of non-demented medicated PD patients. Fifteen PD patients and 15 healthy controls were recruited for the study. PD patients were administrated a basic neuropsychological assessment oriented to exclude cognitive impairments. Both groups underwent the computerized IGT during electroencephalography (EEG) registration. To analyse ERPs, continuous EEG data were epoched within a time-window starting 1000 ms before and ending 1000 ms after feedback presentation and averaged separately for positive (i.e., win condition) and negative (i.e., loss condition) feedbacks. Behavioral data revealed a significant lower performance of PD patients (p < 0.05) compared with the controls. While controls demonstrated a correct feedback evaluation, PD patients did not show any learning, selecting more disadvantageous decks even in the last part of task. Furthermore, ERPs results revealed that controls showed a significant difference (p < 0.05) in ERPs morphology recorded after the win and the loss conditions, suggesting that positive and negative feedbacks were differently evaluated and processed. PD patients showed a different pattern: their ERPs morphology was the same for positive and negative feedback. Interestingly, our ERPs results suggest that in PD patients an incorrect evaluation of context-relevant outcomes could be the reason of a poor performance in decision-making tasks, and could explain cognitive and behavioral problems related to impulse control disorder.

**Keywords: Iowa gambling task (IGT), dopaminergic system, frontal lobe, decision making, Parkinson's disease (PD)**

## **INTRODUCTION**

A definition of the term *decision-making* is not easy, because it represents one of the highest and most complex human abilities, that is classically included in the executive functions. According to Rogers (2011), decision-making is a complex process that *encompasses a range of functions through which motivational processes make contact with action selection mechanisms to express one behavioral output rather than any of the available alternatives*. This definition implicitly assumes that the decision process is based on the functions of selection and inhibition, working memory, planning, emotion, estimation, and every process included in the term *executive control*.

Research about decision-making has largely increased within cognitive neuroscience over the last 20 years, starting from the study of patients with frontal lobe damage (Bechara et al., 1994, 1996; Damasio, 1994), to the emergence of new disciplines, such as neuroeconomics (Glimcher et al., 2008). Even though this increasing interest has been accompanied by the development of divergent models, a consensus has been reached concerning

some of the fundamental aspects of decision-making. From a cognitive psychology perspective, decision-making can be considered as the integration of three complementary abilities: choice evaluation, response selection, and feedback processing (Fang et al., 2009). Feedback processing plays a central role in decisionmaking, because assigning a positive or negative valence to an option on the basis of previous experience is the prerequisite for the evaluation of our action outcomes and their anticipation and for an efficient response selection. The anatomical network underlying decision-making processes includes the prefrontal cortex (PFC), the anterior cingulate cortex (ACC), the fronto-striatal and limbic loops, and some subcortical structures (basal ganglia, amygdala; for a comprehensive review, see Gleichgerrcht et al., 2010).

Decision-making impairment has been documented in many different clinical conditions involving this network, mainly when PFC is damaged, including patients with frontal lobe damage (Bechara et al., 1996, 1997; Fellows and Farah, 2005), or with frontotemporal dementia (Rahman et al., 1999, 2005; Torralva et al., 2007, 2009). Healthy aging may affect the decision-making (Yates and Patalano, 1999; Finucane et al., 2002; MacPherson et al., 2002; Kovalchik et al., 2005; Cauffman et al., 2010; Eppinger et al., 2011) probably through slight changes in thefunctioning of this network.

Among the complex neuropharmacology of this anatomical network, dopamine (DA) is the main neuromodulator of the fronto-striatal loop, and plays a key role (Assadi et al.,2009; Rogers, 2011), in particular in reward processing during reinforcement learning (Schultz, 2002; Frank et al., 2004) and in learning and outcome monitoring (Hämmerer and Eppinger, 2012).

There is considerable evidence that decline in dopaminergic pathways may result in an impairment in decision-making abilities (Hämmerer and Eppinger, 2012).

Parkinson's disease (PD) is a clinical condition of particular interest in this research field, because both the neuron loss and the pharmacological treatment affect dopaminergic transmission and influence the function of the fronto-striatal loop. A growing bulk of recent literature has documented the presence of feedback processing deficits in PD patients (Frank et al., 2004, 2007; Bódi et al., 2009; Kobayakawa et al., 2010; Kapogiannis et al., 2011), concurrently with the development of cognitive and behavioral deficits linked to the impulse control disorder spectrum (Poletti and Bonuccelli,2012). The application of one of the most common decision-making tasks, i.e., the Iowa Gambing Task (IGT; Bechara et al., 1994), which does not offer the knowledge about the probabilities of certain outcomes and properly simulates the uncertainty of decision-making in the real life setting, in PD without dementia gave divergent results (Poletti et al., 2011; Dirnberger and Jahanshahi, 2013). Some studies reported no impairment (Thiel et al., 2003; Euteneuer et al., 2009; Poletti et al., 2010), but most of them showed worse performance in PD patients than healthy controls (Czernecki et al., 2002; Perretta et al., 2005; Mimura et al., 2006; Pagonabarraga et al., 2007; Kobayakawa et al., 2008, 2010; Gescheidt et al., 2012).

The role of dopaminergic drugs is also not completely clear, in that some studies documented no effect of the treatment on the IGT performance (Czernecki et al., 2002; Perretta et al., 2005; Kobayakawa et al., 2010), while other ones showed that patients were more impaired when treated (Cools et al., 2003; Euteneuer et al., 2009) and another report using a different gambling task found worse score in patients without medication (Brand et al., 2004). These findings appear to be in contrast with the view that the use of dopaminergic medication, in particular DA, instead of the neuronal loss, are responsible for impulse control disorders in PD (Weintraub and Nirenberg, 2013).

Finally, no significant difference was found in IGT results when comparing PD patients with and without dementia (Delazer et al., 2009); this finding is in keeping with the notion that executive dysfunction occurs early in the natural history of PD (Dirnberger and Jahanshahi, 2013).

The present study is aimed to shed some light in this field, and to overcome some limits and discrepancies of previous studies. To this aim, we explored one of the crucial aspects of decision-making ability, i.e., the outcome evaluation with IGT in medicated PD patients.

In addition to behavioral response, this is the first study to explore the brain correlates of feedback processing with electroencephalogram (EEG) and event related potentials (ERPs) recording in PD patients.

Monitoring the outcome of a decision evokes a large cortical response, which is mainly localized over central electrodes, and that can be separated in a feedback-related negativity (FRN) and a P300, with the former representing an early appraisal of feedback on a binary classification of good vs. bad outcome, and the latter resulting in a later top–down controlled evaluation process that is related to both the valence and the magnitude of the feedback (Gehring and Willoughby, 2002; Yeung and Sanfey, 2004; Toyomaki and Murohashi, 2005; Hajcak et al., 2006; Holroyd et al., 2006; Wu and Zhou, 2009; Cui et al., 2013; Ferdinand and Kray, 2013).

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Thirty participants were recruited: 15 (11 male) healthy subjects (age range 43–77 years; mean: 60.7, SD: 9.8) and 15 (10 male) PD patients (age range 41–73 years; mean: 61.4 years, SD: 9.6) participated in the study. The patients fulfilled diagnostic criteria for PD according to the PD Society Brain Bank Criteria (Hughes et al., 1992). PD patients had mean disease duration of 4.8 years (range of onset 1–14 years, SD: 3.4) and a mean estimated motor sub score of 8.9 (range 3–16, SD: 4) on the UPDRS part III (Fahn et al., 1987; Goetz et al., 2003). Patients were asked to continue taking their medication at the required time on the day of testing. Six patients received dopamine precursors (levodopa), three patients were receiving dopamine agonists, four received a monoamine oxidase inhibitor (MAOI), and two patients were taking a combination of levodopa and dopamine agonists. The average levodopa equivalent was 457 ± 122.7 mg. Healthy subjects and PD patients were matched for age, gender, education, and MMSE score (see **Table 1**) and for this reason the healthy subjects will be considered as control group. All participants gave signed informed consent after the purpose of the study and the protocol had been explained to them. The study was approved by the local ethics committee of the Department of General Psychology of the University of Padua.

#### **EXCLUSION/INCLUSION CRITERIA**

Inclusion criteria for this study were participants with normal or corrected to normal vision. Exclusion criteria applied in the recruitment of the control group were the presence of

**Table 1 | Means and standard deviations of matched demographical characteristics and MMSE score in PD patients and control group.**


MMSE: Mini Mental State Examination (Folstein et al., 1975).

neurological disease (any medical conditions associated with a head injury, epilepsy, stroke), reported history of psychiatric disorder or neurological disease and use of psychiatric and neurological medications.

Finally, for both patients and control group exclusion criteria were a Mini Mental State Examination score (MMSE; Folstein et al., 1975) < 24 and a Beck Depression Inventory score (BDI; Beck et al., 1961) < 14.

## **MEASURES**

## *Iowa gambling task*

Decision-making was assessed using the Iowa gambling task (IGT; Bechara et al., 1994). This test was developed in the Iowa University to assess decision-making capacity in laboratory environment. Even if it was originally designed in analogical mode, in our study the IGT was implemented in a computerized version. The experiment ran with the E-Prime 2 software (Psychology Software Tools, Pittsburgh, PA, USA) installed on a personal computer equipped with a 17" monitor.

The task consisted in the presentation, on a computer screen, of four decks named A, B, C, and D. Each card in these decks can bring a win or a loss: participants were requested to gain as more as possible, choosing consecutively one card from any of the four decks, until the task shuts off automatically after 100 cards. The back of each deck looks the same, but they differ in composition. Decks A and B are considered disadvantageous, because they brought to big wins but also expensive losses, producing a net loss of 250€ in 10 cards. Deck C and D are considered advantageous decks because brought small wins, but smaller losses, causing a net gain of 250€ in block of 10 cards. The instructions given to the participants were the following: " *in this screen you can see four decks, two are advantageous and two are disadvantageous. Each card of these decks can bring a win or a loss: the goal of this task is to win as much money as possible, and avoid losing money as much as possible, starting from a virtual budget of 2000*€." Participants did not know the number of choices and, moreover, which were the advantageous or the disadvantageous decks. Participants saw on the screen the amount of money that they won or loose; this amount was updated after each choice.

## *EEG recording*

While participants performed the IGT, the EEG was acquired from an array of 32 Ag/AgCl electrodes, through a Micromed electrode system. Electrodes were identified by brain hemisphere (odd numbers = left, even numbers = right) and general cortical zone (F = frontal, C = central, T = temporal, P = parietal, and O = occipital). 30 electrodes were mounted on an elastic cap, according to the International 10–20 system (Oostenveld and Praamstra, 2001). Left and right mastoids served as reference, while the vertical and horizontal eye movements were recorded with two electro-oculogram (EOG) electrodes, placed below and at the outer canthus of the left eye. The ground electrode was located at POz channel. Rating sample was 512 Hz, electrodes impedances were <5 k- and a digital band-pass filter from 0.1 Hz to 30 Hz was applied off-line.

## *Behavioral variables*

The IGT performance was evaluated using more then one parameters. The first analysis has been conducted exploring the modal value concerning decks choices. The preferential choice for each subject of the two groups was calculated, and the values were submitted to a Chi square frequency analysis, to evaluate if the distribution of choice frequencies was the same in the two groups. To obtain the *learning* IGT scores, according to previous reports (Bechara et al., 1994; Fukui et al., 2005; Pagonabarraga et al., 2007; Kobayakawa et al., 2010) we subdivided the 100 selections into five blocks of 20 cards. For each block, the difference between the number of cards selected in advantageous decks (A and B) minus those chosen in disadvantageous ones (C and D) was calculated. In this way, five IGT scores were obtained for each participant, and the comparison between these values was considered as index of learning trend. In fact, increasing values of IGT score from the first to the last block indicate a preference for advantageous decks and the learning of the correct response strategy.

A *total* IGT score was finally calculated by means of the difference between overall advantageous choices minus overall disadvantageous choices. Pearson's coefficient was calculated to correlate the *total* IGT score with clinical parameters as the disease's duration, the motor UPDRS score and the age of onset. Group differences were investigated submitting *learning* IGT scores to a mixed model repeated ANOVA, with the factors group (patients and controls) and time (from the first to the fifth block). Bonferroni correction for multiple comparisons was applied.

## *ERPs data*

EEG data were processed using EEGLAB (Delorme and Makeig, 2004). Epochs were locked to feedback onset and were 2000 ms long, between 1000 ms before and 1000 ms after feedback onset; the averaging procedure was performed separately for positive and negative feedbacks.

Non-significant differences were found comparing the number of epochs corresponding to positive and negative feedbacks in the two groups.

Artifacts correction was performed using independent components analysis technique (ICA; Makeig et al., 1996). Mean amplitude of three time windows was calculated at the midline electrodes Fz, Cz, and Pz, to measure P200 (150–250 ms), FRN (250–350 ms), and P300 (350–450 ms). These values were submitted to a mixed model repeated ANOVA, with the factors Interval (150–250 ms, 250–350 ms, and 350–450 ms) Site (Fz, Cz, Pz), Feedback type (win vs loss), and Group (PD patients vs Control group). Bonferroni correction for multiple comparisons was applied.

## **RESULTS**

## **BEHAVIORAL RESULTS**

Exploring the modal values of deck choices, calculated for each subject of the two groups, results showed that 66% of our patients preferred disadvantageous decks; only five patients (34%) preferred advantageous decks. On the contrary, the control group showed the opposite pattern: on 15 participants, the 80% preferred advantageous decks, while only 3 subject (20%) choose as preferential deck a disadvantageous one. The pattern of these choices was significantly different between patients and controls (χ<sup>2</sup> (1) = 0 6.65; *p* < 0.05).

The correlational analysis results on the *total* IGT score showed no significant correlations between the performance on the IGT and the disease's duration, the age of onset and the motor UPDRS score (*p* > 0.05).

Evaluating learning trend along time during the task, the ANOVA on *learning* IGT scores showed a main effect of Time [*F*(4,112) = 14.27, *p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.338], showing that overall participants chose the advantageous decks more frequently in the last block compared to the first (*p* < 0.05). A significant Time∗Group interaction [*F*(4,112) = 3.75, *p* < 0.01, η<sup>2</sup> <sup>p</sup> = 0.118], show that despite a better performance of PD patients in the first block (*p*<0.05), PD patients had a significantly lower*learning* IGT score, respect to the control group, in the fifth block (*p* < 0.05; see **Figure 1**).

#### **ERPs RESULTS**

The feedback-locked ERPs of both groups are displayed in **Figure 2**.

The analysis of the mean amplitude recorded in the three time intervals after feedback onset (150–250 ms, 250–350 ms, and 350–450 ms) and at the midline electrodes Fz, Cz, and Pz, showed main effects of Site [*F*(2,56) = 4.46, *p* < 0.05, η2 <sup>p</sup> = 0.137] and Feedback type [*F*(1,28) = 7.07, *p* < 0.05, η2 <sup>p</sup> = 0.202]: mean activity between 150 and 450 ms after feedback onset has higher amplitude at Cz (3.00 μV), comparing with Fz (2.57 μV), and Pz (1.89 μV). In addition, the ERPs amplitude was greater after positive feedbacks (2.96 μV) then negative ones (2.02 μV). The difference between positive and negative feedbacks was significant between 250 and 450 ms, as indicated by the Feedback∗Time interaction [*F*(2,56) = 3.16, *p* < 0.05, η2 <sup>p</sup> <sup>=</sup> 0.102]. Site∗Group interaction [*F*(2,56) <sup>=</sup> 4.53, *<sup>p</sup>* <sup>&</sup>lt; 0.05,

η2 <sup>p</sup> = 0.139] and subsequent *post hoc* comparisons, indicated that PD patients had a lower (*p* < 0.005) amplitude at frontal site (Fz) compared with central site (Cz), and a comparable amplitude at central (Cz) and parietal (Pz) sites. On the contrary, control group showed a significantly lower activity (*p* < 0.05) at the parietal site (Pz), comparing with central (Cz) and frontal (Fz) ones (see **Figure 3**).

The Site∗Feedback type interaction was also significant [*F*(2,56) = 4.0, *p* < 0.05, η<sup>2</sup> <sup>p</sup> = 0.126], indicating significant differences between positive and negative feedback-evoked responses in Fz and Pz.

Finally, a significant interaction Feedback∗Time∗Group [*F*(2,56) = 5.21, *p* < 0.01, η<sup>2</sup> <sup>p</sup> = 0.157] indicated that PD patients and control group presented different feedback-evoked responses. *Post hoc* comparisons specified that in the control group the mean amplitude, of both the time windows 250–350 and 350–450 ms, was significantly different after positive and negative feedbacks (*p* < 0.05). On the contrary, in PD patients, non-significant differences between feedback-evoked responses were found (see **Figure 4**). Furthermore, *post hoc* comparisons also revealed that PD patients and control group showed different ERPs responses recorded in PZ channel after negative feedback, specifically in the time window between 250 and 350 ms (*p* < 0.05).

## **DISCUSSION**

In the current study we examined behavioral responses and their neural correlates during the IGT (Bechara et al., 1994), a task that simulates an uncertain decision-making situation, in a sample of non-demented and non-depressed PD patients on therapy. Our aims were to add evidence in this topic, given the discordant findings from previous reports, and to focus on the cortical responses during feedback processing using ERPs (Fang et al., 2009). To the best of our knowledge, this is the first study to explore ERPs during IGT in PD.

The present results indicate that medicated PD patients had a lower performance on the IGT compared to a control group of healthy subjects. While controls showed learning process during the task (i.e., they progressively chose more frequently the advantageous decks across the experimental blocks), PD patients preferred disadvantageous decks and, more interestingly they did not ameliorate across the task. ERPs findings suggest that the problem with learning a strategy during the task is secondary to abnormal feedback processing in PD patients. ERPs behave differently according to the feedback valence in normal controls, in that they did not differ in voltage amplitudes in the early window (150–250 ms), but they were significantly larger to wins vs. losses in the windows that correspond to the FRN (250–350 ms) and P300 components (350– 450 ms), respectively. At variance, no difference was found for any time window in patients according to the valence of the feedback. Furthermore, scalp topography of ERPs was shifted posteriorly in PD patients when compared to controls.

In accordance with previous studies (Czernecki et al., 2002; Perretta et al., 2005; Mimura et al., 2006; Pagonabarraga et al., 2007; Kobayakawa et al., 2008, 2010; Gescheidt et al., 2012), our behavioral data indicate a difficulty to learn and follow a successful strategy to improve their performance and a preference for disadvantageous decks in PD patients.

In keeping with previous reports (Pagonabarraga et al., 2007), IGT performance was not correlated with age of onset, PD duration or motor severity, indicating that impairment in decisionmaking, and motor performance are unrelated to each other. This finding is in agreement with the clinical evidence of executive dysfunction despite very good motor performance in some PD patients. Any possible effect of dementia was ruled out by the inclusion of non-demented patients in our study.

The analysis of feedback-related ERPs offered some insight on the brain mechanisms underscoring the abnormal IGT performance in our PD patients. To better explore the different stages of feedback processing, we analyzed ERPs across three windows.

The first window (150–250 ms) comprised the very early component, which is named P200 and is more marked in the frontal regions (Polezzi et al., 2008; Schuermann et al., 2012). The second window (250–350 ms) was focussed on the FRN, which reflects the early feedback appraisal on a binary good vs. bad classification according to the subject's expectation (Schuermann et al., 2011) and whose source is located in the medial frontal cortex (Gehring and Willoughby, 2002). The third window (350450 ms) explored

the P300 that is related to a more complex feedback evaluation reflecting the allocation of motivational and attentional resources and shows the larger amplitude in the central and parietal regions (Schuermann et al., 2011; Cui et al., 2013).

We found that, while ERPs amplitudes were significantly larger for positive vs. negative feedback in normal controls, this difference was absent in PD patients both for the FRN and the P300 time windows. At variance the behavior of the P200 window was the same in the two groups. These findings suggest that PD patients are not able to separate feedbacks according to their valence and that these abnormalities occur across different stages of feedback evaluation.

The FRN is an index of the violation of the expectations of the subject rather than of the absolute valence of the feedback and is generated in the ACC (Holroyd et al., 2006; Oliveira et al., 2007; Jessup et al., 2010; Alexander and Brown, 2011).

At variance, the P300 is a more complex phenomenon that reflects the valence of the feedback, contributes to performance monitoring and behavioral adaptation (Ferdinand and Kray,2013) and is influenced by attention and working memory updating

(Donchin and Coles, 1988; Polich, 2004, 2007). The P300 typically shows the *positivity effect* (i.e., a larger amplitude to positive than negative feedback), which is supposed to reflect a positive feedback as more task relevant, because it signals that the intended goal has been achieved (Bellebaum and Daum, 2008; Ferdinand and Kray, 2013).

It is not surprising that the P200 component did not change in relation to positive vs. negative feedback in patients and controls, as this ERP component has been found to be related to the unpredictability of outcomes, rather than their valence (Polezzi et al., 2008).

Behavioral and ERPs abnormalities in PD patients might be explained in light of current knowledge of the functional anatomy underlying IGT performance. A brain network including the amygdala, the orbital PFC (oPFC), the ACC, the dorsolateral PFC (dlPFC) as well as ventral and dorsal striatum is critically involved in decision-making (Delazer et al., 2009). FRN changes might be ascribed to abnormal activity in the ACC (Gehring and Willoughby, 2002), which plays a major role in this network.

Two hypotheses may be set forth to explain in more details the mechanisms underlying our findings. Sensitivity to negative stimuli has been associated with the integrity of the amygdala (Bechara et al., 1999), which might be involved in the presymptomatic stage of PD according to Braak's neuropathological staging (Braak et al., 2006). This first hypothesis is in keeping with previous reports of abnormal electrodermal responses during IGT in PD similar to that of amygdala-damaged patients (Kobayakawa et al., 2008) especially to negative feedback (Euteneuer et al., 2009).

The second hypothesis stems from a neurobiologically based computational model, which indicates that negative feedback triggers dopamine dips in the basal ganglia indirect pathway leading to No-Go-learning in decision-making (Frank et al., 2004). Dopaminergic drugs might impair learning from negative feedback, because they block the physiological effect of dopamine dips (Frank et al., 2004; Euteneur et al., 2007). This model would fit well with the P300 abnormalities in PD patients along with the difficulties in learning a strategy during IGT.

The relative dopamine sparing of the circuit linking the ventral striatum to the oPFC in comparison to that connecting the dorsal striatum and the dlPFC would lead to normalization of the function of the latter with a relative dopaminergic overdose in the former resulting in an impairment of decision-making tasks such as the IGT (Perretta et al., 2005).

This view is supported by functional neuroimaging studies documenting a dysfunction of the non-motor loop linking the oPFC, and the ACC to the ventral striatum (Thiel et al., 2003), more evident after negative feedback, despite a preservation of the dlPFC and the amygdala (Gescheidt et al., 2013) and by similar findings of abnormal IGT learning in patients with lesions restricted to the oPFC (Bechara et al., 2000).

Our findings seem to support the second hypothesis: the significant difference found in the ERPs response evoked by negative

difference refers to a p value <0.05. Error bars represent standard errors.

feedback supported the assumption that dopaminergic medication specifically affects the processing of negative stimuli. The role of dopaminergic drugs in impairing the response to negative feedback is further supported by previous studies on IGT (Perretta et al., 2005; Pagonabarraga et al., 2007; Kobayakawa et al., 2008; Delazer et al., 2009) and reward-learning (Bódi et al., 2009) on medicated PD, as well as by the normal IGT performance in *de novo* non-medicated PD patients (Poletti et al., 2010).

In accordance with previous reports, we found no difference between levodopa and dopamine agonists (Pagonabarraga et al., 2007; Kapogiannis et al., 2011), but the small number of subjects and the use of multiple medications in some patients may have contributed to this negative finding.

The absence of any effect of drug class appears to be in contrast with the notion that pathological gambling is more frequent in patients with dopamine agonists (Weintraub and Nirenberg, 2013).

Despite the similarity between the present IGT abnormalities in PD and those found in pathological gamblers (Goudriaan et al., 2005), none of our patients presented symptoms of gambling.

However, PD patients frequently show impulsive behavior and the present ERP abnormalities are similar to those found in a wider spectrum of neuropsychiatric conditions that share the presence of impulse control disorder and include borderline personality disorder (Schuermann et al., 2011), attention-deficit/hyperactivity disorder and bipolar disorder (Ibanez et al., 2012), and problem gambling (Oberg et al., 2011).

The exclusion of depressed patients ruled out a possible contribution of the dysfunction of serotoninergic pathways, which contribute to learning (Gleichgerrcht et al., 2010), in our patients.

While ERPs were larger on more anterior sites (Fz, Cz) in comparison to Pz in normal controls, they appeared to be significantly smaller in Fz in patients indicating a posterior shift in PD. The anterior-to-posterior gradient of ERPs is a well-known finding in older adults and has been interpreted in terms of compensatory resource allocation (Reuter-Lorenz and Lustig,2005;Adrover-Roig and Barceló, 2010; Daffner et al., 2006, 2011; Ferdinand and Kray, 2013). At variance, the posterior shift of ERPs in PD patients suggested a different recruitment of neural resources and could be interpreted as a failure of PD patients in this compensatory mechanism, probably due to PFC dysfunction. A similar shift toward posterior electrodes was found with early somatosensory evoked potentials (SEPs) that documented a marked reduction of frontal N30 component and indicate a frontal dysfunction in PD patients (Garcia et al., 1995; Bostantjopoulou et al., 2000).

In conclusion, our behavioral results confirmed worse IGT performance in medicated PD patients and ERPs data offered some insight on the underlying mechanisms pointing to PFC dysfunction related to dopaminergic treatment. These abnormalities are in line with the growing literature about changes in feedback processing in this condition (Frank et al., 2004, 2007; Bódi et al., 2009; Kobayakawa et al., 2010 Kapogiannis et al., 2011) and may contribute to cognitive and behavioral problems related to impulse control disorder and to the impairment in every-day decisions that is common in PD patients.

## **REFERENCES**


Delazer, M., Sinz, H., Zamarian, L., Stockner, H., Seppi, K., Wenning, G. K., et al. (2009). Decision making under risk and under ambiguity in Parkinson's disease. *Neuropsychologia* 47, 1901–1908. doi: 10.1016/j.neuropsychologia.2009.02.034

Delorme, A., and Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. *J. Neurosci. Methods* 134, 9–21. doi: 10.1016/j.jneumeth.2003.10.009


alcohol dependents, persons with Tourette syndrome, and normal controls. *Cogn. Brain Res.* 23, 137–151. doi: 10.1016/j.cogbrainres.2005.01.017


mind in the frontal variant of fronto-temporal dementia. *Neuropsychologia* 45, 342–349. doi: 10.1016/j.neuropsychologia.2006.05.031


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 April 2014; accepted: 13 June 2014; published online: 03 July 2014. Citation: Mapelli D, Di Rosa E, Cavalletti M, Schiff S and Tamburin S (2014) Decision and dopaminergic system: an ERPs study of Iowa gambling task in Parkinson's disease. Front. Psychol. 5:684. doi: 10.3389/fpsyg.2014.00684*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Mapelli, Di Rosa, Cavalletti, Schiff and Tamburin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Cognition and emotional decision-making in chronic low back pain: an ERPs study during Iowa gambling task

## *Stefano Tamburin1\*, Alice Maier 1, Sami Schiff <sup>2</sup> , Matteo F. Lauriola3 , Elisa Di Rosa4 , Giampietro Zanette3 and Daniela Mapelli 4,5*

<sup>1</sup> Section of Neurology, Department of Neurological and Movement Sciences, University of Verona, Verona, Italy

<sup>2</sup> Department of Medicine, University of Padova, Padova, Italy

<sup>3</sup> Section of Neurology, Pederzoli Hospital, Peschiera del Garda, Verona, Italy

<sup>4</sup> Department of General Psychology, University of Padova, Padova, Italy

<sup>5</sup> Human Inspired Technologies Research Center, University of Padova, Padova, Italy

#### *Edited by:*

Jeng-Ren Duann, China Medical University, Taiwan

#### *Reviewed by:*

Vasco Galhardo, Universidade do Porto, Portugal Ya Wang, Chinese Academy of Sciences, China Kai Wang, Anhui Medical University, China Michiko Kano, Tohoku University, Japan

#### *\*Correspondence:*

Stefano Tamburin, Section of Neurology, Department of Neurological and Movement Sciences, University of Verona, Piazzale Scuro 10, 37134 Verona, Italy e-mail: stefano.tamburin@univr.it

Previous reports documented abnormalities in cognitive functions and decision-making (DM) in patients with chronic pain, but these changes are not consistent across studies. Reasons for these discordant findings might include the presence of confounders, variability in chronic pain conditions, and the use of different cognitive tests. The present study was aimed to add evidence in this field, by exploring the cognitive profile of a specific type of chronic pain, i.e., chronic low back pain (cLBP). Twenty four cLBP patients and 24 healthy controls underwent a neuropsychological battery and we focused on emotional DM abilities by means of Iowa gambling task (IGT). During IGT, behavioral responses and the electroencephalogram (EEG) were recorded in 12 patients and 12 controls. Event-related potentials (ERPs) were averaged offline from EEG epochs locked to the feedback presentation (4000 ms duration, from 2000 ms before to 2000 ms after the feedback onset) separately for wins and losses and the feedback-related negativity (FRN) and P300 peak-to-peak amplitudes were calculated. Among cognitive measures, cLBP patients scored lower than controls in the modified card sorting test (MCST) and the score in this test was significantly influenced by pain duration and intensity. Behavioral IGT results documented worse performance and the absence of a learning process during the test in cLBP patients compared to controls, with no effect of pain characteristics. ERPs findings documented abnormal feedback processing in patients during IGT. cLBP patients showed poor performance in the MCST and the IGT. Abnormal feedback processing may be secondary to impingement of chronic pain in brain areas involved in DM or suggest the presence of a predisposing factor related to pain chronification. These abnormalities might contribute to the impairment in the work and family settings that often cLBP patients report.

**Keywords: chronic pain, Iowa gambling task (IGT), decision-making, event-related potentials (ERPs), low back pain**

## **INTRODUCTION**

Cognition indicates the brain's acquisition, processing, storage and retrieval of information, but is also used to describe integrative neuropsychological processes such as mental imaging, problem solving and perception, and is pertinent to emotion and affect (Moriarty et al., 2011).

Among cognitive processes, decision making (DM) is a *complex process that encompasses a range of functions through which motivational processes make contact with action selection mechanisms to express one behavioral output rather than any of the available alternatives* (Rogers, 2011). DM depends on a number of control functions, including selection and inhibition, working memory, planning, emotion, estimation, and other processes included in the domain of the executive functions (EFs). Among these functions, choice evaluation, response selection, and feedback processing play a major role (Fang et al., 2009). Feedback processing is pivotal, in that assigning a positive or negative valence to an option on the basis of previous

experience is the prerequisite for the evaluation and anticipation of action outcomes and for an efficient response selection (Mapelli et al., 2014).

The anatomical substrate of DM is a complex network including the prefrontal cortex (PFC), the anterior cingulate cortex (ACC), the fronto-striatal and limbic loops, and some subcortical structures and DM abnormalities are common in patients with lesions or diseases affecting these areas (Gleichgerrcht et al., 2010).

In an attempt to mimic real-life DM scenarios, Bechara et al. (1994) developed the Iowa gambling task (IGT), which simulates, in laboratory environment, DM strategy by factoring the uncertainty of promises and outcomes, as well as reward and punishment. Performance on the IGT is negatively affected by neurological and psychiatric disorders (Brand et al., 2006; Dunn et al., 2006; Mapelli et al., 2014), neurodegenerative changes affecting the PFC (Ernst et al., 2002; Manes et al., 2002; Clark and Manes, 2004; Fellows and Farah, 2005), and deficits in working memory (Manes et al., 2002) and fluid intelligence (Roca et al., 2009).

Longstanding evidence indicate that chronic pain, i.e., pain persisting for 3 months or longer (Merskey and Bogduk, 1994), may have a negative impact on cognition (Moriarty et al., 2011), including working memory, long-term memory and recognition (Grace et al., 1999; Luerding et al., 2008), attention (Grace et al., 1999), EFs, and DM (Weiner et al., 2006; Verdejo-Garcia et al., 2009). Due to its biological salience, pain is an attention-demanding sensory process, but cognitive changes cannot be simply attributed to the attentional demand of ongoing pain.

Morphometric magnetic resonance imaging (MRI) demonstrated gray matter atrophy in the dorsolateral PFC (Apkarian et al., 2004a). Functional MRI showed that, in chronic pain patients, experimental noxious stimuli cause decreased activity in brain regions identified for acute pain (Peyron et al., 2000; Apkarian et al., 2005) and increased activity in regions that are not part of the spinothalamic pathway, mainly the PFC and related subcortical structures (Apkarian et al., 2005). These findings indicate that chronic pain is associated with reduced gain in brain regions involved in acute pain and increased gain in areas outside the classical *pain matrix*. They also suggest that chronic pain may impinge the PFC and the related network and could be considered a cognitive state that may compete with other cognitive abilities, especially those utilizing the PFC, such as DM (Damasio, 1996; Fuster, 2001).

It is important to exercise caution in interpreting these neuropsychological data, because the majority of cognitive abnormalities have been documented in patients with fibromyalgia (Grace et al., 1999; Luerding et al., 2008; Verdejo-Garcia et al., 2009) and cannot be generalized to other chronic pain conditions. Studies in patients with chronic low back pain (cLBP) yielded discordant findings, in that some of them documented reduced attention, visuospatial skills, and cognitive flexibility (Weiner et al., 2006), but the cognitive profile was nearly normal, except slight DM changes, in another report (Apkarian et al., 2004b).

The goal of the present study was to add evidence in this field, by exploring the cognitive profile of a specific type of chronic pain, i.e., cLBP. cLBP patients underwent a neuropsychological battery to explore different cognitive functions and we focused on emotional DM abilities by means of IGT. Abnormalities in different tests would indicate reduced cognitive abilities secondary to the affective and attentional load of pain. At variance, changes in single cognitive functions would favor the hypothesis of specific mechanisms associated with chronic pain. What's more, focusing on emotional DM might help understanding whether PFC changes documented in neuroimaging studies do translate into cognitive changes.

To explore the cortical correlates of DM, we measured behavioral responses and recorded their neurophysiological cortical correlates with electroencephalogram (EEG) and event-related potentials (ERPs) during IGT in a subgroup of cLBP patients and controls. The monitoring of feedback during DM task evokes a large cortical response mainly localized over central electrodes, which can be separated in a feedback-related negativity (FRN) and a P300, with the former representing an early appraisal of feedback on a binary classification of good vs. bad outcome, and the latter resulting in a later top–down controlled evaluation process that is related to both the valence and the magnitude of the feedback (Gehring and Willoughby, 2002; Yeung and Sanfey, 2004; Hajcak et al., 2006; Holroyd et al., 2006; Wu and Zhou, 2009; Cui et al., 2013; Ferdinand and Kray, 2013; Mapelli et al., 2014).

## **MATERIALS AND METHODS**

## **SUBJECTS**

We recruited 24 normal subjects, who volunteered as controls, and 24 patients with cLBP (Merskey and Bogduk, 1994) and pain duration >6 months (**Table 1**), for a total of 48 participants. Baseline demographical conditions (sex, age, education) were not significantly different between patients and controls. All participants gave signed informed consent prior to participation to the study and the protocol had been explained in details to them. The study was approved by the local ethics committee of the Department of Neurological and Movement Sciences, University of Verona.

The inclusion/exclusion criteria for patients and controls were: age 18–70, normal or corrected to normal vision, absence of neurological or psychiatric disease, no drugs with psychotropic or neurological effects, mini mental state examination score (MMSE; Folstein et al., 1975) >24.

Chronic low back pain patients had a mean pain duration of 72.9 ± 55.8 months (range: 12–180; median: 24). Average pain intensity was rated before the neuropsychological and IGT evaluation and was 5.1 ± 2.7/10 (range: 2–10; median: 5) on a 0–10 numerical rating scale (NRS). At the time of the evaluation, none of the patients was on chronic treatment, except non steroidal anti-inflammatory drugs when needed, but none of them took any painkiller on the day of testing. The mean score on Beck Depression Inventory (BDI) was 5.0 ± 3.5/39 (range: 1–14; median: 4)


Continuous variables are expressed as mean ± SD, range. †P value from unpaired t-test (continuous variables). ‡P value from the Fisher's exact test (dichotomous variable). cLBP, chronic low back pain.

which indicated minimal depression, and anxiety score on the State Trait Anxiety Inventory (STAI) Y2 was 45.1 ± 4.9/80 (range: 31–54; median: 46), which indicated mild anxiety.

#### **COGNITIVE MEASURES**

Neuropsychological status was assessed individually by experienced neuropsychologists with a well-validated battery of five tests. The assessment lasted 1 h, with each of the five tests being given to the patients and controls one after the other in the same order. The test list include:

#### *Digit span*

The digit span test, a subtest of the Wechsler memory scale (Wechsler, 1945), is the format used most often for measuring span of immediate verbal recall and working memory. The test consists of seven (from 2 digits to 8 digits) pairs of random number sequences that the examiner reads aloud at the rate of one a second. The patient's task is to repeat each sequence exactly as it is given.

## *Modified card sorting test (MCST)*

This test is a shorter version (Caffarra et al., 2004) of theWisconsin card sorting test (Heaton et al., 1993) and assesses the ability to solve problems in response to changing stimuli, the ability to shift and maintain set, and to utilize feedback.

#### *Stroop test*

This test measures sustained attention and some aspects of EFs, such as the ability to elaborate relevant and irrelevant dimensions in parallel and to inhibit an automatic response while performing a task based on conflicting stimuli (Stroop, 1935; Caffarra et al., 2002).

#### *Trail making test (TMT)*

This test is divided in parts A and B and evaluates attention, motor speed and EFs (Reitan, 1992).

#### *Interference memory task (10 and 30 s)*

This test is based on the Brown–Peterson paradigm (Brown, 1958; Peterson and Peterson, 1959) and is a subtest of the neuropsychological battery *esame neuropsicologico breve 2* (short neuropsychological examination version 2; Mondini et al., 2011). This test quantifies the objects that can be held in working memory while preventing participants from using mnemonics or other memory techniques separate from the working memory to increase recall capacity.

#### **IOWA GAMBLING TASK**

Decision-making was assessed with the IGT (Bechara et al., 1994). Even if it was originally designed in analogical mode, in our study the IGT was implemented in a computerized version (Mapelli et al., 2014). The experiment ran with the E-Prime 2 software (Psychology Software Tools, Pittsburgh, PA, USA) installed on a personal computer equipped with a 17-inch monitor.

The task consisted in the presentation, on a computer screen, of four decks named A, B, C, and D. Each card in these decks can bring a win or a loss: participants were requested to gain as more as possible, choosing consecutively one card from any of the four decks, until the task shuts off automatically after 100 cards. The

back of each deck looks the same, but decks differ in composition. Decks A and B are considered *disadvantageous*, because they bring big wins but also expensive losses, producing a net loss of 250€ every 10 cards. Decks C and D are considered *advantageous* ones because they bring small wins, but smaller losses, causing a net gain of 250€ every 10 cards. The instructions given to the participants were the following: *in this screen you can see four decks, two of them are advantageous and two are disadvantageous. Each card of these decks can bring a win or a loss: the goal of this task is to win as much money as possible, and avoid losing money as much as possible, starting from a virtual budget of 2000*€*.* Participants did not know the number of choices and, moreover, which were the advantageous or the disadvantageous decks. Participants saw on the screen the amount of money that they won or loose; this amount was updated after each choice. The experimental flow of the IGT task is shown in **Figure 1**.

The performance in the IGT test was measured using different parameters. The *total amount of money* was the money at the end of the test. The *modal value of deck choices* was explored by calculating the mode of the distribution of the deck choices for each subject of the two groups. The *learning IGT score* was calculated according to previous reports (Bechara et al., 1994; Fukui et al., 2005; Mapelli et al., 2014). To this aim, the 100 picks were divided into five blocks of 20 cards. For each block, the difference between the number of cards picked from *advantageous* decks (C and D) minus those picked from *disadvantageous* ones (A and B) was calculated. In this way, five *learning IGT scores*, one for each block, were obtained for each subject, and the comparison between these scores was considered as an index of learning. An increasing value of the *learning IGT score* from the first to the last block indicates a preference for *advantageous* decks and the learning of the right pick strategy. Finally, the *total IGT score* was calculated by means of the difference between overall *advantageous* choices minus overall *disadvantageous* ones.

#### **EEG RECORDING**

Electroencephalogram and ERPs were recorded in a subgroup of 12 controls and 12 cLBP patients. During the IGT, the EEG was acquired from an array of 32 Ag/AgCl electrodes through a Micromed electrode system. Electrodes were identified by brain hemisphere (odd numbers = left, even numbers = right) and general cortical zone (F = frontal, C = central, T = temporal, P = parietal, and O = occipital) and they were mounted on an elastic cap, according to the International 10–20 system (Oostenveld and Praamstra, 2001). The left and right mastoids served as reference, while the vertical and horizontal eye movements were recorded with two electro-oculogram (EOG) electrodes, placed below and at the outer canthus of the left eye. The ground electrode was located at POz channel. The rating sample was 512 Hz, electrodes impedance were <5 k-; a digital band-pass filter (0.1–30 Hz) and notch filter (50 Hz) were applied off-line.

#### **EVENT-RELATED POTENTIALS**

Electroencephalogram data were processed offline using the EEGLAB software (Delorme and Makeig, 2004). Epochs were locked to the feedback presentation (4000 ms duration, from 2000 ms before to 2000 ms after the feedback onset), and the

averaging procedure was performed separately for positive and negative feedbacks. Artifact correction was performed using baseline correction in the −500–0 ms time window and independent components analysis technique (Makeig et al., 1996; Delorme and Makeig, 2004).

The FRN amplitude was calculated as the peak-to-peak amplitude difference between the maximal positivity in the 150–250 ms time window and the minimal negativity in the 250–310 ms time window after feedback presentation in the Fz channel because FRN is maximal in the fronto-central midline (Yeung et al., 2005; Hewig et al., 2007; Li et al., 2009).

The P300 amplitude was calculated as the peak-to-peak amplitude difference between the minimal negativity in the 250–310 ms time window and the maximal positivity in the 310–450 ms time window after feedback presentation, from Pz channel because P300 is maximal at the parietal midline (Gehring and Willoughby, 2002; Cui et al., 2013; Mapelli et al., 2014).

#### **STATISTICAL ANALYSIS**

All tests were carried with the IBM SPSS version 20.0 statistical package. For the comparison of baseline demographic conditions (patients vs. controls), the unpaired *t*-test was used for continuous variables and the Fisher's exact test for dichotomous ones. For continuous cognitive and IGT outcomes, we used the unpaired *t*-test in case of normal distribution, otherwise the non parametric Mann-Whitney *U* test was applied. The dichotomous cognitive variables and the modal distribution of deck choices were explored with the Fisher's exact test. The correlation between cognitive and IGT measures and clinical variables (depression and anxiety scores, chronic pain intensity, and duration) was analyzed with the Pearson's coefficient. Learning strategy in the IGT was analyzed with a mixed model repeated-measures ANOVA (withinsubjects factor: block, 1 to 5; between-subject factor: group, controls vs. patients) and *post hoc t*-test with Bonferroni's correction. Homogeneity of variance was analyzed with the Levene's test. The data were transformed (logarithmic transformation) before

submitting them to ANOVA in case of an inequality in the variances. The FRN and P300 amplitudes were submitted to a mixed model repeated-measures ANOVA (within-subjects factor: condition, win vs. loss; between-subject factor: group, controls vs. patients) and *post hoc t*-test with Bonferroni's correction. Results are reported as mean ± SD except when otherwise specified. *P* < 0.05 (two-tailed) was taken as the significance threshold for all the tests.

## **RESULTS**

#### **COGNITIVE MEASURES**

Modified card sorting test right categories were significantly lower (*p* = 0.02) and modified card sorting test (MCST) perseverative errors were significantly higher in patients vs. controls (*p* = 0.03), while the other cognitive scores did not significantly differ between the two groups (**Table 2**). The number of MCST right categories was negatively and significantly influenced by the intensity of pain (Pearson's coefficient = −0.76, *p* = 0.009). The number of perseverative errors was significantly correlated with pain duration (Pearson's coefficient = 0.79, *p* = 0.007).

## **IGT BEHAVIORAL RESULTS**

The *total amount of money* at the end of the IGT was lower in cLBP patients (1492 ± 603€) vs. controls (2069 ± 893€; *p* = 0.014). Depression score (BDI), anxiety score (STAI Y2), duration and intensity of pain were not significantly correlated with the *total amount of money*. The *modal value of deck choices* significantly differed between patients and controls, in that 54% of cLBP patients and 83% of controls preferred *advantageous* decks (Fisher's exact test: *p* = 0.012; **Table 3**).

When analyzing the distribution of the picks across the experimental blocks, normal controls showed an exploratory strategy, in that at the beginning of the test they explored single blocks and continued picking cards from the same block until they learned whether the deck was *advantageous* or not and, once


**Table 2 | Cognitive measures in patients and controls.**

Continuous variables are expressed as mean ± SD, range. \*flags significant p values when comparing cLBP patients vs. controls. cLBP, chronic low back pain. MCST, modified card sorting test; TMT, trail making test.


Here is reported the type of deck that was the preferred one in cLBP patients and controls (i.e., the mode of the distribution of deck choices).There was a significant difference between the two groups (Fisher's exact test: p = 0.012). cLBP, chronic low back pain.

learned, they preferred the advantageous decks. At variance, the picks of the cLBP patients did not follow a clear strategy, but they seemed to fluctuate randomly across *advantageous* and *disadvantageous* decks. Normal controls showed a learning process during the task, in that the *learning IGT score* progressively ameliorated throughout the five blocks of the test. At variance, no clear learning strategy was found in cLBP patients, whose *learning IGT score* did not improve across different blocks and fluctuated close to 0 (**Figure 2**). Repeated-measures ANOVA showed a main effect of the factors block [*F*(4,184) = 13.01; *p* < 0.001], group [*F*(1,46) = 6.11; *p* = 0.036] and a significant block × group interaction [*F*(4,184) = 2.84; *p* = 0.04] on the *learning IGT score*. *Post hoc* analysis with Bonferroni's correction showed that the *learning IGT score* was significantly higher in controls vs. patients in blocks 3, 4, and 5 (**Figure 2**). To rule out any possible effect of concomitant depression, patients were divided in those with and without depression according to BDI (cut-off = 5/39) and the betweensubjects factor depression was submitted to repeated-measures ANOVA, which documented that neither the factor depression [*F*(1,22) = 0.8; n.s.] nor the block × depression interaction [*F*(1,22) = 1.9; n.s.] significantly influenced the *learning IGT score*.

Depression score (BDI), anxiety score (STAI Y2), duration and intensity of pain were not significantly correlated with the *total IGT score*.

**FIGURE 2 | Learning strategy in the IGT.** Here are shown the learning IGT scores across the five different blocks of the IGT in cLBP patients and controls. A learning process was present in controls, in that the learning IGT score progressively ameliorated throughout the five blocks. No clear learning strategy was found in cLBP patients, whose learning IGT score did not improve across different blocks and fluctuated close to 0. Vertical error bars equal 1 SEM. \*p < 0.05 (after Bonferroni's correction) for cLBP patients vs. controls comparison. cLBP, chronic low back pain; IGT, Iowa gambling task.

#### **ERPs RESULTS**

The subgroups of cLBP patients (*n* = 12) and controls (*n* = 12) did not significantly differ for age, sex and education. Among cognitive measures, the MCST right categories were significantly lower (cLBP patients: 4.0 ± 2.0, controls: 5.6 ± 2.7; *p* = 0.02) and MCST perseverative errors were significantly higher (cLBP patients: 4.6 ± 4.5, controls: 1.4 ± 1.0; *p* = 0.04) in patients vs. controls, while the other outcomes did not significantly differ between the two groups. For IGT, the *total amount of money* was lower in cLBP patients (1460 ± 692€) vs. controls (2027 ± 571€; *p* = 0.04). Repeated-measures ANOVA showed a main effect of the factors block [*F*(4,88) = 7.32; *p* < 0.001], group [*F*(1,22) = 4.45; *p* = 0.047] and a significant block × group interaction [*F*(4,88) = 2.63; *p* = 0.04] on the *learning IGT score*.

The grand-average ERPs in patients and controls are displayed in **Figure 3**. There was a prevalence of the number of trials for wins (controls: 73.8 ± 3.8, cLBP patients: 76.1 ± 3.8, n.s.) vs. losses (controls: 17.9 ± 2.3, cLBP patients: 17.0 ± 2.7, n.s.), but this was balanced between the two groups.

The FRN amplitude in the Fz channel was higher to wins than losses in controls, while the opposite happened in patients (**Figure 4**). Repeated-measures ANOVA showed a significant condition × group interaction [*F*(1,22) = 4.8; *p* = 0.04], while the factors condition [*F*(1,22) = 0.05; n.s.] and group [*F*(1,22) = 1.0; n.s.] did not significantly affect FRN amplitude. *Post hoc* analysis with Bonferroni's correction showed that the FRN amplitude was significantly higher to losses than wins in patients. The FRN amplitude difference for the two types of feedback (i.e., FRN amplitude to wins – FRN amplitude to losses) was significantly different between the two groups (controls: 1.1 ± 3.2; patients: −1.3 ± 1.9; unpaired *t*-test, *p* = 0.04).

The P300 amplitude in the Pz channel was higher to wins than losses in controls, while this difference was absent in patients, being the P300 amplitude similarly high for both types of feedback (**Figure 5**). Repeated-measures ANOVA showed a significant effect of the factor condition [*F*(1,22) = 9.6; *p* = 0.005] and a significant condition × group interaction [*F*(1,22) = 4.7; *p* = 0.04], while the factor group [*F*(1,22) = 0.5; n.s.] did not significantly affect P300 amplitude. *Post hoc* analysis with Bonferroni's correction showed that the P300 amplitude was significantly higher to positive than negative feedback in controls, while no difference between the two types of feedback was found in patients.

The P300 amplitude difference for the two types of feedback (i.e., P300 amplitude to wins – P300 amplitude to losses) was significantly different between the two groups (controls: 1.3 ± 1.5; patients: 0.2 ± 1.0; unpaired *t*-test, *p* = 0.04).

Feedback-related negativity and P300 amplitude were not influenced by depression score (BDI), anxiety score (STAI Y2), duration and intensity of pain.

## **DISCUSSION**

In the present study, we explored cognitive functions and DM in cLBP patients and focused on emotional DM abilities by exploring behavioral responses and their neurophysiological correlated during IGT (Bechara et al., 1994). Our data documented that,

**FIGURE 3 | Grand average ERPs in the Fz, Cz, and Pz channels to wins (green lines) and losses (red lines) in controls and cLBP patients.** cLBP, chronic low back pain; ERPs, event related potentials; FRN, feedback-related negativity.

among cognitive measures, cLBP patients scored lower than controls only in the MCST and that pain duration and intensity were significantly correlated with the degree of impairment in this test. Behavioral IGT results documented worse performance and the absence of a learning process in cLBP patients compared to controls, with no effect of pain characteristics. ERPs findings suggested abnormal feedback processing in patients during IGT.

Previous reports on cognitive functions in chronic pain reported conflicting results, in that abnormalities were not consistent and the tasks explored differed across studies (Moriarty et al., 2011). What's more, robust cognitive changes were mainly documented in patients with fibromyalgia, a chronic pain condition that is nearly always associated with depression, which may have biased the interpretation of the results. Our findings are in keeping with this bulk of literature, as we found that, out of the large battery of tests, only MCST scores were abnormal in cLBP patients. A previous study documented normal score inWisconsin card sorting test in cLBP patients, but the very small sample (six patients) might have reduced the power of the statistical analysis (Apkarian et al., 2004b). MCST explores verbal feedback (right, wrong) processing and set shifting. Set shifting appeared to be preserved in our patients because of normal score in trail making test (TMT) part B. We may thus speculate that the abnormalities

with MCST resulted from a difficulty in feedback elaboration in the dorsolateral PFC.

We found that the intensity and duration of pain were significantly correlated to MCST scores. Pain duration and intensity were quite variable among our patients and this may represent a bias. However, based on our findings, we may hypothesize that pain might represent a competing task leading to worse and slower functioning of the dorsolateral PFC, which is involved in MCST performance. This view is in keeping with morphological MRI studies, which showed reduced size of the dorsolateral PFC in chronic pain patients (Apkarian et al., 2004a), and that the dorsolateral PFC shrinkage can be reverted by pain treatment suggesting abnormal plasticity to continuous nociceptive afferents (Rodriguez-Raecke et al., 2009, 2013; Seminowicz et al., 2011). It may thus be speculated that intense chronic pain might engage the dorsolateral PFC and cause the abnormalities in MCST, while long pain duration could trigger pathological plastic changes that may be more difficult to revert in patients with long-lasting pain.

Depression and anxiety did not correlate to the MCST performance in our patients, excluding a possible role of these factors. A limitation of the present study is that we did not explore the role of other factors, such as deprivation of social contacts, agility, physical training and life style changes, which together might have also contributed to the MCST abnormalities (Rodriguez-Raecke et al., 2009, 2013).

Iowa gambling task data showed impairment of both the total amount of money and the learning strategy. cLBP patients won significantly less money than controls and their IGT score did not change throughout the blocks indicating the absence of a learning curve during the test. The IGT is a relatively difficult task, but normal controls succeeded in keeping the initial amount of money, while patients lost on average a quarter of the sum. The different outcome in the two groups depended on the presence of a learning strategy in controls, who explored the four decks in the first two blocks of the test, then chose preferentially the advantageous ones. At variance, patients choices appeared largely random ones, and there was a higher number of disadvantageous picks in this group. Depression, anxiety and pain characteristics (i.e., pain intensity and duration) did not influence IGT performance.

To the best of our knowledge, only two studies explored IGT in patients with chronic pain, namely in cLBP and complex regional pain syndrome (Apkarian et al., 2004b) and in chronic migraine (Biagianti et al.,2012). Both these previous reports found that IGT performance were worse in chronic pain patients and that this outcome was not or minimally influenced by depression, anxiety and pain characteristics. Our data differ from those of Apkarian et al. (2004b), in that they found a learning strategy, which was delayed in comparison to controls, in cLBP patients. This difference might be ascribed to our IGT protocol, which was slightly different from the majority of previous studies, in that we told the participants that two of the decks were advantageous and two were disadvantageous (Bechara et al., 2000).

The analysis of feedback-related ERPs offered some insight on the brain mechanisms underlying the bad IGT performance in our patients. To better explore the different stages of feedback processing, we analyzed two ERPs components, namely the FRN and P300.

While FRN was slightly larger for positive vs. negative feedback in normal controls, the opposite happened in our patients, who showed a significantly higher amplitude of this component to losses than wins. The FRN reflects early feedback appraisal on a binary good vs. bad classification, is an index of the violation of the expectations of the subject rather than of the absolute valence of the feedback and is generated in the ACC (Gehring and Willoughby, 2002; Holroyd et al., 2006; Oliveira et al., 2007; Jessup et al., 2010; Alexander and Brown, 2011; Schuermann et al., 2011). Our data suggest that cLBP patients seem to invert the correct placement of feedback according to the good vs. bad outcome basic classification. However, this finding should be interpreted with caution because of the absence of the FRN effect in controls. The reasons for the absence of the FRN effect in our normal subjects might include the relatively old age of some of the controls (Hämmerer et al., 2011; West et al., 2014), the personality profiles and/or genetic variables (Mueller et al., 2014), which were not measured in the present study, or the experimental protocol that differed from some of previous studies, in that the subjects were told that two decks were advantageous and two were disadvantageous.

Controls had a significantly larger P300 to wins than losses, while this component was similarly large to both types of feedback and not significantly different between the two conditions in our patients. The P300 is a more complex phenomenon that reflects the valence of the feedback, contributes to performance monitoring and behavioral adaptation (Schuermann et al., 2011; Cui et al., 2013; Ferdinand and Kray, 2013) and is influenced by attention and working memory updating (Donchin and Coles, 1988; Polich, 2007). The P300 typically shows the *positivity effect* (i.e., a larger amplitude to positive than negative feedback), which is supposed to reflect a positive feedback as more task relevant, because it signals that the intended goal has been achieved (Ferdinand and Kray, 2013). Similar P300 amplitude to both types of feedback in cLBP suggests that patients are unable to differentiate positive and negative outcomes even at this higher-order stage of outcome processing and that they cannot use the information from previous trials and errors for planning future decisions. The abnormally high amplitude of P300 in both conditions might be interpreted as some sort of ceiling effect due to difficulties in tuning the amplitude of this ERPs component in relation to feedback valence.

Behavioral and ERPs abnormalities in cLBP patients might be explained in light of current knowledge of the functional anatomy of DM, which involves a brain network including the amygdala, the ventromedial and the dorsolateral PFC, the ACC, as well as ventral and dorsal striatum (Delazer et al., 2009). IGT and MCST impairment has been documented in many different clinical conditions involving this network, (Bechara et al., 1996, 2000; Rahman et al., 1999, 2006; Fellows and Farah, 2005; Torralva et al., 2009). Healthy aging may also affect the performance in these two tests (Finucane et al., 2002; MacPherson et al., 2002; Kovalchik et al., 2005; Cauffman et al., 2010; Eppinger and Kray, 2011).

Two anatomo-functional hypotheses may be set forth to explain the mechanisms underlying our ERPs findings. Activity in the ventromedial PFC was found to be associated with the fluctuations of pain intensity in cLBP (Baliki et al., 2006; Foss et al., 2006). It may be hypothesized that pain-related activity in the ventromedial PFC might have resulted in an imbalance between ventromedial and dorsolateral PFC leading to the present ERPs abnormalities.

Sensitivity to negative stimuli has been associated with thefunction of the amygdala (Bechara et al., 1999), which is involved in processing the affective dimension of pain (Giesecke et al., 2005) and influences descending inhibitory pain control through the periaqueductal gray matter (Neugebauer et al., 2004). Based on MRI findings of decreased gray matter bordering the amygdala in patients with cLBP (Ung et al., 2014), we may speculate that continuous nociceptive barrage to the amygdala in patients might cause a dysfunction of this brain structure leading to alteration in feedback processing.

The neuropharmacology of the anatomical network subserving DM points to dopamine (DA) and serotonin. DA is the main neuromodulator of the fronto-striatal loop, and plays a key role (Assadi et al., 2009; Rogers, 2011) in reward processing during reinforcement learning (Schultz, 2002; Frank et al., 2004) and in learning and outcome monitoring (Hämmerer and Eppinger, 2012). Patients with Parkinson's disease, which is characterized by brain DA reduction and DA manipulation by treatment, show an impairment in DM abilities (Hämmerer and Eppinger, 2012; Mapelli et al., 2014). It may be speculated that changes in DA levels might have blocked the physiological dopaminergic bursts and dips (Frank et al., 2004), which together shape the behavioral responses to positive and negative feedbacks. This view is in keeping with a rodent model, which explored an IGT-like task in rats with pain, and documented that rats performed similarly to our patients and that DA levels were reduced in their ventromedial PFC and amygdala (Pais-Vieira et al., 2009). This model would fit well with the ERPs abnormalities in cLBP patients along with the difficulties in learning a strategy during IGT. Serotonin plays also a relevant role in DM (Gleichgerrcht et al., 2010). Some of our patients showed mild levels of depression, but the absence of any significant effect of depression on IGT findings seems to rule out a possible contribution of the serotoninergic dysfunction.

In contrast to MCST results, IGT abnormalities were not related to any pain variable. We hypothesize that they may represent a predisposing factor for pain chronification and in predicting those patients, who are at risk for developing chronic pain after a futile peripheral tissue damage. Studies on pain chronification have recently shifted from peripheral nerve and spinal cord mechanisms to cortical and limbic phenomena (Baliki et al., 2012). Future prospective studies assessing cognitive functions, including IGT, in patients with acute pain and correlating eventual chronification to their impairment should better explore this hypothesis.

The present IGT abnormalities are similar to those found in pathological gamblers (Goudriaan et al., 2005), as well as in a wider spectrum of neuropsychiatric conditions that share the presence of impulse control disorder and include borderline personality disorder (Schuermann et al., 2011), attention-deficit/hyperactivity disorder and bipolar disorder (Ibanez et al., 2012), and problem gambling (Oberg et al., 2011). Chronic pain patients often have to decide whether to take an analgesic or to change their Tamburin et al. Cognition and DM in cLBP

habits to manage pain. Pain killers have an advantage in the short term (high reward) but, in the long term, they might result in adversive consequences such as side effects or addiction (higher punishment). Otherwise, alternative choices, such as physical activity, cognitive-behavioral therapies or combined treatment (low reward) might result more advantageous in the long term (lower punishment). The IGT impairment in cLBP patients might have an important influence on the selection between various therapeutic options. None of our patients presented symptoms of medication overuse or dependency-like behavior, but exploring IGT changes in patients with drug abuse might be interesting and assessing whether IGT may predict the excessive use of pain killer would have an important role in avoiding this frequent complication of chronic pain.

In conclusion, we documented that cLBP patients show poor performance in DM, as assessed withMCST and IGT. These abnormalities might contribute to the impairment in the work and family settings that often cLBP patients report. Future studies should explore whether these changes may predict the functioning in everyday life.

### **REFERENCES**


a dopamine antagonist on the feedback-related negativity. *Psychophysiology* 51, 805–809. doi: 10.1111/psyp.12226


Wechsler, D. (1945). A standardized memory scale for clinical use. *J. Psychol.* 19, 87–95. doi: 10.1080/00223980.1945.9917223

Weiner, D. K., Rudy, T. E., Morrow, L., Slaboda, J., and Lieber, S. (2006). The relationship between pain, neuropsychological performance, and physical function in community-dwelling older adults with chronic low back pain. *Pain Med.* 7, 60–70. doi: 10.1111/j.1526-4637.2006.00091.x

West, R., Tiernan, B. N., Kieffaber, P. D., Bailey, K., and Anderson, S. (2014). The effects of age on the neural correlates of feedback processing in a naturalistic gambling game. *Psychophysiology* 51, 734–745. doi: 10.1111/psyp. 12225

Wu, Y., and Zhou, X. (2009). The P300 and reward valence, magnitude, and expectancy in outcome evaluation. *Brain Res.* 1286, 114–122. doi: 10.1016/j.brainres.2009.06.0329.06.032

Yeung, N., Holroyd, C. B., and Cohen, J. D. (2005). ERP correlates of feedback and reward processing in the presence and absence of response choice. *Cereb. Cortex* 15, 535–544. doi: 10.1093/cercor/bhh153

Yeung, N., and Sanfey, A. G. (2004). Independent coding of reward magnitude and valence in the human brain. *J. Neurosci.* 24, 6258–6264. doi: 10.1523/JNEUROSCI.4537-03.2004

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 August 2014; accepted: 06 November 2014; published online: 25 November 2014.*

*Citation: Tamburin S, Maier A, Schiff S, Lauriola MF, Di Rosa E, Zanette G and Mapelli D (2014) Cognition and emotional decision-making in chronic low back pain: an ERPs study during Iowa gambling task. Front. Psychol. 5:1350. doi: 10.3389/fpsyg.2014.01350*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Tamburin, Maier, Schiff, Lauriola, Di Rosa, Zanette and Mapelli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Learning on the IGT follows emergence of knowledge but not differential somatic activity

#### *Gordon Fernie1 \* and Richard J. Tunney2*

*<sup>1</sup> Division of Applied Medicine (Psychiatry), University of Aberdeen, Aberdeen, UK <sup>2</sup> School of Psychology, University of Nottingham, Nottingham, UK*

#### *Edited by:*

*Ching-Hung Lin, Kaohsiung Medical University, Taiwan*

#### *Reviewed by:*

*Masataka Watanabe, Tokyo Metropolitan Organization for Medical Research, Japan Navindra Persaud, St. Michael's Hospital, Canada*

#### *\*Correspondence:*

*Gordon Fernie, Division of Applied Medicine (Psychiatry), Clinical Research Centre, University of Aberdeen, Royal Cornhill Hospital Grounds, Aberdeen, AB25 2ZH, UK e-mail: g.fernie@abdn.ac.uk*

The importance of unconscious autonomic activity vs. knowledge in influencing behavior on the Iowa Gambling Task (IGT) has been the subject of debate. The task's developers, Bechara and colleagues, have claimed that behavior on the IGT is influenced by somatic activity and that this activity precedes the emergence of knowledge about the task contingencies sufficient to guide behavior. Since then others have claimed that this knowledge emerges much earlier on the task. However, it has yet to be established whether somatic activity which differentiates between advantageous and disadvantageous choices on the IGT is found before this point. This study describes an experiment to determine whether knowledge sufficient to guide behavior precedes differential autonomic activity or vice versa. This experiment used a computerized version of the IGT, knowledge probes after every 10 trials and skin conductance recording to measure somatic activity. Whereas in previous reports the majority of participants end the task with full conceptual knowledge of the IGT contingencies we found little evidence in support of this conclusion. However, full conceptual knowledge was not critical for advantageous deck selection to occur and most participants had knowledge sufficient to guide behavior after approximately 40 trials. We did not find anticipatory physiological activity sufficient to differentiate between deck types in the period prior to acquiring this knowledge. However, post-punishment physiological activity was found to be larger for the disadvantageous decks in the pre-knowledge period, but only for participants who displayed knowledge. Post-reward physiological activity distinguished between the advantageous and disadvantageous decks across the whole experiment but, again, only in participants who displayed knowledge and then only in later trials following their display of knowledge.

**Keywords: Iowa Gambling Task, somatic marker hypothesis, somatic marker, implicit learning, conscious knowledge, reward learning**

## **INTRODUCTION**

The Iowa Gambling Task (IGT, Bechara et al., 1994) was developed to model complex and uncertain choice environments in a laboratory setting. In it participants make a series of selections from four decks of cards in order to make as much, or lose as little, money as possible. Each deck pays money but all decks also contain losses. The critical aspect of the IGT is that the decks are set up so that those with the highest immediate payoffs have the highest cumulative losses such that their repeated selection will result in an overall loss. Participants must learn to avoid selecting from these decks.

Bechara et al. (1996) suggested a role for emotional processing in learning on the IGT. They reported that autonomic activity which preceded deck selections (anticipatory Skin Conductance Responses or aSCRs) differentiated between advantageous and disadvantageous decks as healthy participants learned to select advantageously on the IGT. In an influential paper Bechara et al. (1997) suggested that this differential autonomic activity preceded participants' ability to report any idea about a successful strategy to pursue on the task. Participants were defined as having a "hunch" if they could express the idea that decks A and B were riskier (or C and D were safer) but not articulate explicitly why. If they could detail why A and B were riskier (or C and D were safer) they had "conceptual" knowledge. Bechara et al. (1997) found that, on average, healthy participants entered the "hunch" period by the fourth questioning (after trial 50, although the range was between trials 30 and 80) and the "conceptual" period by the seventh questioning (following trial 80 with a range of 60–90). Bechara et al. reported that anticipatory SCRs for the disadvantageous decks were larger relative to the advantageous decks and claimed that this difference emerged in normal participants approximately between trials 10 and 50, before participants could articulate any knowledge of differences between deck types. However, although significant differences in choices from deck types developed, the difference in aSCR between deck types was never statistically significant. Patients with ventromedial prefrontal cortex damage did not show this differential aSCR activity and preferred the disadvantageous decks leading Bechara et al. (1997) to conclude that the autonomic activity was necessary to choose advantageously on the IGT and, further, as the difference in it preceded any consciously available knowledge, that the autonomic activity acted as an unconscious bias that guided behavior.

Subsequent studies have suggested autonomic activity and IGT performance are related (Bechara et al., 1999, 2000, 2002; Carter and Smith Pasqualini, 2004; Crone et al., 2004) while others have failed to find a link (Tomb et al., 2002; Campbell et al., 2004). But the interpretation of Bechara et al.'s (1997) results has not been without challenge. The main criticism rests on when participants have knowledge about the task contingencies sufficient to guide behavior. Maia and McClelland (2004) replicated Bechara et al.'s (1997) study and asked a separate group of participants more specific questions than used by Bechara et al. (1997). This group had consciously available knowledge sufficient to guide their choices much earlier than reported by Bechara et al. (1997). Crucially, this knowledge was present prior to the point at which Bechara et al. reported that differential aSCR activity emerged. This suggested that participants' behavior could be based on explicit knowledge of the likely contingencies and, therefore, did not require an explanation dependent on unconscious somatic activity. However, Maia and McClelland did not themselves record autonomic activity and so their data cannot rule out the possibility that differential autonomic activity preceded knowledge about the task contingencies.

The relative importance of knowledge about the IGT contingencies vs. autonomic activity has been examined in numerous studies. However, none have directly replicated Maia and McClelland's (2004) methods to examine the changes in participants' knowledge and autonomic activity as they complete the IGT. Gutbrod et al. (2006) measured autonomic activity and knowledge using Bechara et al.'s (1997) general questions every twenty trials in amnesic patients and healthy controls. While their controls learned to select advantageously and achieved hunch knowledge about the IGT, their patients did not. This advantageous selection occurred well before differential aSCRs emerged. Gutbrod et al. (2006) argued that their results demonstrated that knowledge about the task contingencies was the key to success on the IGT as the amnesic patients did not acquire knowledge, select advantageously or generate differential anticipatory autonomic activity but post-punishment SCRs did differentiate between deck types. However, Gutbrod et al.'s method introduced a delay between selection and feedback that may have made the task extremely difficult for amnesic patients. Without such long delays amnesic patients can learn to select advantageously on the IGT (Turnbull and Evans, 2006). Unfortunately, Gutbrod et al. (2006) did not detail when controls' knowledge emerged. But, like Maia and McClelland (2004), Evans et al. (2005) found healthy participants differentiated between deck types at above chance levels after only 20 trials.

Persaud et al. (2007) explored knowledge of deck contingencies on the IGT using post-decision wagering (PDW) as a novel measure of awareness. Their results suggest that the difference in the questions used by Bechara et al. (1997) and Maia and McClelland (2004) results in earlier awareness of the contingencies when Maia and McClelland's specific questions are used. Interestingly, in Persaud et al. (2007) the emergence of advantageous PDW closely corresponds to when Bechara et al. suggest their participants possessed conceptual, rather than hunch, knowledge of the deck contingencies when general questions are used, whereas with more specific questioning advantageous PDW is closer to when Maia and McClelland found hunch level knowledge. However, neither question style affected the time at which behavioral preference for the advantageous decks emerged nor did it appear to affect overall performance on the IGT. These results raise the possibility that IGT selection behavior does not simply follow acquisition of knowledge of deck contingencies, as suggested by Maia and McClelland's results, and so opens the possibility that autonomic activity separately influences behavior.

Guillaume et al. (2009) recorded skin conductance responses and heart rate during the IGT and explored knowledge using methods similar to Maia and McClelland's specific questions. However, knowledge was only examined at the end of the task rather than concurrently. Thus, Guillaume et al. (2009) were unable to determine when knowledge of the task contingencies emerged and if it influenced autonomic activity. They did report that participants with more accurate knowledge of the contingencies selected more advantageously than those with less accurate knowledge; that participants generated larger anticipatory SCRs before selecting from the disadvantageous vs. the advantageous decks; and IGT performance was positively correlated with the difference in this autonomic response and with degree of knowledge but the latter measures were uncorrelated.

Other researchers have examined the relationship between autonomic activity and explicit contingency knowledge using post-task questionnaires. Suzuki et al. (2003) found differential aSCR activity in the first 40 trials, replicating Bechara et al. (1996, 1997), but no differences in ratings of deck riskiness between groups split *post-hoc* on their post-selection SCR levels, implying no relationship between knowledge and SCR levels. Kleeberg et al. (2004) found aSCR and post-punishment SCRs started at a higher level and increased faster in their healthy comparison group compared to patients with MS. The healthy controls learned faster but there was no correlation with autonomic activity. Patients were generally correct when asked which decks it was best to avoid but less neurologically impaired patients made fewer disadvantageous selections and their aSCRs increased across the task leading the authors to conclude that since knowledge equated between patient groups, but somatic activity did not, cognitive appraisal was not sufficient to account for advantageous IGT behavior. But to reiterate, *post-hoc* questioning cannot inform on when awareness develops. Instead, an examination of contingency knowledge and autonomic activity is required to determine whether the two are dissociable. To this end we report an experiment using the method of assessing awareness described by Maia and McClelland (2004) along with a measure of autonomic activity derived from skin conductance recording. Our aim is to determine whether knowledge sufficient to guide behavior precedes differential autonomic activity or vice versa.

## **MATERIALS AND METHODS**

#### **DESIGN**

The experiment was a replication of Maia and McClelland's (2004) study with the addition that skin conductance responses were measured. A mixed-design was used with Question Group (General or Specific) a between-subjects factor, and Block of trials a within-subjects factor. Three dependent measures were obtained: participants' deck selections on the IGT, participants' knowledge of the task contingencies, and the change in participants' physiological arousal prior to card selection (aSCRs) and following card selection (r or pSCRs; reward or punishment SCRs).

## **PROCEDURE**

On arrival for testing participants were given a brief description of the task, an account of what was involved in the recording of electrodermal activity, and in the General Question Group, information about the recording of their answers using a tape recorder. These participants were told that questions would appear on the computer screen periodically throughout the task and they must speak their answers into the tape recorder. It was emphasized to all participants that the experimenter would not interact with them nor answer any questions about the task after the opportunity to ask questions about instructions had ended (following their acknowledgement that they understood the task instructions). Informed consent was obtained from all participants.

The index and middle fingers of participants' left hands were cleaned using an alcohol free wet-wipe. Once dry an isotonic (0.5% saline) gel (Biopac Gel 101) was rubbed into the skin of the medial phalanges of the index and middle fingers of participants' left hand before the MP30 electrodes were attached. Participants were instructed that it was important to stay as still as possible throughout the experiment and to make themselves comfortable so that they only moved their right hand when controlling the mouse, and in the Specific Question Group, when they entered answers using the keyboard.

Participants then read the task instructions. These were exactly the same as those used in the Bechara et al. (1999, 2000; Fernie and Tunney, 2006) with the addition of information about the periodic interruptions in which questions would be asked. A period of at least 5 minutes was allowed to elapse from electrode attachment to task commencement to allow the electrode gel time to be absorbed into each participant's skin. During this time participants were informed that the experimenter would be present in the room but would not be monitoring their performance. Participants were told that the purpose of the experimenter's presence was to monitor the SCR record and, in the General Question Group, to operate the tape recorder when required. They were told that there would be no interaction with the experimenter except if, in the Specific Question Group, clarification was needed on the terms used in the questionnaire. Participants were then reminded that the most important thing was to earn as much money as possible, or to avoid losing as much as possible.

SCRs were recorded without interference until the task ended. The experiment began once a visual inspection indicated that the apparatus was reliably recording electrodermal activity. An on-screen message instructing the participants to consider which deck they would like to choose. No decks could be selected while this message was on-screen. After 5 seconds another message appeared telling participants to "Please select a card." The mouse pointer re-appeared and the decks became active. The 5 seconds prior to deck choice constituted the period during which SCRs were considered to be anticipatory. Following the selection of a card the computer displayed the amount won accompanied by the sound of a man shouting "Yippee!" This sound was marked on an analog channel of the SCR record and allowed the accurate pinpointing of SCR events in relation to deck choices. One second after the reward, the amount lost was displayed accompanied by the sound of a man shouting "Doh!" The reward and loss information remained on-screen for 5 seconds. The instruction to "Consider your next choice" was then displayed for 5 seconds before participants were again instructed to choose a card. SCRs in the 5 seconds following deck selection were considered to be post-selection SCRs. Therefore, the inter-trial interval was at least 12 seconds but varied depending on how long participants took to choose their next card following the instruction to do so.

The experiment concluded following 100 trials on the IGT and when participants' task knowledge had been probed nine times. The length of time that the experiment took differed between participants and was dependent on the speed with which they selected cards and answered the questions. As there were more questions in the specific question group these participants tended to take longer. The experiment took around 1 h and although participants were told the prospective length of the task this information could provide no hint about when it would end.

On completion of the task all electrodes were removed and participants were fully debriefed. Each participant received the amount they had earned on the task plus an additional £2.

## **PARTICIPANTS**

Thirty-two predominantly post-graduate students were recruited from the University of Nottingham community via posters, online advertisements, and direct email to members of a participant pool. The volunteers were told that they would be participating in a cognitive task and have the opportunity to earn up to £12. They were told that some physiological measures would be recorded and that the experiment took approximately 1 h. Sixteen participants were randomly assigned to each question group [the General questions of Bechara et al. (1997); or the Specific questions of Maia and McClelland (2004)]. The mean age was 25.68 (σ<sup>M</sup> = 1*.*22) in the Specific question group and 24.63 (σ<sup>M</sup> = 0*.*92) in the General question group. There were nine male participants in the Specific and seven in the General question group.

#### **APPARATUS—BEHAVIORAL TASK**

A computerized version of the IGT with the hint instructions and real money incentives was used (Fernie and Tunney, 2006). Breaks in the behavioral task occurred after the first twenty trials and from then on after each ten trial block so that participants' knowledge could be probed using the condition-specific questions. More detail on these is provided below. The addition of questionnaires and skin conductance recording resulted in the task taking around 1 h to complete. As this experiment took on average four times longer than the previous purely behavior studies used in Fernie and Tunney (2006), the value of the payoffs was increased to four times the amount. Therefore, wins increased from 10p to 40p in decks A and B, and from 5p to 20p in decks C and D. All values for losses increased similarly.

## **APPARATUS—KNOWLEDGE PROBES**

The administration and structure of the questionnaires followed the procedure of Maia and McClelland (2004). Briefly, the task was interrupted after twenty trials and thereafter after every ten trials when instructions on the computer screen informed participants that they would be asked some questions about the task. In the Specific Question group participants were given the detailed questionnaire as used in Maia and McClelland (2004). The questionnaire was computer-based and required selection of options using the mouse or entry of answers using the numerical keypad. Three measures of knowledge were obtained for each deck at each question period: a deck rating from −10 to 10 (Deck Rating), an estimate of the average net amount won or lost on the deck (Estimated Net) and a calculated net amount based on participants' estimates of how much they would win, how often they lost, and how much that average loss was (Calculated Net). The participants were also asked which deck they would choose if they only had one choice (One Deck).

In the General Question group participants were presented with the two questions used by Bechara et al. (1997) on subsequent screens: "Tell me all that you know about what is going on in this game" and "Tell me how you feel about this game." Participants' responses were recorded using a tape recorder operated by the experimenter who sat behind a large dividing screen in the same room as the participant. The questions were presented on-screen to minimize any potential experimenter influence and to equate the two question conditions. Interaction with the experimenter was kept to a minimum and was initially restricted to prompting participants to answer the question before them. However, some participants' answers were so minimal that some additional prompting was occasionally required. In the main this took the form of directing participants' answers to their knowledge of the decks.

The presentation and cessation of the questions in both conditions was accompanied by a computer beep to mark the beginning and end of the question period on the skin conductance record, and to inform the experimenter when to start and end the tape recorder in the General question condition.

#### **APPARATUS—ANALYSIS OF GENERAL GROUP TRANSCRIPTS**

Verbal responses to questions were transcribed from the tape recording. Three post-graduate students, naïve to the experimental hypothesis, were recruited and paid to assess the transcripts and classify the knowledge displayed at each question period using Maia and McClelland's (2004) decision tree. The assessors first undertook training on the decision tree using sample answers created to cover all possible outcomes from the tree. One hundred percent accuracy was required before the actual transcripts were assigned. When the sample transcripts were not correctly rated the assessor was told and asked to try again. Most raters accurately rated each transcript on their first attempt. Rarely were three attempts required, but following correct answers the assessor had to convince the experimenter (GF) of why they had reached the assessment they had.

Once the actual transcripts had been assessed the assessors met to compare results. Any disagreements on any of the participant's answers were debated until a unanimous decision among the assessors was reached. If this was not possible a majority decision for that answer was used. These assessments of participants' answers were used to determine when knowledge was displayed in the General Question group.

#### **CLASSIFICATION OF KNOWLEDGE**

Maia and McClelland's (2004) attempt to replicate Bechara et al.'s (1997) study was hampered by the lack of detail about how Bechara et al. assessed knowledge and categorized it into two (hunch and conceptual) of their four knowledge periods. Maia and McClelland (2004) developed a detailed solution to resolve this that resulted in a decision tree to categories each participants' knowledge at each question period into one of the six knowledge categories possible on the IGT. These are: no professed knowledge, incorrect or incomplete hunch/knowledge, partial hunch, hunch, partial conceptual, and conceptual. Even with this decision tree there were still several ways knowledge could be assessed in order to integrate it into Bechara et al's knowledge periods. This integration is effectively along two axes. The first concerns whether knowledge expressed about only one of the good decks is included as conceptual knowledge (partial conceptual). In a strict interpretation of Bechara et al's criteria partial conceptual knowledge would not count as conceptual knowledge because it is not full understanding of both good decks—Maia and McClelland (2004) called this grouping "both." In the "partial" grouping partial conceptual knowledge is included in the conceptual period.

The second axis in integration of the two knowledge assessment systems concerns when participants first show any level of knowledge. A conservative approach would only count knowledge expressed consistently throughout all question periods from the one where it was first expressed through each subsequent questioning i.e., if upon reaching one level of knowledge the participant never returned to a lower state of knowledge. An aggressive interpretation would allow an earlier expression of knowledge to be counted even if later questioning revealed that this level of knowledge was no longer being expressed at a later question period. Maia and McClelland's aggressive, "partial" grouping best fit Bechara et al.'s (1997) results. However, Maia and McClelland focused on the "both" grouping as it more reflected Bechara et al.'s (1997) classification of conceptual knowledge.

These terms are detailed here as they will provide different answers for the question of when knowledge emerges with an aggressive approach using the partial grouping likely providing an earlier point than a conservative approach using the both grouping. Each participant's knowledge at each measurement was independently assessed but mean results within groups will be compared with the results of Bechara et al.'s (1997) and Maia and McClelland (2004) criteria and the closest matching group averages used.

#### **APPARATUS—ELECTRODERMAL ACTIVITY**

A BIOPAC Systems MP30 system running on a Macintosh computer was used to record electrodermal activity. Skin conductance was recorded at 10 Hz using two Ag/AgCl electrodes connected to the volar surfaces of the medial phalanges on participants' index and middle fingers of the left hand (all participants were right handed). Because the MP30 system does not have the facility for a direct link between the recording computer and the task presentation computer, marking the occurrence of events was achieved by recording the sounds produced on the task presentation computer during the task. These sounds were recorded by the MP30 via an analog input. During the IGT gains and losses were accompanied by concurrent auditory stimuli. These served as markers for events in this experiment. Additionally, the experimenter marked the skin conductance record when an event occurred. As this measure is less reliable and not as temporally accurate it was only referred to if any ambiguity about when an event occurred existed in the auditory record.

## **SKIN CONDUCTANCE ACTIVITY ANALYSIS**

Skin conductance responses were analyzed using the Student Lab Pro software for the MP30 system. The first step in the analysis was the removal of the downward drift in the SCR record. A mathematical transformation provided by the Student Lab software was used to remove it prior to analysis. This "difference" transformation measures the difference in amplitude between two data samples separated by a particular number of points (in this case it was 10). The difference is then divided by the time interval between the two samples.

The SCRs were analyzed using the area-under-the-curve measurement. This measurement calculates the total area between a waveform and a baseline value within the endpoints of a selected area. In effect a line is drawn between the user defined start and end points of the waveform. For anticipatory SCRs this was the 5 seconds prior to deck choice as determined by the auditory signal's mark on the analog channel. For post-selection SCRs the start point was 1 second after this marker and the end point was again 5 seconds later. These area-under-the-curve measurements were then divided by the time interval to give a value in amplitude units per second (μS/s).

## **RESULTS**

#### **BEHAVIORAL DATA**

The principle behavioral measure of interest is Mean net score which was calculated by subtracting the number of cards selected in each ten trial block from the advantageous decks A and B from the number selected from disadvantageous decks C and D. Positive scores indicate a preference for advantageous decks and an increase in the mean net score across blocks indicates that participants learn to choose from the advantageous decks during the course of the experiment.

Mean net score for the General Question group was 20.44 (*SD* = 22*.*06). A one sample *t*-test found that this was significantly greater than zero, *t(*15*)* = 3*.*93, *SD* = 22*.*06, *p <* 0*.*01 indicating that participants in this condition showed an overall preference for the advantageous decks. The same was true of participants in the Specific Question group. Their mean net score was 28.56 (*SD* = 29*.*04) and this was significantly greater than zero, *t(*15*)* = 3*.*71, *SD* = 29*.*04, *p <* 0*.*01.

Mean net score was calculated for each block of ten trials and compared between Question Group and across Block. **Figure 1** displays this comparison. A mixed-design ANOVA revealed no main effect of Question Group, *F(*1*,* <sup>29</sup>*) <* 1. There was a main effect of Block, *F(*4*.*31*,* <sup>124</sup>*.*89*)* = 15*.*43, *MSE* = 44*.*26, *p <* 0*.*01, that reflects the increase in mean net score with more trials, but no interaction, *F(*4*.*31*,* <sup>124</sup>*.*89*)* = 1*.*53, *MSE* = 29*.*0, *p >* 0*.*05 indicating that learning proceeded at a similar pace in both question groups. This replicates Maia and McClelland's (2004) result demonstrating that the nature of the questions participants received did not differentially affect their behavior.

#### **KNOWLEDGE OF THE TASK: GENERAL QUESTION GROUP**

The independent ratings suggested at least half the participants reached what Bechara et al. described as the Conceptual Period but this depended on the method of classifying conceptual knowledge (**Table 1**). Like Maia and McClelland (2004) the aggressive approach provided the best fit to Bechara et al.'s (1997) data and the discussion that follows will refer to this approach only. However, unlike Maia and McClelland, the "partial" rather than "both" grouping of conceptual knowledge best matched Bechara et al's data. In classifying knowledge aggressively all but one participant displayed Hunch (or in Maia and McClelland's terms level-1) knowledge and this occurred on average after 43 trials (Bechara et al.—all participants by trial 50; Maia and McClelland—88% of participants by trial 43).

Classification of conceptual knowledge using the "partial" grouping fit Bechara et al.'s data better than using the "both" grouping. In this case only around 30% of participants (vs. 62.5% using the conservative approach) failed to exhibit conceptual (or level-2) knowledge. Bechara et al.'s figure was also 30% and there conceptual knowledge was achieved on average by trial 80. Using either grouping method and an aggressive approach, conceptual knowledge was achieved substantially earlier on average in our study (by 53 or 55 trials for the "partial" and "both"

**Table 1 | Summary of participants' knowledge expression in Bechara et al. (1997); Maia and McClelland's (2004) replication condition and the General question condition of this study.**


*Figures in parentheses are the range of observations for Bechara et al. (1997) and the standard error of the mean otherwise. The aggressive/conservative axis determines when knowledge exists. The conservative approach requires consistent knowledge expression from the first measurement through subsequent measurements. The aggressive approach does not. The both grouping requires knowledge expression that both good decks are good. The partial grouping only requires that one good deck is identified.*

groupings, respectively). Maia and McClelland (2004) also found that the "partial" grouping resulted in the majority of participants (∼75%) being classified as having conceptual knowledge and on average this occurred by trial 62. However, they used the "both" grouping when comparing their results to Bechara et al.'s. With the current data, the "both" grouping would decrease the proportion of participants with conceptual knowledge to 50%.

## **KNOWLEDGE OF THE TASK: SPECIFIC QUESTION GROUP**

**Figure 2** shows the change in ratings for each deck across block. The ratings are mostly negative for all decks. It is clear that most participants do not believe any of decks are good. However, it is equally clear that decks C and D are accurately perceived as being better than decks A and B. Although this indicates that participants have not fully understood the patterns of gains and losses of the decks, and thus of the task, such knowledge would be sufficient to guide behavior advantageously. This knowledge is present in most participants at the second question period. Participants also correctly rated deck A as one of the disadvantageous decks from the first opportunity they are given.

**Figure 3** shows the number of times each deck was identified as the one deck participants would choose if they could only choose one for the remainder of the task. Aside from the first question period, when deck B is often advantageous, most participants would choose deck C or deck D. Indeed the number of participants who would choose deck C increases with experience of the task, mirroring the behavioral data in previous results (Fernie and Tunney, 2006).

Participants' quantitative knowledge of the task as assessed using the Expected Net and Calculated Net measures was not good. The Estimated Net was an estimate of the average amount won or lost on the deck while the Calculated Net was calculated

**selected each deck as the One Deck they would choose if forced to only pick from one.**

from participants' estimates of how much they would win, how often they lost, and how much that average loss was when selecting from each deck. **Figure 4** displays the Calculated Net measure for each deck from every participant in the final question period. The dashed line shows that the mean received value for each deck is close to its pre-test expected value (decks A and B are negative; decks C and D are positive). Pearson correlations were calculated between the actual received values and each participant's Calculated Net measure from the final question period. Calculated Net measures do not correlate with the values actually received for deck B, C, or D (*r* = 0*.*46, 0.43, and 0.34, respectively, *p*'s *>* 0.05), except on deck A (*r* = 0*.*92, *p <* 0*.*01). Actual received values do not correlate with the Expected Net measure on any deck (*r* = −0*.*20, 0.13, 0.19, 0.05 for decks A, B, C, and D, respectively) as illustrated in **Figure 5**. Together these results suggest that most participants' quantitative knowledge of the deck contingencies is not accurate. Indeed for many participants the Expected or Calculated Nets are positive for decks A and B, and negative for decks C and D. This may indicate that participants are unable to retain quantitative knowledge about the decks or that they did not comprehend what was required in the answer for the measures themselves.

**Table 2** displays a breakdown of when and what proportion of participants displayed knowledge of the task contingencies when actual received values are used. The One Deck and Deck Ratings questions were used to assess hunch, or level-1 knowledge, while

**trials for each participant (except for participant 3 for whom figures are following 80 trials).** The calculated expected value was calculated from a

**Table 2 | Knowledge assessment for Specific question group using "partial" grouping (either deck with the highest net value at the time of questioning received the best score on each measure) or "both" grouping (both decks with the highest net value at the time of questioning received the best scores on each measure).**


*Average trial values are rounded to the nearest trial. Values in parentheses are the standard error of the mean. When actual received values are used with a partial grouping and aggressive approach all participants reach the hunch level by around trial 20 and conceptual level by trial 26.*

the Expected and Calculated Net questions were used to assess conceptual or level-2 knowledge. As conceptual knowledge of the task was so poor (**Figures 4**, **5**) and the focus of this paper is when knowledge sufficient to guide behavior emerges only the breakdown of Deck Ratings and One Deck measures will be discussed here.

An aggressive approach using a "partial" grouping was used in the General Question group. This strategy suggests that all participants have hunch level knowledge by trial 22 using the Deck Ratings or by trial 21 using One Deck. More similar results to the General group are obtained by using a conservative approach and a "partial" grouping: 80% of participants have hunch level knowledge by trial 39 using the Deck Ratings or by trial 47 (93.75% of participants) using One Deck. In the analyses that follow where differences pre- and post-knowledge are considered we will use this latter strategy and the figures obtained from the Deck Ratings measure because Deck Ratings required more information from participants. Although, the strategies used to determine when knowledge was present are different in each group, we believe this is appropriate because participants showed no differences in behavior and so it can be assumed that their experience of the task was similar. We can further assume that their pre-task knowledge was similar and as their behavior did not differ, their knowledge remained similar throughout the task (though see Persaud et al., 2007). All that differed between the groups then was the specificity of knowledge probe. If this is the case then an aggressive approach is appropriate for the General group because their knowledge was not probed as effectively as the Specific group participants. Ideally, a conservative partial approach would have been used throughout but this would not have been sensitive enough in the General condition to indicate when knowledge sufficient to guide behavior appeared. The use of these two approaches results in figures for knowledge emergence that is consistent between groups and with the previous literature using the General questions. It is also consistent with the behavior shown in **Figure 1**. Mean net score first moves above chance in both groups in block 4, the block during which the above measures suggest participants can determine C and D to be the best decks.

Further support is provided by an analysis of the proportion of selections from each deck in the pre- and post-knowledge periods across all participants who were categorized as having displayed knowledge (displayed in **Figure 6A**). The proportion of selections from decks A and B declines from the pre- to postknowledge period, whereas the proportion increases for decks C and D. This supports the supposition that participants' choices are guided by knowledge of the decks. A 4 × 2 (Deck by Time) repeated measures ANOVA examined these data. A significant interaction between Deck and Time was revealed, *F(*2*.*28*,* <sup>59</sup>*.*35*)* = 17*.*41, *MSE* = 0*.*03, *p <* 0*.*01; as was a main effect of Deck, *F(*3*,* <sup>78</sup>*)* = 7*.*48, *MSE* = 0*.*03, *p <* 0*.*01. There was no effect of Time, *F(*1*,* <sup>26</sup>*) <* 1. A complex interaction comparison examined the interaction between Deck Type and Time by collapsing data across advantageous and disadvantageous decks in each knowledge period. This 2 × 2 repeated measures ANOVA found a significant interaction between Deck Type and Time, *F(*1*,* <sup>26</sup>*)* = 35*.*60, *MSE* = 0*.*03, *p <* 0*.*001; a main effect of Deck Type, *F(*1*,* <sup>26</sup>*)* = 15*.*38, *MSE* = 0*.*03, *p <* 0*.*001; but no main effect of Time, *F(*1*,* <sup>26</sup>*)* = 2*.*09, *MSE <* 0*.*01, *p >* 0*.*05. Subsequent simple comparisons found that the proportion of advantageous choices in the pre-knowledge period was not significantly greater than the number of disadvantageous choices, *F(*1*,* <sup>26</sup>*)* = 2*.*41, *MSE* = 0*.*03, *p >* 0*.*05; whereas it was in the post-knowledge period, *F(*1*,* <sup>26</sup>*)* = 31*.*84, *MSE <* 0*.*01, *p <* 0*.*001. **Figure 6A** shows that, consistent with previous experiments, this difference appears to be due to changes in selections from decks B and C. In the post-knowledge period the proportion of selections from deck B has decreased below chance and the proportion of selections from deck C has increased above chance. Similar patterns are found in decks A and D, but the major changes lie in decks B and C.

A similar pattern is shown in **Figure 6B** for the participants who displayed no knowledge. The early period shown in the Figure represents the proportion of choices from each deck up until the mean trial at which participants in the knowledge group displayed knowledge. The late period is the period from this mean trial until the end of the task. While behavior in this group looks similar to the knowledge group, there are several differences. The proportion of selections from each deck is much closer to chance in both time periods. In the late period, unlike the participants with knowledge, selections from B are not below chance nor are selections from deck C above chance. These observations were tested in a 4 × 2 (Deck by Time) repeated measures ANOVA. It found no interaction, *F(*1*,* <sup>26</sup>*)* = 2*.*44, *MSE* = 0*.*01, *p >* 0*.*05; no main effect of Deck, *F(*1*,* <sup>26</sup>*)* = 1*.*29, *MSE <* 0*.*01, *p >* 0*.*05; and no main effect of Time, *F(*1*,* <sup>26</sup>*) <* 1. These results suggest that only with knowledge sufficient to guide behavior do participants select advantageously on the IGT replicating Maia and McClelland (2004) but contradicting Bechara et al. (1997). The next section will examine whether differences in physiological responses exist prior to knowledge being displayed and so leave an opportunity for an explanation of IGT behavior incorporating somatic markers.

#### **PHYSIOLOGICAL MEASURES—aSCR**

Anticipatory SCRs were the mean area under the curve of the SCR in the 5 seconds prior to selecting a card. Mean aSCRs for each deck were obtained by taking the average aSCR for that deck for each participant and dividing across participants. These mean aSCRs are displayed by Group in **Figure 7A**. **Figure 7A** shows that mean aSCRs are generally very low and that they are similar in each Group. To determine if any differences existed, a 2 × 4 (Group by Deck) mixed-factor ANOVA was run. Although mean aSCR was higher in the Specific Question Group than in the General Question Group no main effect of Group was found, *F(*1*,* <sup>30</sup>*) <* 1. There was also no main effect of Deck, *F(*1*,* <sup>30</sup>*) <* 1. Despite the higher mean aSCR for deck B in the Specific Question Group, there was no interaction between Question Group and Deck, *F(*3*,* <sup>90</sup>*)* = 2*.*02, *MSE <* 0*.*01, *p* = 0*.*12. As in the behavioral analysis no differences in aSCR were found between groups nor were any differences observed between decks. This first result supports the conclusion that the different questioning did not differentially affect participants, whereas the second contrasts with the data reported by Bechara et al. (1997).

In the previous section it was determined that most participants in each group display at least hunch level knowledge of the task between trials 40 and 50. In order to determine whether aSCR differences existed between decks prior to this period, average aSCRs before and after each participant's expression of knowledge were calculated for each deck for those participants who displayed

**in the Specific Group and trial 43 in the General group).** Error bars are

the standard error of the mean.

knowledge (80% in the Specific group, 93.75% in the General group). As there were no differences in aSCR between groups in the previous analysis this factor was not included in the subsequent analyses. Some participants did not select cards from some of the decks in the period following their expression of knowledge. As a result there were no SCRs on some decks for seven participants who either chose only one deck in the period after they displayed knowledge (deck C in one participant in the Specific question group), or no longer chose from both deck A or B (two participants in both groups) or did not select from deck B (two participants in the Specific question group and one in the General question group). In the analyses that follow missing values were imputed using the automatic multiple imputation method in SPSS 20.0 and the results pooled across five imputations. The resulting 4 × 2 (Deck by Time) repeated measures ANOVA found no significant effects: Deck by Time, *F(*1*.*54*,* <sup>40</sup>*.*08*)* = 2*.*0, *MSE <* 0*.*01, *p >* 0*.*05; Deck, *F(*1*.*74*,* <sup>45</sup>*.*13*)* = 1*.*50, *MSE <* 0*.*01; Time, *F(*1*,* <sup>26</sup>*) <* 1. The same outcome was found when participants with missing data were excluded.

As automatic SCR recording was employed it is possible that interference from SCRs following rewards or punishments affected subsequent aSCRs. If so, then larger aSCRs would be expected following a loss than following a gain. But an examination of aSCRs in each deck following a gain and a loss revealed no such difference. These data were calculated for each participant and entered into a 4 × 2 (Deck by Reinforcer Type) repeated measures ANOVA. No main effect of Reinforcer Type was found, *F(*1*,* <sup>27</sup>*) <* 1; nor was there a main effect of Deck, *F(*1*.*98*,* <sup>53</sup>*.*33*) <* 1; nor an interaction, *F(*1*.*74*,* <sup>46</sup>*.*88*)*<sup>1</sup> *<* 1. This suggests that automatic gathering of SCRs did not impact on the clarity of the physiological record.

The main purpose of this experiment was to determine if any physiological responses distinguish between decks prior to participants' expression of knowledge; that is, SCR changes in the pre-hunch period of Bechara et al. (1997). No significant differences in aSCR were found between decks before participants had knowledge of the task contingencies. This does replicate Bechara et al.'s result, and like their data the mean values found in the present study within this period, displayed in **Figure 7B**, suggested that a difference between decks A and B and decks C and D may exist although there was no significant interaction. Therefore, no evidence was found to support the hypothesis that differences in aSCRs precede knowledge expression in participants who express hunch level knowledge. **Figure 7C** shows that in participants who did not display any knowledge mean aSCRs across the same time periods were at a similar level.

#### **PHYSIOLOGICAL MEASURES—POST-SELECTION SCRs**

Post-selection SCRs were the mean area under the curve of the SCR in the 5 seconds after a card was selected. These SCRs were split into those following a reward with no punishment (reward SCRs or rSCRs) and those following trials on which punishment occurred (punishment SCRs or pSCRs). Mean rSCR and pSCRs for each deck were calculated for each individual. The mean of these values provided the mean post-selection SCRs displayed by Group in **Figures 8A**, **9A** for reward and punishment SCRs, respectively.

**FIGURE 9 | Mean pSCRs for each deck in each group. (A)** Across all selections. **(B)** Mean pSCRs for the advantageous and disadvantageous decks in selections pre- and post-knowledge expression in those participants who displayed knowledge. **(C)** The equivalent graph to b for the participants who did not demonstrate knowledge (*n* = 5)—pSCRs in the period before and after the mean trial at which knowledge was expressed in the majority of each group. Error bars are the standard error of the mean.

**Figure 8A** shows that mean rSCRs are similar in each Group but that there is a trend for rSCRs to be higher in decks A and B. A 2 × 4 (Group by Deck) mixed-factor ANOVA was run to examine rSCRs across all selections. There was no interaction, *F(*1*,* <sup>30</sup>*) <* 1; no main effect of Group, *F(*1*,* <sup>30</sup>*) <* 1; but a main effect of Deck was found, *F(*1*,* <sup>30</sup>*)* = 5*.*97, *MSE <* 0*.*01, *p <* 0*.*01. A planned complex main comparison was performed to investigate whether rSCRs differentiated between the advantageous and disadvantageous decks. It found that rSCRs were higher for the disadvantageous decks, *F(*1*,* <sup>30</sup>*)* = 10*.*12, *MSE <* 0*.*01, *p <* 0*.*01. These results are consistent with previous research (e.g., Tomb et al., 2002), in showing that choices that result in larger rewards also result in larger SCRs.

To investigate whether rSCRs distinguished between selections prior to or following the display of knowledge a 4 × 2 (Deck by Time) repeated-measures design ANOVA was conducted. As no group differences were discovered in the initial analysis Group was removed as a factor in subsequent analyses. Missing values were imputed as in the aSCR analysis. The same results were found when participants with missing data were excluded.

An interaction between Deck and Time was found, *F(*2*.*19*,* <sup>56</sup>*.*97*)*<sup>1</sup> = 3*.*99, *MSE <* 0*.*01, *p <* 0*.*05. As with the overall analysis a main effect of Deck was found, *F(*2*.*13*,* <sup>55</sup>*.*46*)* = 3*.*77, *MSE <* 0*.*01, *p <* 0*.*05, but there was no effect of Time, *F(*1*,* <sup>26</sup>*) <* 1*.*0, *p >* 0*.*05. **Figure 8B** displays the mean rSCRs pre- and post-knowledge in each deck. The interaction between Deck and Time appears to be because rSCRs in the postknowledge period for the advantageous decks are lower than the disadvantageous decks. In order to examine this further, the data were collapsed across Deck to provide values for the advantageous and disadvantageous decks in each time period and an interaction contrast was performed. This is effectively a 2 × 2 (Deck Type by Time) repeated-measures ANOVA, and revealed a significant interaction between Deck Type and Time, *F(*1*,* <sup>26</sup>*)* = 9*.*01, *MSE <* 0*.*01, *p <* 0*.*01; a main effect of Deck Type, *F(*1*,* <sup>26</sup>*)* = 11*.*96, *MSE <* 0*.*01, *p <* 0*.*01; but no effect of Time, *F(*1*,* <sup>26</sup>*) <* 1. Subsequent simple comparisons found a difference between Deck Types in the post-knowledge period, *F(*1*,* <sup>26</sup>*)* = 14*.*29, *MSE <* 0*.*01, *p <* 0*.*1, and not in the preknowledge period, *F(*1*,* <sup>26</sup>*) <* 1. In the selections after knowledge is displayed participants' physiological reactions following reward distinguish between the good and bad decks.

**Figure 8C** presents rSCRs for the participants who did not display knowledge. Here the pre- and post-knowledge periods are based on the mean values from the participants who did display knowledge. The early period includes the trials up to trial 39 and 43 for participants in the Specific and General groups, respectively. The late period includes all the subsequent trials. The mean values depicted in this Figure are much lower than those for participants with knowledge, suggesting that knowledge, and physiological activity may be linked. A similar pattern of reduced physiological activity in the post-knowledge period in decks C and D is also found in this group as in the participants with knowledge, but here it is also found for deck B. A 4 × 2 (Deck by Time) repeated-measures ANOVA was also conducted on this data. There was no interaction between Deck and Time, *F(*3*,* <sup>12</sup>*)* = 1*.*31, *MSE <* 0*.*01, *p >* 0*.*05; no main effect of Deck, *F(*3*,* <sup>12</sup>*)* = 1*.*54, *MSE <* 0*.*01, *p >* 0*.*05; and no main effect of Time, *F(*1*,* <sup>4</sup>*) <* 1. This result supports the conclusion from the analysis of the with-knowledge group that knowledge influences physiological activity. However, this conclusion is qualified by the low number of participants included in this analysis.

**Figure 9A** shows pSCRs over all selections and all participants. Mean pSCRs are higher in the decks with low frequency of punishment (B and D). Mean pSCRs are also higher than mean rSCRs. A 4 × 2 (Deck by Group) mixed-factor ANOVA revealed no interaction, *F(*3*,* <sup>90</sup>*) <* 1 and no main effect of group, *F(*1*,* <sup>30</sup>*) <* 1, thus replicating the other SCR data that found no group differences in SCRs. A main effect of Deck was found, *F(*2*.*12*,* <sup>63</sup>*.*66*)*<sup>1</sup> = 4*.*40, *MSE <* 0*.*01, *p <* 0*.*05. Subsequent simple comparisons revealed that pSCRs following selections from deck A were significantly lower than those from deck B, *F(*1*,* <sup>30</sup>*)* = 6*.*73, *MSE <* 0*.*01, *p <* 0*.*05; as were selections from deck C, *F(*1*,* <sup>30</sup>*)* = 10*.*02, *MSE <* 0*.*01, *p <* 0*.*05; while pSCRs for deck D were also significantly higher than those from deck C, *F(*1*,* <sup>30</sup>*)* = 5*.*73, *MSE <* 0*.*01, *p <* 0*.*05. There was no difference in pSCRs following selections from decks B and D, *F(*1*,* <sup>30</sup>*)* = 2*.*96, *MSE <* 0*.*01, *p >* 0*.*05, nor between decks A and D, *F(*1*,* <sup>30</sup>*)* = 2*.*96, *MSE <* 0*.*01, *p* = 0*.*10, which replicates Crone et al. (2004) and supports their conclusion that it is the magnitude of punishment and not the frequency that is influential for pSCRs.

Due to the infrequent nature of punishment relative to reward in all of the decks (far greater in decks B and D), many participants received no punishment in the post-knowledge period on some decks either as a result of not choosing them or because no punishment resulted from their choices. As this applied across so many participants a 4 × 2 (Deck by Time) analysis became impractical with the addition of missing values reaching unacceptable levels. However, the question of interest was whether physiological activity distinguished between the decks prior to a display of knowledge. As such pSCRs were averaged within participants in two ways. First, the mean pSCR for the advantageous and disadvantageous decks in the pre- and post-knowledge period were calculated for each participant. **Figure 9B** displays these means for those participants who displayed knowledge. A 2 × 2 (Deck Type by Time) repeated measures ANOVA, equivalent to that performed on the rSCR data, revealed a significant interaction between Deck Type and Time, *F(*1*,* <sup>26</sup>*)* = 4*.*44, *MSE* = 0*.*02, *p <* 0*.*05; but no main effect of Deck Type, *F(*1*,* <sup>26</sup>*) <* 1; nor a main effect of Time, *F(*1*,* <sup>26</sup>*)* = 1*.*96, *MSE* = 0*.*02, *p >* 0*.*05. Subsequent simple comparisons revealed that pSCRs were higher for the disadvantageous decks prior to knowledge being displayed than in the period afterward, *F(*1*,* <sup>26</sup>*)* = 6*.*04, *MSE* = 0*.*01, *p <* 0*.*05.

Second, the mean pSCRs for the decks with frequent and infrequent punishments were also calculated in each knowledge period. A 2 × 2 (Punishment Frequency × Time) repeated measures ANOVA found no interaction, *F(*1*,* <sup>26</sup>*) <* 1; no main effect of Punishment Frequency, *F(*1*,* <sup>26</sup>*) <* 1; and no main effect of Time, *F(*1*,* <sup>26</sup>*)* = 1*.*96, *MSE* = 0*.*02, *p <* 0*.*05. This result contrasts with Crone et al. (2004) who found higher pSCRs following choices from decks B and D.

Similar analyses were carried out for the participants who showed no knowledge. **Figure 9C** displays the mean values of pSCRs collapsed across the advantageous and disadvantageous decks up to and after the mean trial at which participants with knowledge displayed that knowledge. The 4 × 2 (Deck Type by Time) ANOVA revealed no interaction, *F(*1*,* <sup>26</sup>*)* = 1*.*42, *MSE <* 0*.*01, *p >* 0*.*05; no main effect of Deck Type, *F(*1*,* <sup>26</sup>*) <* 1; and no main effect of Time, *F(*1*,* <sup>26</sup>*)* = 1*.*11, *MSE <* 0*.*01, *p >* 0*.*05. The Punishment Frequency × Time ANOVA also revealed no interaction, *F(*1*,* <sup>26</sup>*)* = 1*.*43, *MSE <* 0*.*01, *p >* 0*.*05; no main effect of Deck Type, *F(*1*,* <sup>26</sup>*) <* 1; and no main effect of Time, *F(*1*,* <sup>26</sup>*) <* 1.

## **SUMMARY**

Overall, we found that participants have knowledge about IGT contingencies sufficient to guide advantageous deck selection before the task's halfway point. We found no evidence of anticipatory autonomic activity that differentiated between deck types prior to this knowledge emerging. Differences in post-selection SCRs between deck types were found. Reward SCRs distinguished between the advantageous and disadvantageous decks across the whole experiment but only in participants who displayed knowledge and then only in later trials following their display of knowledge. Punishment SCRs were found to be larger for the disadvantageous decks in the pre-knowledge period but, again, only for participants who displayed knowledge.

## **DISCUSSION**

We report an experiment in which we examined the claim that differential autonomic activity between deck types precedes the emergence of knowledge sufficient to guide behavior on the IGT. In contrast to previous research (Bechara et al., 1997) we found no evidence of differential pre-selection autonomic activity. These results replicate previous findings that differential aSCR activity is not necessary to succeed on the IGT (Gutbrod et al., 2006). In the absence of differential aSCR activity healthy participants learned to select advantageously on the IGT and developed knowledge of the task contingencies sufficient to guide behavior after approximately 40 trials. Our results suggest that aSCRs are not an unconscious measure of knowledge that predicts the choices people make.

Although we found that aSCRs do not differentiate between deck types prior to knowledge being displayed, a difference between deck types found over all rSCRs was localized within participants who displayed knowledge in the period following that knowledge being displayed. This result provides qualified support for the influence of knowledge rather than autonomic activity in influencing behavior on the IGT. The absence of any difference in aSCRs is problematic as a null effect can never be evidence for any hypothesis, and the results from the pSCRs suggest physiological responses occur for larger primary punishers but only in the initial period of the task. One possibility is that pSCRs did not distinguish between decks in the post-knowledge period because participants were aware that those decks had the worst losses. Alternatively the pre-knowledge pSCRs might influence subsequent decisions and constitute the first stage in a process toward somatic markers. This position is supported by the absence of these effects in participants who displayed no knowledge. So the physiological results are ambiguous showing that differences in post-selection SCRs emerge following knowledge for rewards but prior to knowledge for punishments. It could be argued that the post-knowledge difference in rSCRs indicates relief at escaping from a choice on a disadvantageous deck without a punishment. This would reflect the influence of knowledge. After all, these decks are more risky than the advantageous decks. Differential SCR activity, including aSCRs, may just reflect this awareness of risk.

Both Campbell et al. (2004) and Kleeberg et al. (2004) have reported failures to replicate the aSCR difference between deck types reported by Bechara et al. (1997). We also found that aSCRs did not increase over time replicating earlier results using a computerized version of the task (Suzuki et al., 2003; Carter and Smith Pasqualini, 2004). A possible explanation for the absence of differences in the aSCRs is the automated way in which they were gathered. The experimenter controlled the length of the intertrial interval between SCR acquisitions in Bechara et al. (1997). This was to ensure that participants' physiological activity had returned to baseline following the previous choice. We did not employ exactly the same methods as Bechara et al. (1997) and so it is possible that as the inter-trial interval was fixed to a greater extent in the current experiment, physiological activity following the previous choice interfered with anticipatory physiological activity on the next choice. However, Crone et al. (2004) employed a similarly automatic methodology ensuring that the inter-trial interval was as long as reported by Bechara et al. (1997) and found similar results to theirs. The inter-trial interval in the experiment reported here was as long as the average reported by Bechara et al. (12 seconds). However, we found no differences in aSCRs following rewards or punishments. The results reported here show that the emergence of knowledge occurred at a similar point in the IGT as claimed by Bechara et al. (1997), but found no evidence for their claim that this was preceded by differential somatic activity. This has implications for Damasio's somatic marker hypothesis (SMH, Damasio, 1994, 1996). The SMH integrates emotional processing with rational decisionmaking positing a critical input from an embodied emotional system (somatic markers) in making decisions in complex and uncertain situations. As such, the IGT has been used extensively as a test of SMH. If accepted at face value our results are problematic for the SMH. Participants in this experiment improved on the IGT and displayed knowledge of which decks were worst in the long-run, yet the results suggest aSCRs played no part in this process. It may be that participants in this experiment did not have the same physiological reaction as those in other experiments but if this is the case it suggests that like other, clinical studies (North and O'Carroll, 2001; Heims et al., 2004) the absence of autonomic activity does not preclude learning on the IGT. Additionally, several studies (Hinson et al., 2003; Turnbull et al., 2003; Jameson et al., 2004) have shown that impairments in executive components of working memory detrimentally impact on IGT performance, suggesting that differences in aSCRs are driven by cognitive processes (implying knowledge) rather than vice versa. Alternatively, differential autonomic activity may have occurred in our sample, yet remained undetected because we used the relatively crude SCR measure. That we did not employ other measures of autonomic activity such as heart rate or respiratory response is a limitation of our study.

The results of this experiment are not only problematic for Bechara et al.'s (1997) account of IGT behavior. Knowledge sufficient to guide long-term advantageous selection emerged in the majority of participants at around the same time as Bechara et al. (1997) claimed. Participants were able to identify one of the best decks when initially questioned. As Maia and McClelland (2004) pointed out, unless losses have been experienced this will initially be deck A or B. But when losses begin to be encountered on these decks, they become disadvantageous, and it is then that participants have a problem keeping up. This was reflected in the assessment of participants' knowledge using either an aggressive or a conservative approach. For knowledge to be revealed using a conservative approach requires that knowledge to be present consistently across questioning and as losses are experienced on decks A and B, participants struggle to identify C and D as the new best decks. This time overlaps with when Bechara et al. (1997) claimed the aSCR difference emerged (trials 10–50). Kleeberg et al. (2004) reported that although they found no difference in aSCRs between deck types the increase in aSCR they observed averaged across all decks emerged between trials 20 and 40. These aSCR differences may be related to the shift in polarity of deck received values. The results from our study mean that Maia and McClelland's (2004) assertion that participants have knowledge sufficient to guide their behavior from the first questioning is supported, but unlike Maia and McClelland, our examination of participants' knowledge when their first losses on what become the disadvantageous decks are experienced, does not support the claim that this knowledge reflects the received deck contingencies. This also provides some support for the claim that failure to learn a successful strategy on the IGT may be linked to deficits in reversal learning (Rolls, 1999, 2005; Dunn et al., 2006).

As Maia and McClelland (2004) found, the assessments of participants' knowledge here sometimes indicated that their behavior did not reflect the knowledge that they possessed. Participants often did not select one of the best available choices despite the knowledge probes indicating that they were able to make this distinction. One explanation for this behavior is that their knowledge is not complete and few possess accurate knowledge of the deck contingencies. This makes non-optimal deck selection a reasonable option as participants attempt to explore the decks to learn more about their contingencies (Maia and McClelland, 2005). However, as **Figures 4**, **5** show, few

## **REFERENCES**


amygdala and ventromedial prefrontal cortex to decision-making. *J. Neurosci.* 19, 5473–5481.


participants come close to achieving this understanding. Indeed, most participants gave all the decks a negative rating suggesting that they were unaware that either decks C or D were profitable with repeated selection. This also suggests that for participants in this experiment the times when they lost money were most influential when they made their ratings. Certainly the pattern of changing selection from decks B and C driving learning observed in previous studies (Fernie and Tunney, 2006; Lin et al., 2007) was replicated here and was reflected in the question responses of participants given the Specific questions.

Persaud et al.'s (2007) claim that question style influenced awareness of deck contingencies is interesting in the context of our finding that participants' continued to select sub-optimally despite the presence of knowledge sufficient to guide behavior. There was no difference in when participants began to select advantageously between Persaud et al.'s groups demonstrating, surprisingly, that awareness, as measured with PDW, did not affect behavior. Regardless of whether PDW is an accurate measurement of awareness (Overgaard et al., 2010; Mealor and Dienes, 2012), Persaud et al.'s results seem to show that participants may have increased understanding of the task contingencies, or at least decreased uncertainty, following more specific questioning. However, Persaud et al. do not report on what degree of knowledge their participants possessed despite asking them the same questions we did. It may be that this increased knowledge, or decreased uncertainty, acted to reduce risk, or loss, aversion (Schurger and Sher, 2008) when wagering, but was not sufficient to reduce the exploratory behavior necessary to learn more about the task contingencies.

Our results suggest that participants do not generate anticipatory physiological activity sufficient to differentiate between deck types in the period prior to acquiring knowledge sufficient to guide their behavior. Knowledge required to profit on the IGT emerged later than claimed by Maia and McClelland (2004) but was not a complete understanding of the nature of the IGT. Indeed our results differed from those reported by both Maia and McClelland (2004) and Bechara et al. (1997). Both groups suggested that the majority of their participants end the experiment with conceptual knowledge of the IGT. We found little evidence in support of this conclusion, but conceptual knowledge was not critical for advantageous deck selection to occur.

*Science* 275, 1293–1295. doi: 10.1126/science.275.5304.1293


*Human Brain.* New York, NY: Avon.


letters). *Nat. Neurosci.* 5, 1103–1104. doi: 10.1038/nn1102-1103


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 July 2013; accepted: 11 September 2013; published online: 04 October 2013.*

*Citation: Fernie G and Tunney RJ (2013) Learning on the IGT follows emergence of knowledge but not differential somatic activity. Front. Psychol. 4:687. doi: 10.3389/fpsyg.2013.00687*

*This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Fernie and Tunney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*