# A model-based analysis of impulsivity using a slot-machine gambling paradigm

^{1}Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich and Swiss Federal Institute of Technology (ETH Zurich), Zurich, Switzerland^{2}Max Plank Institute for Neurological Research, Cologne, Germany^{3}Laboratory for Social and Neural Systems Research (SNS), University of Zurich, Zurich, Switzerland^{4}Wellcome Trust Centre for Neuroimaging, University College London, London, UK

Impulsivity plays a key role in decision-making under uncertainty. It is a significant contributor to problem and pathological gambling (PG). Standard assessments of impulsivity by questionnaires, however, have various limitations, partly because impulsivity is a broad, multi-faceted concept. What remains unclear is which of these facets contribute to shaping gambling behavior. In the present study, we investigated impulsivity as expressed in a gambling setting by applying computational modeling to data from 47 healthy male volunteers who played a realistic, virtual slot-machine gambling task. Behaviorally, we found that impulsivity, as measured independently by the 11th revision of the Barratt Impulsiveness Scale (BIS-11), correlated significantly with an aggregate read-out of the following gambling responses: bet increases (BIs), machines switches (MS), casino switches (CS), and double-ups (DUs). Using model comparison, we compared a set of hierarchical Bayesian belief-updating models, i.e., the Hierarchical Gaussian Filter (HGF) and Rescorla–Wagner reinforcement learning (RL) models, with regard to how well they explained different aspects of the behavioral data. We then examined the construct validity of our winning models with multiple regression, relating subject-specific model parameter estimates to the individual BIS-11 total scores. In the most predictive model (a three-level HGF), the two free parameters encoded uncertainty-dependent mechanisms of belief updates and significantly explained BIS-11 variance across subjects. Furthermore, in this model, decision noise was a function of trial-wise uncertainty about winning probability. Collectively, our results provide a proof of concept that hierarchical Bayesian models can characterize the decision-making mechanisms linked to the impulsive traits of an individual. These novel indices of gambling mechanisms unmasked during actual play may be useful for online prevention measures for at-risk players and future assessments of PG.

## Introduction

Uncertainty is a fundamental aspect of human decision-making (Bland and Schaefer, 2012). One general framework for assessing decision-making under uncertainty is to view humans as Bayesian learners. From this perspective, humans employ a generative model of sensory inputs to update beliefs about the state of the world and choose actions in order to minimize prediction errors (Knill and Pouget, 2004; Daunizeau et al., 2010; Friston et al., 2010). When this predictive machinery breaks (due to disease or drugs), maladaptive behavior can arise. This aberrant behavior can be formally examined and understood mechanistically using different computational models (e.g., McGuire and Kable, 2013). One interesting and clinically relevant case of potentially harmful aberrant behavior that arises is impulsivity, i.e., actions without deliberation or forethought, particularly in the face of uncertainty (Dickman, 1993; Sharma et al., 2014).

Impulsive responses under uncertainty play a crucial role in disordered gambling, where players continue to bet money even in the face of large losses and potentially catastrophic long-term consequences. It has been found that standard measures of impulsivity and gambling severity scores are significantly correlated (Alessi and Petry, 2003; Krueger et al., 2005). Pathological gambling (PG) was therefore originally categorized as an “Impulse Control Disorder Not Elsewhere Classified” in the Diagnostic and Statistical Manual (DSM) Fourth Edition. It has recently been relabeled “gambling disorder” and reclassified as an addictive disorder in the 5th edition of the DSM, due to the large number of characteristics it shares with other addictions. This, however, does not question the relationship between impulsivity and disordered gambling, since impulsivity is a central theme in addiction as well (Holden, 2010; APA, 2013).

Impulsivity has been shown to have predictive power in assessing a subject's susceptibility to addiction (deWit, 2009; Leeman et al., 2014). In the specific context of gambling, correlations between gambling severity and more traditional questionnaire-based measures of impulsivity, such as the Eysenck's Impulsivity Inventory, the Barratt Impulsiveness Scale (11th version; BIS-11), the Urgency, Premeditation, Perseverance and Sensation-Seeking (UPPS) scale, and the Dickman Impulsiveness scale, have been reported (Monterosso and Ainslie, 1999; Rodriguez-Jimenez et al., 2006; Whiteside and Lynam, 2009). More specifically, changes in gambling severity were related to changes in self-reported impulsivity scores (Blanco et al., 2009). Given this evidence, impulsivity has been proposed as a potential predisposing factor for PG (Vitaro et al., 1997; Brewer and Potenza, 2008; Guerrieri et al., 2008; Stein, 2008).

However, impulsivity is a broad concept with many facets, including spontaneous acts without planning or deliberation (“acting without thinking”), excessive risk-taking, and a lack of orientation to future outcomes (Patton et al., 1995; Robbins et al., 2012). It is therefore conceivable that PG behavior involves only a subset of these elements (Nower and Blaszczynski, 2006). For example, response impulsivity (also referred to as “stopping impulsivity”; Robbins et al., 2012), measured by deficits in inhibitory control, showed mixed results when tested for in pathological and problem gamblers. Lawrence et al. (2009) found deficits in a Stop Signal Reaction Time Task (SSRT) with relation to alcohol dependence but not in relation to gambling behaviors. Similarly, Rodriguez-Jimenez et al. (2006) found decreased SSRT accuracy only in gamblers also diagnosed with Attention Deficit Hyperactivity Disorder (ADHD). Other studies, however, report that pathological gamblers commit more commission errors in a Go/NoGo task, indicative of impulse control problems (Fuentes et al., 2006).

By contrast, measures of choice impulsivity (or “waiting impulsivity”; Robbins et al., 2012) show a more consistent relation to gambling behavior. For example, higher discount rates in delay discounting tasks have been associated with problem and PG in a number of studies (Petry, 2001; Alessi and Petry, 2003; Peters and Büchel, 2011; Miedl et al., 2012). These deficits correlate mainly with cognitive distortions, suggesting that differences in the underlying belief structure of a gambler might contribute to the types of impulsivity we see in disordered gambling (Michalczuk et al., 2011). These findings are in line with reported decision-making deficits of gamblers across a variety of tasks (Goudriaan et al., 2005). This does, however, not explain how different cognitive mechanisms related to impulsivity translate into different gambling behaviors, from the recreational to the pathological.

Classical analyses of impulsivity, in the context of gambling, rest primarily on questionnaires (Eysenck and Eysenck, 1977; Barratt, 1985; Monterosso and Ainslie, 1999; Whiteside and Lynam, 2009). For many complex traits or psychological constructs (including impulsivity), self-report measures from questionnaires represent a gold standard. However, as they provide a descriptive summary of processes that may be controlled by factors not accessible through conscious introspection, they can be subject to various confounds (Wilson and Dunn, 2004). A promising alternative approach is to directly engage the subject in a paradigm that unmasks pathological behavior and apply a model that infers on the latent mechanisms underlying this behavior. This notion is rapidly gaining attention, particularly in the application to psychiatry (cf. “computational psychiatry”; Moutoussis et al., 2011; Montague et al., 2012; Stephan and Mathys, 2014), and represents the approach pursued in this paper. To gain acceptance in the field, however, any model-based approach of this sort will have to show construct validity with respect to an established standard, i.e., a commonly used questionnaire (for a similar rationale, see Huys et al., 2012). In our case, the BIS-11 represents one such widely accepted standard way of assessing impulsivity, and we thus used this questionnaire as a reference point for demonstrating the plausibility of our model-based characterization of impulsivity.

Formal modeling of the time series of responses during gambling (whether pathological or not) has received surprisingly little attention (one significant exception being Ligneul et al., 2012). However, there have been several publications in the recent past urging the community toward cognitive models of problem gambling (i.e., Gobet and Schiller, 2011). Some analyses have been motivated conceptually by reference to reinforcement learning (RL) (Shao et al., 2013), but we are not aware of studies that have directly applied a reinforcement-learning model to slot machine gambling data. This may be because classical RL does not directly relate to probabilistic inference on hidden states of the world *per se* (e.g., states of slot machines) but assumes states and actions to be given and accessible (Gershman and Niv, 2010). This lack of an intrinsic concept of uncertainty (with respect to states of the world) is not ideal for studying gambling behavior (Averbeck et al., 2013; McGuire and Kable, 2013). This suggests the application of Bayesian approaches, for which uncertainty is a core quantity. Wetzels et al. (2010), for instance, use an Expectancy Valence (EV) model to parameterize how subjects perceive wins and losses when engaging in the Iowa Gambling Task (IGT), and argue for the use of Bayesian cognitive models to explain gambling behaviors. Similarly, a recent call for increasing the role of mathematics in the psychological intervention in problem gambling highlights the need for further modeling approaches (Barboianu, 2013).

To yield mechanistic insights into gambling, we need to infer, from measured behavior, the principles that govern an individuals' belief-updating processes. This can be achieved using a Bayesian model of cognitive processes–one that illustrates how sequences of latent states and their respective uncertainties are transformed into observable responses. Bayesian models thus allow for “triple inference,” with respect to perception (inference on states of the world), learning (estimating the parameters that govern perceptual updates) and decision-making (the transformation of beliefs into actions). These quantitative estimates provide a more complete and mechanistically interpretable explanation of behavior in an individual, reflecting perceptual and decision-related nuances that simple summary statistics, such as average accuracy or reaction time, may have hidden from the experimenter (Mathys et al., 2011).

In the present work, we treat the player as an (approximate) Bayes-optimal learner who invokes a hierarchical generative model of trial outcomes in order to infer on the probabilistic structure of the game, allowing for optimal decisions under uncertainty (cf. Daunizeau et al., 2010). Having seen a trial outcome, the player updates his beliefs about trial-wise probabilities of winning and how these change in time (i.e., whether the slot machine is stable or volatile). Critically, these updates exhibit individual approximate Bayes-optimality (Mathys et al., 2011), governed by subject-specific parameters that couple the hierarchical levels of inference in the model. On any given trial, the ensuing beliefs then provide a basis for a response model that prescribes a probabilistic mapping from beliefs to responses.

A likely reason as to why there have been few attempts at formal modeling of slot machine gambling may be that it is not immediately obvious which of the many data features a naturalistic slot machine paradigm affords should be used to formulate a model for optimally predicting impulsivity (both in terms of sensory inputs and motor responses). Notably, this cannot be decided by standard statistical model comparison techniques since this requires the data to be constant across models. Here, we address this problem by examining construct validity. That is, for different combinations of sensory and motor data features, we assess the predictive power of the resulting model parameter estimates in relation to an external and independent variable.

To summarize, in this proof of concept study we evaluated the potential utility of a model-based approach to characterizing gambling behavior, combining a naturalistic gambling paradigm with generative (Bayesian) modeling to quantify gambling-relevant aspects of impulsivity. For this, we sought to establish construct validity in relation to standard questionnaire measures of impulsivity. Specifically, we first tested 48 male participants using a naturalistic slot-machine gambling paradigm task where a variety of different gambling behaviors could be expressed. We assessed the behavioral correlates in gambling behavior with respect to the individuals' impulsivity, as assessed by the BIS-11 (Patton et al., 1995) and independently modeled participants' belief-updating mechanisms by a hierarchical Bayesian framework (Hierarchical Gaussian Filter, HGF). Finally, we examined whether the model parameter estimates would predict the individuals' impulsive traits (BIS-11 scores).

## Materials and Methods

### Experimental Procedure

#### Participants

Participants included 48 healthy male subjects (Table 1). All volunteers gave written informed consent. The study was approved by the ethics committee of the Faculty of Medicine at the University of Cologne, Germany (study number 10-226). One subject left the task early, and was excluded from the analyses.

#### The Barratt Impulsiveness Scale

The Barratt Impulsiveness Scale (version 11; BIS-11) was used as an independent measure of impulsivity. It has a 50-year track record in psychiatric diagnosis and was validated as a measure of impulsivity in a series of studies (Moeller et al., 2001; Stanford et al., 2009). Here, we used the German version of BIS-11 (Patton et al., 1995). The BIS-11 is a 30-item self-report questionnaire divided into 6 first-order and 3 second-order sub-scales (First-order sub-scales: Attention, Motor, Self-control, Perseverance, Cognitive Instability, Cognitive Complexity; Second-order subscales: Attentional, Motor, Non-planning). We used the total score as the external impulsivity measure in our analyses of construct validity of computational models.

#### The Sensitivity to Punishment and Sensitivity to Reward Questionnaire

The Sensitivity to Punishment and Sensitivity to Reward Questionniare (SPSRQ) assesses the Behavioral Inhibition System and the Behavioral Activation System in the two subscales Sensitivity to Punishment (SP) and Sensitivity to Reward (SR), respectively. The SR has been found to relate positively to the Eysenck's Impulsivity Inventory and also has a significant positive correlation with the Sensation-Seeking Scale (SSS) (Torrubia et al., 2001). Here, we use this as a complementary scale, in addition to BIS-11, to examine gambling; in contrast to other impulsivity questionnaires, such as the UPPS Impulsive Behavior Scale (Whiteside and Lynam, 2009), the BIS-11 lacks an isolated sensation-seeking subscale, which we account for by using the SPSRQ. In the context of gambling behavior the more relevant measure will be the Sensivity to Reward subscale.

#### The South Oaks Gambling Screen

The South Oaks Gambling Screen (SOGS) is a self-administered 20-item questionnaire to screen for clincial populations with problem and PG based on criteria stated by the third edition of the Diagnostic Statistical Manual (DSM III). We assessed the SOGS to account for potential confounds of PG behavior in our analysis of impulsive gambling. The clincal cut-off of the SOGS proposed by Lesieur and Blume (1987) is 5, while the cutoff poposed by Tolchard and Battersby (1996) is 10. The mean score of our healthy subjects was 1.12. 3 out of 47 subjects in our study exceeded a SOGS score of 5, none of the subjects exceeded a SOGS score of 10. As the SOGS has been reported to be overly sensitive for assessments of the general population with a false positive rate of 50% (Stinchfield, 2002), we decided to include all subjects into the main analysis. We do not find a significant correlation between the BIS-11 and the SOGS (*r* = 0.18, *p* = 0.2).

#### Slot-machine paradigm

We designed a naturalistic behavioral paradigm to approximate the experience of true casino gambling by simulating a simple Electronic Gambling Machine (EGM). In addition to flexibility of design and ease of play, the literature suggests that EGMs have a higher addiction potential than other gambling alternatives, and increased access to these machines may lead to an increase in gambling problems across the world, independent of cultural context (Dowling et al., 2005; Lund, 2009). For these reasons, a slot-machine experimental paradigm proved particularly appealing in eliciting impulsive behavior from our subjects.

The features of the game, the design of slot-machines itself, and the probability trace and pay-out percentage were inspired by real slot machines in Swiss casinos, and allowed players significant freedom to express different types of gambling behavior (Figure 1). To increase engagement in the task, participants gambled with real money (20 Euros) that they symbolically received—in addition to their show-up fee—before the start of the slot-machine game. The actual payout (the sum of wins, losses, and the show-up fee) took place after the game was completed. With a view to future studies with PG patients, we designed the virtual slot machine to resemble a realistic one, in the hope that this will facilitate the emergence of underlying risk tendencies and allow us to measure a broad spectrum of potentially relevant behaviors. We used industry-typical color and sound effects to increase the subject's engagement in the task, making the experiment as entertaining and realistic as possible while keeping the response options sufficiently simple such that subjects without previous gambling experience could immediately understand the options of the game. From a modeling perspective, a casino-like game allows the experimenter to observe a rich set of behaviors that go beyond trial-wise bets. For example, the behavior and self-reports of many gamblers indicate that they are actively trying to estimate whether a given machine is “hot” (i.e., whether it is likely to produce wins) and switch to a new machine if it gets “cold” (Parke and Griffiths, 2006). To review such behavior, our paradigm offered four different slot machines, and the subject was free to switch casinos and switch machines at any point in time.

**Figure 1. Structure of slot machine game.** After introductory instructions and 5 training trials, the player “enters the casino.” He or she is then made to chose a machine, place a bet and pull the lever. As in a standard slot machine, the player watches the wheels spin and is then shown the trial outcome. On 50% of the win trials, the player is allowed to engage in a double-up option. He or she is given 3 seconds to decide whether or not to gamble. On any given trial, the player can also choose to switch the machine or switch between casino visits (not shown). The trial outcomes can be clarified as follows: “true wins” (small and big) were wins, in which the monetary amount won was larger than the original bet. “Fake wins,” were trials in which the monetary amount received was smaller than the original bet placed. “Near-misses” are trials in which the outcome of the trial was a loss, but only the last wheel was different (e.g., AAB), and true losses are trials in which the amount bet was greater than the amount won.

Generally, at any point during the game, the player was able to:

• switch between four different slot machines,

• switch casinos and re-enter the casino on a new virtual day,

• add money from wallet to machine,

• place small or large bets on each trial,

• check the scores of different fruit combinations,

• accept the double-up (DU) option after a subset of wins.

#### Behavioral readout

Participants played 200 trials of the slot machine gambling task, and could decide between 4 machines (see Figure 1) that differed in their visual appearance but not in choice options. Players had the option to switch between these machines at any point during the game (machine switch, MS). Similarly, they could also leave the virtual casino which would result in a cash-out of the money that was currently on the machine. After a casino switch (CS), participants had to reenter the casino on a “new day” in the virtual world (while the underlying probability structure of the trials, which was unknown to the subjects, continued) until they played through a minimum number of 200 trials.

At the beginning of the game, participants could decide how much of the money on their account they wanted to load into the slot machine and which of the four machines they wanted to play on. In each trial, they had to place one of two bets (low bet = 20 Cents, high bet = 60 Cents) before starting the spinning of all three wheels of the machine, one after another. After a delay of 1 s, the wheels stopped spinning and displayed a combination out of 9 possible stimuli, all of which were pre-programmed in the game. The magnitude of the win was determined by the combination depicted; possible wins ranged from 5 Cents to 60 Euro (the Jackpot win for a large bet). Participants could look up the win table at any point in time by pressing an extra button on the machine. After 50% of the wins, the player was offered a secondary gamble option, with a 50–50 chance of doubling or losing the win amount (Figure 1). The decision to accept the “DU” option had to be made within 3 s of the screen appearing, enforcing rapid decision-making.

#### Perceptual input

To differentiate between different types of learning, we used four different trial outcomes in the game: true wins, fake wins, near-misses, and true losses (compare Figure 1). “True wins” were wins, in which the monetary amount won was larger than the original bet, whereas in “fake wins,” the monetary amount received was smaller than the original bet placed. Fake wins have been found to reinforce the sense of winning in slot machine games, and were included here to identify whether these events play a role in characterizing impulsivity (Jensen et al., 2013). “Near-misses” refer to cases, where the bet was lost, but only the last wheel was different (e.g., AAB). This type of trial outcome has been shown to enhance gambling motivation, to lead to physiological arousal and to activate reward-related brain areas (Clark et al., 2009, 2012), all of which are related to subjective skill-oriented cognitive gambling traits, such as illusion of control (Billieux et al., 2012). Finally, “true losses” were cases in which all symbols depicted were different. Upon winning, the player experienced one of two different win banners (Figure 1), depending on the size of the win, where the distinction of “mega win” was reserved for the top three largest win amounts. In the case of fake wins, the same win-banner was shown to the player as for true “non-mega” wins.

#### Underlying game structure

To ensure comparable inference trajectories, each participant played the same sequence of 200 trials (with pre-determined win/loss outcomes), but was given the option to continue gambling past the end of this period if he desired. For comparison across subjects and modeling purposes, we analyse performance over the 200 trials only. Because the sequence of probabilities and reward levels were fixed across subjects, variability in performance could only result from the subject's own betting behavior and choices to engage in the DU option. Through systematic simulations, we chose a trace of probabilities and reward levels such that the mode of the return-to-player (RTP) for all types of potential bet combinations was around 90%, which is higher than the minimum required return to player of 70%, as stipulated by norms for actual casino RTPs (Gaming Laboratories International, 2011). The trace accounts for numerous relevant variables that may determine gambling behavior:

• 40% of the trials were wins,

• 50% of the wins were fake wins,

• 50% of the winning trials were followed by a DU option,

• 18% of trials were near misses,

• pre-determined winning and losing streaks.

Figure 2 depicts two exemplary performance traces for two subjects with BIS-11 score below (55) and above average (73). Behavioral readouts are overlaid in different colors. Notably, the more impulsive subject showed more behavioral activation and risk-seeking behavior throughout the game, in particular during the more volatile phase.

**Figure 2. Two exemplary performance traces for subjects with different BIS-11 scores. (A)** Subject with BIS-11 score below average (55). **(B)** Subjects with BIS-11 score above average (73). Gray trace, performance over the course of the game in EUR. Colored dots are overlayed to the performance trace and reflect events of interest in a particular trial. MS (orange), machine switch; CS (green), casino switch; DU (blue), double-up; BI (red), bet increase. Notably, the more impulsive a subject (i.e., the higher a subject's BIS score), the more the subject exhibited behavioral activation throughout the game, and in particular, in the more volatile phases of the paradigm.

### Data Analysis

In addition to computational modeling described below, we used classical multiple linear regression for analyzing the behavioral data and for evaluating the construct validity of computational models (i.e., testing relations between model parameter estimates and BIS-11 scores). These analyses were performed using the regstats function in the MATLAB Statistics Toolbox. To determine how much the variance of an estimated regression coefficient increased due to collinearity, we estimated the variance inflation factor (VIF) for each regressor. In all analyses, a *p*-value of < 0.05 was considered significant; multiple tests were corrected for by Bonferroni correction. For model comparison we used the Bayesian Information Criterion (BIC).

### Computational Modeling

#### Generic considerations

This paper is concerned with a proof of concept demonstration that generative modeling of gambling behavior can yield mechanistic descriptions of impulsivity in terms of individual beliefs and belief-to-response mappings. A generative model is a model which provides a joint probability distribution over all random variables involved (e.g., observations and parameters). It specifies a forward mapping from hidden parameters and states to measurable observations. Here, we will consider generative models which are formulated under the “observing the observer” framework (Daunizeau et al., 2010). Such models allow the experimenter to infer upon the hidden states and parameters of an agent or subject engaged in a task.

Critically, this class of models generate two things: sensory inputs and motor responses. Therefore, when specifying a generative model of gambling, one must consider what aspects of the sensory input administered and the behavioral responses observed are to be predicted by the model. First, the player's internal belief updating (mediated by the “perceptual model”) could be informed by different aspects of trial outcomes to which he has sensory access (“perceptual variables”). For example, does he treat near-misses similar to wins of any sort, and does he distinguish between true wins and fake wins? Secondly, how is a given belief transformed into a behavioral response or choice? A particular belief-to-response mapping constitutes what we refer to as a “response model.” Importantly, in a naturalistic paradigm, many different aspects of behavior can be observed (e.g., bets, DUs, MS, etc.), and, similar to the perceptual variables above, a choice has to be made regarding what the most relevant data features are that should be predicted by the generative model. In other words, generative models could be constructed for different combinations of perceptual and response variables.

In principle, finding the optimal model can be accomplished by means of Bayesian model selection (BMS), which evaluates the relative plausibility of competing models in terms of the log evidence (MacKay, 2003) and represents a principled trade-off between model fit and model complexity. However, a condition for BMS is that the competing models predict identical data. This means that BMS can only proceed if both perceptual and response variables are identical.

To deal with this issue, we implement a two-stage model selection in this paper. As described in the next section, we consider five different “core models;” each of these represents a particular combination of a perceptual and a response model. We then consider three different perceptual variables and four response variables; this results in 12 sensory-motor datasets. For any of these datasets, we can invert all five core models and select an optimal model using BMS. In a second step, we can evaluate the relative goodness of these 12 selected models by assessing their construct validity against an external measure of impulsivity. To this end, we use an independent questionnaire-based measure of impulsivity (the BIS-11) and perform multiple regression analyses of individual parameter estimates on individual questionnaire scores, as described below.

Following this general overview of our modeling strategy, the following paragraphs will unpack these ideas and specify both the perceptual and response variables considered as well as the form of the generative models employed.

#### Perceptual variables

We considered the three following pereptual variables which refer to binary trial outcomes and are summarized in Table 2 (where win is coded as 1 and loss as 0): (i) Win/Loss Gross (WLG); this case treats real wins and fake wins as wins and near-misses as losses; (ii) Win/Loss Net (WLN); this option only considers real wins as wins and treats fake wins and near-misses as losses; (iii) Overlearn (OL), where real wins, fake wins and near-misses are all considered as wins.

#### Response variables

A naturalistic paradigm like ours allows for numerous readouts of behavior, and thus, many possible response variables. Here, we consider several combinations of readouts as candidate response variables. As our intention is to explain how impulsivity is manifested in a gambling paradigm, we use the BIS-11 to inform the choice of response variables. Across factor analyses of the BIS-11, conducted by Barratt (1985) and Patton et al. (1995), respectively, the 2 second-order subscales, which were found consistently include Motor Impulsiveness (the inclination to act spontaneously and aimlessly) and Non-planning Impulsiveness (the lack of future orientation and consideration of risks). Guided by these two BIS-11 subscales, we focus on four candidate response variables, bet increase (BI), double-up (DU), casino switches (CS), and machine switches (MS). Specifically, we considered nested combinations of these actions as response variables (see Table 2 and Figure 5). BI and DU reflect Non-planning Impulsiveness, whereas CS and MS are best characterized by Motor Impulsiveness. Quantitatively, we represent the players' trial-by-trial responses for each of these variables in a binary fashion and apply a boolean OR operator to the respective response set (Table 2).

The reinforcement-learning and Bayesian models we consider below link the expression of the above responses to the agent's internal beliefs and their uncertainy. Simply speaking, we are modeling an agent for whom stronger beliefs of winning lead to increasingly risk-seeking behavior and, at the same time, result in an increasing frequency of erratic and sensation-seeking behavior in terms of CS and MS. Critically, this probabilistic link between beliefs and actions is governed by a subject-specific parameter which, in some of the models described below, becomes a function of the agent's trial-wise uncertainty.

### Core Models

#### Hierarchical Gaussian Filtering

As motivated in the Introduction, this paper adopts a Bayesian perspective on gambling. Specifically, we use a hierarchical Bayesian belief-updating model, the HGF shown in Figure 3, to infer upon the underlying belief structure guiding individual gambling behavior and its relation to an individual's level of impulsivity (Mathys et al., 2011, unpublished work). The HGF represents a generic generative model of the sensory inputs an agent receives. It consists of hierarchically coupled Gaussian random walks, where this coupling is specified by subject-specific parameters.

**Figure 3. Schematic of Hierarchical Gaussian Filter (HGF).** Different levels of the hierarchy encode a subject's estimates of different characteristics of environmental uncertainty. The first level, *x*_{1}, follows the trajectory of the perceived variable in the environment, in the absence of perceptual noise. The second level, *x*_{2}, tracks the probability of trial outcomes over the course of the paradigm. The step-size of the random walk by *x*_{2} depends on the highest level, *x*_{3}, that tracks the global volatility of the environment. This three-level system underlies the player's belief-updating process during the game.

To illustrate its general structure, let us assume that we track a quantity *x*_{1} in our environment which evolves as a Gaussian random walk. Let us now characterize the variance of this random walk as a function of a higher level, *x*_{2}, which is itself a Gaussian random walk. *x*_{2} now controls the step size of the random walk performed at the first level via some transformation function *f*, and thus determines our uncertainty about *x*_{1}. We can continue this hierarchical coupling up to some *n*-th level:

where a parameter ϑ determines the step-size on the highest level *n*:

#### Perceptual model of the HGF

Figure 3 shows the graphical model and the equations of a standard three-level HGF which we are using for the present analyses. In this model, the lowest level, *x*_{1}, corresponds to the perceived variable (e.g., a win), barring sensory noise (which is negligible in our case). The second level represents the evolution of the probability of trial outcomes over time. Critically, its variance depends on the third level which, in turn, represents the stability of the environment (log-volatility). In our context, this model describes how the player updates his beliefs about trial outcome probabilities under the influence of a higher belief of how these probabilities change in time (i.e., whether the slot machine is stable or volatile).

By linking beliefs to choices via a response model, we can invert this model given measured responses; for details, see Mathys et al. (2011, unpublished work). This model inversion allows for inference on subject-specific model parameters, and thus, on an individual's hierarchical belief trajectories and their associated uncertainties. Notably, the posterior estimates of subject-specific model parameters describe an individual's approximation to Bayes-optimal behavior.

In the HGF, model inversion rests on a variational approximation to full Bayesian learning which results in simple analytical belief update equations (for a detailed derivation, see Mathys et al., 2011). Intuitively, one would imagine that a belief update occurs when an agent compares the predicted to the actual sensory input, calculates an error term, and then back-propagates this error up the hierarchy to adjust beliefs at all levels. In the HGF, this occurs by passing back a prediction error that is weighted by precision (inverse uncertainty). This precision term is (proportional to) the inverse step size of the Gaussian random walk on different levels.

The belief update equations generalize to the following form: At any level *i* of the hierarchy, the belief on trial *k* (posterior mean μ^{(k)}_{i} of the state *x _{i}*) is updated in proportion to a precision-weighted prediction error ε

^{(k)}

_{i}; this is the product of the prediction error δ

^{(k)}

_{i − 1}from the level below and a precision ratio Ψ

^{(k)}

_{i}:

where

Here, ${\widehat{{\pi}}}_{{i}{\text{\hspace{0.05em}}}{-}{\text{\hspace{0.05em}}}{1}}^{{(}{k}{)}}$ represents the precision of the prediction about input from the level below, and π^{(k)}_{i} represents the precision of the belief at the current level. Finally, the prediction error δ^{k}_{i − 1} is simply the difference between the actual value of a state (e.g., stimulus at the lowest level of the hierarchy) and our expectation of its value:

The equations above show that our network updates in a manner similar to RL, in which the model trains on the error signal between model predictions and observed data. A key difference, however, is that the HGF not only provides estimates of states, but also of their uncertainty (posterior variance or precision); enabling a precision-weighting of prediction errors. This precision weighting means that prediction errors lead to greater updates the more precise (less uncertain) predictions are. The HGF thus takes into account estimates of uncertainty about the hidden hierarchically related processes which generate sensory inputs. The detailed update equations for precisions (or uncertainties) can be found in Mathys et al. (2011).

#### What do the parameters mean?

The HGF described above represents a Bayesian player who updates his beliefs about trial outcome probabilities (at the 2nd level) under the influence of a higher belief (at the 3rd level) of how these probabilities change in time, i.e., whether the slot machine is currently stable or volatile. This model has four parameters of interest: three perceptual parameters (κ, ω, ϑ), and one response parameter (β) which is described below.

κ and ω determine the step size of the random walk on the second level of our hierarchy. Both of them contribute different aspects of volatility: while ω is a fixed component of step size variance at the second level, κ scales the influence of the third level on the step size variance of the second level and can thus be seen as a mediator for the dynamic component of volatility. The analyses presented in this paper fix κ to unity, because of identifiability problems that arise under some of the response models chosen here. ϑ determines the step-size of the random walk on the third level; in a sense, it represents an agent's *a priori* belief on the precision of his own inference at the second level. Collectively, these subject-specific model parameters describe the coupling of belief updates across levels and thus an individual approximation to Bayes-optimal behavior.

#### Response model of the HGF

To model the players' responses we use a softmax function, in which we vary the nature of the decision temperature, β, in Equation 6. This function describes a sigmoidal mapping from the gambler's beliefs to his chosen action:

The free parameter β encodes the curvature of the softmax, and thus decision noise, ${\widehat{{x}}}_{{1}}$ is the present prediction of trial outcome probability, and *y* is the binary response variable. An intuitive interpretation of β is that it specifies how deterministically a subject's actions follow from his/her beliefs. The larger β, the steeper the softmax curve, increasingly resembling a step function, and thus the more deterministic the relation between beliefs and actions. Conversely, as β gets smaller decisions become less determined by beliefs, i.e., choices become more stochastic or exploratory.

We consider four classes of response models that will be tested on each of the four aggregate response variables (see section Response Variables). Here, we vary the nature of β (Models 1–3) or the argument of our softmax response function (Model 4) (see Figure 4):

• Model 1: β^{(k)} = constant, the decision noise is a subject-specific, static feature which is independent of any higher level beliefs and which is estimated as a free parameter. The response model is then *p*(*y* = 1) = softmax(*x*_{1}, β).

• Model 2: β^{(k)} = 1/σ^{(k)}_{2}, where σ^{(k)}_{2} is the trial-wise uncertainty (about winning probability) on the second level of the perceptual model. The response model is then *p*(*y* = 1) = softmax(*x*_{1}, 1/σ_{2}).

• Model 3: β^{(k)} = 1/exp(μ^{(k)}_{3}), where μ_{3} is the log-volatility. The response model is then *p*(*y* = 1) = softmax(*x*_{1}, 1/exp(μ_{3})).

• Model 4: *p*(*y* = 1) = softmax(4σ_{1}, β), where σ_{1}is the trial-wise uncertainty about winning probability on the first level of the perceptual model, and β is a free parameter as in Model 1.

**Figure 4. Overview of the 5 core models tested.** Model 1–4 are different types of the HGF that differ in the response model used, whereas Model 5 is a classical Rescorla-Wagner Model with a standard softmax response function.

Using a fixed value for β as in Model 1 is the standard formulation by which computational models employ the softmax function. By contrast, in Models 2 and 3, β turns into a dynamic quantity, which depends on trial-wise estimates of uncertainty (about the estimating winning probability, Model 2, and log-volatility, Model 3). This model space is motivated by the following rationale: both σ_{2} and *exp*(μ_{3}) encode for aspects of uncertainty in the perceptual model, and higher uncertainty about one's estimates should lead to less reliance on one's beliefs when choosing a decision; i.e., more exploratory behavior. This would be expressed by a more gentle slope of the softmax curve and a lower β-value. Conversely, low uncertainty about one's estimates should map onto a steeper, more deterministic softmax curve which corresponds to a higher β-value. For this reason, σ_{2} and *exp*(μ_{3}) both enter the decision model as inverses. Finally, Model 4 was inspired by a response model in Vossel et al. (2014) but formulated slightly differently. Here, we imagine an agent who is more sensation-seeking, erratic and risk-taking the higher his trial-wise uncertainty about winning probability. This uncertainty corresponds to the variance of a Bernoulli distribution at the first level of the perceptual model and takes a maximum value of 0.25 for ${\widehat{{x}}}_{{1}}$ = 0.5. Using a scaling factor of 4 ensures that this argument enters the softmax appropriately, such that the maximum leads to the greatest probability of eliciting a response.

All HGF analyses were done in the context of the HGF Toolbox Version 2.1 which is freely available as part of the open source software package TAPAS (http://www.translationalneuromodeling.org/tapas/).

#### Reinforcement learning model

While hierarchical Bayesian learning is an appealing model to describe belief updating during gambling, we need to evaluate its suitability in comparison to simpler (non-hierarchical) models (Model 5, Figure 4). In particular, this includes RL models which have found application in some analyses of gambling tasks (e.g., Oya et al., 2005; Kafidindi and Bowman, 2007). Having said this, we are not aware of any RL analyses of trial-wise data from casino slot machine gambling. Here, we focus on one of the most generic and widely used RL models, the Rescorla–Wagner (RW) learning model (Rescorla and Wagner, 1972).

The RW model is a trial-wise learning model, originally developed for estimating associative learning mechanisms in conditioning. It is also frequently used in a reduced form, for example, for estimating on-line the probability of a trial-wise outcome; this is the form we use here. Updates are governed by prediction errors, scaled by a fixed learning rate:

where *V* is the estimate of probability (of a specific outcome in trial *k*), α is known as the learning rate, and λ is actual outcome.

Here, we use the RW model as a perceptual model and combine it with a standard softmax function with a free parameter β encoding decision noise (see Model 1 above).

### Model Selection

In this paper, we adopt a two-stage model selection procedure that evaluates different models with regard to two things: how well a model explains a given set of perceptual and response data features (step 1—BMS), and for which of these different data features the parameters of an optimal model best predict an external measure of impulsivity (step 2—construct validity).

#### Model selection stage 1—Bayesian model comparison

As described above, first, we consider five different “core models” (Figure 4), each of which combines a particular perceptual and a particular response model. These five core models are inverted using 12 different sets of data features, which result from combining three alternative perceptual variables with four alternative response variables (Table 4 and Figure 5). The best of the five core models for a given dataset is selected via Bayesian model comparison. This rests on the log evidence, a principled index of a model's trade-off between fit and complexity (MacKay, 2003). Critically, BMS implementations exist which can deal with heterogeneity across subjects and enable proper random effects group-level inference (Stephan et al., 2009).

**Figure 5. Model selection stage 1.** For each pairing of a perceptual variable with a response variable, Bayesian model comparison was performed, yielding an optimal model (penumbra) for this combination of perceptual/response data features. This optimal model then entered stage 2 (construct validation). **Model selection stage 2**. To determine the best model with respect to an external measure of impulsivity, we regressed individual BIS-11 scores on model parameter estimates from the 12 models (one for each pair of perceptual variable and response variable) provided by model selection stage 1. The winning model is picked using a BIC comparison across regression models, to account for differing model complexities. BIS-11, Barratt Impulsiveness Scale; WLN, Net Win/Loss; BI, bet increase; DU, double-up.

The approach we employ in the present analyses is that of approximating the log evidence by negative free energy. The free energy is an upper bound approximation to the agent's surprise about seeing the data and, in contrast to the log-evidence which is analytically intractable for all but the simplest models, can be computed as part of model inversion by means of variational Bayesian (VB) optimization (see Mathys et al., 2011, for the procedure used by the HGF). For further details on model comparison using free energy, please see Penny (2012) and Stephan et al. (2009).

#### Model selection stage 2—construct validation against external criteria

Having selected an optimal model for each of the 12 sets of data features, we can evaluate the models' construct validity, i.e., how well they predict an external measure of impulsivity. For this purpose, we use the independent questionnaire scores of impulsivity (BIS-11) and perform multiple regression analyses on the model parameter estimates. In this case of competing predictions based on multiple regression models, potential differences in model complexity (due to differences in the number of generative model parameters and thus number of resulting regressors) can be corrected using the BIC. The significance of the ensuing best prediction is adjusted for multiple tests using Bonferroni correction.

## Results

### Impulsivity Expressed in Slot Machine Gambling Behavior

We used a multiple regression analysis to predict impulsivity from the behavioral read-outs of the slot machine game (BI%, DU%, MS%, CS%). Together, these behavioral measures explained 32% of the variance in the individuals' BIS-11 scores [*F*_{(3, 46)} = 4.72, *p* < 0.001]. Individually, only BI percentage was significant (see Table 3); this also survived Bonferroni correction.

### Computational Modeling

#### Model selection stage 1—Bayesian model comparison

In order to determine which of the five core models (Figure 4) best explained each of the 12 different data feature sets, resulting from combining three perceptual variables (WLG—treating real wins and fake wins as wins; WLN—treating only real wins as wins; OL—treating real wins, fake wins, and near-misses as wins; see Table 2) with four response variables ({BI}, {BI, DU}, {BI, DU, CS}, {BI, DU, CS, MS}), we used BMS. This enabled us to identify, for each of the perceptual-response feature sets, the model with the highest posterior probability (Figure 6). These selected 12 models then entered a second stage of model comparison, where we examined the construct validity of these models by testing how well their parameter estimates predicted the independent BIS-11 scores.

**Figure 6. Summary of model comparison results across all 12 classes of 5 models each (i.e., 5 optimal core models for each perceptual variable/response variable pairing).** The posterior expectation of model probability, obtained from a random effects Bayesian model selection procedure, is plotted on the y-axis. The perceptual variables span the x-axis; the response variables span the y-axis. WLG, Win/Loss Gross; WLN, Win/Loss Net; OL, Overlearn; BI, bet increase; DU, double-up; CS, casino switch; MS, machine switch; M1, M4 are the HGF core models listed in the Response Model section. RW, Rescorla–Wagner Model. The models tested in Table 4 are indicated with an asterisk. From the analysis presented, core model 2 (an HGF with second-level uncertainty instructing the decision noise in the response model) trained on perceptual variable WLN and response variable {BI, DU, CS, MS} best explains the BIS-11 scores and is the winning generative model of impulsivity.

#### Model selection stage 2—external validation

To determine which of these 12 selected computational models best predicts impulsivity, we used multiple regression to test how well their parameter estimates predicted the BIS-11 scores (Figure 5, Table 4). As the computational models vary in the number of free parameters (e.g., some do not include a free parameter for decision temperature β) and thus the regression models differ in the number of regressors, we use the BIC to compare the regression models. We find that the core model 2 (an HGF with second-level uncertainty governing decision noise in the response model; Figure 4) with perceptual variable WLN and response variable {BI, DU, CS, MS} best explains the BIS-11 scores (highlighted in dark gray in Table 4). Two other variants of core model 2 had a similar but slightly smaller BIC value (light gray in Table 4).

Together the model parameters of the winning model explained 28% of the variance in the individuals BIS-11 scores [*F*_{(1, 46)} = 8.66, *p* = 0.0007, *R*^{2} = 0.28]. Note that this model-based prediction of BIS-11 scores is significant, even after Bonferroni correction.

#### HGF parameter estimates

In a next step, we used the parameter estimates of the winning model (core model 2, perceptual variable: WLN, response Variable: {BI, DU, MS, CS}) and examined their relation to the subscales of the BIS-11, the SPSRQ, and the behavioral readouts of the game using linear regression (see Table 5, for details). Across subjects, the average posterior means of the parameters (±SD) were: ω : −4.08 ± 1.36; ϑ: 0.05 ± 0.01. Together the model parameter estimates explained 29% of the variance in Non-planning subscale scores [*F*_{(1, 46)} = 8.61, *p* < 0.006] and 15% of the variance in Motor Impulsiveness subscale scores [*F*_{(1, 46)} = 3.86, *p* < 0.029]; the latter, however, did not survive Bonferroni correction. *Post-hoc t*-tests showed that both ω [*t*_{(47)} = 2.98, *p* < 0.004] and ϑ [*t*_{(47)} = 2.10, *p* < 0.04] contribute to predicting Non-planning impulsiveness. The model parameter estimates also predict sensitivity to reward, as measured by the SPSRQ subscale [*F*_{(1, 46)} = 3.72, *p* < 0.032]; however, again this did not survive Bonferroni correction. We did not find a relation between the model parameters and Sensitivity to Punishment (SP) (Table 5).

Finally, the model parameters (ω and ϑ) predicted all main behavioral readouts from our paradigm (BI, MS, CS, and DU percentage), which survived Bonferroni correction for CS, MS, and DU (Table 5). *Post hoc t-tests* showed that there is a significant linear relationship between for CS and ω [*t*_{(47)} = 7.22, *p* < 0.001], but not ϑ [*t*_{(47)} = 0.58, *p* < 0.56].

## Discussion

This study aimed to evaluate the utility of computational modeling in characterizing slot-machine gambling behavior under realistic conditions and establish construct validity in relation to standard questionnaire measures of impulsivity. To this end, we created a naturalistic slot-machine paradigm to accrue realistic behavioral readouts from a group of healthy subjects and used a hierarchical Bayesian model of individual learning and decision-making to model the paradigm outputs.

The task builds upon previous research using slot machine tasks to explore gambling (e.g., Shao et al., 2013), but adds various degrees of freedom and realism, such as increasing the amount of the bet placed, machine switching, casino visits, and the DU option. Overall, we find that impulsivity as measured by the BIS-11 score was significantly related to an exploration of these game features. Impulsive subjects showed a stronger tendency to increase their bet size, switch between machine and casino visits and engage in a double-up option; an example of such a player is shown in Figure 2. The predictive importance of BIs for impulsivity (Table 3) is in line with studies on on-line gambling showing that gamblers with the highest levels of gambling severity exhibit the largest variance in their bet behavior (Adami et al., 2013). While the other gambling options were not correlated with BIS-11 scores when considering the “raw” behavioral data, our modeling results suggest that they jointly predict BIS-11 scores better than BIs alone (Table 4).

The mechanistic model aims at formalizing how humans solve the task at hand on a computational level. It relates potential beliefs and their evolution over time to behavioral choices. Variability across individuals within this process is captured by subject specific parameter estimates that can then be related to traits of the individual like impulsivity. To unearth this hidden information, the models have to consider (i) what information of the game players are using in order to infer their chances of winning on a trial by trial basis (perceptual variables), (ii) how they update their beliefs over time and express these beliefs through actions (core models, each representing a particular combination of perceptual and response models), and (iii) which aspect of the observed responses should be used for estimating model parameters (response variables). As described above, we use a two-step procedure that combines initial BMS (of the core model best explaining a given data set of perceptual and response variables) with subsequent construct validation through multiple regression (of parameter estimates from this selected core model on BIS-11 scores).

Concerning the first step of our procedure (BMS), Figure 6 shows that the choice of the perceptual variable has a much stronger impact on the posterior probabilities of the models than the choice of the response variable. The reason for this is simply that the perceptual variables differ much more from another than the response variables. With regard to the latter, the four response variables are nested in each other and are dominated by the frequent occurrences of BIs. By comparison, CS, and MS are less frequent and their addition to BI does not change the resulting response variable dramatically. In contrary the perceptual variable changes substantially depending on whether “fake wins” (which constitute 50% of all wins) are considered as wins or losses.

Following this procedure, the model with the highest construct validity is one which assumes that players (i) learn the Net Win/Loss probability of the game, that is, they consider only true monetary wins, (ii) update their beliefs based on their respective uncertainty on a trial by trial basis, and (iii) perform any action in the game (BI, MS, etc.) based on the belief about winning and loosing, and its respective uncertainty. It is notable that not all of our optimal core models significantly predict individual BIS-11 scores (Table 4); that is, the selection of the most informative trial-wise perceptual and response variables is crucial for predicting impulsivity by computational models.

The parameter estimates from the optimal model (ω and ϑ) significantly explain individuals' total BIS-11 score (Table 5). However, it should not be overlooked that these are in-sample predictions and the effect size estimates (i.e., *R*^{2}) presented here are thus likely to be optimistic. We will address this issue in future studies with larger samples which enable out-of-sample predictions. Furthermore, when interpreting the present results, one should keep in mind that these analyses were performed in healthy volunteers who show only moderate variability with respect to BIS-11 scores (see Table 1). While this limited variance in a healthy population poses an even harder problem for statistical predictions than dealing with a highly variable population, there is no guarantee that the mechanisms highlighted by our model-based analyses will extrapolate to pathological gamblers. Instead, it is possible that qualitatively different mechanisms operate during pathological as compared to recreational gambling. This would be signaled by a different outcome of our model comparisons and will be examined in future studies with patients. Notwithstanding these caveats, the present study is important because it suggests a novel two-step modeling procedure for slot machine gambling data, and it provides concrete suggestions of which data features in slot machine gambling may be most useful for future studies.

Several of the competing models also successfully predict the BIS-11. The three leading models (highlighted in gray in Table 4), however, all share the same core model structure, in which noisy decision making is a function of perceptual uncertainty (Model 2) as well as the same perceptual input (Net Win/Loss). The models differ solely in the response variables they predict. One potential cause for that could be that some of the behavioral readouts (like CS) were relatively sparse and contributed less to the individuals' variance in gambling behavior. Finally, a variant of model 1 (which contains an additional free parameter compared to model 2) also significantly predicts BIS-11, but with a worse BIC score. Interestingly, this is the only predictive model which rests on WLG (learning from fake and true wins) as perceptual variable and a constant decision noise that is independent of the current uncertainty. Thus, this model variant might capture a general bias toward reward-related processing and behavior.

We found that subjects based their decisions on the Net Win/Loss probability, considering only real wins as wins and treating fake wins and near-misses as losses. This is not in line with an earlier study (Jensen et al., 2013) which found that subjects' estimate of winning probability increased in games with a higher number of fake wins. However, this previous study compared two slot-machines which not only differed in the number of fake wins, but also in the number of wheels (3 vs. 6), altering both game difficulty as well as the visual appearance of wins and fake wins. In our slot machine paradigm, wins and fake wins were of constant appearance; in addition subjects were informed about win magnitude after each trial, thus facilitating the distinction between real and fake wins.

Altogether, our modeling results emphasize that uncertainty plays two important roles in gambling. First, the model parameters which jointly predict BIS-11 scores (ω and ϑ) both encode aspects of uncertainty. ω represents a fixed, subject-specific tendency to change beliefs about winning probability (i.e., variance at the second level of the HGF), while ϑ determines the fluctuations of log-volatility (at the third level of the HGF) and thus the dynamic component of volatility on belief updating about winning probability. Second, the optimal response model captures a direct influence of this belief uncertainty on the individual's decision process in that decision noise is modulated by trial-wise uncertainty about winning probability. That is, the more uncertain a subject is whether he will win on the next trial the less his actions will be informed by his a priori beliefs, leading to seemingly more random behavior.

Encoding of uncertainty has previously been linked to an individual's impulsivity (Averbeck et al., 2013). While our study finds that both parameters described above jointly predict the BIS-11 score (Table 5), the different aspects of uncertainty, captured by ω and ϑ, however, have been shown to influence different parts of the brain (Iglesias et al., 2013) and may thus have differential effects on the expression of impulsivity. Indeed, our behavioral modeling analyses find a closer relation between individuals' impulsivity and the ω parameter of the model. That is, for our particular paradigm and healthy volunteers, uncertainty about winning probability appears to be more strongly related to impulsivity than the prior belief about volatility. Intriguingly this link is particularly strong for the Non-planning subscale of the BIS-11 (Table 5), suggesting that uncertainty about favorable outcomes might be a key factor in producing the lack of forethought measured by the BIS-11 (Barratt, 1985).

We used the BIS and the SPSRQ as established measures of the impulsive traits to establish construct validity of our approach (Patton et al., 1995; Torrubia et al., 2001). Having said this, the accuracy of these questionnaire-based assessments suffers from a number of limitations. PG has a high co-morbidity with mood disorders and depression, both of which tend to overshadow gambling habits and their subsequent symptoms, and may thereby cause distorted self-reports (Allcock and Grace, 1988; Black and Moyer, 1998). Further bias stems from patients lacking the requisite capacity for self-reflection (Wilson and Dunn, 2004). It has thus been suggested that interactive, computer-based neuropsychological tests provide more reliable measures of impulsivity (Kertzman et al., 2006; Chamberlain and Sahakian, 2007). Combining such tasks with a computational model of impulsivity in a naturalistic gambling setting may allow us to go even further.

Four advantages of a computational approach to such problems are particularly worth mentioning. First, computational models (i) can provide interpretations of trait like impulsivity by replacing the more descriptive nature of questionnaires with more mechanistic descriptions of how players update their beliefs during gambling and transform these into choices. In our case this is done by establishing a link between the individuals' uncertainty about winning and loosing and the resulting increase in more erratic and riskier responses. Furthermore, computational models can (ii) assess the degree of impulsiveness during actual gambling and without any need of potentially distorted self-reports, and (iii) they allow us to generate not only response traces observed in our subjects, but possible candidate response traces that reflect extreme cases of impulsive behavior. Such traces could help to identify patterns in gambling data that earmark potential problem gamblers. This approach is therefore particularly interesting for prevention with respect to online gambling. After having established a clear link between impulsivity and problem gambling, models of naturalistic play could assess the individual's impulsiveness “on the fly” and identify potential at risk players without the need of a self-report outside the actual gambling situation. Finally, (iv) the trial-wise traces of beliefs and uncertainties, inferred by a model, can serve to inform analyses of neurophysiological or fMRI data (for examples using the HGF, see Iglesias et al., 2013; Vossel et al., 2014), opening new avenues for neuroimaging research on gambling.

## Summary and Outlook

The hierarchical Bayesian modeling approach presented here is capable of revealing cognitive mechanisms in gambling that are linked to traditionally defined impulsive traits of the individual. In particular, the gambling behavior of subjects, who are more impulsive, is best described by models that encode for greater uncertainty at various levels in their hierarchy, and show uncertainty-dependent coupling between beliefs about winning and subsequent decisions.

Our analyses provide a proof of concept that individual heterogeneity in gambling behavior can be quantified by computational models, enabling a mechanistic interpretation of individual gambling. Future research will have to assess the generalizability and practical utility of this approach in predicting disordered gambling behavior in various gambling settings such as online gambling.

## Conflict of Interest Statement

The Review Editor Dr. Harriet Brown declares that, despite being affiliated to the same institution as author Prof. Klaas E. Stephen, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Acknowledgments

We would like to thank Carolin Wolters for assisting in data collection and the volunteers who tested and provided much-valued feedback about the paradigm. A sincere thank you to Christoph Mathys for very helpful guidance and feedback. This study was supported by the René and Susanne Braginsky Foundation (Klaas E. Stephan), by the Max Planck Society (Marc Tittgemeyer), and the German Research Foundation [Clinical Research Group 219 (Marc Tittgemeyer, Anna Katharina Schmitz) and the Transregional Collaborative Research Center 134 (Marc Tittgemeyer, Klaas E. Stephan)].

## References

Adami, N., Benini, S., Boschetti, A., Canini, L., Maione, F., and Temporin, M. (2013). Markers of unsustainable gambling for early detection of at-risk online gamblers. *Int. Gambl. Stud*. 13, 188–204. doi: 10.1080/14459795.2012.754919

Alessi, S., and Petry, N. (2003). Pathological gambling severity is associated with impulsivity in a delay discounting procedure. *Behav. Processes* 64, 345–354. doi: 10.1016/S0376-6357(03)00150-5

Allcock, C. C., and Grace, D. M. (1988). Pathological gamblers are neither impulsive nor sensation-seekers. *Aust. Psychiatry* 22, 307–311. doi: 10.3109/00048678809161212

APA. (2013). *Diagnostic and Statistical Manual of Mental Disorders, 5 Edn*. Arlington, VA: American Psychiatric Publishing.

Averbeck, B. B., Djamshidian, A., O'Sullivan, S. S., Housden, C. R., Roiser, J. P., and Lees, A. J. (2013). Uncertainty about mapping future actions into rewards may underlie performance on multiple measures of impulsivity in behavioral addiction: evidence from Parkinson's disease. *Behav. Neurosci*. 127, 245–255. doi: 10.1037/a0032079

Barboianu, C. (2013). Mathematician's call for interdisciplinary research effort. *Int. Gambl. Stud*. 13, 1–4. doi: 10.1080/14459795.2013.837087

Barratt, E. S. (1985). “Impulsiveness subtraits: arousal and information processing,” in *Motivation, Emotion and Personality*, eds J. T. Spence and C. E. Izard (North Holland: Elsevier Science Publishers), 137–146.

Billieux, J., Van der Linden, M., Khazaal, Y., Zullino, D., and Clark, L. (2012). Trait gambling cognitions predict near-miss experiences and persistence in laboratory slot machine gambling. *Br. J. Psychol*. 103, 412–427. doi: 10.1111/j.2044-8295.2011.02083.x

Black, D. W., and Moyer, T. (1998). Clinical features and psychiatric comorbidity of subjects with pathological gambling behavior. *Psychiatr. Serv*. 49, 1434–1439.

Blanco, C., Potenza, M. N., Kim, S. W., Ibáñez, A., Zaninelli, R., Saiz-Ruiz, J., et al. (2009). A pilot study of impulsivity and compulsivity in pathological gambling. *Psychiatry Res*. 167, 161–168. doi: 10.1016/j.psychres.2008.04.023

Bland, A. R., and Schaefer, A. (2012). Different varieties of uncertainty in human decision-making. *Front. Neurosci*. 6:85. doi: 10.3389/fnins.2012.00085

Brewer, J. A., and Potenza, M. N. (2008). The neurobiology and genetics of impulse control disorders: relationships to drug addictions. *Biochem. Pharmacol*.75, 63–75. doi: 10.1016/j.bcp.2007.06.043

Chamberlain, S. R., and Sahakian, B. J. (2007). The neuropsychiatry of impulsivity. *Curr. Opin. Psychiatry* 20, 255–261. doi: 10.1097/YCO.0b013e3280ba4989

Clark, L., Crooks, B., Clarke, R., Aitken, M. R., and Dunn, B. D. (2012). Physiological responses to near-miss outcomes and personal control during simulated gambling. *J. Gambl. Stud*. 28, 123–137. doi: 10.1007/s10899-011-9247-z

Clark, L., Lawrence, A. J., Astley-Jones, F., and Gray, N. (2009). Gambling near-misses enhance motivation to gamble and recruit win-related brain circuitry. *Neuron* 61, 481–490. doi: 10.1016/j.neuron.2008.12.031

Daunizeau, J., Den Ouden, H. E., Pessiglione, M., Kiebel, S. J., Stephan, K. E., and Friston, K. J. (2010). Observing the observer (i): meta-bayesian models of learning and decision-making. *PLoS ONE* 5:e15554. doi: 10.1371/journal.pone.0015554

deWit, H. (2009). Impulsivity as a determinant and consequence of drug use: a review of underlying processes. *Addict. Biol*.14, 22–31. doi: 10.1111/j.1369-1600.2008.00129.x

Dickman, S. J. (1993). *Impulsivity and Information Processing*. Washington, DC: American Psychological Association.

Dowling, N., Smith, D., and Thomas, T. (2005). Electronic gaming machines: are they the “crack-cocaine”of gambling? *Addiction*100, 33–45. doi: 10.1111/j.1360-0443.2005.00962.x

Eysenck, S. B., and Eysenck, H. J. (1977). The place of impulsiveness in a dimensional system of personality description. *Br. J. Soc. Clin. Psychol*. 16, 57–68. doi: 10.1111/j.2044-8260.1977.tb01003.x

Friston, K. J., Daunizeau, J., Kilner, J., and Kiebel, S. J. (2010). Action and behavior: a free-energy formulation. *Biol. Cybern*. 102, 227–260. doi: 10.1007/s00422-010-0364-z

Fuentes, D., Tavares, H., Artes, R., and Gorenstein, C. (2006). Self-reported and neuropsychological measures of impulsivity in pathological gambling *J. Int. Neuropsychol. Soc*. 12, 907–912. doi: 10.1017/S1355617706061091

Gaming Laboratories International. (2011). Gli 11: gaming devices in casinos. Available online at: http://www.gaminglabs.com/downloads/GLI%20Standards/Bill%20E%202011/GLI-11%20v2.1.pdf

Gershman, S. J., and Niv, Y. (2010). Learning latent structure: carving nature at its joints. *Curr. Opin. Neurobiol*. 20, 251–256. doi: 10.1016/j.conb.2010.02.008

Gobet, F., and Schiller, M. (2011). “A manifesto for cognitive models of problem gambling,” in *European Perspectives on Cognitive Sciences–Proceedings of the European Conference on Cognitive Science* (Sofia: New Bulgarian University Press).

Goudriaan, A. E., Oosterlaan, J., de Beurs, E., and van den Brink, W. (2005). Decision making in pathological gambling: a comparison between pathological gamblers, alcohol dependents, persons with tourette syndrome, and normal controls. *Cogn. Brain Res*. 23, 137–151. doi: 10.1016/j.cogbrainres.2005.01.017

Guerrieri, R., Nederkoorn, C., and Jansen, A. (2008). The effect of an impulsive personality on overeating and obesity: current state of affairs. *Psihologijske Teme* 17, 265–286.

Holden, C. (2010). Behavioral addictions debut in proposed DSM-V. *Science*327, 935–935. doi: 10.1126/science.327.5968.935

Huys, Q. J., Eshel, N., O'Nions, E., Sheridan, L., Dayan, P., and Roiser, J. P. (2012). Bonsai trees in your head: how the Pavlovian system sculpts goal-directed choices by pruning decision trees. *PLoS Comput. Biol*. 8:e1002410. doi: 10.1371/journal.pcbi.1002410

Iglesias, S., Mathys, C., Brodersen, K. H., Kasper, L., Piccirelli, M., den Ouden, H. E. M., et al. (2013) Hierarchical prediction errors in midbrain and basal forebrain during sensory learning. *Neuron* 80, 519–530. doi: 10.1016/j.neuron.2013.09.009

Jensen, C., Dixon, M. J., Harrigan, K. A., Sheepy, E., Fugelsang, J. A., and Jarick, M. (2013). Misinterpreting ‘winning’ in multiline slot machine games. *Int. Gambl. Stud*. 13, 112–126. doi: 10.1080/14459795.2012.717635

Kafidindi, K., and Bowman, H. (2007). Using epsilon-greedy reinforcement learning methods to further understand ventromedial prefrontal patients' deficits on the Iowa gambling task. *Neural Netw*. 20, 676–689. doi: 10.1016/j.neunet.2007.04.026

Kertzman, S., Grinspan, H., Birger, M., and Kotler, M. (2006). Computerized neuropsychological examination of impulsiveness: a selective review. *Isr. J. Psychiatry Relat. Sci*. 43, 74–80. doi: 10.1590/S1516-44462009000100003

Knill, D. C., and Pouget, A. (2004). The bayesian brain: the role of uncertainty in neural coding and computation. *Trends Neurosci*. 27, 712–719. doi: 10.1016/j.tins.2004.10.007

Krueger, T. H., Schedlowski, M., and Meyer, G. (2005). Cortisol and heart rate measures during casino gambling in relation to impulsivity. *Neuropsychobiology*52, 206–211. doi: 10.1159/000089004

Lawrence, A. J., Luty, J., Bogdan, N. A., Sahakian, B. J., and Clark, L. (2009). Impulsivity and response inhibition in alcohol dependence and problem gambling. *Psychopharmacology* 207, 163–172. doi: 10.1007/s00213-009-1645-x

Leeman, R. F., Hoff, R. A., Krishnan-Sarin, S., Patock-Peckham, J. A., and Potenza, M. N. (2014). Impulsivity, sensation-seeking, and part-time job status in relation to substance use and gambling in adolescents. *J. Adolesc. Health* 54, 460–466. doi: 10.1016/j.jadohealth.2013.09.014

Lesieur, H. R., and Blume, S. B. (1987). The South Oaks Gambling Screen (SOGS): a new instrument for the identification of pathological gamblers. *Am. J. Psychiatry* 144, 1184–1188.

Ligneul, R., Sescousse, G., Barbalat, G., Domenech, P., and Dreher, J. C. (2012). Shifted risk preferences in pathological gambling. *Psychol. Med*. 43, 1059–1068. doi: 10.1017/S0033291712001900

Lund, I. (2009). Gambling behavior and the prevalence of gambling problems in adult egm gamblers when egms are banned. a natural experiment. *J. Gambl. Stud*. 25, 215–225. doi: 10.1007/s10899-009-9127-y

MacKay, D. J. (2003). *Information Theory, Inference and Learning Algorithms*. Cambridge, UK: Cambridge University Press.

Mathys, C., Daunizeau, J., Friston, K. J., and Stephan, K. E. (2011). A Bayesian foundation for individual learning under uncertainty. *Front. Hum. Neurosci*. 5:39. doi: 10.3389/fnhum.2011.00039

McGuire, J. T., and Kable, J. W. (2013). Rational temporal predictions can underlie apparent failures to delay gratification. *Psychol. Rev*. 120, 395. doi: 10.1037/a0031910

Michalczuk, R., Bowden-Jones, H., Verdejo-Garcia, A., and Clark, L. (2011). Impulsivity and cognitive distortions in pathological gamblers attending the uk national problem gambling clinic: a preliminary report. *Psychol. Med*. 41, 2625–2635. doi: 10.1017/S003329171100095X

Miedl, S. F., Peters, J., and Buchel, C. (2012). Altered neural reward representations in pathological gamblers revealed by delay and probability discounting. *Arch. Gen. Psychiatry* 69, 177. doi: 10.1001/archgenpsychiatry.2011.1552

Moeller, F. G., Barratt, E. S., Dougherty, D. M., Schmitz, J. M., and Swann, A. C. (2001). Psychiatric aspects of impulsivity. *Am. J. Psychiatry* 158, 1783–1793. doi: 10.1176/appi.ajp.158.11.1783

Montague, P. R., Dolan, R. J., Friston, K. J., and Dayan, P. (2012). Computational psychiatry. *Trends Cogn. Sci*. 16, 72–80. doi: 10.1016/j.tics.2011.11.018

Monterosso, J., and Ainslie, G. (1999). Beyond discounting: possible experimental models of impulse control. *Psychopharmacology*146, 339–347. doi: 10.1007/PL00005480

Moutoussis, M., Bentall, R. P., El-Deredy, W., and Dayan, P. (2011). Bayesian modelling of jumping-to-conclusions bias in delusional patients. *Cogn. Neuropsychiatry* 16, 422–447. doi: 10.1080/13546805.2010.548678

Nower, L., and Blaszczynski, A. (2006). Impulsivity and pathological gambling: A descriptive model. *Int. Gambl. Stud*. 6, 61–75. doi: 10.1080/14459790600644192

Oya, H., Adolphs, R., Kawasaki, H., Bechara, A., Damasio, A., and Howard, M. A. (2005). Electrophysiological correlates of reward prediction error recorded in the human prefrontal cortex. *Proc. Natl. Acad. Sci. U.S.A*.102, 8351–8356. doi: 10.1073/pnas.0500899102

Parke, J., and Griffiths, M. (2006). “The psychology of the fruit machine: the role of structural characteristics (revisited).” *Int. J. Ment. Health Addict*. 4, 151–179. doi: 10.1007/s11469-006-9014-z

Patton, J. H., Stanford, M. S., and Barratt, E. S. (1995). Factor structure of the barratt impulsiveness scale. *J. Clin. Psychol*. 51, 768–774.

Penny, W. (2012). Comparing dynamic causal models using AIC, BIC and free energy. *Neuroimage* 59, 319–330. doi: 10.1016/j.neuroimage.2011.07.039

Peters, J., and Büchel, C. (2011). The neural mechanisms of inter-temporal decision-making: understanding variability. *Trends Cogn. Sci*. 15, 227–239. doi: 10.1016/j.tics.2011.03.002

Petry, N. M. (2001). Substance abuse, pathological gambling, and impulsiveness. *Drug Alcohol Depend*.63, 29–38. doi: 10.1016/S0376-8716(00)00188-5

Rescorla, R. A., and Wagner, A. (1972). “A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement,” in *Classical Conditioning II: Current Research and Theory*, eds A. H. Black and W. F. Prokasy (New York, NY: Appleton-Century-Crofts), 64–99.

Robbins, T. W., Gillan, C. M., Smith, D. G., de Wit, S., and Ersche, K. D. (2012). Neurocognitive endophenotypes of impulsivity and compulsivity: towards dimensional psychiatry. *Trends Cogn. Sci*. 16, 81–91. doi: 10.1016/j.tics.2011.11.009

Rodriguez-Jimenez, R., Avila, C., Jimenez-Arriero, M., Ponce, G., Monasor, R., Jimenez, M., et al. (2006). Impulsivity and sustained attention in pathological gamblers: influence of childhood ADHD history. *J. Gambl. Stud*. 22, 451–461. doi: 10.1007/s10899-006-9028-2

Shao, R., Read, J., Behrens, T., and Rogers, R. (2013). Shifts in reinforcement signalling while playing slot-machines as a function of prior experience and impulsivity. *Transl. Psychiatry* 3, e213. doi: 10.1038/tp.2012.134

Sharma, L., Markon, K. E., and Clark, L. A. (2014). Toward a theory of distinct types of “impulsive” behaviors: a meta-analysis of self-report and behavioral measures. *Psychol. Bull*. 140, 374–408. doi: 10.1037/a0034418

Stanford, M. S., Mathias, C. W., Dougherty, D. M., Lake, S. L., Anderson, N. E., and Patton, J. H. (2009). Fifty years of the barratt impulsiveness scale: an update and review. *Pers. Individ. Dif*. 47, 385–395. doi: 10.1016/j.paid.2009.04.008

Stein, D. J. (2008). Classifying hypersexual disorders: compulsive, impulsive, and addictive models. *Psychiatr. Clin. North Am*. 31, 587–591. doi: 10.1016/j.psc.2008.06.007

Stephan, K. E., and Mathys, C. (2014). Computational approaches to psychiatry. *Curr. Opin. Neurobiol*. 25, 85–92. doi: 10.1016/j.conb.2013.12.007

Stephan, K. E., Penny, W. D., Daunizeau, J., Moran, R. J., and Friston, K. J. (2009). Bayesian model selection for group studies. *Neuroimage* 46, 1004–1017. doi: 10.1016/j.neuroimage.2009.03.025

Stinchfield, R. (2002). Reliability, validity, and classification accuracy of the South Oaks Gambling Screen (SOGS). *Addict. Behav*. 27, 1–19. doi: 10.1016/S0306-4603(00)00158-1

Tolchard, B., and Battersby, M. W. (1996). “The effect of treatment of pathological gamblers referred to a behavioural psychotherapy unit: II outcome of three kinds of behavioural intervention,” in *7th Annual Conference of the National Association for Gambling Studies* (Adelaide).

Torrubia, R., Ávila, C., Moltó, J., and Caseras, X. (2001). The Sensitivity to Punishment and Sensitivity to Reward Questionnaire (SPSRQ) as a measure of Gray's anxiety and impulsivity dimensions. *Pers. Individ. Dif*. 31, 837–862. doi: 10.1016/S0191-8869(00)00183-5

Vitaro, F., Arseneault, L., and Tremblay, R. E. (1997). Dispositional predictors of problem gambling in male adolescents. *Am. J. Psychiatry* 154, 1769–1770.

Vossel, S., Mathys, C., Daunizeau, J., Bauer, M., Driver, J., Friston, K. J., et al. (2014). Spatial attention, precision and Bayesian inference: a study of saccadic response speed. *Cereb. Cortex* 24, 1436–1450. doi: 10.1093/cercor/bhs418

Wetzels, R., Vandekerckhove, J., Tuerlinckx, F., and Wagenmakers, E.-J. (2010). Bayesian parameter estimation in the expectancy valence model of the Iowa gambling task. *J. Math. Psychol*. 54, 14–27. doi: 10.1016/j.jmp.2008.12.001

Whiteside, S. P., and Lynam, D. R. (2009). Understanding the role of impulsivity and externalizing psychopathology in alcohol abuse. *Pers. Disord. Theory Res. Treat*. 11, 69–79. doi: 10.1037/1949-2715.S.1.69

Keywords: Hierarchical Gaussian Filter, Hierarchical Bayesian Model, Barratt Impulsiveness Scale, impulsivity, pathological gambling

Citation: Paliwal S, Petzschner FH, Schmitz AK, Tittgemeyer M and Stephan KE (2014) A model-based analysis of impulsivity using a slot-machine gambling paradigm. *Front. Hum. Neurosci*. **8**:428. doi: 10.3389/fnhum.2014.00428

Received: 20 February 2014; Accepted: 28 May 2014;

Published online: 03 July 2014.

Edited by:

Harriet Brown, University of Oxford, UKReviewed by:

Luke Clark, University of Cambridge, UKBojana Kuzmanovic, Research Center Juelich, Germany

Copyright © 2014 Paliwal, Petzschner, Schmitz, Tittgemeyer and Stephan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Saee Paliwal and Frederike H. Petzschner, Translational Neuromodeling Unit, Institute for Biomedical Engineering, University of Zurich and ETH Zurich, Wilfriedstrasse 6, CH-8032 Zurich, Switzerland e-mail: paliwal@biomed.ee.ethz.ch; petzschner@biomed.ee.ethz.ch

^{†}These authors have contributed equally to this work.