Effects of different feedback types on information integration in repeated monetary gambles

Haffke, Peter; Hübner, Ronald

doi:10.3389/fpsyg.2014.01597

ORIGINAL RESEARCH article

Front. Psychol., 23 January 2015

Sec. Decision Neuroscience

Volume 5 - 2014 | https://doi.org/10.3389/fpsyg.2014.01597

Effects of different feedback types on information integration in repeated monetary gambles

Peter Haffke^*

Ronald Hübner

Graduate School of Decision Sciences, Department of Psychology, Universität Konstanz, Konstanz, Germany

Most models of risky decision making assume that all relevant information is taken into account (e.g., von Neumann and Morgenstern, 1944; Kahneman and Tversky, 1979). However, there are also some models supposing that only part of the information is considered (e.g., Brandstätter et al., 2006; Gigerenzer and Gaissmaier, 2011). To further investigate the amount of information that is usually used for decision making, and how the use depends on feedback, we conducted a series of three experiments in which participants choose between two lotteries and where no feedback, outcome feedback, and error feedback was provided, respectively. The results show that without feedback participants mostly chose the lottery with the higher winning probability, and largely ignored the potential gains. The same results occurred when the outcome of each decision was fed back. Only after presenting error feedback (i.e., signaling whether a choice was optimal or not), participants considered probabilities as well as gains, resulting in more optimal choices. We propose that outcome feedback was ineffective, because of its probabilistic and ambiguous nature. Participants improve information integration only if provided with a consistent and deterministic signal such as error feedback.

Introduction

Risky choice behavior is often investigated by analyzing how persons choose between different options or lotteries of a monetary gamble. For these structures, which are defined by the possible outcomes (gains or losses) and associated probabilities, the optimal decision can be obtained by calculating the expected value (EV) for each lottery. Consequently, a decision can be considered optimal if the option with the largest EV is chosen.

Soon after the introduction of such choice problems it became clear that people often do not decide “rationally” in this sense of optimality. Therefore, researchers proposed alternative models of human choice behavior. A first idea was that the monetary gains of a gamble do not necessarily represent their subjective value for the decision maker. Accordingly, utility functions were introduced (Bernoulli, 1954) that transform monetary gains into utilities, i.e. subjective values reflecting the amount of satisfaction the gains will eventually produce. By substituting the monetary values of a gamble by their utilities, as assumed in the Expected Utility (EU) theory (von Neumann and Morgenstern, 1944), the expected utility can be computed for each option, and a decision is considered as optimal, if the option with the highest result was chosen. However, even EU theory could not satisfactorily account for some aspects of human choice behavior. Therefore, in their Prospect Theory (PT), Kahneman and Tversky (1979) assumed, among others, that the probabilities within a gamble have also to be transformed to represent systematic subjective distortions (e.g., underestimation of small probabilities).

Obviously, these models assume that all relevant information for finding an optimal choice is available. Accordingly, they were mostly tested in so-called description-based decision studies, where fully described gambles are presented once (e.g., Hertwig et al., 2004). For many decisions, however, one rarely knows all relevant facts (e.g., Simon, 1955). Therefore, experience-based decision studies have also been conducted, where gambles, like the Iowa Gambling Task (IGT; Bechara et al., 1994), are described only partially but administered repeatedly. Obviously, in these studies part of the participants' task is to learn and/or infer the defining probabilities and values of a gamble from the feedback of gains and losses.

However, even if the information for an optimal decision is available, as in description-based studies, participants do not necessarily process all the relevant data. Gigerenzer and colleagues, for instance, have shown that decisions are often based on heuristics that take only a fraction of the available information into account (e.g., Brandstätter et al., 2006; Gigerenzer and Gaissmaier, 2011). Nevertheless, it is conceivable that, if fully described gambles are processed repeatedly, information processing and the applied heuristics change with experience. Unfortunately, little is known in this respect as the focus was on either description-based gambles (e.g., Brandstätter et al., 2006; Glöckner and Betsch, 2008; Rieskamp, 2008; Fiedler and Glöckner, 2012), experience-based gambles (e.g., Lejuez et al., 2002; Barron and Erev, 2003; Hertwig et al., 2004), or their comparison (e.g., Hertwig et al., 2004; Camilleri and Newell, 2011; Glöckner et al., 2012). Therefore, the aim of the present study was to investigate gambles that combine both characteristics. One question was which decision strategies are applied. In most studies, participants were not informed about the outcome of their single choices, which presumably prevented learning. However, if this information is provided, participants may be encouraged to test different strategies to figure out their effectiveness and eventually maintain the most successful one instead of continuously applying their initially preferred strategy. Because strategy evaluation largely depends on feedback, a further question was, to what extent strategies and choices in repeated gambles depend on the type of the provided information. Yechiam and Busemeyer (2006), for example, found that choices were generally less risky when participants were informed about the outcome after each choice. However, different feedback types vary with respect to their information content so that they presumably also affect the learning of choice behavior differently. In the present study, we tested how outcome feedback and normative error feedback influence choices compared to when feedback is absent.

For investigating choice performance in the present study, we applied a specific version of the Wheel of Fortune task (WOF; Ernst et al., 2004; Smith et al., 2009). In this computerized gamble participants had to choose one of two lotteries, A and B. In each lottery they could win a certain amount of money x with probability p or nothing (x = 0) with probability 1 – p. Moreover, the winning probabilities of the two competing lotteries added up to 1 (i.e., p_A = 1 – p_B). The probabilities of each lottery were presented as pie charts, and the gains as numbers above the corresponding pies (see Figure 1). Thus, because all relevant information was available, according to the PT, EU, and EV theories, the lottery with the larger attractiveness, expected utility, or expected value should be chosen, respectively. However, there are also heuristics that use only partial information. For instance, the Most-Likely (ML) heuristic demands to select the lottery with the larger winning probability. With respect to the gambles used in our experiments, this rule is equivalent to the Priority Heuristic (PH; Brandstätter et al., 2006). Alternatively, one can chose the lottery with the larger gain, as stated by the Maximax (MM) heuristic.

FIGURE 1

Figure 1. Experimental procedure of the Wheel of Fortune task with outcome feedback as used in Experiment 2. Participants had to decide whether they wanted to play the lottery with the higher winning probability (left) or the one with the higher potential gain (right).

The aim of the present study was to examine how well these different theories and heuristics account for human choice performance. For this objective, two types of gambles were constructed and presented equally often. In pro-win gambles, the EV-optimal choice (optimal according to EV theory) was associated with the higher winning probability, whereas in pro-gain gambles it corresponded to the higher amount of money (see Table 1). Consequently, a person who applies the ML heuristic (i.e., chooses the lottery with the higher winning probability), performs EV-optimally in pro-win gambles, but not in pro-gain gambles. In contrast, for persons applying the MM heuristic, the opposite would be the case. Consequently, only if probabilities and gains are combined in some beneficial way, choices can be optimal in more than 50% of the trials.

TABLE 1

Table 1. Overview of the composition of lotteries regarding the presented lottery information as indicator for optimal and non-optimal choices (gamble type).

The two types of gambles also served our goal to examine the dynamics of choice processes. Obviously, the strategies differ in complexity, which suggests that their application differs in mental effort. ML and MM are relatively easy to perform and, therefore, might be executed relatively automatically, whereas EV and PT are based on calculations that require controlled and effortful mental processes. Thus, first of all, we expected that choices based on ML and MM are faster than those relying on EV and PT. Moreover, we hypothesized that individuals apply different strategies across trials. For instance, it is conceivable that automatic and controlled processes compete in a race, as assumed by dual-process models (e.g., Hübner et al., 2010; Mukherjee, 2010). Thus, even if persons intend to integrate the provided information, on some trials, the required slow computational processes might be superseded by a fast automatic process that chooses, for instance, the lottery with the higher probability of winning. A gain-domain specific overweighting of higher probabilities over higher outcomes for lotteries with the same EV has already been observed and called risk-aversion (Tversky and Kahneman, 1981), p-dominance, or probability-dominance (Fiedler and Unkelbach, 2011). Because this dominance is presumably most influential for fast responses, it needs to be suppressed by controlled processes to allow for an integration of values. We therefore expected that choice performance improves with an increasing response time, at least in pro-gain gambles in which information integration is beneficial. It has already been proposed that decision makers adaptively choose strategies from a toolbox (Payne et al., 1993). A race between strategies could also explain why studies in which individuals are classified according to their applied strategy often identify more than one strategy for a single person (e.g., Glöckner, 2009; Davis-Stober and Brown, 2011).

To see whether the observed response times (RTs) indicate a mixture of automatic and controlled processes, we considered conditional choice functions (CCFs). These functions are adapted versions of conditional accuracy functions, which have been applied to analyze perceptual decisions (e.g., Gratton et al., 1992; Hübner and Töbel, 2012). CCFs provide choice proportions as function of RT, and can, as we will demonstrate, give useful insights in the domain of risky choices. For example, if the same strategy is used throughout an experiment, and the speed of the corresponding processes merely varies randomly across trials, then the CCFs should be flat. However, if fast choices are caused by automatic heuristics and slow ones by complex but favorable computations, then the proportion of optimal choices should systematically increase with RT.

Our ideas were tested in three experiments, which differed with respect to the type of feedback that was provided. After observing performance without any feedback (Experiment 1), we provided outcome feedback (Experiment 2) and error feedback (Experiment 3). Additionally, to analyze mean performance and changes in choice behavior within the experiments, we also fitted different decision models to the data and compared their overall performance. Finally, CCFs were computed and analyzed.

Experiment 1

Our first experiment served for collecting baseline choice data in a condition without any feedback. The absence of feedback is common in risky-choice experiments, especially in those with a one-shot paper-pencil procedure. Here, a choice between lotteries was required repeatedly. Apart from fitting and comparing the performance of different choice strategies, the central focus was on gathering information about the time course of how the presented lottery information is taken into account.

Methods

Participants

A total of 17 participants (12 female), aged between 19 and 58 years (M = 24.5, SD = 8.9), from the Universität Konstanz, participated in the experiment. Participants, recruited via our laboratory's participant database, received either course credit or money at the end of the experiment. They were told that, in addition to a base payment of €5, they could win a certain proportion of €5, depending on their decisions (i.e., the total proportion of money actually won across all trials from the maximum possible amount)¹.

Material and procedure

As task served a specific version of the Wheel of Fortune task (WOF; Ernst et al., 2004; Smith et al., 2009). Each gamble had one of two combinations of winning probabilities: 60:40, and 80:20. The first combination, for example, means that one lottery had a 60% chance of winning a certain amount of money and a 40% chance of winning nothing. The competing lottery had a 40% chance of winning a certain amount of money and a 60% chance of winning nothing. The two probabilities were represented by two pie charts. As shown in Figure 1, the colored (blue and orange) portions of the pie reflected the winning probabilities, where blue always indicated the higher probability. The white areas represented the probabilities of winning nothing.

The gains for each lottery ranged from 1 to 600 Cent (Eurocent). They were randomly selected, but restricted in two ways: First, for each gamble, the difference in gain between the two lotteries could either be 50 or 200 Cent, with a ±10 Cent jitter. By jittering the values, variability was increased in order to minimize learning effects through recognition of specific pairs, and also allowed to test the influence of the magnitude of the value difference. Second, probability-value pairs had to be in line with our manipulation of gamble type, as explained below.

In the original version of the WOF (Ernst et al., 2004; Smith et al., 2009), the option with the highest winning probability had an overall advantage with respect to choosing EV-optimally. The gambles in our experiments, however, can be categorized into two types. For pro-win gambles, the lottery with the higher winning probability represents the optimal choice according to EV theory, whereas for pro-gain gambles the lottery with the higher gain is optimal. An overview of these configurations can be found in Table 1. It is noteworthy that we omitted lotteries where both the higher probability and the higher gain indicated EV-optimality.

Participants had to choose the left or right lottery as quickly as possible by pressing the left or right mouse key, respectively. The left/right position of the lotteries was randomized across trials. Each trial started with the presentation of all information (see Figure 1). After each choice, another gamble was presented. Participants had one training block to familiarize with the mode of presentation.

Altogether, the experiment comprised a 2 (80:20 vs. 60:40) × 2 (50 vs. 200 Cent) × 2 (pro-win vs. pro-gain) within-participant design. For each of the 8 condition there were 120 trails, resulting in 960 trials (divided in 40 blocks of 24 randomized trials). Participants were not informed about the number of trials to avoid riskier choices at the end of the experiment. However, they were informed about the length of the study.

Analysis of strategy fit

To assess whether the observed choice proportions are in line with a specific strategy or heuristic, we compared five prominent choice strategies (of which two yielded the same predictions). The strategies differ with respect to the extent to which the available information is used.

- Most-Likely / Priority Heuristic (ML/PH): For both strategies, the lottery with the higher winning probability has to be chosen. According to the Most-Likely heuristic, only the highest winning probabilities are compared. The Priority Heuristic assumes that choices are made by sequentially comparing minimum gains, minimum probabilities and maximum gains. The examination of a gamble is stopped if, for example, minimum gains differ by 1/10 of the maximum gain, otherwise the next aspect of the gambles is examined. For the gambles in this experiment ML and PH predict the same choices².

- Maximax (MM): The lottery with the larger gain has to be chosen, irrespective of the winning probabilities.

- Expected Value theory (EV): For each lottery, the expected value (probability × gain) is computed and the lottery with the higher EV is chosen.

- Cumulative Prospect Theory (CPT): The lottery with the higher attractiveness according to CPT is chosen. How attractiveness is calculated and how the required parameters were estimated for every subject using a probabilistic choice rule is described in next section. The averaged parameter estimates for all experiments can be found in Table 2. For each of the 960 lotteries, predictions were computed.

TABLE 2

Table 2. Averaged parameters for Cumulative Prospect Theory (CPT).

Parameter estimation of cumulative prospect theory

According to Cumulative Prospect Theory (CPT; Tversky and Kahneman, 1992), the subjective value V of a lottery A is defined as,

\begin{matrix} V (A) = v (x) \cdot w (p), & (1) \end{matrix}

where the value function v characterizes the subjective value of a single lottery's gain x, and the probability weighting function w denotes the transformation of the corresponding probabilities p. In the present study, a lottery consisted of two probability-gain pairs. The potential gain of one pair was always zero. Thus, one probability-gain pair was only needed for the calculation of V(A).

As value function, we used the function proposed by Tversky and Kahneman (1992),

\begin{matrix} v (x) = x^{α} if x \geq 0, & (2) \end{matrix}

where α determines the curvature of the value function. A value of α = 1 would indicate that subjective and objective values are identical, whereas α < 1 indicates decreasing subjective values with increasing objective values.

We furthermore implemented a two-parameter probability weighting function proposed by Gonzalez and Wu (1999),

\begin{matrix} w (p) = \frac{δ p^{γ}}{δ p^{γ} + {(1 - p)}^{γ}} if x \geq 0, & (3) \end{matrix}

where γ controls the curvature, with γ < 1 indicating an overweighting of small probabilities. The parameter δ denotes the elevation of the function, and is interpreted as characterizing the attractiveness of a lottery (Glöckner and Pachur, 2012).

To determine the probability of choosing Lottery A over Lottery B, we used the exponential version of Luce's choice rule³,

\begin{matrix} p (A, B) = \frac{e^{φ V (A)}}{e^{φ V (A)} + e^{φ V (B)}}, & (4) \end{matrix}

where V denotes the subjective value of the entire Lottery A or B, and φ describes the sensitivity of how the model reacts to differences in-between the subjective values of the two lotteries. A large φ indicates that the choice probabilities are a function of the lotteries' subjective value difference, rather than based on probabilistic choices (e.g., Rieskamp, 2008).

As goodness-of-fit measure we used the G² statistic (e.g., Sokal and Rohlf, 1994),

\begin{matrix} G^{2} = - 2 \sum_{i = 1}^{n} ln [f_{i} (y | θ)], & (5) \end{matrix}

where n denotes the total number of lottery choices, and f_i(y|θ) represents the probability of choosing a lottery y given parameter set θ. If Lottery A or B was chosen, then f_i(y|θ) = p_i(A, B), or f_i(y|θ) = 1 − p_i(A, B), respectively. Following suggestions of Rieskamp (2008), choice probabilities were truncated to a minimum of 0.01 and a maximum of 0.99.

CPT parameters were restricted to 0 < α ≤ 1, 0 < γ ≤ 1.5, 0 < δ ≤ 4, and 0 < φ ≤ 10. Usually, γ is not allowed to be larger than 1. However, experienced-based decisions are typically characterized by an underestimation of small probabilities, which can be reflected by γ > 1.

To derive the set of best fitting parameters, we used the statistical software R (R Development Core Team, 2010). We first applied a grid search within entire parameter space in steps of 0.1 to derive appropriate starting values for each participant. We then used the optimization function optim, with the L-BFG-S method⁴ to obtain a set of best fitting parameter values for each participants by minimizing G².

Conditional choices functions

To examine how participants' choice behavior varies with response time Conditional Choice Functions (CCFs) were constructed for each experimental condition and participant by sorting the corresponding data into five 20% bins. We then computed the proportion of optimal choices (according to EV theory) and the mean RT for each bin. The resulting values were then averaged across participants in order to obtain a group distribution (for an evaluation of this method, see Rouder and Speckman, 2004).

Results

All analyses were conducted with R (R Development Core Team, 2010) and visualized using the package ggplot2 (Wickham, 2009).

Choices with RTs smaller than 200 ms and larger than 3000 ms were considered as outliers and excluded from analysis (<0.6% of all data).

Choice behavior

The mean choice proportions show that the lottery with the higher winning probability was chosen on 70.3% (SD = 45.8%) of the trials. For pro-win gamble this lottery was chosen more frequently (M = 82.5%, SD = 38.0) than for pro-gain gambles (M = 57.5%, SD = 49.4%). A repeated-measures ANOVA revealed that the factor gamble type was significant, F(1, 16) = 38.60, p < 0.001, η_G² = 0.707.

Block by block learning

To test whether choice proportions changed across experimental blocks, we subjected the data to a 2 (gamble type) × 40 (block number) repeated-measures ANOVA. No significant effects involving block number were present, suggesting stability over the time course of the experiment (see Figure 2B).

FIGURE 2

Figure 2. Overview of the results for the modified Wheel of Fortune task without feedback. (A) Proportions of correctly predicted choices for pro-win and pro-gain gambles across different strategies, (B) changes of choice proportions across 40 experiment blocks, and (C) conditional choice functions (CCFs).

Strategy fits

Using the parameter estimates obtained from model fitting, CPT predicted that, on average, lotteries with higher winning probabilities are chosen in 73.4% of the trials. Other strategies predicted overall proportions of either 0% (MM), 50% (EV), or 100% (ML/PH) choices for the same lotteries. As a goodness-of-fit measure, we computed the proportion of correct predictions based on single decisions for each strategy. The strategy with the overall highest fit to the observed data was ML/PH (M = 70.0%, SD = 45.8%), followed by CPT (M = 68.3%, SD = 46.5%), EV (M = 62.5%, SD = 48.4%), and MM (M = 30.0%, SD = 45.8%). However, paired t-tests revealed no significant differences between the fits of EV, CPT, and ML/PH (see Table 3).

TABLE 3

Table 3. Paired comparisons of the overall proportion of correctly predicted choices within the three experiments, with t-statistic and effect size Cohen's d.

As Figure 2A shows, the fit was not equally well between gamble types. Except for MM, the strategies fared better in predicting pro-win gambles than pro-gain gambles. The differences between gamble types were significant for every strategy, paired |ts| > 4.51, ps <. 001, and Cohen's |ds| > 1.55⁵.

Conditional choices

Figure 2C shows the CCFs for the two gamble types in the different conditions. As can be seen, for fast responses more EV-optimal choices were made in pro-win gambles (blue line), compared to pro-gain gambles (red line). This indicates that spontaneously the lottery with the higher winning probability was chosen. With an increasing RT, however, the proportions changed. Whereas the proportion of EV-optimal choices decreased for pro-win gambles, it increased for pro-gain gambles.

The black lines in Figure 2C represent the average CCFs, which indicate whether the overall performance increased with RT. Linear regression coefficients were computed for each participant and average CCF. The slopes indicate whether the proportion of EV-optimal choices increased with RT in steps of 1 ms (positive slopes), decreased with RT (negative slopes) or remained constant (slopes close to zero). They were subjected to a 2 (probabilities: 80:20, 60:40) × 2 (gain differences: 200, 50) repeated-measures ANOVA.

The analysis revealed an overall increase of optimal responses as the intercept term indicated a significant deviation of the slopes' grand mean (M = 0.008, SD = 0.037) from zero, with F(1, 16) = 9.41, p < 0.01, η_G² = 0.232. The base level of optimal responses, that is the averaged point where CCF regression lines intersect with the y-axis at 0 ms, was at 56.9%. In addition, we found a main effect of probability with F(1, 16) = 11.03, p < 0.01, η_G² = 0.114 (b_80:20 = 0.013 and b_60:40 = 0.003). As can also be seen in Figure 2C, the average proportion of optimal choices increased substantially with RT in 80:20 gambles, but only weakly improved in 60:40 gambles. There were no further significant results.

Discussion

The results clearly show that participants performed nearly optimally in pro-win gambles, but chose suboptimally in pro-gain conditions. This indicates that participants based their decisions mostly on partial information. More specifically, they largely neglected the monetary outcomes and preferred the lottery with a higher winning probability. This conclusion is also supported by results obtained from comparing different decision strategies with the observed choice behavior. The ML/PH strategy, according to which the lotteries with the higher chance of winning should always be chosen, explained our results better than the other strategies. The match was even better for pro-win gambles, which suggests that these gambles further encouraged the use of such a strategy. CPT did equally well in predicting the choices. In this model probability-dominance was reflected by a rather flat value function (see parameter estimations in Table 2), indicating that choices were mainly driven by probabilities.

By comparing decision strategies with choice proportions, one assumes that participants always use the same strategy throughout the experiment. However, it is reasonable to assume that several strategies compete for execution and that, therefore, the observed performance reflects a mixture of applied strategies. Furthermore, if simple but fast strategies compete with more optimal but slow ones, then this should be reflected by the CCFs. Indeed, as can be seen in Figure 2C, the proportion of optimal choices changed substantially with RT. They decreased for pro-win gambles but increased for pro-gain ones. This indicates that fast choices relied more on simple strategies that take only partial information into account, whereas slow choices were based on more information. Specifically, the CCFs suggest that fast responses resulted from the application of ML/PH, i.e., from simply choosing the lottery with the higher winning probability. For slower responses more or other information was taken into account. The fact that the overall performance increased with RT indicates that the participants did not only switch from the ML/PH to the MM strategy, because in this case overall performance would have remained constant with RT. Rather, the increase in performance signals that slower responses were indeed based on some integration of probability and gain information, which was particularly the case for 80:20 gambles. That lotteries with a higher chances of winning are generally preferred is already known (e.g., Kahneman and Tversky, 1984). However, up to now, it has not been shown that this preference declines with the duration of processing (but see Dambacher et al., unpublished manuscript).

Although repeated choice performance improved for slow responses, it was still far from optimal. One reason could have been that no feedback was provided. Without this information the participants were obviously not able to adjust their behavior toward optimal performance and simply stuck to their initial choices preference. To see whether feedback helps to improve performance, we conducted a further experiment.

Experiment 2

This experiment was similar to our first one, except that feedback was provided. Specifically, after each choice, the chosen lottery was played by the computer and the respective outcome (x Cent or nothing) was presented on the display. If participants can use this information to improve their choice strategy, then performance should be better than in our first experiment. Moreover, due to learning, performance should now improve during the experimental session. To see whether this is the case, and if so, how quickly learning takes place, we again examined how the proportion of optimal choices varied across the experimental blocks. Finally, learning could produce a generally increased mean RT, because more time is spent for information integration. In the CCFs this should be reflected by a shift to longer RTs and/or flat curves, if a single strategy has been adopted.