Abstract
A sequential sampling model for multiattribute binary choice options, called multiattribute attention switching (MAAS) model, assumes a separate sampling process for each attribute. During the deliberation process attention switches from one attribute consideration to the next. The order in which attributes are considered as well for how long each attribute is considered—the attention time—influences the predicted choice probabilities and choice response times. Several probability distributions for the attention time with different variances are investigated. Depending on the time and order schedule the model predicts a rich choice probability/choice response time pattern including preference reversals and fast errors. Furthermore, the difference between finite and infinite decision horizons for the attribute considered last is investigated. For the former case the model predicts a probability p0 > 0 of not deciding within the available time. The underlying stochastic process for each attribute is an Ornstein-Uhlenbeck process approximated by a discrete birth-death process. All predictions are also true for the widely applied Wiener process.
1. Introduction
Sequential sampling models are powerful models to account simultaneously for choice probabilities and choice response times. They have become the dominant approach to modeling decision processes in cognitive science. Their application includes a variety of psychological tasks from basic perceptual decision to complex preferential choice tasks. Early on they have been applied to identification and discrimination tasks (e.g., Edwards, 1965; Laming, 1968; Pike, 1973; Link and Heath, 1975; Heath, 1981; Ashby, 1983); memory retrieval (e.g., Stone, 1960; Ratcliff, 1978; Van Zandt et al., 2000); and classification (e.g., general recognition theory, Ashby, 2000; exemplar–based random walk models of classification, Nosofsky and Palmeri, 1997) to account for speed-accuracy data. They have also been used for preferential decision tasks (e.g., decision field theory (DFT), Busemeyer and Townsend, 1993; multiattribute dynamic decision model, Diederich, 1997; Diederich and Busemeyer, 1999) to account for choice response times and choice probabilities interpreted as preference strength; judgment and confidence ratings (Pleskac and Busemeyer, 2010); to account for selling prices, certainty equivalents, and preference reversal phenomena (Busemeyer and Goldstein, 1992; Johnson and Busemeyer, 2005). More recently, they have been applied to combining perceptional decision making and payoffs (Diederich and Busemeyer, 2006; Diederich, 2008; Rorie et al., 2010; Gao et al., 2011). Furthermore, these models have been closely linked to measures from neuroscience like multi-cell electrode recordings (e.g., Ditterich, 2006; Gold and Shadlen, 2007; Churchland et al., 2008).
Sequential sampling models assume that (1) stimulus and choice alternative characteristics can be mapped onto a hypothetical numerical value representing the instantaneous level of evidence (activation, information, or preference—the wording often depends on the context), (2) some random fluctuation of this value over time occurs, (3) this evidence is accumulated over time, and (4) a final choice is made as soon as the evidence reaches a threshold. Therefore, sequential sampling can be described as a stochastic process. Two quantities are of foremost interest: (1) the probability that the process eventually reaches one of the thresholds or boundaries for the first time (the criterion to initiate a response), i.e., first passage probability; (2) the time it takes for the process to reach one of the boundaries for the first time, i.e., first passage time. The former quantity is related to the observed relative frequencies, the latter usually to the observed mean choice response times or the observed choice response time distribution.
Two classes of sequential sampling models have been predominantly used in psychology: Random walk/diffusion models and accumulator/counter models. The former are typically applied to a binary choice task, so that evidence for one choice alternative is at the same time evidence against the other. A decision is made as soon as the process reaches one of two preset criteria. In the latter, an accumulator/counter is established for each choice alternative separately, and evidence is accumulated in parallel. A decision is made as soon as one counter wins the race to reach one preset criterion. The accumulators/counters may or may not be independent. In the following we focus on random walk/diffusion models. For a review of both diffusion models and counter models see (Ratcliff and Smith, 2004).
To be more precise and to introduce notation, let X(t) denote the accumulation process. For a binary choice, say between choice options A and B (Figure 1), the models assume that the decision process begins with an initial state of evidence X(0). This initial state may either favor option A (X(0) > 0) or option B (X(0) < 0) or may be neutral with respect to A or B (X(0) = 0). Upon presentation of the choice options, the decision maker sequentially samples information from the stimulus display over time, retrieves information from memory, or forms preferences, depending on the context. The small increments of evidence sampled at any moment in time are such that they either favor option A (dX(t) > 0) or option B (dX(t) < 0). The evidence is accumulated from one moment in time to the next by summing the current state with the new increment: X(t + h) ≈ X(t) + μ(X(t), t) h + σ (X(t), t) (W(t + h) − W(t)). Here, μ(x, t) is called the drift rate and describes the expected value of increments per unit time. The factor σ(x, t) in front of the increments W(t + h) − W(t) of a standard Wiener process W(t) is called the diffusion rate, and relates to the variance of the increments. This process continues until the magnitude of the cumulative evidence exceeds a threshold criterion, θ. The process stops and option A is chosen as soon as the accumulated evidence reaches a criterion value for choosing A (here, X(t) = θA > 0) or it stops and chooses option B as soon as the accumulated evidence reaches a criterion value for choosing B (here X(t) = θB < 0). The probability of choosing A over B is determined by the accumulation process reaching the threshold for A before reaching the threshold for B. The criterion is assumed to be set by the decision maker prior to the decision task.
Figure 1
The Wiener process with drift, lately called drift-diffusion model in the psychological literature (Bogacz et al., 2006), is the most widely applied model. Different versions reflect additional assumptions for specific psychological domains. Ratcliff (1978) proposed a diffusion model for memory retrieval that is used for various psychological decision tasks. It is based on the work by Laming (1968) and Link and Heath (1975) and assumes variability in the starting point (i.e., X(0) follows a uniform distribution), and the drift rate μ = μ(t) of the Wiener process is normally distributed (cf. Laming). The residual time, i.e., the time other than the decision time, such as stimulus encoding and motor response, is assumed to be uniformly distributed and added to the decision time, i.e., response time equals the decision time plus a residual (non-decision) time. For a recent overview with applications see Voss et al. (2013). Other approaches include the Ornstein-Uhlenbeck model that linearly accumulates evidence with decay (Busemeyer and Townsend, 1993; Diederich, 1997), and the leaky competing accumulator model (Usher and McClelland, 2001) that non-linearly accumulates evidence with decay.
Common to almost all of these approaches is the assumption that a single integrated source of evidence generates the evidence during the deliberation process leading to a decision. In particular, the integrated source may be based on multiple features or attributes, but all of these features or attributes are assumed to be combined and integrated into a single source of evidence, and this single source is used throughout the decision process until a final decision is reached. Diederich (e.g., Diederich, 1995, 1997, 2003, 2008), however, assumed a separate process for each attribute1. The decision maker switches attention from one attribute to the next during the time course of one trial. For instance, in a crossmodal task (visual, auditory, tactile), Diederich (1995) assumed a serial processing controlled by stimulus input at given stimulus onset asynchronies (SOA). That is, the order of attributes, here a light, followed by a tone, followed by a tactile vibration, as well as the point in time when a new attribute was added, here the tone presented at t1 (t1 ms after the light onset) and the tactile vibration at t2 (t2 ms after the light onset) was determined externally by the experimental setup. In the following we will call an attention switch at predetermined, fixed times, and predefined order attributes, a deterministic time and order schedule. Often, however, neither the processing order of attributes nor the point in time when the decision maker switches attention from one attribute to the next are known or can be inferred from the experimental setup. For those cases, Diederich (1997) proposed a specific model in which attention switches from one attribute to the next with some probability. This is an instance of a random time and order schedule which will be investigated more systematically in the present study.
The purpose of this paper is to present a unified treatment of sequential sampling models for both deterministic and random time and order schedules. To do so we start with deriving expressions for mean choice response times and choice probabilities for a deterministic time and order schedule before we show how they extend to random time and order schedules, including Poisson, binomial, geometric, and uniform distributions for the attention time devoted to each attribute in the sequence before attention switches to the next randomly or deterministically chosen attribute. We will provide first numerical evidence on the influence of various properties of a schedule on the predictions for mean choice response times and choice probabilities.
2. Preliminaries
The model applies to any finite number of attributes that the decision maker may consider, i.e., k = 1, …, K. For convenience we first describe the process for one attribute. As underlying information process for each attribute we assume an Ornstein-Uhlenbeck process X(t) defined by
where W(t) is a standard Wiener process. The parameters δk, γk, and σk are characteristics of the k-th attribute. The attribute characteristics may affect the quality of the extracted evidence for choosing A over B and this quality of evidence determines the drift rate δk. That is, the better an attribute discriminates between A and B, the larger is δk. The parameter γk which induces a change of the drift rate depending on the current state in the state space is often connected to memory processes (e.g., primacy and recency effects), conflict situations (e.g., approach-avoidance), or similarities between choice alternatives. Thus, together the effective drift δk − γkX(t) determines the direction and the velocity of the process when considering the k-th attribute at time t. Note that by setting γk to 0 results in a Wiener process with drift. That is, all the analysis we perform in the following is also valid for the Wiener process with drift. The diffusion coefficient σk indicates the variance of the increments of the process, for simplicity, we will set σk = σ for all k.
2.1. Matrix approach
Stochastic processes such as the above X(t) can be approximated by a discrete time, finite state space Markov chain. We use the matrix approach since it is simple to implement, sufficient in determining the entities of interest, i.e., choice probabilities and choice response times, and flexible to account for non-stationary and non-linear properties one wishes to include for the decision making process in the future. The continuous state space [θB, θA] of the piecewise Ornstein-Uhlenbeck process X(t) is replaced by a finite state space S = {−mB, …, mA} with m = mA + mB + 1 states. The diffusion process {X(t), t ≥ 0} is approximated by a discrete random walk {(n), n ≥ 0} with values in S such that X(nτ) ≈ Δ · (n) and θA ≈ mAΔ and θB ≈ −mBΔ, where Δ is the step size of change in evidence. To achieve convergence in the limit, the discretization parameters (Δ for state space, and τ for time) are tied to each other by the relation Δ = σ .
The attribute-related matrices Pk, k = 1, …, K, are given in their canonical form by

where
for i = 2, …, m − 1 (here, the index i corresponds to the state i − 1 − mB). As Δ → 0 (or, equivalently, τ → 0), the decision probabilities and mean choice response times obtained from the Markov chain model converge to the values obtained from the underlying continuous process X(t). The identity matrix I corresponds to the two absorbing states (−mB and mA) associated with the two decision thresholds, one for each choice alternative; the matrix Qk contains the transient probabilities, corresponding to the updating evidence process, and the matrix Rk contains the one-step transition probabilities from the transient to the absorbing states. In particular, the first column vector of the matrix Rk (denoted by RB,k) contains the transient probabilities for reaching alternative B, while the second RA,k contains the ones for alternative A. For details and derivations see Diederich (1997) and Diederich and Busemeyer (2003).
2.2. Time and order schedule
For K attributes, each one to be considered for some specific time in some specific order it is convenient to introduce a formal schedule of both time and order. A finite time and order schedule consists of a set of L consecutive time intervals {[tl − 1, tl]}l= 1, …,L and the attribute sequence {kl ∈ {1, …, K}}l= 1, …,L which specifies that during the time interval [tl − 1, tl] the kl-th attribute is considered. At switching time tl, l = 1,…, L − 1, attention switches from attribute kl to attribute kl + 1. Depending on the situation, the final time tl may be set finite (then the decision process may also finish without deciding for one of the alternatives) or infinite. Consequently, the process X(t) determined by such a schedule is a piecewise Ornstein-Uhlenbeck process, defined over a finite partition t0 = 0 < t1 < … < tL − 1 < tL ≤ + ∞ of the time interval [0, tL], where for t ∈ [tl − 1, tl] the process is determined by (1) with k = kl. Figure 2 shows an example with three different attributes (K = 3) and a deterministic time and order schedule of length L = 4 with switching times tl independent of the trajectories, and attribute order (1, 2, 1, 3), i.e., k1 = 1, k2 = 2, k3 = 1, k4 = 3 (note that the first attribute is reconsidered once).
Figure 2
For fixed Δ resp. τ, the m × m transition probability matrix n containing the transition probabilities ii′: = P(n + 1 = i′|n = i) for the n-th step of the discrete-time random walk depends on the currently considered attribute defined by the time and order schedule, i.e., we set n = Pkl if n = nl − 1, …, nl − 1, where n0 = 0, τ nl ≈ tl for l = 1, …, L (if tL = ∞, we formally set nL = ∞).
3. Choice probabilities and mean choice response times
In this section we derive the choice probabilities and mean choice response times for various time and order schedules. For simplicity we assume an unbiased process, i.e., with X(0) = 0 and symmetric decision thresholds, i.e., θA = −θB. Since the diffusion coefficient is a scaling parameter it will be set to σ = 1 for all attributes throughout. We start with the deterministic time and order schedule.
3.1. Deterministic time and order schedule
The evidence accumulation process for attribute k1, which is considered first, evolves until time t1 when the second attribute k2 comes into consideration, triggering a change in the accumulation process. This attribute in turn is considered until time t2 when a third attribute k3 is considered and so forth until a decision is initiated (or tl is reached). Let the random variables TA and TB denote the finite time when the process reaches a decision threshold θA or −θB, stops, and a decision response for A or B is initiated. With the switching times tl replaced by integers nl ≈ tl/τ, the choice probability Pr[choose A] = Pr(TA < ∞) is then approximated by the value pA obtained from the discrete random walk model as
where Z is the probability distribution for the initial state X(0). For instance, for an unbiased process, Z would be a coordinate vector with probability 1 at state 0 halfway between the decision thresholds. The remaining vectors and matrices are those defined in (2). The evidence accumulation process for a successive attribute starts with the final evidence state of the previous attribute. Note that Z′Qn1k1 to Z′Qn1k1…QnL − 1−nL − 2kL − 1 are defective distributions, i.e., the entries of these vectors do not sum up to 1, for the states of the random walk at discrete times n1,…,nL − 1. Further note that the stochastic process is time homogeneous within each time interval [0, t1) to [tL − 1, tl] but non-homogeneous across [0, tL] (see Diederich, 1992, 1995).
Similarly, the mean response time for choosing alternative A is approximated as
The probability and the mean response time for choosing alternative B can be determined similarly. Note that p0: = 1 − (pA + pB), the probability of not making a decision until the final time tL, is strictly positive if tL < ∞. As shown in Diederich (1997), these formulas can be further compactified. We will do this below for the general case of deterministic and random schedules by deriving an efficient recursion for their evaluation.
3.2. Random time and order schedule
The above derivation of formulas for choice probabilities and mean response times for a deterministic time and order schedule have counterparts for random schedules which we describe next in three steps.
3.2.1. Random order schedule
For generating the attribute order {kl}l = 1,…,L, we consider stochastic K × K matrices D(l) such that d(l)k′k ≥ 0 describes the probability with which attention switches from the k′-th attribute to the k-th attribute at switching time tl ≈ τ nl, l = 1,…,L − 1. Normally, d(l)kk = 0 would be assumed, to avoid a no switching situation. For two attributes K = 2, we must then have d(l)11 = d(l)22 = 0, d(l)12 = d(l)21 = 1, and the attribute sequence is either (1, 2, 1, 2, …) or (2, 1, 2, 1, …), depending on whether k1 = 1 or k1 = 2. For three attributes and L = 3, choosing
would for k1 = 1 result in order sequences (1, 2, 1), (1, 3, 1), (1, 3, 2) with probability 1/2, 3/8, 1/8, respectively. The above matrix D(1) models the situation when no preference or bias for considering attributes can be asserted.
3.2.2. Random time schedule
We assume that the number of discrete time steps during which attention is paid to the k-th attribute is a discrete random variable denoted by Tat with given distribution. In principle, this distribution may change its type and may have different parameters, such as expected value, depending on the attribute and the attribute order {kl}l = 1, …, L. This can be used to model time pressure and other temporal effects. However, often we assume one and the same distribution type for attention times across all attributes, and allow for different parameters only.
For instance, the geometric distribution (as implicitly considered in Diederich, 1997) is given by
and characterized by a single parameter r > 0, with expectation 1/r and variance (1 − r)/r2, and the uniform distribution is defined as
with parameters N and M = 0, 1, …, N − 1 and expectation N and variance M(M + 1)/3. Details for other tested distributions (Poisson with parameter λ > 0, and binomial distributions with parameters n and p) are omitted. For comparable expectation values E(Tat) (i.e., for parameter choices 1/r ≈ N ≈ λ ≈ np), the geometric distribution has much larger variance than the Poisson, binomial and uniform distribution with M ≈ (the latter are very close to each other). Figure 3 shows the pdf and cdf for different Tat distributions with fixed mean value E(Tat) = 300. The two uniform distributions are with M = 150 = N/2 and M = 299 = N − 1. Varying the parameter M of the uniform distribution allows us to produce intermediate results between the deterministic and geometric distribution cases as shown in the following.
Figure 3
3.2.3. Constructing random time and order schedules
We create a random time and order schedule of length L in two steps: First, given an initial distribution of k1 ∈ {1, …, K}, we create the attribute sequence {kl}l = 2, …,L using a non-stationary Markov chain model with transition probability matrices D(l), l = 1,…, L − 1. In a second step, for each l = 1,…,L, the attention time T(l)at = nl − nl − 1 is created by the discrete random variable responsible for the attention time paid to the kl-th attribute, choices are independent for the different l. Consequently, tl − tl − 1 ≈ τ T(l)at is the real attention time paid to the kl-th attribute. We note that semi-random schedules, where the sequence {kl} is given deterministically, and only the T(l)at are determined as in the second step outlined above, are covered if we choose the D(l) such that d(l)kl,kl + 1 = 1.
To understand the recursive computation of choice probabilities and mean response times in this more general case, we first consider the special cases L = 1, 2, and illustrate the derivation on some distribution types of the random variable Tat generating attention times by providing concrete formulas. In general, the distribution for Tat is given by its probability mass distribution (pdf) and cumulative distribution function (cdf)
We start with L = 1, and will drop the index l from the notation introduced in the previous subsection. Since the probability of choosing alternative A at the i-th step is given by Z′Qki−1RA,k, i = 1, …, Tat, and Tat is a random variable distributed according to (5) we get
A similar formula holds for pB,k. To avoid repetition, introduce the row vector pAB,k: = [pB,k, pA,k], then
The 2 × (m − 2) matrix Vk depends on the attribute and its parameters via Qk, Rk, and on the chosen attention time distribution and the cdf (fn,k). For the discussed concrete attention time distributions these matrices may be precomputed, in some cases closed-form expressions can be found, e.g., for the geometric distribution with parameter r = rk we have
Next we discuss choice probabilities for the case L = 2, assuming for simplicity that the attention time distribution is the same for all attributes. To save on indices, denote k1 ≡ k′, k2 ≡ k, and D(1) ≡ D (this matrix is responsible for the random choice of k given any k′). Then the decision probability vector pAB,k′, k for reaching alternatives B or A in with attribute order (k′,k) has two parts: the probabilities of having decided on while still considering the k′-th attribute (i.e., TA/τ ≤ T′at, where T′at is the randomly generated attention time for the first attribute k′) plus the probabilities that τ T′at < TA/τ ≤ T′at + Tat, where Tat is the randomly (and independently) generated attention time for the second attribute k. On top of this, k itself is randomly chosen according to the entries in the k′-th row of D. Thus, for each fixed k1 = k′ and n1 = T′at according to (6) probabilities for reaching a decision after n1 are given by
Thus, for L = 2, the choice probabilities (under the assumption that k1 = k′ is fixed) can be obtained as
where
are (m − 2)× (m − 2) matrices depending on the attribute and attention time distribution type. For example, for the geometric distribution this simplifies to Bk = rkQk(I − (1 − rk)Qk)−1, closed form expressions are available for Poisson, binomial, and uniform distributions as well.
For arbitrary L, it is more convenient to write the resulting recursion in terms of block-matrix-vector operations. Denote by
| Z | the K × 1 array with each entry equal to the initial distribution Z (and think of Z′ as its transpose, a 1× K array with entries Z′), |
| B | the K × K diagonal array with the Bk on the diagonal (similarly for C defined later), |
| I | the K × K diagonal array, with identity matrices I of the appropriate size on the diagonal, |
| V | the K × 1 array with the Vk as entries (similarly for W defined later), and |
| pAB | the K × 2 matrix, whose rows are the choice probabilities [pA, pB]|k1 = k defined before in the case L = 2. |
Then the above result for L = 2 can be compactly written as
Note that the product BD of the array B with the matrix D is interpreted as the K × K array with dk′kBk′ as entry in row k′ and column k. Moreover, by iterating (8), one arrives at the formula for arbitrary L:
Formulas for mean response times can be derived similarly. Indeed, for L = 1, denote by ETA,k the mean response time for reaching alternative A when considering the k-th attribute for a random time Tat distributed according to (5). Then ETA,k ≈ τ etA,k/PA,k, where
Similarly for ETB,k and etB,k. Thus, similar to (6), we can write
The matrices Wk can be precomputed to any accuracy at essentially the same cost as the Vk. For particular distributions, the formulas can be turned into closed form expressions.
Next, let us look at L = 2. By using similar notation and arguments as for choice probabilities, the quantities etA,k′,k, etB,k′,k have a part before and after T′at. This, together with (10), (11), gives
where
Thus, the counterpart of (8) is
From here, combining with (8), a joint recursion for computing pAB and etAB results:
We conclude this section with a few remarks. In Diederich (1997), under the name MADD/pp, a slightly different presentation of random schedules is given for the special case of geometrically distributed attention times. It is not hard to see, that (with the notation rij used in the K = 3 example presented in Section 4.2 Diederich, 1997) our model is equivalent to MADD/pp as L → ∞, if we set rk = 1 − rkk for the parameters r of the geometrically distributed Tat, k = 1, 2, 3, and dkk = 0, dkk′ = rkk′/(1 − rkk), k′≠ k, for the entries of the matrix D = D(l), l ≥ 1. The advantage of the MADD/pp model is that it provides closed form formulas for the case L = ∞, a possibility that we did not pursue here for other types of attention time distributions.
In previous sequential decision models with finite L (Diederich, 1997), the last attribute was always considered infinitely long (infinite decision horizon) to avoid the situation of no decision, i. e., p0 > 0. This can be incorporated into the current model by modifying the definition of the matrices Vk, Wk corresponding to the last interval [tL − 1,∞) to
and modifying the recursion (14) slightly. Alternatively, one can artificially change the parameters of the attention time distribution for l = L such that its expected value is sufficiently large, and make p0 practically negligible. Since infinite decision horizons do not seem to adequately reflect the situation of a real decision process or laboratory experiment, it might be interesting to work under scenarios where tl is fixed and finite that we described in this paper.
4. Simulations
We present some simulations that demonstrate the predictive power of the proposed model. We focus on features that have not been considered in Diederich (1997) for the deterministic case. Throughout this section we fix certain parameters, such as σ = 1, θA = −θB = 10, (this implies a state space size of m = 81), and always start at the neutral position X(0) = 0 between choice alternatives A and B.
4.1. Impact of attention time distributions
First, we show how different assumptions on the randomness of the attention time Tat (i.e., the time spent on considering a certain attribute) influences choice probabilities and mean response times. In the first example, we assume just two attributes with parameters δ1 = 0.2, γ1 = 0.03, δ2 = 0.04, γ2 = 0.003, both attributes favor alternative A, the first one more strongly than the second one2. The attributes are considered only once (L = 2), with order k1 = 1, k2 = 2. The first attribute is considered for time t1 = τ n1, where n1 is a random variable Tat described above with given expectation N. For the second attribute we compare two situations: (1) We assume an infinitely long decision horizon t2 = ∞, and (2) we determine a finite time horizon t2 = τ n2 by choosing n2 = n1 + Tat which is also Tat distributed with the same expected value N. These two situations are depicted in Figures 4, 5. The graphs show choice probabilities and mean response times as functions of the expectation τ E(Tat) of the real attention times. Lines of different color represent different distributions. Distributions with a small variance, such as the Poisson distribution, the binomial distribution, and the uniform distribution with M ≈ produce results indistinguishable from the deterministic case. This holds for all tested situations shown below. This means, small uncertainties in attention time spans do not influence the observable choice frequencies and mean response times. However, as the variance of the attention times grows, we see quantitative and qualitative changes. Compared to the deterministic attention time situation, the geometric distribution differs most, and the uniform distributions with M = N/2 = 150 (Unif.1) and M = N − 1 = 299 (Unif.2) are intermediate. Moreover, there is expectedly a big difference for small mean attention times between finite and infinite decision horizons. Most importantly, for the former case it predicts a probability p0 > 0 of not deciding within the available time t2. We claim that for many situations, where an infinite time horizon does not represent reality well enough, our finite schedule model might be more appealing. This aspect will be pursued in further research.
Figure 4
Figure 5
Figures 6, 7 show similar simulation results for the situation of considering first an attribute favoring B (δ1 = −0.1, γ1 = 0) followed by an attribute more strongly favoring A (δ2 = 0.2, γ2 = 0.03). As expected, the results look now different, however, the main conclusions from the previous example concerning the influence of the randomness type for attention times and the differences for finite vs. infinite time horizons remain the same. Most importantly here, the model predicts a preference reversal (i.e., choice probabilities from below 0.5 to above 0.5) as a function of attention time when one attribute is in favor of choosing alternative A and the other in favor of choosing alternative B. Parameter studies, as in Diederich (1997), will be pursued further elsewhere.
Figure 6
Figure 7
To complete the picture, we show a three-attribute example (K = 3) in Figure 8. The chosen attribute parameters are now δ1 = 0.04, γ1 = 0.003, δ2 = −0.1, γ2 = 0, δ3 = 0.2, γ3 = 0.03, i.e., a weakly in favor of A, in favor of B, and strongly in favor of A sequence of attributes. Attention times for the first two attributes are chosen independently from each other but with the same distribution with fixed mean value; the last attribute is considered indefinitely.
Figure 8
4.2. Dependence on attribute order
The proposed sequential decision model is sensitive to the order in which the attributes are consider. If we consider in the aforementioned second two-attribute example the attribute in favor of A first, and then the attribute in favor of B we get very different patterns as shown in Figure 9 compared to Figure 6. A similar effect is true for the above K = 3 example. In Figure 10, the attribute in favor of B is now the last one; the graphs need to be compared with Figure 8. One interesting pattern can be observed. If the evidence for choosing one alternative decreases in the sequence of attribute consideration then the model predicts faster choice response times for the more frequently chosen alternative—a typical pattern observed in response time analysis. However, if the evidence increases in the sequence of attribute consideration then the model predicts faster choice response times for the less frequently chosen alternative which has been called fast error, as shown in Figure 11 compared to Figure 4. Simply by changing the order of attribute processing the model predicts a complex pattern of choice response times and choice probabilities.
Figure 9
Figure 10
Figure 11
So far, all examples shown are with a fixed, deterministic attribute order with no repetitions (semi-random schedule, L = K). The evaluation of fully random time and order schedules requires larger L, and will be presented elsewhere.
5. Concluding remarks
The proposed multiattribute attention switching (MAAS) model can predict a very complex choice probability/(mean) choice response time pattern. It may appear too flexible to be testable. However, this is not the case. If two attributes both favor alternative, A say, and the first attribute that is considered provides more evidence for choosing A than the second (δ1 > δ2), then the model predicts always shorter response times for the more frequently chosen alternative, here A, regardless of the assumed underlying attention time distribution. If the order of processing these attributes is reversed, i.e., the attribute that favors alternative A less is considered first (δ2 > δ1), then the model always predicts faster responses for the less frequently chosen alternative, here B, again regardless of the assumed underlying attention time distribution. A single stage process can only account for this pattern by assuming variability in starting positions and variability in drift rates, i.e., a statistical means where the drift rate itself is a random variable. It is difficult experimentally to disentangle the variability stemming from the stochastic process itself and the variability from the distribution of different drift rates. As Jones and Dzhafarov (2013) pointed out, the predictions of various sequential sampling models rest upon the assumptions made about the assumed probability distributions. This is not the case here. The model is falsifiable without assuming specific distributions. Rather than relying on statistical mechanisms to ensure an observed response patterns we rely on assumptions about cognitive processes such as attention switching and salience. The specific attention time distribution used for an application may be related to the experimental paradigm. For instance, when tracking eye movements, the sequence of attribute consideration and the switching times are directly observable, and a deterministic or a uniform distribution with a small variance is advisable. When all attributes are shown simultaneously, like in complex objects, and attention may shift at any moment in time a geometric distribution or a uniform distribution with a large variance may describe the situation better. Testing the model rigorously will be pursued in the future.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Statements
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Footnotes
1.^The notion of attributes is defined here in a broad sense. For example, it includes dimensions such as color and size of visual target; amplitude and frequency of a tone; different modalities in a crossmodal task; payoff information and perceptual information; attitudinal evidence and perceptual evidence; prize and quality of a consumer product and more.
2.^Note that when looking only at the numerical values of the drift parameter δ1 = 0.2 and the decision criterion θA = 10 and assuming that the attention times t1 to the first attribute are large enough it would suggest mean response times in the range TA ≈ 50 (and very small pB). However, since γ1 = 0.03 it leads to a negative effective drift δ1 − γ1X(t) if X(t) comes close θA, and the mean response times become much longer. This also demonstrates the effect of the parameter γk, and a difference between Ornstein-Uhlenbeck process and Wiener process based models.
References
1
AshbyF. (1983). A biased random walk model for two choice reaction times. J. Math. Psychol. 27, 277–297. 10.1016/0022-2496(83)90011-1
2
AshbyF. (2000). A stochastic version of general recognition theory. J. Math. Psychol. 44, 310–329. 10.1006/jmps.1998.1249
3
BogaczR.BrownE.MoehlisJ.HolmesP.CohenJ. (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychol. Rev. 113, 700–765. 10.1037/0033-295X.113.4.700
4
BusemeyerJ.GoldsteinW. (1992). Linking together different measures of preference: a dynamic model of matching derived from decision field theory. Organ. Behav. Hum. Decis. Process. 52, 370–396. 10.1016/0749-5978(92)90026-4
5
BusemeyerJ.TownsendJ. (1993). Decision field theory: a dynamic-cognitive approach to decision-making in an uncertain environment. Psychol. Rev. 100, 432–459. 10.1037/0033-295X.100.3.432
6
ChurchlandA.KianiR.ShadlenM. (2008). Survey of decision field theory. Nat. Neurosci. 11, 693–702. 10.1038/nn.2123
7
DiederichA. (1992). Intersensory Facilitation: Race, Superposition, and Diffusion Models for Reaction Time to Multiple Stimuli. Frankfurt am Main: Verlag Peter Lang.
8
DiederichA. (1995). Intersensory facilitation of reaction time: evaluation of counter and diffusion coactivation models. J. Math. Psychol. 39, 197–215. 10.1006/jmps.1995.1020
9
DiederichA. (1997). Dynamic stochastic models for decision making with time constraints. J. Math. Psychol. 41, 260–274. 10.1006/jmps.1997.1167
10
DiederichA. (2003). Decision making under conflict: decision time as a measure of conflict strength. Psychon. Bull. Rev. 10, 167–176. 10.3758/BF03196481
11
DiederichA. (2008). A further test on sequential sampling models accounting for payoff effects on response bias in perceptual decision tasks. Percept. Psychophys. 70, 229–256. 10.3758/PP.70.2.229
12
DiederichA.BusemeyerJ. (1999). Conflict and the stochastic dominance principle of decision making. Psychol. Sci. 10, 353–359. 10.1111/1467-9280.00167
13
DiederichA.BusemeyerJ. (2003). Simple matrix methods for analyzing diffusion models of choice probability, choice response time and simple response time. J. Math. Psychol. 47, 304–322. 10.1016/S0022-2496(03)00003-8
14
DiederichA.BusemeyerJ. (2006). Modeling the effects of payoff on response bias in a perceptual discrimination task: threshold-bound, drift rate-change, or two-stage-processing hypothesis. Percept. Psychophys. 68, 194–207. 10.3758/BF03193669
15
DitterichJ. (2006). Stochastic models of decisions about motion direction: behavior and physiology. Neural Netw. 19, 981–1012. 10.1016/j.neunet.2006.05.042
16
EdwardsW. (1965). Optimal strategies for seeking information: models for statistics, choice reaction times, and human information processing. J. Math. Psychol. 2, 312–329. 10.1016/0022-2496(65)90007-6
17
GaoJ.TortellR.McClellandJ. L. (2011). Dynamic integration of reward and stimulus information in perceptual decision-making. PLoS ONE6:e16749. 10.1371/journal.pone.0016749
18
GoldJ.ShadlenM. (2007). The neural basis of decision making. Ann. Rev. Neurosci. 30, 535–574. 10.1146/annurev.neuro.29.051605.113038
19
HeathR. (1981). A tandem random walk model for psychological discrimination. Br. J. Math. Stat. Psychol. 34, 76–92. 10.1111/j.2044-8317.1981.tb00619.x
20
JohnsonJ.BusemeyerJ. (2005). A dynamic, stochastic, computational model of preference reversal phenomena. Psychol. Rev. 112, 841–861. 10.1037/0033-295X.112.4.841
21
JonesM.DzhafarovE. N. (2013). Unfalsifiability and mutual translatability of major modeling schemes for choice reaction time. Psychol. Rev. 121, 1–32. 10.1037/a0034190
22
LamingD. (1968). Information Theory of Choice Reaction Times. Oxford: Academic Press.
23
LinkS.HeathR. (1975). A sequential theory of psychological discrimination. Psychometrika40, 77–105. 10.1007/BF02291481
24
NosofskyR.PalmeriT. (1997). An exemplar based random walk model of speeded classification. Psychol. Rev. 104, 266–300. 10.1037/0033-295X.104.2.266
25
PikeA. (1973). Response latency models for signal detection. Psychol. Rev. 80, 53–68. 10.1037/h0033871
26
PleskacT.BusemeyerJ. (2010). Two-stage dynamic signal detection: a theory of choice, decision time, and confidence. Acta Neurobiol. Exp. 117, 864–901. 10.1037/a0019737
27
RatcliffR. (1978). A theory of memory retrieval. Psychol. Rev. 85, 59–108. 10.1037/0033-295X.85.2.59
28
RatcliffR.SmithP. (2004). A comparison of sequential sampling models for two-choice reaction time. Psychol. Rev. 111, 333–367. 10.1037/0033-295X.111.2.333
29
RorieA.GaoJ.McClellandJ.NewsomeW. (2010). Integration of sensory and reward information during perceptual decision-making in lateral intraparietal cortex (lip) of the macaque monkey. PLoS ONE5:e9308. 10.1371/journal.pone.0009308
30
StoneM. (1960). Models for choice-reaction time. Psychometrika25, 251–260. 10.1007/BF02289729
31
UsherM.McClellandJ. (2001). The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev. 108, 550–592. 10.1037/0033-295X.108.3.550
32
Van ZandtT.ColoniusH.ProctorR. (2000). A comparison of two reaction time models applied to perceptual matching. Psychon. Bull. Rev. 7, 208–256. 10.3758/BF03212980
33
VossA.NaglerM.LercheV. (2013). Diffusion models in experimental psychology. Exp. Psychol. 60, 385–402. 10.1027/1618-3169/a000218
Summary
Keywords
sequential sampling, multiattribute, attention time, time schedule, order schedule, finite time horizon, Ornstein-Uhlenbeck, Wiener
Citation
Diederich A and Oswald P (2014) Sequential sampling model for multiattribute choice alternatives with random attention time and processing order. Front. Hum. Neurosci. 8:697. doi: 10.3389/fnhum.2014.00697
Received
07 April 2014
Accepted
19 August 2014
Published
09 September 2014
Volume
8 - 2014
Edited by
José Antonio Díaz, Universidad de Granada, Spain
Reviewed by
Chris Donkin, University of New South Wales, Australia; José Antonio Díaz, Universidad de Granada, Spain
Copyright
© 2014 Diederich and Oswald.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Adele Diederich, Cognitive Psychology, School of Humanities and Social Sciences, Jacobs University, Campus Ring 1, Bremen 28759, Germany e-mail: a.diederich@jacobs-university.de
This article was submitted to the journal Frontiers in Human Neuroscience.
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.