An Accumulation-of-Evidence Task Using Visual Pulses for Mice Navigating in Virtual Reality

The gradual accumulation of sensory evidence is a crucial component of perceptual decision making, but its neural mechanisms are still poorly understood. Given the wide availability of genetic and optical tools for mice, they can be useful model organisms for the study of these phenomena; however, behavioral tools are largely lacking. Here, we describe a new evidence-accumulation task for head-fixed mice navigating in a virtual reality (VR) environment. As they navigate down the stem of a virtual T-maze, they see brief pulses of visual evidence on either side, and retrieve a reward on the arm with the highest number of pulses. The pulses occur randomly with Poisson statistics, yielding a diverse yet well-controlled stimulus set, making the data conducive to a variety of computational approaches. A large number of mice of different genotypes were able to learn and consistently perform the task, at levels similar to rats in analogous tasks. They are sensitive to side differences of a single pulse, and their memory of the cues is stable over time. Moreover, using non-parametric as well as modeling approaches, we show that the mice indeed accumulate evidence: they use multiple pulses of evidence from throughout the cue region of the maze to make their decision, albeit with a small overweighting of earlier cues, and their performance is affected by the magnitude but not the duration of evidence. Additionally, analysis of the mice's running patterns revealed that trajectories are fairly stereotyped yet modulated by the amount of sensory evidence, suggesting that the navigational component of this task may provide a continuous readout correlated to the underlying cognitive variables. Our task, which can be readily integrated with state-of-the-art techniques, is thus a valuable tool to study the circuit mechanisms and dynamics underlying perceptual decision making, particularly under more complex behavioral contexts.

Distribution of average running speed across trials and sessions for animals with at least 1000 trials (n = 25). Arrowhead indicates population mean. (C) Distribution of standard deviations of average running speed across session-wide averages (gray, mean ± SEM: 6.7 ± 0.6 cm/s) and across trials within a session (green, mean ± SEM: 7.5 ± 0.6 cm/s). Arrowheads indicate population mean, and follow the same color code. (D) Distribution of standard deviations of average view angle across sessions and across trials within a session, calculated separately for right-and left-choice trials and then averaged (mean ± SEM: 4.9 ± 0.4˚vs. 10.4 ± 0.9˚, respectively). Conventions as in C. (E) Correlation between average running speed and average overall performance across all sessions for each mouse (n = 25, r = 0.48, P = 0.02, Pearson's correlation). (F) Distribution of session-wise correlations between average running speed and average overall performance, showing that although there is an overall correlation between the two indicators, for any given mouse there is little correlation of speed and performance on individual sessions: only 4/25 mice had significant correlations between running speed and performance across sessions, and the sign of the correlation was negative for one of these mice ( r = 0.06 ± 0.05, mean ± SEM). (G) Average frequency of different types of putative motor errors, belonging to five categories: trials with large-magnitude view angles during the cue period (> 60˚), trials with early turns (i.e. a turn immediately before the arm, resulting in a wall collision), trials in which the mouse first entered the opposite arm to its final choice, trials with speeds below the 10th percentile (defined separately for each mouse), and trials with traveled distance in excess of 110% of nominal maze length. Frequency was calculated separately for correct and error trials. Error bars, ± SEM. *** P < 0.001, n.s.: not significant.

Supplementary Figure 8 | Cue-triggered change in view angles for individual mice.
Each panel corresponds to data from a single mouse in this study, otherwise this is the same as Figure  8D: cue-triggered change in the view angle θ relative to the average trajectory 〈θ〉 for trials of the same choice. The bands indicate the 1 standard deviation spread across trials, with the lines being the mean across trials.  T1   T2  T3  T4  T5  T6  T7  T8  T9  T10  T11   Total  length  (cm)   60  170  270  330  330  330  330  330  330  330  Luminance has been increased for convenience. Right: equivalent top-down view of the virtual maze. The mouse avatar turns according to its recorded virtual view angle, and towers become gray outlines when they disappear from the maze. Movie has been slowed down by 2x.

Spatial Poisson distribution of tower locations
We used the following algorithm to randomly generate tower placement locations according to a Poisson process, i.e. with exponentially distributed inter-tower spacings subject to a minimum interval between towers:

Algorithm 1: Spatial Poisson process with refractory interval
Inputs L : Maximum possible location towers

Exponential gain for translating treadmill movements into changes in virtual view angle
Learning to use a spherical treadmill to execute navigational movements in virtual reality constitutes a substantial portion of the training time for this task. One of the optimizations we have performed to ease this process is to select treadmill-to-virtual-movement transformations so that mice can execute smooth motions without spending aversive amounts of time during turns into the arms of the T-maze. Historically we had first utilized a constant gain (Harvey et al. 2012) for the , but when this gain was low mice required a large amount of time to turn into the arms, encouraging them to initiate turns early (at the expense of accumulating later cues).
Conversely, when this gain was high, small postural shifts in the stem of the T-maze caused the virtual scene to wobble, which was undesirable in a task involving visual cues. These observations motivated the use of a nonlinear gain function that deemphasizes small, uncontrollable movements of the treadmill during running down the stem, but facilitates sharper turns at the end of the T-maze to encourage straighter view angle trajectories.

Heuristic models: optimization technique
Here we defined several models where the choice of the mouse in a series of trials is assumed to be a Bernoulli process parameterized by a probability of making a choice to the right, , that depends on a set of trial-specific quantities (see Materials and Methods ).
We obtained best-fit parameters for each model by maximizing the log likelihood of the model for a given dataset comprising of trials. Let the mouse's choice on the i th trial be which is 1 (0) if the mouse chose right (left), then the likelihood of observing this choice is given by the binomial distribution . Taking the product of individual-trial likelihoods we obtain: Additionally we subtracted L1 penalty terms for all free parameters of the model. For a model that includes all factors, the quantity that is maximized is therefore: where is the L1 norm. This regularization is used as a method for selecting the most parsimonious model in terms of driving coefficients to zero when they do not result in a significantly better fit for the model (Schmidt, 2010) . It was also crucial for some models, particularly those that contain history-dependent lapse terms, because of the presence of multiple local maxima that made the problem otherwise ill-posed.
The regularization strength hyperparameter was determined by using a 3-fold cross-validation (CV) procedure to find the optimal model in terms of predictive power. A given dataset was first divided into thirds, and each third is used exactly once as a test set and the remaining two thirds as its complementary training set. To equalize the highly different scales of the factors compared to the rest of the factors which are bounded within [-1,1], for each coordinate the standard deviation was computed using the trials in the training set, and used to scale the evidence factors, . In other words, the only thing that this changed was that the coefficients were expressed in units of where are constants derived using the training set (the same are used for the test set, as it would be unfair use of information if they were re-derived for the test set).

Alternative strategy models: one-random-tower analysis details
For the analysis in Figure 4C , for each mouse we selected the top 1 third performance blocks, and only analyzed mice that had at last 200 trials in these blocks; we pooled together all trials from these blocks and mice. To test the 1-random tower hypothesis, we reasoned that we expect to obtain a linear psychometric curve when the sum of towers (#R+#L) was fixed for all trials. This is because the probability to go right for the 1-random tower strategy is given by #R/(#R+#L), and if the denominator is fixed, then the psychometric curve (which is given by (#R-#L)/(#R+#L)) is linear in the difference of towers #R-#L, which is the standard x-axis of the psychometric curve. However, we have empirically observed sigmoid shapes for the psychometric curves of the mice's choices. Thus, we proceeded to quantify if the psychometric curves of the mice choices were different from that of the 1-random tower model (as described in Materials and Methods ). To obtain a dataset with fixed #R+#L, we next selected only trials where #R+#L=12. This number was chosen because it was the maximum number of #R+#L for which there were at least 4000 trials. We then found the psychometric curve for the actual data and the 1-random tower model. As expected, the 1-random tower model results in a linear psychometric curve, whereas the actual data appears more sigmoidal. To find whether these curves are significantly different from each other, we performed a shuffling test in the following way: we generated 5000 bootstrapped pairs of curves by pooling for all trials with a given #R -#L the number of times the mice (or model) chose right, and then randomly assigning the same number of right choices between the two curves, while keeping the total number of trials as in the original data. The sum of absolute differences between the two curves was used as the test statistic.