Explorer Physiological Signal Variability in hMT + Reflects Performance on a Direction Discrimination Task

This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with. General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Motion perception in humans critically depends on area hMT+ (also known as V5, for a review see Born and Bradley, 2005). Extensive research on its equivalent in monkeys (MT) has shown that neurons in this region are selective for the direction and speed of moving stimuli. Direction sensitive neurons show columnar organization, with columns of smoothly changing preferred directions abutting columns of the opposite preferred direction (Born and Bradley, 2005). Relating neuronal characteristics to behavior, neurometric functions of single-neurons were shown to correlate with psychometric functions in a direction discrimination task (Britten et al., 1992). More evidence for a direct link between MT neuronal properties and perception comes from studies which show that microstimulation can considerably bias performance (Cohen and Newsome, 2004) and that deteriorated neuronal speed and direction selectivity accompanies aging (Yang et al., 2009; Liang et al., 2010). In humans, hMT+ lies in an anatomically variable region and shows variation in histological and functional anatomy across individuals (Dumoulin et al.

Motion perception in humans critically depends on area hMT+ (also known as V5, for a review see Born and Bradley, 2005). Extensive research on its equivalent in monkeys (MT) has shown that neurons in this region are selective for the direction and speed of moving stimuli. Direction sensitive neurons show columnar organization, with columns of smoothly changing preferred directions abutting columns of the opposite preferred direction (Born and Bradley, 2005). Relating neuronal characteristics to behavior, neurometric functions of single-neurons were shown to correlate with psychometric functions in a direction discrimination task (Britten et al., 1992). More evidence for a direct link between MT neuronal properties and perception comes from studies which show that microstimulation can considerably bias performance (Cohen and Newsome, 2004) and that deteriorated neuronal speed and direction selectivity accompanies aging (Yang et al., 2009;Liang et al., 2010).
In humans, hMT+ lies in an anatomically variable region and shows variation in histological and functional anatomy across individuals (Dumoulin et al., 2000;Huk et al., 2002;Malikovic et al.,

IntroductIon
Accurate perception of visual motion is a key function of the human brain, enabling us to interpret the world around us, to predict trajectories of moving objects and to steer vehicles and control locomotion. While many psychophysical and neurophysiological studies have revealed common processing of visual motion information across participants, perceptual capabilities can differ substantially between individuals (Halpern et al., 1999). Though classical behavioral experiments average these difference to focus on the mean tendency, heterogeneity in visual motion perception can provide information on perceptual functioning. Describing performance profiles of motion perception might, for example, help to distinguish subgroups in phenomena like dyslexia or describe aging processes in the visual system (Talcott et al., 2000;Slaghuis and Ryan, 2006;Bennett et al., 2007;Billino et al., 2008). Exploring the relation between differences in performance on motion tasks and physiological signals in the visual dorsal stream can shed light on the relationships between cortical processing and perception.
Our ability to perceive visual motion is critically dependent on the human motion complex (hMT+) in the dorsal visual stream. Extensive electrophysiological research in the monkey equivalent of this region has demonstrated how neuronal populations code for properties such as speed and direction, and that neurometric functions relate to psychometric functions within the individual monkey. In humans, the physiological correlates of inter-individual perceptual differences are still largely unknown. To address this question, we used functional magnetic resonance imaging (fMRI) while participants viewed translational motion in different directions, and we measured thresholds for direction discrimination of moving stimuli in a separate psychophysics experiment. After determining hMT+ in each participant with a functional localizer, we were able to decode the different directions of visual motion from it using pattern classification (PC). We also characterized the variability of fMRI signal in hMT+ during stimulus and rest periods with a generative model. Relating perceptual performance to physiology, individual direction discrimination thresholds were significantly correlated with the variability measure in hMT+, but not with PC accuracies. Individual differences in PC accuracy were driven by non-physiological sources of noise, such as head-movement, which makes this method a poor tool to investigate inter-individual differences. In contrast, variability analysis of the fMRI signal was robust to non-physiological noise, and variability characteristics in hMT+ correlated with psychophysical thresholds in the individual participants. Higher levels of fMRI signal variability compared to rest correlated with lower discrimination thresholds. This result is in line with theories on stochastic resonance in the context of neuronal populations, which suggest that endogenous or exogenous noise can increase the sensitivity of neuronal populations to incoming signals.

2007
). Studies exploring neurophysiological properties of hMT+ have worked with exogenous variation of the stimulus (e.g., coherence of movement) to describe related modulations of the blood oxygen level dependent (BOLD) signal. Other studies have considered endogenous signal changes in hMT+ during the presentation of ambiguous stimuli, reflecting switches between percepts (for example Castelo-Branco et al., 2002;Muckli et al., 2002). The latter line of research shows the informative value of looking at endogenous fluctuations in hMT+, an approach we took in the current study to describe inter-individual physiological differences. While structural differences in the visual stream have been shown to correlate with individual psychophysical thresholds (Kanai and Rees, 2011), the connection between individual physiological properties of hMT+ and inter-individual differences in psychophysical tasks is less explored.
On a neuronal level, a possible reason for different perceptual sensitivity for direction could be the relative width of directional tuning curves. Sharper tuning curves lead to an unambiguous population signal in hMT+, which could be reflected in more distinct patterns for different directions of motion. On the behavioral level, this might translate into lower psychophysical thresholds when an individual has to make fine discrimination between different directions of motion (Purushothaman and Bradley, 2005;Liang et al., 2010). A potential candidate for revealing such physiological differences in fMRI is multi-voxel pattern analysis (MVPA) which is able to resolve fine grain patterns of hMT+ organization invisible to univariate techniques (Kamitani and Tong, 2006). Individual differences in decoding accuracy might indicate the distinctiveness of the hMT+ population pattern and correlate with perceptual performance.
Another method which has been recently suggested as a good gauge for inter-individual comparisons is variability analysis of the BOLD signal (Garrett et al., 2010;Mohr and Nagel, 2010;Samanez-Larkin et al., 2010;Mennes et al., 2011). Measurements of variability aim to describe endogenous background fluctuations in the signal, which appear independent of the timecourse of the experimental manipulation. An important confound for accurately measuring such endogenous variability is that the relationship between the stimulus and the BOLD signal has to be described as precisely as possible. Only if this is achieved can one investigate if the observed physiological variability has functional significance. A growing body of studies suggests that neurophysiological variability patterns can be understood as (functional relevant) "signal" rather than (function disturbing) "noise" (Faisal et al., 2008;McDonnell and Abbott, 2009;Garrett et al., 2010). Population signal variability in hMT+ could have different effects on performance accuracy: higher overall variability levels in hMT+ could be detrimental for discrimination performance if they would have an destabilizing effect on the hMT+ population signal as some authors suggest for the dopamine system (Winterer et al., 2006;Samanez-Larkin et al., 2010). Alternatively, a certain level of variability has been described to improve the sensitivity of systems, e.g., by stabilizing synchronized oscillating populations (Ermentrout et al., 2008), an observation described as stochastic resonance (Emberson et al., 2007;McIntosh et al., 2008;McDonnell and Abbott, 2009;Garrett et al., 2010).
In the present study, we set out to characterize brain activity that correlates with inter-individual variability in the accuracy of visual motion perception. We used multivariate pattern classification (PC) to describe hMT+ population patterns and we characterized the variability of the hMT+ BOLD signal during perception of motion in different directions. We investigated if these measures can serve as sensitive indicators for inter-individual performance differences on a motion direction discrimination task.

PartIcIPants
Fifteen healthy subjects gave written informed consent to participate in this study. The study was performed in accordance with the Declaration of Helsinki and approved by the ethics committee of the medical faculty of the Ludwig-Maximilians University Munich. Handedness was determined according to a 10-item excerpt of the "Handedness Inventory," coding the degree of handedness (+100: exclusively right handed, −100: exclusively left handed; Oldfield, 1971). It resulted in +100 in 13 subjects, one with +64 and one with +81. All subjects had normal or corrected-to-normal visual acuity as determined binocularly with a Snellen table (0.8 on 6 m or better). None of the subjects were taking medication or had any history of neurological disease. All subjects understood the instructions without difficulty. One subject was excluded from the MR analysis due to excessive motion resulting in a final cohort of 14 subjects (age range: 21-27, 6 female). These 14 subjects consecutively also took part in the psychophysical task on direction discrimination. Three subjects were excluded from psychophysical data analysis, as their measurements did not fulfill stability criteria as described below.

Apparatus
Stimuli were generated by a Fujitsu Siemens Pentium(R) 4 CPU at a frame rate of 85 Hz and displayed on a 40-cm × 30-cm Conrac Elektron CRT monitor driven by a NVIDIA Quadro Pro2 graphics card. The monitor resolution was set to 1280 × 1024. White and black pixel had a luminance of 25.3 and 0.1 cd/m 2 , respectively, resulting in a maximum Michelson contrast of 99%. Experiments were conducted in a darkened room and subjects were seated in 60 cm distance from the monitor.

Stimulus
Stimuli were programmed in Matlab 7.3 using the Psychophysics Toolbox extensions (Brainard, 1997). Coherent translational flow fields were presented in a circular aperture (11.4° × 11.4°), containing 300 white dots (diameter: 0.1°) at a time on a black background. All dots of one stimulus moved in a upward direction either vertically or at a small tilt from the vertical with a speed of 8°/s. Dots moving out of the aperture reappeared at new random positions (at the bottom of the aperture). Stimulus intensity was defined as the degree of tilt of the match stimulus (clockwise or anticlockwise) in respect to the upward (0°) reference stimulus.

Procedure
A two-alternative-forced-choice task was used to determine individual thresholds and psychometric functions of direction sensitivity. Reference stimulus and match stimulus were presented consecutively To ensure data reliability, those subjects whose thresholds exceeded the fourth quartile were excluded from further analysis (2 of 14). Subjects were furthermore excluded if the fit of their psychometric function did not meet goodness-of-fit criteria in the sensitivity analysis. Summary statistics yielded good fits between the psychometric function and the data for 11 of the 12 remaining subjects. Ninety-five percentage confidence intervals (CI) were calculated for the thresholds of each subject using the bootstrapping method (sampling with replacement, 1999 repetitions).
A one-way Kruskal-Wallis ANOVA tested for inter-individual differences in the behavioral thresholds, using the bootstrapped results.
Averaged reaction times (RT) were calculated as the arithmetic mean over the whole constant stimuli experiment. RT consistency was calculated as the SD over the experiment.

Experimental stimulus and procedure
Visual stimuli were projected with a LCD projector on a screen placed behind participants in the MR-scanner, which they viewed through a mirror placed above them at 45°. Vizard 3.0 (Worlviz) 1 , was used to produce coherent translational flow fields presented in a circular aperture (300 dots per display, aperture size 11.4° × 11.4°). Participants watched flow fields in one of four possible directions (0°, 90°, 180°, and 270°), shown in a randomized order, while fixating on a cross in the middle. Using a block design, 18 s task periods were interleaved with 10 s rest periods, during which subjects continued fixating. One block consisted of four trials, in which direction of motion was kept constant. Subjects performed a twoalternative forced-choice speed discrimination task, to keep their attention directly related to the movement of the stimulus while incidentally coding stimulus direction. In each trial, two consecutive (stimulus duration: 1.5 s, inter stimulus interval: 0.25 s, intertrial interval: 1.25 s). While fixating on the center of the aperture, subjects indicated with a buttonpress whether the second stimulus (match) was tilted clockwise or anticlockwise with respect to the first, upward moving reference stimulus (compare Figure 1A). After initial training with feedback (60 trials), preliminary thresholds were determined by two repetitions of a 3-down-1-up adaptive doublestaircase method (140 trials). The staircase measure was defined as stable if the slope of the linear fit from the last 12 reversals was less than 0.02. All but one subject achieved stable staircase measurements (this subject belonged also to the outliers in the measurement of constant stimuli, defined as subjects whose threshold exceeded the fourth quartile, see 2.2.4). Consecutively, the method of constant stimuli was used to sample the psychometric function, the range of sampling was set around the threshold determined by the staircase measurements. Tilt was varied between seven different intensities and each intensity was presented in 30 trials, resulting in a total of 210 trials. Subjects answered following the second stimulus and both speed and accuracy of the response were emphasized. Response times were measured from the moment the second stimulus ended until the moment of response. No feedback was given in staircase or constant stimulus measurements.

Data analysis
Data was analyzed using psignifit toolbox (Wichmann and Hill, 2001a,b) in Matlab 7.3. Final thresholds were obtained by fitting the percentage of correct responses determined by the method of constant stimuli with a cumulative Weibull distribution using a maximum likelihood procedure. Free parameters were threshold, slope, and lapse rate, which was kept variable between 0 and 0.5 (Wichmann and Hill, 2001b). Thresholds were taken as the 0.5 cut-off from the fitted function, corresponding roughly to a performance level of 75% correct (see Figure 1B). A separate fMRI experiment was conducted to functionally localize hMT+ in each subject, according to previously established procedures (Morrone et al., 2000;Huk et al., 2002). Briefly, a stimulus of alternating moving and stationary dot patterns was presented in a circular aperture with interleaved rest periods. Moving dots (velocity: 17.1°/s) traveled toward and away from the fixation cross for 16 s, followed by a 16-s stationary dot field, and a 20-s blank screen. Subjects fixated at all times.

fMRI acquisition
Imaging data were acquired on a 3T MR-Scanner (GE Sigma HDx) with a standard 8 channel head coil using an echo-planar imaging sequence (TR: 2 s, echo time: 40 ms, flip angle: 70°) to acquire 25 slice volumes (interleaved acquisition, no gap), centered on the area of interest (medial temporal lobe). Voxel size was 1.75 mm × 1.75 mm × 2.4 mm. In total, 8 runs of 225 volumes for stimuli were shown, a reference speed of 8°/s and a match stimulus of faster speed randomly distributed to the first or second presentation (stimulus duration: 1.5 s, interstimulus interval: 0.25 s, intertrial interval: 1.25 s, as for the psychophysical stimulus). Subjects reported the order-position of the faster stimulus with a buttonpress (see Figure 2A). For keeping task difficulty constant, individual speed discrimination thresholds were kept at a task performance of about 80% correct with an adaptive staircase procedure (QUEST, Watson and Pelli, 1983). Subjects performed 8 runs for a total of 32 repetitions per direction. Participants practiced the task outside the MR-scanner until they reached a satisfactory performance level (2 runs in which participants had to be error-free for 12 trials (fixed velocity difference) after which a staircase procedure started, on which subjects had to demonstrate a stable 80% correct threshold for at least 12 trials). They also practiced inside the bore of the MR-scanner, until they were comfortable conducting the task in a supine position.

Generating a stability index to quantify head motion
An index was designed to assess data stability for individual subjects. Head-movement causes image shifts between classifier training and test periods which are detrimental for MVPA. Specifically, a movement in the middle of the acquisition is more detrimental than a movement at its start or end because there will be more cross-validation iterations in which the training set contains volumes misaligned with the test set's volumes. Our stability index (SI) roughly represents the longest stable stretch of head orientation during data acquisition. For each volume, the location of the center of hMT+ is estimated from the realignment parameters generated during image preprocessing. Each volume is compared with all others. At each comparison (e.g., between volumes i and j), the distance, d ij between the estimated locations of hMT+ is calculated and a number, A ij , assigned describing how aligned the pair of volumes are. This alignment score is The similarity S i of each volume with all the other volumes is summarized by summing over all of its alignment scores: Finally, the whole recording session is given a SI, which is the score for the volume with the highest similarity score: illustrates how the similarity value varies for different time points over a fictitious series of volumes. The example shows little head motion during the longest part of data acquisition and a single large head motion toward the end. Similarity values for volumes in the long stable period are higher than for those after the movement, because the volumes in the former are similar to many more timepoints than a volume taken after the movement.

Estimating BOLD signal statistics using a generative model
Variability of the timecourses of the 160 voxels from the above described hMT+ and V1 masks was assessed with a generative model for stimulus (SDstim) and rest periods (SDrest) (see Figure 4B for an illustration of the model). In addition, variability was estimated in a white matter region to quantify the contribution of non-physiological variability to noise, as those regions show little change in local metabolism (Rostrup et al., 2000). Spheres of 80 voxels in each hemisphere were selected from the anterior portion of the corona radiata (CR), as determined by the Harvard-Oxford structural atlas (see Figure 3C for an example). Timecourses were high-pass filtered before model analysis. The temporal properties of the BOLD signal were described by modeling all eight events within a stimulus block as box-cars (1.25 s duration), which is similar to modeling them as delta functions as used in event-related designs. Box-cars were then convolved with the canonical hemodynamic response function (HRF), to account for the latency of the BOLD signal. A mixing parameter a i was generated by this function and assigned to each volume i, describing the proportion of the signal recorded at that timepoint that was provided by the stimulus periods. the experimental condition and 1 run of 132 volumes for the functional hMT+ localizer were acquired in each subject. In addition, a T1-weighted anatomical volume was acquired.

Defining hMT+ and V1 masks
To define functional regions of interest, fMRI data from the functional localizer were realigned to the first volume of the timeseries and smoothed with a kernel of 4 mm FWHM as implemented in SPM8 (Wellcome Department of Imaging Neuroscience, London, UK). Data were processed in individual space. A general linear model analysis comprising regressors for motion and stationary conditions was performed. Contrasting motion and stationary regressors identified clear delineated clusters for hMT+ (FWE, p < 0.05 in all but two subjects, who showed hMT+ clusters only at p < 0.001 uncorrected). See Figure 2B for an example. The clusters from the two hemispheres were combined to make a hMT+ mask of voxels for further analysis.
The V1 mask was created using anatomical and functional constraints. V1 was determined anatomically using FreeSurfer's cortical parcelation algorithms in every subject, based on anatomical constrains described by Hinds et al. (2008). The final mask consisted of voxels within this anatomically defined V1 which showed significant activation in the functional localizer, using the motion-stationary contrast.

Multivariate pattern classification and preprocessing
We used the Princeton Multi-Voxel Pattern Analysis Toolbox (MVPA) 2 , to test whether voxels within hMT+ or V1 contained information about the direction of the stimulus. Data were prepared by unwarping, realigning (SPM8), and detrending (MVPA) the timeseries to remove linear trends and high-pass filtering (cutoff: 128 s) to remove low frequency noise. Z-scoring of response amplitudes for stimulus periods of individual voxels was applied to minimize baseline differences across runs and to reduce the impact of outliers. To account for the latency of the hemodynamic response, all stimulus onset times were shifted forward in time by 4 s as described previously (Kamitani and Tong, 2006). Data were neither smoothed nor spatially normalized, to avoid signal degradation and preserve inter-individual differences. The nine image volumes from each block of four trials were combined to generate a single average volume for each block.
The 160 voxels with the highest t-values in the functional localizer experiment were selected from the hMT+ or V1 masks respectively for decoding analysis. We tested different mask sizes, but found no improvement in classification accuracy beyond 160 voxels (see Figure 2C).
The lSVM (linear support vector machine) classifier was chosen as it provided stable results across participants without overfitting. It was used with a fixed cost, c = 1. Classification used standard leave-one-out cross-validation, in which the data set was divided, with seven runs in the training set and one run in the testing set. The test was repeated eight times, with each different run being the test set (Pereira et al., 2009). The accuracy scores reported represent the proportion of blocks in which the classifier correctly decoded directions.
Assuming independent and identically distributed sampling, the likelihood of the whole timeseries is: The log likelihood therefore is: The four parameters were estimated by maximizing this function with respect to each of them. For the generative model, both stimulus and rest periods were modeled as gaussian distributions (stimulus: mean m s , variance s s 2 ; rest: mean m r , variance s r 2 ). The proportion of each of these distributions included in the final signal was estimated using maximum likelihood estimation (MLE). The estimate of the signal for a particular time point was calculated by finding the weighted sum of the two distributions. The mean and variance of the sum of two independently distributed gaussian random variables was found by adding the means and variances of the two distributions. So the mean and variance of the new distribution for time point i could be written as: Methods. Slopes of the individual psychometric functions were heterogeneous as well and showed a negative correlation with threshold (the higher the slope, the lower the threshold). The width of subjects 95% CI also differed between subjects. Average RT and RT consistency varied between subjects (max: 460 ms, min: 176 ms, SD: 67 ms, and SD max: 149 ms, SD min 57 ms respectively). RT means or variability did not correlate with individual direction discrimination thresholds.

Pattern classIfIcatIon Is confounded by resIdual head MotIon and cannot exPlaIn PercePtual dIfferences
Replicating previous results (Kamitani and Tong, 2006), the linear SVM was able to discriminate between the four motion directions in hMT+ with above chance accuracy (m = 53 ± 13%, p < 0.002 using permutation testing) in all but one participant (see Figure 2D). Also consistent with previous results, classification accuracy was still higher in V1 (m = 65 ± 12%, p < 0.001).
To test if individual classification scores in hMT+ or V1 were related to performance on the direction discrimination tasks, a correlation analysis between scores and psychophysical thresholds (t0.5) was performed which showed no significant effect (hMT+: r = 0.15, p = 0.64; V1: r = 0.16, p = 0.64).
To investigate possible reasons for inter-individual differences in classification scores, we looked at its correlation with non-physiological noise of the MR signal. Classification accuracy correlated significantly with variability (SDstim) in the white matter region CR (r = −0.59, p < 0.03, Figure 3D), from which we concluded that the level of global noise determined the differences in decoding success rather than local hMT+ noise.
To test this, we looked at one of the largest methodological contributors to variability in MR signal: head-movement (Friston et al., 1996;Lund et al., 2005). A strong correlation was observed between the SI reflecting stability of the signal and classification accuracy (r = 0.62, p < 0.02, Figure 3B).
This implies that noise induced by subject movement is the predominant cause for differential classification accuracies in subjects. Being this sensitive for head-movement artifacts, PC differences between subjects are unlikely to be a viable method to investigate physiological differences between subjects.

a generatIve Model for assessIng bold sIgnal varIabIlIty
We used the arithmetic difference between SD of block and rest periods (SDdiff) to look at variability of the MR signal in hMT+ and V1 in individual participants. Being a relative measure, it was assumed to be largely resistant to movement induced artifacts and background scanner noise, as those would influence both periods to the same extend.
Considerably more variability was found in the hMT+ region than in a white matter region (CR), both within stimulus blocks, and rest periods (SD was 30% higher in hMT+ and V1 than in CR). The SDdiff was also found to be larger in hMT+ and V1 than in CR (36%).
Importantly, subjects with a larger noise difference in hMT+ between rest and blocks did not have larger SI scores (r = −0.4810, p = 0.0695) which demonstrates that SDdiff is less affected by head motion.
Finally, the arithmetic difference between the SD within stimulus periods (s s ) and within rest periods (s r ) was calculated for each participant (SDdiff).
Adaptation model. The above model is not the only conceivable description of the signal timecourse. An alternative model was tested to assess the stability of our results yielded with the first approach. In this alternative model, possible signal adaptation in hMT+ over a block was accounted for by introducing a exponential decay term with a time constant of 5 s. This reduced the (pre-HRF convolved) box-car signal exponentially while the stimulus was applied, and allowed it to recover using the same exponential function during the stimulus-off periods. A maximum reduction of 14% in the BOLD response due to the adaptation was assumed, based on electrophysiological studies (Petersen et al., 1985;Krekelberg et al., 2006). The model was tested at four values of the time constant: 5, 10, 20, and 40 s. All other parameters of the model were kept constant.

assessIng eye MoveMents froM fMrI data
Although subjects were instructed to fixate, we were concerned that systematic eye movements occurred. It has been shown previously that eye movements can be estimated from fMRI data by analyzing the timecourse of fMRI signal in the vitreous of the eye (Beauchamp, 2003). We took this retrospective approach in those subjects in which the eyeball was partially contained in the field of view (FOV; in 3 of 11 participants the eyeballs were to 33, 40, and 46% contained in the FOV, see Figure 5). We defined a region of interest for the available section of the eyeball using FreeSurfer. The mean timecourse was extracted using marsbar in SPM8. To estimate the dependency between eyeball signal and the rest of the brain, we used the eyeball timecourse as regressor in a GLM, as has been described previously (see Muckli et al., 2009 supplementary material).

correlatIon of behavIoral data wIth Mr MeasureMents
A Pearson correlation was calculated between individual thresholds from the behavioral experiment (t0.5) and the individual noise difference between block and rest periods as determined by the generative model (SDdiff). Additionally, a Spearman correlation was performed which also showed a significant correlation. The robustness of the significant Pearson correlation was estimated using bootstrapping, sampling with replacement with 2000 iterations, to produce 95% CI for the r distributions.

Inter-IndIvIdual varIabIlIty In dIrectIon dIscrIMInatIon
On average, direction discrimination thresholds were found to be similar to previous results (Westheimer and Wehrhahn, 1994). We observed significant differences in discrimination thresholds between subjects (Kruskal-Wallis ANOVA, p < 0.001). Post hoc analysis also revealed similarities in subgroups of subjects, in three subject pairs (see Figure 1C: there was no significant difference between subject 1 and 4, between subject 3 and 11 and between subject 6 and 8). Note that data stem from 11 subjects, as three subjects did not reach reliability criteria explained in Materials and varIabIlIty Patterns In hMt+, but not v1, correlate wIth dIrectIon sensItIvIty In the final analysis, we tested whether inter-individual variability of perceptual performance was correlated with variability characteristics of the hMT+ signal. As can be seen in Figure 4A, we observed a significant correlation between psychophysical threshold and SDdiff: participants with a greater SDdiff showed better behavioral performance (smaller thresholds) compared to participants with a smaller SDdiff (r = −0.61, p < 0.046, bootstrap CI 95% for r: −0.87 to −0.23). In other words, the larger the difference in variability (stimulus block minus rest), the lower the threshold the respective subject achieved. Similar correlation results were found for estimating SDdiff with an alternative model taking into account adaptation effects within blocks (r = −0.59, p < 0.058, bootstrap CI 95% for r: −0.84 to −0.20).
To investigate the specificity of this effect, we also correlated SDdiff in the CR with the psychophysical thresholds which was not significant (r = −0.35, p = 0.29, Figure 4C, lower panel). To test another region involved in direction coding, we correlated SDdiff of V1 with psychophysical thresholds. We did not observe a significant correlation in V1 neither (r = −0.44, p = 0.181, Figure 4C, lower panel).
When the MR-blocks were split into those with stimuli of different directions, the effect remained significant for vertical but not horizontal motion (see Figure 4C, upper panel). Given that the stimulus in the psychophysics experiment were visual flow fields moving vertically upward, this might indicate that we are observing a phenomenon specific for vertical motion. Alternatively, one could interpret this observation as showing a general bias for vertical versus horizontal motion in hMT+. Further studies are necessary to clarify this point.

eye MoveMent analysIs
The hMT+ is known to be influenced by eye movements (Dukelow et al., 2001;Acs and Greenlee, 2008). For this reason we instructed subjects to fixate, with which they reported no difficulties. We can not exclude however, that eye movements occurred. To investigate this, we used a retrospective approach to assess, if the signal timecourse of the eyeballs taken from the EPI images correlates with fluctuations in hMT+. In the three subjects analyzed, we did not observe significant correlations of eyeball signal timecourse with fluctuations in area hMT+ (see Figure 5).

dIscussIon
We demonstrate in the current study that inter-individual differences in performance on a direction discrimination task of visual motion are correlated with signal variability characteristics of hMT+ but not V1. We furthermore show that PC, though being able to decode direction from hMT+ within subjects, is a poor tool to describe inter-individual differences. Assessing individual BOLD signal variability difference in stimulus and rest periods is shown to be a better measure for such comparisons, being less influenced by non-physiological noise.
Differences in psychophysical thresholds between subjects show that perceptual sensitivity for motion direction is variable even within a homogeneous sample. Worse or better perception of motion stimuli in subjects with normal visual acuity has been suggested to reflect changes in higher level visual cortical areas rather than in the peripheral apparatus (Halpern et al., 1999).
Relatively little is known about hMT+'s contribution to worsening of direction perception (Bennett et al., 2007;Billino et al., 2008), although concepts like the "magnocellular theory" behind learning disorders like dyslexia attribute a partial cause of the phenomenon to perceptual malfunctioning in the dorsal visual stream (Stein, 2001). Other authors already suggested that BOLD signal variability over the whole brain (Garrett et al., 2010) or in specific regions like the nucleus accumbens (Samanez-Larkin et al., 2010) might have predictive value for degradation of function during aging. Our method of characterizing signal variability in hMT+ could help the clinical understanding of degraded motion perception in aging or disorders like dyslexia.
Better performance in the psychophysical task suggests higher perceptual sensitivity in that particular participant and thereby most likely more effective processing in the brain. Our results show that variability characteristics in hMT+ but not V1 correlate with psychophysical thresholds. This might indicate that we observe individual differences not at the initial encoding of the visual information in V1, but rather during a more complex motion processing step in hMT+, an area thought to drive perceptual decisions in higher cortical areas.
We find lower thresholds correlating with larger variability differences between stimulus and rest periods which mean higher variability levels in stimulus periods (but see the below discussion on model bias as a limitation to this claim). How could increased random physiological signal be beneficial for the sensitivity of a system? An influential theory based on the phenomenon of stochastic resonance advertises "[...]randomness that makes a non-linearity less detrimental to a signal." (McDonnell and Abbott, 2009). The theory asserts that a certain level of noise can actually be beneficial for signal transmission. Studies have shown that a certain level of endogenous noise can make synchronized oscillating populations more stable (Ermentrout et al., 2008;Ghosh et al., 2008) and benefits the emergence of fast oscillations in local field potentials (Brunel and Wang, 2003). For us this means that detecting higher levels of endogenous variability in the hMT+ population signal might actually reflect a more robust signal.
Other fMRI and EEG studies have described lower levels of cortical noise in senior subjects (Garrett et al., 2010) and children  compared to young adults. This has been discussed as neurophysiological noise being inversely related to the well described U-shaped function of performance during the lifespan (MacDonald et al., 2006;McIntosh et al., 2008).
More specifically for our case of signal variability in the visual system, Bair et al. (2001), recording single-neurons in macaque MT, describe that those neuron pairs with high signal correlations also showed an increase in the correlation of noise. Clearly, given the coarse MR resolution, correlated noise would be more detectable at the fMRI level than uncorrelated noise. Our results suggest that greater variability differences between stimulus and rest periods might be beneficial for perceptual sensitivity in hMT+. The basis for signal variability could be caused by individual neurophysiological characteristics of hMT+. Each volume's alpha value is used to estimate what proportion of the signal is from the stimulus and what proportion is from the rest period. These two distributions are sampled and their weighted sum is found. This is used to generate the distribution. The log likelihood of the real distribution being generated in this way is calculated. The parameters of the block and rest distributions are then altered to maximize this log likelihood. (C) Top graph: Comparing correlations for different stimulus directions. Splitting the block and rest periods in the four directions shown during the MR experiment, we observed small differences in correlation strength. Bottom graph: Comparing correlations over different brain regions. The correlation between noise difference and psychophysical threshold was smaller and not significant in the white matter region CR and V1. CR, corona radiata.
2002; Liu et al., 2011;Stoppel et al., 2011), and those participants able to apply attention most accurately to the stimulus are not only likely to do well in the psychophysics direction discrimination task, but may also show the BOLD signal variability we observe. Topdown control by areas described for internally evoked attention processes like the intraparietal cortex and superior frontal cortex could play a role in inducing the individual hMT+ signal variability we observe (Corbetta and Shulman, 2002). From the methodological point of view, we demonstrate that PC is a poor method to determine between subject differences. Although it could decode directional information from hMT+ activity in individual subjects, its ability to describe the relative difference between subjects was confounded by individual headmovement and scanner artifact differences. Filtering out movement artifacts has been a challenge in the field of MR, as it contributes the greatest amount of non-physiological noise (Friston et al., 1996;Lund et al., 2005). Although successful methods have been established for reducing the effect of head-movement in univariate analysis based on the general linear model (Friston et al., 1996;Andersson et al., 2001), the specific influence of residual artifacts on new methods like PC is less well documented. Beyond this methodological confound, other evidence exists that classification accuracy may not be an appropriate metric to compare experimental conditions, brain regions, or participants. Smith et al. (2011) for example suggest that classifier performance is influenced by other factors besides neural specificity such as response amplitude. Using MVPA for between subject comparison might therefore require further corrections to guarantee comparability.
Head-movement artifacts can also confound measures of signal variability. Garrett et al. (2010) show that the predictability of a noise measure was greatly improved by the extensive preprocessing of the data, beyond the conventional steps of realignment and normalization. Their methods included artifact correction via independent component analysis (Beckmann and Smith, 2004) and regressing out motion parameters. For future analysis of both PC and BOLD signal variability, this seems to be a fruitful approach. In the current study we used the relative value of noise differences between stimulus and rest periods, which minimizes the movement confound, as both periods should be equally affected by movement.
Critically, all assumptions on signal variability characteristics depend on the validity of our method to estimate the variability in the hMT+ signal. We used a generative model to estimate variability in the fMRI signal, modeling all eight events within a stimulus blocks separately as box-cars convolved with the HRF. The model furthermore accounted for the HRF-induced overlap of stimulus blocks and rest periods by assigning mixing values to each individual volume, based on the estimation of the relative contribution of stimulus and rest periods to the signal in that particular volume. Compared to other methods to assess variability in the BOLD signal, our method is quite complex. Garrett et al. (2010) for example directly calculated the SD over blocks. Considering that the physiological response in hMT+ to our stimulus periods probably consisted of a sustained elevation in BOLD signal, overlaid with single spikes evoked by the eight single events, simply calculating the SD would have not allowed us to separate the endogenous from the stimulus induced variability. The current model is FIguRe 5 | Retrospective eye movement analysis. The mean timecourse was extracted from the eyeball ROIs (green) defined for three subjects. The timecourses were used as regressors in general linear models to assess correlated activity in the rest of the brain (blue). No overlap was found with the hMT+ masks (red).
A confound that must be considered before interpreting our variability signal is signal fluctuations in hMT+ caused by eye movements. Participants were instructed to fixate, but as we used translation stimuli, an automatic smooth pursuit must have been suppressed which individual subjects might have achieved with more or less success over the timecourse of stimulation. However, we did not find that the signal timecourse from the eyeball ROIs as measured in a subgroup of subjects correlated with hMT+ signal fluctuations. Participants furthermore did not report difficulties fixating. Although we cannot exclude an influence of eye movements on the hMT+ signal, we believe it is not the strongest component causing the observed inter-individual differences in fluctuation of the hMT+ signal.
Also non-perceptual phenomena like individual motivation or attentional levels could explain our results, influencing both physiological and perceptual measurements. It has been shown that hMT+ BOLD signal is modulated by attention (Berman and Colby, designed to account for the stimulus induced modulation of the BOLD signal, leaving us with the endogenous variability. Certain stimulus induced modulation of the BOLD signal might still not have been accounted for, such as repetition suppression which might occur due to repetitive stimulus display during a block. An alternative model taking this adaptation effect of the signal into account yielded similar results as our initial model. Extending our model to include an adaptation effect therefore seems to have little consequence for our measure of variability. The model, in its current form, has also important limitations. By using non-uniformly distributed mixing parameters (e.g., the stimulation and rest periods), a bias is introduced as the maximum likelihood estimator assigns more of the variance in the data to the more frequent parameter (the stimulus period). Critically, though, this does not affect their use to compare subjects, as the bias will influence all subjects equally.
Another point to be considered is that the mixing parameter was calculated by convolving the stimulus events with the HRF, while the remaining signal was assumed to stem from the rest periods. Other ways to model the data are conceivable, e.g., convolving both rest and block-periods with the HRF. The ratio or sum of the two could then be used to model the data. Different modeling schemes remain to be explored systematically, to find which best estimates the contributions of the two distributions.
Taking the relative difference as a measure and not absolute variance, we are however confident that we observed physiological differences in hMT+ correlating with perceptional sensitivity. We conclude by suggesting that modeling variability difference between rest and stimulus cycles is a promising method to investigate physiological differences between subjects. We furthermore suggest that perceptual sensitivity in direction discrimination might be associated with noise characteristics in hMT+. This could ultimately help to understand normal and pathological changes in visual motion perception.

acknowledgMent
This work was supported by the Deutsche Forschungsgemeinschaft (GRK1091, JA1087/1-1) and the Neuroinformatics and Computational Neuroscience Doctoral Training Centre, School of Informatics, University of Edinburgh. The authors thank Stefan Glasauer for helpful input, all the participants, and the scanning assistants for their support.