ORIGINAL RESEARCH article
Neural basis of feature-based contextual effects on visual search behavior
- 1Centre for Neuroscience Studies, Queen's University, Kingston, ON, Canada
- 2Departments of Biomedical and Molecular Sciences and Psychology, Queen's University, Kingston, ON, Canada
Searching for a visual object is known to be adaptable to context, and it is thought to result from the selection of neural representations distributed on a visual salience map, wherein stimulus-driven and goal-directed signals are combined. Here we investigated the neural basis of this adaptability by recording superior colliculus (SC) neurons while three female rhesus monkeys (Macaca mulatta) searched with saccadic eye movements for a target presented in an array of visual stimuli whose feature composition varied from trial to trial. We found that sensory-motor activity associated with distracters was enhanced or suppressed depending on the search array composition and that it corresponded to the monkey's search strategy, as assessed by the distribution of the occasional errant saccades. This feature-related modulation occurred independently from the saccade goal and facilitated the process of saccade target selection. We also observed feature-related enhancement in the activity associated with distracters that had been the search target during the previous session. Consistent with recurrent processing, both feature-related neuronal modulations occurred more than 60 ms after the onset of the visually evoked responses, and their near coincidence with the time of saccade target selection suggests that they are integral to this process. These results suggest that SC neuronal activity is shaped by the visual context as dictated by both stimulus-driven and goal-directed signals. Given the close proximity of the SC to the motor circuit, our findings suggest a direct link between perception and action and no need for distinct salience and motor maps.
Our ability to select a visual object from amongst numerous alternatives is thought to be guided by a visual salience map (Cave and Wolfe, 1990; Findlay and Walker, 1999; Itti and Koch, 2000). Visual representations on the salience map are the result of both stimulus-driven (bottom-up) inputs as well as goal-directed (top-down) signals. The magnitude of each representation is related to the probability of selecting that object for further processing and, in the case of overt visual search, as a target for the next saccadic eye movement. Stimulus discriminability is crucial to determining visual behavior when searching for a target stimulus amongst distracter stimuli (Treisman, 1988; Duncan and Humphreys, 1989; Wolfe et al., 1989): if discriminability is high, the representation of the search target on the visual salience map is significantly greater than those of distracter stimuli, resulting in a target that seems to “pop-out.” If target discriminability is low, as when the target is defined by a conjunction of features, the representations are more similar and subjects are less likely to select the search target. Visual behavior is also influenced by prior information such as knowledge of target features and recent past experience. Visual search studies have shown that both humans and monkeys are more likely to make saccades to stimuli that share features with the target of the current search session (Findlay, 1997; Motter and Belky, 1998) or of the previous one (Bichot and Schall, 1999b).
Neurophysiological studies of visual search in monkeys have demonstrated how the visual salience map is distributed across a network of sensory-motor structures that include the lateral intraparietal (LIP) area (Ipata et al., 2006; Thomas and Paré, 2007), the frontal eye field (FEF; Thompson et al., 1996; Thompson and Bichot, 2005), as well as the superior colliculus (SC; McPeek and Keller, 2002; Shen and Paré, 2007; Shen et al., 2011). These neurons give visually evoked responses to stimuli falling in their receptive fields and subsequently signal the location of the saccade target before saccades are made. Neurons in area LIP and FEF are also influenced by prior information, showing feature-based modulations that are task-relevant (Bichot and Schall, 1999a; Toth and Assad, 2002) as well as memory-related (Bichot and Schall, 1999a), i.e., priming of distracter stimuli whose features were those of the target in the previous search session.
Stimulus discriminability is not only determined by individual stimulus features but also by the spatial organization of the search display. Perceptual grouping of objects, for instance, can facilitate visual search (Verghese and Nakayama, 1994; Hegde and Felleman, 1999; also see Duncan, 1995) as can the global composition of the search display as shown in studies using the variable distracter-ratio search task. In this task, the ratio of distracters that share a feature with the search target is manipulated (Shen et al., 2000; Shen and Paré, 2006). Responses are fastest when there are few of one distracter type, with incorrect saccades being biased toward those distracters. Humans and monkeys naturally adapt their visual search strategies to visual context on a trial-by-trial basis (also see Egeth et al., 1984; Zohary and Hochstein, 1989; Poisson and Wilkinson, 1992; Kaptein et al., 1995; Bacon and Egeth, 1997; Sobel and Cave, 2002). This flexible allocation of resources can facilitate visual search by limiting the selection process to stimuli that are most relevant.
How the neural process of saccade target selection is influenced by visual context is not known. Studies of contextual influences on visual processing using figure-ground segregation (Zipser et al., 1996; Lamme et al., 1999; Supèr et al., 2001) and curve tracing (Khayat et al., 2006, 2009) tasks have indicated that contextual modulation in visual cortex is most likely mediated by (top-down) cortico-cortical recurrent processing. This interpretation is supported by the late onset of the neuronal modulation (∼70–90 ms following the visually evoked responses), which is in stark contrast with the early modulation (∼20 ms) associated with (bottom-up) feed-forward processing that has been observed in studies using pop-out stimuli (Knierim and van Essen, 1992; Nothdurft et al., 1999, 2000; Burrows and Moore, 2009). In this study, we investigated how the influence of visual context is exerted on the visual salience map by recording the activity of SC sensory-motor neurons while monkeys performed a variable distracter-ratio search task. We chose to record neurons within the SC because this structure has direct motor outputs to the saccade generating system (Rodgers et al., 2006) and its neuronal activity can, therefore, more closely impact saccade target selection. We tested the hypothesis that the influence of visual context has a goal-directed component by measuring the timing of the corresponding neuronal modulation and comparing it to those associated with saccade target selection as well as with the priming of stimulus features that occurs between experimental sessions.
Materials and Methods
Data were collected from three female rhesus monkeys (Macaca mulatta, 4.5–6.0 kg, 8–10 years) cared for under experimental protocols approved by the Queen's University Animal Care Committee and in accordance with the Canadian Council on Animal Care guidelines. The surgical procedure, stimulus presentation, and data acquisition have been described previously (Shen and Paré, 2006, 2007). Monkeys were housed in large enclosures (Clarence et al., 2006) and received both antibiotics and analgesic medications during the post-surgery recovery period, after which they were trained with operant conditioning and positive reinforcement to perform fixation and saccade tasks for a liquid reward until satiation. The extra-cellular activity of single SC neurons was recorded using previously described methods (Paré and Wurtz, 2001), and spike occurrences were sampled at 1 kHz.
Monkeys first performed a visual delayed saccade task to characterize the discharge properties of the neurons and determine their receptive fields (Paré and Wurtz, 2001). This task temporally dissociated visual stimulation from saccade execution by introducing a delay of 500–1000 ms between the presentation of a stimulus in a neuron's receptive field and the disappearance of the fixation stimulus, which acted as the signal for the monkeys to make a saccade to that stimulus. Neurons were included in our sample if they exhibited both visually evoked responses (≥10 sp/s) and saccade-related activity (≥100 sp/s) in this task. We, therefore, focused on sensory-motor neurons whose activity may reflect the selection of the saccade target or the programming of the targeting saccade, or both. These neurons are candidate units in sensory-motor processing within the visual salience map for eye movements.
The main data of this report were collected while monkeys performed a visual conjunction search task in which the stimuli were conjunctions of a color (red or green) and a shape (circle or square). On each trial, monkeys initially fixated a central stimulus that acted as a cue for the search target. This stimulus disappeared with the simultaneous appearance of a concentric array of one target and 11 distracter stimuli. On each trial, either the target or a distracter stimulus appeared randomly in the center of the neuron's receptive field, and all other stimuli were randomly positioned equidistant from the central stimulus position and from each neighboring stimulus. The ratio of same-color/same-shape distracters on each trial was varied randomly between 2/9, 6/5, and 9/2 (Figure 1A). Monkeys were rewarded maximally for fixating the location of the pseudo-randomly positioned target stimulus within 500 ms of the display presentation, and were partially rewarded (<0.33 of the maximum amount along with the reinforcement tone) for locating it with multiple saccades within 2000 ms of the initial eye movement. Trials were deemed correct if the monkey successfully foveated the target after a single saccade.
Figure 1. Task and behavior. (A) The variable distracter-ratio visual search task. Displays consisted of three possible same-color/same-shape distracter-ratios: few same-color (2/9, left), balanced same-color (6/5, middle), and many same-color (9/2, right). All example displays show a correct single saccade (arrow) made to the target. (B) Average index of behavioral salience (observed errors to same-color distracters—errors expected by chance) for three monkeys performing the variable distracter-ratio task. Red: few same-color; blue: balanced; green: many. All values are mean ± SE.
A single block of at least 600 search trials was performed on a given day. As in our previous study, the target remained the same within a single day's session but the target of each new session shared one feature with the target from the previous session (Shen and Paré, 2006; also see Bichot and Schall, 1999a). In addition, to familiarize monkeys with the target in the conjunction search task and quantitatively delimit each neuron's receptive field, each block of search trials was preceded by 120 trials of a simple detection task. In this task, the search target first appeared as the fixation stimulus and then stepped to one of the 12 positions used in the conjunction search task.
Only trials with stimulus-directed saccades were included in the data analysis. For each experimental session, we quantified the animals' eye movement choices (i.e., behavioral strategy) in each display type with a behavioral salience index, where behavioral salience = observed proportion of errors to same-color distracters –proportion of errors to same-color distracters expected by chance (see Shen and Paré, 2006). Chance was equal to 0.182 (2/11), 0.545 (6/11), and 0.818 (9/11) in few, balanced, and many same-color displays, respectively.
Details of the neuronal data analyses have been described previously (Thompson et al., 1996; Shen and Paré, 2007; Thomas and Paré, 2007). Neuronal activity was quantified as continuously varying spike density functions aligned on the onset of either the visual stimulus presentation (stimulus aligned) or the first saccade (saccade aligned). Spike density functions were constructed by convolving spike trains with a combination of growth (1 ms time constant) and decay (20 ms time constant) exponential functions that resembled a postsynaptic potential (Thompson et al., 1996).
We used the data collected in the delayed saccade task to calculate a visuo-movement index (VMI; Shen and Paré, 2007), which quantified the relative magnitude of visually evoked and saccade-related activity of each neuron: VMI = (vis − mov)/(vis + mov), where vis is the mean discharge rate over the first 100 ms following stimulus presentation (mean = 61 sp/s; range = 12–231 sp/s), and mov is the peak discharge rate within ± 40 ms of saccade onset (mean = 334 sp/s; range = 100–706 sp/s). The peak saccade-related activity occurred on average 2.4 ± 0.7 ms before saccade onset. Neurons with stronger visually evoked activity have VMIs closer to 1.0 and those with stronger saccade-related activity have VMIs closer to −1.0. The mean VMI value of the neuronal sample was −0.66 ± 0.03 (range = −0.19 to −0.97). Several neurons also exhibited sustained activity during the delay period of this task, and that activity was quantified as the mean discharge rate over the last 300 ms of the delay period (Paré and Wurtz, 2001).
We used the now common method (Thompson et al., 1996; Shen and Paré, 2007, 2006) derived from Signal Detection Theory to quantify the separation between a neuron's activity associated with the target and that associated with a distracter (trials in which the target was located in one of the seven opposing locations). Receiver operating characteristic (ROC) curves were built for successive 5-ms intervals by plotting the probability that the rate of target-related activity is greater than a criterion rate as a function of the probability that the rate of distracter-related activity is greater than that same criterion. The area under each of these curves (auROC) was plotted as a function of time, and the time course of neuronal discrimination was captured by the Weibull function that fit best with the data (mean R2 = 0.96, range: 0.82–1). More than 15 target and 75 distracter trials were used to construct the ROC curves for each neuron. Best-fit functions were calculated only with activity occurring before the initiation of saccades landing correctly on target, and they were terminated when there were fewer than five target or distracter (correct) trials. The ranges of response latencies in target and distracter trials were matched across all conditions. The discrimination magnitude of each neuron was defined as the upper limit of the best-fit functions, and the point at which these functions reached a criterion value of 0.75 was taken as the neuron's discrimination time. Discrimination time occurred at least 13 ms before saccade initiation, and it was used to center a 25-ms analysis epoch to contrast the distracter-related modulation in neuronal activity: Modulation Index (MI) = (color − shape)/(color + shape) (Motter, 1994a; Treue and Martinez-Trujillo, 1999). To be included in any analysis, neurons had to contribute a minimum of five trials in each of the conditions considered. To estimate the time at which distracter-related activities became significantly different from each other we conducted successive rank-sum tests on the same 5-ms intervals used for the ROC analysis (Thomas and Paré, 2007).
To examine correlations between neuronal activity variables as well as between neuronal activity and behavior, we conducted linear regressions and the significance of the regression slopes were determined with a permutation test using sampling with replacement and 10,000 iterations (Efron and Tibshirani, 1993).
Visual Search Strategy
We recorded the activity of 42 sensory-motor neurons within the intermediate layers of SC while three rhesus monkeys (monkey F: 19; G: 10; H: 13) reported with a saccadic eye movement which of the stimuli in a visual search display had a unique conjunction of features. The distracter stimuli in this task could share either the color or the shape of the search target, and the ratio of same-color/same-shape distracters was randomly varied between 2/9, 6/5, and 9/2 from trial to trial (Figure 1A). Monkeys correctly foveated the target with a single saccade with a mean (± SE) probability of 0.71 ± 0.02 and the latencies of their correct saccades averaged 192±5 ms. We quantified the effects of display composition on the animals' behavioral strategy (i.e., eye movement choices) in each experimental session with a relative index of behavioral salience (see Materials and Methods; also see Shen and Paré, 2006). A positive index indicates how same-color distracters were least salient (i.e., salient same-shape distracters). Monkeys biased their search to the color dimension when same-color distracters were few (on average, behavioral salience: 0.42 ± 0.04; Figure 1B) and away from it when there were many (−0.226 ± 0.040). These behavioral biases were significantly different from chance (t-test, p < 0.05); no preference for either feature dimension was observed when distracter types were balanced (0.090 ± 0.035; p = 0.08). Note that these biases were determined only from the small proportion of errors that monkeys made (<30%). The overall proportions of first saccades to each distracter type were similar and relatively small across displays (on average: few same-color displays, same-color: 0.147 ± 0.013 vs. same-shape: 0.108 ± 0.015; balanced displays, 0.186 ± 0.016 vs. 0.115 ± 0.014; many same-color displays, 0.198 ± 0.018 vs. 0.161 ± 0.020). Those distracters that were most unique in the unbalanced displays, therefore, did not appear to automatically capture attention because of their physical salience (Yantis and Jonides, 1984; Theeuwes, 1990; Yantis and Jonides, 1990; Folk et al., 1992).
These results suggest that our monkeys, like humans (e.g., Egeth et al., 1984; Shen et al., 2000), limited their search to the features most relevant to the task on a trial-by-trial basis, as dictated by the visual context.
Behavioral Salience and Saccade Target Selection
The activity of all SC neurons in the variable distracter-ratio task was first indiscriminant, but it evolved to signal the target location: the activity associated with the target became enhanced and that associated with a distracter became suppressed. Consistent with previous studies (McPeek and Keller, 2002; Shen and Paré, 2007), visually evoked responses were not selective for either the type or feature of the stimulus presented in their receptive field across search displays (see Table 1). This was also the case within the few same-color and the many same-color displays, when one distracter type was estimated to be more behaviorally salient. When and how well sensory-motor neurons selected the saccade target in each of the distracter-ratio displays was quantified with a discriminant analysis based on Signal Detection Theory, which computed the probability that an ideal observer could discriminate neuronal activity associated with target and distracter stimuli (Thompson et al., 1996; Shen and Paré, 2007). On average, neurons discriminated the target from any distracter 125 ± 3 ms following the onset of the search display and 61 ± 4 ms before the onset of the initial saccade that correctly landed on the search target. At these times, all 42 neurons had statistically greater activity associated with the target than that associated with distracters (rank-sum test, p < 0.01). The discrimination magnitude of these neurons averaged 0.94 ± 0.01. Because this analysis only considers correct trials, such a high discrimination is expected from neurons whose activity is thought to reflect the process of selecting the search target and play a critical role in guiding behavioral choice (Schall, 2003). Consistent with our previous reports (Shen and Paré, 2007; Shen et al., 2011) as well as with observations in FEF (Thompson et al., 2005a; Trageser et al., 2008), the discrimination of SC sensory-motor neurons generally exceeded the overall accuracy of the search computed from each corresponding session, which averaged only 0.71 across all sessions. As a consequence, discrimination magnitude was not correlated with the overall probability of correctly foveating the target with a single saccade in each corresponding session (Spearman rank correlation, p = 0.83). This lack of correlation rules out task difficulty as a factor influencing target discrimination by SC neurons in correct trials.
Crucially, the activity associated with the more salient same-color distracter in the few same-color trials evolved to be greater than that related to the less salient same-shape distracter (Figure 2A, top). While the initial discrimination probability was the same when comparing target activity to each distracter type, the discrimination process was faster and more efficient for the non-salient distracter (Figure 2A, bottom). When the distracter types were more balanced, the activation associated with each distracter type were nearly identical (Figure 2B). Finally, when same-shape distracters were more salient, activity associated with these distracters was enhanced and the discrimination of the same-color distracter was faster and more efficient (Figure 2C). This modulation was quantified by taking the difference between each neuron's discrimination magnitude and discrimination time of the target from the same-color and same-shape distracter (i.e., target/same-color –target/same-shape) in each display type. On average, discrimination magnitude was significantly higher when discriminating the target from the less salient distracters (Figure 2D; ANOVA, p < 0.001). In addition, discrimination time was significantly shorter when discriminating the target from the less salient distracters (Figure 2E; p < 0.001). Similar results were obtained when data were aligned to saccade onset (Figure 2F; p < 0.001). In other words, the more common a distracter was, the more its feature was filtered out.
Figure 2. Effects of feature sensitivity on saccade target selection. (A–C) For a sample neuron in few same-color (A), balanced (B), and many same-color (C) displays. (D) Effects of feature sensitivity on discrimination magnitude for the neuronal sample. (E–F) Effects of feature sensitivity on discrimination time, with data aligned on either stimulus-onset (E) or saccade-onset (F). Only neurons with significant discrimination reaching the criterion of 0.75 in all conditions were included in (E) and (F).
Any differences in search accuracy or saccade latency across conditions could not account for these observed changes in neural discriminability. The changes in discrimination ability (see Figure 2D) were not related to the overall accuracy or saccade latency computed from each corresponding search display (Spearman rank correlation, p = 0.23 and 0.58 for accuracy and saccade latency, respectively). Similarly, the changes in discrimination timing (see Figures 2E,F) were not related to accuracy or saccade latency within each condition when considering differences in DT aligned to stimulus-onset (p = 0.91 and 0.27) as well as saccade-onset (p = 0.34 and 0.86). The small changes in search accuracy (5%) and saccade latency (10 ms), compared to the change in behavioral bias, may account for the above lack of correlation with changes in neuronal discrimination ability. Moreover, the time at which neurons discriminated the target from a distracter with the same-color or same-shape did not increase with increasing number of each type of distracter (ANOVA, p = 0.75 and 0.99 for color and shape distracters, respectively). For the experimental sessions included in the above analysis, the behavioral salience observed in the corresponding sessions did not differ from that reported in Figure 1B (rank-sum test, p = 0.52).
In summary, the process of selecting the saccade target, as reflected in SC sensory-motor neurons, depended on the varying enhancement of a stimulus feature predicated by visual context.
Magnitude and Timing of Stimulus Feature Sensitivity
To quantify the stimulus feature sensitivity of SC neurons, we compared the activity associated with the two types of distracter stimuli. Because these stimuli were not the goal of an eye movement, this analysis controls for the possibility that the modulation was related to saccade programming. We computed a Modulation Index (MI) that contrasts the activity associated with each distracter type across conditions around the time that individual neurons discriminated a target from any distracter (Figure 3). Positive indices indicate greater activity for same-color distracters, and negative indices indicate greater activity for same-shape distracters. Neuronal activity associated with same-color distracters was significantly greater than that associated with same-shape distracters when the former were fewer and presumed to be more salient (Figure 3A; MI: 0.10; t-test, p < 0.01), whereas it was significantly lower when they were more numerous and presumed to be less salient (Figure 3C; MI: −0.16; p < 0.001). The activity associated with these distracters was not different when their proportions were balanced (Figure 3B; MI: 0; p = 0.43). Moreover, the activity associated with the same-color relative to the same-shape distracter was 24% greater when the distracter-ratio was low and 28% less when the ratio was high (Figure 3; t-test, p < 0.001).
Figure 3. Feature-based contextual modulation of SC neuronal activity. (A) few same-color, (B) balanced, and (C) many same-color displays. The modulation index (MI) was calculated for each neuron using distracter-related activity during a 25-ms epoch centered on the neuron's discrimination time, where MI = (color − shape)/(color + shape).
This modulation was indeed both an enhancement of the salient and a suppression of the non-salient distracter representations: when the activity associated with each distracter type in the few and many displays were compared to their activity in the balanced displays, the representations of salient distracters were enhanced by 19.5% and those of non-salient distracters suppressed by 9.2% (Figure 4); these changes in activity were significantly different from zero (t-test, p < 0.05), and they were not restricted to a subset of the neuronal sample. The majority of the neurons showed both enhancement and suppression modulation in their activity, but those showing the most activity enhancement were not necessarily those showing the most suppression (permutation test, p = 0.57). Moreover, enhancement and suppression was not limited to a single distracter type, as these effects were observed in both distracter types (enhancement: 11.7% and 27.8%, suppression: 9.1% and 9.1%, same-color and same-shape distracters, respectively; t-test, all p < 0.05).
Figure 4. Enhancement in neuronal activity associated with salient distracters and suppression of that associated with non-salient distracters. Percent changes (modulation) were calculated relative to the (baseline) activity in the balanced displays. Each neuron's activity was computed during a 25 ms interval centered on the neuron's discrimination time. Salient distracters were both the same-color distracter in the few same-color display and the same-shape distracter in the many same-color display. Non-salient distracters were both the same-shape distracter in the few same-color display and the same-color distracter in the many same-color display. Bold axis labels: mean enhancement and suppression. Solid symbols: individual neurons with both activity enhancement and suppression that were not statistically significant (rank-sum test, p > 0.05).
To determine when this stimulus feature sensitivity arose, we calculated the time at which the salient distracter activity became significantly greater than the non-salient distracter activity (successive rank-sum tests, p < 0.05). We found that feature sensitivity occurred, on average, 103 and 121 ms after the onset of the many and few same-color displays, respectively. This time corresponded to 59 and 76 ms after the onset of the visually evoked responses in these trials, and it did not occur significantly earlier than when neurons discriminated the target from any distracter (Figure 5; many same-color displays: 100 vs. 114 ms, paired t-test, p = 0.25, n = 15; few same-color displays: 121 vs. 90 ms, p < 0.001, n = 22). Across display types, feature sensitivity arose 69 ± 5 ms after the onset of the visually evoked responses and not before the time that neurons discriminated the target from any distracter (113 vs. 99 ms; p < 0.05, n = 37).
Figure 5. Time course of feature-based modulation. Mean ± SE stimulus onset, visual response onset, and discrimination time with respect to time at which feature-based modulation was significant (i.e., 0 ms; p < 0.05, rank-sum test). Feature sensitivity in many (top) and few (middle) same-color context displays, as well as between-session feature priming (bottom).
We also examined whether the variability of the feature-based modulation of SC activity can be explained by an individual neuron's pattern of activity. First, we tested whether the discrimination ability of a neuron predicted this modulation. Neither the enhancement of salient distracter representations nor the suppression of non-salient ones was related to a neuron's discrimination magnitude (permutation test, enhancement: p = 0.41; suppression: p = 0.29). Next, we tested whether a neuron's modulation was related to its basic discharge properties observed in the delayed saccade task: More “visual” neurons might show greater modulation than more “saccade” ones. A neuron's position along the visuo-movement axis (determined by its VMI) did not predict its modulation abilities (p = 0.91 and p = 0.11 for enhancement and suppression, respectively), nor did the strength of its saccade-related activity (p = 0.37 and p = 0.58). In addition, the enhanced activity of a neuron observed in the variable distracter-ratio task was related to neither the strength of its visually evoked response (p = 0.64) nor its sustained activity during the delay period (p = 0.86) of the delayed saccade task. The only significant correlation that we found was between the suppression of a neuron's activity and its visually evoked response (p < 0.01), though this relationship was relatively weak (R2 = 0.19) and caused by two outlier neurons with strong visually evoked responses: it was also not corroborated by the correlation with the neuron's delay activity (p = 0.10). In summary, the feature-based contextual modulation appears to be neither a product of the discrimination ability of an individual neuron nor its position on the visuo-movement axis. These results argue against a neuron-specific contribution to this process, as has been shown in studies of spatial attention in both SC (Ignashchenkova et al., 2004) and FEF (Thompson et al., 2005b). While we cannot rule out different contributions by extreme classes of neurons, it is more likely that SC neuronal ensembles cooperate to enhance and suppress salient and non-salient features, which together perform a more efficient selection process (Shen and Paré, 2007; see also Bichot et al., 2001). Consistent with this population coding is the lack of correlation at the neuronal level between the enhancement of the salient distracter representation and the suppression of the non-salient distracter representation.
Stimulus Feature Sensitivity Predicts Visual Search Strategy
To what extent can the feature sensitivity of sensory-motor neurons account for the observed biases in behavior? To answer this question and to determine the time course of this contextual modulation, we examined the relationship between the index of behavioral salience and the neuronal MI during three analysis epochs related to events within the discrimination process that were common across recording sessions despite the different response latencies: (1) when the neuron first responded to the visual stimulation, (2) when the neuron discriminated the target from any distracter, and (3) just prior to saccade initiation (Figure 6). A value in the upper-right quadrant of this graph would indicate that a behavioral attraction toward the presumably salient same-color distracter is associated with an increased neuronal sensitivity for that same distracter type. A value in the lower-left quadrant refers to a behavioral aversion to the same-color distracter associated with a reversed neuronal sensitivity to that distracter. Early in the discrimination process, the visual responses to each distracter could not predict the behavioral biases (Figure 6A; permutation test, p = 0.35). This result was anticipated because the display composition changed randomly from trial to trial. By the time neurons discriminated the target from any distracter, neuronal activity was predictive of behavior (Figure 6B; R2 = 0.35, p < 0.0001), and it further improved just before saccades (Figure 6C; R2 = 0.48, p < 0.0001).
Figure 6. Time course of feature-based contextual modulation across visual search displays. Relationship between the neuronal modulation index and the corresponding index of behavioral salience during (A) the visual response (25 ms from visual response latency), (B) the discrimination (25 ms centered on the time neurons discriminated target from any distracter), and (C) the pre-saccade (25 ms before saccades) epochs. Diamonds: few same-color displays; squares: balanced same-color; triangles: many same-color. Statistical values are from permutation tests (10,000 iterations) of the slopes of the linear regression.
The results of the above correlation analyses capture the match between the bias in MI and that in the behavior expressed primarily in the two unbalanced search displays. A stronger link between neuronal activity and behavior could be made if the variability within these biases were also correlated within the balanced condition, i.e., when stimulus context was controlled. Although the distribution of modulation indices in the balanced displays was centered on zero, some neurons displayed activity modulation that was either significantly positive or negative (Figure 3B). We, therefore, examined whether the variability between neurons in balanced displays could account for some of the variability in search strategy between sessions. We found that this was the case just before saccades were initiated away from the neurons' receptive fields (permutation test, R2 = 0.15, p < 0.0001) as well as when the neurons discriminated the target from any distracter (R2 = 0.21, p < 0.01), but not during their initial visually evoked responses (p = 0.69). Even within a single trial condition, i.e., when both the task difficulty is unchanging and the distracter-ratio is balanced and constant, the variability in activity between neurons accounts for some of the variability in visual search behavior. The activity related to a distracter stimulus during correct trials—when a saccade is made away from that distracter—can, therefore, predict behavioral salience estimated from error trials. In other words, a neuron's activity associated with a type of distracter (not a particular spatial location) predicts the general probability of saccades made erroneously to that type of distracter (not a particular saccade goal).
Priming of Stimulus Feature Sensitivity
One possible contribution to the feature-related signals observed in the balanced condition could be between-session priming of stimulus features. Priming is known to affect FEF activity when the current session's target shares a feature with the target of the previous day (Bichot and Schall, 1999a). We examined behavior in balanced displays in a subset of experimental sessions in which the previous session's target had either the same color (color-primed) or the same shape (shape-primed) as the current session. Similar to previous behavioral reports of conjunction search (Bichot and Schall, 1999a, b) monkeys' preferences for one target feature over the other was influenced by the previous session's target identity: in balanced display trials, the index of behavioral salience was greater (i.e., monkeys were more biased for same-color distracters) when the previous session's target had the same-color than if it had the same-shape (Figure 7A; 0.15 ± 0.03 vs. −0.10 ± 0.08, rank-sum test, p < 0.05). The behavioral salience index during color-primed sessions was also significantly greater than 0 (t-test, p < 0.01). Figure 7B illustrates, using two sample neurons, how distracter-related activity was affected by feature priming: initial visual responses were the same, but the activity associated with the distracter having the primed feature was greater by the time the neurons discriminated the target from any distracter. The neuronal MI was quantified for the subset of neurons recorded during either color- or shape-primed sessions. Calculated around the time of discrimination, the MI during color-primed sessions was significantly greater than 0 (Figure 7C; t-test, p < 0.05) and was also significantly different from that during shape-primed sessions (rank-sum test, p < 0.05). In the absence of any manipulation of stimulus context, between-session priming also induced behavioral biases that were reflected in distracter representations.
Figure 7. Effects of between-session priming in balanced displays. (A) Average (± SE) index of behavioral salience for two monkeys (G and H) in either color- (n = 8) or shape-primed (n = 11) sessions. (B) Two sample neurons' activities in balanced display trials during sessions that were either color- (left) or shape-primed (right). Green: same-color distracter, pink: same-shape distracter, black: target. Black bars indicate the occurrence of saccades made to the target. (C) MI at DT in balanced displays during sessions that were either color- (black) or shape-primed (gray).
We tested whether there was still a relationship between the observed neuronal modulation and behavior in the absence of feature-based priming by considering the balanced display condition for only the sessions in which the previous session's target did not share any features with the current target (i.e., no feature priming). Even in these sessions, the behavior varies from session-to-session and can be predicted by the activity variability between neurons at the time of discrimination (permutation test, p < 0.05, n = 12).
To determine when the effect of between-session feature priming arose on neuronal activity, we calculated the time at which the primed distracter activity became significantly greater than the non-primed distracter activity (successive rank-sum tests, p < 0.05). We found that the onset of feature-based priming occurred, on average, 79 ± 8 ms after the onset of the visually evoked responses (Figure 5, bottom) and around the time that neurons discriminated the target from any distracter (115 vs. 101 ms; paired t-test, p = 0.06, n = 13). The timing of this between-session feature priming is comparable to that of the stimulus feature sensitivity.
We observed modulation in the sensory-motor activity of SC neurons associated with distracter stimuli in a visual search display whose feature composition varied from trial-to-trial as well as when a feature was primed by a previous experimental session. This feature-based contextual modulation was associated with stimuli that were not the goal of an eye movement, thereby controlling for the spatial confound that it was related to saccade programming. Its magnitude did, however, correlate with the visual search strategy of the monkeys, as assessed by the distribution of their occasional errant saccades, and facilitated the process of saccade target selection. The late onset of this feature-based modulation with respect to visually evoked responses suggests that it is not purely stimulus-driven but that it also has a goal-directed component. Its near coincidence with saccade target selection suggests that it is integral to this process. These findings reveal that non-spatial information about stimulus features is integrated into oculomotor programs spatially represented in the SC, whose close proximity to the pre-motor circuit eliminates the need for distinct visual salience and motor maps for regulating visual behavior.
The evidence presented in this paper for feature-based contextual modulation in the activity of SC sensory-motor neurons adds further support to the hypothesis that the SC instantiates the visual salience map postulated by models of visual search and selective attention (Cave and Wolfe, 1990; Findlay and Walker, 1999; Itti and Koch, 2000; Glimcher et al., 2005; Hamker, 2006). According to these models, the stimulus-driven outputs from individual feature maps, which can be instantiated by extra-striate cortical areas, are integrated with goal-directed signals into the visual salience map. Previous studies have shown how SC neurons have stimulus-driven visual responses to targets and distracters in their receptive fields and that their activity evolves to reflect the selection of saccade goals (McPeek and Keller, 2002). SC also has stimulus representations whose magnitudes are predictive of which stimulus will be selected as a saccade target (Shen and Paré, 2007). The present study extends these findings further by demonstrating how SC stimulus representations are also modulated by recent history as well as by visual context.
Although the activity on the visual salience map is thought to be non-selective to visual features, it can be seen as sensitive to features when reflecting the salience of stimuli in a search display. Indeed, feature-based contextual modulation has been demonstrated in the sensory-motor activity of neurons within cortical areas FEF and LIP. These previous findings are consistent with the between-session priming effects that we observed, because they primarily involved static goal-directed signals. In these studies, either the monkeys were highly experienced with the target feature of a feature search task over many sessions (FEF: Bichot et al., 1996) or the stimulus features retained the same relevance over an individual visual search session (FEF: Bichot and Schall, 1999a; see also Bichot et al., 2001; LIP: Toth and Assad, 2002; Sereno and Amador, 2006). Such neuronal activity modulation, however, also reflects past trial history rather than current trial demands, which underlie the feature-based contextual modulation we observed in the variable distracter-ratio search task. Moreover, when we took into account the effects of feature-based priming in the balanced displays, we still found a relationship between behavior and neuronal modulation. What remained could be due to additional goal-directed signals, such as the individual monkeys' preferences in a session. The most closely related observation to ours is the feature-based modulation in sensory responses of neurons in visual cortex reported by Bichot and colleagues (2005). In this study, monkeys were cued with the target's feature during the initial fixation period of each trial and were free to move their eyes to search a conjunction search display. During the ensuing search fixations, the activity of V4 neurons was enhanced whenever a preferred stimulus in their receptive fields matched the target's feature. Similar neuronal activity modulation has also been observed in V4 (Motter, 1994a, b) and MT (Treue and Martinez-Trujillo, 1999) when monkeys were instructed to allocate their attention to different features in tasks precluding eye movements. Such enhanced neuronal activity has been presented as a neural correlate of feature-based attention—the ability to preferentially process certain features of visual objects irrespective of their locations.
The feature-based contextual modulation observed in our study might be analogous to that observed in attention studies. This is a reasonable suggestion given that the expression of feature-based attentional modulation has been estimated to be within the latency range of the saccades made by our animals: ∼100 ms after the beginning of a new fixation (Bichot et al., 2005) and ∼150 ms after a new instruction (Motter, 1994a). In the variable distracter-ratio search task, it is the search display itself that would provide the information about the feature for which to allocate attentional resources. Is there enough time to process this information before the SC activity modulation? This is quite possible given that our monkeys had considerable experience with these search displays. In addition, there is evidence that color can be discriminated within 30 ms (Verghese and Nakayama, 1994; Bodelon et al., 2007; Stanford et al., 2010) and that the feed-forward sweep of visual processing is sufficient for extracting the gist of complex visual scenes (Schyns and Oliva, 1994; Castelhano and Henderson, 2008). If the non-spatial contextual modulation in SC neuronal activity were taken to reflect feature-based attention, our results would provide unequivocal evidence that attention is at play within this sub-cortical structure, thus complementing existing evidence obtained in studies of spatial attention (Kojima et al., 1996; Ignashchenkova et al., 2004; also see Kustov and Robinson, 1996; Cavanaugh and Wurtz, 2004; Müller et al., 2005).
Possible mechanisms for the modulation of stimulus representations by visual context are incorporated in existing models via interactions within feature maps (Cave and Wolfe, 1990; Itti and Koch, 2000; Hamker, 2006). According to these models, the unbalanced distracter composition in a visual search display would suffice to enhance the representations of the fewer distracters, thereby making them and the search target the primary representations competing for selection at the level of the salience map. Such low-level mechanisms, however, would entail changes in activity occurring much earlier than those we observed, perhaps as early as ∼20 ms following the onset of the visually evoked responses as when visual cortex neurons are activated by pop-out stimuli (Knierim and van Essen, 1992; Nothdurft et al., 1999, 2000; Burrows and Moore, 2009; also see Tomita et al., 1999; Arcizet et al., 2011). Furthermore, these early changes in activity may be relatively small and perhaps only manifest in individual feature maps, i.e., not at the level of the visual salience map. Our observations that the stimulus feature sensitivity of SC neurons arose around 70 ms after the onset of the visually evoked responses and that the magnitude of those responses did not vary with behavioral salience argue against a purely stimulus-driven account. The timing of this modulation compares well with the timing of contextual effects in figure-ground segregation and curve tracing tasks (∼70–90 ms), which have been argued to result from cortico-cortical recurrent processing (Zipser et al., 1996; Lamme et al., 1999; Supèr et al., 2001; Khayat et al., 2006, 2009). In addition, our behavioral observations are consistent with studies that have shown how top-down feature-based control can be exerted such that salient distracters do not always capture attention (Bacon and Egeth, 1994; Leber and Egeth, 2006; Schubo and Müller, 2009; but see Theeuwes, 2004). Finally, the observed modulations nearly coincided with the timing of saccade target selection, which is thought to involve sensory-motor neurons instantiating the visual salience map and similar enhancement/suppression mechanisms. It is important to note that the selection of a saccade target is considered to be a process that occurs in advance of, and possibly discretely from movement preparation (Purcell et al., 2010). The enhancement of salient representations, then, may serve to aid visual selection and not necessarily indicate saccade programming. Whichever mechanism underlies the SC stimulus feature sensitivity we observed, our findings suggest that stimuli sharing a feature with the search target can be processed preferentially when they appear to belong to a small perceptual group (Duncan, 1995), and it may be that stimulus-driven inputs to the salience map must be enhanced by goal-directed signals about the target's identity for perceptual grouping to happen.
Early descriptions of the visual salience map considered it to be the last processing stage whose output feeds perception itself, but more recent work on visual search has its output specifying the next saccade target. How then is the salience map connected to the eye movement system if the SC is a node within the distributed network that forms the visual salience map? The common view in both conceptual (Glimcher et al., 2005) and formal computational models (Beck et al., 2008; also see Hamker, 2006) is that the SC is only a motor map reading out the end product of the selection process within a cortical salience map. Our findings, however, strongly suggest that there is no need for a motor map for saccades distinct from the salience map (also see Findlay and Walker, 1999). Visual and cognitive processing has long been observed in the activity of SC neurons (Goldberg and Wurtz, 1972), and neurons with visually evoked responses and saccade-related activity—like those recorded in this study—have been shown to project to the brainstem saccade generator circuit (Rodgers et al., 2006). With sufficient activation, potential saccade target representations in this structure can, therefore, become saccade programs via its direct access to the pre-motor circuits. This ability may also be conferred by the characteristic saccade-related, high-frequency burst of activity produced by SC neurons, which is significantly distinct from that observed in cortical neurons and the main candidate trigger signal for saccade initiation (Paré and Hanes, 2003).
Despite many similarities between the neuronal activity observed in FEF, area LIP, and SC, it is unlikely that the visual salience map is simply replicated across these brain regions. With respect to saccade target selection, we posit that this process results from the progressive filtering of distracter representations and amplifying of target representations from area LIP to FEF and onto SC. This is consistent with a recent comparative analysis showing that the reliability of the target/distracter discrimination improves from cortex to SC (Thomas and Paré, 2007). Contrary to the discrete target-selection/saccade-programming models mentioned above, this continuous flow of information between these brain regions could account for the discriminating activity observed simultaneously in cortex (Thompson et al., 1996; Thomas and Paré, 2007) and SC during visual search (McPeek and Keller, 2002; Shen and Paré, 2007) as well as the feature sensitivity in SC neuronal activity that we observed in this study.
The direct link between perceptual and motor processing suggested by our findings could be viewed as a substrate for the visual grasp reflex, the inflexible capture of overt attention by a salient visual stimulus (Hess et al., 1946; Theeuwes et al., 1998). The SC is indeed a phylogenetically old brain structure crucial to visual processing and orienting responses, but this does not imply that its visuo-motor function is inflexible. For instance, the anuran behaviors of prey catching and predator avoidance—which rest on the integrity of the optic tectum, the SC homologue in non-mammal vertebrates—are modifiable (Ewert et al., 2001). Furthermore, some have argued that automatic sensory-motor activation is integral to, and not distinct from, voluntary behavior, which is regulated but not exclusively dictated by cortical circuits (Sumner and Husain, 2008). In conclusion, the SC may represent the primordial visual salience map for regulating gaze orienting behavior and it is unlikely that this function was entirely replaced by cortical areas in mammals (Shen et al., 2011). Instead, the cortical salience maps may confer additional flexibility in regulating gaze orienting and perhaps more direct links to other behavior, e.g., visual perception and visuo-manual activities.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We thank W. Clarence for expert assistance with the training and preparation of the animals. We thank Dr. A. Winterborn and his team for veterinary care. This work was supported by grants from the Canadian Institutes of Health Research (CIHR MOP 38089) and by an Early Researcher Award from the Ontario Ministry of Research and Innovation to Dr. Martin Paré. Kelly Shen held a postgraduate scholarship from the Natural Sciences and Engineering Council of Canada.
Beck, J. M., Ma, W. J., Kiani, R., Hanks, T., Churchland, A. K., Roitman, J., Shadlen, M. N., Latham, P. E., and Pouget, A. (2008). Probabilistic population codes for Bayesian decision making. Neuron 60, 1142–1152.
Clarence, W. M., Scott, J. P., Dorris, M. C., and Paré, M. (2006). Use of enclosures with functional vertical space by captive rhesus monkeys (Macaca mulatta) involved in biomedical research. J. Am. Assoc. Lab. Anim. Sci. 45, 31–34.
Ewert, J. P., Buxbaum-Conradi, H., Dreisvogt, F., Glagow, M., Merkel-Harff, C., Rottgen, A., Schurg-Pfeiffer, E., and Schwippert, W. W. (2001). Neural modulation of visuomotor functions underlying prey-catching behaviour in anurans: perception, attention, motor performance, learning. Comp. Biochem. Physiol. A Mol. Integr. Physiol. 128, 417–461.
Ipata, A. E., Gee, A. L., Goldberg, M. E., and Bisley, J. W. (2006). Activity in the lateral intraparietal area predicts the goal and latency of saccades in a free-viewing visual search task. J. Neurosci. 26, 3656–3661.
Kaptein, N. A., Theeuwes, J., and van der Heijden, A. H. C. (1995). Search for a conjunctively definded target can be selectively limited to a color-defined subset of elements. J. Exp. Psychol. Hum. Percept. Perform. 21, 1053–1069.
Lamme, V. A., Rodriguez-Rodriguez, V., and Spekreijse, H. (1999). Separate processing dynamics for texture elements, boundaries and surfaces in primary visual cortex of the macaque monkey. Cereb. Cortex 9, 406–413.
Sereno, A. B., and Amador, S. C. (2006). Attention and memory-related responses of neurons in the lateral intraparietal area during spatial and shape-delayed match-to-sample tasks. J. Neurophysiol. 95, 1078–1098.
Thompson, K. G., Bichot, N. P., and Sato, T. R. (2005a). Frontal eye field activity before visual search errors reveals the integration of bottom-up and top-down salience. J. Neurophysiol. 93, 337–351.
Thompson, K. G., Hanes, D. P., Bichot, N. P., and Schall, J. D. (1996). Perceptual and motor processing stages identified in the activity of macaque frontal eye field neurons during visual search. J. Neurophysiol. 76, 4040–4055.
Trageser, J. C., Monosov, I. E., Zhou, Y., and Thompson, K. G. (2008). A perceptual representation in the frontal eye field during covert visual search that is more reliable than the behavioral report. Eur. J. Neurosci. 28, 2542–2549.
Keywords: visual context, salience map, monkey, superior colliculus, feature priming
Citation: Shen K and Paré M (2012) Neural basis of feature-based contextual effects on visual search behavior. Front. Behav. Neurosci. 5:91. doi: 10.3389/fnbeh.2011.00091
Received: 18 November 2011; Accepted: 24 December 2011;
Published online: 11 January 2012.
Edited by:Agnes Gruart, University Pablo de Olavide, Seville, Spain
Reviewed by:Jorge A. Bergado, Department of Experimental Neurophysiology, Cuba
Miou Zhou, University of California, Los Angeles, USA
Copyright: © 2012 Shen and Paré. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
*Correspondence: Martin Paré, Department of Biomedical and Molecular Sciences, Botterell Hall Room 438, Queen's University, Kingston, ON K7L 3N6, Canada. e-mail: firstname.lastname@example.org
†Present address: Kelly Shen, Rotman Research Institute, Baycrest, Toronto, ON M6A2E1, Canada.