Finding Flicker: Critical Differences in Temporal Frequency Capture Attention

Rapid visual flicker is known to capture attention. Here we show slow flicker can also capture attention under reciprocal temporal conditions. Observers searched for a target line (vertical or horizontal) among tilted distractors. Distractor lines were surrounded by luminance modulating annuli, all flickering sinusoidally at 1.3 or 12.1 Hz, while the target’s annulus flickered at frequencies within this range. Search times improved with increasing target/distractor frequency differences. For target–distractor frequency separations >5 Hz reaction times were minimal with high-frequency targets correctly identified more rapidly than low frequency targets (~400 ms). Critically, however, at these optimal frequency separations search times for low and high-frequency targets were unaffected by set size (slow flicker popped out from high flicker, and vice versa), indicating parallel and symmetric search performance when searching for high or low frequency targets. In a “cost” experiment using 1.3 and 12.1 Hz flicker, the unique flickering annulus sometimes surrounded a distractor and, on other trials, surrounded the target. When centered on a distractor, the unique frequency produced a clear and symmetrical search cost. Together, these symmetric pop-out and search costs demonstrate that temporal frequency is a pre-attentive visual feature capable of capturing attention, and that it is relative rather than absolute frequencies that are critical. The shape of the search functions strongly suggest that early visual temporal frequency filters underlie these effects.


INTRODUCTION
One of the primary goals of sensation and perception is to locate and identify objects of interest in the surrounding environment. In the visual domain, an object can be easily found if it differs from competing visual clutter along a primary sensory dimension such as color, luminance, orientation, or spatial frequency (Treisman and Gelade, 1980;Yantis and Jonides, 1984;Theeuwes, 1992;Theeuwes and Van der Burg, 2007; see Wolfe and Horowitz, 2004 for a review). Spatio-temporal differences such as direction of movement may also help isolate a target (Horowitz et al., 2007), and recently it was found that purely temporal differences can help in finding a target. Specifically, an abrupt temporal change in luminance or motion of a target object presented among competing distractors enhanced search efficiency, and an abrupt change in a distractor impaired search (Franconeri and Simons, 2003).
The idea that abrupt changes can capture visual attention has intuitive ecological appeal. Gestures such as hand-waving are used by human infants and adults alike to capture the attention of others. Conversely, many animals exhibit freezing behavior when trying to maintain camouflage and to avoid detection during predator-prey interactions (Eilam, 2005;Ioannou and Krause, 2009). While transient visual events certainly can break camouflage and capture attention (James, 1950;Regan, 2000), little is known about what sensory information guides this saliency.
One way to understand the salience of abrupt temporal changes is to consider temporal frequency. Any movement in the retinal image, whether due to object motion or self-motion (i.e., locomotion, head, or eye movements) produces local dynamic signals that can be decomposed, via Fourier analysis, into an array of simpler sinusoidal temporal waveforms, each with a particular temporal frequency, phase, and amplitude (Rucci et al., 2007). Change saliency may be related to the temporal frequency spectrum elicited by an object undergoing abrupt change and the neural response it engenders. Neurons selective for temporal frequency and retinotopic location are common in the early stages of primate visual system and neurophysiology (Solomon et al., 2004;Moore et al., 2005). At the psychophysical level, two kinds of temporal frequency channel are evident: one broad and low-pass (or "sustained") and one band-pass and high-frequency selective or "transient" (Anderson and Burr, 1985;Hess and Snowden, 1992;Cass and Alais, 2006;Cass et al., 2009a,b).
In this study we investigate whether temporal frequency (flicker rate) is an effective cue for visual search. A previous study examined this and found an intriguing asymmetry where search was more efficient for high temporal frequency targets among low frequency distractors than vice versa (Ivry and Cohen, 1992). Although this asymmetry agrees with our experience that movement breaks camouflage and captures attention, the experiment www.frontiersin.org tested a limited span of temporal frequencies and did not control the temporal frequency content of the stimuli. Moreover, the asymmetry is inconsistent with findings showing that static objects capture attention when embedded amongst flashing objects (Pinto et al., 2006). The present study examines a wide and carefully controlled range of frequencies to reveal the precise role of temporal frequency as a cue for guiding visual search. The range is large enough to encompass the peak sensitivities of the underlying temporal channels (1.3-12.1 Hz), and frequency content was tightly controlled by using sinusoidal modulations.

Participants
Four participants (two naïve) took part in all experiments (one female, mean age 34.5 years; range 23-44). All were right-hand dominant.

Stimuli and apparatus
Experiments were run in a dimly lit room. Participants sat ∼80 cm from a cathode ray monitor with 85 Hz refresh-rate and 8-bit linearized luminance output. Stimulus displays consisted of four, seven, or 11 white line segments (116.5 cd m −2 , length 0.7˚visual angle) on a gray mean-luminance background (38.85 cd m −2 ). Lines were equally spaced on an imaginary circle (4.9˚radius) centered on a white fixation dot. Line orientations were randomly plus or minus 10˚from horizontal or vertical, except for the target line, which was horizontal or vertical (see Figure 1). The orientation of the target line (known as an orientation singleton) was FIGURE 1 | Illustration of an example search display. In Experiment 1 all distractor annuli modulated with an identical frequency within a given trial (1.3 or 12.1 Hz) whilst target annuli modulated at either 1. 3,2.7,4.0,6.7,8.1,or 12.1 Hz. randomly assigned by the stimulus presentation software on a trial-by-trial basis. Each line was surrounded by an annulus (1.1r adius, 0.4˚width) whose luminance varied sinusoidally over time around mean-luminance. The modulation depth of each annulus (Lmax − Lmin/Lmax + Lmin) was randomly jittered ±20% around the average modulation depth of 77% to obviate luminance cues which are known to occur as a function of stimulus frequency (deLange, 1958;Bex and Langley, 2007) and to affect visual search (e.g., Theeuwes and Van der Burg, 2007). The phase of each temporal modulation was randomized to exclude predictable phase changes (Spalek et al., 2009).

Design Experiment 1
Within each trial all distractors modulated at 1.3 or 12.1 Hz while the single target annulus modulated at either: 1.3, 2.7, 4.0, 6.5, 8.5, or 12.1 Hz (see Movies S1 A,B in Supplementary Material). The total set size (target plus distractors) was 4, 7, or 11 items. Each of the 12 possible target × distractor frequency pairings was presented 20 times in random sequence for a total of 720 trials per subject.

Design Experiment 2
In the second experiment, set size was fixed at seven items. Half of the trials (560/1120) involved target and distractor annuli modulating at the same frequency (either 1.3 or 12.1 Hz). These trials we refer to as the deviant absent or temporally neutral condition (see Movie S2A in Supplementary Material). Other trials consisted of an annulus singleton modulating at a unique frequency (1.3 or 12.1 Hz) with respect to the other six elements in the display (all 12.1 or 1.3 Hz). Most frequently (probability of 6/7, total 480 trials), this unique frequency was centered on a distractor item (deviant distractor condition -see Movie S2B in Supplementary Material). Of the remaining 80 trials, the annulus surrounding the target had a unique frequency. The participant's task was to search for the target and to indicate as quickly and accurately as possible via a keypress whether the target was a vertical or horizontal line. Correct reaction times (RT) and target orientation accuracy rates were measured. Each trial began with a central fixation dot presented for 1 s and the search display was presented until a response was made. Participants were instructed to maintain fixation on the dot and respond as fast and accurately as possible. There was one practice block of 72 trials, followed by 10 experimental blocks of 72 trials. Participants received feedback about mean accuracy and mean RT following each block.

EXPERIMENT 1
Our first experiment was designed to examine whether uniqueness in the temporal frequency domain is sufficient to drive efficient visual selection, and to determine whether high-frequency targets (among low frequency distractors) are more salient than low frequency targets (among high-frequency distractors). On any given trial all distractor stimuli modulated at an identical temporal frequency of either 1.3 or 12.1 Hz (at a random temporal phase), whilst the temporal frequency of a target singleton could take on frequencies within this range (differing from Frontiers in Psychology | Perception Science the distractors by 0-10.8 Hz). Three set sizes were employed (4, 7, and 11 elements). Error rates were less than 4% for all subjects. Figure 2 depicts RT for trials in which target singletons were correctly identified as either horizontal or vertical, expressed as a function of target frequency. Left and right columns show the two levels of distractor frequency, 1.3 and 12.1 Hz respectively. The data exhibit several distinct trends. Firstly, RT improve monotonically with increasing frequency difference. The magnitude of this effect depends on the number of distractors present (set size), with longer RT associated with more items. Interestingly, the improvement in performance due to increasing temporal frequency differences asymptotes, with little further improvement once a critical target/distractor separation is reached.
Inspecting Figure 2, it is clear that minimum mean RT occur when target and distractor frequencies are very dissimilar (>5 Hz), whereas maximum RT tend to occur when the target and distractor frequencies are nearly identical. We conducted a three-way within subjects ANOVA on the full set of reaction time data, which revealed a significant three-way interaction F 10, 3 = 17.1, p < 0.001 between target frequency, distractor frequency, and set size. To better understand this pattern of results we then conducted separate two-way ANOVAs at each distractor frequency, both of which yielded a significant two-way interaction between target frequency and set size: F 10, 3 = 11.1, p < 0.001 (low frequency distractors); F 10, 3 = 9.4, p < 0.001 (high frequency distractors). Visual inspection of the data suggests that these interactions both result from significantly larger set size effects when target and distractors are similar in frequency, compared to when they are dissimilar (see Figure 2).
To confirm the findings of these ANOVAs we compared the effects of set size for each of the four binary combinations (low frequency = 1.3 Hz; high frequency = 12.1 Hz) of distractor and target frequencies (respectively: low, low; low, high; high, low; and high, high) using four separate within-subjects ANOVAs. These analyses confirm that set size is only effective when target and distractors modulate at similar frequencies (low/low: F 2, 3 = 20.7, p = 0.002; 21.3; high/high: F s 2, 3 = 16.9, p = 0.003), with no effects of set size evident when target and distractors are dissimilar (low target/high distractor: F 2, 3 = 2.2, p = 0.19; high target/low distractor: F 2, 3 = 2.1, p = 0.20). These results indicate that search times are unaffected by the number of distractors present once a critical frequency difference is reached (>5 Hz). Importantly, the search efficiencies (i.e., the absence of set size effects) arising from FIGURE 2 | Results from Experiment 1. Data points show reaction times for correct target identification plotted as a function of target frequency, for three different set sizes. Continuous lines show the best-fitting Gaussian functions (see Eq. 1). Error bars for individual subjects depict SE of all correct trials. Average data points represent between-subject means. Error bars for averaged data represent SE of individual means. Gaussian fits of averaged data were calculated independently of fits for individual subjects.
www.frontiersin.org this frequency difference are largely symmetrical, as they occur whether the target frequency is low (1.3 Hz) or high (12.1 Hz). Although temporal frequency search is efficient and symmetric beyond this critical frequency difference, a paired t -test on the data collapsed across set size reveals that even within this "efficient" range, low frequency targets are overall in fact identified more slowly (∼440 ms) than high-frequency targets (t 3 = −7.5, p = 0.005). That is to say, we observe a search asymmetry whereby high-frequency singletons are detected more rapidly (but not more efficiently) than low frequency singletons.
This asymptotic effect of target-distractor frequency differences on RT can be well-approximated using a Gaussian model (see Eq. 1, plotted as continuous lines in Figure 2), fitted to each subject's (and averaged) data, one for each set size, and distractor frequency. Note that the Gaussian is the standard model used to describe human temporal frequency filters in the psychophysical literature. The applicability of this model for search performance is discussed below.
Equation 1. Offset Gaussian function, where x = temporal frequency dimension; X target = the particular target frequency; A = peak amplitude (fixed to be negative), σ = Gaussian standard deviation; and α = baseline offset (of the asymptotic portion of the curve). This function was used to fit search performance separately for each distractor frequency at each set size in Figure 2.

Temporal channels
An intriguing feature of the Gaussian fits in Figure 2 is their strong resemblance to the established tunings of temporal frequency channels (Anderson and Burr, 1985;Hess and Snowden, 1992;Waugh and Hess, 1994;Metha and Mullen, 1996;Cass and Alais, 2006;Cass et al., 2009a,b). It is thought that there are only two (or possibly three) temporal channels, and these are well characterized by partially overlapping Gaussian functions. One channel is low-pass (or sustained), the other is/are band-pass (or transient) peaking between 10 and 20 Hz, as shown by the dashed curves in is certainly striking, but why would a visual search function resemble the tuning profiles of early visual filters? We propose that when the target and distractors are similar in frequency they activate the same temporal channel, making it difficult to determine activity related to the target from activity generated by competing distractors. In other words, there is a low signal-to-noise ratio, making the task difficult and slow. In contrast, when frequencies differ more widely, the target and distractors are likely to activate separate channels. This would induce a local bias in the relative output of localized temporal channels, the sign of which would uniquely signify the location of the target, thereby driving efficient visual selection. The plots in the left column of Figure 2 illustrate this.
Here distractors are low frequency and the curves decline from long RT when target frequency is low to a minimum when target frequency is high. The minimum arises at around 10 Hz because at this frequency the target optimally activates the high-frequency temporal channel at its peak sensitivity. Simply inverting the function reveals its agreement with temporal channels derived using other methods such as masking or adaptation.

EXPERIMENT 2
In Experiment 1 we found that increasing the number of distractor items had no effect on search performance once a critical difference between target and distractor frequencies had been reached. This was so regardless of whether the unique frequency was higher or lower than the distractor frequency. In other words search efficiency was symmetric with respect to temporal frequency. To determine whether a unique temporal frequency automatically captures attention in a stimulus-driven fashion, we tested if there was a search "cost" when the unique temporal frequency (1.3 or 12.1 Hz) indicated a distractor location rather than a target. These "cost" trials were interleaved randomly with trials similar to those in Experiment 1 where the unique frequency correctly cued the target location and with neutral trials where all elements modulated at the same frequency. If unique temporal frequencies guide visual selection in an exogenous manner, then we predict a search cost (relative to the neutral condition) when the unique frequency is paired with a distractor, as well as a search benefit as in Experiment 1 when the unique frequency is paired with the target. Figure 4 shows the results of this experiment (RT averaged across observers). The abscissa represents trials in which a unique temporal frequency was co-incident with either a target (deviant target), a distractor (deviant distractor), or was absent (deviant absent or neutral condition). Green bars indicate trials where the deviant object modulated at 1.3 Hz embedded within a 12.1 Hz object array, and orange bars, 12.1 Hz deviants within a 1.3 Hz array. In the neutral condition, all objects modulated at either 1.3 or 12.1 Hz (green and orange bars respectively).
A within-subjects ANOVA revealed a significant interaction between the type of deviant (target vs. distractor vs. absent) and deviant frequency (low vs. high; F 2, 3 = 6.0, p = 0.037). We also found a significant interaction between the type of deviant object (target vs. distractor) and deviant frequency (low vs. high; F 1, 3 = 8.7, p = 0.042), and a reliable main effect of deviant object type (F 1, 3 = 13.5, p < 0.021). The main effect of deviant object type was further examined by separate paired t -tests. The first of these confirmed the search benefit result observed in Experiment 1

FIGURE 4 | Mean reaction times for correctly identifying the orientation of the target element (horizontal vs. vertical) when presented under the various temporal contexts in Experiment 2 (set size = 7 items). Error bars represent SE of inter-subject means (n = 4).
whereby pairing a unique temporal frequency with a target object facilitates search performance compared to conditions in which a deviant frequency is not present (t 3 = 6.6, p = 0.007). As in Experiment 1, we observed a symmetrical search benefit in that unique targets, whether lower than or higher than the distractor frequency, produced significantly faster search times than the neutral condition. Again, too, unique high-frequency targets were identified more quickly than unique low frequency targets, as indicated by the two-way interaction (F 1, 3 = 14.4, p = 0.019). As noted above, however, this apparent asymmetry in the search performance for low and high-frequency target singletons is not related to search efficiency as there was no effect of set size under either of the target × distractor frequency conditions (Experiment 1).
Critically, and consistent with our prediction, we find significant costs when the unique frequency is paired with a distractor object compared to neutral conditions [t 3 = 3.8, p = 0.015 (one-tailed)]. The combination of performance benefits when the unique frequency is paired with a target, and performance costs when a unique frequency is paired with a distractor, strongly suggests that unique temporal frequencies capture attention in an automatic, stimulus-driven fashion. Interestingly, no significant interaction was observed between deviant frequency and deviant type (cost vs. neutral; F 1, 3 = 0.317, p = 0.603). The fact that we can produce both search benefits and search costs, and can do so using a low or a high-frequency deviant, confirms three important points. First, it demonstrates that temporal frequency can guide visual selection in an automatic manner (as seen in the "cost" result), second, it shows that temporal-frequency guided search is symmetrical (low or high frequencies are both effective), and third, it is the temporal frequency difference between target and distractors that drives efficient search, not the absolute frequency.

DISCUSSION
Our results clearly demonstrate that temporal frequency differences alone can support very efficient visual search. More specifically, four important findings are revealed. First, for efficient search to occur, the target and distractor frequencies must differ enough (∼5 Hz) to drive separate temporal frequency channels (Experiment 1). Second, once that critical temporal frequency difference is reached, search efficiencies are remarkably symmetrical: high frequencies are easily found among low frequencies for all set sizes, and -in a novel finding -low frequencies are easily found among high frequencies (Experiments 1 and 2). Importantly, we can be sure this is not a consequence of spurious stimulus transients as all modulations were sinusoidal. Third, even though search is equally efficient for low and high-frequency singletons alike (as evidenced by the equivalent search slopes beyond a critical frequency), highfrequency singletons are nonetheless found more rapidly than low frequency singletons (Experiment 1). Finally, deviant frequencies capture attention in an automatic, stimulus-driven manner (Experiment 2).
An early paper which examined the role of modulation rate on search performance (Ivry and Cohen, 1992) reported a strong asymmetry whereby fast apparent motion singletons were detected more efficiently when embedded within slower distractors than the converse (slow targets among rapid distractors). We suggest this asymmetrical transient-bias resulted from the rather narrow range of temporal frequencies employed. In their study the comparison frequencies on given trial ranged from 1-2 Hz ("low frequency") to 2.3-4.6 Hz ("high-frequency"), a much narrower range than used in our study (1.3-12.1 Hz). Not only was the range narrow, even their high frequencies were too low to optimally drive the high temporal frequency ("transient") channel, which peaks at around 8-12 Hz. For this reason, it is unlikely that their results can be explained in terms of target and distractors driving separate populations of neural temporal filters.
More recently, Spalek et al. (2009) reported that critical differences in flicker rate are capable of driving efficient search. Several factors make their results difficult to interpret, however. One is that they used temporally broadband stimuli (i.e., square-wave modulations). Fourier analysis of their modulation rates reveals substantial overlap between the temporal spectra of their target and distractor stimuli, begging the question of exactly what information drove their reported effects. Second, in their study all distractors modulated in synchrony. This would produce a strong grouping cue that would facilitate identification of the target, as well as a periodic luminance signal that is informative of target location. One could, in principal, find the target by taking a "snapshot" of the display and searching for the deviant luminance value. Because of this, any evidence of pop-out cannot necessarily be attributed to modulation frequency.
Our study overcomes the limitations of earlier studies to show conclusively that temporal frequency differences can support efficient visual search. It adds to related studies showing that abrupt luminance onsets and offsets are can capture attention (Pinto et al., 2006), whether the flash is consistent with the presentation www.frontiersin.org of a new object or not (Franconeri et al., 2005;Hollingworth et al., 2010). What is not clear from these and other studies of temporal salience (Ivry and Cohen, 1992;Spalek et al., 2009), however, is what temporal frequency constraints underlie their results. This is complicated by the fact that all previous studies have employed abrupt luminance onsets and offsets which have a very broad temporal frequency spectrum ( Van der Burg et al., 2010), making it impossible to conclude which particular frequencies underlie the effect. Our study circumvents this problem by using sinusoidal modulations, which by definition contain a single frequency. Using this approach, we have confirmed that temporal salience does not require a temporally complex stimulus, and is not a consequence of a broadband artifact. Rather, a temporal frequency difference alone is sufficient to enable efficient visual search, provided this difference exceeds about 5 Hz so the target and distractors differentially activate the underlying neural temporal filters.
Even though temporal frequency search shows a symmetrical efficiency once a critical target/distractor frequency difference is reached, observers nonetheless responded more quickly to high than to low frequency targets (∼440 ms). This, we believe may simply reflect the longer temporal period of low frequencies (769 ms −1 cycle at 1.3 Hz) compared to high frequencies (83 ms per cycle at 12.1 Hz). Alternatively, and consistent with the notion that the underlying temporal filters make efficient search possible, faster RT to high frequencies may reflect the greater overall responsiveness of the transient compared with the more sustained channel (lower frequency; Langley and Bex, 2007). A more intriguing notion is that this asymmetry may reflect an interaction between temporal frequency-selective filters. Such an asymmetry has been reported previously using an overlay masking paradigm, in which high-frequency luminance modulation stimuli interfered with the detection of spatio-temporally superimposed low frequency luminance modulation, but not vice versa (Cass and Alais, 2006).
To be sure the observed search efficiencies really do reflect stimulus-driven attention based on temporal frequency differences, we verified in a second experiment that a symmetric pattern of search costs occurs when the unique temporal frequency is paired with a distractor rather than the target. The absence of a set size effect in Experiment 1 and the costs observed in Experiment 2 confirm our hypothesis that differences in the rate of flicker across the visual field are capable of pre-attentively guiding visual selection and that this effect is symmetric with respect to temporal frequency. This is to be contrasted with earlier studies, which find evidence for search asymmetries in the temporal frequency domain, characterized by efficient search for high, but not low frequency targets.
Highlighting the link between efficient search and activity in underlying temporal filters, we also show that plotting search RT for low and for high-frequency targets as a function of distractor temporal frequency produces a pattern strongly resembling the known spectral profiles of human temporal frequency channels (see Figures 2 and 3). Overall, our results demonstrate that attentional capture based on temporal frequency differences is intimately linked to the tunings and relative outputs these channels. This suggests that similar links would be found for visual search in other basic stimulus dimensions for which underlying visual filters are well characterized, such as orientation, spatial frequency, and color (Campbell and Robson, 1968;Flanagan et al., 1990;Cass et al., 2009c).
Given that both low and high temporal frequencies are capable of capturing attention, why then does visual movement (transient information) so potently capture attention under natural viewing conditions? We propose this arises because the temporal content of natural scenes is typically dominated by low temporal frequencies.
It is well established for dynamic natural images that the amplitude spectrum for temporal frequency declines with increasing frequency according to a 1/f profile (Dong and Atick, 1995;van Hateren and van der Schaaf, 1996;Cass et al., 2009a). For this reason, any local transient information is likely to perceptually "pop-out" from its low frequency-biased background. This probably explains why waving, for example, is an effective gesture for attracting someone's attention.

SUPPLEMENTARY MATERIAL
The Movies S1 and S2 for this article can be found online at http:// www.frontiersin.org/perception_science/10.3389/fpsyg.2011.003 20/abstract Movies S1 | Example trials in Experiment 1. In these examples, the target and distractor-centered annuli sinusoidal modulated are modulated at different frequencies. (A) Target = 12.1 Hz, distractors = 1.3 Hz; (B) Target = 1.3 Hz, distractors = 12.1 Hz. Note that the frequency of the modulation is dependent on the refresh-rate of your computer.

Movies S2 | Example trials in Experiment 2. (A)
Example deviant absent (neutral) condition: target-and distractor-centered annuli modulate at 1.3 Hz. (B) Example deviant distractor condition: a single distractor-centered element modulates at unique frequency (12.1 Hz) relative to other elements in the display (1.3 Hz). For examples of deviant target conditions see Movies S1A,B. Note that the frequency of the modulation is dependent on the refresh-rate of your computer monitor.