Why is Binocular Rivalry Uncommon? Discrepant Monocular Images in the Real World

When different images project to corresponding points in the two eyes they can instigate a phenomenon called binocular rivalry (BR), wherein each image seems to intermittently disappear such that only one of the two images is seen at a time. Cautious readers may have noted an important caveat in the opening sentence – this situation can instigate BR, but usually it doesn’t. Unmatched monocular images are frequently encountered in daily life due to either differential occlusions of the two eyes or because of selective obstructions of just one eye, but this does not tend to induce BR. Here I will explore the reasons for this and discuss implications for BR in general. It will be argued that BR is resolved in favor of the instantaneously stronger neural signal, and that this process is driven by an adaptation that enhances the visibility of distant fixated objects over that of more proximate obstructions of an eye. Accordingly, BR would reflect the dynamics of an inherently visual operation that usually deals with real-world constraints.

Binocular rivalry (BR) papers usually begin with a fib. The near ubiquitous phrase is something like "when different images are shown to the two eyes they rival for perceptual dominance, such that only one image is seen at a time while the other is suppressed from awareness." Statements like this are greatly misleading, and the reasons for this misconception speak both to the function of binocular suppression, and consequently to processes that are fundamental to BR.

DISCREPANT MONOCULAR IMAGES IN THE REAL WORLD
There are at least two reasons why humans frequently encounter completely different monocular images at corresponding points on the two retinas, differential occlusions of the two eyes and selective obstructions of just one eye. Neither situation typically results in BR.
As depicted in Figures 1A,B, when an object can be seen in both eyes it will occlude more distant parts of the visual scene. Importantly, different sections of the distant scene can be selectively visible to either eye. Figure 1A depicts an example where a disembodied head is floating in space behind a pillar. Obviously this graphic is not going to win any artistic accolades, and there is more than a touch of irony in trying to depict a real-world constraint using a disembodied head, but hopefully this will serve to illustrate a point. The disembodied head is peering down at a point beyond the pillar. The bold lines depict the nearest points visible to either eye just to the left of the pillar. As can be seen, a region beyond the pillar is selectively visible to the left eye (as images of the pillar will reside at corresponding points on the right retina). This zone, shaded gray, is called a monocular occlusion zone (Gillam and Borsting, 1988;Nakayama and Shimojo, 1990;Ono et al., 2003; see also Harris and Wilcox, 2009 for a recent review). Figure 1B attempts to depict the same type of scenario viewed from above. Here the right eye is colored black and the left eye gray. Both eyes are converged to fixate a distant point (the dash) beyond the pillar (which is now depicted as a gray circle). Black dotted lines depict the limits of the monocular occlusion zone caused by the image of the pillar in the right eye, and dotted gray lines depict the limits of the monocular occlusion zone for the left eye. Note that both of these zones are visible to the other eye. The important point to take from these illustrations is that we frequently encounter monocular occlusion zones, but these very rarely, if ever, induce BR in daily life.
A selective obstruction of just one eye is depicted in Figure 1C. Here someone has crept up to a doorway and is peeking around the doorframe. This attempt to see without being seen results in the exposed eye having an unobstructed view of the distant scene, whereas the occluded eye can only see the back of the doorframe. This type of scenario does not just occur when people are sneaky. An analogous situation can occur if you try to look down past your nose at an acute angle, or if you lie down with the side of your face in a pillow while looking across a room, or if you stick a finger directly in front of one eye while reading this text. As the reader can demonstrate for themselves, none of these situations typically result in BR despite the presence of completely different images at corresponding points in the two eyes. Why?

WHY DIFFERENTIAL OCCLUSIONS DON'T CAUSE BR
There is a geometric cue available to the brain when unmatched monocular images result from differential occlusion. As can be inferred from Figures 1A,B, a monocular occlusion zone will project to the temporal side of the retina (the side closest to the ear) relative to the image of the occluder (in this case the pillar). Stimuli that obey this constraint are resistant to suppression during BR. In contrast, unmatched monocular images that project to the nasal side of the retina, relative to an image seen in both eyes, are susceptible to BR Nakayama, 1990, 1994). Thus at least one reason BR is uncommon is that the processes responsible are sensitive to differential occlusion cues. So BR is not instigated by the unmatched monocular retinal images that are encountered in daily life as a consequence of differential occlusion.

WHY SELECTIVE OBSTRUCTIONS DON'T CAUSE BR
When an occluder is so close that it only obstructs one eye there is no geometric cue to signal which of the unmatched retinal images is of an occlusion, and which is of a more distant point of regard. Evidently the visual system does, however, differentiate between these images as perception tends to be dominated by images relating to more distant objects. We are usually only faintly aware of images of selective obstructions because they are persistently suppressed from awareness. Hence you can still read this text if you place a finger directly in front of one eye. One can also easily demonstrate that this involves an active suppression, as the image of the selective obstruction jumps into awareness if one shuts the unobstructed eye. So what cues, or properties, of an image of a selective obstruction does the visual system tap to ensure it is suppressed from awareness?

SIGNAL STRENGTH
The concept of signal strength (Levelt, 1968) will be familiar to most readers with a passing interest in BR. Historically signal strength seems to have been a somewhat circular concept. Whenever a stimulus property was found to influence the probability of perceptual dominance during BR it was added to a grab bag of characteristics collectively termed signal strength. However, close inspection of this grab bag reveals that many features within it could be used to differentiate images of selective obstructions from images of more distant objects (see Fahle, 1982a,b;Arnold et al., 2007;Changizi and Shimojo, 2008).
The reader should bear in mind that young adults can only accommodate to focus on an object at a viewing distance of ∼10 cm, so images of selective obstructions are necessarily blurred, as they have to be very close to an eye in order to obstruct it selectively. Tellingly, image blur was one of the first characteristics placed under the term "signal strength" (Levelt, 1968). When an image is blurred it selectively reduces higher spatial frequency content, and this too contributes to signal strength (Fahle, 1982a,b;Wolfe, 1983). Similarly, blurring an image reduces image contrast (Fahle, 1982a,b), and both luminance and chromatic contrasts contribute to signal strength (Levelt, 1968;Mueller and Blake, 1989;Kovacs et al., 1996;Pearson and Clifford, 2004). Clearly signal strength, or at least a number of characteristics grouped under this term, would be useful for a process that strives to suppress awareness of selective obstructions in order to enhance the visibility of more distant objects (Fahle, 1982a,b;Arnold et al., 2007;Changizi and Shimojo, 2008). So another reason BR is uncommon is that the images of selective obstructions have a very low signal strength.

SIGNAL STRENGTH AND NATURAL IMAGES
Natural images are complicated, so historically vision scientists have focused on simplified stimuli that are more easily controlled. However, some brave souls have investigated the properties of natural images and how the visual system responds to them (Maloney, 1986;Field, 1987;Zetzsche et al., 1993;Geisler et al., 2001;Simoncelli and Olshausen, 2001;Mante et al., 2005;Geisler, 2008). Pertinently, it has been established that the mechanisms responsible for BR are sensitive to the characteristics of a natural image (Baker and Graf, 2009).
Natural images contain luminance changes that can be detected at different spatial scales. Imagine you have taken a picture, and you want to know how luminance changes are distributed in terms of spatial scale. Figuratively, you could move a very small circle around each part of the image and work out how often that circle contains a difference in luminance, and how large that variance is. You now have an estimate of how much variance in luminance occurs within the image at a fine spatial scale. You can then repeat the exercise with progressively larger circles to determine estimates for progressively coarser scales. When this type of analysis was applied to images of natural scenes most of the variance was found at coarse spatial scales and progressively less variance was found at finer spatial scales. Importantly the drop off was linear if plotted on a log scale, so it is said to obey a 1/f amplitude spectrum, where f reflects spatial scale (Maloney, 1986;Field, 1987;Geisler et al., 2001;Simoncelli and Olshausen, 2001;Mante et al., 2005;Geisler, 2008). The relevance of this for BR is that you can generate random patterns that obey this constraint and compare them to patterns that don't, and the former tend to dominate perception during BR (Baker and Graf, 2009).
We could add a 1/f amplitude spectrum to the grab bag of properties that contribute to image signal strength, or we could perhaps simplify things further. The images analyzed to determine the properties of natural scenes tend to be taken by proficient photographers. Omitted are the numerous defocused images taken by less gifted practitioners. If blurry photos were analyzed one would find that their amplitude spectrum does not conform to a 1/f spectrum, as there would be no content at a fine spatial scale and so the drop off in content with increasingly fine spatial scale would be too rapid. So we can take this type of finding as yet further evidence that focused images tend to dominate perception during BR, contributing to BR being uncommon as distant focused images tend to suppress awareness of the blurred images of selective obstructions.

SOME LAWS ARE MADE TO BE BROKEN
Ultimately BR is uncommon as unmatched monocular images in real life are often persistently suppressed. So you can place a finger immediately in front of one eye while reading this text, and wait, and wait, and wait, and for the vast majority it will never dominate perception by suppressing awareness of the text. This might prompt the question, are these situations relevant to BR, which after all is characterized by changes in perceptual dominance?
One of the oft quoted characteristics of BR is that increasing the relative signal strength of an unmatched monocular image will increase the frequency at which it becomes dominant, but will not extend its individual periods of dominance (Levelt, 1968). There is a great deal of evidence consistent with this premise (Levelt, 1968;Fox and Rasche, 1969;Mueller and Blake, 1989;Bossink et al., 1993), but clearly this second law of BR (Levelt, 1968) must be broken if the inherently weak signal strength of images relating to selective obstructions contributes to BR being absent in daily life.
More recently it has been established that the second law of BR breaks down if you further increment the signal strength of an image that already has a greater relative signal strength (Brascamp et al., 2006). This, and similar findings (Mueller and Blake, 1989;Bossink et al., 1993), has prompted a more nuanced guideline -that changes to relative signal strength will predominantly impact the dominance durations for the stimulus with a higher signal strength (Brascamp et al., 2006). Critically the impact is to lengthen its dominance durations. So, if we take this to a logical extreme the inherently weak signal strength of the blurred images of selective obstructions could result in their being reliably and persistently suppressed via the focused images of more distant objects.

SIGNAL STRENGTH, EYES, AND PATTERNS -EVERYONE'S WRONG
One of the longest running debates concerning BR regarded whether suppression targets the input from a given eye (Blake and Fox, 1974;Blake et al., 1979Blake et al., , 1992Dutour, 1760translated by O'Shea, 1999Tong and Engel, 2001), or if it targets one of the two conflicting images (Dorrenhaus, 1975;Logothetis et al., 1996). There is good evidence that supports both propositions (Dorrenhaus, 1975;Blake et al., 1979;Logothetis et al., 1996), so contemporary consensus holds that both views were right all along, which is a popular sort of resolution, but not one that is necessarily correct. In the interests of being deliberately provocative one could suggest an alternative -that both camps were fundamentally wrong.
One possibility, that has perhaps not attracted the attention it deserves, is that during BR perception simply tracks the unmatched monocular signal with the instantaneously higher signal strength. Sometimes this might be tied to a particular monocular channel whereas at others it might switch rapidly between monocular channels. Why would an adaptation that has evolved to deal with a real-world constraint allow for a signal to switch rapidly between monocular channels? For illustrative purposes, refer to the picture of a cute kitten that is Figure 2. As happens so often, this kitten has found itself in a tree. As a consequence one of its two eyes could easily become obstructed by a leaf while it looks into the distance, searching for a kind hearted soul with a ladder. If the wind were to start moving the branches a leaf could rapidly switch between selectively obstructing one or another eye, both eyes or neither eye. To maximize the kitten's chances of spotting a distant rescuer it would be optimal if the image of the proximate obstruction could instantaneously be suppressed no matter which eye it projects to, even if it rapidly switches between being encoded in different monocular channels.
Adult humans perhaps spend less time in trees than they should, and presumably much less time than our monkey-like forebears, but a conceptually similar scenario with which the reader might be better acquainted can happen when walking past a picket fence. If one looks through a proximate picket fence while walking, distant points of interest can rapidly switch between projecting to either eye, to both eyes, or to neither eye. Thus again, in order to maximize the visibility of interesting distant objects, it would beneficial to instantaneously suppress signals relating to proximate obstructions regardless of which eye they project to.
In a conceptual emulation of these real-world scenarios, recent studies have shown that if conflicting images that differ in signal strength alternate between the eyes, the stronger signal can reliably and persistently suppress awareness of the weaker signal (Arnold et al., 2007. Crucially the participants in these studies were very bored. While this is common in psychophysical tasks, in this context their boredom had scientific merit. In a majority of trials participants felt they were simply watching a static picture of a girl or a house (Arnold et al., 2007) or of even more tedious static white noise . They were unaware that these images were switching between projecting to either eye in counterphase with a weaker signal. Note that there was no flicker to mask these alternations, as is necessary for persistent perceptual dominance

Frontiers in Human Neuroscience
www.frontiersin.org when conflicting images have approximately equal signal strength (Logothetis et al., 1996;Lee and Blake, 1999). Thus these studies were akin to our kitten being able to persistently see fixated objects in the distance as a swaying leaf rapidly switches between obstructing either eye. While it is pleasing this could be demonstrated in BR experiments (Arnold et al., 2007, to continue the real-world emphasis of this discourse an uninhibited reader can demonstrate this principle by wiggling fingers in front of their eyes, such that each eye is alternately obstructed. You should find that you have no difficulty reading, that this text is persistently visible despite switching between being encoded in different monocular channels. If you are not secluded you may also find that people are looking at you. The fact that perceptual dominance can seamlessly track an image as it is switched between the eyes (Arnold et al., 2007 implies that during BR perceptual dominance is resolved in favor of the instantaneously higher strength signal, as is required of a process that enhances the visibility of distant fixated objects over that of selective obstructions of an eye (see also Changizi and Shimojo, 2008). Consequently, from a functional perspective, BR is not resolved in favor of a signal from a specific eye (Blake and Fox, 1974;Blake et al., 1979Blake et al., , 1992Dutour, 1760translated by O'Shea, 1999Tong and Engel, 2001), or in favor of a particular perceptual interpretation (Dorrenhaus, 1975;Logothetis et al., 1996), it is simply resolved in favor of the instantaneously higher strength signal.

WHY DOES PERCEPTUAL DOMINANCE CHANGE IN BR EXPERIMENTS?
Because relative signal strength changes.
A common assumption is that an image associated with a higher signal strength will begin to dominate perception, but its signal strength disproportionately wanes over time, resulting in a relative neural signal strength change, and a consequent switch in perceptual dominance (Lehky, 1988;Blake, 1989). The fine details of this standard account are a matter of debate, but the waning of the dominant signal seems to be at least partially driven by neural adaptation (Blake et al., 1990(Blake et al., , 2003Carter and Cavanagh, 2007;Alais et al., 2010). An additional common assumption is that some source of noise is necessary to explain the stochastic dynamics of BR (Brascamp et al., 2006;Kim et al., 2006). Note that a commonly overlooked source of noise would involve an interaction between involuntary stochastic eye movements (Yarbus, 1967;Murakami and Cavanagh, 1998;van Dam and van Ee, 2005;Martinez-Conde et al., 2006) and neural adaptation (see Sabrin and Kertesz, 1983;Georgeson, 1984). While the fine details of the standard account will doubtless continue to be debated, many are comfortable with the basic assumption that a dominance change is driven by a change in relative signal strength. Surprisingly behavioral evidence for this standard account was lacking until recently. But it has now been established that there is a gradual switch in the depth of suppression for content in either eye leading up to a dominance change. Crucially, content in the suppressed eye becomes relatively less suppressed in the moments leading up to a dominance change (Alais et al., 2010). These observations are perfectly consistent with BR being resolved in favor of the instantaneously stronger signal.

IS BR RELATED TO OTHER MULTI-STABLE PHENOMENA?
If perceptual suppressions during BR are driven by an adaptation that enhances the visibility of focused retinal images, instead of the blurred images of selective obstructions, BR would be unlikely to be directly related to a range of other multi-stable phenomena.
A popular assumption is that BR and other multi-stable phenomena are driven by a common process that deals with situations wherein perceptual input is ambiguous (Andrew and Purves, 1997;Leopold and Logothetis, 1999;Sterzer et al., 2009). For instance, an impression of a rotating cylinder or globe can be created by using a field of dots that translate back and forth. Crucially the direction of rotation is ambiguous, and seems to intermittently reverse (Miles, 1931;Howard, 1961;Blake et al., 2003). Motion-induced blindness is another example, wherein static dots can seem to intermittently disappear when surrounded by movement (Bonneh et al., 2001;Graf et al., 2002;Hsu et al., 2006;Wallis and Arnold, 2009) or flicker (Kawabe and Miura, 2007;Wallis and Arnold, 2008). Another classic example, depicted in Figure 3, is the Necker cube (Necker, 1832). Here lines mark the edges of a three dimensional cube. One of the sides of the cube is gray, whereas others are white. At times the gray side may seem to be located in front and at others behind, and as one watches this relationship will seem to intermittently reverse.
Other than their subjective similarity, with perception flipping between different states in the presence of unchanging input, is there any evidence that links various instances of multi-stable perception? In short, yes there is, but the evidence is inconclusive and it does not dictate that the diverse phenomena are driven by a common process.
One piece of evidence linking diverse multi-stable phenomena is that distributions of periods for which percepts seem to persist tend to conform to a gamma distribution (Kovacs et al., 1996;Logothetis et al., 1996;Andrew and Purves, 1997;Carter and Pettigrew, 2003;Murata et al., 2003). This is a complicated way of saying that a few percepts will persist for a very brief period and a few will persist for variable longer periods, but most will persist for a medium duration, in sum producing a distribution with a marked right skew. This constitutes weak

Frontiers in Human Neuroscience
www.frontiersin.org evidence for a link for at least two reasons. First, distributions of obviously unrelated phenomena also conform to a gamma distribution, such as the distribution of rainfall over time (Barger and Thorn, 1949). Second, if one asks a person to press a button randomly, the distribution of times for which they depress the button might also conform to a gamma distribution (see Edwards and Li, 2002). Stronger evidence for a link can be found in the fact that people who report slow perceptual switches during one type of multistable perception also tend to report slow switching in other forms (Carter and Pettigrew, 2003). This evidence is inconclusive, however, as perceptual dominance changes are seldom sharply defined. During BR, for instance, a switch in perceptual dominance can begin, with the dominant image seeming to fade or blur, then pause, reverse, then begin all over again. Consequently the criterion adopted for reporting a change in perceptual dominance can have a profound impact on the dynamics of the phenomenon as recorded by the experimenter. The correlation between the dynamics of diverse multi-stable phenomena might therefore speak to a tendency to adopt tight or relaxed criteria when reporting changes, rather than to the diverse phenomena being driven by a common process.

NEURAL SUBSTRATE -SOME OUTRAGEOUS SPECULATION
One of the reasons BR research has enjoyed a resurgence in prominence is the tantalizing prospect that it might shed light on the neural substrates of consciousness. Thus far this discussion has focused on the plausible function of binocular suppression -the proposal being that it is to facilitate the visibility of distant focused objects over that of more proximate obstructions. If this is the goal of perceptual suppressions during BR what, if anything, does this say about the neural substrates of BR?
At the risk of stating the obvious, this goal would necessitate that the substrate has access to each of the conflicting signals, so that it can determine which of the two signals most likely relates to an obstruction. Seemingly this would place the critical substrate in cortex, the first site in the human visual system where there is robust evidence of cross talk between inputs from the two eyes (Barlow et al., 1967;Poggio and Fischer, 1977). This goal also implies that the substrate is unlikely to be found at sites where activity maximally correlates with perception during BR. At such sites there is little evidence of a signal relating to suppressed input (Tong et al., 1998;Moutoussis et al., 2005;Jiang and He, 2006). If there is no activity relating to a suppressed input there would be no need to suppress that signal, and no prospect of that signal subsequently overcoming its counterpart. Such sites likely reflect the consequence of a process at an earlier critical substrate.
To have any hope of identifying a critical neural substrate for BR one probably needs a targeted measure of brain activity, not a gross measure. Targeted measures can simultaneously track signals relating to different inputs from within a single brain structure, and can therefore track slight fluctuations in relative signal intensity (see Brown and Norcia, 1997;. A gross measure of activity, on the other hand, can only provide information about the aggregate response of a neural substrate, and so one should probably not expect these to be sensitive to the critical signal strength fluctuations that seem to drive dominance changes during BR (see Alais et al., 2010). Gross measures of brain activity can, however, provide very pretty pictures of the brain, although the images are very expensive, and at least on occasion they are more colorful than computationally informative.
At this point popular consensus holds that there is no single critical site at which one or another signal is selected for suppression. This contention is encouraged by behavioral data showing that dominance can sometimes track the content of an eye (Blake et al., 1979), whereas at others it can track a particular image (Dorrenhaus, 1975;Logothetis et al., 1996). It is also encouraged by neuroimaging showing that signals at multiple sites can correlate with perception during BR (Tong et al., 1998;Lee and Blake, 1999;Polonsky et al., 2000;Tong and Engel, 2001;Moutoussis et al., 2005;Wunderlich et al., 2005;Jiang and He, 2006). However, the interconnectivity of different brain regions dictates that neither observation rules out the possibility of there being a single critical substrate where activity is modulated via interactions with other brain regions (Watson et al., 2004;van Boxtel et al., 2008a,b;Arnold et al., 2009;Kang et al., 2009;Quinn and Arnold, 2010). For instance, recent behavioral data ) has strongly implicated monocular mechanisms within the spread of perceptual dominance across complex images (human faces) that are usually linked to coding in higherlevel brain structures. The implication is that, due to feedback, activity in higher-level brain structures could shape analyzes at a single critical monocular substrate. Thus at this point there is no convincing evidence to discount the possibility that there is a single critical neural substrate for BR.

SO WHY IS BR UNCOMMON?
Discrepant monocular images are frequently encountered in daily life, but BR is seldom, if ever, experienced. So why do unmatched monocular images in the laboratory induce BR while those encountered outside it don't? Binocular rivalry does not occur in daily life as the images of either differential occlusions of the two eyes or of selective obstructions of one eye are persistently suppressed. If one accepts that the mechanisms responsible for this are responsible for binocular suppressions during BR, it follows that BR is uncommon as images of obstructions almost never rival their counterpart, presumably largely because of signal strength differences. By implication, perceptual dominance during BR would simply track the instantaneously stronger signal, and is therefore unlikely to reflect the dynamics of a more abstract process that deals with ambiguity.
Alternatively, one could presume that the mechanisms responsible for the perceptual suppression of obstructions are unrelated to BR -that unmatched monocular images excite completely different processes in and outside of the laboratory. One could adopt this position, but it doesn't seem sensible.