Skip to main content

ORIGINAL RESEARCH article

Front. Comput. Neurosci., 20 March 2013

Multi-stable perception balances stability and sensitivity

  • 1Center for Behavioral Brain Sciences, Magdeburg, Germany
  • 2Department of Cognitive Biology, Otto-von-Guericke Universität, Magdeburg, Germany
  • 3Centre de Recerca Matemàtica, UAB Science Faculty, Barcelona, Spain
  • 4Department de Matemàtica Aplicada I, Universitat Politècnica de Catalunya, Barcelona, Spain
  • 5Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain

We report that multi-stable perception operates in a consistent, dynamical regime, balancing the conflicting goals of stability and sensitivity. When a multi-stable visual display is viewed continuously, its phenomenal appearance reverses spontaneously at irregular intervals. We characterized the perceptual dynamics of individual observers in terms of four statistical measures: the distribution of dominance times (mean and variance) and the novel, subtle dependence on prior history (correlation and time-constant). The dynamics of multi-stable perception is known to reflect several stabilizing and destabilizing factors. Phenomenologically, its main aspects are captured by a simplistic computational model with competition, adaptation, and noise. We identified small parameter volumes (~3% of the possible volume) in which the model reproduced both dominance distribution and history-dependence of each observer. For 21 of 24 data sets, the identified volumes clustered tightly (~15% of the possible volume), revealing a consistent “operating regime” of multi-stable perception. The “operating regime” turned out to be marginally stable or, equivalently, near the brink of an oscillatory instability. The chance probability of the observed clustering was <0.02. To understand the functional significance of this empirical “operating regime,” we compared it to the theoretical “sweet spot” of the model. We computed this “sweet spot” as the intersection of the parameter volumes in which the model produced stable perceptual outcomes and in which it was sensitive to input modulations. Remarkably, the empirical “operating regime” proved to be largely coextensive with the theoretical “sweet spot.” This demonstrated that perceptual dynamics was not merely consistent but also functionally optimized (in that it balances stability with sensitivity). Our results imply that multi-stable perception is not a laboratory curiosity, but reflects a functional optimization of perceptual dynamics for visual inference.

Introduction

The visual system extrapolates beyond the retinal evidence on the basis of prior experience of the visual world (Kersten et al., 2004; Hohwy et al., 2008; Friston et al., 2012). The inferential nature of vision becomes evident when prior experience shapes visual appearance (Weiss et al., 2002; Yang and Purves, 2003; Gerardin et al., 2010), in visual illusions (von Helmholtz, 1866; Bach and Poloschek, 2006; Gregory, 2009), and in visual hallucinations of certain patient populations (Ffytche et al., 2009).

The temporal dynamics of visual inferences is revealed in the phenomenon of multi-stable visual perception (von Helmholtz, 1866; Leopold and Logothetis, 1999; Blake and Logothetis, 2002; Sterzer et al., 2009). When certain ambiguous visual displays are viewed continuously, their appearance changes spontaneously from time to time. For example, some planar motion flows induce an illusory appearance of a volume moving in depth, which occasionally reverses its direction (“kinetic depth”) (Wallach and O'Connell, 1953; Sperling and Dosher, 1994). Implausible visual patterns not encountered in the natural environment induce particularly striking, multi-stable illusions. To reconcile such patterns with prior experience, even strong retinal inputs are intermittently removed from awareness, resulting in “monocular” or “binocular rivalry” (Campbell and Howell, 1972; Leopold and Logothetis, 1999; Bonneh et al., 2001; Blake and Logothetis, 2002).

Multi-stable visual perception engages a distributed network of occipital, parietal, and frontal cortical areas (Tong et al., 2006; Sterzer et al., 2009). The collective dynamics of this network reflects several stabilizing and destabilizing factors (Kohler and Wallach, 1944; Lehky, 1988; Blake et aal., 2003; Lee et al., 2007). Firstly, competition between alternative appearances stabilizes whichever appearance dominates at the time (Blake et al., 1990; Alais et al., 2010). This competition seems to be mediated by inhibitory interactions operating locally within visual representations (Lee et al., 2007; Donner et al., 2008; Maier et al., 2008). Secondly, neural adaptation of visual representations progressively weakens the dominant appearance, limiting its temporal persistence (Wolfe, 1984; Nawrot and Blake, 1989; Petersik, 2002; Blake et aal., 2003; Kang and Blake, 2010). Thirdly, neural noise initiates transitions between alternative appearances at irregular intervals (Hollins, 1980; Brascamp et al., 2006; Kim et al., 2006; Hesselmann et al., 2008; Sterzer and Rees, 2008; Sadaghiani et al., 2010; Pastukhov and Braun, 2011). Finally, volitional processes, such as attention shifts and eye movements, may also destabilize multi-stable appearance (Leopold et al., 2002; Mitchell et al., 2004; van Dam and van Ee, 2006; Zhang et al., 2011).

The interplay of stabilizing and destabilizing factors in multi-stable perception can be captured by simplistic computational models (Laing and Chow, 2002; Moldakarimov et al., 2005; Moreno-Bote et al., 2007; Noest et al., 2007; Shpiro et al., 2007; Curtu et al., 2008; Shpiro et al., 2009), at least under certain stimulus conditions (viz. symmetric inputs). More elaborate models are needed to reproduce multi-stable dynamics under more general conditions (Moreno-Bote et al., 2007; Wilson, 2007; Gigante et al., 2009; Seely and Chow, 2011). Here, we show that experimental observations by individual observers in particular displays tightly constrain the dynamical balance of stabilizing and destabilizing factors in multi-stable perception. Because perceptual dynamics is notoriously diverse across observers and displays (Fox and Herrmann, 1967; Borsellino et al., 1972; Walker, 1975), we expected to obtain widely disparate results. Astonishingly, we found that almost all observers operated in a narrow dynamical regime (i.e., with a particular balance of stabilizing and destabilizing factors). In addition, this “operating regime” turned out to be functionally optimal in that it balances perceptual stability and sensitivity. Our observations imply that the temporal dynamics of visual inference is functionally optimized.

Materials and Methods

Observers

Fifteen observers (nine female, six male, including author Alexander Pastukhov) with normal or corrected-to-normal vision participated in three experiments [kinetic-depth (KD), binocular rivalry (BR) and Necker cube (NC)]. Because some observers performed multiple experiments, we obtained 24 data sets in total. The data sets from KD and BR displays were used previously to introduce the “cumulative history” measure (Pastukhov and Braun, 2011). Apart from Alexander Pastukhov, all observers were naïve to the purpose of the experiment and were paid to participate. Procedures were approved by the medical ethics board of the Otto-von-Guericke Universität, Magdeburg: “Ethikkomission der Otto-von-Guericke-Universität an der Medizinischen Fakultät.”

Apparatus

Stimuli were generated online and displayed on a 19” CRT screen (Vision Master Pro 454, Iiyama, Nagano, Japan), with a spatial resolution of 1600 × 1200 pixels and a refresh rate of 100 Hz. The viewing distance was 95 cm, so that each pixel subtended approximately 0.011°. Background luminance was 26 cd/m2. Anaglyph glasses (red/cyan) were used for the dichoptic presentation.

Multi-Stable Displays

The KD effect stimulus (Figure 1A) consisted of an orthographic projection of 300 dots distributed on a sphere surface (radius 3°). Each dot was a circular patch with a Gaussian luminance profile (σ = 0.057°) and a maximal luminance of 63 cd/m2. The sphere was centered at fixation and rotated around the vertical axis with a period of 4 s. As front and rear surface are not distinguished, the orthographic projection was perfectly ambiguous and consistent with either a clockwise or a counter-clockwise rotation around the axis. Observers perceive a three-dimensional sphere, which reverses its direction of rotation from time to time.

FIGURE 1
www.frontiersin.org

Figure 1. Experimental displays and statistical measures of multi-stable dynamics. (A) Kinetic depth (KD) display—viewing planar motion, observers perceive a volumetric rotation in either of two directions. (B) Binocular rivalry (BR) display—viewing different patterns with each eye (through red-green glasses), observers typically perceive either pattern. (C) Necker Cube (NC) display—viewing a line drawing, observers perceive one of two solid cubes. (D) Spontaneous perceptual dynamics varies widely between observers. Four statistical measures (mean and standard errors)—dominance duration Tdom, coefficient of variation cV of dominance duration, coefficient of correlation cH with dominance history, time-constant τH of dominance history (green: 8 observers KD; red: 11 observers BR; blue: 5 observers NC). Different symbols are used for the three exceptional observers jn, lf, and np (pale symbols, see text).

The BR stimulus (Figure 1B) consisted of two gratings presented dichoptically at fixation (radius, 0.9°; spatial frequency 2 cycles/degree). One grating was tilted leftward by 45° and the other rightward by 45°. The right-eye grating (green, visible only through the green filter) grating was kept at 50% contrast, while the contrast of the left-eye grating (red, visible only through the red filter) was adjusted for each subject to balance perceptual strengths. BR gives rise to several alternative perceptual states: two uniform percepts of either the left- or right-eye grating as well as different kinds of transitional percepts. Transitional percepts may be “fused” (i.e., both gratings are perceived) and/or “fragmented” (i.e., parts of both gratings are perceived in different image regions).

The NC stimulus (NC, Figure 1C) consisted of a line drawing of a 3D cube (size 3°). Observers perceive a 3D cube, which reverses its depth from time to time.

Experimental Procedure

Observers viewed the display continuously and reported the presence and identity of a clear and uniform percept. Observers pressed either the (←) key [for left rotation, left-eye (red) grating, up-and-left looking cube], or the (→) key [for right rotation, right-eye (green) grating or down-and-right looking cube], or (↓) key (for mixed or patchy percepts). Each presentation lasted for 5 min, separated by a compulsory break of (at least) 1 min. Consistent with previous reports (Lehky, 1995; Mamassian and Goutcher, 2005) reversal rates slowed during the initial part of the block, so that only the last 4 min (minus the final, incomplete dominance period) of each presentation were analyzed. Total observation time was 60 min (12 blocks) per observer for KD, 90 min (18 blocks) per observer for BR stimulus and 50 min (10 blocks) per observer for NC. Average number of clear percepts per block was 36 for KD, 110 for BR, and 45 for NC.

Observables

The perceptual dynamics was characterized in terms of four statistical measures (see Figure 1D and Table 1), each of which varied widely between observers and displays. In addition, the distribution of dominance times was established in the form of a histogram.

TABLE 1
www.frontiersin.org

Table 1. Observables.

Dominance distribution

From a sequence of dominance periods Ti (i = 1,…, N), we computed the mean dominance time Tdom and the coefficient of variation Cv as

Tdom=1Ni=1NTi(1)
Cv=1Tdom1N1i=1N(TiTdom)2(2)

As is typical for multi-stable percepts (Fox and Herrmann, 1967; Borsellino et al., 1972; Walker, 1975), average dominance periods varied greatly between observers and stimuli (Tdom in Table 1). In addition, dominance periods were highly variable (Cv in Table 1). However, the two alternative percepts dominated for comparable amounts of time (see Table 1). Patchy appearances of the BR display lasted for 1.05 ± 0.42 s.

To characterize the shape of the observed distributions of dominance times (either from human observers), we fitted the empirical distribution with a Gamma distribution with free parameters α (shape) and λ (rate)

G(t)=1Γ(α)tα1λαeλt(3)

an exponential distribution with free parameter λ (rate)

E(t)=λeλt(4)

and a Gaussian distribution with free parameters μ (mean) and σ (variance)

N(t)=12πσe(tμ)22σ(5)

Goodness of fit was assessed by means of KS tests. Human dominance distributions were fitted well by Gamma distributions (shape parameter α = 3.7 ± 0.7), but not by either exponential or normal distributions (Table 1), as expected from previous work (Levelt, 1967; Blake et al., 1971; Walker, 1975; Murata et al., 2003).

History-dependence

It is well known that successive dominance periods of the same percept tend to exhibit a marginally significant, negative correlation (van Ee, 2009; Kang and Blake, 2010), which is presumably due to neural adaptation. Recently, we have introduced a novel and more sensitive measure for this history-dependence, termed “cumulative history” (Pastukhov and Braun, 2011), which involves both a correlation coefficient, cH, and a characteristic time-constant, τH (Table 1).

The analysis of “cumulative history” in reversal sequences is described in detail by Pastukhov and Braun (2011). Briefly, the observed record of dominance reports Sx(t) is convolved with a leaky integrator (Tuckwell, 2006) to compute hypothetical states Hx(t) of selective neural adaptation of percept x:

τHdHxdt=Hx(t)+Sx(t)Hx(t)=1τH0tSx(t)               exp((tt)τH)dt,(6)

where x denotes a uniform percept, τH is a time-constant, and Hx(0) = 0. Sx(t) takes values of 1 for dominance, 0.5 for patchy dominance (BR only), and 0 for non-dominance. The cumulative history Hx(t) reflects both how long and how recently a given percept has dominated in the past. In the absence of “patchy” appearances, the cumulative histories of two competing percepts x and y sum to unity (Hx + Hy = 1).

For suitable values of τH, the cumulative history H(t) at a reversal time t correlates significantly with the subsequent dominance period Ti. Specifically, if ti marks the beginning of dominance period Tix, we computed linear correlations between Hx(ti) and ln(Tix) for all four possible combinations of history and percept (Hx × Tx, Hx × Ty, Hy × Ty, and Hy × Tx). The average absolute correlation was obtained for values of τH ranging from 0.01 to 60 s, in order to determine the maximal correlation coefficient cH and its associated value of τH (Figure 2A).

FIGURE 2
www.frontiersin.org

Figure 2. Analysis of cumulative history in terms of cH and τH. As described in “Materials and Methods,” correlations between cumulative history values H(ti) at reversal times ti and subsequent dominance periods Ti were computed for different values of τH, in order to determine the maximal value of cH and its associated value of τH. (A) Correlation results for all displays and observers, cH as a function of τH, where τH is normalized to the average dominance period Tdom of each observer (γH = τH/Tdom). All data sets exhibit a significant maximum, which quantifies the subtle but significant history-dependence of dominance periods in terms of cH and τH. (B) Analysis of shuffled reversal sequences: all dominance periods were drawn randomly and with replacement from the observed distribution of dominance periods. No significant correlations (indications of history-dependence) remain after shuffling. Panel (A) is modified from Figure 3 of Pastukhov and Braun (2011).

To verify that the values of cH and τH represented a true history-dependence (and not just the spectral characteristics of the data), we repeated the analysis with shuffled reversal sequences (dominance times drawn randomly with replacement from the observed distribution). No significant correlations cH were observed in the shuffled data sets (Figure 2B).

Computational Modeling

To generate a wide variety of dynamical regimes, we simplified the rate model of Laing and Chow (see Laing and Chow, 2002), which has been analyzed and extended by several other groups (Moreno-Bote et al., 2007; Shpiro et al., 2007; Curtu et al., 2008; Shpiro et al., 2009). Two neural populations represent competing percepts. Each population excites itself and inhibits the other population. In addition, each population is subject to adaptation in the form of a threshold elevation and to stochastic effects in the form of additive noise:

τrr˙1,2=r1,2+F(αr1,2βr2,1ϕaa1,2+I1,2+n1,2)(7)
τaa˙1,2=a1,2+r1,2(8)

where r1, 2 is population activity, a1, 2 is adaptive state, I1, 2 = I0 is the strength of the (common) input to both populations, and n1, 2 is colored noise. The sigmoidal function F(x) is defined as

F(x)=11+exp(xk)(9)

The parameters α and β control, respectively, the self-excitation and mutual inhibition of the two populations. In a sense, they represent the influence of prior experience. We set α = 0 because we were not interested in the regime of self-sustaining activity. The parameter ϕa sets the strength of neural adaptation and I1, 2 represents current retinal input. We typically set I1 = I2 = I0. The parameters τr and τa are the characteristic time-constants of activity and adaptive state, respectively. Finally, additive noise n1, 2 is provided by two independent Ornstein–Uhlenbeck processes with variance σn and time-constant τn:

n˙i=niτn+2σn2τnξi(10)

from two independent sources of Gaussian noise ξ1, 2 with

ξi(t)ξi(t+ϵ)=δ(ϵ), ξi=0(11)

Thus, the signal-to-noise ratio of the retinal input is given by I1, 2n. To predict perceptual dominance Sx(t), we assume a reversal to percept x whenever the associated activity rx is 25% larger than the activity associated with the other percept.

Model Parameters

The parameters τr, τn, and k remained fixed at τr = 10 ms, τn = 100 ms, and k = 0.1. The dynamical regime (stationary, oscillatory, or bistable) depends largely on three parameters, with I0 setting the general activity and overall stability of percepts, β the strength of mutual inhibition, and ϕa the strength of adaptation. This three-dimensional parameter space was explored in the limits of I0 ∈ [0, 2], β ∈ [0, 2], and ϕa ∈ [0, 1]. For every given triplet of I0, β, and ϕa values, we additionally simulated all combinations of τa ∈ [1.0, 1.2, 1.4, 1.6, 1.8, 2.0, 3.0, 4.0, 5.0, 6.5, 8.0] s and σn ∈ [0.01, 0.03, 0.05, …, 0.35]. The latter two parameters influence Tdom and Cv, but are inconsequential for the dynamical regime.

For convenience, all model parameters and associated value ranges are listed here: α = 0, β ∈ [0, 2], ϕa ∈ [0, 2], I1, 2 = I0 ∈ [0, 2], σn ∈ [0.01, 0.35], τa ∈ [1, 8] s, τr = 10 ms, τn = 100 ms, k = 0.1.

Simulations

To generate multi-stable dynamics and to predict psychophysical observables, three simulations of 500 s each were performed for every combination of model parameters. If the value of any predicted observable varied too much (Cv > 0.5), five simulations of 3000 s were performed. The values of predicted observables were then compared with the empirical values of Tdom, Cv, τH, and cH for each observer and display. If all four predictions fell within 25% of the empirical values, the corresponding combination of model parameters I0, β, and ϕa was marked as a “match.” Typically, a match was obtained for σn ≈ 0.15.

Frequency Resonance Simulations

To investigate frequency resonance, the two inputs were modulated in anti-phase with different periods Ts

I1,2=I0±ΔIcos(2πtTs)(12)

and the distribution of dominance periods Pres(T) was determined for different values of TsI = 0.2I0). As shown in Figure 12A, this distribution exhibits resonance peaks at odd multiples of the half-period of modulation Ts2. The most pronounced resonance typically occurs for HP = Ts/2 = Tdom.

To compare frequency resonance at different points in the three-dimensional parameter space I0 ∈ {0, 2}, β ∈ {0, 2}, and ϕa ∈ {0, 1}, two simulations of 4000 s were performed at each point with medium noise σn = 0.15 and τa = 1 s. One simulation established the unperturbed distribution of dominance periods Pref(T) and the mean dominance time 〈Tdom〉. In the other simulation, inputs I1, 2 were modulated in anti-phase at the best resonance frequency Ts = 2〈Tdom〉 and the distorted distribution of dominance periods Pres(T) was established.

The resonance coefficient P1 was then computed as

P1=[HP23HP2Pres(T)dT][HP23HP2Pref(T)dT]1(13)

where HP = Ts/2.

Finally, to localize the bifurcation surfaces, simulations of 600 s were performed throughout the three-dimensional parameter space in the absence of noise (σn = 0, τa = 1 s). Starting from an asymmetric initial condition (r1,2 = a1,2 = [0, 1]), we determined whether activities migrated to identical steady-state values r1 = r2 = a (stationary regime), periodically reversed in rank order to exhibit values with r1 < r2 (oscillatory regime), or migrated to steady-state values with the same rank order r1 > r2 (bistable regime).

Simulation Equipment

Simulations were performed on a Linux cluster (Suse Linux Enterprise Server 10, Matlab R2007a, C++ compiler gcc 20070115) with five nodes (each with four processors Intel(R) Xeon(R) CPU E5430 @ 2.66 GHz and 8 GB RAM).

Results

We studied three canonical multi-stable displays (Figures 1AC and Video S1): KD in a two-dimensional projection of a rotating cloud of dots (Wallach and O'Connell, 1953), BR between two gratings of different color and orientation (Wheatstone, 1838; Meng and Tong, 2004), and the NC (Necker, 1832). Observers viewed each display continuously for 5 min and reported its appearance either as rotating in depth “front left” or “front right” (KD), or as “uniformly red,” “uniformly green,” or “patchy” (BR), or as the marked corner pointing to “front” or “back” (NC display).

Dominance Distribution and History-Dependence

For each observer and display, we characterized perceptual dynamics in terms of several statistical measures (Figure 1D and Table 1). The distribution of dominance times was binned into a histogram and summarized in terms of mean dominance duration, Tdom, and coefficients of variation, Cv. Both dominance durations (1–22 s) and coefficients of variation (0.2–1.1) varied widely between observers and displays, as is typical for multi-stable percepts (Fox and Herrmann, 1967; Borsellino et al., 1972; Walker, 1975). Also as expected (Levelt, 1967; Blake et al., 1971; Walker, 1975; Murata et al., 2003), the distributions of dominance times resembled Gamma functions with a comparatively narrow range of shape parameters α (3.7 ± 0.6). Specifically, the empirical distributions were consistently fit better by a Gamma distribution (KS-test p = 0.7 ± 0.06), than by either an exponential distribution (p = 0.03 ± 0.02) or a Gaussian distribution (p = 0.09 ± 0.03).

In addition, we captured the subtle history-dependence of dominance times in terms of a correlation coefficient, cH, and a characteristic time-constant, τH (Figures 1D, 2). Due to the destabilizing effect of neural adaptation, successive periods dominated by the same appearance often exhibit a marginally significant, negative correlation (van Ee, 2009; Kang and Blake, 2010; Pastukhov and Braun, 2011). Recently, we have introduced a more sensitive, integral measure, dubbed “cumulative history,” of how long and how recently a given percept has dominated in the past (Hudak et al., 2011; Pastukhov and Braun, 2011). This measure reveals that individual dominance periods are consistently and significantly influenced by prior perceptual history (see “Materials and Methods” and Figure 2). For different observers and displays, the values of cH ranged from 0.1 to 0.4 and the values of τH from 0.6 to 10 s, quantifying the history-dependence in each case (Table 1). Our use of this “cumulative history” measure constitutes an important difference to earlier work (Shpiro et al., 2009).

Dynamical Regimes of LC-Model

Next, we compared our perceptual observations to a class of generative models for multi-stable dynamics. We chose the model formulated by Laing and Chow (2002) and investigated by several other groups (Moldakarimov et al., 2005; Moreno-Bote et al., 2007; Noest et al., 2007; Shpiro et al., 2007; Curtu et al., 2008; Shpiro et al., 2009), which strikes a dynamical balance between competition β, adaptation ϕa, and input strength I0 (Figure 3). Depending on this balance, the “LC-model” is able to generate sequences of perceptual reversals with a wide range of dominance distributions and history-dependencies. Note that all models incorporating adaptation, such as (Laing and Chow, 2002; Moldakarimov et al., 2005; Moreno-Bote et al., 2007; Noest et al., 2007; Shpiro et al., 2007; Curtu et al., 2008; Shpiro et al., 2009), necessarily predict a degree of history-dependence.

FIGURE 3
www.frontiersin.org

Figure 3. Bifurcation analysis of a class of generative models. (A) Generative models (schematic) for multi-stable dynamics with two neural populations (after Laing and Chow, 2002). Population activities r1,2, strength of cross-inhibition β, visual input I1,2 = I0, strength of neural adaptation ϕa, time-constant τa of neural adaptation, independent neural noise ξn. Dynamical regimes depend largely on only three parameters: β, ϕa, and I0. (B) Bistable region (red volume and red lines on bifurcation diagrams EFG), see also Figure 4A. Without neural noise, activities r1,2 approach one of two steady-states with disparate activity levels (one high, one low). With noise, transitions between the two steady-states occur at irregular intervals. (C) Oscillatory regime (blue volume and blue lines on bifurcation diagrams EFG), see also Figure 4B. Without noise, activities r1,2 oscillate in counter-phase between low and high levels. Neural noise renders the alternation more irregular. (D) Stationary regime (green and green lines on bifurcation diagrams EFG). Activities r1,2 approach a single steady-state, with or without noise. (E–G) Bifurcation analysis of parameters ϕa, I0, and β. (E) Dependence on ϕa, revealing bistable, oscillatory, and stationary regimes (β = 1.75, I0 = 0.5). Hopf bifurcations are marked ϕhb and ϕHB. (F) Dependence on I0, showing a central bistable regime flanked by oscillatory and stationary regimes on either side (β = 1.75, ϕa = 0.25). (G) Dependence on β, showing bistable, oscillatory, and stationary regimes (ϕa = 0.25, I0 = 0.5).

Whereas the LC-model generates a continuum of possible dynamics, one may technically distinguish two regimes: a bistable or fluctuation-driven regime in which adaptation ϕa is weak [ϕa < ϕhba(β, I0)] and dominance periods are terminated by noise (Figure 3B), and an oscillatory or limit-cycle regime in which adaptation ϕa is strong enough [ϕa > ϕhba(β, I0)] to terminate each dominance period on its own (Figure 3C). The stationary regime of the model does not generate reversals and is not relevant here (Figure 3D).

Both the bistable and the oscillatory regimes of this model generate multi-stable dynamics, but with important differences in detail (Figure 4). A typical bistable dynamics is dominated by noise, resulting in irregular trajectories through state space, aperiodic dominance reversals, and an approximately exponential distribution of dominance times (Figure 4A). In marked contrast, a typical oscillatory dynamics is dominated by adaptation, with state-space trajectories describing a stereotypical limit-cycle, periodic dominance reversals, and an approximately Gaussian distribution of dominance times (Figure 4B).

FIGURE 4
www.frontiersin.org

Figure 4. Bistable, oscillatory, and intermediate dynamics. (A) Bistable dynamics obtained deeply within the bistable regime (far left, cf. Figure 3B). Driven largely by noise, it is characterized by irregular trajectories in state space (middle left), aperiodic dominance reversals (middle right), and an approximately exponential distribution of dominance times (far right). (B) Oscillatory dynamics obtained deeply within the oscillatory regime (far left, cf. Figure 3C). Driven largely by adaptation, it is characterized by regular trajectories in state space (middle left), periodic dominance reversals (middle right), and an approximately Gaussian distribution of dominance times (far right). (C) The multi-stable dynamics of human observers falls between these two extremes: it exhibits irregular trajectories (middle left), aperiodic reversals (middle right), and a Gamma-like distribution of dominance times (far right). With suitable levels of noise, a large parameter volume (far left) can result in realistic (human-like) distributions of dominance times (see text for details).

The perceptual dynamics of human observers tends to fall between these two extremes. Typically, human dominance periods exhibit a Gamma distribution with shape factor α between 3 and 4 (Murata et al., 2003), a distribution shape that is intermediate between exponential and Gaussian distributions Figure 4C). On this basis, it has been suggested that the operating regime of human multi-stable perception may lie near the boundary between bistable and oscillatory regimes (Shpiro et al., 2009).

Realistic Dominance Distribution

We will now show that the distribution shape of dominance periods does not usefully constrain the dynamical regime of multi-stable perception. In essence, this is because the LC-model is highly redundant in the sense that many combinations of parameters generate equally realistic (Gamma-like) distribution shapes. To establish this point, we carried out extensive simulations, independently varying competition β ∈ [0, 2], adaptation ϕa ∈ [0, 1], input strength I0 ∈ [0, 2], noise amplitude σn ∈ [0.01, 0.35], and adaptation time-scale τa ∈ [1 s, 8 s]). For each parameter combination (β, ϕa, I0, σn, τa), we generated reversal sequences and established the best-fitting Gamma, exponential, and Gaussian functions for the resulting distribution of dominance times.

The dominance distribution generated by a parameter combination (β, ϕ_a, I0, σn, τa) was classified as realistic or human-like, if it was well fit by a Gamma distribution with shape parameter α ∈ [3.1, 4.3] (KS-test p > 0.7) and less well by either exponential and Gaussian distributions. The parameter volume in which the LC-model generated human-like distributions of dominance times is shown in Figure 4C (far left). Note that the illustration shows only three of the five parameters. Only some, not all, choices of the two hidden parameters σn, τa resulted in realistic distributions. The depicted volume encompassed approximately 57% of the possible volume and was not restricted to the boundary between bistable and oscillatory regimes.

Accordingly, the distribution shape of dominance periods, taken by itself, does not usefully constrain the dynamical regime of multi-stable perception, as has been claimed (Shpiro et al., 2009). The reason for this discrepancy is that we explored a larger range of hidden parameters σn, τa than (Shpiro et al., 2009). Essentially, a realistic distribution shape can almost always be obtained if a suitable noise level σn and adaptation time-constant τa are chosen.

Realistic Dominance Distribution and History-Dependence

Fortunately, a far more informative set of constraints becomes available when both the dominance distribution and the history-dependence of human observers are taken into account. Comparing simulated and human perceptual dynamics, parameter combinations (β, ϕa, I0, σn, τa) were considered a “match” if their statistics (Tdom, Cv, cH, τH) fell within 25% of the statistics of a particular observer/display combination. In this case, we refrained from comparing distribution shapes explicitly, as this would have complicated the interpretation of the results, but would not have further constrained the parameter volumes.

Astonishingly, the parameter combinations that matched almost all observers/displays clustered in a consistent “operating regime” of approximately 15% of the possible volume (Figure 5B): 8/8 observers of the KD display were matched by 10%, 8/11 observers of the BR display by 13%, and 5/5 observers of the NC display by 7% of the possible parameter volume. The individual results for all observers are presented in Figures 68. In most cases, a comparatively small and well-defined parameter volume reproduced all four statistical measures (Tdom, Cv, cH, τH) (see Figure 5A for representative examples). On average, the matching volumes comprised 2.4 ± 1.1% (KD display), 4.5 ± 0.7% (BR display), and 2.9 ± 1.0% (NC display), of the possible parameter spaces (bistable and oscillatory regimes).

FIGURE 5
www.frontiersin.org

Figure 5. Operating regime of multi-stable perception. KD display (left), BR display (middle), and NC display (right). (A) Parameter volumes (green, red, blue) matching the perceptual dynamics of three representative human observers (lp, kt, and ia, respectively) in terms of both the distribution (Tdom, Cv) and the subtle history-dependence (cH, τH) of dominance times. The depicted volumes fill approximately 6% of the possible volume and are here compared to the union of observers (transparent gray volumes). (B) Union of the matching volumes (green, red, blue) from 8, 8, and 5 observers, respectively. The matching volumes lie entirely within the bistable regime (transparent gray volumes) and fill approximately 15% of the possible volume.

FIGURE 6
www.frontiersin.org

Figure 6. Parameter volumes matching the perceptual dynamics of individual observers for KD displays. For each parameter triplet I0, ϕa, and β, different combinations of noise level and adaptation time-constant were explored in the ranges σn ∈ [0.01, 0.35] and τa ∈ [1 ms, 8 ms]. A “match” was declared when the statistics of synthetic reversal sequences fell within 25% of the mean values of each of the four observables 〈Tdom〉, Cv, cH, and τH. The color coding indicates the value of τa at which each parameter triplet I0, ϕa, and β best matched observer dynamics. For each matching volume, three orthogonal projections on different planes are shown in gray. The green volume shown on the left of Figure 5B represents the union of the volumes illustrated here.

At this juncture, the reader may well wonder how these results depend on the 25% criterion used to define a “match” between simulated and human reversal statistics. In fact, the “envelope” of the matching volumes described above is largely independent of this criterion choice. If the parameter space (β, ϕa, I0, σn, τa) is sampled at a sufficiently densely spaced points, any set of observed statistical measures (Tdom, Cv, cH, τH) can be reproduced with arbitrary precision. In other words, the density of parameter sampling determines the precision with which observed statistical measures can be reproduced. The 25% criterion was chosen to obtain cohesive “matching” volumes, given the sampling grid of our simulations. For this criterion value, an observed statistics was typically reproduced by several adjacent grid locations. When a stricter criterion was used, an observed statistics tended to be reproduced only by isolated grid locations, resulting in non-cohesive or “patchy” matching volumes. In sum, the criterion choice merely affected the internal cohesiveness, but not the “envelope,” of the parameter volumes reproducing human reversal statistics.

Why should the four statistical measures (Tdom, Cv, cH, τH) offer a more informative set of constraints than the shape of the dominance distribution alone? In the LC-model, distribution shape (Tdom, Cv, and higher moments) is determined by the relative strength of adaptation and noise. Accordingly, many parameter combinations produce realistic distribution shapes, provided a suitable level of noise is chosen in each case. History-dependence (cH, τH), on the other hand, is less sensitive to the level of noise and therefore more informative about the absolute strength of adaptation. Thus, distribution shape and history-dependence provide largely independent constraints. That this is indeed the case was evident from the disparate parameter volumes which reproduce different sets of constraints: whereas comparatively small volumes (3.3 ± 1.6% of the possible volume) reproduced both dominance distribution (Tdom, Cv) and history-dependence (cH, τH) of individual observers/displays, far larger volumes reproduced either one of these constraints (29 ± 15% for Tdom, Cv and 44 ± 7% for cH, τH).

A Consistent Human “Operating Regime”

Overall, the multi-stable dynamics of 21/24 data sets was matched by a consistent “operating regime,” lying entirely within the bistable domain of the model and comprising approximately 15% of the possible volume (Figure 5B). The results from individual observers are detailed in Figure 6 (KD displays), Figure 7 (BR displays), and Figure 8 (NC displays). Only three observers of the BR display (jn, lf, np) exhibited an exceptional dynamics in that their brief dominance times Tdom and strong history-dependence cH were matched not only in the bistable but also in the oscillatory regime of the LC-model (Figure 7).

FIGURE 7
www.frontiersin.org

Figure 7. Parameter volumes matching the perceptual dynamics of individual observers for BR displays (see Figure 6 for details). The color coding indicates the value of τa at which each parameter triplet I0, ϕa, and β best matched observer dynamics. For exceptional observers (jn, lf, and np) parameter volumes lie partially outside the stable and sensitive volume. For each matching volume, three orthogonal projections on different planes are shown in gray. The red volume shown in the middle of Figure 5B represents the union of the volumes illustrated here.

FIGURE 8
www.frontiersin.org

Figure 8. Parameter volumes matching the perceptual dynamics of individual observers for NC displays (see Figure 6 for details). The color coding indicates the value of τa at which each parameter triplet I0, ϕa, and β best matched observer dynamics. For each matching volume, three orthogonal projections on different planes are shown in gray. The blue volume shown on the right of Figure 5B represents the union of the volumes illustrated here.

We were astonished by this clustering, especially in view of the superficial diversity in the perceptual dynamics exhibited by different observers/displays (Figure 1D). To assess the likelihood of an accidental clustering, we shuffled the pairs of statistical measures (Tdom, Cv) and (cH, τH), drawing observables randomly from the value pairs produced by real observers and recombining them to form “virtual” observers. In general, the matching volumes of these “virtual” observers were far more widely scattered (51% of the possible volume) than those of “real” observers. To quantify this further, we computed the centers of all matching volumes (mean parameter vectors) and the norms of the distances between all volume pairs. Whereas the average pair-distance was comparable for real and for “virtual” observers (2.0 ± 1.2 and 3.4 ± 3.8, respectively, Figure 9A), the group-mean for real observers was much smaller than the group-mean for equal numbers of “virtual observers” (Figure 9B), demonstrating that real observers clustered tightly in a consistent “operating regime.” The likelihood of obtaining by chance the clustering exhibited by real observers was not significant (p < 0.02).

FIGURE 9
www.frontiersin.org

Figure 9. Clustering of matching regions in (I0, ϕa, β)-space. (A) Distribution of center-to-center distances between the matching volumes of observer pairs (real and virtual). Vertical lines mark the distribution means. (B) Distribution of the mean of all center-to-center distances among groups of 21 virtual observers (computed over 10,000 randomly chosen sets). The vertical line (red) marks the value obtained for the 21 real observers/data sets. The likelihood that equal numbers of virtual observers cluster as tightly as real observers was <0.02.

Shape and Location of “Operating Regime”

To examine the “operating regime” of human observers in more detail, we carried out additional simulations in several two-dimensional subspaces, three of which are shown in Figure 10a = 0.25, I0 = 0.5, and β = 1.75). These detailed simulations revealed that, depending on the assumed level of noise, human observers operate in different shell-like volumes of the bistable regime, each of which follows the bifurcation surface at some distance. As the assumed noise level increased from low (σn ∈ [0.01, 0.11]) to middle (σn ∈ [0.13, 0.19]) to high (σn ∈ [0.21, 00.35]), the distance to the bifurcation surface increased. Thus, the perceptual dynamics of most observers was matched by a shell-shaped volume at the margins of the bistable regime or, equivalently, near but not at the brink of the oscillatory regime (see also Figure 11).

FIGURE 10
www.frontiersin.org

Figure 10. Operating regimes of multi-stable perception for different levels of noise (planar subspaces). The left inset relates the selected subspaces to the three-dimensional volumes of Figure 5. Several regions matching human observer dynamics with different displays and under different noise assumptions are illustrated. Specifically, the union of the matching regions of individual observers is outlined in a different color for each display (KD, BR, NC, see inset). Also marked are the bifurcation surface (black contour) and the functional “sweet spot” for medium noise (dotted black outline, see Figure 12C). Matching regions occupy different shell-like volumes, depending on the assumed level of noise (low, medium, or high). Distance to the bifurcation increases with noise. (A) Planar subspace ϕa = 0.25. (B) Planar subspace I0 = 0.55. (C) Planar subspace β = 1.71.

FIGURE 11
www.frontiersin.org

Figure 11. Matching volumes depend on the assumed level of noise. Union of matching volumes for all data sets from KD displays (top row), BR displays (middle row), and NC displays (bottom row). Assuming low noise (σn ∈ [0.01, 0.11]) displaced matching volumes to the margins of the bistable regime (left column), whereas an assumption of high noise (σn ∈[0.21, 00.35]) shifted matching volumes to the center of that regime (right column). Medium levels of noise (σn ∈ [0.13, 0.19]) produced the matching volumes shown in the middle column. The dependence of matching volumes on the assumed level of noise is also shown by the dashed contours in Figure 10.

Shape and Location of Functional “Sweet Spot”

Is there a functional reason as to why multi-stable perception should operate in this particular regime? On the one hand, deep inside the bistable regime (strong β and weak ϕa), perception is particularly stable (dominance times are particularly long). On the other hand, at the bifurcation boundary between the oscillatory and bistable regimes (β and ϕa proportional), perception is particularly sensitive to differential input (small imbalances between I1 and I2). Accordingly, any regime combining perceptual stability with perceptual sensitivity would constitute a functional “sweet spot.”

To locate this “sweet spot” in terms of the LC-model, we computed the parameter volume providing exceptional stability (dominance periods >1 s, Figure 12B) and intersected it with the volume providing exceptional sensitivity (Figure 12C). To quantify sensitivity, we established frequency resonance under the assumption of medium noisen = 0.15). Frequency resonance is a sensitive method for probing the “operating point” of a dynamical system and is well established for the multi-stable perception of human observers (Kim et al., 2006).

FIGURE 12
www.frontiersin.org

Figure 12. Functional “sweet spot” combining perceptual stability and sensitivity. (A) Frequency resonance driven by input modulation. Distribution of dominance times without modulation (far left) and for different modulations (red lines mark half-periods, from 0.25 to 2 Hz). A resonance peak is evident when the modulation half-period coincides with the peak of the unmodulated distribution. (B) Volume of maximal stability (orange, Tdom ≥ 1 s), compared to bistable regime (transparent gray). (C) Functional “sweet spot” combining maximal stability with maximal sensitivity to input fluctuations (cyan, frequency resonance measure P1 ≥ 1.2), compared to bistable regime (transparent gray). (D–F) Comparison of functional “sweet spot” (cyan) with regions matching perceptual dynamics of human observers for KD, BR, and NC displays (D–F, respectively).

Specifically, a periodic, anti-phase modulation of input strengths I1, 2 induces frequency resonance in the form of periodic reversals of dominance (Figure 12A). The input modulation moves the bifurcation boundary back and forth (with the movement range depending on modulation amplitude). Periodic reversals are triggered as soon as the boundary displacement reaches the “operating point” (i.e., the operative parameter combination) of the system under investigation. The system's sensitivity to input modulation may therefore be measured either in terms of modulation amplitude or, equivalently, in terms of the multiplicative increase of reversal probabilities around the resonance frequency (P1 measure, see “Materials and Methods”). The larger the P1-measure, the less modulation amplitude is needed to trigger a perceptual reversal.

The functional “sweet spot” of the LC-model, which combines maximal stability and sensitivity (Tdom > 1 s and P1 > 1.2), is illustrated in Figure 12C. It formed a shell-shaped volume which followed the bifurcation surface at a distance and was restricted to small values of adaptation. Remarkably, the volumes matching observer dynamics were largely coextensive with this “sweet spot” (Figures 12DF). A more detailed comparison was possible in the planar subspaces of Figure 10, which juxtaposed the regions matching observer dynamics for low, medium and high noise (colored contours) and the functional “sweet spot” for medium noise (dotted contours). Note that it was the perceptual operating regime for medium noise (not for low or high noise) which best matched the functional “sweet spot” for medium noise.

Discussion

We have compared the dynamics of multi-stable perception with a class of generative models in order to assess the effective contributions of competition, neural adaptation, and neural noise. Astonishingly, we find that highly heterogeneous measurements from different observers and displays consistently constrain these models to the same narrow operating regime (21 of 24 data sets). Moreover, this operating regime falls in a particularly interesting region from the point of view of perceptual performance. Specifically, it falls in a shell-shaped volume at some distance from the bifurcation boundary, which uniquely combines stability of perceptual outcome with sensitivity to input modulations. This constitutes compelling evidence that the temporal dynamics of perceptual inference is functionally optimized.

A Simplistic Hypothesis

We have tested the hypothesis that different multi-stable phenomena reflect a common mechanism, namely, tectonic shifts of neural activity arising spontaneously within an attractor neural network that may well be distributed across distant cortical areas (Braun and Mattia, 2010). Presumably, a multi-stable display stimulates recurrent neural networks with several distinct steady states of neural activity (“attractor states”), which embody the cumulative residue of prior visual experience. These steady states are not absolutely stable, but are continually destabilized by neural adaptation and by neural noise. The result is an irregular, saltatory dynamics in which stable episodes are punctuated by rapid transitions.

The essential part of this hypothesis is the existence of a balance between competition, neural adaptation, and neural noise. Its precise mathematical formulation [here, the Laing and Chow model (Laing and Chow, 2002)] is only of secondary importance. Accordingly, we would expect that quantitatively different formulations of the same stabilizing and destabilizing factors should lead to qualitatively similar results. Consistent with this expectation, Shpiro et al. (2009) have shown that the broad “operating regimes” defined by the dominance distribution generalize over different models. It remains to be seen whether the same is true for the narrower “operating regimes” reported here (defined by both dominance distribution and history-dependence of multi-stable perception).

The hypothesis advanced here is admittedly simplistic in that it neglects many important aspects of multi-stable perception, such as its dependence on input strength (Moreno-Bote et al., 2007; Wilson, 2007; Seely and Chow, 2011) or its persistence across gaps in stimulation (Leopold et al., 2002; Maier et al., 2003; Brascamp et al., 2008; Pastukhov and Braun, 2008). Moreover, in treating multi-stable perception as a stochastic dynamical system, it ignores volitional processes such as attention shifts or eye movements.

There are two ways to justify this omission. Firstly, there is compelling evidence that reversals in the appearance of multi-stable displays do occur spontaneously, requiring neither attention nor eye movements (Lee et al., 2007; Pastukhov and Braun, 2007), except perhaps in some special situations (Zhang et al., 2011). Secondly, it seems likely that attention shifts and eye movements are part and parcel of the spontaneous dynamics we are postulating here. Recent evidence that reversals engage attentional mechanisms in a feedforward manner (Knapen et al., 2011) is consistent with the latter possibility.

In the end, we feel that the astonishing success of this simplistic hypothesis speaks for itself, especially as it extends to multi-stable displays (NC) known to be particularly susceptible to voluntary control (Meng and Tong, 2004).

A Hidden Consistency

Our main finding is that the seemingly heterogeneous perceptual dynamics, which different observers exhibit with different multi-stable displays, conceals a hidden consistency. It has often been noted that the variability of dominance times is stereotypical, whereas mean dominance times are not (Murata et al., 2003; Brascamp et al., 2005; van Ee, 2005). On this basis, previous studies have concluded that human observers exhibit a bistable dynamics (Moreno-Bote et al., 2007), or that they operate in the vicinity (on either side) of the bifurcation separating bistable and oscillatory regimes (Shpiro et al., 2009). In contrast to these earlier studies, we also took into consideration the weak (but significant) dependence of dominance times on prior perceptual history (Pastukhov and Braun, 2011). These additional constraints revealed a consistent and narrow operating regime of human observers.

If multi-stable dynamics is so consistent, why do mean dominance times vary so widely between displays and observers? Our findings suggest at least a partial answer: when a dynamical system operates near a bifurcation, its evolution over time is not dominated by a single mechanism and parameter, but by a mixture of mechanisms and a combination of parameters. Indeed, for any given value of the time-constant τa of adaptation, small perturbations in the other parameters of the Laing and Chow model (Laing and Chow, 2002) generate considerable variance in the dominance time Tdom and, independently, in the time-constant τH of cumulative history. As a consequence, the pair-wise correlations between τa, Tdom and τH are quite poor (Pastukhov and Braun, 2011).

Near, Not at, the Brink

If our mechanistic hypothesis captures the essence of the situation, then visual perception operates in a marginally stable regime, near the brink of an oscillatory instability. According to the theory of dynamical systems, the Hopf bifurcation at the brink of an oscillatory instability constitutes a state of criticality (Camalet et al., 2000), in which signal processing is often found to be optimal in terms of sensitivity, dynamic range, or response latency. Several recent studies have shown that the dynamic range of the system response is enlarged (Kinouchi and Copelli, 2006), and the amount of information transferred increases (Beggs et al., 2003; Plenz and Thiagarajan, 2007; Shew et al., 2009), at the point of criticality. Indeed, operating at or near criticality may be a general principle of brain function (Bak, 1996).

The operating regime we have identified lies at some distance from the bifurcation boundary: it falls near, but not directly at, the brink of the oscillatory instability and is restricted to moderate strengths of adaptation. The functional advantage of such a marginally stable regime—in terms of relative stability of perceptual outcome and high sensitivity to input modulations (Figure 10)—may be understood as follows: Both dominance and response times are short at the bifurcation, but grow longer as the system enters more deeply into the bistable regime. A compromise—relatively long dominance and short response times—is reached at some distance to the bifurcation. When the input changes from being balanced (I1 = I2) to being biased (I1 < I2), the bifurcation border moves toward the bistable region. Accordingly, a system previously situated near the border may now find itself at the border and hence able to respond with a rapid reversal. In short, being near, but not directly at, the bifurcation affords both stability when the input is constant and sensitivity when the input changes.

Stability vs. Sensitivity

If visual inference is based on attractor dynamics (Braun and Mattia, 2010; Rolls and Deco, 2010), a goal conflict between stability and sensitivity seems unavoidable. Presumably, a stable and compelling appearance of a visual scene recruits numerous associations at all levels of visual processing—edges, surfaces, objects, generic context, episodic context. In terms of attractor dynamics, reciprocal excitation between visual and memory activity would be expected to stabilize a particular pattern of activity (and, thus, a particular appearance). The downside to this stabilization would be reduced sensitivity to incremental changes in the visual input, for attractor dynamics would tend to counteract any change and to restore the activity pattern that conforms to the activated memories. Accordingly, if the system is to remain sensitive to incremental input changes, associative stabilization by memory traces must not go too far. A combination of neural noise and neural adaptation would seem to offer an appropriate strategy for balancing stability and sensitivity, as this would also ensure that alternative interpretations are exhaustively explored.

Exploitation-Exploration Dilemma

The present findings have important implications for theories of perceptual inference (Kersten et al., 2004). Given an exhaustive store of prior information, the outcome of Bayesian inference is deterministic. However, if the store of prior knowledge must be acquired by reinforcement learning (i.e., by trial and error), an inferential system faces the “exploitation-exploration dilemma” (Sutton and Barto, 1998). One the one hand, it must exploit what it knows already by following successful precedents from the past. On the other hand, if it is to expand its knowledge, it must explore alternative possibilities that may prove more successful in the future. The dilemma is that neither strategy can be pursued to the exclusion of the other. At the mechanistic level, such an inferential system must balance prior experience against current input. Favoring the former foregoes exploring novel inferences and compromises the sensitivity of inference (as input details are ignored). Favoring the latter foregoes the exploitation of prior knowledge and impairs the stability of inference (as input details are unduly amplified). Several authors have formulated similar thoughts in connection with perceptual inference (Hoyer and Hyvärinen, 2003; Hohwy et al., 2008; Sundareswara and Schrater, 2008; Moreno-Bote et al., 2010, 2011).

Exception or Rule?

Does marginal stability characterize only perfectly ambiguous, laboratory situations—such as the multi-stable displays investigated here—or does it apply also to real-world visual scenes? The answer hinges on whether the phenomenal appearance of real-world scenes is entirely stable, or whether it fluctuates in some way. Indeed, real-world objects evoke “contextual associations” such as, for example, episodic memories of prior personal experience, or generic knowledge about prototypical uses and locations (Bar, 2004, 2009b). The activation of such contextual associations is temporary and new associative possibilities are continuously being explored (Bar, 2009a). Contextual associations strongly color phenomenal appearance, presumably by activating perceptual representations in the manner of mental imagery (Moulton and Kosslyn, 2009). In certain impoverished visual displays—such as two-tone faces or Rorschach ink blots (Mooney et al., 1957)—this influence is particularly evident. Accordingly, we speculate that multi-stable phenomena form a continuum, ranging from perfectly ambiguous situations (such as the canonical multi-stable displays studied here), to partially ambiguous images with multiple readings of different plausibility (such as two-tone faces), to real-world images with a large number of subtly different associations.

Final Thoughts

We propose a functional hypothesis as to why visual perception is marginally stable in general, and marginally multi-stable in ambiguous situations. Specifically, we propose that vision operates in a dynamical regime that uniquely combines stability and sensitivity, thus optimizing performance. At the mechanistic level, we speculate that this balance may be struck by attractor dynamics encompassing both visual and memory representations.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Alexander Pastukhov, Joachim Haenicke, and Jochen Braun: BMBF Bernstein Network, EU FP7-269459. Gustavo Deco: BFU2007-61710, Consolider Ingenio 2010, FP7 Brainsync, ITN Codde. Antoni Guillamon: MICINN/FEDER MTM2009-06973 and CUR-DIUE 2009SGR-859. Pedro E. García-Rodríguez: BFU2007-61710.

Supplementary Material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/Computational_Neuroscience/10.3389/fncom.2013.00017/abstract

References

Alais, D., Cass, J., O'Shea, R. P., and Blake, R. (2010). Visual sensitivity underlying changes in visual consciousness. Curr. Biol. 20, 1362–1367.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bach, M., and Poloschek, C. (2006). Visual neuroscience: optical illusions. Adv. Clin. Neurosci. Rehabil. 6, 20–21.

Bak, P. (1996). How Nature Works: The Science of Self-Organized Criticality. New York, NY: Copernicus Press.

Bar, M. (2004). Visual objects in context. Nat. Rev. Neurosci. 5, 617–629.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bar, M. (2009a). Predictions: a universal principle in the operation of the human brain. Introduction. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 1181–1182.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Bar, M. (2009b). The proactive brain: memory for predictions. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 1235–1243.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Beggs, J. M., and Plenz, D. (2003). Neuronal avalanches in neocortical circuits. J. Neurosci. 23, 11167–11177.

Pubmed Abstract | Pubmed Full Text

Blake, R., Fox, R., and McIntyre, C. (1971). Stochastic properties of stabilized-image binocular rivalry alternations. J. Exp. Psychol. 88, 327–332.

Pubmed Abstract | Pubmed Full Text

Blake, R., and Logothetis, N. K. (2002). Visual competition. Nat. Rev. Neurosci. 3, 13–21.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Blake, R., Sobel, K. V., and Gilroy, L. A. (2003). Visual motion retards alternations between conflicting perceptual interpretations. Neuron 39, 869–878.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Blake, R., Westendorf, D., and Fox, R. (1990). Temporal perturbations of binocular rivalry. Percept. Psychophys. 48, 593–602.

Pubmed Abstract | Pubmed Full Text

Bonneh, Y., Sagi, D., and Karni, A. (2001). A transition between eye and object rivalry determined by stimulus coherence. Vis. Res. 41, 981–989.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Borsellino, A., De Marco, A., Allazetta, A., Rinesi, S., and Bartolini, B. (1972). Reversal time distribution in the perception of visual ambiguous stimuli. Kybernetik 10, 139–144.

Pubmed Abstract | Pubmed Full Text

Brascamp, J. W., Knapen, T. H. J., Kanai, R., Noest, A. J., van Ee, R., and van den Berg, A. V. (2008). Multi-timescale perceptual history resolves visual ambiguity. PLoS ONE 3:e1497. doi: 10.1371/journal.pone.0001497

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brascamp, J. W., van Ee, R., Noest, A. J., Jacobs, R. H., and van Den Berg, A. V. (2006). The time course of binocular rivalry reveals a fundamental role of noise. J. Vis. 6, 1244–1256.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brascamp, J. W., van Ee, R., Pestman, W. R., and van Den Berg, A. V. (2005). Distributions of alternation rates in various forms of bistable perception. J. Vis. 5, 87–298.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Braun, J., and Mattia, M. (2010). Attractors and noise: twin drivers of decisions and multistability. Neuroimage 52, 740–751.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Camalet, S., Duke, T., Jülicher, F., Prost, J., and Julicher, F. (2000). Auditory sensitivity provided by self-tuned critical oscillations of hair cells. Proc. Natl. Acad. Sci. U.S.A. 97, 3183–3188.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Campbell, F. W., and Howell, E. R. (1972). Monocular alternation: a method for the investigation of pattern vision. J. Physiol. 225, 19P–21P.

Pubmed Abstract | Pubmed Full Text

Curtu, R., Shpiro, A., Rubin, N., and Rinzel, J. (2008). Mechanisms for frequency control in neuronal competition models. SIAM J. Appl. Dyn. Sys. 7, 609–649.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Donner, T. H., Sagi, D., Bonneh, Y. S., and Heeger, D. J. (2008). Opposite neural signatures of motion-induced blindness in human dorsal and ventral visual cortex. J. Neurosci. 28, 10298–10310.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Ffytche, D. H. (2009). Visual hallucinations in eye disease. Curr. Opin. Neurol. 22, 28–35.

Pubmed Abstract | Pubmed Full Text

Fox, R., and Herrmann, J. (1967). Stochastic properties of binocular rivalry alternations. Percept. Psychophys. 2, 432–446.

Pubmed Abstract | Pubmed Full Text

Friston, K., Breakspear, M., and Deco, G. (2012). Perception and self-organized instability. Front. Comput. Neurosci. 6:44. doi: 10.3389/fncom.2012.00044

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gerardin, P., Kourtzi, Z., and Mamassian, P. (2010). Prior knowledge of illumination for 3D perception in the human brain. Proc. Natl. Acad. Sci. U.S.A. 107, 16309–16314.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gigante, G., Mattia, M., Braun, J., and Del Giudice, P. (2009). Bistable perception modeled as competing stochastic integrations at two levels. PLoS Comput. Biol. 5:e1000430. doi: 10.1371/journal.pcbi.1000430

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Gregory, R. (2009). Seeing Through Illusions. New York, NY: Oxford University Press.

Hesselmann, G., Kell, C. a., Eger, E., and Kleinschmidt, A. (2008). Spontaneous local variations in ongoing neural activity bias perceptual decisions. Proc. Natl. Acad. Sci. U.S.A. 105, 10984–10989.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hohwy, J., Roepstorff, A., and Friston, K. (2008). Predictive coding explains binocular rivalry: an epistemological review. Cognition 108, 687–701.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Hollins, M. (1980). The effect of contrast on the completeness of binocular rivalry suppression. Percept. Psychophys. 27, 550–556.

Pubmed Abstract | Pubmed Full Text

Hoyer, P. O., and Hyvärinen, A. (2003). “Interpreting neural response variability as monte carlo sampling of the posterior,” in Advances in Neural Information Processing Systems, eds S. Becker, S. Thrun, and K. Obermayer (Cambridge, MA: MIT Press), 293–300.

Hudak, M., Gervan, P., Friedrich, B., Pastukhov, A., Braun, J., and Kovacs, I. (2011). Increased readiness for adaptation and faster alternation rates under binocular rivalry in children. Front. Hum. Neurosci. 5:128. doi: 10.3389/fnhum.2011.00128

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kang, M.-S., and Blake, R. (2010). What causes alternations in dominance during binocular rivalry? Atten. Percept. Psychophys. 72, 179–186.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kersten, D., Mamassian, P., and Yuille, A. (2004). Object perception as Bayesian inference. Annu. Rev. Psychol. 55, 271–304.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kim, Y.-J., Grabowecky, M., and Suzuki, S. (2006). Stochastic resonance in binocular rivalry. Vis. Res. 46, 392–406.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kinouchi, O., and Copelli, M. (2006). Optimal dynamical range of excitable networks at criticality. Nat. Phys. 2, 348–351.

Knapen, T. H. J., Brascamp, J. W., Pearson, J., van Ee, R., and Blake, R. (2011). The role of frontal and parietal brain areas in bistable perception. J. Neurosci. 31, 10293–10301.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Kohler, W., and Wallach, H. (1944). Figural after-effects. An investigation of visual processes. Proc. Am. Philos. Soc. 88, 269–357.

Laing, C. R., and Chow, C. C. (2002). A spiking neuron model for binocular rivalry. J. Comput. Neurosci. 12, 39–53.

Pubmed Abstract | Pubmed Full Text

Lee, S.-H., Blake, R., and Heeger, D. J. (2007). Hierarchy of cortical responses underlying binocular rivalry. Nat. Neurosci. 10, 1048–1054.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Lehky, S. R. (1988). An astable multivibrator model of binocular rivalry. Perception 17, 215–228.

Pubmed Abstract | Pubmed Full Text

Lehky, S. R. (1995). Binocular rivalry is not chaotic. Philos. Trans. R. Soc. Lond. B Biol. Sci. 259, 71–76.

Leopold, D. A., and Logothetis, N. K. N. (1999). Multistable phenomena: changing views in perception. Trends Cogn. Sci. 3, 254–264.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Leopold, D. A., Wilke, M., Maier, A., and Logothetis, N. K. (2002). Stable perception of visually ambiguous patterns. Nat. Neurosci. 5, 605–609.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Levelt, W. J. (1967). Note on the distribution of dominance times in binocular rivalry. Br. J. Psychol. 58, 143–145.

Pubmed Abstract | Pubmed Full Text

Maier, A., Wilke, M., Aura, C., Zhu, C., Ye, F. Q., and Leopold, D. A. (2008). Divergence of fMRI and neural signals in V1 during perceptual suppression in the awake monkey. Nat. Neurosci. 11, 1193–1200.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Maier, A., Wilke, M., Logothetis, N. K., and Leopold, D. A. (2003). Perception of temporally interleaved ambiguous patterns. Curr. Biol. 13, 1076–1085.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Mamassian, P., and Goutcher, R. (2005). Temporal dynamics in bistable perception. J. Vis. 5, 361–375.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Meng, M., and Tong, F. (2004). Can attention selectively bias bistable perception? Differences between binocular rivalry and ambiguous figures. J. Vis. 4, 539–551.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Mitchell, J. F., Stoner, G. R., and Reynolds, J. H. (2004). Object-based attention determines dominance in binocular rivalry. Nature 429, 410–413.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Moldakarimov, S., Rollenhagen, J. E., Olson, C. R., and Chow, C. C. (2005). Competitive dynamics in cortical responses to visual stimuli. J. Neurophysiol. 94, 3388–3396.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Mooney, C. M. (1957). Age in the development of closure ability in children. Can. J. Psychol. 11, 219–226.

Pubmed Abstract | Pubmed Full Text

Moreno-Bote, R., Knill, D. C., and Pouget, A. (2011). Bayesian sampling in visual perception. Proc. Natl. Acad. Sci. U.S.A. 108, 12491–12496.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Moreno-Bote, R., Rinzel, J., and Rubin, N. (2007). Noise-induced alternations in an attractor network model of perceptual bistability. J. Neurophysiol. 98, 1125–1139.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Moreno-Bote, R., Shpiro, A., Rinzel, J., and Rubin, N. (2010). Alternation rate in perceptual bistability is maximal at and symmetric around equi-dominance. J. Vis. 10:1. doi: 10.1167/10.11.1

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Moulton, S. T., and Kosslyn, S. M. (2009). Imagining predictions: mental imagery as mental emulation. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 1273–1280.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Murata, T., Matsui, N., Miyauchi, S., Kakita, Y., and Yanagida, T. (2003). Discrete stochastic process underlying perceptual rivalry. Neuroreport 14, 1347–1352.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Nawrot, M., and Blake, R. (1989). Neural integration of information specifying structure from stereopsis and motion. Science 244, 716–718.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Necker, L. A. (1832). Observations on some remarkable phenomena seen in Switzerland; and an optical phenomenon which occurs on viewing of a crystal or geometrical solid. Philos. Mag. 1, 329–337.

Noest, A. J., van Ee, R., Nijs, M. M., and van Wezel, R. J. A. (2007). Percept-choice sequences driven by interrupted ambiguous stimuli: a low-level neural model. J. Vis. 7, 10.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pastukhov, A., and Braun, J. (2007). Perceptual reversals need no prompting by attention. J. Vis. 7, 5.1–5.17.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pastukhov, A., and Braun, J. (2008). A short-term memory of multi-stable perception. J. Vis. 8, 7.1–7.14.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Pastukhov, A., and Braun, J. (2011). Cumulative history quantifies the role of neural adaptation in multistable perception. J. Vis. 11:12. doi: 10.1167/11.10.12

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Petersik, T. J. (2002). Buildup and decay of a three-dimensional rotational aftereffect obtained with a three-dimensional figure. Perception 31, 825–836.

Pubmed Abstract | Pubmed Full Text

Plenz, D., and Thiagarajan, T. C. (2007). The organizing principles of neuronal avalanches: cell assemblies in the cortex? Trends Neurosci. 30, 101–110.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Rolls, E. T., and Deco, G. (2010). The Noisy Brain: Stochastic Dynamics as a Principle of Brain Function. New York, NY: Oxford University Press.

Sadaghiani, S., Hesselmann, G., Friston, K. J., and Kleinschmidt, A. (2010). The relation of ongoing brain activity, evoked neural responses, and cognition. Front. Syst. Neurosci. 4:20. doi: 10.3389/fnsys.2010.00020

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Seely, J., and Chow, C. C. (2011). Role of mutual inhibition in binocular rivalry. J. Neurophysiol. 106, 2136–2150.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Shew, W. L., Yang, H., Petermann, T., Roy, R., and Plenz, D. (2009). Neuronal avalanches imply maximum dynamic range in cortical networks at criticality. J. Neurosci. 29, 15595–15600.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Shpiro, A., Curtu, R., Rinzel, J., and Rubin, N. (2007). Dynamical characteristics common to neuronal competition models. J. Neurophysiol. 97, 462–473.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Shpiro, A., Moreno-Bote, R., Rubin, N., and Rinzel, J. (2009). Balance between noise and adaptation in competition models of perceptual bistability. J. Comput. Neurosci. 27, 37–54.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sperling, G., and Dosher, B. A. (1994). “Depth from motion,” in Early Vision and Beyond, eds T. V. Papathomas, A. G. Charles Chubb, and E. Kowler (Cambridge, MA: MIT Press), 133–142.

Sterzer, P., Kleinschmidt, A., and Rees, G. (2009). The neural bases of multistable perception. Trends Cogn. Sci. 13, 310–38.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sterzer, P., and Rees, G. (2008). A neural basis for percept stabilization in binocular rivalry. J. Cogn. Neurosci. 20, 389–399.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sundareswara, R., and Schrater, P. R. (2008). Perceptual multistability predicted by search model for Bayesian decisions. J. Vis. 8, 12.1–12.19.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Sutton, R. S., and Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.

Tong, F., Meng, M., and Blake, R. (2006). Neural bases of binocular rivalry. Trends Cogn. Sci. 10, 502–511.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Tuckwell, H. C. (2006). Introduction to Theoretical Neurobiology: Volume 1, Linear Cable Theory and Dendritic Structure. Cambridge: Cambridge University Press.

van Dam, L. C. J., and van Ee, R. (2006). Retinal image shifts, but not eye movements per se, cause alternations in awareness during binocular rivalry. J. Vis. 6, 1172–1179.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

van Ee, R. (2005). Dynamics of perceptual bi-stability for stereoscopic slant rivalry and a comparison with grating, house-face, and Necker cube rivalry. Vis. Res. 45, 29–40.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

van Ee, R. (2009). Stochastic variations in sensory awareness are driven by noisy neuronal adaptation: evidence from serial correlations in perceptual bistability. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 26, 2612–2622.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

von Helmholtz, H. (1866). Treatise on Physiological Optics. Vol 3. Birmingham, AL: The Optical Society of America.

Walker, P. (1975). Stochastic properties of binocular-rivalry alternations. Percept. Psychophys. 18, 467–473.

Pubmed Abstract | Pubmed Full Text

Wallach, H., and O'Connell, D. N. (1953). The kinetic depth effect. J. Exp. Psychol. 45, 205–217.

Pubmed Abstract | Pubmed Full Text

Weiss, Y., Simoncelli, E. P., and Adelson, E. H. (2002). Motion illusions as optimal percepts. Nat. Neurosci. 5, 598–604.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wheatstone, C. (1838). Contributions to the physiology of vision – part the first. on some remarkable, and hitherto unobserved, phenomena of binocular vision. Philos. Trans. R. Soc. Lond. 128, 371–394.

Pubmed Abstract | Pubmed Full Text

Wilson, H. R. (2007). Minimal physiological conditions for binocular rivalry and rivalry memory. Vis. Res. 47, 2741–2750.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Wolfe, J. M. (1984). Reversing ocular dominance and suppression in a single flash. Vis. Res. 24, 471–478.

Pubmed Abstract | Pubmed Full Text

Yang, Z., and Purves, D. (2003). A statistical explanation of visual space. Nat. Neurosci. 6, 632–640.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Zhang, P., Jamison, K., Engel, S., He, B., and He, S. (2011). Binocular rivalry requires visual attention. Neuron 71, 362–369.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Keywords: multi-stability, binocular rivalry, adaptation, model, exploitation-exploration dilemma

Citation: Pastukhov A, García-Rodríguez PE, Haenicke J, Guillamon A, Deco G and Braun J (2013) Multi-stable perception balances stability and sensitivity. Front. Comput. Neurosci. 7:17.doi: 10.3389/fncom.2013.00017

Received: 12 December 2012; Accepted: 04 March 2013;
Published online: 20 March 2013.

Edited by:

Klaus R. Pawelzik, University Bremen, Germany

Reviewed by:

Udo Ernst, University of Bremen, Germany
Ruben Moreno-Bote, Foundation Sant Joan de Deu, Spain

Copyright © 2013 Pastukhov, García-Rodríguez, Haenicke, Guillamon, Deco and Braun. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.

*Correspondence: Alexander Pastukhov and Jochen Braun, Department of Cognitive Biology, Otto-von-Guericke Universität, Leipziger Straße 44/Haus 91, Magdeburg 39120, Germany. e-mail: pastukhov.alexander@gmail.com; jochen.braun@ovgu.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.