The Independent and Shared Mechanisms of Intrinsic Brain Dynamics: Insights From Bistable Perception

In bistable perception, constant input leads to alternating perception. The dynamics of the changing perception reflects the intrinsic dynamic properties of the “unconscious inferential” process in the brain. Under the same condition, individuals differ in how fast they experience the perceptual alternation. In this study, testing many forms of bistable perception in a large number of observers, we investigated the key question of whether there is a general and common mechanism or multiple and independent mechanisms that control the dynamics of the inferential brain. Bistable phenomena tested include binocular rivalry, vase-face, Necker cube, moving plaid, motion induced blindness, biological motion, spinning dancer, rotating cylinder, Lissajous-figure, rolling wheel, and translating diamond. Switching dynamics for each bistable percept was measured in 100 observers. Results show that the switching rates of subsets of bistable percept are highly correlated. The clustering of dynamic properties of some bistable phenomena but not an overall general control of switching dynamics implies that the brain’s inferential processes are both shared and independent – faster in constructing 3D structure from motion does not mean faster in integrating components into an objects.


INTRODUCTION
Our visual brain constantly engages in the inferential process of constructing a meaningful and coherent interpretation of the visual world based on retinal images, a process mostly unconscious to the observers (Helmholtz, 1962). However, some people may be faster than others in making these "inferences." Is there a general and common mechanism or multiple and independent mechanisms that control the dynamics of the inferential brain? Bistable perception, where viewing a constant and stable stimulus leads to the dynamic alternation between two interpretations (Sterzer et al., 2009), provides an opportunity to test the common vs. independent nature of brain's dynamics.
There are many forms of bistable perception, well-known examples include binocular rivalry (competition between different images from the two eyes), alternating face or vase perception, switching direction of motion with structure-from-motion displays, etc. (Leopold and Logothetis, 1999;Blake and Logothetis, 2002;Pearson and Brascamp, 2008). A key property of bistable perception is the spontaneous nature of the perceptual alternations, the dynamics of which presumably reflects the dynamics of our brain's inferential process. The many different forms bistable perception share similar dynamic properties (Brascamp et al., 2005; Sheppard and Pettigrew, 2006). Perceptual alternation is typically stochastic, with the probability distribution of dominance time following a gamma distribution (Lehky, 1995). In the case of binocular rivalry, the dominance durations are only minimally affected by voluntary control (Meng and Tong, 2004;Chong et al., 2005;Klink et al., 2008), yet attention is critical for its manifestation (Chong and Blake, 2006;Kohler et al., 2008;Dieter and Tadin, 2011;Zhang et al., 2011;Brascamp and Blake, 2012).
Though the alternation dynamics of different bistable perception share common statistical properties, the specific switching rates of different stimuli can have large differences. More importantly, there are large individual differences in switching rates for bistable phenomena (Carter and Pettigrew, 2003;Patel et al., 2015). The individual differences in switching rates present an opportunity to investigate the critical question about whether there is a general mechanism or multiple independent mechanisms supporting the inferential processes that underlie the dynamics of the inferential brain?
One scenario is that there is a common temporal mechanism such as a master clock that is responsible for the timing of all dynamic switching processes. For example, the idea that in bistable perception, the switching occurs between representations in the two hemispeheres (Miller et al., 2000). However, there is also evidence that suggests different cortical areas are involved in switching dynamics (Kanai et al., 2005), such as the interocular completion between monocular neurons in V1 for binocular rivalry (Blake and Logothetis, 2002); the activity of prefrontal cortex (Windmann et al., 2006), inferior frontal cortex (Sterzer and Kleinschmidt, 2007), and frontoparietal cortex (Lumer et al., 1998;Knapen et al., 2011;Weilnhammer et al., 2013) might influence the rivalry rate. Another evidence from morphology supported that local gray-matter density in the parietal cortex might influence the dominance time (Kanai et al., 2010(Kanai et al., , 2011. However, even if different bistable perception may transpire at different cortical sites, the neuronal dynamics underlying the switching could still have shared properties. An important alternative scenario is that different bistable phenomena are controlled by their respective cortical mechanisms with relatively independent dynamic properties. Indeed, many factors could potentially influence the switching rates, including local level of neural noise, dynamic properties of adaptation, etc. ( van Ee, 2009;Pastukhov and Braun, 2011;Scocchia et al., 2014). Even relatively global factors such as mood of observer at that time (Sheppard and Pettigrew, 2006), while it is possible that different processes are differentially influenced by those global factors.
The following general factors might influence the dynamics of alternation: gender (Schouten et al., 2010), age (Beer et al., 1989), visual acuity, color vision, basic stereovision, handedness (Christman et al., 2009), simple reaction time (Schouten et al., 2013), and anxiety (Nagamine et al., 2007). If some of these factors have connection to the alternation rate, it would be helpful to understand the fundament of percept switch.
In the present study, we use an individual difference approach to investigate whether there is common mechanism that either control or influence the dynamics of the visual brain, by examining the correlation among the switching rates between different bistable perceptions across individuals. In other words, if an observer experiences faster switching in binocular rivalry, will the same observer also experience faster switching in (some) other bistable perception?

Stimuli and Apparatus
A total of 11 bistable stimuli were used in the main experiment, including 10 types of monocular bistable stimuli, and binocular rivalry (Figure 1).
For binocular rivalry, horizontal red/black and vertical green/black gratings were dichoptically presented through a stereoscope to observers. The gratings were 0.775 cpd sinewave gratings extending 3.1 × 3.1 • , centered at the fixation point.
The other stimuli were observed binocularly without stereoscope, and all were similar in sizes (∼3 • ). The vase-face and motion induced blindness stimuli were also presented at the fixation point, however, the other 8 types of stimuli were presented 2.2 • below the fixation point, to minimize the potential of stimulus-induced or voluntary eye movements that might influence the switch dynamics. The fixation point was a 0.13 • red dot.
Subjects were seated at a distance of 57 cm from a 19-inch CRT monitor, with a resolution of 1280 × 960 and a refresh rate of 90 Hz, and their head was stabilized using a chin and forehead rest.
For static stimuli (VF, NC, BR), the figures were continuously presented (Figure 1). Below, we provide brief description for each type of the dynamic stimuli.
(1) SD: Stimuli were obtained from Liu et al. (2012), figure was manipulated in length and width, the color was reversed, and each gait cycle was 1.13 s and contained 102 frames. (2) LF: Lissajous-figure was generated by the intersection of two sinusoids with perpendicular axes [x(t) = sin(2t); y(t) = sin(t + ∂); with ∂ increasing from 0 to 2 pi], and rotated 0.25 cycle per second (Weilnhammer et al., 2014). (3) RC: Cylinder projections were 3.1 • wide and 3.1 • tall, contained 450 randomly placed dots, and rotated 0.25 cycles per second. (4) MP: The gratings were 0.75 cpd rectangular wave gratings extending 3 • of visual angle in diameter. The gratings moved at a speed of 1.5 • /s. (5) TD: Stimuli was a line drawing of a diamond whose four corners were occluded by three vertical bars of the same color as the background, the diamond moved at a constant horizontal speed of 3 • /s and reversed direction every 1 s (Fang et al., 2008).

Procedure and Task
At the beginning of the experimental session, each subject went through a training process in which the subject was first shown the two possible percepts of the stimuli, then experienced the alternating perception 16 times (with button presses). After training, the task session began with 2 binocular rivalry trials, followed by the other 10 bistable stimuli in a random order with each type repeated three times, then binocular rivalry was tested twice again. Each trial started with a press of the space key by the subject, and the stimuli were presented for 60 s. During that period, subjects pressed one of two keys indicating their dominant percepts. Subjects were informed to maintain their fixation throughout each trail. All had 60-s rest between trials.

Participants
A total of 100 subjects (53 females) with normal or corrected to normal vision participated, their age ranged from 18 to 33, with the majority between 20 and 25. Prior to participating in the bistable experiment, they were tested in: visual acuity, stereo acuity, color vision, simple reaction time, perception of coherent motion, and handedness.

Data Analysis
For each subject and each stimulus type, we obtained the average dominance time from the three 60-s trials (four trials for binocular rivalry); then the average switch rates were also calculated. For each type the switching rate data of observers was eliminated which exceeding 3 standard deviations among 100 observers. In total 9 data points from 1100 were excluded in correlation analysis.
The MATLAB and SPSS were used to analyze the data. To investigate the similarity of each bistable stimuli, correlation coefficient was calculated between each stimuli type pair. We applied factor analysis to see whether there are some latent factors correspond for the similarity pattern. In order to view how dissimilarities contribute the bistable stimuli relationship, multi-dimensional scaling was done in 3D, because the initial eigenvalues of first three components exceed 1, and for better individual distributions, we used standardized logarithmic switch rates.

Control Experiments
We performed two control experiments in smaller number of subjects to investigate (1) whether the variation of switching dynamics across individuals was largely invariant to the retinal location tested; and (2) the potential contribution of eye blinking and movement patterns into the switch dynamics.
In the first control experiment, with 13 subjects (8 females), the procedures were the same as described above, except that only the binocular rivalry and moving plaid stimuli were used, and presented at different location to the fixation with pattern changes as Figure 2. The fixation dot was gray and had a diameter of 0.2 • , and the hole in the center was 0.4 • . Physical presentation condition was the same as the main experiment. And all viewed through stereoscope.
In the second control experiment (n = 14, 9 female) aimed at investigating the contribution of eye movements. Visual stimuli (rolling wheel, vase-face, binocular rivalry, Necker cube, rotating cylinder, moving plaid, and translating diamond) were presented 5 × 5 • on a 21-inch monitor, with a resolution of 1920 × 1440 and a refresh rate of 75 Hz.
The procedure and task were essentially the same as the main experiment, except the addition of eye tracking. Eye movements and blinks were measured by using a noninvasive infrared "Eyelink-1000" (SR Research, Osgoode, Ottawa, ON, Canada) eye tracker with sample rate 500 Hz binocularly.

RESULTS
The main experiment investigated the relationship between temporal dynamics of different bistable stimuli, with an individual difference approach. A simple and direct question is whether some of the switching rates are correlated across subjects? The basic results are in the form of cross-individual correlations between different stimulus types.

Correlation Matrix
First we computed the Pearson's correlation coefficient between the switching rates of all 11 bistable types across our sample of 100 subjects. The paired correlations are then put together to form a correlation matrix, shown in Figure 3 as a heatmap. To correct for multiple comparisons, significance is asserted only for correlation coefficient with a corrected p-value below that corresponding to an expected false discovery rate (FDR) of 0.05. The significant paired correlations are also summarized in Table 1.
Perhaps not that surprisingly, results show that the three stimuli related to structure from motion (spinning dancer, Lissajous-figure, and rotating cylinder) were highly correlated with each other (Figure 4). Correlation between SD and LF was r = 0.463 (p < 0.001), SD and RC was r = 0.393, (p < 0.004), and LF and RC was r = 0.371 (p < 0.009).
In addition, binocular rivalry was also correlated with a number of other stimuli (Figure 5), the strongest was with vaseface r = 0.369 (p < 0.009). Moving plaid had relative strong  correlation with Necker cube r = 0.462 (p < 0.001), and with rolling wheel r = 0.403 (p < 0.002), as well as with binocular rivalry r = 0.344 (p < 0.027).

Structures (Clustering) in the Correlation Matrix: Factor Analysis and Multi-Dimensional Scaling
In order to see whether there are some latent components that can explain the pattern in the correlation matrix, we performed factor analysis in 11 stimuli types across individuals. The Kaiser-Meyer-Olkin measure of sampling adequacy was 0.662. A minimum Kaiser-Meyer-Olkin score of 0.50 is considered necessary to reliably use factor analysis for data analysis. Similarly, the Bartlett test of sphericity (the higher the better) was 163.7 with significance level of p < 0.000. The inspection of the Scree plot and eigenvalues produced a departure from linearity coinciding with a three-factor result. Therefore, the Scree test indicated that the data should be analyzed for three factors. To facilitate interpretation of results, Orthogonal Varimax rotation was done. From Table 2, we could see different stimuli contribute to factors differently: Binocular Rivalry, Vase-face, Moving Plaid, Necker Cube, and Rolling Wheel contribute to the first factor most, Rotating Cylinder, Spinning Dancer, and Lissajous-figure loaded highly on the second factor, and the other three contribute to the third. The correlation result show to what extent the stimuli were lineally alike, and the multi-dimensional scaling could provide a more intuitive view of how these different stimuli are grouped (Figure 6). The best solution for the dissimilarity coefficient matrix for the 11 stimuli types at this data scale was computed by ALSCAL as a 3D Euclidean space: stress value 0.105, RSQ = 0.910. From the figure, we found the 11 stimuli clustered into configurations consistent with the results from factor analysis. As indicated in the figure, Rotating Cylinder, Spinning Dancer, and Lissajous-figure are close with each other, and vase-face, Necker cube, rolling wheel, binocular rivalry, and moving plaid seem to form another group.

Influence of Gender and Age to Switching Dynamics
We also examined the contribution of gender and age to switching rates, and discovered that gender strongly influenced the switching dynamics of the three types of structure-frommotion stimuli (SD, RC, and LF, see Figure 7), that males had a significantly slower switching rates than females in spinning dancer (F = 14.25, p < 0.001), rotating cylinder (F = 6.80, p < 0.011), and in Lissajous-figure (F = 15.78, p < 0.001). In addition, there is also a negative correlation between switching rates of translating diamond and age, slower switching of translating diamond with increasing age (Figure 8).

Control Experiments
The two control experiments were performed to investigate how sensitive the alternation dynamics were to the retinal location and eye movements. First, we looked at the effects of retinal location. With Binocular Rivalry stimuli and the Moving Plaids presented at different retinal locations (see the section "Materials and Methods"), their relative switching rates seem to be highly invariant. We plotted the Pearson's correlations between tested conditions in a correlation matrix (Figure 9). The results clearly showed that the correlation was essentially invariant within the same stimulus type, invariant to the stimulus location (diagonal cells in the matrix, all r > 0.7 and p < 0.001), and a reduced level of correlation between the two stimulus types was also largely invariant to the stimulus location (the off-diagonal cells, r∼0.4).
Then, we examined the effect of eye moments and blinks in a small group of subjects. Eye movements and blinks of both eyes in 14 observers were recorded at 500 Hz during exposure to seven different types of rivalry stimuli. The number of blinks of every subject was calculated by the missing of pupil during constant presentation of every stimulus. Results show that there was no correlation between the number of blinks and the perception FIGURE 6 | Multi-Dimensional Scaling of tested bistable types. Dissimilarity measured by Euclidean distance model. Closer distance indicates higher similarity.
switching rates across individuals ( Table 3, top row). The mean absolute Pearson's correlation coefficients value is 0.140, and all p > 0.25. Data indicated that blink rates were relatively constant when viewing different stimuli, suggesting that variations in eye blink is very unlikely the cause of the observed clustering among the 11 stimuli in the main results.
We also found no correlation between eye movements (measured as standard deviation of distance to center) and the switching rates of the seven types of stimuli ( Table 3, bottom row). The mean absolute value of r is 0.084, and all p > 0.45. These results indicate that eye movement mode is also very unlikely the reason for the observed grouping in switching dynamics in the main experiment.

DISCUSSION
Our results show that the dynamics of different types of bistable phenomena is unlikely controlled by a single mechanism, i.e., there do not seem to be a master mental clock against which all perceptual switches are pegged: a person experiencing faster switching in one phenomenon would not necessarily perceive faster switching in all other bistable stimuli. That said, it is also very apparent from the data that the dynamics of some subsets of bistable stimuli are clearly correlated, forming a number of related groups.
The most noticeable relationships are the correlations among spinning dancer, Lissajous-figure and rotating cylinder. They showed a considerable high correlation to each other. This is not surprising since they are all examples of structure-from-motion stimuli, the bistability of these stimuli from assigning one surface or the other to the front vs. back.
The second group of phenomena includes binocular rivalry, vase-face, moving plaid, Necker cube, and rolling wheel. The factors contributing to this cluster are more complicated. Some in these group of bistable stimuli could be considered having a bistable figure-ground assignments in the general sense, although they differ in which features determine the figureground relationship. In the case of binocular rivalry, when one eye's input becomes the figure, the other recedes into the back and indeed could be considered occluded by the figure. For the Necker cube, the switching between front and back surfaces is more apparent, which is determined by the viewer-cube viewpoint relationship. The vase-face is a clear example of figureground bistable assignment, with the figure-ground relationship determined by boarder-ownership. Though Moving Plaid and Rolling Wheel maybe correlated with each other due to shared motion mechanism, it is less clear how these two are related to the other phenomena in this group, though our results also indicated that RW and MP seem to be somewhat removed from the other three (Figure 6).
The remaining three bistable phenomena (MB, TD, and BM) seem to be more unique in their underlying neural  mechanisms and the key factors determining their bistability. This is particularly true for the point-light Biological Motion (BM) display. Although one could think that BM is also an example of structure-from-motion, numerous studies have suggested that there are dedicated neural mechanisms for the processing of BM, involving cortical regions such as the STS (Grossman et al., 2000;Beauchamp et al., 2003;Grossman et al., 2005). A number of studies have suggested predictive coding as a possible candidate mechanism for explaining bistable rivalry (Hohwy et al., 2008;Megumi et al., 2015;Weilnhammer et al., 2017;Brascamp et al., 2018). For the hierarchical Bayesian inference, the prediction-error signal generated at each level drives the perceptual switch. The relative independence of some of the switching dynamics in our study may indicate that the prediction-error signals could be generated at different cortical areas for different types of stimuli (e.g., SFM may involve MT+/V5, biological motion may involve STS, and TD may involve LOC).
Gender seems to play an important role in the switching dynamics of the structure-from-motion stimuli (Shechter et al., 1991;Scocchia et al., 2014). Results show that the switching rates in males are significantly slower than that in females for the three structure-from-motion stimuli. However, it is difficult to pinpoint which components of the neural processing are responsible for the observed gender difference.
It is also reported that some of the bistable perception slows down with increasing age (Beer et al., 1989;Ukai et al., 2003;Hudak et al., 2011;Aydin et al., 2013). Subjects in our study have a rather limited age range and the current sample is not the best for observing age-related effects. Still, we did see that the switching rates of translating diamond slowed down with the age.
Apparently, the ability to group (and ungroup) individual items to form (and to dissolve) a coherent shape is more flexible in younger adults and deteriorate quite rapidly with increased age, considering the narrow range of our observers' age.
There were early reports that the switching dynamics of binocular rivalry were correlated with that of the Necker cube (Shannon et al., 2011), motion induced blindness (Carter and Pettigrew, 2003), and stimuli rivalry (Patel et al., 2015). Results from the current study provide further support that binocular rivalry and Necker cube share some common factors in determining their dynamics. However, our data show much lower correlation between binocular rivalry and motion induced blindness compared with what was reported in Carter and Pettigrew. It is not clear whether the differences of stimulus used between the two studies that are responsible for the discrepancy, but subjects in the Carter and Pettigrew's (2003) study had a wider range of age, making it more likely that age could be the common contributing factor behind the correlation between binocular rivalry and the motion induced blindness, and there is also the possibility that a very small number of subjects with much faster switching dynamics than others accounted for a large portion of the correlation (see their article Figure 1).
Eye movements and eye blinks could potentially play a role in the switching dynamics of bistable perception. One study found a positive correlation between eye movements or blinks with binocular rivalry, but not essential to perceptual switch in other perceptual rivalry (van Dam and van Ee, , 2006b. Their data show that saccades were not associated with perceptual transitions, though blink rate increased around the time of a switch. In the current study, we did observe that subjects had different blink patterns during binocular rivalry sessions compared with other stimuli, potentially due to the use of the stereoscope. A study on Necker cube indicated that changing eye position itself might provide a negative feedback signal that suppresses the percept (Einhäuser et al., 2004). Besides, there were studies reported eye movements have no relationship with rivalry rates, which were consistent with our results van Dam and van Ee, 2006a;Law et al., 2015). In our research, individual's blink rate of each stimulus appeared appropriately constant, which we think shows individual's blink character. Considering clustered correlation among bistable switching rates, we think there is no much correlation between blink and switch time dynamic. We calculated the eye movement variety, and found no special correlation between eye movement and switch dynamic as well. Overall, the clustering of bistable dynamics among different bistable phenomena is more likely due to the intrinsic mechanisms of those bistable phenomena, rather than the result of different patterns of eye movements or blinks. In addition, the control experiments on the retinal location also provide support that the correlational structure among the different stimuli will be stable across different retina locations, despite that the absolute switching rate could change at different retinal location.
The dynamics of bistable phenomena provide a window into the intrinsic property of temporal operations in the brain, since the perceptual switches were not dictated by changes in the external stimulus, but initiated by the internal processes in the brain. Understanding the nature of bistable dynamics also has important clinical implications. For example, people who suffered from bipolar disorder have a slower switching rates in binocular rivalry and structure-from-motion displays relative to the normal controls (Pettigrew and Miller, 1998;Miller et al., 2003Miller et al., , 2012Krug et al., 2008;Nagamine et al., 2009;Ngo et al., 2011;Vierck et al., 2013;Law et al., 2015). Our data show that there is not a single "master clock" type mechanism that governs the temporal dynamics of perceptual switches of all bistable stimuli, instead there seem to be a set of different mechanisms responsible for different groups of bistable phenomena. Thus, it is unlikely that patients with mental diseases (e.g., bipolar) would have slower switching for all bistable phenomena, contrary to what was suggested by Miller et al. (2003). Indeed, the potential difference in their relative change in temporal dynamics among patient groups suffering from different types of mental disorders could potentially serve as an objective and effective multidimensional endophenotype index in the research and diagnosis of different types of mental disorders.

CONCLUSION
We measured the temporal dynamics of 11 bistable phenomena in 100 observers. Results show that the switching rates of subsets of bistable percept are highly correlated, yet different groups of bistable stimuli show relatively independent perceptual switching dynamics. The clustering of dynamic properties of some bistable phenomena but not an overall general control of switching dynamics implies that the temporal pace of bistable perception is not controlled by a "master clock" type mechanism. Instead these results suggest that the existence of both shared and independent inferential processes in the brain responsible for the dynamic of different types of stimuli -faster in constructing 3D structure from motion does not mean quicker in integrating components into an object.

ETHICS STATEMENT
All subjects provided written informed consent before the experiments, and protocols were approved by the institutional review board of the Institute of Biophysics, Chinese Academy of Sciences.

AUTHOR CONTRIBUTIONS
SH, TC, SE, and LW contributed to the conception and the design of the work. TC and ZS acquired the data. TC analyzed the data, and wrote the initial draft. SH and SE revised the manuscript.