No Evidence of Narrowly Defined Cognitive Penetrability in Unambiguous Vision

The classical notion of cognitive impenetrability suggests that perceptual processing is an automatic modular system and not under conscious control. Near consensus is now emerging that this classical notion is untenable. However, as recently pointed out by Firestone and Scholl, this consensus is built on quicksand. In most studies claiming perception is cognitively penetrable, it remains unclear which actual process has been affected (perception, memory, imagery, input selection or judgment). In fact, the only available “proofs” for cognitive penetrability are proxies for perception, such as behavioral responses and neural correlates. We suggest that one can interpret cognitive penetrability in two different ways, a broad sense and a narrow sense. In the broad sense, attention and memory are not considered as “just” pre- and post-perceptual systems but as part of the mechanisms by which top-down processes influence the actual percept. Although many studies have proven top-down influences in this broader sense, it is still debatable whether cognitive penetrability remains tenable in a narrow sense. The narrow sense states that cognitive penetrability only occurs when top-down factors are flexible and cause a clear illusion from a first person perspective. So far, there is no strong evidence from a first person perspective that visual illusions can indeed be driven by high-level flexible factors. One cannot be cognitively trained to see and unsee visual illusions. We argue that this lack of convincing proof for cognitive penetrability in the narrow sense can be explained by the fact that most research focuses on foveal vision only. This type of perception may be too unambiguous for transient high-level factors to control perception. Therefore, illusions in more ambiguous perception, such as peripheral vision, can offer a unique insight into the matter. They produce a clear subjective percept based on unclear, degraded visual input: the optimal basis to study narrowly defined cognitive penetrability.

the fovea (Westheimer, 1982;Anderson et al., 1991). Yet we perceive the world as rich in color and detail (Lamme, 2006;Block, 2007Block, , 2011Rahnev et al., 2011). So how can human perceptual experience be so clear, when it is often based on unclear input?
There are currently two major, but conflicting, answers on the question why we see things the way we do. The first answer is the classical bottom-up view. The classical view states that our visual experience is purely based on a sensory/bottom-up signal, translated according to fixed rules (that may involve world knowledge). A highly influential psychologist in this regard is J. J. Gibson (1904Gibson ( -1979. Gibson states that vision is purely based on information from the environment and that it is not affected by cognitive construction or processing. Gibson's view is also known as ecological psychology (Gibson, 1966). This bottom-up processing is often considered as cognitively impenetrable. Cognitive impenetrability can be defined as the inability to consciously and purposefully modulate the processing of a mental operation that is thought to be carried out in an automated unsupervised manner, such as basic sensory perception. This modular system is domain specific and its operation is mandatory (Fodor, 1983). Although some theories about the visual system are based on this concept, the classical view cannot clearly explain how noisy input is often experienced as a rich visual percept and how object recognition is influenced by contextual information [see, e.g., Bar (2004) for a review on object perception]. It seems that theories based on purely bottomup processing (without any influence of top-down processes) do not hold, and have become outdated.
The second answer to the question why we see things the way we do is the alternative top-down view. In contrast to the classical view, the alternative view states that our perception is affected by transient internal states, such as wishes, expectations and beliefs. This latter view, also known as cognitive penetrability (CP), claims that (intentions of) actions can change our perception through flexible priors. Higherlevel cognitive states routinely penetrate our perception, such that what we see is an alloy of bottom-up factors and beliefs, desires and motivations. The brain continually updates its model of the world based on a Bayesian weighing of sensory input (bottom-up) and prior expectations (top-down) (Knill and Pouget, 2004;Clark, 2013;Summerfield and de Lange, 2014;Pinto et al., 2015). Our perception is cognitively modulated in many ways, for instance, in brightness illusions (Adelson, 1993), Ramachandran's scotoma (Ramachandran and Gregory, 1991), or motion induced blindness (Bonneh et al., 2001). Other examples of the modulation of perception are illusions based on cognitive general rules, such as Ames window (Ittelson, 1952) and Hollow faces (Gregory, 1970;Hill and Bruce, 1993). In these illusions unusual objects or shapes give systematic errors, as they are in conflict with fixed rules or general knowledge.
Over the last few years, many studies claim to have proven CP, without the use of these illusions based on fixed rules or general knowledge. For example, studies show that a bottle of water looks closer when we are thirsty (Balcetis and Dunning, 2010), social expectations affect basic perceptual experiences, i.e., faces with African American features look darker (Levin and Banaji, 2006;Zhong and Leonardelli, 2008), and words are easier to detect when they are morally relevant (Gantman and Van Bavel, 2014). Most researchers consider these results as such pervasive evidence of cp, that the classical notion of cognitive impenetrability is often considered to be untenable.
Although many studies claim to have proven cp, Firestone and Scholl pointed out some significant problems in most of these experiments (Firestone and Scholl, 2015). They state that perceptual top-down research "falls prey to a set of pitfalls." Roughly said, there are two major problems. The first problem within this field is that most results reflect topdown processes in early visual selection through attention shifts. Researches have shown that selection of input can be under top-down control, for instance through eye movements or attention shifts. However, it fails to prove that after selection the translation to percept is under top-down control. For example, inattentional blindness (Mack and Rock, 1998;Most et al., 2005;Ward and Scholl, 2015) might be a failure to see or memorize (Wolfe, 1999;Lamme, 2003) what we do not attend to. According to Lamme (2003), attention and conscious perception might be two separated systems, in which attention is needed to store our actual perception in working memory and to be able to report it afterward. According to this theory, it remains unclear whether inattentional blindness is a result of insufficient attention, insufficient perception or insufficient conscious memory.
The second problem is that experimental results are often not direct proof of change in perception per se, but are possibly a reflection of, for instance, our judgment. We can directly see that a bottle of water is closer when we are thirsty or just assume/conclude that it is closer. Another example is the study of Wesp and Gasper (2012). In an earlier experiment they found that less accurate throwing of darts led to estimation of smaller target-size, as if one's performance perceptually resized the target (Wesp et al., 2004). However, when they replicated this experiment in 2012, subjects were told that the darts were defective. This additional instruction eliminated all correlation between performance and reported size of the target. This result indicates that if an experiment shifts perceptual reports, it could be possible that the shift reflects changes in judgment, rather than changes in perception. Other examples of studies that possibly do not reflect change in perception, although claiming to do so, are experiments using neuroimaging and electrophysiology. Although feedback connectivity in descending neural pathways are often interpreted as top-down effects, in which higher brain regions are assumed to modulate lower brain regions through descending neural pathways (Bar et al., 2006;Gilbert and Li, 2013), such imaging studies are per definition correlational. Specific neuronal interactions and feedback connectivity might be a reflection of our visual percept, but could also be a reflection of, for example, recall (Le Bihan et al., 1993) or imagery (Kosslyn, 2005). Thus, activation that is registered via an electrode or MRI scanner might be not always necessary or even not directly related to perception. Even when neuroimaging data do reflect a direct effect of feedback processing on perception, for example in unconscious inferences, this process is not under conscious control. Using neural data or behavioral data can be very useful in supporting perceptual changes by controlled top-down processes, however, it is not conclusive by itself.
The experimental pitfalls pointed out by Firestone and Scholl make it arguable whether perception is indeed cognitively penetrable or whether most of these studies are methodologically insufficient. The pitfalls listed by Firestone and Scholl mostly rest on the assumption that attention is pre-perceptual and memory is post-perceptual, and that it is often not clear which actual process has been affected. However, it is debatable whether attention and memory should be considered as purely pre-and post-perceptual systems, or as part of the mechanisms by which top-down processes influence the actual percept and thereby as part of the visual system (Lupyan, 2016).
We suggest that one can interpret cp in two different ways, in a broad and narrow sense. The broader sense of cp suggests that attention and memory are part of the visual system, and that top-down processes can influence the perceptual system. In this definition, perception is penetrable when top-down processes change attention, perception or memory. If cp is interpreted in the broad sense, many studies have provided fairly strong evidence of cp. For example, scene knowledge affects perception of edge orientations (Neri, 2014), knowledge of the real-world size of, e.g., a basketball affects apparent speed of motion (by altering perception of distance) (Andrés et al., 2015), knowledge of usual object colors shades our color perception (Hansen et al., 2006;Olkkonen et al., 2008;Witzel et al., 2011;Kimura et al., 2013) and influences the intensity of color afterimages (Lupyan, 2015a,b), and hearing the right word can make something visible that is otherwise invisible (Lupyan and Ward, 2013).
In the narrow sense of cp, however, the notion of cp is less obvious. We define narrow cp as follows. Narrow cp occurs when flexible factors (that can be learned and unlearned) affect perception, after the effects of attention, selection and memory are dismissed (see Vance and Stokes, 2016 for a similar definition). According to this narrow definition of cp, the pivotal question is whether selected malleable top-down factors can still affect perceptual experiences after sensory input (attention) and before reporting (memory). Two requirements need to be met before narrow cp is established. First, perception itself has to be unambiguously influenced by top-down processes. Second, these top-down processes must be flexible, in the sense that a healthy adult is able to turn these processes on and off, through training or voluntary decisions. Thus, fixed topdown processes (such as brightness perception being affected by surrounding information) do not count as examples of narrow cp.
In many of the previously mentioned studies (Hansen et al., 2006;Olkkonen et al., 2008;Witzel et al., 2011;Kimura et al., 2013;Neri, 2014;Martín et al., 2015;Witzel, 2016) purely attentional or post-perceptual processes (Lupyan and Ward, 2013) may have caused the observed effects. For instance, it could be argued that scene knowledge primarily affects orientation judgments, rather than that it causes perceptual distortions. Similarly, perhaps real world knowledge affects speed judgments more than that it creates actual illusions in speed perception. Furthermore, studies of binocular rivalry and continuous flash suppression have shown that attention/selection can determine the dominance of a stimulus (Chong et al., 2005). In other words, selection through attention may cause the effects of top-down processes on binocular rivalry and continuous flash suppression.
We acknowledge that it is very difficult to separate out attentional and perceptual effects. Some attention researchers may therefore not share our notion of narrow CP, since it could be argued that attention cannot be separated from perception. However, it is crucial to stress that in our definition of narrow CP, selection or amplifying effects of attention do not constitute narrow CP. These effects of attention on perception clearly occur and are consistently found in both behavior and neural activity. However, in our definition of narrow CP, flexible topdown factors should affect perception after selection has taken place, and in such a way that the contents of perception are altered (not merely the level of awareness). We assert that although it might be difficult, it is not impossible to prove narrow CP after the effects of attention, selection and memory are dismissed. For example, some illusions, such as the McGurk effect (McGurk and Macdonald, 1976), are clearly distortions of perception (from a first person perspective). In experiments without such a clear subjective distortion, it is hard to prove whether perception, or pre-or post-perceptual processes are affected. We, therefore, assert that in order to prove cp according to the narrow definition, we need to focus on perception from a first person perspective instead of (or in addition to) using proxies for perception. For example, by using clear visual illusions. Only when top-down, malleable, factors cause a clear illusion from a first person perspective, strong claims about the narrow definition of cp can be made.
Importantly, although visual illusions may be considered as proof for cp in the broader sense, awareness and understanding of the illusion cannot make them unseen and therefore most visual illusions cannot (yet) directly provide evidence for narrow cp. These illusions seem to be caused by fixed rules, which are hardwired into the visual system.
In conclusion, we claim that there is currently decisive evidence for CP when defined broadly, but not (yet) for CP in the narrower sense.

PERIPHERAL ILLUSIONS
Here we take a critical position toward the existence of narrow CP, i.e., the occurrence of flexible, learnable top-down factors affecting the contents of perception (as shown through clear illusions) while dismissing the effects of attention and memory. We want to point out, however, that the current lack of evidence for narrow cp does not necessarily imply that it does not exist. An alternative explanation for the absence of proof might be the fact that in nearly all illusion studies, stimuli are presented foveally, while they are attended. Since the signals from the fovea are often high fidelity, bottom-up input requires less or no direct top-down influences. In contrast to these clear foveal signals, the resolution in our peripheral vision is roughly equivalent to "looking through a frosted shower door" (Eagleman, 2001). We suggest that with noisy sensory input, like this peripheral frosted shower door, we have a much better chance of finding evidence that noisy bottom-up signals might be influenced by first-person factors, such as personal traits, experiences and believes. Even though sensory signals can also be ambiguous in the foveal part of the retina, as they confuse information from surfaces and illuminants, and because of the 3D to 2D projection, the essential difference between a noisy foveal image and an image in the periphery, is that in one case the external input is noisy, while in the other case the input is clear but the processing is noisy. The difference can be understood as follows. Imagine two reporters; one is a very reliable reporter while the other one is extremely chaotic and unreliable. When the very good reporter (i.e., the fovea) reports to the control room that the situation is disorderly, and there are riots everywhere, the control room will simply conclude that that is the current state of affairs. However, when the chaotic reporter (i.e., the peripheral signal) delivers an incoherent report, the control room will try to use best guesses to really understand what is going on. In other words, when the fovea reports to the brain that the external stimulus is noisy, the brain has no reason to override this report, and thus no flexible illusion will be created (only illusions based on fixed rules). However, when the fovea transmits a low fidelity report, the brain may augment this report, and thus possibly create a visual illusion based on transient cognition.
Peripheral vision becomes especially noisy during long fixations (Clarke, 1961;Martinez-Conde et al., 2006), in which (parts of) perception flexibly adopt a new identity based on global visual information and possibly high-level factors. Perhaps, peripheral illusions based on such long fixations could prove an effect of cognitive contents. They might be more sensitive to learnable priors and less driven by automatic algorithms, as their bottom-up signal is noisy. One striking visual illusion in peripheral vision is the uniformity illusion (see Figure 1) 1 . This illusion suggests that the detailed peripheral visual experience is partially based on a reconstruction of reality. In a visual display where central stimuli differ from peripheral stimuli on specific properties, central stimuli appear to overflow into the periphery for extended periods of time. Observers thus perceive the stimuli in the periphery to take on the properties of the central stimuli, resulting in a uniform field encompassing the center and the periphery of the display (Otten et al., 2016). This uniformity illusion has been demonstrated for a wide range of visual features, such as luminance, orientation, motion and texture. Importantly, unlike most other visual illusions, this is an illusion based on weak sensory processing. Although it seems likely that the illusion is (at least partly) driven by fixed rules and automatic algorithms just as other visual illusions are, more research is required in order to answer the question whether learnable priors can affect this illusion. Its ambiguous nature, its global effect on perception and the wide range of visual features in which this illusion occurs, provide the ideal circumstances to study how the brain constructs visual illusions and to what extent such illusions are cognitively penetrable.
To summarize, there are still some disagreements concerning the role of cognitive penetrability in visual perception. We do not debate the existence of CP in the broader sense. However, in our definition of narrow CP, attention and memory are considered to be pre-perceptual and post-perceptual processes. Moreover, cognitive penetrability only occurs when flexible, learnable factors affect the contents of perception, after the effects on attention and memory are dismissed. We argue that narrow cp has not (yet) been proved, since most evidence for cognitive penetration is based on methods that employ proxies for perception. The one data point that could really prove narrow cp is a clear illusion from a first-person perspective. However, so far, clear illusions do not support narrow cp, as these illusions cannot be unseen (i.e., they are driven by unchangeable rules).
To provide more insight into the matter, future research should focus on cognitively induced perceptual illusions when the sensory signal is noisy, such as during the uniformity illusion. Interesting research questions would be; which functional manipulations affect the uniformity illusion? Or, how can prior expectations influence this illusion? For example, when subjects are divided into two categories, in which subjects of category one are given no priors and subjects of category two are first given correct priors about the stimuli in the periphery, followed by false priors. Can these correct/false priors strengthen/weaken the uniformity illusion? And does changing the priors within subjects change the perception of the same stimuli? If future research indeed verifies that illusions can be affected through learnable cognitive priors when sensory input is unreliable, then the notion of cognitive penetrability receives clear proof, even when it is defined narrowly. However, if even under these circumstances narrow cp does not occur, then it becomes doubtful whether narrow CP exists at all. In that case, the effects of cognition are probably either purely based on postperceptual processes (e.g., memory, judgment) or pre-perceptual processes (input selection), or driven by fixed, unlearnable factors.

AUTHOR CONTRIBUTIONS
All authors shared the opinion of a critical position toward cognitive penetrability in visual perception, and the significance of cognitively induced perceptual illusions in peripheral vision. NL drafted the manuscript, which was adjusted based on feedback from YP and EdH. All authors approve the manuscript.

ACKNOWLEDGMENT
None of the authors have financial or other conflicts of interest to report. This work was supported by ERC grant FAB4V (#339374).