Improving Perception to Make Distant Connections Closer

Goldstone, Robert; Landy, David; Brunel, Lionel  Cédric

doi:10.3389/fpsyg.2011.00385

HYPOTHESIS AND THEORY article

Front. Psychol., 27 December 2011

Sec. Perception Science

volume 2 - 2011 | https://doi.org/10.3389/fpsyg.2011.00385

This article is part of the Research TopicLinking Perception and CognitionView all 7 articles

Improving perception to make distant connections closer

Robert L. Goldstone¹*

David Landy² and Lionel C. Brunel³

¹ Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, USA
² Department of Psychology, University of Richmond, Richmond, VA, USA
³ Department of Psychology, Université Paul-Valéry Montpellier III, Montpellier, France

One of the challenges for perceptually grounded accounts of high-level cognition is to explain how people make connections and draw inferences between situations that superficially have little in common. Evidence suggests that people draw these connections even without having explicit, verbalizable knowledge of their bases. Instead, the connections are based on sub-symbolic representations that are grounded in perception, action, and space. One reason why people are able to spontaneously see relations between situations that initially appear to be unrelated is that their eventual perceptions are not restricted to initial appearances. Training and strategic deployment allow our perceptual processes to deliver outputs that would have otherwise required abstract or formal reasoning. Even without people having any privileged access to the internal operations of perceptual modules, these modules can be systematically altered so as to better serve our high-level reasoning needs. Moreover, perceptually based processes can be altered in a number of ways to closely approximate formally sanctioned computations. To be concrete about mechanisms of perceptual change, we present 21 illustrations of ways in which we alter, adjust, and augment our perceptual systems with the intention of having them better satisfy our needs.

Improving Perception to Make Distant Connections Closer

One of the prime indicators of sophisticated cognition is that it does not rely on superficial resemblances to make connections between situations. Whereas a novice physicist may group scenarios based on surface properties such as whether springs or inclined planes are involved, the expert instead groups problems on the basis of the deep law of physics required for solution, such as Newton’s second law or conservation of energy (Chi et al., 1981). Whereas a child typically connects clouds to sponges via surface features such as “round and fluffy,” a more experienced adult may refer to more sophisticated relations such as “stores, and then releases water” (Gentner, 1988), allowing the adult to see connections among clouds, sponges, cisterns, and reservoirs. Scientists armed with the notion of a negative feedback system can see a resemblance between toilets, heat regulation, and predator–prey dynamics – namely, that each has two variables that are related such that increases to x cause increases to y which, in turn, cause decreases to x (Goldstone and Wilensky, 2008). Even though these scenarios have little in common with one another at first sight, sophisticated cognitive processes unite these situations because they share deep properties that crucially govern their behavior.

One moral that could be drawn from these examples is that perceptual resemblances must be cast aside if one is to procure the sophisticated categories and inferences of a scientist, mathematician, or domain expert. This is precisely the moral drawn by Quine (1977) when he wrote, “I shall suggest that it is a mark of maturity of a branch of science that the notion of similarity or kind finally dissolves, so far as it is relevant to that branch of science. That is, it ultimately submits to analysis in the special terms of that branch of science and logic” (p. 160). The sort of example that Quine has in mind is a natural kind such as gold. Prior to the discovery of atomic elements, observers presumably noticed that several geological samples resembled each other, and used the term “gold” to refer to the collection of similar objects. However, once the elemental composition of gold was identified, surface features like “yellow,” “malleable,” and “shiny” were no longer necessary for identifying an object as gold. Advantages of supplanting these surface features with the chemical feature “atomic number 79” are that the chemical feature offers the promise of a scientific causal account for why gold has the surface features that it does, and it provides a way of excluding objects like pyrite (“fool’s gold”) from the category of gold despite its possession of some of gold’s surface features. Perceptual resemblances can be misleading, and a sophisticated cognizer learns when to disregard these resemblances.

Another possibility is that perceptual resemblances are not fixed, and that we may adapt our perceptions so as to better support the requirements of categories and inferences that are important for us. Another way, then, of becoming a sophisticated cognizer is to modify one’s perceptual processes to generate categories and inferences that are consonant with those that are formally sanctioned. In what follows, we first describe empirical evidence that people can and do change their perceptual processes in this way. We then describe mechanisms for this perceptual plasticity, with a particular eye toward exploring the cognitive penetrability of these perceptual adaptations.

Making Distant Connections

The examples of high-level cognition described above have a commonality – they all involve making connections between apparently distant scenarios, and/or splitting apart apparently similar scenarios. For example, an informed chemist connects a gold nugget to liquid gold dissolved in an alkaline solution, and differentiates it from pyrite. One way to draw inferentially productive yet distant connections is to equip oneself with an appropriate theory. This is the approach pursued by Quine (1977; see also, Goodman, 1972). A recently growing body of psychological evidence indicates a second way that is grounded in perception and action. Researchers in language, transfer, analogy, and cognition have found cases of people drawing connections between situations that do not seem to be superficially related. Much of this research has been associated with embodied and grounded cognition, an approach that argues that cognition is grounded in perception and action processes, rather than being associated with purely formal, amodal processing (Barsalou, 2008). This is an intriguing connection because of the prime facie tension between grounded accounts of cognition and connections being drawn that are not supported by perception. If cognition is inherently grounded in perception, then how are these superficially distant connections being made?

Implicit Analogical Transfer via Perceptual Priming

One possible answer is provided by an experiment on transfer of learning by Day and Goldstone (2011). Their participants interacted with two systems that were superficially dissimilar, but both required participants to apply forces that either reinforced or opposed the system’s natural resonance. The first scenario (see Figure 1), featured an oscillating ball suspended between two vertical poles by a rubber bands. If the ball is displaced to the right of center, then the red rubber band on the left will pull the ball back to the left. If the ball is displaced to the left, then the blue rubber band on the right will pull the ball back to the right. Given the absence of friction in the system, any perturbation of the ball’s horizontal position leads to an undampened oscillation. The participants are able to apply a rightward force to the ball via a fan positioned on the left side of apparatus and facing to the right. By timing when the fan is turned on, the participants’ task is either to stabilize the ball at the apparatus’ midpoint without movement, or to get the ball to reach the extreme right side of the apparatus, as indicated by the checkered triangle in Figure 1. A Flash implementation of the simulation can be accessed at http://cognitrn.psych.indiana.edu/complexsims/Oscillatingball.html. To solve the stabilize task, the participants should turn on the fan whenever the ball is moving to the left, regardless of the ball’s horizontal position. To solve the extremitize task, the participants should turn on the fan whenever the ball is moving to the right, so as to reinforce the ball’s own movement.

FIGURE 1

Figure 1. Two superficially dissimilar scenarios instantiating the same principle of reinforcing forces in a resonating system, as studied by Day and Goldstone (2011).

After exploring this first simulation for several minutes, participants are given a second task without any indication of its relation to the first task. In this task, participants assume the role of mayor of a city. Whenever the population of the city is higher than 500,000, there is an intrinsic tendency for the population to decrease because of overcrowding, traffic jams, and expensive housing. Whenever the population is less than 500,000, there is a tendency for the population to increase because of living ease and inexpensive housing. Participants are given one of two goals as mayor: to stabilize the population at 500,000 citizens without fluctuation, or to make the population reach 1,000,000. To achieve these goals, participants can strategically deploy “media campaigns.” At the beginning of each discrete year of the simulation, participants decide whether they will initiate a media campaign that adds a positive constant to the natural annual change (velocity) of the population.

The two tasks are isomorphic systems, governed by the same equation: velocity_t+1 = velocity_t + C × (midpoint − position) + F, where C is a constant, and F is the force that the participant strategically adds. There is a rigorous analogy in which the ball’s position corresponds to the size of the population, the velocity of the ball corresponds to the year-to-year change in population, and turning on the fan corresponds to initiating a media campaign. Participants demonstrated sensitivity to these correspondences because they solved population problems more quickly when they were preceded by a congruent version of the ball task. That is, when both tasks involved stabilize goals, or when both tasks involved extremitize goals, solutions were found more quickly than when one task required stabilization while the other required extremitization.

Interestingly, positive transfer between congruent simulations was found even when participants did not see any connection between the simulations, and could not correctly draw the correspondences listed in the previous paragraph. Similarly, when the correspondences were explicitly pointed out to participants, this did not increase the difference between congruent and incongruent conditions when performing the transfer task. In fact, the advantage of congruent over incongruent simulations was equally large when participants did versus did not demonstrate an understanding of the valid correspondences between scenarios. The observed transfer seems to be mediated by implicit priming, rather than strategic application of explicit schemas. The transfer also appears to be perceptually grounded because swapping the side of the fan from the left side (facing right) to the right side (facing left) eliminated transfer. Our interpretation of this effect is that people naturally understand population as a variable that goes from small values on the left to large values on the right, recruiting space to understand the numeric variable of population. Transfer is found only when the spatial relations in the ball scenario naturally align with the spatial interpretation of population.

The observed successes and failures of transfer across the ball and population scenarios point to both the power and fragility of perceptually grounded representations. These representations have the power to bridge across scenarios from different domains and with different interfaces, graphical elements, and timings. However, they are also fragile in that they depend upon the preservation of spatial relations that are not intrinsic to the underlying formal equations. The answer provided by these experiments to the question “If cognition is inherently grounded in perception, then how can connections be made between superficially dissimilar domains?” is that people naturally and automatically translate scenarios that are not directly spatial into spatial representations, and perceptual priming can occur between these transformed representations. In fact, perceptual priming can provide a vehicle for transfer even when more explicit, strategic avenues to transfer, such as abstract schemas (Gick and Holyoak, 1983; Detterman, 1993) or mathematical formulae (Ross, 1987) fail. Perceptual priming is effective for linking superficially dissimilar situations because people are habitually reinterpreting situations and translating them into (recently) familiar, frequently spatial, representations.

While the observed transfer apparently derives from spatial and dynamic representations, transfer is not always maximized by presenting a situation with its most intuitive embodiment. In fact, Byrge and Goldstone (2011) provide evidence that transfer from the ball to population situation is fostered by decoupling one’s manual interaction with the ball simulation from its underlying resonance dynamic. The relatively unintuitive act of moving a switch to the left to make the fan blow rightwards results in better transfer to the population than when one’s manual direction of motion is congruent with the fan’s direction of force. The problem with incorporating highly intuitive perceptions and actions into a simulation is that people’s knowledge of the simulations may become too closely tied to these groundings. If the subsequent situation does not share these groundings, then an opportunity for transfer may be missed. This result is consistent with earlier results showing that idealized, but still spatial, representations can produce particularly transferable knowledge by loosening the dependency between one’s understanding of the principle and one’s appreciation of the particular training domain (Goldstone and Sakamoto, 2003; Son et al., 2008; Son and Goldstone, 2009). Together with results suggesting that some action congruity effects are mediated by subjective construals rather than low-level bodily actions (Markman and Brendl, 2005), these results speak against naïvely assuming that more intuitive embodied representations will always yield superior transfer.

Other Cases of Grounded but Superficially Distant Connections Being Made

The above case study of cross-situational transfer that is grounded but nonetheless distant is not altogether unique. Other researchers have found examples of implicit transfer between structurally related situations despite a lack of conscious appreciation of the connection between the situations. People can solve a problem involving an “inhibition” strategy more quickly when another superficially dissimilar problem requiring inhibition was seen the previous day, even when they do not report noticing the relationship between the tasks (Schunn and Dunbar, 1996). Likewise, Gross and Greene (2007) have reported that the global structural relationships within a set of items (e.g., transitive or transverse relationships) may be transferred to a new set without participants’ awareness. As a final example, structural relations involving relative clauses and scoping have been shown to transfer from mathematics equations to written sentences (Scheepers et al., 2011). Transfer across these kinds of situations have been modeled by relational priming using automatic spreading activation in neural networks (Kokinov and Petrov, 2001; Leech et al., 2008). Some results suggest that relational priming is not always automatic, but rather requires that people engage in cognitive processing that is sensitive to relations (Spellman et al., 2001). In any case, these situations provide examples of transfer across apparently dissimilar entities that reveal natural ways for people to construe their world. As with the earlier ball–population example, a cross-situation connection is forged because it does not require the cognizer to explicitly put the connection into words or equations, but rather only requires the same, grounded system to be recruited in different situations.

Another example of this generalization-by-conservation-of-systems mechanism is Hills et al. (2008, 2010) study of exploration and exploitation actions. They hypothesized that many situations fundamentally feature a decision about how much to explore new options versus exploit the options previously explored, and that there could be transfer across tasks that involve similar choice points along this tradeoff. To test this, they gave participants an initial task requiring them to forage for spatially distributed resources that were either clumped in discrete clusters or scattered. In a second task, participants came up with as many words as possible by rearranging sets of letters, exchanging old sets for new when they believed that they had effectively exhausted the potential words from their current set. Participants who foraged for distributed resources tended to exchange letter sets more often than participants who foraged for clustered resources, consistent with the idea that training in a task that promotes exploration leads people to more exploratory behavior in a second task. Hills et al. speculate that this cross-task transfer may be mediated by dopamine. When clustered resources are present in the foraging task, then dopamine may be released as regions of highly concentrated resources are found. Dopamine is associated with an increased tendency to exploit currently known options. If this is the case, then the observed cross-task priming may be due to increased levels of dopamine in the clustered resources condition that simply remain active during the word formation task, leading to greater perseveration with a given letter set (exploiting known options). By this account, even without participants consciously appreciating that both tasks involve decisions to explore or exploit (post-experimental interviews indicated that participants did not explicitly make this connection), neural underpinnings are sensitive to the amount of exploration and exploitation required for a task, and transfer is simply a form of priming via shared task requirements.

Regardless of whether the dopaminergic hypothesis is correct, this form of explanation provides a general template for how grounded and embodied accounts of cognition can nonetheless produce surprisingly far transfer. Transfer can seem far to us because we do not have privileged access to the primitive components and parameters underlying our cognitive processes. Our conscious reflection prominently features words and justifications. However, the actual mechanisms that allow us to solve problems presented in computer simulations, recognize that a problem can be solved be inhibition, and decide whether to gamble on a new set of letters may feature other cognitive components. In particular, these components may be more perceptual, spatial, embodied, and diffuse than our reflections suggest. In these cases, transfer only seems far because we are biased to measure distance in terms of verbally expressible schemas. Perception and action provide us with unexpected connections that seem to depend on complex rationales, but this is only because our expectations are based on our consciously available justifications rather than our actual cognitive mechanisms.

Attention to the Visual Objects of Mathematics

Mathematical reasoning is a good place to look for connections between perception and high-level cognition. Mathematics is perhaps the pinnacle of cognitive abstraction. Mathematicians, even more so than physicists and computer scientists, strive to develop theories for increasingly general domains, covering more superordinate categories, and for more universal cases. Any particular mathematical tool, say combinatorics, can be applied to countless domains, ranging from bathroom tiling to lotteries. Much of this generality comes from the application of symbol systems, such as variables, equations, set theory, and predicate logic. These symbol systems confer on their user an ability to transcend the details of a particular domain. Given the critical role that symbol systems play in granting a cognizer distance from a domain, it is understandable that researchers have contrasted symbolic cognition from embodied cognition (Lakoff and Nuñez, 2000).

Yet, it is also worth remembering that symbol systems are physical themselves (Newell and Simon, 1976). This is especially true for external symbol systems such as mathematical notation. Rather than pitting symbolic processing versus perceptually grounded processes, we have found it productive to understand symbolic processing via perceptually grounded processes. Mathematical notation has changed over the millennia to be easily processed by humans (Cajori, 1928), but in addition, people change over the course of their lifetimes to more effectively manipulate and process mathematical notations. This latter, human, adaptation provides an excellent example of bridging perception and cognition by adapting perception to fit the needs of cognition that is engaged in symbolic processing.

In one line of experiments, we have studied how attentional processes are trained to facilitate algebraic reasoning (Goldstone et al., 2010). In particular, in algebra, there is an established convention of order of operations such that 3 + 4 × 5 equals 23 [3 + (4 × 5)] rather than 35 [(3 + 4) × 5]. The mnemonic PEDMAS provides some of this order, with parenthesis – exponentiation – division and multiplication – addition and subtraction operations ordered from highest to lowest precedence. This formal system of operation precedence can be memorized and explicitly invoked when doing mathematics. However, applying explicit rules like this makes strong demands on memory and executive control. A cognitively less strenuous alternative is simply to train our visual attention in a manner that honors order of precedence without explicitly following a rule that specifies the order. In fact, people train their visual attention processes to give higher priority to notational operators that have higher precedence. The operator for multiplication, “×,” attracts attention more so than does the notational symbol for the lower precedence addition operator, “+.” People who know algebra show earlier and longer eye fixations to “×”s than “+”s in the context of math problems (Landy et al., 2008). Even when participants do not have to solve mathematical problems, their attention is automatically drawn toward the “×”s. When simply asked to determine what the center operator is for expressions like “4 × 3 + 5 × 2,” participants’ attention is diverted to the peripheral “×,”s as indicated by their inaccurate responses compared to “4 + 3 + 5 + 2” trials (Goldstone et al., 2010). The distracting influence of the peripheral operators is asymmetric because responding “×” to “4 + 3 × 5 + 2” is significantly easier than responding “+” in “4 × 3 + 5 × 2.” That is, the operator for multiplication wins over the operator for addition in the competition for attention. This is not simply due to specific perceptual properties of “×” and “+” because similar asymmetries are found when participants are trained with novel operators with orders of precedence that are counterbalanced. The results suggest that a person’s attention becomes automatically deployed to where it should be deployed to get them to act in accordance with the formal order of precedence in mathematics.

Blind and Myopic Flailing

Thus far, our argument has been that cognitive processes grounded in perception and action can still lead to surprisingly distant connections being made, because our sense of surprise is disproportionately based on our explicit rationales. Furthermore, we train our perceptual processes so that they better serve the needs of high-level cognition. The ability of our perceptual system to support far or “smart” transfer is further enhanced because of this training.

At this point, we must dispel a certain tension between the two planks of this argument. On the one hand, we are arguing that we do not have privileged access to the perceptual and grounded processes that underlie our own cognition, and so we do not realize that seemingly dissimilar ball and population simulations intrinsically involve similar force- and space-based representations, or that foraging for spatial resources and finding words involve similar processes that mediate the explore–exploit tradeoff. On the other hand, we are also arguing that we train our perceptual processes to achieve apparently more sophisticated outcomes. A critic might well press us to say, “Which way is it? Do you think we have access to the perceptual processes that underlie our cognition? If not, how can we adjust them?”

In defending our simultaneous assertion of both claims, we begin by distinguishing two senses of “so that” in our argument that “we selectively improve our perceptual abilities so that the tasks that we need to perform are performed better.” By one interpretation, “so that” means “with the intention that,” implying that we strategically alter our perceptual abilities. By the second interpretation, “so that” means “with the end result that,” implying that our perceptual abilities are altered naturally through an automatic, non-conscious process. Our primary claim is meant in the spirit of this latter interpretation, although we shall later return to the first interpretation.

Blind Flailing

There is strong evidence from the field of perceptual learning that points to the importance of a learner’s goal on perceptual adaptation. Granted, goals are not everything. Even perceptual information that is irrelevant to a task can become sensitized (Watanabe et al., 2001), even if this information is in the visual periphery and below the threshold for conscious detection (Seitz and Watanabe, 2005). However, there is at least as strong evidence that what is learned and how efficiently it is learned depends on the observer’s task and goal. Even when sensitivity to a line orientation appears to have a relatively early locus of change, in that it does not transfer strongly across eyes or visual regions, it nonetheless depends on the observer’s goals (Shiu and Pashler, 1992). Perceptual sensitization to the orientation of a line is much more robust when it is relevant for the task than when it is irrelevant. When observers are given the same stimuli in two conditions, but are required to make fine, subordinate-level categorizations in one condition and coarser, basic-level categorizations in the other, then greater selectivity of cortical regions implicated in object processing is found in the former condition (Gillebert et al., 2008). As a final example, perceptual discriminations are easier to make at boundaries between important categories for an observer, such as between a/p/and/b/phoneme that would be important for distinguishing “pats” from “bats.” Evidence for this “categorical perception” effect from training studies and cross-linguistic comparisons indicates that it is not just perceptual sensitivities that are driving the categories, but rather the acquired categories are also driving perceptual sensitivities (Goldstone and Hendrickson, 2010). All of these studies show that we get better at making exactly the perceptual discriminations that help us do what we want to do.

A conservative interpretation of these results is that perceptions are changing with the end result that performance improves. Strategic changes need not be implicated to account for the improvements. A mechanism that involves only random variation plus selection suffices. The effective strengths of neuronal connections are constantly varying. If a random change causes important discriminations to be made with increasing efficiency, then the changes can be preserved and extended. If not, the changes will not be made permanent. There may be other more goal-directed processes of neuronal change, but simple random variation with reinforcement that may be internally generated is all that is needed to systematically improve perceptual systems. Although apparently inefficient and “stupid,” the “blind flailing” of random variation plus selection is surprisingly powerful. It features prominently in the theory of neuronal selection (Edelman, 1987), and the development of perception and action systems. In a literal application of flailing, infants often flail their arms around while learning to control them (Smith and Thelen, 1993). The flails that are relatively effective in moving the arms where desired are reinforced, allowing an infant to gradually fine-tune their motor control.

The blind flailing observed during perceptual learning can be fruitfully compared to the role of randomness in genetic algorithms. Genetic algorithms mimic some aspects of natural evolution to solve high-dimensional and difficult search problems by employing random variation and selection (Holland, 1975; Mitchell, 1996). A pool of random candidate solutions is initialized by encoding solutions in the chromosome of individuals. The fitness of each of individuals’ solution is assessed and then, a new generation of solutions is formed by recombining, and adding random mutations to, the previous generations’ solutions. Over several generations, genetic algorithms are often able to produce close-to-optimal solutions to difficult search problems. We are not arguing that genetic algorithms provide, in general, an accurate account of human cognition. Rather, we introduce genetic algorithms as a strong example of what blind flailing can achieve by way of macroscopically systematic progress.

For our current purposes, the important feature of genetic algorithms is that manipulations at one level, the chromosome of an individual solution, are then tested at a higher level that is effectively blind to the specific chromosomal changes that have been made. Selections of individuals are made on the basis of the results of these higher level tests. For example, a genetic algorithm might be applied to solving the traveling salesperson problem (TSP) for a given arrangement of destinations, such that the destinations are visited in a sequence that minimizes the total distance of the journey. Solutions could be encoded in an individual’s chromosome as the sequence of destinations, such as “1 2 3 4 5 6” or “6 4 5 2 3 1.” Mutations could involve swapping pairs of destinations, transforming “6 4 5 2 3 1” into “4 6 5 2 3 1.” Each solution can be assigned a fitness defined as the total path distance, assuming that the distances between every pair of destinations is known.

Importantly, a system like this evolves fitter low-level chromosomal representations based on some other system’s (e.g., the evaluator of fitness) feedback, without ever having an explicit mapping of how those low-level representations produce a good high-level result. For the TSP, it is easy to point to exactly such a mapping – namely the function that takes a sequence of destinations and produces a total distance. However, in this case, the mapping is possessed by only one system – the high-level evaluation. More generally, the mapping between low-level chromosomes and high-level evaluations may be opaque or non-existent. When a male peacock is selected for its ability to attract females, then the mapping between the chromosome’s coding of the male’s body and its environmental fitness is not possessed by any single system, and is highly non-linear if it exists at all. Despite the potentially unknown/unknowable status of the mapping, it is nonetheless possible to evolve increasingly fit peacocks and TSP solutions. Similarly, blind flailing in the form of random changes to perceptual systems, combined with feedback on the changes’ outcomes that is used to shape selection, can lead to systematic improvement to our perceptual processes. Both natural and artificial evolution give us strong precedents for the idea that short-term, blind flailing can lead to systematic improvement over a longer time course. Even if we completely lacked the ability to strategically refine our perception, our perceptual systems could still reliably adapt to become more congruent with the needs of high-level cognition.

Myopic Flailing

A conspicuous disanalogy between perceptual learning and evolutionary algorithms is that perceptual learning occurs within sentient agents. There is no strategic agent that looks down upon evolutionary processes with the aim of increasing their efficiency by directing evolution in particular directions¹. However, in the case of people, we may be interested in tweaking our perceptual system so that the tasks we need to perform are performed better. Now, “so that” is being used in its other sense of “with the intention that.” Merely desiring that our perceptual systems perform better provides no guarantee that they will do so. In fact, there are arguments suggesting that they will not cooperate with our intentions. Within cognitive science it is common to argue for the cognitive impenetrability of perception – the notion that what we perceive is not influenced by our beliefs, goals, or experiences (Pylyshyn, 2006). A classic example is that the two lines of Müeller-Lyer illusion continue to look unequal even after we have just measured them, and hence we know that they are the same length. The claim for the cognitive impenetrability of perception is consistent with the notion of perceptual modules – that perceptual processes are generally structured such that we have access to their outputs, but no ability to adjust their internal workings (Fodor, 1983).

However, humans are impressively resourceful, and we have found a number of ways of affecting our perception at many different levels of processing. People purposefully “hack” their perceptual systems in order to facilitate performance. Often times, these hacks are still flailing, but they are not completely blind, merely myopic. To better appreciate the resourceful with which people manage to change their perceptual systems in order to accomplish tasks that they would have difficulty accomplishing otherwise, we outline a variety of pertinent cases. These cases illustrate mechanisms by which we alter, adjust, or adjust our perceptual abilities due to our intentional actions.

(A) Changing our perceptual equipment

(1) Cupping one’s hands behind one’s ear to allow us to hear better in a particular direction.

(2) Pushing the skin around one’s eye’s to deform the eye’s shape to make an image sharper.

(3) Clamping one’s jaws tight to make one’s ears less sensitive to noise.

(4) Arranging our fingers so as to create a small aperture in front of our eye with the intention of creating a sharper image of an object.

(B) Strategically employing perceptual equipment

(5) When wine tasting, sloshing the wine around one’s mouth so that it covers more taste buds, also sucking in a bit of air to make more molecules airborne, thus intensifying olfactory response.

(6) In a Stroop interference task, purposefully squinting one’s eyes to facilitate ignoring the word that the colored ink forms.

(7) Explicitly remind oneself to assess the characteristics of clarity, cut, caret, and color when judging the quality of diamonds.

(8) Looking at a dim star not directly but in the periphery of one’s eyes, where the concentration of rods is greater, and hence one’s ability to detect faint light is greater.

(9) When trying to see a pass-through rather than bounce event in the ambiguous apparent motion sequence shown in Figure 2, track with one’s eyes a ball moving persistently from left to right.

(10) Self-exposure to important stimuli. For instance, the communal collection, publication, and distribution of sets of “interesting” and “non-interesting” results from cloud chamber experiments, in order to train new observers (Galison, 1997).

(11) When trying to learn the distinction between monarch and viceroy butterflies, explicitly juxtaposing pairs of the butterflies to exploit the benefit of simultaneous comparison and contrast.

(12) Giving oneself spaced, rather than massed, practice when trying to learn the difference between two species of mushrooms, so as to increase the impact on learning of each presentation.

(13) Purposefully exposing oneself to different speakers and syllables when trying to learn a difficult speech sound discrimination such as high-rising versus low-dipping tones in Mandarin for native English speakers or/r/versus/l/for native Japanese speakers.

(14) Training baseball batters to read numbers painted on baseballs to improve their ball tracking ability.

(15) Placing paintings on the walls of a baby’s room if one wishes for the baby to later have an easier time identifying and distinguishing the paintings.

(D) Creating new perceptual objects to emphasize important properties

(16) Using Venn diagrams to determine the different possible combinations for three binary variables.

(17) Rewriting a math equation, spacing notational element further apart if they have a relatively low order of precedence, to promote correctly solving it.

(18) Drawing a graph to better understand the nature of a three-way interaction from a psychology experiment.

(E) Creating physical tools to allow us to perceive better

(19) Creating a telescope to view other planets.

(20) Putting ink on a ball before rolling it, so as to better inspect its trajectory.

(21) Creating a cloud chamber to view the trajectories of sub-atomic particles.

(22) Installing a cochlear implant to restore hearing to a deaf individual.

FIGURE 2

Figure 2. Five frames of an ambiguous apparent motion sequence. Two balls can be either seen passing through each other or as bouncing off one another.

To be sure, not all of these examples are violations of cognitive impenetrability. Examples 11–13 are cases of an observer’s goals influencing their perceptual categorizations. It could be argued that they are not relevant, though, to cognitive penetration because the goals are long-term rather than acting on-line during the processing of a single stimulus. We would argue, however, that these kinds of perceptual changes are more influential exactly because they are long-term and chronic, and the perceptual change becomes automatic once acquired (Shiffrin and Lightfoot, 1997). If we restrict the influence of goals to only interactive and on-line influences, then we systematically ignore the large class of situations in which we change the feed-forward characteristics of a perceptual system to make it more efficient for meeting our goals.

Examples like 9 are interesting because motion perception has been singled out as one of the strongest cases for a modularized perceptual system, with well-defined computational accounts (Ullman, 1979) and localized brain regions (e.g., area MT). The fact that one’s goals can change the motion that is subjectively perceived is compatible with motion perception being highly modular. Either people can systematically adjust the inputs to their perceptual apparatus to alter the computation of motion, or the parameters governing the computation of the object correspondences underlying motion perception (Dawson, 1991) can themselves be tuned by goals. There are numerous examples of such tuning being necessary to account for the influences of knowledge and context on motion perception (Palmer, 1999). Just because something is highly modular does not mean that it performs its function without variation or context-sensitivity. In the same way that a function or subroutine can take arguments that affect the computations performed within it, the computations within even a completely opaque perceptual black-box can be modulated, and if the perceptual module is to be responsive and robust, it must be.

The mechanisms described above for changing perceptions have been organized into five categories. The intention is not so much to draw sharp distinctions between these categories as to draw parallels across the categories. For example, we suspect that few people would naturally consider the mechanisms of (E) (except 21) to be perceptual changes at all. However, we see these mechanisms to be comparable to some of the mechanisms of (A). Cupping one’s hands over one’s ears seems importantly similar to building a telescope. They both extend the normal range of one’s sensory organ. It seems less important that one extension is achieved by natural, bodily means, while the other by an inorganic tool. Likewise, there are strong parallels between the mechanisms of (D) and (E). We believe that creating perceptual tools like Venn and Feynman diagrams can be understood as deeply related to creating physical tools that extend our sensory organs (Landy and Goldstone, 2005). A powerful new spatial representation changes how things look just as surely as a microscope does. Compelling examples have been empirically described for how diagrams help thinking by promoting new ways of perceiving. Providing a static diagram may help people see what two seemingly dissimilar instantiations of a “convergence schema” share (Gick and Holyoak, 1983), and if a dynamic animation showing convergence is provided, then even greater transfer is achievable (Pedone et al., 2001). Cheng’s (2002) analysis of diagrams points to a suite of desirable properties of diagrams that allow them to serve as effective “cognitive prostheses”: (1) they combine globally homogeneous with locally heterogeneous representations of concepts, (2) they integrate alternative perspectives, (3) they allow for expressions to be easily manipulated, and (4) they support compact and uniform procedures.

A new spatial representation does not always need to be physically instantiated to prove effective, once it has been internalized. The benefits of Venn diagrams, once understood, can be secured even when they are only internally generated. As useful as it is to offload cognitive tasks onto the environment (Clark, 2009), it is often equally useful to internalize physical transformations. For example, one of the striking effects of learning the formalisms and diagrams for Signal Detection Theory is that they can become so well internalized that their possessor spontaneously sees connections between doctors diagnosing cancers and farmers determining which melons to ship, even when the learner does not prepare any external representation (Son and Goldstone, 2009). More generally, one of the best hopes for schooling is that students will learn new, habitual ways of seeing their world as a result of their formal education. Students will learn to see their world through the tools they have acquired.

The term “myopic flailing” is meant to be contrasted with the “blind flailing” of genetic algorithms and natural evolution. Myopic flailing conveys that people can educate their perceptions more efficiently than expected via pure random variation, even though their manipulations are less direct and straightforward than they would be if they could access and manipulate all aspects of the perceptual module. The classic chicken sexing expertise study by Biederman and Shiffrar (1987) provides a good context for appreciating myopic flailing. In this study, novice participants were given a single page of instructions on how to categorize day-old baby chickens that elevated their performance at chick sexing with photographs from slightly above chance to approximately that of experts with 24 years of experience sexing chicks. The novice’s impressive improvement with less than an hour of training is striking. It is highly unlikely that the novice’s improvement is mostly due to perceptual learning. Most cases of perceptual learning are characterized by slow and protracted learning over the course of weeks or years (Shiffrin and Lightfoot, 1997; Goldstone, 1998). Perceptual learning is an example par excellence of the adage that “wisdom can not be taught.” One cannot simply read a text-based book that has no illustrations if one wishes to become an expert dog show judge, gymnastics coach, wine taster, or umpire. One needs experiences to change one’s perceptual system.

However, it would also be a mistake to completely ignore the beneficial influence of instructional words and verbal justifications. In the case of the chick sexing study, the rapidity of learning suggests that the novice participants already had the perceptual building blocks firmly in place needed to understand and follow the instructions, which featured phrases such as “look for two large cylindrical side lobes near the bottom of each picture” and “Male chicken genitals tend to look round and foolish like a ball or watermelon.” This is a case of adaptation that is so clear-sighted that it does not qualify as “flailing” at all. In many cases of perceptual training, the accompanying words are not so directly actionable, but neither are they completely irrelevant. These are the cases where perceptual adaptation is best understood as operating via myopic flailing. Consider, for example, a radiologist instructing her students on how to distinguish between sarcoidosis and pulmonary alveolar proteinosis by looking for fissural beading versus a diffuse mosaic ground glass paving pattern without fibrosis. These perceptual features require months/years of training to develop. It is unlikely that a simple page of imageless instructions will ever suffice for their instruction, and medical schools have converged on training disease identification through a combination of describing bodily appearances and explaining causal bodily mechanisms. Features like “ground glass,” “fibrotic,” “paving pattern,” “mossy,” “ulcerated,” and “pustulated” are not immediately understandable, and developing an operational understanding of them practically necessitates undergoing perceptual training by witnessing cases. However, the words are nonetheless useful for focusing one’s attention on different aspects of a disease, such as its spatial distribution, color, arrangement, tactile feel, and texture. The words do not directly alter the internal workings of perceptual modules, but they do lead to more effective learning than pure random selection. They provide myopic support for tuning perceptions.

A characteristic of many forms of expertise is that the expert has both a highly precise verbal vocabulary and an ability to perceptually parse objects from their domain in a coherent and expressive manner. These two characteristics are correlated because, we believe, each informs the development of the other. In most cases, words cannot replace experience for teaching perceptual skills, but they can facilitate perceptual skill learning, as anybody who has tried to learn to distinguish poisonous from edible mushrooms in a completely word-less, instruction-less, and inductive fashion would attest (in the unlikely event that they lived long enough to do so).

Conclusion

There is little, if any, gap between perception and high-level cognition because perceptual systems adapt to fit the needs of high-level cognition. These adaptations may be either the result of random variation or more directed tuning. A person gaining experience with the world also acquires more knowledge about how low-level, physical transformations affect high-level cognitive outcomes. For this reason, blind flailing generally gives way to varying degrees of guided tuning through learning. Babies have difficulty even tracing the edges of a high-contrast object with their eyes. A psychophysicist studying color can separately isolate the saturation and brightness levels of an object. Most adults fall somewhere in between these two points, having intermediate-level access to visual properties. Once a visual property has been isolated, it can then be strategically tuned. Before a person has learned to isolate saturation from brightness, it is difficult or impossible for them to selectively attend to just one of these dimensions (Goldstone and Steyvers, 2001). Afterward, they have strategic control over which dimensions they will use for a particular purpose. Thus, people not only learn to attend to perceptual dimensions to address their needs; they also learn how to learn to attend to dimensions. This meta-learning represents the transition from a relatively uncontrolled, random search for a method to improve perceptual processing to a relatively controlled and guide done.

Perceptual learning, and perceptual learning learning, serve to increase the sophistication of our perceptual processes. The result is that people’s perceptual processes can support what appear to be long-distance connections requiring formal abstractions. The primary advantage of long-distance connections that are based on perceptual rather than formal symbolic processes is that they are more likely to exist! Formalisms provided by mathematics and logic are typically cognitively inert unless they are grounded in perceptual processes. They are inert in the sense that people are unlikely to realize that two situations are governed by the same formalism, unless they are given a hint to connect the situations. As such, these connections are not likely to be made through application of formalisms.

The promise of making connections based on learned perceptual properties is that the connections can be automatically forged because they are perceptual, but they can nonetheless be sophisticated because they are learned. Strategies and goals shape perceptual learning via “myopic flailing” (see Materials and Methods 10–15), but importantly, once the learning has transpired, it is automatically deployed during perception. Even when perceptions cannot be semi-permanently changed via learning, the other methods describe ways of manipulating perceptions so as to overcome some of their limitations.

This perspective on achieving sophisticated reasoning through perceptual manipulation can be contrasted with the Quine an approach of trumping perception by higher level reasoning, rules, and the application of definitions. In practice, both kinds of processes must occur. Determining the causes and consequences of each process would constitute a fertile research program, with perhaps even neural correlates. For example, for cases in which perceptual processes are trumped by rules, we might expect frontal cortex to exhibit heightened activity, and to actively inhibit more posterior perceptual regions. In contrast, when perceptual processes are adapted to subserve formal thought, then posterior cortical regions may assume particular importance. This decomposition into modules is roughly compatible with empirically observed neural supercessions –cases in which controlled, initial performance is governed by different neural populations than subsequent automatic processing (Procyk et al., 2000). For example, when a monkey first learns to associate a novel stimulus with a response, some cells with in the supplementary eye fields (SEF) of the dorsal–medial surface of the frontal lobe are highly active, but become decreasingly active with repetition of the stimulus. Other cells show the opposite tendency, becoming increasingly active as the response to a novel stimulus is learned (Chen and Wise, 1995). This pattern of complimentary controlled and automatic processes fits the above developed account in that we have argued that controlled processes operate to make themselves obsolescent by modifying perception over a protracted course of training. While our account is similar in some ways to theories positing a split between rule-based versus automatic, association-based reasoning (Sloman, 1996), our account focuses on the development of new perceptual processes rather than simply association learning, and points to ways in ways in which our rule-based system guides and informs the construction of perceptual processes.

One advantage of training over trumping perception is that the opportunities provided by rich and nuanced interpretations available from a highly evolved and trained perceptual system are not relinquished. As a consequence of the automatic and strategic changes to perception, people can perceive connections between balls and city growth (they can both be resonance systems), toilets and hare–lynx populations (they are both negative feedback systems), and a general surrounding a fortress and removing a tumor by concentrating multiple lasers at the tumor (they are both example of converging forces to overcome an entity). Once bridges are built between these prime facie distant but deeply related situations, knowledge and inferences can freely move from one to the other. Connections can be made between situations that, at first, may not appear related at all, because trained appearances can go far beyond first appearances.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

This research was supported in part by National Science Foundation REESE grant 0910218 and Department of Education IES grant R305A1100060. The authors would like to thank Josh Brown, Lisa Byrge, Paulo Carvalho, Samuel Day, Keith Holyoak, Melanie Mitchell, Jessie Peissig, Luis Rocha, Michela Tacca, and Larry Yaeger for helpful comments.

Footnote

^However, in the case of genetic algorithms, there is current research interest in systems that guide evolution by creating new heuristics that will then constrain future fitness evaluation (Burke et al., 2009).

References

Barsalou, L. W. (2008). Grounded cognition. Annu. Rev. Psychol. 59, 617–645.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Biederman, I., and Shiffrar, M. M. (1987). Sexing day-old chicks: a case study and expert systems analysis of a difficult perceptual-learning task. J. Exp. Psychol. Learn. Mem. Cogn. 13, 640–645.