Apparent Motion Can Impair and Enhance Target Visibility: The Role of Shape in Predicting and Postdicting Object Continuity

Lenkic, Peter  J; Enns, James  T

doi:10.3389/fpsyg.2013.00035

ORIGINAL RESEARCH article

Front. Psychol., 01 February 2013

Sec. Consciousness Research

Volume 4 - 2013 | https://doi.org/10.3389/fpsyg.2013.00035

This article is part of the Research TopicAwareness shaping or shaped by prediction and postdictionView all 15 articles

Apparent motion can impair and enhance target visibility: the role of shape in predicting and postdicting object continuity

Peter J. Lenkic

James T. Enns*

Department of Psychology, University of British Columbia, Vancouver, BC, Canada

Some previous studies have reported that the visibility of a target in the path of an apparent motion sequence is impaired; other studies have reported that it is facilitated. Here we test whether the relation of shape similarity between the inducing and target stimuli has an influence on visibility. Reasoning from a theoretical framework in which there are both predictive and postdictive influences on shape perception, we report experiments involving three-frame apparent motion sequences. In these experiments, we systematically varied the congruence between target shapes and contextual shapes (preceding and following). Experiment 1 established the baseline visibility of the target, when it was presented in isolation and when it was preceded or followed by a single contextual shape. This set the stage for Experiment 2, where the shape congruence between the target and both contextual shapes was varied orthogonally. The results showed a remarkable degree of synergy between predictive and postdictive influences, allowing a backward-masked shape that was almost invisible when presented in isolation to be discriminated with a d′ of 2 when either of the contextual shapes are congruent. In Experiment 3 participants performed a shape-feature detection task with the same stimuli, with the results indicating that the predictive and postdictive effects were now absent. This finding confirms that shape congruence effects on visibility are specific to shape perception and are not due to either general alerting effects for objects in the path of a motion signal nor to low-level perceptual filling-in.

Introduction

When two stimuli are presented in close spatio-temporal proximity we experience a single object in motion. Although such apparent motion is experienced without effort by the viewer, it is only achieved after a number of complex problems have been solved. These include problems of image correspondence (Ramachandran and Anstis, 1986), the relative spatial position of elements (Nijhawan, 1994; Eagleman and Sejnowski, 2000; Krekelberg and Lappe, 2000), and visual masking of one stimulus by the other (Breitmeyer and Ogmen, 2000, 2006; Enns and Di Lollo, 2000). One might reasonably predict from these challenges that a stimulus in motion would be seen less accurately than a static stimulus of similar duration and size. In the present paper, we demonstrate that visibility can sometimes be impaired and at other times enhanced by the relations between stimuli making up the perceptual object in an apparent motion sequence.

Evidence for Prediction and Postdiction in Perception

The role of prediction is emphasized in recent theories of spatio-temporal processing (Nijhawan, 1994; Enns and Lleras, 2008; Mathewson et al., 2010; Roach et al., 2011). As one example of a study of motion predictability on target visibility, Schwiedrzik et al. (2007) presented a target within various phases of the up-and-down motion path of a secondary stimulus and reported that target visibility was especially reduced when the target coincided with the middle portion of the motion path. In contrast, visibility was increased for targets at the end-points of the path, and when there was only a single preceding motion stimulus or a single following motion stimulus. Schwiedrzik et al. (2007) referred to this impairment as “motion masking,” in keeping with the earlier use of this term by Yantis and Nakama (1998). Similar results have also been been reported by Hidaka et al. (2011, 2012), Khuu et al. (2010), and Souto and Johnston (2012).

In another study, Roach et al. (2011) presented pairs of inducer stimuli to the left and right of central fixation, oscillating up-and-down over several cycles. A target Gabor patch was presented in the path of one of these inducers, and its timing adjusted so that it appeared either at the end of the motion sequence or the beginning. The target was also presented either in or out of spatial phase with the inducer. The participant’s task was to report whether the target appeared to the left or right of fixation. The results indicated that target visibility was lowest when the inducing stimuli moved away from the target location and it was highest when it was predictable from both the temporal and spatial phase of the inducer. Thus, contrary to Schwiedrzik et al. (2007), motion predictability was a benefit to target visibility in this task, not an impairment.

Prediction, or forward-going expectations, are only part of what occurs in a motion sequence. Postdiction, or a revisionist history of what has just occurred, also influences the visibility of a target in motion (Di Lollo et al., 2000; Eagleman and Sejnowski, 2000; Lleras and Moore, 2003; see also Kolers and Pomerantz, 1971; Kolers and von Grunau, 1976). The theoretical mechanism for these influences is often referred to as object updating, because the visual system seems to give a revisionist interpretation specifically to perceptual objects, not to the image as a whole (see review by Enns et al., 2009). That is, there is a powerful bias to interpret changes to a scene as the consequence of a single object in motion, rather than as the sudden appearance of unexpected new objects, or as the consequence of a moving background in the context of a stationary single object. This bias offers heuristic benefits to a visual system faced with chaotic input, but at the same time it incurs a cost in certain conditions. The cost is that target features seen at point A in time may be overwritten and rendered less visible, or even invisible, by the target features presented at point B. This is the main idea behind what has come to be called object substitution masking (e.g., Di Lollo et al., 2000; Lleras and Moore, 2003; Moore and Lleras, 2005; Enns, 2008).

The Role of Shape

At what level of representation are the predictive and postdictive mechanisms at work when interpreting an object in motion? Extant theories of how motion relates to target visibility have been described as falling into three camps (Souto and Johnston, 2012). In one camp are researchers who give their participants a detection task (i.e., reporting whether a stimulus is present or absent along a motion path), thereby emphasizing image-level processes. For example, Hidaka et al. (2011) showed that motion path predictability lead to a decrement in target detection, and they conclude that motion masking is the result of an early visual interaction between a physical stimulus (the target) and an illusory percept (the interpolated motion path between stimulus inducers). Souto and Johnston (2012) expanded on this idea, reporting that motion masking depended on the targets and inducers sharing the same isoluminant colors. In a second camp, researchers have demonstrated that object-level competition between inducers and target also plays a role in motion masking (Yantis and Nakama, 1998; Liu et al., 2004). These authors demonstrate that more than detection-level processes are involved by giving their participants shape-discrimination tasks. In a third camp, Schwiedrzik et al. (2007) and Roach et al. (2011) go a step further, by arguing that when masking is attenuated by motion path consistency, it demonstrates the role of predictive processes at play, over, and above an object-level competition between stimuli.

Although Schwiedrzik et al. (2007) and Roach et al. (2011) show that predictable targets can attenuate masking (i.e., reduce the visibility impairment caused by motion), they do not examine the role of shape consistency between stimuli and inducers, focusing only on spatio-temporal consistency. To be fair, Schwiedrzik et al. (2007) discuss the possibility that the shape dissimilarity between the stimuli in motion and the target may have played a role in the impairments that they and Yantis and Nakama (1998) reported. This way of thinking also raises the possibility that the predictive benefits of Roach et al. (2011) may have occurred because of the greater similarity between inducing and target shapes in their study.

Here we focus on the role of shape continuity in the visibility of a target in an apparent motion sequence. Specifically, we compare the influences that arise from forward-acting (predictive) processes with those that derive from backward-acting (postdictive) processes (see also Hogendoorn et al., 2008). If we find that both processes are at work, we can then ask questions about their relative magnitude and whether they combine in an additive way (indicating independent processes) or interactively (pointing to synergistic processes).

It may also be important to distinguish between previous studies in which the target stimulus was unrelated to the motion inducing stimulus (e.g., Yantis and Nakama, 1998; Khuu et al., 2010), offering greater opportunity for masking, versus those in which the target stimulus was a component of the motion inducing stimulus (e.g., Hidaka et al., 2011). As such, we begin with a study in which the target to be perceived is itself part of the motion sequence.

To address these questions, we designed a target discrimination task in which the effects of a preceding shape and a following shape could be evaluated, first independently (Experiment 1), and then jointly (Experiment 2). We did this by varying the motion congruence between the central target shape and the contextual shapes (preceding, following). To anticipate the results, we report strong predictive and postdictive influences on target visibility, along with a great deal of synergy between these influences.

In a final experiment (Experiment 3) we replicated the essential stimulus conditions of Experiment 2, but asked participants to perform a shape-feature detection task (presence versus absence) rather than a shape-discrimination task. This serves as an important control for the idea that predictive and postdictive processes specific to shape perception are influencing target visibility, as opposed to more primitive alerting process or image-level processes that boost the gain of all signals in the path. If the processes we are studying are shape specific, we anticipate that continuity in apparent motion will not have the same effect on a target detection task. And again, to anticipate the results, that is what we find.

Experiment 1: Baseline Visibility

To set the stage for a study of target visibility in the context of a three-frame motion sequence, we first compared the visibility of a target shape in isolation, with the visibility of a target either preceded or followed by a single shape. The spatial layout and temporal sequence is illustrated in Figure 1. We also varied the orientation of the preceding and following shapes, so that they were congruent or incongruent with the target. Three additional factors were varied to increase the generality of the findings and to minimize the possibility of strategic factors influencing the results. First, to ensure that target visibility would be measured at more than one level, we varied whether or not a pattern mask was presented immediately after the target and in the same spatial position (Breitmeyer and Ogmen, 2006). Second, we varied the spatial proximity between neighboring shapes at two levels, as this is often a critical factor in target visibility (Breitmeyer and Ogmen, 2006). Finally, the shapes were presented randomly to the right or left of fixation, and motion sequences were also either to the left or the right, so that observers were unable to predict where the shapes would appear and in what context (Enns and Di Lollo, 1997, 2000).

FIGURE 1

Figure 1. (A) Illustration of the four possible target shapes and the pattern mask in the experiments. Participants reported whether the target had a notch on the left or the right side, regardless of its slant. (B) Illustration of the sequence of events on each trial. (C) Illustration of the displays in Experiment 1. Gray arrows indicate the two possible motion directions on the right side of the screen; equivalent paths were possible on the left side (not shown).

Participants were asked to report the location of a notch in each target shape, which could be either on the right or left side. Note that this task is immune from any decision-based biases arising from the orientation of the preceding or following shapes, or from the relation between these shapes and the target (congruent versus incongruent), since the only shape with a notch was the target, and the notch was equally often on the right or the left of this shape, independent of all other factors.

Method

Participants

Fifteen university students participated in a 1-h session for extra-course credit or a $10 payment. All participants had normal or corrected-to-normal vision and were treated according to APA ethical guidelines as administered by the University of British Columbia.

Stimuli and apparatus

Rectangular gray shapes (gray level = 62%) were presented on an LCD monitor with a refresh rate of 60 Hz. The shapes subtended 2.5° × 1° of visual angle, were slanted either 45° or 135° from vertical (i.e., they had a positive or negative slant, see Figure 1A), and were presented on a white background. The pattern masks consisted of six rectangular shapes, as illustrated in Figure 1A, each oriented to differ slightly from the cardinal directions of vertical, horizontal, and oblique. This pattern subtended 2.5° × 2.5° of visual angle. The target shape had a semicircular notch on one side. A fixation cross was centered horizontally on the screen, but positioned 5.5° below the vertical center, so that the shapes were presented above fixation.

The contextual shape that preceded or followed the target shape on some trials was identical to the target in size and luminance, but it did not have a notch, and it was spatially separated by a center-to-center distance of either 2.5° (near proximity condition) or 6.5° (far). The target was always presented 10.5° from the fixation point, but randomly to the left or right, with a positive or negative slant and with a notch randomly removed from its left or right side. The orientation of the preceding and following shapes was either congruent or incongruent with a linear motion trajectory.

The temporal sequence of events is illustrated in Figure 1B, with the target shape and preceding or following shape (when either was present) appearing 100 ms apart (stimulus onset asynchrony). The target had a duration of 33 ms, as did the mask, when present, and the target and mask were separated by an interval of 33 ms.

Procedure

Participants were seated with their eyes 57 cm from the display screen. They were instructed to maintain gaze on the cross in the bottom of the screen, using their peripheral vision to view the shapes. They were introduced to the task with 10 practice trials with much longer display durations and received feedback on each trial (the words “correct” or “incorrect” appeared at fixation), and the experimenter monitored this feedback during the practice trials and provided further verbal instruction when necessary to ensure they understood the task.

Each trial began with a variable onset interval (1400–2200 ms, in 200 ms steps) that began after the participant’s previous response. Participants registered their responses with one of two keys (“w” or “o”) and visual feedback consisting of a green or red colored text message at fixation indicated whether their response was “correct” or “incorrect,” respectively. Trials were presented in a random order, with equal representation of the three conditions (alone, preceding, and following) × 2 notch locations × 2 target orientations × 2 mask conditions. Among the preceding and following conditions, trials were further divided among congruent and incongruent shape relations and close and far proximity conditions. Participants completed a total of 768 trials, divided into eight blocks of 96 trials, with self-paced breaks between blocks.

Data analyses

In order to convert responses into hits and false-alarm rates that are amenable to a signal detection analysis, the proportion of left responses to left-notched targets were counted as hits and the proportion of right responses to left-notched targets were counted as false-alarms, for each participant. These rates were then used to calculate d′, a measure of sensitivity unaffected by response bias. Because proportions of 0 or 1 cause d′ to take on a value of infinity, hit, or false-alarm rates with these values were replaced with values of 0.01 and 0.99, respectively (MacMillan and Creelman, 1991), which placed a ceiling on d′ of 4.46.

Results

Figure 2 shows target visibility in Experiment 1. Masking was clearly effective in reducing overall visibility, as the mean d′ was 3 with no mask and less than 1 with the mask. Shape congruency also played a large role in target visibility: congruent shape sequences resulted in larger d′ values than incongruent sequences at both levels of masking. The temporal order of the contextual shape also played a large role, with a preceding shape having less of an influence on target visibility than a following shape. Most important, the influence of shape congruence on visibility was greater for following shapes than preceding shapes, with an incongruent-following shape reducing visibility in the no-mask condition (d′ = 1.01) near the baseline level in the masking condition (d′ = 0.79), and in the mask condition reducing visibility to a d′ of near zero (d′ = 0.21). Contextual shapes that were near in proximity to the target generally led to lower levels of visibility (d′ = 1.64) than contextual shapes that were farther away (d′ = 1.85). These observations were supported by the following statistical analyses.

FIGURE 2

Figure 2. Visibility of the target in Experiment 1, as indexed by d′. Error bars represent ±1 SEM. The asterisks indicate those conditions in which target visibility was significantly reduced relative to the target alone condition.

A repeated measures ANOVA examined the factors of temporal order (2) × congruency (2) × masking (2) × proximity (2). All main effects were significant: temporal order [F_(1,14) = 19.17, p = 0.00063], congruence [F_(1,14) = 105.26], mask [F_(1,14) = 369.07], and proximity [F_(1,14) = 6.52, p = 0.023], as were the two-way interactions of temporal order × congruence [F_(1,14) = 65.40], temporal order × proximity [F_(1,14) = 9.50 p = 0.0081], temporal order × mask [F_(1,14) = 17.28, p = 0.00097], mask × congruence [F_(1,14) = 59.57], and mask × proximity [F_(1,14) = 5.03, p = 0.042]. The only significant three-way interactions were temporal order × congruence × mask [F_(1,14) = 18.09, p = 0.00080] and congruence × mask × proximity [F_(1,14) = 5.37, p = 0.036]. All other effects were not significant (ps > 0.094).

Simple effect tests on the critical temporal order × congruence interaction indicated that, although the congruency effect was much greater in the following than preceding condition, congruent shapes were nonetheless more visible than incongruent shapes in both conditions: [F_(1,14) = 234.70] and [F_(1,14) = 15.08, p = 0.0017], respectively.

Additional comparisons tested whether target visibility in the preceding and following shape conditions was improved or impaired relative to the target presented alone. The asterisks in Figure 2 indicate which of these comparisons were significant, based on a Bonferroni-adjusted family wise alpha of p < 0.05. With no mask, only the two incongruent conditions resulted in significant reductions in visibility, preceding [F_(1,14) = 16.53, p = 0.0012] and following [F_(1,14) = 190.86, p < 0.0001]. When the mask was present only the following incongruent condition showed a significant visibility reduction [F_(1,14) = 21.15, p = 0.0004].

Discussion

These results establish an important baseline for us to explore how prediction and postdiction combine in their influence when a target is seen in the context of a larger motion sequence. In summary, the results show that shape congruence in a motion sequence plays a critical role in influencing the visibility of a target shape, such that when the shapes are congruent, visibility is similar to when the same target is presented briefly in isolation. However, when the shapes are incongruent there is a serious reduction in visibility, with this reduction being much greater for an incongruent shape that follows the target (postdiction based on the incongruent shape impairs visibility) than for an incongruent shape that precedes it (prediction based on an incongruent shape has little consequence).

These results are broadly consistent with previous reports of motion masking (Yantis and Nakama, 1998; Schwiedrzik et al., 2007; Hogendoorn et al., 2008), in that placing a target in a motion sequence can be detrimental to its visibility under some conditions (e.g., when following shapes are incongruent). These results are also consistent with previous reports that backward masking of shape is generally more detrimental to visibility than forward masking (Breitmeyer and Ogmen, 2006). Finally, they are consistent with object updating theory (Enns et al., 2009), which proposes that human vision is biased to process a spatio-temporal sequence of stimuli as the same object translating in space-time. To the extent that this bias is supported by a spatio-temporally consistent motion display (here the congruent condition), the visibility of a target shape in an apparent motion sequence is not impaired.

Experiment 2: Visibility in an Apparent Motion Sequence

In this experiment we measured the visibility of a target shape in a three-frame apparent motion sequence, while varying whether the preceding and following shapes were congruent or incongruent with the overall motion trajectory. By comparing these data with those in Experiment 1, we were able to gage the extent to which congruency in the two contextual shapes made additive or synergistic contributions to target visibility.