Mentalizing Another's Visual World—A Novel Exploration via Motion Aftereffect

Yuan, Xuefei; Wang, Nanbo; Geng, Haiyan; Zhang, Shen

doi:10.3389/fpsyg.2017.01535

ORIGINAL RESEARCH article

Front. Psychol., 07 September 2017

Sec. Cognitive Science

Volume 8 - 2017 | https://doi.org/10.3389/fpsyg.2017.01535

Mentalizing Another's Visual World—A Novel Exploration via Motion Aftereffect

Xuefei Yuan¹^†

Nanbo Wang¹^†

Haiyan Geng¹^*

Shen Zhang²

¹Beijing Key Laboratory of Behavior and Mental Health, School of Psychological and Cognitive Sciences, Peking University, Beijing, China
²Department of Psychology, University of Wisconsin-Whitewater, Whitewater, WI, United States

Past research on level 2 visual perspective-taking (VPT) has mostly focused on understanding the mental rotation involved when one adopts others' perspective; the mechanisms underlying how the visual world of others is mentally represented remain unclear. In three studies, we addressed this question by adopting a novel VPT task with motion stimuli and exploring the aftereffect on motion discrimination from the self-perspective. Overall the results showed a facilitation aftereffect when participants were instructed to take the avatar's perspective. Meanwhile, participants' self-reported perspective-taking tendencies correlated with the aftereffect for both instructed and spontaneous VPT tasks, when the “to-be-adopted” perspective required the participants to mentally transform their self-body clockwise. Specifically, while facilitation was induced for participants with low self-reported perspective-taking tendencies (e.g., viewing a leftward motion stimulus under another's perspective enhanced subsequent perception of leftward motion from the self-perspective), those with high self-reported perspective-taking tendencies showed an adaptation aftereffect (e.g., viewing a leftward motion stimulus under another's perspective weakened subsequent perception of leftward motion from the self-perspective). For these individuals, the adaptation effect indicated the engagement of direction-selective neurons in processing of the subsequent congruent-direction motion from self's perspective. These findings suggest that motion perception from different perspectives (self vs. another) may share the same direction-selective neural circuitry, and this possibility depends on observers' general perspective-taking tendencies.

Introduction

Perspective-taking (PT) is the process by which an individual views a situation from another's point-of-view (Galinsky et al., 2008). It is widely adopted in our daily lives to ensure successful social interactions (Tversky and Hard, 2009). For instance, we use perspective-taking to infer how others feel (emotional PT; Ruby and Decety, 2004; Lamm et al., 2007), to represent what others know (cognitive PT; Ruby and Decety, 2003; Apperly et al., 2004) and to make sense of others' actions and intentions (PT of action; Ruby and Decety, 2001; Jackson et al., 2006). One basic and early-developed form of perspective-taking is understanding the visual experience of another agent, known as visual perspective-taking (VPT). The literature has distinguished two levels of VPT: the ability to infer whether an object is visible from another person's line of sight (level 1 VPT; Flavell et al., 1981), and, of particular interest to us, the ability to recognize that a simultaneously visible object can look differently from the different perspectives of the self and another person (level 2 VPT; Michelon and Zacks, 2006).

A considerable amount of research has focused on understanding the nature of level 2 VPT (as opposed to level 1 VPT). These studies usually adopt a paradigm that asks participants to judge the spatial position of an object (e.g., a glove) in disparate scenes (May and Wendt, 2013; Pearson et al., 2013), or report visual content (e.g., the number “6” or “9,” Surtees et al., 2013b) from contradictive perspectives of the self and the avatar. An important feature of level 2 VPT is mentally adopting the spatial position of another person (Surtees et al., 2013a). For example, with increasing angular disparity between the viewpoints of the participants and the avatar, participants' reaction times also increased, suggesting that participants mentally transform themselves to the avatar's position when performing the VPT task (Michelon and Zacks, 2006; Kessler and Thomson, 2010). Although mentally switching into another's spatial point of view is essential, we believe forming a mental representation of the world from that visual perspective is also integral to level 2 VPT.

Nevertheless, most studies have only focused on “adopting another's position.” For example, it has been found that participants' handedness (Gardner and Potts, 2010), motor experience (Steggemann et al., 2011), and the position of the self within the world (Kessler and Thomson, 2010) modulate the difficulty of mental body transformation. Studies were also conducted to understand how level 2 VPT was modulated by participants' gender, socio-cultural background (Mohr et al., 2013; Kessler et al., 2014), emotion, and mental conditions (such as empathy, anxiety and schizotypy; Thakkar and Park, 2010; Gronholm et al., 2012; Todd et al., 2015). These studies as well as studies that explore neural mechanisms of VPT (e.g., David et al., 2006; Mazzarella et al., 2013), however, did not particularly examine the representation of the visual scenes once another's perspective is taken.

How does one visualize objects in that new perspective? Is it the same process as if s/he experiences the stimulus from the self-perspective? Does visual perception from different perspectives (e.g., another's and self's) share some common psychological or neural mechanisms? The present study aimed to answer these questions and understand the mental representation of level 2 VPT.

It is difficult to address these questions with past paradigms, for these paradigms either directly asked participants to report the static visual content (the position of an object or number “6 or 9”) under contradictive perspectives of the self and someone else, or indirectly deduced the existence of VPT by demonstrating its interference in visual processing from the self-perspective (Elekes et al., 2016; Surtees et al., 2016). Regardless of approach, participants' reaction time and/or the accuracy of their report in the VPT task were usually the dependent variables. These gross indexes result from the entire processing episode, but do not provide specific information about the mechanism of mental representation under another's perspective in level 2 VPT.

Instead of static stimuli, we used motion stimuli (i.e., the motion adaptors, see Figure 1A) to examine how VPT affects participants' performance from the perspective of the self in a subsequent motion-direction discrimination task (Figure 1B). This paradigm can answer above research questions, because viewing the motion stimuli from the avatar's perspective for a certain time (5 s in our study) can possibly generate different aftereffects on participants' performance, depending on the mechanisms of mental representation under another's perspective. It should be noted that, the “aftereffect” in this article has a very general meaning, referring to any visual effects from viewing motion stimuli. It does not necessarily mean “the Motion Aftereffect” (usually refers to a motion adaptation effect, Anstis et al., 1998; Huk et al., 2001; Mather et al., 2008).

FIGURE 1

Figure 1. The illustration of a critical trial in the PT condition of Study 1a. After imagining seeing the moving dots from the avatar's perspective for 5 s (A), the subjects made judgments about the dominant moving direction of the test stimulus (B). The arrows in the figure indicate the moving direction of the dots and were not actually presented in the experiments. An example of a complete trial is illustrated in (C).

With this said, the aftereffect could be a motion adaptation effect, an illusion in which after prolonged viewing of motion in one direction, a stationary or ambiguous dynamic test stimulus appears to drift in the opposite direction (Mather et al., 1998; Winawer et al., 2010). It is caused by adaptation of the corresponding neural circuits that reduces subsequent processing of the direction of motion (Wark et al., 2007; Webster, 2011). In our study, this could be the case if motion perceptions from different perspectives (other vs. the self) recruit the same direction-selective neural circuitry, those neurons tuned to leftward movement would become less responsive due to VPT (which is a leftward motion under the avatar's perspective), and thus would weaken one's subsequent processing of leftward motion under the self-perspective. If, however, viewing the motion stimuli under the avatar's perspective does not simulate that in the self-perspective, then such an adaptation aftereffect is unlikely to occur.

Thus in Study 1 we explicitly instructed participants to take the avatar's perspective to view leftward/rightward motion adaptors, and measured their performance, i.e., the possible aftereffect, in a subsequent leftward/rightward motion-direction discrimination task under the self-perspective. We also conducted Study 2 to try to replicate the main findings of Study 1.

We believe that the aftereffect ties to participants' PT abilities. It has been found that when participants were asked to judge the relative direction of a static object (Thakkar and Park, 2010), their' self-reported PT tendencies positively correlated with the efficiency in completing that VPT task. Such self-reported PT tendency should predict behavior in VPT task with motion stimuli too, since it indicates participants' general ability of adopting another's perspective, independent of visual stimuli (Davis, 1980). Meanwhile, individual difference in visual motion perception has been found in recent literature: following the presentation of the same stimulus, some participants demonstrated motion facilitation, while others demonstrated motion adaptation (Takeuchi et al., 2017). Thus, we investigated whether the aftereffect following VPT with motion stimuli would vary along with participants' self-reported PT tendencies. Due to their stronger ability of perspective-taking, we assume people with high self-reported PT tendencies would be immersive when mentalizing other's visual world, therefore we predict these people exhibit an adaptation effect on subsequent motion perception. For people with low self-reported PT tendencies, their aftereffect might be weaker or a different kind.

To measure participants' PT tendencies, we adopted the PT subscale of the Interpersonal Reactivity Index (IRI scale; Davis, 1980). The IRI scale consists of four seven-item subscales, and each measures an aspect of the global concept “empathy.” The PT subscale fits the aim of our research well, for it contains items assessing people's spontaneous tendencies to take other people's perspectives and see things from their points of view in everyday life (e.g., “I sometimes try to understand my friends better by imagining how things look from their perspective” and “I try to look at everybody's side of a disagreement before I make a decision”).

Recent literature showed that level 2 VPT can be spontaneously induced (Elekes et al., 2016; Surtees et al., 2016). For example, when participants had to report the visual content of a number shown on the table from their own view while another person was sitting across the table from them, their reaction time was longer when the number was “6” or “9” rather than “0” or “8,” which was thought to result from the interference of participants' spontaneous VPT on their own perspective processing (Elekes et al., 2016; Surtees et al., 2016).

Unlike Study 1 and 2 that examine instructed perspective-taking by deliberately requiring participants to take another person's viewpoint, we conducted Study 3 to explore spontaneous perspective-taking and its possible aftereffect. Since literature comparing explicit vs. implicit processing usually demonstrated a weaker effect for the latter (Jiang and He, 2006; Jiang et al., 2009), we expect a similar but weaker effect from spontaneous VPT with motion stimuli.

In summary, with a novel motion-adaptation paradigm, we conducted three studies to investigate the mechanisms of mental representation under others' perspective. We aim to examine whether the direction-selective neural circuits engaged in self-perspective processing are also involved in level 2 VPT, and whether the approaches people employ to adopt another's perspective are correlated with their self-reported PT tendencies.

Study 1 (a and b)

Study 1a

Study 1a was conducted as the main experiment to examine the mental representation of instructed level 2 VPT.

Participants were asked to take the avatar's perspective to view a motion stimulus comprised of a set of light-colored dots on a dark background, moving leftward or rightward from the avatar's perspective (but upward or downward from the participant's perspective), and subsequently complete a leftward/rightward motion-direction discrimination task from the self-perspective (Figure 1B).

After mentally transforming themselves to the avatar's position, if participants use the same populations of neurons to process the moving dots during VPT as they do in their own perspective, prolonged viewing of a motion stimulus (e.g., moving upward) from the avatar's perspective (moving leftward, Figure 1A) will lead to an adaptation aftereffect for motion in that direction (leftward). As a result, participants' discrimination of the subsequent motion stimulus in the adapted direction will weaken, manifested as a higher probability of participants' reporting the opposite direction (rightward) in the subsequent discrimination task from the self-perspective.

We also explored whether the occurrence of the aftereffect relates to people's PT tendencies, in that higher PT tendencies predicts a stronger adaptation effect.

Method

Participants

Seventeen students with normal or corrected-to-normal vision were recruited from Peking University for this study and received monetary compensation or course credits for their participation. Data from one participant were excluded because of extremely low accuracy in his performance (beyond three standard deviations from the group mean). Results from the remaining 16 participants (6 females and 10 males; M_age = 22.7 years, SD = 2.6) were included in the final analyses. All studies reported in this paper were approved by the Ethics Review Committee of Peking University.

Materials

The computer task. The program used for the computer task was generated by Matlab2011 with the Psychtoolbox 3. All stimuli, described below, were presented on a 19-in Viewsonic Professional Series P97f+ (1,024 × 768 at 75 Hz) monitor connected to a computer running Windows XP.

The fixation point was an outline of a light-gray square presented at the center of the screen, subtending 0.4° × 0.4° of visual angle, with a luminance of 6.65 cd/m².

Each motion stimulus consisted of three sequences of randomly distributed Gaussian, anti-aliased white dots with interleaving frames (60 cd/m² at maximum contrast; 0.06 deg at half-height, with a 5 dots/deg² density; Shadlen and Newsome, 2001; Roitman and Shadlen, 2002). These dots drifted at a speed of 8 deg/s within a region subtending 4° × 4° of visual angle against a black background at the center of the computer screen (0.05 cd/m²). These dots moved vertically as a motion adaptor, but moved horizontally as a test stimulus (see Procedure). When they were a test stimulus, not all but only a percentage of dots were moving coherently (“motion coherence,” Newsome and Pare, 1988). A total of nine coherence levels were used, randomly varied from trial to trial: 0, ±5, ±10, ±20, and ±40%, where the negative and positive signs indicate the leftward and rightward motion, respectively.

The avatar was an average Eastern Asian face with neutral emotional expression and indifferent gender characteristics, generated by FaceGen 3.4.1 (Copyright 2009, Singular Inversions Inc.). Facing the motion stimulus, the avatar was located 6° horizontally away from the center of the screen to the left, subtending 7° × 7°of visual angle. The participants had a top view of the avatar's head (Figure 1).

The pt measure. Participants' PT tendencies was measured by the Perspective-Taking (PT) subscale of IRI (Davis, 1980).

Procedure

Participants individually completed the computer task with their heads supported by a chin rest, at a viewing distance of 57 cm from the computer screen. They also completed the PT measure at the end of the experiment.

There were three experimental conditions in the computer task: Perspective-Taking (PT), Adaptor Only (AO), and baseline. There was a practice block of trials within each condition. Both the Perspective-taking (PT) and the Adaptor Only (AO) conditions included 10 blocks of trials with a motion adaptor moving upward or downward (5 blocks for each direction), followed by a presentation of test stimulus in each trial. The baseline condition, however, only had 5 blocks of trials with only the presentation of test stimulus included in each trial. Each block had 45 critical trials across all three conditions, and additional 10 catch trials in the PT condition as well as 5 catch trials in the AO condition (see below the description of the conditions for detail). No catch trials was presented in the baseline condition.

Participants went through a total of 25 blocks of trials in 3 days, to reduce fatigue from working on the computer task each day. Two PT blocks and two AO blocks (with either upward or downward adaptor in each block), as well as one baseline block, were run on the first day. The number of blocks was doubled for each condition on the following 2 days, with all blocks run in a random order on each day.

Pt condition. At the beginning of each block, there was a 20 s presentation of the motion adaptor (pre-adaption, 100% motion coherence) and 3 warm-up trials (not included in the final analysis) to familiarize participants with the procedure. Then the participant went through 45 critical trials (nine levels of motion coherence; five trials at each level).

A critical trial (Figure 1C) started with a fixation of random duration (0.8–1.3 s), and was followed by a 5 s-presentation of a motion adaptor (topping-up adaption, 100% motion coherence), moving upward or downward from the self-perspective. At the same time, the avatar's head was presented on the left side of the adaptor (Figure 1A). Participants were instructed to continuously imagine themselves looking at the motion adaptor from the avatar's perspective. After a 0.2 s fixation-only interval, a horizontal-moving (from the self-perspective) test stimulus was presented for 0.4 s, at one of the nine coherence levels. Participants were asked to make a two-alternative forced-choice (2-AFC) judgment of the dominant direction of the test stimulus (either left or right) as accurately as possible (Figure 1B). Their responses prompt the beginning of the next trial.

Each block also included two types of “catch” trials, randomly mixed with the critical trials. Specifically, five “motion acceleration” trials were used to ensure participants were attending to the adaptor. In such a trial, the speed of the motion adaptor increased abruptly from 8°/s to 16°/s, a change that required the participants' immediate response by pressing the “N” key on the keyboard. Five “closed eyes” trials were used to ensure the participants were paying their attention to the avatar. In such a trial, the eyes of the avatar closed at a random time point between 1 and 3 s after the appearance of the avatar. Participants were asked to press the “V” key as soon as they detected this change. A failure to respond within 1 s prompted a warning message on the center of the screen for 0.6 s and the termination of the current trial. No test stimuli was presented in either “motion acceleration” or “closed eyes” trials. All participants in study 1a as well as in the rest of the studies reported in this paper completed the catch trials with above 90% accuracy, suggesting that they paid sufficient attention to the motion adaptor and the avatar's face.

Ao condition. The AO condition was the same as the PT condition, except that neither avatar nor “closed eyes” catch trials were included.

Baseline condition. The baseline condition did not have the avatar nor any adaptor, but only presented the test stimulus 0.2 s after the fixation for participants to judge it's direction, as a measure of the participants' baseline motion discrimination sensitivity.

Participants completed the PT measure at the end of the experiment.

Data analysis

Participants' probabilities of “rightward” responses following upward adaptors vs. downward adaptors were estimated by logistic regression analysis (see Appendix for model fits). The dependent variable of our study was the threshold of the test stimuli, which was the amount of motion coherence that yielded 50% “rightward” responses on the psychometric function curve (see a graphic illustration in Figure 2A), and was tested against a repeated measure ANOVA.

FIGURE 2

Figure 2. The effects of viewing motion adaptors on perceived direction of test stimuli under the PT condition (A) and the AO condition (B) in Study 1a, for all participants combined. The threshold of the test stimuli was quantified as the amount of motion coherence that yielded 50% “rightward” responses on the psychometric function curve. Meanwhile, the separation value was quantified as the difference between the two thresholds in the upward and downward adaptor conditions. If the separation value is positive (i.e., the upward-adaptor curve was on the left side of the downward-adaptor curve), it indicates an adaptation effect; if the separation value is negative, it indicates a facilitation effect. The abscissa refers to the motion coherence, with positive values for rightward motion and negative values for leftward motion. Error bars are ±1 SEM.

Meanwhile, the horizontal separation value, which was the difference between the two thresholds in the upward and downward adaptor conditions, was used as the index of the aftereffect from perceiving motion stimuli under the avatar's perspective, and correlated with participants' self-reported PT scores.

If the aftereffect did not exist, the two curves would overlap. If the separation value is positive (the threshold of the downward-adaptor curve is larger than that of the upward-adaptor curve), it indicates an adaptation effect: for instance, a smaller probability of “rightward” response after viewing a downward, rather than an upward adaptor. In contrast, a negative separation value (a smaller threshold of the downward-adaptor curve than the upward-adaptor curve) indicates a facilitation effect, which means viewing the motion adaptor facilitated subsequent motion perception in congruent-direction, for instance, a larger probability of “rightward” response after viewing a downward, rather than an upward adaptor.

Results

To examine whether response bias existed among participants, we conducted a one-sample t-test on the motion coherence of the test stimuli that yielded 50% “rightward” responses against 0. The result was not significant, t₍₁₅₎ = −0.349, p = 0.732, indicating participants' unbiased responses for discriminating the motion direction of the test stimuli. Although only reported here, the same test was conducted for all studies in this paper and none of the results was significant (Table 1).

TABLE 1

Table 1. Means (SE) of the motion coherence of the test stimuli that yielded 50% “rightward” responses in the baseline condition of each study.

Taking a general overview of the data from the experimental conditions, we found two categories of aftereffects. Some participants showed a positive separation value between the two psychometric function curves, indicating an adaptation effect, whereas some others showed a negative separation value, which indicates a facilitation effect.

Participants' separation value between the two psychometric function curves was found to significantly correlate with their self-reported PT tendency scores, for both the PT condition (r = 0.76, p < 0.001) and the AO condition (r = 0.53, p = 0.033). Specifically, participants with lower PT scores demonstrated a facilitation effect, whereas participants with higher PT scores showed an adaptation effect (Figure 3).

FIGURE 3

Figure 3. The correlation between the separation values of psychometric function curves and participants' self-reported PT scores in Study 1a. Negative values of abscissa indicate facilitation effects, whereas positive values indicate adaptation effects.

Across all participants, we conducted a 2 × 2 repeated measure ANOVA on the thresholds of the test stimuli, with Condition (PT vs. AO) and Motion Adaptor (upward vs. downward) as within-participant factors. Neither the main effect of Condition nor the interaction was significant, [Fs_{(1, 15)} < 1, η² = 0.001 and η² = 0.060]. The only significant main effect is Motion Adaptor, F_{(1, 15)} = 6.998, p = 0.018, η² = 0.318. These results indicates a facilitation effect for both the PT and the AO conditions (Figures 2A,B), but the two facilitation effects do not differ from each other.

Study 1b

We speculated that the facilitation effect found in the AO condition in Study 1a was a carryover effect. Since blocks of different conditions were interleaved and the avatar always appeared on the left side of the visual scene, participants' perspective-taking tendencies induced by the PT blocks might transfer to the AO blocks that did not have the avatar. Literature also showed that intensive practice of mental rotation activates memory mechanisms, and possibly leads to the extraction of the rotated representation of the stimuli directly from memory without actual mental rotation (Tarr and Pinker, 1989).

Alternatively, one could argue that, instead of perspective-taking, the upward/downward adaptors per se affected the discrimination sensitivity for the horizontal motion stimuli for both the PT and the AO blocks. To test and rule out this explanation, we conducted Study 1b that only included the AO condition and the baseline condition. Without the PT condition, we expected no aftereffect and therefore no separation of the psychometric curves between the upward and downward adaptors in the AO condition.