The Frozen Face Effect: Why Static Photographs May Not Do You Justice

Post, Robert; Haberman, Jason; Iwaki, Lica; Whitney, David

doi:10.3389/fpsyg.2012.00022

ORIGINAL RESEARCH article

Front. Psychol., 20 February 2012

Sec. Cognition

Volume 3 - 2012 | https://doi.org/10.3389/fpsyg.2012.00022

The frozen face effect: why static photographs may not do you justice

Robert B. Post¹

Jason Haberman^1,2,3

Lica Iwaki^1,2

David Whitney^2,4*

¹ Department of Psychology, University of California at Davis, Davis, CA, USA
² Center for Mind and Brain, University of California at Davis, Davis, CA, USA
³ Department of Psychology, Harvard University, Cambridge, MA, USA
⁴ Department of Psychology, University of California at Berkeley, Berkeley, CA, USA

When a video of someone speaking is paused, the stationary image of the speaker typically appears less flattering than the video, which contained motion. We call this the frozen face effect (FFE). Here we report six experiments intended to quantify this effect and determine its cause. In Experiment 1, video clips of people speaking in naturalistic settings as well as all of the static frames that composed each video were presented, and subjects rated how flattering each stimulus was. The videos were rated to be significantly more flattering than the static images, confirming the FFE. In Experiment 2, videos and static images were inverted, and the videos were again rated as more flattering than the static images. In Experiment 3, a discrimination task measured recognition of the static images that composed each video. Recognition did not correlate with flattery ratings, suggesting that the FFE is not due to better memory for particularly distinct images. In Experiment 4, flattery ratings for groups of static images were compared with those for videos and static images. Ratings for the video stimuli were higher than those for either the group or individual static stimuli, suggesting that the amount of information available is not what produces the FFE. In Experiment 5, videos were presented under four conditions: forward motion, inverted forward motion, reversed motion, and scrambled frame sequence. Flattery ratings for the scrambled videos were significantly lower than those for the other three conditions. In Experiment 6, as in Experiment 2, inverted videos and static images were compared with upright ones, and the response measure was changed to perceived attractiveness. Videos were rated as more attractive than the static images for both upright and inverted stimuli. Overall, the results suggest that the FFE requires continuous, natural motion of faces, is not sensitive to inversion, and is not due to a memory effect.

Introduction

We have often observed that when a video of someone speaking is paused, the stationary image of the speaker typically appears less flattering than the preceding video. We refer to this phenomenon as the frozen face effect (FFE).

The FFE may be related to observations in prior research that facial motion may influence ratings of attractiveness. For example, using computer-generated animations of faces, Knappmeyer et al. (2002) report that the addition of motion to computerized faces positively influenced attractiveness judgments. Morrison et al. (2007) report that animating androgynous line drawings of faces can influence attractiveness ratings. Specifically, attractiveness was found to correlate with the total amount of movement in female, but not male faces. This study did not compare the animations to static faces, however. In contrast to the studies using computer-generated or cartoon-like faces, Rubenstein (2005) compared attractiveness of video clips of a person speaking and a stationary image selected from the video clip (on the basis of neutral appearance). In this study, there was no difference in attractiveness ratings between the stationary and dynamic facial stimuli.

In this paper we sought to empirically demonstrate the FFE, and examine its possible mechanisms. In contrast to past work, we presented more ecologically valid stimuli, including videos of individuals speaking in naturalistic settings such as news programs, talk shows, and interviews. Speakers included individuals who were either famous or not, and speaking a variety of languages.

Experiment 1

Experiment 1 was designed to quantify the FFE and measure differences in how subjects rated flattery of the video and static frame stimuli.

Method

Ethics statement – Written informed consent was obtained for all participants, and UC Davis’ Institutional Review Board granted approval for all research.

Subjects

Seven undergraduate students (two males, five females), aged 21–23 participated. All but one observer were naïve to the hypotheses of the research.

Stimuli

The stimuli were 40, 2 s video clips (20 unique individuals, 4 s of video for each individual divided into two equal 2 s clips), and all static frames contained within each video. Video clips of individuals speaking in naturalistic settings were sampled from the internet, divided into 2 s clips, and saved in Quicktime format at 15 frames per second (fps). Although familiarity (i.e., with celebrities) was not controlled, anecdotally participants reported recognizing less than one-third of the individuals. Static stimuli were created by extracting all frames from each 2 s clip, which were standardized at 528 × 431 pixels. Video clips were then recreated from the static frames at 15 fps and muted to eliminate auditory cues. Thus a total of 40 2 s videos and 1200 static frames were created and presented as stimuli. Examples of static frames used in the study are shown in Figure 1.

FIGURE 1

Figure 1. Examples of static images derived from video clips.

Procedure

Subjects were seated in a dark sound-dampened room. Stimuli (all 11.6° × 9.50°) were presented on a Sony CRT (Sony Multiscan G520, 21′′, 1600 × 1200, 85 Hz refresh) at a viewing distance of 65 cm. All videos and static frames were presented in random order within the same session. Video stimuli were presented for the 2-s duration of each video, after which observers saw only a fixation cross until a response was made. Static stimuli were also presented until a subject response was received. A random dot mask was presented between each trial. After each image was presented, subjects gave flattery ratings for the person in each stimulus using a seven-point scale (1 = least flattering, 7 = most flattering). The next stimulus was presented immediately after the flattery rating was made. Attractiveness ratings have been widely used in previous research (e.g., Knappmeyer et al., 2002; Rubenstein, 2005; Morrison et al., 2007), however, attractiveness is not the optimal dependent measure for our task. While attractiveness is a perceptual property invariant to context, we were interested in relative differences as a function of stimulus type – hence our use of flattery as our dependent measure. To illustrate, one can imagine a scenario in which person A is known and believed to be attractive, but appears in a photograph that is not particularly flattering. By utilizing flattery ratings, we offset differences in absolute attractiveness across our stimuli and reduced subject confusion about what counts as “attractive.” Nevertheless, we conducted a control experiment (Experiment 6) that used attractiveness ratings, to confirm that the same results hold with both flattery and attractiveness ratings.

Results and Discussion

Overall, flattery ratings of the videos were significantly higher than average flattery ratings of static images derived from the videos [t(6) = 6.40, p < 0.001, η = 0.87]. As seen in Figure 2, this pattern was obtained for each of the seven subjects. We assessed inter-rater reliability of the static image ratings by using Kendall’s W. For each observer, we averaged the ratings of the 30 images composing a given movie. Thus, we compared rating consistency for 40 stimuli across the seven subjects. The inter-rater reliability was significant [W = 0.638, χ²(6) = 153, p < 0.001], justifying the use of a t-test and other comparable group assessments. It additionally demonstrates that subjects were not lapsing or repeatedly making the same response as they rated the static images.

FIGURE 2

Figure 2. Flattery ratings in Experiment 1 for the person in each video or static frame using a seven-point scale (1 = least flattering, 7 = most flattering). Error bars denote ± SEM.

The results demonstrate a very strong effect for each of the seven subjects. On average, subjects rated 91% of the static frames as less flattering than the movies they composed (excluding images rated the same as videos; across subjects this translated to 6254 out of 6844 static images being rated as less flattering than the corresponding video). This pattern held for each subject; the least significant subject still rated 73% of the static images as less flattering than the corresponding movies (652 out of 882), a highly significant effect [χ²(1) = 201, p < 0.0001]. The consistency of the FFE for the vast majority of the static images and for each of the 40 movie stimuli shows that familiarity with certain faces (which occurred on less than one-third of the stimuli) was not necessary for the FFE.

This finding is consistent with the results of Knappmeyer et al. (2002), but contrary to the results of Rubenstein (2005) who found no influence of facial motion on ratings of attractiveness.

A regression analysis of the mean flattery ratings for the static images against flattery ratings for the corresponding video clips indicated a statistically significant relationship [average r = 0.58 (converted from averaged fisher z-scores), least significant subject: r = 0.35, p = 0.025]. This contrasts with Rubenstein (2005), who reported no correlation between attractiveness ratings of static and dynamic formats of the same face (r = 0.19, p = 0.26).

A number of possibilities exist for this discrepancy, including differences in the stimuli. Rubenstein (2005) was careful to use neutral static images, in which the depicted individual was staring straight ahead. Here, we used natural movies of speaking individuals where the expression, gaze, and head orientation could vary.

The results of Experiment 1 provide strong empirical support for the existence of the FFE. The rest of this paper explores the properties, limitations, and possible mechanisms of the FFE.

Experiment 2: The Inverted FFE

Prior research has shown that face recognition is strongly disrupted by facial inversion (e.g., Yin, 1969; Farah et al., 1997). Additionally, artificial manipulations of faces that are easily detected when they are presented upright, become much less perceptible when they are presented inverted (Thompson, 1980). These studies might predict that the FFE is either strongly decreased or eliminated with facial inversion. Experiment 2 was therefore designed to test this hypothesis.