- 1Department of Child Studies, Shiraume Gakuen University, Kodaira City, Tokyo, Japan
- 2Center for Baby Science, Doshisha University, Kizugawa City, Kyoto, Japan
- 3Department of Psychology, Faculty of Humanities, Kyoto University of Advanced Science, Kyoto City, Kyoto, Japan
Understanding how infants allocate attention to emotional facial expressions offers crucial insights into the developmental origins of social cognition. This study adopted a dimensional approach focusing on valence and arousal to investigate age-related changes in infants' preferential attention to dynamic facial expressions. Using an eye-tracking paradigm, we examined 234 participants aged 3–36 months, as well as a group of adults, who viewed bilateral presentations of dynamic emotional faces differing in valence (happiness–fear), arousal (anger–disgust), and combined attributes (surprise–sadness). Preferential-looking time at each face and the distribution of gazes toward the eyes or mouth were quantified, and generalized linear models were developed with age, sex, and their interaction as predictors while controlling for individual eye–mouth preferences. In addition, motion-energy differences between paired stimuli were quantified and included as covariates in analyses. Results revealed distinct developmental trajectories across emotional dimensions. Preferences along the valence dimension increased with age; adults attended more to positive (happy) expressions than younger infants. No clear age-related modulation was observed for the arousal dimension under the present stimulus contrast. On the other hand, the combined valence–arousal dimension (surprise–sadness) exhibited a robust inverted-U developmental pattern, peaking between 8 and 12 months. Infants' eye–mouth preferences also followed a U-shaped developmental trajectory, with enhanced mouth focus between 10 and 18 months. Motion-energy analyses demonstrated that perceptual motion salience significantly influenced preferential looking but did not fully account for the observed developmental effects. These findings suggest that sensitivity to valence precedes the differentiation of arousal and that integration of both dimensions undergoes a transient amplification during late infancy. The results support a developmental model in which dimensional sensitivities scaffold later categorical emotion recognition, refining our understanding of early socioemotional specialization.
1 Introduction
The perception of facial expressions is a foundational component of early social cognition, providing infants with critical information about others' affective states, intentions, and communicative signals. From the first months of life, infants attend preferentially to faces, and a growing body of research demonstrates sensitivity to emotional expressions well before the emergence of language (Farroni et al., 2007; Field et al., 1982; Grossmann, 2010; LaBarbera et al., 1976; Leppänen and Nelson, 2009; Serrano et al., 1992; Walker-Andrews, 1997). Nevertheless, the developmental mechanisms underlying early emotion perception remain contested, particularly with respect to whether infants primarily process facial expressions as discrete categories or along broader affective dimensions (Ekman, 1992; Kuppens et al., 2013; Posner et al., 2005; Ruba and Repacholi, 2019; Russell, 1980; White et al., 2018, 2019).
Contemporary theoretical frameworks increasingly emphasize dimensional models of emotion, proposing that affective processing is organized along continuous dimensions such as valence (positive–negative) and arousal (low–high activation), rather than as a set of innate discrete categories (Russell, 1980). Within this framework, developmental change is often characterized as a progression from coarse sensitivity to affective dimensions toward more refined categorical differentiation (Russell and Bullock, 1985; Vesker et al., 2018; Widen and Russell, 2008, 2013). Empirical support for this view comes from behavioral, physiological, and neuroimaging studies indicating that infants respond differently to positive vs. negative expressions early in life, whereas finer distinctions among negative expressions emerge (Leppänen, 2011; Peltola et al., 2013).
Valence-related preferences have been particularly well-documented. Several studies report that older infants preferentially attend to happy faces, whereas younger infants show weaker or inconsistent valence biases (Farroni et al., 2007; LaBarbera et al., 1976). In contrast, findings concerning threat-related biases, such as preferential attention to fearful faces, have proven less consistent and appear to depend on stimulus format, task demands, and the availability of motion cues (Kauschke et al., 2019; LoBue and DeLoache, 2010; Peltola et al., 2013). These inconsistencies raise the possibility that early valence effects are modulated by low-level perceptual features rather than reflecting fully developed categorical fear processing (Frank et al., 2009; Oakes and Ellis, 2013).
Compared to valence, arousal-related processing has received less direct empirical attention in infancy. Yet arousal may constitute a particularly salient dimension early in development, as high-arousal signals are closely linked to orienting responses and physiological regulation (Bradley and Lang, 2007). Expressions such as anger and disgust, which share negative valence but differ in arousal, offer a theoretically useful contrast for probing this dimension. However, existing evidence suggests that arousal discrimination may be relatively stable across development, potentially reflecting early-maturing perceptual or subcortical mechanisms (Leppänen and Nelson, 2009; Vaish et al., 2008).
Beyond isolated dimensions, some expressions differ simultaneously in both valence and arousal, requiring integration across affective dimensions. For example, surprise and sadness differ in both arousal and valence, and their discrimination may depend on the coordinated processing of multiple affective cues. Developmental theories propose that such multidimensional integration is particularly prominent during late infancy, when social referencing, joint attention, and communicative signaling rapidly expand (Grossmann, 2010; Hoehl et al., 2017). Consequently, preferences for expressions differing along both dimensions may exhibit non-linear developmental trajectories rather than monotonic change.
A critical yet often underappreciated factor in emotion perception is the dynamic nature of facial expressions. Dynamic stimuli are processed differently from static images and tend to elicit stronger attentional engagement and neural responses, even in early infancy (Addabbo et al., 2018; Hunnius and Geuze, 2004; Wilcox et al., 2013; Xiao et al., 2015). At the same time, dynamic expressions necessarily vary in the magnitude and distribution of motion across facial regions, raising the possibility that motion salience influences looking behavior independently of emotional meaning (Addabbo et al., 2018; Bassili, 1978; Frank et al., 2009; Sato and Yoshikawa, 2004; Segal and Moulson, 2020; Smith et al., 2005; Xiao et al., 2015).
Similarly, infants' scanning patterns undergo marked developmental change. Attention to the mouth increases during periods of speech learning, whereas attention to the eyes becomes more dominant later in development (Lewkowicz and Hansen-Tift, 2012; Tenenbaum et al., 2013). These shifts interact with expression-specific motion cues, as some emotions (e.g., surprise) involve larger mouth movements than others (Eisenbarth and Alpers, 2011). Thus, both motion energy and eye–mouth allocation represent important sources of variance that must be considered when interpreting preferential-looking data obtained with dynamic faces.
Recent methodological discussions have emphasized that low-level stimulus properties, including motion energy, can systematically bias preferential-looking measures when dynamic stimuli are used (Frank et al., 2009). Consistent with these concerns, supplementary analyses using linear mixed models have revealed that lateralized differences in motion energy between paired dynamic facial expressions significantly predicted preferential-looking behavior, indicating that infants' gaze was partially driven by relative motion salience. Importantly, however, the magnitude of motion-energy differences varied substantially across expression pairs. While alternative pairings of emotional expressions could in principle reduce motion-energy disparities, such pairings would also alter the affective dimensional structure of the contrasts, potentially compromising their theoretical specificity. This highlights an inherent tradeoff between optimizing affective dimensional contrast and minimizing perceptual differences when selecting dynamic facial stimuli, underscoring the need to explicitly model motion-related factors rather than eliminating them through stimulus selection alone.
This study investigates developmental changes in infants' preferential looking to dynamic facial expressions across three theoretically motivated contrasts: valence (happiness vs. fear), arousal (anger vs. disgust), and integrated valence–arousal (surprise vs. sadness). By examining these contrasts within a single experimental framework and explicitly modeling eye–mouth scanning and motion-energy differences, the study aims to clarify how affective dimensions and perceptual factors jointly shape early emotion perception.
Based on prior dimensional models, empirical findings, and the considerations outlined above (Bastianello et al., 2022; Leppänen and Nelson, 2009), we formulated the following hypotheses.
First, we hypothesized that valence-based preference (happiness over fear) would increase with age, reflecting the gradual strengthening of positive–negative discrimination.
Second, we predicted that arousal-based preference (anger vs. disgust) would show relatively limited age-related change, consistent with early-emerging sensitivity to arousal.
Third, for expressions differing along both valence and arousal (surprise vs. sadness), we expected a non-linear developmental trajectory, with enhanced preference during late infancy when multidimensional integration is particularly prominent.
Fourth, we hypothesized that motion energy would systematically influence preferential-looking behavior, such that greater relative motion would bias gaze toward the more dynamic stimulus. Accordingly, motion-energy differences were included as a covariate in all statistical models.
Finally, we predicted that motion energy would not fully account for developmental effects, indicating that affective dimensions contribute to looking preferences above and beyond low-level perceptual salience.
Together, this approach provides a rigorous test of dimensional and integrative accounts of emotion perception.
2 Methods
2.1 Participants
Two-hundred sixty-two participants were recruited from the Kyoto city area, via local prenatal clinics, community events, and social media. Twenty-eight infants were excluded from the final data set due to fussiness (n = 8), failure to calibrate (n = 9), or failure to have data on a minimum number of trials (n = 11). The final sample (N = 234) included a group of 3-month-old infants (n = 41, Mage 91.12 days, SD 6.89 days, 17 females), a group of 6-month-old infants (n = 23, Mage 179.13 days, SD 7.13 days, 10 females), a group of 8-month-old infants (n = 36, Mage 241.35 days, SD 6.12 days, 16 females), a group of 10-month-old infants (n = 32, Mage 302.38 days, SD 7.26 days, 16 females), a group of 12-month-old infants (n = 27, Mage 368.15 days, SD 6.12 days, 13 females), a group of 18-month-old infants (n = 23, Mage 538.47 days, SD 4.12 days, 11 females), a group of 36-month-old infants (n = 28, Mage 1,083.53 days, SD 7.22 days, 11 females), and adults (n = 24, Mage 34.27 years, SD 10.22 years, 12 females). All infants were born full-term (at least 37 weeks of gestation) with no known visual or hearing difficulties. All infants were monolingual, learning Japanese as their native language. Their guardians received a small cash payment for participation. Only infants who provided useable data for a minimum of one presentation of each condition (see Section 2.4, “Measurement”) were included in the analysis.
2.2 Facial stimuli
Emotional faces with directed gaze were depicted by a Japanese woman model that was selected from the ATR facial expression picture database (https://www.atr-p.com/products/face-db.html). We created movie stimuli of dynamic facial expressions or dynamic faces (Sato and Yoshikawa, 2004). Movie stimuli were created in the following way. Between the neutral and each target expression, 24 intermediate images in 4% steps were created using computer-morphing techniques (Sqirlz Morph version 2.1: Xiberpix, UK, https://www.xiberpix.net/SqirlzMorph.html). Then, to create a moving clip, a total of 26 images (one neutral image, 24 intermediate images, and the final expression image) were presented in succession. Each image was presented for 40 ms, and the first and last images were additionally presented for 340 ms; thus, each animation clip lasted for 1,720 ms. Each clip was repeated seven times (i.e., 12-s duration). This presentation speed has been found for adults to sufficiently reflect natural changes in dynamic facial expressions (Sato and Yoshikawa, 2004). To maximize experimental control over low-level facial structure and motion trajectories, we used a single Japanese female actor as the source for all dynamic morph sequences. Using one model minimizes between-actor variance in facial morphology and motion kinematics, which is particularly important in infant preferential-looking paradigms, where stimulus variability can inflate noise and obscure dimension-specific effects. Nonetheless, we recognize that this choice reduces generalizability across identities and therefore note it explicitly as a study limitation (see “Limitations” in Section 4, “Discussion”).
Stimuli were validated by 20 adults (11 females, mean age = 28.3, SD = 6.12), who did not participate in the main experiment and were asked to rate the emotional valence and arousal of the stimuli using an Affect Grid assessment (Russell et al., 1989). The 9 × 9 Affect Grid assesses affect along the dimensions of valence and arousal. Participants were asked to rate the emotion expressed by a dynamic face using a computer mouse to select the appropriate location on a two-dimensional square representing emotional space. Each facial stimulus was presented twice in random order. The results on the Affect Grid are shown in Figure 1. The valence and arousal scores (M ± SD) in each expression were as follows: happiness (7.8 ± 0.6; 7.0 ± 0.8), fear (3.0 ± 0.9; 7.9 ± 1.0), anger (2.2 ± 0.8; 7.5 ± 0.9), disgust (2.5 ± 0.7, 4.5 ± 0.8), surprise (4.9 ± 1.0, 8.5 ± 0.4), and sadness (2.5 ± 0.7, 3.0 ± 0.6).
Figure 1. Results of the Affect Grid assessment. Each stimulus of dynamic facial expression was validated by the 9 × 9 Affect Grid assessment. Mean scores (M) ± standard deviations (SD) in the valence and arousal dimensions were the following: happiness (7.8 ± 0.6; 7.0 ± 0.8), fear (3.0 ± 0.9; 7.9 ± 1.0), anger (2.2 ± 0.8; 7.5 ± 0.9), disgust (2.5 ± 0.7, 4.5 ± 0.8), surprise (4.9 ± 1.0, 8.5 ± 0.4) and sadness (2.5 ± 0.7, 3.0 ± 0.6). Happiness and fear (double arrow) were positioned almost parallel to the valence axis. Anger and disgust (double arrow) were positioned almost parallel to the arousal axis. Surprise and sadness (double arrow) were positioned in combinations of both valence and arousal attributes.
The selection of emotional expression pairs was guided by both theoretical considerations derived from dimensional models of affect and methodological constraints inherent in the use of dynamic facial stimuli. Specifically, we aimed to construct contrasts that selectively probed valence, arousal, and their integration, while maintaining consistency across stimulus generation and enabling subsequent control for low-level perceptual factors such as motion energy.
Facial stimuli of happiness and fear were positioned almost parallel to the valence dimension on the Affect Grid space; thus, the difference in the two emotions is the difference in valence. For the arousal dimension, we selected anger and disgust expressions. Both expressions share negative valence but differ in arousal, making them suitable candidates for isolating arousal-related processing within a dimensional framework. We acknowledge that anger and disgust may be perceived as conceptually and perceptually similar (Widen and Russell, 2008, 2013; but Ruba et al., 2017), and that alternative pairings, most notably anger vs. sadness, could potentially yield a larger separation in affective space, as sadness is characterized by comparably low arousal while maintaining a similar negative valence. Such a pairing would indeed increase the Euclidean distance between expressions within the Affect Grid and might, in principle, enhance sensitivity to arousal-based contrasts.
However, stimulus selection in this study required balancing affective dimensional contrast against perceptual and methodological considerations. In particular, the study included an additional contrast between surprise and sadness to examine expressions differing simultaneously in both valence and arousal. Preserving this integrated contrast necessitated the inclusion of sadness in only one pairing to avoid reuse of the same expression across multiple dimensions, which could complicate interpretation and introduce dependency across contrasts. Consequently, anger vs. disgust was selected as the arousal contrast to maintain a clear separation between the arousal-only and the integrated valence–arousal conditions.
Moreover, because dynamic facial expressions inherently differ in the magnitude and distribution of facial motion, we conducted analyses quantifying motion energy for each expression (see Section 2.5, “Motion-energy quantification and control”). These analyses revealed substantial variation in motion energy across expressions, with surprise and disgust showing particularly high values relative to other emotions. Alternative pairings, such as anger vs. sadness or surprise vs. disgust, would have reduced motion-energy differences between paired stimuli but would also have altered the dimensional structure of the contrasts. Given these tradeoffs, we opted to retain the theoretically motivated contrasts and to statistically model motion-energy differences rather than eliminate them solely through stimulus selection.
Taken together, the final stimulus set reflects a principled compromise between maximizing affective dimensional clarity, preserving theoretically distinct contrasts, and ensuring methodological transparency. This approach allows us to test arousal-related processing while explicitly accounting for perceptual factors that may influence preferential-looking behavior.
Thus, in the main experiments, happiness and fear faces were used to test subjects' preferences in the valence dimension (pleasant vs. unpleasant) and presented bilaterally in the monitor, while anger and disgust were presented for the arousal dimension (high- vs. low-arousal). Surprise and sadness represent distinct combinations of both valence and arousal attributes (Fujimura et al., 2012; Matsuda et al., 2013). Supplementary Videos 1–3 show samples of dynamic expressions.
2.3 Procedure
Written informed consent was obtained from all legal guardians of infants/children and all participant adults prior to participation. Participant infants were seated on their guardian's lap, and other participant children and adults sat on a chair, with eyes positioned centrally, both vertically and horizontally, relative to the display monitor (17′′ Dell E1715S color LCD), which was placed approximately 60 cm from the participant. Two speakers, concealed behind the monitor, delivered the soundtrack accompanying the video stimuli. All testing was conducted in a sound-attenuated room, enclosed in black curtains to minimize external visual and auditory distractions. A remote eye-tracking system (Tobii Pro X3-120, Tobii AB), mounted beneath the monitor and directed toward the participant, recorded gaze data at a sampling frequency of 120 Hz. The researcher proceeded with calibration (two-point model with animated targets) to establish the corneal reflection detection threshold. Following successful calibration, the researcher started the trial.
The presentations of each pair face (i.e., happiness–fear, anger–disgust, and surprise–sadness) were repeated twice by reversing left/right positions to prevent participants' one-side orientation bias. Each face stimulus subtended a visual angle of 11.13 ° × 12.50 ° from 60 cm. Each trial was preceded by a stimulus that was intended to attract the infants' visual attention. The order of the six test trials was random across participants.
2.4 Measurement
Areas of interest (AOIs) were created using ellipses and rectangles in Tobii Pro Studio (Tobii AB). The face AOI of model expressions was created with an ellipse to cover the whole face, except the hair (11.10 ° × 11.20 ° from 60 cm). Looks that occurred outside these AOIs were discarded from analysis during data processing. We analyzed total dwell time to a face as an index of level of attention. Total dwell time was defined as the amount of time in milliseconds (ms) the participant spent looking anywhere within the face region. Since we presented each expression twice, we added each dwell time to obtain the total dwell time. We analyzed participants' preferences for facial expressions (happiness vs. fear, anger vs. disgust, and surprise vs. sadness) by calculating the relative difference in total looking time at each face AOI. The relative difference was defined as the dwell time on a face AOI minus the dwell time on co-paired face AOIs. This approach allowed us to quantify a relative preference for gaze allocation, with a positive time difference indicating greater looking time at one face and a negative time difference indicating greater looking time at the other face.
We also defined subregions within the face AOIs. The bottom half of a model's face was defined as mouth AOI, and that around the top half of the model's face was defined as the eye AOI. These AOIs were created with rectangles to equally split the face from the center up to include the eyes and the center down to include the mouth. Then we analyzed the relative difference in total looking time at each of the AOIs. The relative difference (eye–mouth) was defined as the dwell time on the eye AOI minus the dwell time on the mouth AOI. A positive time difference indicates a greater looking time at the eyes and a negative time difference indicates a greater looking time at the mouth. Any participant who failed to contribute looking data (i.e., at least one fixation with each pair faces) was excluded from the analysis.
2.5 Motion-energy quantification and control
To quantify the motion salience of each dynamic clip, we computed a frame-difference-based motion-energy index (MEI). For each animation, consecutive frames were converted to grayscale and the absolute pixelwise difference between successive frames was computed; these per-frame differences were summed across the clip and normalized by the face AOI to yield a single MEI value for each clip: happiness 767; fear 729; anger 1,016; disgust 1,369; surprise 1,414; sadness 739. For each bilateral trial we computed the motion difference between the left- and right-stimulus MEIs. Motion difference was entered as a continuous covariate in subsequent statistical models to test whether lateralized motion differences predicted preferential looking. The MEI computation followed standard frame-differencing procedures (Paxton and Dale, 2013; Ramseyer, 2020).
2.6 Statistical analysis plan
All statistical procedures were prespecified. For each of the three primary contrasts (happiness vs. fear; anger vs. disgust; surprise vs. sadness) we computed a relative preference score defined as dwell_time (faceA) – dwell_time (faceB), as mentioned earlier. We first inspected whether group means departed from chance (zero) using one-sample t-tests or Wilcoxon signed-rank tests as appropriate for distributional properties; these tests evaluate whether preferences are present at the group level. We then modeled developmental effects using generalized linear models (GLMs) with age group (factor), participant sex, and participant-level mean eye–mouth index as covariates. Models used Gaussian identity links; continuous covariates were mean-centered. Post hoc between-group comparisons used Games-Howell tests (Welch adjustment) where heteroscedasticity was evident. Effect sizes (Cohen's d or partial η2) and 95% confidence intervals are reported throughout. For motion-energy concerns, we computed an objective MEI (see the preceding Section 2.5) for each clip (frame differencing summed across frames) and examined whether motion energy covaried with preference scores; where relevant, motion energy was added as an additional covariate. Statistical analyses were performed in the Jamovi statistical package (version 2.6.45.0, Jamovi project 2025, https://www.jamovi.org/), which is an open-source graphical user interface for the R programming language.
3 Results
3.1 Developmental changes of preferential attention toward facial expressions
Preferential attention for expression faces in the valence dimension (pleasantness vs. unpleasantness) was determined from bilateral presentations of happiness–fear stimuli. Participant's preference was defined quantitatively as the relative difference in total looking time at happiness over fear faces. Figure 2A shows developmental differences in preferential attention. First, one-sample t-tests were conducted to determine whether each developmental-group mean departed from chance (zero, i.e., the same preference for the two faces). Significant fear-face preference was observed at 3 months, t(40) = −2.44, p = 0.019, whereas significant happiness-face preference was observed at 8 months, t(35) = 2.03, p = 0.050, and adults, t(24) = 2.72, p = 0.012. A one-way analysis of variance (ANOVA; Welch's test) was also conducted to examine whether expression preferences were different among age groups. The effect of age difference was significant, F(7, 91.0) = 2.809, p = 0.011. The Games-Howell post hoc test showed that adults were significantly different from 3-month-old infants, t(36.9) = −3.60, p = 0.021, which indicates adults preferred positive-valence (happiness) faces more than 3-month-old infants (Supplementary Table 1A).
Figure 2. Developmental changes of preferential attention toward facial expressions. (A) The relative difference of total looking time to happiness over fear faces. Positive value means happiness preference, and negative value means fear preference. (B) The relative difference of total looking time to anger over disgust faces. Positive value means anger preference, and negative value means disgust preference. (C) The relative difference of total looking time to surprise over sadness faces. Positive value means surprise preference, and negative value means sadness preference. Filled circle represents mean and error bar represents standard error of mean (SEM) in each age group.
Preferential attention for expression faces in the arousal dimension (high vs. low arousal) determined from bilateral presentations of anger–disgust stimuli. Figure 2B shows developmental differences in preferential attention. One-sample t-tests showed a significant difference only at 12 months, t(26) = −2.14, p = 0.042 (disgust-face preference). A one-way ANOVA (Welch's test) was conducted to examine whether expression preferences were different among the different age groups. The effect of age differences was not significant, F(7, 92.7) = 0.644, p = 0.718.
Preferential attention for expression faces in the combinations of both valence and arousal attributes was determined from bilateral presentations of surprise–sadness stimuli. Figure 2C shows developmental differences in preferential attention. One-sample t-tests showed a significant surprise-face preference in all age groups (ps < 0.005) except adults. A one-way ANOVA (Welch's test) was conducted to examine whether expression preferences were different among the different age groups. The effect of age differences was significant, F(7, 91.7) = 12.768, p < 0.001. The Games-Howell post hoc test showed that surprise face was preferred to sadness face in an inverted U-shape manner with developmental age (Supplementary Table 1B).
3.2 Effect of motion energy
Infants might have preferred the surprise face due to its high motion saliency. Indeed, eyes and mouth were mostly widened to move in our stimulus set of dynamic faces. We computed a frame-difference-based MEI for each dynamic clip (see Section 2.5). Because dynamic stimuli differed in total movement across clips, we tested whether lateralized motion differences predicted preferential looking by fitting a linear mixed model (LMM) with looking-time difference as the dependent variable and motion difference, eye-mouth preference (see Section 3.3), sex, age, and the sex × age interaction as fixed effects, with a random intercept for participant (ID). All continuous covariates were mean-centered. The LMM converged and revealed a robust effect of motion difference: motion difference was a highly significant predictor of looking preference, F(1, 684) = 204.83, p < 0.001 (b = 0.0482, SE = 0.000337, t = 14.31). Eye-mouth preference also showed a significant omnibus effect, F(1, 684) = 4.39, p < 0.037. Furthermore, age showed a significant omnibus effect, F(7, 684) = 3.71, p < 0.001; several age contrasts (6, 8, 10, and 12 vs. 3 months) remained significant (Supplementary Table 2). The conditional and marginal R2 were 0.267, indicating that the model accounted for ~27% of total variance. These results indicate that lateralized motion differences between simultaneously presented clips significantly influenced preferential looking and should be included as a covariate when interpreting developmental effects.
3.3 Preferential attention toward eyes or mouth
Next, we investigated whether preferential attention toward eyes or mouth depended on facial expression, developmental age, and/or the individual. Preference toward eyes or mouth was quantified as the relative difference (see Section 2.4, “Measurement”).
In the valence dimension (happiness vs. fear face), the mouth area of the happiness face was preferred over that of the fear face, while the eye area of the fear face was preferred over that of the happiness face across age groups (Shapiro-Wilk normality test, W = 0.950, p < 0.001; Wilcoxon signed-rank test, W = 9,581, N = 234, p < 0.001, rank-biserial correlation r = −0.28). In the arousal dimension (anger vs. disgust), there was no significant difference between anger and disgust faces across age groups (Shapiro-Wilk normality test, W = 0.969, p < 0.001; Wilcoxon signed-rank test, W = 12,156, N = 234, p = 0.265, r = −0.08). In the combinations of both valence and arousal attributes (surprise vs. sadness), the mouth area of the surprise face was preferred over that of the sadness face, while the eye area of the sadness face was preferred over that of the surprise face across age groups (Shapiro-Wilk normality test, W = 0.969, p < 0.001; Wilcoxon signed-rank test, W = 5,455, N = 234, p < 0.001, r = −0.59).
We then analyzed age differences with respect to preferential attention toward eyes or mouth in each facial expression (Figures 3A–F). One-way ANOVAs (Welch's test) revealed that age groups had significant differences in all expressions: happiness F(7, 91.1) = 11.680 p < 0.001; fear F(7, 91.2) = 5.450, p < 0.001; anger F(7, 91.1) = 8.52, p < 0.001; disgust F(7, 89.5) = 7.98, p < 0.001; surprise F(7, 89.6) = 10.62, p < 0.001; sadness F(7, 91.1) = 5.01, p < 0.001. Games-Howell post hoc tests were also conducted (Supplementary Tables 3A–F). In any expression, preferential attention toward eyes or mouth changed developmentally in a U-shape manner, where 10- to 18-month-old infants showed greater mouth preference than did those younger and older than these ages.
Figure 3. Developmental changes of preferential attention toward eyes or mouth. The relative differences of total looking time to eyes over mouth in different expression faces. Positive value means eye preference, and negative value means mouth preference. Filled circle represents mean and error bar represent standard error of mean (SEM) in each age group. (A) happiness face, (B) fear face, (C) anger face, (D) disgust face, (E) surprise face, and (F) sadness face.
We also analyzed individual differences in preferential attention toward eyes or mouth (Viktorsson et al., 2023). Pearson's correlation analyses revealed a strong individual consistency in preference across facial stimuli, where participants showed high correlation coefficients not only between pair-wise stimuli but also between non-paired stimuli (Figure 4).
Figure 4. Correlations of eye-mouth preference between different expression faces. Positive value means eye preference, and negative value means mouth preference in both horizontal and vertical axes. Filled circle represents each participant. Corr and *** represent correlation coefficient and p < 0.001, respectively.
3.4 Prediction models of preferential attention toward facial expression
A series of GLMs were conducted to examine the effects of age, sex, and their interaction on three dependent variables, preferences of happiness vs. fear (valence dimension), anger vs. disgust (arousal dimension), and surprise vs. sadness (combinations of both valence and arousal attributes), while controlling for the covariate eye–mouth preference. To make the statistical model simple, because individual consistency in eye–mouth preference was robust across facial expressions, we calculated the average value of preference for each participant. Each model specified a Gaussian distribution with an identity link function, and all continuous covariates were mean-centered prior to analysis. Parameter estimation successfully converged in all models.
The model predicting scores of happiness–fear preferences was significant, χ2(16) = 410, p = 0.005, explaining approximately 14% of the variance (R2 = 0.14, adjusted R2 = 0.13). Omnibus tests revealed a significant main effect of age, χ2(7) = 22.20, p = 0.002, indicating developmental changes in happiness–fear preference across age groups. Neither the main effect of sex, χ2(1) = 0.02, p = 0.883, nor the covariate eye–mouth preferences, χ2(1) = 2.13, p = 0.144, reached significance. The age × sex interaction approached significance, χ2(7) = 11.31, p = 0.126. Parameter estimates indicated that, relative to the 3-month reference group, scores of happiness–fear preferences were significantly higher at 8 months (b = 1.88, SE = 0.82, z = 2.29, p = 0.022) and in adults (b = 3.73, SE = 0.92, z = 4.07, p < 0.001). Other age comparisons did not reach significance. Notably, significant age × sex interactions were observed at 12 months (b = −3.83, SE = 1.73, z = −2.21, p = 0.027) and 18 months (b = −4.26, SE = 1.82, z = −2.34, p = 0.019), suggesting that sex differences emerged transiently during these developmental stages, with males showing higher values of happiness–fear preferences compared to females.
The model for anger–disgust preferences did not reach significance, χ2(16) = 177, p = 0.671, accounting for 5.7% of the variance (R2 = 0.06, adjusted R2 = 0.05). None of the omnibus tests revealed significant effects of age, χ2(7) = 4.52, p = 0.718, or age-by-sex interaction, χ2(7) = 5.44, p = 0.606. The main effect of sex was marginal, χ2(1) = 3.33, p = 0.068, indicating a non-significant trend toward lower scores of anger–disgust preferences in females. The covariate eye–mouth preferences had no significant effect, χ2(1) = 0.08, p = 0.778. Examination of the parameter estimates revealed that none of the age contrasts significantly differed from the 3-month baseline, and no significant interaction terms emerged (ps >0.30). The intercept was significantly below zero (b = −0.51, SE = 0.25, z = −2.04, p = 0.041), suggesting that the overall level of anger–disgust preferences across participants tended to be negative, though this pattern was stable across age and sex.
The model predicting surprise–sadness preference scores was highly significant, χ2(16) = 1,308, p < 0.001, explaining 31% of the variance (R2 = 0.31, adjusted R2 = 0.31). Omnibus tests indicated a robust main effect of age, χ2(7) = 50.37, p < 0.001, and a significant negative effect of eye–mouth preferences, χ2(1) = 4.34, p = 0.037. Neither the main effect of sex, χ2(1) = 0.05, p = 0.821, nor the age × sex interaction, χ2(7) = 2.92, p = 0.892, reached significance. Parameter estimates showed that, relative to the 3-month group, surprise–sadness preference scores significantly increased at 8 months (b = 4.58, SE = 0.87, z = 5.26, p < 0.001), 10 months (b = 3.72, SE = 0.91, z = 4.08, p < 0.001), 12 months (b = 5.01, SE = 0.95, z = 5.30, p < 0.001), and 18 months (b = 3.72, SE = 1.01, z = 3.70, p < 0.001), indicating a pronounced peak during late infancy. No significant differences were observed at 36 months or adulthood (ps >0.10). The covariate eye–mouth preferences had a small but significant negative effect (b = −0.16, SE = 0.08, z = −2.08, p = 0.037), suggesting that higher eye–mouth preference scores were associated with slightly lower values of surprise–sadness preference scores.
Across the three dependent measures, surprise–sadness preferences exhibited the strongest developmental modulation, characterized by a significant increase between 8 and 12 months, followed by a decline toward adulthood. Happiness–fear preferences, which also showed age-related changes, with pronounced increases in late infancy and adulthood and transient sex differences around 12–18 months. In contrast, anger–disgust preferences displayed no significant age- or sex-related variation. Collectively, these results indicate that developmental differentiation in the examined emotional dimensions was minimal for the arousal dimension, moderately evident for the valence dimension, and most prominent for the combinations of both valence and arousal attributes.
4 Discussion
This study investigated developmental changes in infants' preferential looking at dynamic facial expressions differing along theoretically motivated affective dimensions: valence, arousal, and integrated valence–arousal. By combining a dimensional framework with dynamic stimuli and explicitly modeling perceptual factors such as motion energy and facial scanning patterns (Bastianello et al., 2022; Calvo and Nummenmaa, 2016; Sato and Yoshikawa, 2004), the study aimed to clarify how affective and perceptual cues jointly shaped early emotion perception. The results yielded three main insights: (a) valence-based preferences showed age-related modulation, (b) arousal-based contrasts did not yield clear developmental change, and (c) motion energy exerted a robust influence on preferential looking across ages.
4.1 Developmental modulation of valence-based preferences
Consistent with prior work, the contrast between happiness and fear revealed age-related differences in preferential looking. Older infants showed a stronger bias toward positive expression, whereas younger infants displayed weaker or more variable preferences (Vaish et al., 2008; Vesker et al., 2018; Widen and Russell, 2008, 2013). This pattern aligns with dimensional accounts proposing that sensitivity to valence emerges gradually and becomes increasingly robust with development (Leppänen and Nelson, 2009; Peltola et al., 2013). Importantly, this developmental modulation persisted even when motion-energy differences were statistically controlled, suggesting that valence-related effects could not be reduced to low-level perceptual salience alone (Bassili, 1978; Wilcox et al., 2013; Xiao et al., 2015).
At the same time, the relatively small motion-energy difference between happiness and fear underscores that valence-based effects in this contrast were not driven by large disparities in facial movement. This finding strengthens the interpretation that developmental changes in this condition reflect affective processing rather than purely perceptual bias, supporting the view that valence constitutes a core organizing dimension of early emotion perception (Leppänen and Nelson, 2009; Vaish et al., 2008).
4.2 Absence of developmental change in arousal contrast
In contrast to valence, the arousal-based comparison between anger and disgust did not reveal systematic developmental change. One interpretation of this finding is that sensitivity to arousal-related cues emerges early and remains relatively stable across infancy, consistent with proposals that arousal processing relies on early-maturing mechanisms linked to orienting and vigilance (Bradley and Lang, 2007; Leppänen, 2011). However, an alternative and equally plausible explanation concerns the affective proximity of anger and disgust.
Although anger and disgust differ in arousal, they share similar negative valence and are perceptually and conceptually related (Ruba et al., 2017; Widen and Russell, 2008, 2013). Within the Affect Grid, the distance between these expressions is smaller than that between anger and sadness, which combines similar negative valence with substantially lower arousal. Consequently, the absence of developmental effects in the present arousal contrast may reflect limited discriminability rather than a genuine lack of arousal sensitivity (Kauschke et al., 2019; Ruba et al., 2017; Russell and Bullock, 1985; Widen and Russell, 2008). This interpretation is consistent with previous findings showing that infants' discrimination performance depends critically on the magnitude of affective contrast rather than on categorical labels per se (Caron et al., 1985; Flom and Bahrick, 2007; Russell and Bullock, 1985; Soken and Pick, 1992).
Nevertheless, the choice of anger vs. disgust was motivated by the need to preserve a distinct contrast for integrated valence–arousal processing (surprise vs. sadness) while avoiding reuse of the same expression across dimensions. Thus, the present findings should be interpreted not as evidence against arousal-based processing but as reflecting the constraints imposed by stimulus selection in multidimensional designs.
4.3 Integrated valence–arousal processing and non-linear development
The contrast between surprise and sadness, which differs simultaneously in valence and arousal, yielded a pattern distinct from both the valence-only and arousal-only conditions. Preferences in this condition did not change monotonically with age, suggesting a more complex developmental trajectory (Grossmann, 2010; Ruba and Pollak, 2020; Scherer, 2009; White et al., 2019). This finding is consistent with theoretical accounts proposing that the integration of multiple affective dimensions becomes particularly salient during late infancy, when social communication, joint attention, and affective signaling rapidly expand (Grossmann, 2010; Hoehl et al., 2017).
Notably, the surprise–sadness contrast also exhibited the largest motion-energy difference among the tested pairs. While motion energy significantly predicted preferential looking, the persistence of age-related variation beyond motion effects suggests that multidimensional affective integration contributes uniquely to infants' responses. These results support the notion that expressions differing along multiple affective dimensions may engage broader processing mechanisms than those differing along a single dimension.
4.4 Motion energy as a determinant of preferential looking
A central contribution of this study is the explicit demonstration that lateralized differences in motion energy robustly bias preferential-looking behavior (Paxton and Dale, 2013; Ramseyer, 2020). The LMM analyses revealed that greater relative motion reliably shifted gaze toward the more dynamic stimulus, accounting for a substantial proportion of variance in looking preferences. This finding accords with methodological critiques emphasizing that dynamic stimuli introduce systematic perceptual variability that can influence infant attention independently of emotional meaning (Frank et al., 2009).
Importantly, motion-energy differences varied considerably across expression pairs. While alternative stimulus pairings could have reduced motion-energy disparities, they would also have altered the affective dimensional structure of the contrasts. This tradeoff highlights a fundamental methodological challenge in developmental research using dynamic facial expressions: optimizing affective contrast and perceptual equivalence simultaneously is often not feasible. Rather than attempting to eliminate motion differences through stimulus selection alone, this study demonstrates the value of explicitly quantifying and modeling motion energy as a covariate.
4.5 Implications for dimensional models of emotion perception
Taken together, the findings support a partially dissociable developmental trajectory across affective dimensions. Valence-related preferences exhibit age-dependent strengthening, arousal-based contrasts appear comparatively stable or difficult to detect with closely related expressions, and integrated valence–arousal processing follows a more complex pattern. These results are compatible with dimensional models in which affective dimensions are differentially weighted across development and interact with perceptual features of stimuli (Barrett and Kensinger, 2010; Fujimura et al., 2012; Matsuda et al., 2013; Posner et al., 2005; Russell, 1980; Scherer, 2009).
Crucially, the data presented here caution against interpreting preferential-looking results without considering low-level stimulus properties. Motion energy is not merely a nuisance variable but a meaningful contributor to infants' attentional allocation. Accounting for such factors is essential for refining theoretical models of early emotion perception and for improving the interpretability of developmental findings.
4.6 Limitations and future directions
Several limitations of this study should be noted. First, the use of a single actor, while allowing tight control over identity and motion kinematics, limits generalizability across facial identities (Addabbo et al., 2018; Kim et al., 2025). Second, the arousal contrast may have been constrained by the affective proximity of anger and disgust, potentially obscuring developmental effects. Future studies could systematically manipulate affective distance while orthogonally controlling motion energy or employ multi-actor stimulus sets with matched motion profiles. Finally, combining preferential looking with complementary measures such as physiological responses or neural indices may help disentangle perceptual salience from affective evaluation.
4.7 Conclusion
This study demonstrates that infants' responses to dynamic facial expressions are shaped by both affective dimensions and perceptual motion cues. By integrating dimensional theory with explicit modeling of motion energy, the findings offer a more nuanced account of how emotion perception develops in early life. These results underscore the importance of methodological transparency and provide a framework for future research seeking to disentangle affective and perceptual contributions to social cognition.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by Institutional Review Board of Doshisha University. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants' legal guardians/next of kin.
Author contributions
Y-TM: Investigation, Writing – review & editing, Validation, Funding acquisition, Data curation, Conceptualization, Formal analysis, Writing – original draft, Visualization. KT: Writing – review & editing, Investigation, Writing – original draft, Formal analysis.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the Japan Society for the Promotion of Science (JSPS KAKENHI Grant Numbers JP23K25654 and JP17H02195, awarded to Y-TM).
Acknowledgments
We thank all our research participants and their families for their time and participation. Finally, the authors are deeply grateful to the late Professor Yukuo Konishi for his invaluable guidance, mentorship, and unwavering support throughout this research.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fdpys.2025.1742510/full#supplementary-material
Supplementary Video 1 | A sample of dynamic expressions (Fear vs. Happiness).
Supplementary Video 2 | A sample of dynamic expressions (Disgust vs. Anger).
Supplementary Video 3 | A sample of dynamic expressions (Sadness vs. Surprise).
References
Addabbo, M., Longhi, E., Marchis, I. C., Tagliabue, P., and Turati, C. (2018). Dynamic facial expressions of emotions are discriminated at birth. PLoS One 13:e0193868. doi: 10.1371/journal.pone.0193868
Barrett, L. F., and Kensinger, E. A. (2010). Context is routinely encoded during emotion perception. Psychol. Sci. 21, 595–599. doi: 10.1177/0956797610363547
Bassili, J. N. (1978). Facial motion in the perception of faces and of emotional expression. J. Exp. Psychol. Hum. Percept. Perform. 4:373. doi: 10.1037//0096-1523.4.3.373
Bastianello, T., Keren-Portnoy, T., Majorano, M., and Vihman, M. (2022). Infant looking preferences towards dynamic faces: a systematic review. Infant Behav. Dev. 67:101709. doi: 10.1016/j.infbeh.2022.101709
Bradley, M. M., and Lang, P. J. (2007). “Emotion and motivation,” in Handbook of Psychophysiology, 3rd ed., eds. J. T. Cacioppo, L. G. Tassinary, and G. G. Berntson (Cambridge University Press), 581–607.
Calvo, M. G., and Nummenmaa, L. (2016). Perceptual and affective mechanisms in facial expression recognition: an integrative review. Cogn. Emot. 30, 1081–1106. doi: 10.1080/02699931.2015.1049124
Caron, R. F., Caron, A. J., and Myers, R. S. (1985). Do infants see emotional expressions in static faces? Child Dev. 56, 1552–1560. doi: 10.2307/1130474
Eisenbarth, H., and Alpers, G. W. (2011). Happy mouth and sad eyes: scanning emotional facial expressions. Emotion 11:860. doi: 10.1037/a0022758
Ekman, P. (1992). An argument for basic emotions. Cogn. Emot. 6, 169–200. doi: 10.1080/02699939208411068
Farroni, T., Menon, E., Rigato, S., and Johnson, M. H. (2007). The perception of facial expressions in newborns. Euro. J. Dev. Psychol. 4, 2–13. doi: 10.1080/17405620601046832
Field, T. M., Woodson, R., Greenberg, R., and Cohen, D. (1982). Discrimination and imitation of facial expression by neonates. Science 218, 179–181. doi: 10.1126/science.7123230
Flom, R., and Bahrick, L. E. (2007). The development of infant discrimination of affect in multimodal and unimodal stimulation: the role of intersensory redundancy. Dev. Psychol. 43:238. doi: 10.1037/0012-1649.43.1.238
Frank, M. C., Vul, E., and Johnson, S. P. (2009). Development of infants' attention to faces during the first year. Cognition 110, 160–170. doi: 10.1016/j.cognition.2008.11.010
Fujimura, T., Matsuda, Y.-T., Katahira, K., Okada, M., and Okanoya, K. (2012). Categorical and dimensional perceptions in decoding emotional facial expressions. Cogn. Emot. 26, 587–601. doi: 10.1080/02699931.2011.595391
Grossmann, T. (2010). The development of emotion perception in face and voice during infancy. Restor. Neurol. Neurosci. 28, 219–236. doi: 10.3233/RNN-2010-0499
Hoehl, S., Hellmer, K., Johansson, M., and Gredebäck, G. (2017). Itsy bitsy spider…: infants react with increased arousal to spiders and snakes. Front. Psychol. 8:1710. doi: 10.3389/fpsyg.2017.01710
Hunnius, S., and Geuze, R. H. (2004). Gaze shifting in infancy: a longitudinal study using dynamic faces and abstract stimuli. Infant Behav. Dev. 27, 397–416. doi: 10.1016/j.infbeh.2004.02.003
Kauschke, C., Bahn, D., Vesker, M., and Schwarzer, G. (2019). The role of emotional valence for the processing of facial and verbal stimuli—positivity or negativity bias? Front. Psychol. 10:1654. doi: 10.3389/fpsyg.2019.01654
Kim, H., Bian, Y., and Krumhuber, E. G. (2025). A review of 25 spontaneous and dynamic facial expression databases of basic emotions. Affect. Sci. 6, 380–394. doi: 10.1007/s42761-024-00289-3
Kuppens, P., Tuerlinckx, F., Russell, J. A., and Barrett, L. F. (2013). The relation between valence and arousal in subjective experience. Psychol. Bull. 139:917. doi: 10.1037/a0030811
LaBarbera, J. D., Izard, C. E., Vietze, P., and Parisi, S. A. (1976). Four-and six-month-old infants' visual responses to joy, anger, and neutral expressions. Child Dev. 47, 535–538. doi: 10.2307/1128816
Leppänen, J. M. (2011). Neural and developmental bases of the ability to recognize social signals of emotions. Emot. Rev. 3, 179–188. doi: 10.1177/1754073910387942
Leppänen, J. M., and Nelson, C. A. (2009). Tuning the developing brain to social signals of emotions. Nat. Rev. Neurosci. 10, 37–47. doi: 10.1038/nrn2554
Lewkowicz, D. J., and Hansen-Tift, A. M. (2012). Infants deploy selective attention to the mouth of a talking face when learning speech. Proc. Natl. Acad. Sci. U.S.A. 109, 1431–1436. doi: 10.1073/pnas.1114783109
LoBue, V., and DeLoache, J. S. (2010). Superior detection of threat-relevant stimuli in infancy. Dev. Sci. 13, 221–228. doi: 10.1111/j.1467-7687.2009.00872.x
Matsuda, Y.-T., Fujimura, T., Katahira, K., Okada, M., Ueno, K., Cheng, K., et al. (2013). The implicit processing of categorical and dimensional strategies: an fMRI study of facial emotion perception. Front. Hum. Neurosci. 7:551. doi: 10.3389/fnhum.2013.00551
Oakes, L. M., and Ellis, A. E. (2013). An eye-tracking investigation of developmental changes in infants' exploration of upright and inverted human faces. Infancy 18, 134–148. doi: 10.1111/j.1532-7078.2011.00107.x
Paxton, A., and Dale, R. (2013). Frame-differencing methods for measuring bodily synchrony in conversation. Behav. Res. Methods 45, 329–343. doi: 10.3758/s13428-012-0249-2
Peltola, M. J., Hietanen, J. K., Forssman, L., and Leppänen, J. M. (2013). The emergence and stability of the attentional bias to fearful faces in infancy. Infancy 18, 905–926. doi: 10.1111/infa.12013
Posner, J., Russell, J. A., and Peterson, B. S. (2005). The circumplex model of affect: an integrative approach to affective neuroscience, cognitive development, and psychopathology. Dev. Psychopathol. 17, 715–734. doi: 10.1017/S0954579405050340
Ramseyer, F. T. (2020). Motion energy analysis (MEA): a primer on the assessment of motion from video. J. Counsel. Psychol. 67:536. doi: 10.1037/cou0000407
Ruba, A. L., Johnson, K. M., Harris, L. T., and Wilbourn, M. P. (2017). Developmental changes in infants' categorization of anger and disgust facial expressions. Dev. Psychol. 53:1826. doi: 10.1037/dev0000381
Ruba, A. L., and Pollak, S. D. (2020). The development of emotion reasoning in infancy and early childhood. Annu. Rev. Dev. Psychol. 2, 503–531. doi: 10.1146/annurev-devpsych-060320-102556
Ruba, A. L., and Repacholi, B. M. (2019). Do preverbal infants understand discrete facial expressions of emotion? Emot. Rev. 12, 235–250. doi: 10.1177/1754073919871098
Russell, J. A. (1980). A circumplex model of affect. J. Pers. Soc. Psychol. 39, 1161–1178. doi: 10.1037/h0077714
Russell, J. A., and Bullock, M. (1985). Multidimensional scaling of emotional facial expressions: similarity from preschoolers to adults. J. Pers. Soc. Psychol. 48:1290. doi: 10.1037/0022-3514.48.5.1290
Russell, J. A., Weiss, A., and Mendelsohn, G. A. (1989). Affect grid: a single-item scale of pleasure and arousal. J. Pers. Soc. Psychol. 57:493. doi: 10.1037/0022-3514.57.3.493
Sato, W., and Yoshikawa, S. (2004). BRIEF REPORT the dynamic aspects of emotional facial expressions. Cogn. Emot. 18, 701–710. doi: 10.1080/02699930341000176
Scherer, K. R. (2009). The dynamic architecture of emotion: evidence for the component process model. Cogn. Emot. 23, 1307–1351. doi: 10.1080/02699930902928969
Segal, S. C., and Moulson, M. C. (2020). Dynamic advances in emotion processing: differential attention towards the critical features of dynamic emotional expressions in 7-month-old infants. Brain Sci. 10:585. doi: 10.3390/brainsci10090585
Serrano, J. M., Iglesias, J., and Loeches, A. (1992). Visual discrimination and recognition of facial expressions of anger, fear, and surprise in 4- to 6-month-old infants. Dev. Psychobiol. 25, 411–425. doi: 10.1002/dev.420250603
Smith, M. L., Cottrell, G. W., Gosselin, F., and Schyns, P. G. (2005). Transmitting and decoding facial expressions. Psychol. Sci. 16, 184–189. doi: 10.1111/j.0956-7976.2005.00801.x
Soken, N. H., and Pick, A. D. (1992). Intermodal perception of happy and angry expressive behaviors by seven-month-old infants. Child Dev. 63, 787–795. doi: 10.2307/1131233
Tenenbaum, E. J., Shah, R. J., Sobel, D. M., Malle, B. F., and Morgan, J. L. (2013). Increased focus on the mouth among infants in the first year of life: a longitudinal eye-tracking study. Infancy 18, 534–553. doi: 10.1111/j.1532-7078.2012.00135.x
Vaish, A., Grossmann, T., and Woodward, A. (2008). Not all emotions are created equal: the negativity bias in social-emotional development. Psychol. Bull. 134:383. doi: 10.1037/0033-2909.134.3.383
Vesker, M., Bahn, D., Degé, F., Kauschke, C., and Schwarzer, G. (2018). Perceiving arousal and valence in facial expressions: differences between children and adults. Euro. J. Dev. Psychol. 15, 411–425. doi: 10.1080/17405629.2017.1287073
Viktorsson, C., Portugal, A. M., Li, D., Rudling, M., Siqueiros Sanchez, M., Tammimies, K., et al. (2023). Preferential looking to eyes versus mouth in early infancy: heritability and link to concurrent and later development. J. Child Psychol. Psychiatry 64, 311–319. doi: 10.1111/jcpp.13724
Walker-Andrews, A. S. (1997). Infants' perception of expressive behaviors: differentiation of multimodal information. Psychol. Bull. 121:437. doi: 10.1037/0033-2909.121.3.437
White, H., Chroust, A., Heck, A., Jubran, R., Galati, A., and Bhatt, R. S. (2018). Categorical perception of facial emotions in infancy. Infancy 24, 139–161. doi: 10.1111/infa.12275
White, H., Chroust, A., Heck, A., Jubran, R., Galati, A., and Bhatt, R. S. (2019). Categorical perception of facial emotions in infancy. Infancy 24, 139–161.
Widen, S. C., and Russell, J. A. (2008). Children acquire emotion categories gradually. Cogn. Dev. 23, 291–312. doi: 10.1016/j.cogdev.2008.01.002
Widen, S. C., and Russell, J. A. (2013). Children's recognition of disgust in others. Psychol. Bull. 139:271. doi: 10.1037/a0031640
Wilcox, T., Stubbs, J. A., Wheeler, L., and Alexander, G. M. (2013). Infants' scanning of dynamic faces during the first year. Infant Behav. Dev. 36, 513–516. doi: 10.1016/j.infbeh.2013.05.001
Keywords: arousal, developmental trajectory, emotion perception, eye tracking, infancy, valence
Citation: Matsuda Y-T and Taniguchi K (2026) A dimensional approach to emotion in infancy: developmental shifts in preferential attention to valence and arousal. Front. Dev. Psychol. 3:1742510. doi: 10.3389/fdpys.2025.1742510
Received: 09 November 2025; Revised: 14 December 2025;
Accepted: 24 December 2025; Published: 27 January 2026.
Edited by:
Marco Lunghi, University of Padova, ItalyReviewed by:
Elisa Di Giorgio, University of Padua, ItalyMaggie Guy, Loyola University Chicago, United States
Copyright © 2026 Matsuda and Taniguchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Yoshi-Taka Matsuda, bWF0c3VkYUBzaGlyYXVtZS5hYy5qcA==