Look Who's Talking: Pre-Verbal Infants’ Perception of Face-to-Face and Back-to-Back Social Interactions

Four-, 6-, and 11-month old infants were presented with movies in which two adult actors conversed about everyday events, either by facing each other or looking in opposite directions. Infants from 6 months of age made more gaze shifts between the actors, in accordance with the flow of conversation, when the actors were facing each other. A second experiment demonstrated that gaze following alone did not cause this difference. Instead the results are consistent with a social cognitive interpretation, suggesting that infants perceive the difference between face-to-face and back-to-back conversations and that they prefer to attend to a typical pattern of social interaction from 6 months of age.

At the same time, infants also demonstrate sensitivity to interpersonal relationships while being an active social partner, for example while interacting with an adult. In fact, studies on children's sensitivity to timing of turn-taking (Beebe et al., 1988;Jaffe et al., 2001;Crown et al., 2002) demonstrate that 4-montholds are able to coordinate the timing of own interactive behavior in relation to the timing of their interactive partner. Mayer and Tronick (1985) also demonstrated that even younger children, 2-to 5-month-olds, responded to maternal turn-taking cues by timing their own responses appropriately.
The current study aims to further our knowledge of how young infants (age 4-11 months) perceive complex social interactions that they are not actively engaged in by integrating the paradigms used by von Hofsten et al. (2009) and Gredebäck and Melinder (2010). The present study looks at the understanding of social interactions, but with a focus on young infants. Little is known about when infants form expectations about the manner in which others' conversations are carried out. Thus, infants in the current study are presented with two individuals having a conversation while facing each other or looking away from each other. We know that young infants are sensitive to cues in a social interaction that invites the other part to respond (Mayer and Tronick, 1985) while being actively engaged in a social interaction. However, nothing is known about the degree to which infants can use such information to make sense of perceived social interactions that they are not actively apart of. Therefore, the aim of this study is threefold. (1) To further map infants' emerging sensitivity to social interactions in general.
(2) To explore if (when) infants' interest in others' conversations are modulated by direction of gaze, and (3) to discuss possible mechanisms that cause infants to visually follow the temporal flow of conversations.

IntroductIon
Young infants hold an amazing set of social cognitive abilities that help them interpret the goals and intentions of others actions. For example, 4-month-old infants follow gaze direction (D'Entremont et al., 1997; in press) whereas 6-month-old infants encode the goal of manual reaching actions (Woodward, 1998) based on hand aperture (Daum et al., 2009), and anticipate the goal of manual feeding actions (Kochukhova and Gredebäck, in press). As infants grow older they expand their social cognitive repertoire further. Above 12 months of age infants follow pointing gestures to external events (Liszkowski et al., 2004;von Hofsten et al., 2005). They also anticipate the goal of manual displacement actions (Falck-Ytter et al., 2006), imitate rational action goals (Gergely et al., 2002), and demonstrate early pre-cursors of theory of mind (Onishi and Baillargeon, 2005).
Despite the large number of studies devoted to unravel how infants perceive others' actions, until recently, virtually nothing was known about how infants perceive social interactions, conducted independently of the observing infant. A few recent exceptions have attempted to fill this gap by investigating infants' understanding of social interactions performed in realistic social contexts. These studies demonstrate that 12-month-old infants selectively attend to the speaker of social conversations (von Hofsten et al., 2009) and that 6-month-olds react with surprise when observing irrational feeding (Gredebäck and Melinder, 2010). Both studies rely on eye tracking technology to analyze infants' scanning patterns during observation of two adult actors interacting in a face-to-face manner. While a third, habituation study, demonstrates that 6-month-old infants expect different behaviors from people when they interacted with an inanimate object and a person (Lagerstee et al., 2000). Furthermore, older children at around 3 years of age, demonstrate an understanding of affiliation between two people based on the direction of gaze (Abramowitch and Daly, 1978). a normal face-to-face manner. During the back-to-back condition the two actors held an identical conversation (same movies rotated in Adobe Premiere), however, with their backs turned toward each other (see Figures 1A,B). The entire movie lasted 49 s and the actors spoke for 32 s (each utterance lasted on average 1.5 s, SD = 1.2 s). Thus, the whole experiment lasted less than 2 min in total, not including calibration and attention grabbers.

Procedure
The study was approved by the Regional Ethic Committee according to the 1964 Declaration of Helsinki. Participants were recruited by mail. As each family entered the lab parents were informed about the procedure and signed a consent form. Infants were then seated in a safety car seat on the parent's lap in front of the eye tracker. Following calibration and an attention grabbing sequence (a colorful toy bumping and making noise) was presented until the child looked at the computer screen. Infants were then presented with two movies featuring either the face-to-face or the back-to-back conversation. The two movies were identical, with the actors positioned in the same spot for all children and in both conditions. The other condition was presented on a separate day with order counterbalanced across participants. Days between each visit was not significantly different between the three age groups included in the study.

Data reduction
Two measures were used to estimate how infant's perceived the conversation in the two conditions. Both analyses rely on gaze shifts performed between two areas of interest (AOI), each covering one

Participants
The final sample consisted of 12 infants at 4 months of age (M = 132 days, SD = 10 days, 6 girls), 12 infants at 6 months of age (M = 193 days, SD = 9 days, 6 girls), and 12 infants at 11 months of age (M = 343 days, SD = 23 days, 5 girls), who visited the lab twice (M = 7.6 days apart). An additional one 4-month-old, three 6-month-olds, and two 11-months-old infants participated but were excluded due to lack of attention to the stimuli (i.e., no recorded gaze data).

Stimuli and apparatus
Gaze was recorded with a Tobii 1750 corneal-reflection near-infrared eye tracker (precision 1°; accuracy 0.5°; 50 Hz) using a standard 9 point calibration (Falck-Ytter et al., 2006). During the session infants were presented with videos of two women talking about their pets (Figure 1). They initially faced forward saying "hello" while concurrently waving their hands. Following this greeting both actors turned 90° and started a conversation. During this conversation, each actor, one at a time, said nine utterances followed by a small break in between. Only one actor spoke at any given time. The conversation had a natural vocal and turn-taking flow that imitate, as closely as possible, typical conversational patters between two people. The actors kept looking at each other throughout the conversation. Following the completion of this conversation the actors turned back 90° to their original forward facing orientation while simultaneously saying "bye bye" and waving their hands (Figure 2). During the face-to-face condition the actors talked to each other in continuous gaze data directed to the same agent) were included in the analysis. Note that only one gaze shift per turn-taking event, that accords with the criteria specified above, is counted and aggregated to the final gaze shifts score. As such, gaze shifts provide a measure of how many turn-taking events infants attend to. Data reduction was preformed by a frame-by-frame analysis (www.virtualdub.org) of gaze replay movies including both gaze and the stimuli (timelocked at 50 Hz). The inclusion criterion for a gaze shift is similar to what is used in most eye tracking studies that investigate action understanding (for example Falck-Ytter et al., 2006). In these studies participants have to fixate the agent performing an action, make a gaze shift to the goal of the agents action (for example a reach), and remain on the goal until the goal is accomplished. The reason for this restriction is that overly fast gaze shifts or quick scanning patterns that just scan the scene without paying specific attention to the goal is not included in the analysis. On a similar note, the current criterion ensures that a gaze shift included in the analysis is related to the turn taking (since gaze has to remain on the actor until she starts to speak). of the actors ( Figure 1C) during the time when the actors were not facing forward (black bar in Figure 2). Fixation duration measures how much time infants spend fixating at the speaker and the nonspeaker (changing between each utterance of the conversation). For each utterance gaze data is aggregated within the two AOIs starting with the first sound of each utterance and ending the frame before the next actor started to speak. Data reduction was performed with custom analysis tools (Matlab, Mathworks, Natick, MA, USA).
The second measure, gaze shifts, measures the degree to which infants visually attend to the flow of the conversation. Gaze shifts count the number of turn-taking events (when speaker n−1 stops talking and speaker n starts talking) that was accompanied by a gaze shift from speaker n−1 to speaker n . If a gaze shift was performed before the turn-taking, while speaker n−1 still talks, then gaze had to remain on the next speaker (speaker n ) until she started talking, making sure that the gaze shifts were related to the turn-taking. If a gaze shift was performed later, while speaker n talked, then the first gaze shift from speaker n−1 to speaker n was counted. Later gaze shifts that occurred after speaker n has stopped talking were not included in the analysis. In addition, only gaze shifts that terminate in a fixation (200 ms of infants follow gaze without regards for the social context, possibly without attending to the conversation at all. These alternatives are not exclusive. In fact, detection of gaze direction is most likely essential to both explanations. Clearly, detection of gaze direction is essential for both gaze following and the ability to differentiate between the two conversations contrasted in Experiment 1. The distinction is rather the degree to which infants are able to use gaze direction to decipher the perceived social interaction and selectively attend to the flow of conversation during face-to-face interactions. That is, if infants were more interested in social interactions than similar situations that lack explicit social components (as expressed by mutual gaze), infants should follow the flow of the conversation to a higher degree, and make more gaze shifts between the two actors when the actors look into each others' eyes. According to the alternative hypothesis infants will make more gaze shifts between the actors in the condition where gaze following leads to the other individual engaging in a mutual gaze as illustrated in the face-toface condition in the current study.
Experiment 2 is designed to address the issue of whether it is only the actors´ gaze per se that infants are following. Infants are presented with a conversation in which two actors face the same direction (both actors either looking to the right or to the left), allowing a direct comparison between gaze shifts made from an actor looking toward her interaction partner and an actor looking out in the periphery. According to the social cognitive explanation no differences should be found in the number of gaze shifts that follow the flow of conversation performed from either of the two actors. However, according to the gaze following explanation, which postulates that it is only the direction of gaze that infants attend to, without regards to the social context, infants would make less gaze shifts from the outward facing actor (infants would look to the periphery instead), than the number of For both dependent variables (fixation durations and gaze shifts) statistical reduction was conducted using general linear models, with age (4, 6, and 11 months) as between subject variable and condition (face-to-face and back-to-back) as within subject repeated measures. For fixation duration analysis the additional within subject variable "speaker" is added, comparing fixation durations to the agent currently speaking and the agent currently being silent. Analysis of gaze shifts was followed by age specific planned comparison repeated measure t-tests. Preliminary analyses demonstrate no effects of presentation order across or within sessions, thus, the above-mentioned analysis was aggregated over order.

rEsults and dIscussIon
The fixation duration analysis indicates that infants fixated either of the two AOIs (covering the two speakers) in 44.8% of the conversation during the face-to-face condition and in 37.8% of the conversation during the back-to-back condition. Which indicates that in general the participating infants were more interested in the face-to-face interaction, although this difference was not significant across conditions, F(1,33) = 2.6, p = 0.12, η p 2 0 07 = . . Furthermore, there were no significant differences across age, F(2,33) = 2.76, p = 0.08, η p 2 0 14 = . . The marginal significance reported for age is caused by an enhanced number of gaze data recorded inside either AOI with decreased age, ranging from 48.4% at 4 months to 36.9% at 11 months. At the same time infants, in both conditions, fixated the speaker (58.5%) to a higher degree than the non-speaker, F(1,1) = 53.24, p < 0.00001, η p 2 0 62 = . . No other main or interaction effects were significant.
The analysis of gaze shifts demonstrate that infants performed more gaze shifts between the two actors, in accordance with the flow of the conversation, during face-to-face (n = 6.2) relative to back-toback (n = 3.3) conditions, F(1,34) = 18.49, p < 0.00001, η p 2 0 37 = . . LSD post hoc tests demonstrate a significant difference between 4 and 11 month old infants (p < 0.05). No significant interaction between age and conditions was observed. Planned comparison t-tests demonstrate significant differences between conditions at 6, t(11) = 2.49, p = 0.03, d = 1.5, and 11 months of age, t(11) = 4.09, p = 0.002, d = 2.5, but not at 4 months of age (see Figure 3).
Experiment 1 suggests that infants discriminate between face-toface and back-to-back conversations from approximately 6 months of age. Infants do not differ in the amount of time spent fixating these events, only in the degree to which they follow the flow of the conversation. Two possible interpretations are available at this point. Infants' might develop expectations about how social interactions are performed between 4 and 6 months of age. In this respect, infants attend to the transitions of the face-to-face interactions to a higher degree than back-to-back conversations. This alternative is referred to as the social cognitive explanation.
Alternatively, infants follow the actors' gaze direction without paying attention to the conversation (recent findings demonstrate that infants are able to follow others gaze at 6 months of age, Gredebäck et al., 2008;Senju and Csibra, 2008;Gredebäck et al., in press). According to this suggestion infants might produce ample gaze shifts between the actors in the face-to-face conditions and several gaze shifts from each speaker to the periphery of the screen (or off-screen) in the back-to-back condition. According to this interpretation, referred to as the gaze following explanation,

Figure 3 | Number of gaze shifts performed in accordance with the flow of conversation in face-to-face (closed circles) and back-to-back (open squares) conditions, error bars represent Se.
also attended to the actor facing outward (60%) more then the actor facing inward, F(1,13) = 8.91, p < 0.01, η p 2 0 41 = . . No interaction effect was observed.
In addition, no differences were observed between left and right facing conversations with respect to looking time, however, infants performed more gaze shifts in accordance to the flow of conversations when both actors faced right, relative to left, t(13) = −2.21, p < 0.05.
These findings illustrate that infants' visual attention to social interaction is not primarily guided by the actors gaze direction, without regards for the social context in which the conversation occurred. Instead infants are equally likely to make gaze shifts from both actors, irrespective of the individuals gaze direction. This finding does therefore not support the alternative gaze following hypothesis suggested in Experiment 1, and one possible explanation could therefore be the social cognitive hypothesis suggested. Thus, if the gaze following hypothesis alone was the reason for the ability to follow the flow of the conversation in the face-to-face condition of Experiment 1, one would expect that the infants in the current experiment would evince more gaze shifts from the inward facing (toward the other actor) than the outward facing actor (to the other actor). This was not the case. To the contrary the infants paid equal attention to the two actors, and made an equal number of gaze shifts between the two actors, regardless of the direction the actors were facing. This finding supports the social cognitive interpretation.

GEnEral dIscussIon
In normal everyday life communicative acts are often occurring between people who are looking at each other rather than looking away. Experiment 1 indicates that infants to some extent discern between face-to-face and back-to-back interactions at 6 months of age by paying attention to the transitions of the conversation primarily when the two interaction partners look at each other. The same gaze pattern between face-to-face and back-to-back conversations is suggested at 11 months, but not at 4 months of age. From 6 months of age infants seem to prefer to follow the flow of conversation during observation of others' social interactions which carry more familiar and common ways of communication than the back-to-back situation.
It has previously been argued that young infants scanning patterns, as measured with eye tracking, are influenced by social cognitive motives, more specifically an interest in others' preferences (Senju and Csibra, 2008;Gredebäck et al., in press). We argue that similar processes might be operational during passive observation of everyday social interactions performed by others, as suggested by the social cognitive explanation. According to this perspective young infants might use others gaze direction to decipher intentions and goals of others, relying on social context to detect communicative acts. Being interested in social interactions and the preferences of others motivate infants to devote attention to faceto-face interactions and the social turn taking that are a natural component of many everyday social contexts.
This early sensitivity to interacting individuals might be influenced by infants' experience with social interactions within their environment. Not only are adults engaging children in conversations, infants also observe their caretakers engage in gaze shifts made from the actor facing her conversation partner. Confirmation of the gaze following hypothesis could then possibly explain the difference between the two conditions in Experiment 1. Contrary to this, if equal numbers of infant gaze shifts are made from both actors in the conversation (regardless of their direction of gaze), gaze following alone cannot explain the results in Experiment 2. To test this hypothesis, Experiment 2 focuses on 6-month-olds, as this was the earliest age in which infants differentiated between conditions.

Participants
Fourteen 6-month-olds (M = 200 days, SD = 25 days, 6 girls) participated in Experiment 2. Recruitment procedures were identical to Experiment 1. All participating infants were included in the final sample.

Stimuli and apparatus
The same apparatus and stimulus was used as in Experiment 1, with one exception. In the movie presented to participants of Experiment 2 one of the actors were rotated horizontally, relative to the face-to-face conversation, so that she turned outward, following the initial "hello." Through this manipulation one actor looked outward (away from her interaction partner) whereas the other actor looked inward, toward her interaction partners back.

Procedure
Infants were presented with movies in which both actors either turned left or right. The stimulus was presented twice in succession. The direction of the two actors (left or right) and the identity of the inward and outward turning actors were counterbalanced and each participant observed one movie with both actors turning to the right and another with both actors turning left.

Data reduction
Analysis of variance (GLM) compares fixation durations between the two actors (inward and outward facing) and between the speaker and non-speaker. Gaze shifts between the two actors (comparing gaze shifts from the inward to the outward facing actor with gaze shifts from the outward to the inward facing actor) was analyzed with a paired sample t-test. In other words the same analysis was used for both Experiment 1 and 2 with the addition of the two individuals (inward-and outward-facing) as a dependent variable in the statistical analysis in the current experiment. No order effects were observed for either looking time or gaze shifts and the following analysis was aggregated over this variable.

rEsults and dIscussIon
Infants made an equal number of gaze shifts between the two actors, in accordance with the flow of conversation, when comparing gaze shifts being performed from the speaker turning inward to the outward turning actor (n = 2.2) and the other way around (from outward to inward facing speaker, n = 2.0). Furthermore, infants fixated at the AOIs covering the two speakers on 55.9% of the time. This time was spent fixating the speaker (59%) to a higher degree than the non-speaker, F(1,13) = 11.79, p = 0.004, η p 2 0 48 = . . They In line with the above argument about prediction of turn taking, it is important to account for the fact that infants, in the current study, might not understand all the facets of the conversational patterns per se. Instead the present study demonstrates an early emerging sensitivity to important components of social interactions between third parties. That is, infants might not still understand the essence of a conversation between two people, but rather be aware of some of the prerequisites that give meaning to social interactions (e.g., to face the person one talks to). This understanding might be based on any of a series of cues that are confounded in the current study, and in most aspects of social life outside the lab. In the current study gaze direction, face and body orientation all point in the same direction. At the same time, subtle social cues such as nodding toward another might also play an important part in creating the perception of two individuals facing each other or facing opposite directions. Also, another point of interest is whether children showed more gaze shifts in general for the face-to-face condition, regardless of turn taking. Although fixation data does not suggest a mere preference for the face-to-face condition, it could be that the conversing act combined with the face-to-face orientation would elicit more interest in general irrespective of turn taking. Future research is needed to disentangle these cues in order to gain a more complete understanding of the individual variables that help define the current context.
In summary, the current paper represents one of few papers that attempt to enhance our understanding of how young infants perceive complex everyday social interactions involving two people that converse, independently of the infant. The current study points in the direction that gaze following directed toward social interactions surface between 4 and 6 months of age, an age where gaze following abilities also emerge (Gredebäck et al., in press). At the same time a supplementary experiment demonstrates that gaze following alone cannot explain the preference for direct face-to-face conversations. Instead, we argue that detection of gaze direction is used to aid the infants' understanding of meaningful, conventional, social interactions, arguing for a social cognitive interpretation of infants' sensitivity to social interaction from 6 months of age. acknowlEdGmEnts This research was supported by grants from the Norwegian Directorate for Children, youth and Family Affairs (06/34707) and Marie-Curie ITN RobotDoc (2010-2013).
conversations and social interactions with other adults. Such a hypothesis lends it self to a social cognitive explanation which is experience based, in harmony with prior studies demonstrating a clear experience dependency in other, but related, action understanding abilities (Sommerville et al., 2005(Sommerville et al., , 2008Falck-Ytter et al., 2006). This finding has interesting parallels to a recent study of prosocial behavior in 18-month-olds (Over and Carpenter, 2009). They demonstrate that infants have a higher tendency to act in a pro-social manner if they previously (in a different context) have been presented with two dolls that look at each other than if they have been presented with dolls standing next to each other but look away from each other. The present paper suggests that the sensitivity to the direction of interacting others develops much earlier, about 2 months after infants first tendencies to follow others gaze (at 4 months; Gredebäck et al., in press) and at the same time as infants react to irrational social interactions with enhanced pupil dilation (at 6 months; Gredebäck and Melinder, 2010).
At the same time little evidence supports the alternative, context independent, gaze following explanation. Infants spend an equal amount of time looking at the two actors in Experiment 2, and they make an equal number of gaze shifts from an actor facing the back of her interaction partner and an actor facing away from her interaction partner.
Most likely the selective attention to face-to-face conversations observed in the current study represents only the first steps in the development of conversation understanding, an ability that might be influenced by the development of language and a comprehension of formal grammatical rules. One important step involves the ability to anticipate the flow of conversations by fixating the next speaker just before (s)he starts to speak. According to von Hofsten et al. (2009) 3-year-olds are able to do that, fixating the next speaker before (s)he starts to speak. None of the age groups tested in the current paper demonstrate consistent predictive behavior (according to pilot analysis of timing). Most likely, infants start by developing a sensitivity, and selective attention, to face-to-face conversations (in accordance with the current findings) and that the ability to predict the transition of conversation (von Hofsten et al., 2009) develops later. The actual onset of this ability is not currently known.