Unintentional synchronization with self-avatar for upper- and lower-body movements

The subjective experience of embodying an avatar when immersed in virtual reality (VR) is known to support the sense of presence and to help with the interaction in a virtual environment. Virtual embodiment is often thought of as the consequence of replacement of the physical body by a virtual one, with a sense of agency for the avatar obtained by making the avatar’s body follow the user’s movements. This unidirectional motor link was, however, challenged by studies observing the opposite effect under different circumstances, for example, in a slow-motion context or when an arm movement was snapped on a predefined axis. These reports are, however, still rare or anecdotal. With the idea of a generalized bidirectional relationship between the user and the avatar in mind, we established a methodology to systematically provoke and study the circumstances under which participants follow the movements of their avatar during long repetitive movements without having been instructed to do so. A preliminary study confirmed that our virtual experimental setup, using full-body motion capture, avatar animation, and virtual mirrors, supports a strong sense of agency and body ownership for the avatar while enabling the experimental manipulation of the avatar’s movement. In the main experimental study, where participants performed repetitive upper- and lower-body movements while their avatar animations were either congruent or out-of-phase, we observed that almost all participants synchronized with their avatar at least once, for ∼ 47 % of trials for lower limb movements and ∼ 38 % for upper limb movements. Participants still reported low agency and ownership for the avatar under the incongruent condition, but, most interestingly, some of them also reported that their movements were not influenced by the avatar despite the behavioral effect. Our methodological approach and results contribute to the characterization of the conditions of occurrence of the self-avatar follower effect and, thereby, to identifying an enriched interaction design for VR, involving complex avatar–user mutual interdependencies.

embodiment is of particular interest as it not only leads to a better understanding of the specifics of the relationship between one and one's own body but also to how this transposes to virtual bodies and avatars. Maybe even more importantly, this also allows identifying the manipulations that can be applied to virtual avatars without breaking the experience of embodiment and offering methodological and scientific basis for the design and improvement of interaction in VR.
In a virtual environment, when providing users a virtual avatar that they can control and see from a first-person perspective, they usually experience a sense of embodiment towards the displayed virtual body. While small disruptions of the congruency between the user's and avatar's movements can be tolerated (Porssut et al., 2019), large disruptions can strongly affect the user's sense of agency, which may in turn cause a break in embodiment (Kokkinara and Slater, 2014;Porssut et al., 2019;Porssut et al., 2022a;Porssut et al., 2022b). However, recent findings show that, in the context of movement distortion, and when the system allows them to do so, users can unintentionally follow the movements of their avatar in order to reduce potential multi-sensory discrepancies between the (real) body cues and the visual feedback (Debarba et al., 2015;Debarba et al., 2017;Rietzler et al., 2017;Burin et al., 2019;Cohn et al., 2020;Gonzalez-Franco et al., 2020;Zhao et al., 2021). This phenomenon, named the "self-avatar follower effect" , shows that despite the avatar-user motor relationship being commonly considered as unidirectional (i.e., users control their avatar), this motor relationship can become bidirectional under specific conditions, thereby shedding light on the mechanisms underlying the production of unintentional movements.
The present work, thus, aims at deepening our knowledge about this bidirectional motor relationship by identifying the critical factors allowing for the emergence of this phenomenon. More specifically, we focus here on the user-avatar synchronization and on the disruptions leading users to synchronize with their avatar as opposed to those leading to a break in embodiment. To that end, we established experimental paradigms supporting avatar embodiment with fullbody agency and self-view in virtual mirrors and conducted studies to validate our experimental environment and study the impact of different movement conditions.

Related work
The sense of embodiment (SoE) for an avatar body in virtual reality refers to the subjective feeling rising from the sense of body ownership, the sense of agency, and the sense of self-location inside that virtual body (Kilteni et al., 2012;Herbelin et al., 2016). First, the sense of body ownership (SoO) refers to the subjective feeling that a virtual body part or an entire virtual body is one's own . Second, the sense of agency (SoA) designates the subjective experience of being the agent causing the actions made by an avatar (Tsakiris et al., 2006;Haggard and Chambon, 2012). High SoA can be achieved by animating the virtual body in synchrony with one's own movements. Of note, as the body tracking fidelity increases, the embodiment level increases too (Eubanks et al., 2020). Finally, the sense of self-location refers to the experience of being located in space, which usually coincides with the location of the virtual body when experienced from a first-person perspective (1 PP) (Maselli, 2015;Debarba et al., 2017). Despite the fact that some distortions can be tolerated without breaking the sense of embodiment, when one of its component is strongly disrupted, a break in embodiment occurs, which should be avoided for an optimal experience in VR (Kokkinara and Slater, 2014;Porssut et al., 2019;Porssut et al., 2022a;Porssut et al., 2022b). For example, the avatar's movements can be distorted without breaking the sense of agency (Farrer et al., 2008;Kokkinara et al., 2015;Kasahara et al., 2017;Rietzler et al., 2017;Galvan Debarba et al., 2018;Porssut et al., 2019;Porssut et al., 2021) and can, therefore, be used to improve, for example, interactions with 3D objects in VR (Burns et al., 2005;Azmandian et al., 2016) or navigation (Nguyen et al., 2020).
To strengthen the embodiment of full-body avatars, numerous studies in immersive virtual reality use virtual mirrors, as they allow participants to get used to the avatar provided to them by seeing their virtual body animated by their own motions Jun et al., 2018;Fribourg et al., 2020;Gorisse et al., 2020). Gonzalez-Franco et al. (2010) even showed that participants can experience SoE for their avatars when presented only in a virtual mirror facing them, even in the absence of a virtual body in a first-person perspective (Gonzalez-Franco et al., 2010). However, it was conversely observed that the feedback provided by a virtual mirror can have some significant influence on the experienced agency (Fribourg et al., 2018) and on the perception of animation inaccuracies and artifacts (Ito et al., 2019;Koilias et al., 2019). Indeed, the additional information on the virtual body movement provided by mirrors can influence the SoA either positively or negatively depending on the experimental conditions and setup. Concerning its possible negative impact, it has been hypothesized that virtual mirrors might distract participants, highlight animation imperfections and the differences between the avatar and the user's appearance, or even emphasize a possible uncanny valley effect (Mori et al., 2012).
In addition, for embodiment purposes, mirrors were also used to induce discrepancies between seen and actual movements in order to study the range of the resulting sensory disturbances. The visual mismatch achieved by this type of setup was found to be a potential cause of subjective discomfort in healthy adults, ranging from tingling sensations to mild pain (McCabe et al., 2005). On the other hand, in a VR context, combining a slow passive movement of one arm and visual feedback showing both arms moving can induce an illusion of movement of both arms, namely, a kinesthetic illusion (Giroux et al., 2019;Giroux et al., 2021). Thus, incongruencies between the seen and actual movement can induce subjective discomfort and distort one's perception of one's movements. Interestingly, in setups in which users can act to reduce the experienced visuomotor discrepancy, a tendency was observed towards an active resolution of the conflict. First, hints of this phenomenon were anecdotally reported by Wegner et al. (2004) in a physical setup in which participants could see themselves in a full-length mirror placed in front of them, while their arms were hidden under a smock and "replaced" by the arms of a confederate standing behind them. While the confederate performed hand movements, authors observed some participants doing small movements under the smock.
In a VR context, this type of motor behavior was named the selfavatar follower effect (FE). Specifically, it was proposed to designate the tendency of VR users to follow the movements of their avatars in order to reduce potential multi-sensory discrepancies between the (real) body cues and the visual feedback . In the context of goal-directed arm movement in virtual reality, the follower effect has been observed for visuo-proprioceptive discrepancies Gonzalez-Franco et al., 2020).

Frontiers in Virtual Reality
frontiersin.org Participants were asked to reach a static target located in front of them, while a distortion was applied in order to horizontally shift the final position of the hand. In this context, it was observed that participants adjusted their pose in order to reduce the spatial offset between their real arm and the avatar's arm. Similar observations were made in a context of a drawing task in VR, showing that a strong ownership associated with a mismatch between the intended and the seen movement could cause an "attraction" of one's motor actions towards the seen movement (Burin et al., 2019). Because participants were not aware of their actual motor performance, it suggests a conscious detection of the distortion but an unintentional motor side-effect. Extending it to a full-body interaction context, Rietzler et al. (2017) studied the influence of time distortion in VR and showed that when alternating between real-time animation and slow-motion, participants adapt their movement speed to match the slow-motion restriction and maintain visuo-motor coherence between themselves and their avatar. The same kind of result was anecdotally observed by Debarba et al. (2015); Debarba et al. (2017) when providing participants with a pre-animated full-body avatar viewed from a first-person perspective. Some participants would either follow their avatar in a very explicit way by fully extending their arms when the avatar was reaching for the targets or, in more subtle ways, by entering in a similar phase or rhythm while the avatar was walking. Similarly, Zhao et al. (2021) anecdotally reported a similar phenomenon as participants would sometimes adapt their walkingin-place movement to follow the gait of the pre-animated avatar provided to them. As previously discussed by Gonzalez-Franco et al. (2020), the follower effect could be considered as related to the motor contagion phenomenon, which was found to be accentuated by the feeling of ownership toward the seen body (Burin et al., 2019). It could also be related to mimicry, which is however more of a social phenomenon (Raafat et al., 2009) and designates the influence of others on individual behavior. On this aspect, it differs from the self-avatar follower effect, as this terminology specifically describes the influence of the virtual self-body movements on one's actual actions.
From a cognitive point of view, the follower effect has been proposed to arise from the need to minimize multisensory conflicts induced by the embodiment of an avatar altering the self-body perception Maselli et al., 2022), which would emphasize the self-specificity of the follower effect, compared to mimicry or motor contagion. This hypothesis is grounded in the active inference theory, in which actions are understood as participating in the global goal of prediction error minimization (Friston et al., 2010;Sajid et al., 2021). Indeed, in the follower effect context, as one cannot influence the visual feedback in a way that decreases the multisensory conflict, one can do so by moving to narrow gap between the seen body and the real one. Congruently with this interpretation, a recently proposed computational model of movement control shows fundamental differences between the types of prediction errors driving intentional and "conflict-resolution" movements, thereby suggesting a novel explanation for the unintentional aspect of the self-avatar follower effect (Maselli et al., 2022).
Put together, those observations provide heterogeneous but compelling evidence for considering a bilateral motor relationship and mutual influences between users and their embodied avatars through the diverse visual feedback that can be provided in VR. Aiming to systematically study the conditions of occurrence of this phenomenon, we conducted an experiment to investigate whether switching to prerecorded avatar movements can influence users to synchronize with their avatars during repetitive upper-and lowerbody movements. Because of the potentially uncontrolled influence of the virtual mirror view on our manipulation, we first conducted a preliminary study to validate our experimental setup with full-length virtual mirrors.

Virtual scene
The experiments were fully conducted in VR by a virtual experimenter, presenting themself as Dr. Wood, who was animated using pre-recorded movements. The virtual experimenter communicated with the participants using pre-recorded voice samples, chosen in real-time by the real experimenter, following a Wizard-of-Oz setup. The virtual scene included a full-length mirror with a two-part frame ( Figure 1). The left part of the frame corresponded to the participant's mirror side, while the right part was dedicated to Dr. Wood. Both frames could be turned on to glow blue in order to indicate the current state of the task (cf subsection 4.1). In order to avoid an uncanny valley effect, a gender-neutral nonrealistic wooden mannequin was chosen, both for the virtual experimenter and for the participant's avatar ( Figure 1). Indeed, relatively abstract avatars, such as our wooden mannequin or a robotic equivalent, were shown to efficiently compete with more realistic avatars in terms of a sense of embodiment (Latoschik et al., 2017;Fribourg et al., 2020). They also offer the advantage of making the potential motion artifacts less noticeable as one does not expect highly sophisticated motions from a simple humanoid model (Argelaguet et al., 2016).

Avatar animation
Participants were equipped with a set of seven HTC Vive Trackers; one per hand, elbow, and foot and the last one tracking the pelvis. The acquired tracking data were used to compute the pose of the avatar's skeleton using the VRIK solver from the Final IK Unity package. Two types of corrections were applied to the raw tracking data in order to compensate for the avatar's skeleton structure inaccuracy and correct the resulting avatar poses. First, the head and pelvis target heights were clamped to the height initialized when the participant performed the calibration T-pose. Second, the leg and arm lengths were automatically modified online when necessary. This correction prevented interpenetration between the avatar's feet and the virtual floor and ensured the virtual arms of the avatar were fully extended when the participant stood with arms along their body.

Mirror paradigm and questionnaires
For each movement, Dr. Wood first performed it in front of the mirror. During this phase, the corresponding mirror's frame was glowing, thereby indicating to the participants where to look ( Figure 1). Then, Dr. Wood's frame turned off and the frame of the mirror facing the participant turned on. A fixation cross appeared Frontiers in Virtual Reality frontiersin.org (at knee reflection height) to remind participants where to look when performing the movement. When the cross disappeared (after 1s), the participants had to repeat the movement they were just shown.
To avoid breaking the VR experience, the questionnaires were presented in VR. Participants answered using virtual sliders and buttons while still being embodied in their avatars ( Figure 1).

Preliminary study 4.1 Task and experimental conditions
The basis of our experimental manipulation was to ask participants to repeat a movement previously demonstrated to them by a virtual character (the virtual experimenter called Dr. Wood) facing them through a mirror. To validate our experimental setup and to evaluate the influence of mirrors on the sense of embodiment in our experiment, we investigated two factors in a within-subject design.
• Mirror: with or without. In the condition with a mirror, participants were facing a mirror and could see the reflection of the avatar's body facing them. • Congruency: avatar movement congruent with the participant's movement or incongruent (pre-recorded participant's movements).
Each condition was evaluated through two blocks, leading to eight blocks presented in pseudo-random order. Each block was composed of five simple movements, first shown by Dr. Wood, and then repeated by the participant. The five movements, randomly ordered inside each block, were the following: lift the right arm, lift the left arm, lift both arms, lift the right knee, and finally, lift the left knee.
To ensure the participants would see a first-person view of the movement performed by the avatar in the incongruent condition even without a mirror, the mapping between the movement to perform and the pre-recorded movement executed by the avatar followed a predetermined pattern: when the movement to perform was a leg movement (resp. an arm movement), the avatar performed an arm movement (resp. a leg movement). In the particular case of the "lift both arms" movement, the avatar would only lift one arm. After each block, participants filled out a questionnaire assessing their sense of ownership of the avatar and their sense of agency (same items as in Gonzalez-Franco and Peck (2018), each ranging from −3 to +3, see Figure 1). Self-location was not evaluated to shorten the questionnaire because no change is expected as the first-person view is maintained throughout the experiment. Details about the materials, methods, and procedure are in section 3.

Procedure
The preliminary experiment proceeded as follows. The participant arrived in the lab, performed a quick balance test, signed the consent form, and filled out a demographic questionnaire. They were equipped with the Vive Trackers and the head mounted display (HMD) and given two tennis balls to hold in their hands to prevent finger movements and visually match their virtual wooden mannequin avatar ( Figure 1). After explaining the task, the VR experiment was started by the experimenter, who performed the calibration phase: the avatar was globally rescaled to match the participant's height, and its arm and leg lengths were adjusted. Once the avatar was calibrated, a small acclimatization phase was conducted by Dr. Wood. Participants were instructed to look at Dr. Wood doing simple movements in front of the mirror and to mimic them. The training phase followed, in which the participants were trained to perform a block, including the questionnaire filling. During this phase, their movements were recorded in order to replay them later in the main blocks (incongruent condition). The eight main blocks were then performed. Dr. Wood then thanked the participants for participating and told them they could remove the headset. Once the HMD and Vive Trackers were removed, a small debriefing was conducted by the real experimenter.

Hypotheses
We expected the movement congruency to act as a controlling factor and to have a main effect on both ownership and agency scores, with congruent movements leading to a better sense of embodiment (H0a). We presumed the mirror factor to interact with the movement factor: we hypothesized the mirror could have an impact only in the

Results and synthesis
A total of 31 healthy participants (12 women) were recruited (mean age = 23, sd = 4). Agency and ownership scores were computed using Gonzalez-Franco and Peck (2018)'s formulas, ranging from −12 to 12 and from −15 to 15, respectively. Both were analyzed separately using the same procedure. Normality of the samples was assumed given the sample size and checked using data QQ plots. In case of multiple comparisons, p-values were corrected with the conservative Bonferonni method and reported using the following notation: p corr .
Concerning the sense of agency, a 2 × 2 repeated measure ANOVA revealed a main effect of the movement factor (congruent vs. incongruent; F (1, 30) ≃ 232; p < 0.001). Neither the main effect of mirror nor the interaction was significant (resp. p ≃ 0.53 and p ≃ 0.86). Pairwise t-tests confirmed a significant effect of movement congruency for both mirror conditions (both p, p corr < 0.001) and did not reach significance when tested for a mirror effect for both movement conditions (congruent movement: p ≃ 0.77; incongruent movement: p ≃ 0.42) (Figure 2).
Similarly, for the sense of ownership, a 2 × 2 repeated measure ANOVA revealed a main effect of the movement factor (F (1, 30) ≃ 183; p < 0.001) as well as a significant interaction with the mirror factor (F (1, 30) ≃ 5.73; p = 0.023). No main effect of the mirror factor was found (p ≃ 0.99). Pairwise t-tests confirmed a significant effect of movement congruency for both mirror conditions (both p, p corr < 0.001, see Figure 2). However, the interaction was not confirmed as the pairwise t-tests did not reach significance when testing for a mirror effect for each movement condition (congruent: p ≃ 0.061; incongruent: p ≃ 0.175).
To sum up, the analysis confirmed (H0a) by revealing a main effect of movement congruency on both the agency and ownership scores, with congruent movements leading to a higher sense of embodiment. Conversely, (H0b) was rejected as the mirror did not have a strong impact on ownership or on agency, and the interaction between movement congruency and the mirror factor did not withstand the post hoc analysis. We can thus conclude that our setup is efficient in terms of eliciting embodiment towards the avatar when its movements are congruent with the participant's ones. This confirms that our selection of avatar shape and motiontracking algorithms can be used for further experimentation on embodiment. Moreover, the fact that the presence of mirrors did not affect participants' judgement of embodiment indicates that mirrors can only be supportive for our experimental study, more specifically to ensure that movements of the avatar are clearly seen when the user is in first-person view.

Main study
The goal of the main experiment was to investigate the possibility of inducing a follower effect both for the upper-and lower-body movements. The experimental setup, virtual scene, and avatar animation were the same as in the preliminary study.

Task and experimental conditions
We investigated whether some specific conditions and instructions can lead users to spontaneously synchronize with their avatars during the execution of repetitive upper-and lower-body movements, despite the avatar executing pre-recorded movements that are not congruent with those of the users. For that purpose, we asked participants to repeat, after the demonstration by Dr. Wood, the following movements: • Right/left arm: First, lifting the right arm until it is stretched out to shoulder height, then lowering it back down to resting position and, second, doing the same with the left arm. • Left/right arm: Opposite movement to the "right/left arm" one: same movement, but in reversed order. • Right/left step: First, raising the right foot off the ground and returning to the resting position and then doing the same with the left foot. This movement mimics a walking-in-place gesture. • Left/right step: Opposite movement to the "right/left step" one: same movement, but in reverted order. Then, as we expect to trigger the follower effect during repetitive movements, the experiment considers two movement durations: movements had to be executed either once (short condition) or repeated over 20 s (long condition). For long trials, Dr. Wood first showed the movement to be performed during the 20 s. Participants were instructed to avoid counting the exact number of repetitions executed by the virtual experimenter but instead to go at their own pace until instructed to stop (see Section 5.2). Finally, in order to study the possible occurrence of a follower effect, the avatar's movements could either follow the participant's movements (congruent condition) or replay the "opposite" movement as previously recorded (incongruent). Participants were specifically instructed to pay attention to the order of the limb movement demonstrated by Dr. Wood (e.g. first the left arm, then the right arm). The incongruent condition was therefore designed to have the replayed movement done by the avatar, starting with the opposite limb of the one instructed to the participant, thus leading to out-of-phase movements between the avatar and the user. Of note, in the incongruent condition, the head was kept tracked to avoid motion sickness.
Two blocks of 16 trials each were generated (two upper body movements × two lower body movements × two avatar movements conditions × two duration conditions). In each block, the trial order was randomized. As incongruent motion is known to induce disembodiment and displeasure (Jun et al., 2018), embodiment and pleasure/displeasure measurements were collected after each trial through four in-VR questions using continuous scales ranging from 0-1. The ownership and agency questions were taken by Kalckert and Ehrsson (2014) and Fribourg et al. (2021): "I felt as if the virtual body was my body." (ownership); "The virtual body moved just like I wanted to, as if it was obeying my will." (agency);" I felt as if I had no longer a body, as if my body had disappeared." (negative ownership). To assess the participants' subjective feeling of pleasure/displeasure, we used the affective slider from Betella and Verschure (2016): participants were asked to rate how they felt during the trial using a simple continuous scale between a frowning emoji face and a smiling emoji face. The three embodiment questions were randomized and presented before the affective slider was used. In order to measure the follower effect, both the tracking data of the participants and the movements of their avatars were recorded.

Procedure
Upon their arrival, participants were asked to demonstrate their ability to stand on one foot while lifting the other for a few seconds to ensure good balance for the walking-in-place movements. They then underwent a simple VR stereoscopic test, signed a consent form, and filled out a demographic questionnaire. Similarly to the preliminary experiment, they were equipped with Vive Trackers and HMD, and given two tennis balls to hold in their hands. After the calibration and acclimatization phases, the training started, serving two purposes. First, to show participants the course of a trial and the movements they were asked to reproduce throughout the experiment and to make sure they perform them correctly. Second, to record the participant's movements that would then be replayed in the incongruent condition of the experiment. Thus, during this phase, the avatar's movements were congruent with the participant's ones. They were asked to perform every experimental movement shown by the virtual experimenter, each one both in the short and long version, thus leading to eight trials (see Section 5.1).
After the training, each participant went through two main blocks, thus performing 32 trials in total. Then, they removed the HMD and Vive Trackers and went through the post-experiment interview (see Supplementary Material).

Hypotheses
We first hypothesized that, as for the preliminary study, congruent motions would induce higher embodiment scores than incongruent motions (H1). Second, we similarly hypothesized that congruent motions would induce higher affective slider scores than incongruent motions (H2) (Jun et al., 2018).
Third, we made the additional hypothesis that the follower effect would be different for upper and lower limb movements. This was first based on some observations made in VR on the importance of foot tracking on the experience of controlling a full-body virtual avatar (Galvan Debarba et al., 2020) and the influence this has on leg movements (Pan and Steed, 2019). Concerning the direction of this hypothesis, we expected that the follower effect could occur more often during lower-body movements than during upper-body ones (H3); based on the observation that the walking-in-place movement performed by participants is close to locomotion, it could indeed be expected that well-known phenomena of automatic walking synchronization would occur spontaneously (e.g., people walking side by side, see Cheng et al. (2020)). This would be in line with studies investigating visuomotor conflicts during locomotion (Kannape and Blanke, 2012;Kannape and Blanke, 2013) that observe the lower limb movement adjustments happening (attempts at synchronizing) when asynchronous avatar movements are presented to people while walking.
Finally, because following the avatar's movement would avoid breaking the virtuous cycle of interaction between perception of the virtual body and control of the real body, we expected the sense of embodiment to be higher in trials in which a follower effect occurs than in trials in which participants would not synchronize with their avatars (H4). Indeed, embodiment of the avatar is a necessary condition for leading a potential follower effect. Conversely, following the avatar's movement would minimize the experienced multisensory discrepancy, which could help in avoiding a break in the embodiment.

Results
A total of 26 healthy volunteers were recruited for this study. Four were excluded for either explicitly challenging the limits of the tracking system during the incongruent trial or not following the instructions. The statistical analysis was thus conducted on 22 subjects (11 of who were women), aged from 18 to 27 (mean = 21.5, sd ≃ 2.2). All participants gave informed consent and the study was approved by our local ethic committee. The study and methods were carried out in accordance with the guidelines of the Declaration of Helsinki. The experiment lasted for about an hour and participants were compensated with 20 Swiss francs.
Movement data pre-processing was used in order to study upperand lower body trials in the same way, and because both types of Frontiers in Virtual Reality frontiersin.org movement consist of a raising motion of hands or feet, we focused on the y-axis (vertical) coordinate of the corresponding movements. Incongruent trial movement data plots were examined manually to spot invalid trials. In our case, a trial was declared invalid and thus excluded from the dataset if the participant did not follow the instructions or if a technical issue had resulted in the participant continuing the movement for a few seconds after the incongruent replay had ended. This latter issue would lead to congruent avatar movements being included in the trial data, therefore, making the following questionnaire unreliable. According to this methodology, 17 short incongruent trials and 8 long incongruent trials were removed from the dataset, letting 335 short and 344 long trials for the analysis. Follower effect data pre-processing In our context, a follower effect is reflected in the synchrony between the participant's actual movement and the one replayed by the avatar. Here we present the pre-processing of the data corresponding to the follower effect analysis, namely the long incongruent trials.
For each trial, we focused on the y-axis (vertical) coordinate of each movement. As both types of movement consist of a motion raising the hands or feet, it allows us to compute a follower effect metric in the same way for both upper and lower-body trials. To isolate the region of interest in each trial and avoid noise to blur the data, Python SciPy's peak detection algorithm was used to identify the highest point of each raising motion. The region of interest was between one second before the first peak and one second after the last peak of the replayed movement. Examples of the delimitation of this region of interest can be seen on Figure 3 as illustrated by vertical black lines. Due to tracking jitters, some samples had noisy area at their beginning and/or end, which were interfering with the peak detection algorithm and were thus cut manually.
Second, as the synchrony between the movements of the avatar and the participant is the targeted feature to measure, our follower effect score is computed based on the Pearson correlation between the replayed movement and the participant's actual movement. Using this measure, if the two y-coordinate curves are alternating, it leads to a negative correlation (see Figure 3A for an example). Conversely, if the two curves are overlapping, the resulting correlation is positive (e.g. Figure 3B).
Following this idea, a sliding Pearson correlation (centered window of 200 frames, corresponding to approximately 4 s) was computed on each trial's region of interest. Figure 4A shows the corresponding mean and variance over all long incongruent trials. The total duration of 700 frames corresponds to the common region of interest of the long incongruent trials. We can see that during the first half of the trials, the correlation mean is rising, which can be interpreted as an adaptation phase during which the participants' movements are, on average, becoming increasingly correlated with the avatar replay over time. The mean correlation then stabilizes until the end of the trial. Consequently, we computed a follower effect score for each trial over the second half of the trials, delimited in Figure 3 by dashed lines. As each movement involved both hands moving (resp. both feet), two scores were computed for each trial; each one corresponding to one hand (resp. foot). If the scores of both hands (resp. feet) were greater than the threshold ϵ = 0.01, the trial was classified as a follower effect trial for the analysis (Figure 3A), else as a no follower effect trial ( Figure 3B). Each trial classification as well as tracking data plots are available in the Supplementary Material of this project.Using this classification to distinguish between FE trials and no FE trials, the sliding correlation profiles of the FE trials were isolated and grouped depending on the movement type ( Figure 4B). The

Frontiers in Virtual Reality
frontiersin.org corresponding profiles show a clear trend in the dynamic of the FE trials, with an increase in the correlation between the actual movements and the replayed movements in the beginning followed by stabilization at the end of the trials. This confirms the existence of a transition phase during which the participants movements are, on average, becoming increasingly synchronous with their avatar movements. Effects of movement synchrony on embodiment and affective slider scores, following the preliminary study results, as both agency and ownership were impacted the same way by incongruent avatar's movements, both scores were averaged to compute a sense of embodiment score for each trial, thus ranging from 0-1. To compare the sense of embodiment and affective slider scores across the movement synchrony condition, the trials were aggregated independent of their length. As the resulting samples were unbalanced due to some long incongruent trials being excluded from the dataset, one-sided permutation tests were used. Concerning the sense of embodiment, the test revealed a significant difference (p < 0.001), thus indicating a significantly higher sense of embodiment for congruent trials than for incongruent trials, thereby validating our (H1) hypothesis ( Figure 5). Similarly, concerning the affective slider, the test indicated significantly higher scores for congruent trials than for incongruent trials (p < 0.001), thereby validating our (H2) hypothesis ( Figure 5).
While examining the negative ownership question scores depending on the avatar's movement congruency, we noticed higher scores in the congruent condition (m ≃ 0.6) than in the incongruent condition (m ≃ 0.3). The difference was significant (a two-sided permutation test, p < 0.001), thus indicating a similar behavior than that of the ownership scores, which was different from Fribourg et al. (2021). This questionnaire item, originally used for hand disownership, is thus not adapted to evaluate fullbody ownership.
Follower effect classification and comparison of the upper/lower body: Using our classification method, around 47% of leg movements and around 38% of arm movements were classified as follower effect Sliding correlation mean and variance with a centered window of 200 frames of (A) long incongruent trials; (B) long incongruent trials classified as F E depending on the movement type.

FIGURE 5
Sense of embodiment scores (left) and affective slider results (right) depending on the avatar's movement congruency.
Frontiers in Virtual Reality frontiersin.org models. A Pearson's Chi-squared test revealed no significant difference between these proportions (p ≃ 0.22), thus failing to validate our (H3) hypothesis. Link between follower-effect occurrence and embodiment: To compare the sense of embodiment scores depending on the follower-effect occurrence, only long incongruent trials were considered. As the samples used to compare were unbalanced, a one-sided permutation test was applied. The test revealed a significant difference (p < 0.001) between the samples, indicating a significantly higher sense of embodiment in trials in which a follower effect occurred, thereby validating our (H4) hypothesis ( Figure 6).

Discussion
First, as expected, our paradigm successfully shows that, for cyclic and repetitive movements, an asynchrony between the avatar and the user movements leads to a significantly lower sense of embodiment than when both are congruent ( Figure 5). This result is consistent across both our studies and in previous studies about embodiment, thereby confirming that congruent movements are essential to experience agency and ownership toward a virtual avatar. Similarly, we replicated previous results according to which congruent avatar movements lead to a more pleasant user experience than incongruent movements (Jun et al., 2018). This is in line with the previous results, showing visuo-motor mismatches to induce subjective discomfort (McCabe et al., 2005), and suggesting that in a VR context, the resulting disembodiment could also be linked to displeasure.
Second, at the core of the present study, our main experiment extends the previous study that observed users following distorted movements of their avatars Gonzalez-Franco et al., 2020) by further showing that a self-avatar follower effect can be triggered when the avatar's movements are entirely replaced by a recording and are not influenced by the real body movements. More precisely, when performing upper-and lower-body repetitive movements, participants synchronize to the movement of the selfavatar image that shows an out-of-sync recording of the same movement. Indeed, almost half (≃ 47%) of long incongruent lowerbody trials and more than a third of the corresponding upper-body trials (≃ 38%) were exhibiting a follower effect. It is worthy to note that a total of 20 over 22 participants followed the avatar at least once. Of particular interest, during the post-experiment interview, five of them answered "No" to both "Did you feel like your movements were sometimes influenced by something out of your will ?" and "Did you ever follow the avatar's movement although you wanted to perform another one ?", thus suggesting that in some cases, the self-avatar follower effect occurred in our setup without the awareness of the participants. Of note, one of them even elaborated by commenting that the avatars had sometimes adapted to their own movements. It is also worth noting that 15 out of the 20 participants who followed the avatar at least once mentioned that the movements of the avatar may have influenced their own movements, with 12 of them describing more precisely a possible influence of the rhythm of the avatar's movements. Here, we could speculate that participants became aware of the phenomenon because of the duration of the trials (20s) which let them enough time to notice a synchronization if it occurred. Additionally, six participants declared having followed at least once the avatar's movement although they wanted to perform another one. Among the associated comments, one participant pointed out that it happened inadvertently, while another one mentioned that they tried to stick to the avatar rhythm when it was close to their own rhythm. Even more interesting is that one elaborated saying that they sometimes started with the same limb as the avatar or followed its rhythm, but that it was "against their will". Furthermore, echoing the previous study about visuomotor incongruencies (McCabe et al., 2005), some participants pointed out that they had to focus more during asynchronous trials than in synchronous trials. Similarly reflecting on our result about the affective slider, one participant reported that "it was weird" when their movements were not synchronous with the avatar's movements and pointed out that they synchronized with the avatar because "it was more pleasant".
Third, we could observe that the sense of embodiment scores for the avatars were significantly higher for trials exhibiting follower effect compared to others ( Figure 6). However, even if significant, it is worth noticing that this difference is small, with the embodiment scores remaining low overall. Thus, even under the occurrence of a follower effect trial, a break in embodiment can occur, highlighting the limitations of the manipulation we could achieve.
Finally, a detailed analysis of the participant's movement data in long incongruent trials shows some recurrent and noticeable behaviors. Figure 3 shows an example of the most striking of the movements, in which we can clearly see the dynamics of a participant adapting their movements to follow the avatar. The trial begins with the avatar starting by moving the left hand (shown by the blue curves in Figure 3) and the participant lifting the opposite hand as instructed. Then, as the avatar continues by lifting the right hand, the participant catches up "on the fly" to the avatar's movement by lifting the right hand instead of bringing it to the resting position, thus giving rise to a characteristic "M-shape" movement. Other typical follower-effect behaviors include some trials in which participants followed the avatar from the very first movement (starting the task with the wrong limb or waiting for the avatar to start) or in which participants were possibly confused by the avatar's incongruent movements and started moving after a slight pause by following the avatar movement. We also observed trials in which participants were performing the movements at a slightly different pace than their avatars, thereby eventually catching up with the avatar movements and staying synchronized from this point.

FIGURE 6
Sense of embodiment scores depending on the occurrence of the follower effect.

Frontiers in Virtual Reality frontiersin.org
Several limitations in the present study are worth noticing for future improvements and extensions. Our questionnaire results lack a proper control item that would not be influenced by our manipulation, in contrast to the observed effects on agency and body ownership. The experimental task itself could also integrate an orthogonal manipulation (e.g. of color or appearance) that would outline the specificity of the observed influence of asynchrony on the avatarfollower effect. Our observations are limited to cyclic and repetitive movement, and our conclusion do not extend to more ecological cases of non-repetitive and/or goal-oriented movements. Future studies could explore to what extent users could let themselves be guided by their avatars when they are uncertain about the movement to perform, or when there is no precise instruction about what movement to perform. Finally, we chose a mannequin-like avatar to avoid any potential uncanny valley effect but it is possible that a more realistic design would lead to different results in terms of occurrence and/or magnitude of the self-avatar follower effect. Ultimately, the self-avatar follower effect could potentially be stronger with a hyper-realistic personalized avatar, as the visuomotor incongruency may be considered at a different relevance level. The opposite effect could also be true, with a higher expectation for realism on all aspects of the avatar, thus providing an interesting direction for future study.
Taken together, our results and the different strategies spontaneously exhibited by users, which are in contradiction with the instructions and often not intentional, shed some light on the possible conditions favorable to the spontaneous and involuntary tendency to adapt the performed movements to the feedback provided by the avatar. The presented experimental paradigm can thus be used for further studies aiming at interpreting the behavior of participants, potentially concluding on the influence of motor contagion and mimicry, and eventually leading to identifying the specificity of the cognitive mechanisms behind the FE. Grounded in the active inference framework (Maselli et al., 2022), the complex avatar-user mutual interdependencies highlighted here need further investigation to be fully understood, but already point towards interesting possibilities for enriched interaction design in recreational, therapeutical, or educational VR applications.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found here: https://zenodo.org/record/7614495.

Ethics statement
The studies involving human participants were reviewed and approved by Commission cantonale (VD) d'éthique de la recherche sur l'être humain (CER-VD). The patients/participants provided their written informed consent to participate in this study.

Author contributions
LB, LS, BH, and RB conceived and designed the experiments. LS and HD performed the experiments. LB, LS, and HD analyzed the data and contributed reagents/materials/analysis tools. LB, LS, HD, BH, and RB wrote the manuscript.

Funding
This work was supported by the SNFS project "Immersive Embodied Interactions" grant 200020_207424.