Authentic Fear Responses in Virtual Reality: A Mobile EEG Study on Affective, Behavioral and Electrophysiological Correlates of Fear

Fear is an evolutionary adaption to a hazardous environment, linked to numerous complex behavioral responses, e.g., the fight-or-flight response, suiting their respective environment. However, for the sake of experimental control, fear is mainly investigated under rather artificial laboratory conditions. The latter transform these evolutionary adaptions into artificial responses, like keystrokes. The immersive, multidimensional character of virtual reality (VR) enables realistic behavioral responses, overcoming aforementioned limitations. To investigate authentic fear responses from a holistic perspective, participants explored either a negative or a neutral VR cave. To promote real-life behavior, we built a physical replica of the cave, providing haptic sensations. Electrophysiological correlates of fear-related approach and avoidance tendencies, i.e., frontal alpha asymmetries (FAA) were evaluated. To our knowledge, this is the first study to simultaneously capture complex behavior and associated electrophysiological correlates under highly immersive conditions. Participants in the negative condition exhibited a broad spectrum of realistic fear behavior and reported intense negative affect as opposed to participants in the neutral condition. Despite these affective and behavioral differences, the groups could not be distinguished based on the FAAs for the greater part of the cave exploration. Taking the specific behavioral responses into account, the obtained FAAs could not be reconciled with well-known FAA models. Consequently, putting laboratory-based models to the test under realistic conditions shows that they may not unrestrictedly predict realistic behavior. As the VR environment facilitated non-mediated and realistic emotional and behavioral responses, our results demonstrate VR’s high potential to increase the ecological validity of scientific findings (video abstract: https://www.youtube.com/watch?v=qROsPOp87l4&feature=youtu.be).

Fear is an evolutionary adaption to a hazardous environment, linked to numerous complex behavioral responses, e.g., the fight-or-flight response, suiting their respective environment. However, for the sake of experimental control, fear is mainly investigated under rather artificial laboratory conditions. The latter transform these evolutionary adaptions into artificial responses, like keystrokes. The immersive, multidimensional character of virtual reality (VR) enables realistic behavioral responses, overcoming aforementioned limitations. To investigate authentic fear responses from a holistic perspective, participants explored either a negative or a neutral VR cave. To promote real-life behavior, we built a physical replica of the cave, providing haptic sensations. Electrophysiological correlates of fear-related approach and avoidance tendencies, i.e., frontal alpha asymmetries (FAA) were evaluated. To our knowledge, this is the first study to simultaneously capture complex behavior and associated electrophysiological correlates under highly immersive conditions. Participants in the negative condition exhibited a broad spectrum of realistic fear behavior and reported intense negative affect as opposed to participants in the neutral condition. Despite these affective and behavioral differences, the groups could not be distinguished based on the FAAs for the greater part of the cave exploration. Taking the specific behavioral responses into account, the obtained FAAs could not be reconciled with well-known FAA models. Consequently, putting laboratory-based models to the test under realistic conditions shows that they may not unrestrictedly predict realistic behavior. As the VR environment facilitated nonmediated and realistic emotional and behavioral responses, our results demonstrate VR's high potential to increase the ecological validity of scientific findings (video abstract: https://www.youtube.com/watch?v qROsPOp87l4&feature youtu.be).

INTRODUCTION
The most salient stimuli that instantly draw attention are biologically relevant stimuli ensuring survival: nutrition, reproduction, and physical dangers (Carretié et al., 2012;Carboni et al., 2017). Among these, threats to physical integrity most inevitably jeopardize survival and immediately trigger complex responses, like the fight-or-flight response (Cannon, 1929). Hence, fear has been extensively investigated ever since (e.g., Fanselow, 1994;LeDoux 1998LeDoux , 2014Debiec and LeDoux, 2004;Blanchard and Blanchard, 1969). Several laboratory setups have been used over time to induce fearrelated responses under laboratory conditions. One of the most prominent and efficient procedures for fear induction is classical conditioning (e.g., LeDoux, 1998;Jarius and Wildemann, 2015). This method has proven to be successful innumerable times in generating fear of a stimulus that was previously not frightful, assessed by typical fear responses to the conditioned stimulus, such as the startle reflex (e.g., Brown et al., 1951;Grillon and Ameli, 2001). However, conditioning paradigms require laboratory fear acquisition in order to examine fear responses (e.g., LeDoux, 1998), and mostly take only single components of the reaction detached from the overall reaction into account, e.g., the startle reflex, to guarantee high internal validity. More naturalistic assessments are based upon pre-existing fear, for example in behavioral avoidance tasks (BAT). BATs are conventionally used in exposure therapies to estimate the severity of phobias and the treatment's efficacy (see e.g., Bernstein and Nietzel, 1973;Rinck and Becker, 2007). In clinical assessments, BATs are regularly carried out in vivo, and therefore allow for holistic responses to the frightful stimulus (e.g., Bernstein and Nietzel, 1973;Koch et al., 2002;Deacon and Olatunji, 2007). However, clinical assessments are indicative of deficient or altered emotional regulation, rather than natural fear reactions (e.g., Hermann et al., 2009;Cisler et al., 2010;Lanius et al., 2010). In contrast, non-clinical applications of BATs broadly rely on finite response options and stimuli, such as pressing a key or pulling a joystick to indicate the urge to avoid or approach an aversive stimulus (e.g., Heuer et al., 2007;Hofmann et al., 2009;Krieglmeyer and Deutsch, 2010). These rather artificial setups neglect that fear is a multidimensional response to a holistic environment and associated with complex behavioral programs, such as the fight-or-flight response to immediate threat (e.g., Cannon, 1929;Lynch and Martins, 2015;Teatero and Penney, 2015).
The complexity and multidimensionality characteristic of realworld experiences can be simulated by sophisticated virtual reality (VR) setups (Slater and Sanchez-Vives, 2016;Parsons, 2019;Pan and Hamilton, 2018;Schöne et al., 2020). In particular, VR offers high levels of sensory cues and fidelity of the virtual environment (VE); (Dan and Reiner, 2017;Riva et al., 2019), resembling a multisensory 3D-environment (Cabeza and Jacques, 2007;Pan and Hamilton, 2018;Parsons, 2019;Schöne et al., 2020). Consequently, users feel actually present and involved into the VE: Being able to manipulate their surroundings, but also to be the subject to the virtual events and actions significantly increases the VE's personal and emotional relevance (Slater and Wilbur, 1997;Kisker et al., 2020;Schöne et al., 2019;Schöne et al., 2020). Over the last couple of years, it has repeatedly been demonstrated that well-designed VEs are capable of eliciting strong emotional responses (e.g., Diemer et al., 2015;Felnhofer et al., 2015; for review see; Bernardo et al., 2020), that even keep up with their real-life counterparts (Higuera-Trujillo et al., 2017;Chirico and Gaggioli, 2019). For example, the exposure to great virtual heights evokes fear responses consistently across various setups as assessed by self-reports, psychophysiological and behavioral responses (Kisker et al., 2019a;Biedermann et al., 2017;Gromer et al., 2018Gromer et al., , 2019Wolf et al., 2020;Asjat et al., 2018). Accordingly, VR has gained great interest as an instrument for fear paradigms. For instance, being submersed into a virtual park at night and seeing distant shadowy silhouettes effectively elicited unease and anxiety in participants (Felnhofer et al., 2015). Thus, VR setups are markedly superior to the use of conventional stimuli, e.g., static pictures, regarding emotion induction and emotional involvement (Gorini et al., 2010).
But even more, a strong sensation of presence and a high degree of immersion increase the chances that participants behave as they would in real-life situations (Blascovich et al., 2002;Slater, 2009;Kisker et al., 2019a). For example, participants effectively adapt their behavior to the environmental conditions by making smaller, slower steps when crossing a beam at a considerable height (e.g., Biedermann et al., 2017;Kisker et al., 2019a). In a similar vein, VR exposure therapies effectively trigger fear responses and modify phobia-related reactions permanently, e.g., concerning acrophobia (e.g., Coelho et al., 2009), arachnophobia (e.g., Bouchard et al., 2006), agoraphobia, and social phobia (e.g., Wechsler et al., 2019). Hence, VR bears the potential not only to elicit real-life processes within a simulation but beyond that, to transfer virtual experiences to everyday life.
Consequently, when exposed to highly emotional and interactive VR scenarios, participants' responses go far beyond self-reports or pressing keys. The use of VR setups enables participants to respond within a much wider behavioral spectrum and most importantly, to react naturally and instantly to stimuli within a fully controllable setup (e.g., Slater, 2009;Bohil et al., 2011;Kisker et al., 2019a). Initial studies elicited fear using highly interactive setups and distinct fear cues. For example, VR horror games such as "The Brookhaven Experiment" Phosphor Games (2016) trigger anxiety by contextual features, such as darkness (e.g., Felnhofer et al., 2015), but beyond that, elicit fear responses to specific stimuli, e.g., zombies approaching the protagonist (e.g., Lin, 2017). Being virtually present and involved in dangerous situations positively correlates with increases in psychophysiological measures of stress, like heart rate (e.g., Higuera-Trujillo et al., 2017;Parsons et al., 2013;Gorini et al., 2010;Kisker et al., 2019a), verbal expressions of fear like screaming, and behavioral coping reactions like dodging or closing the eyes (Lin, 2017). A correspondingly high degree of interactivity allows for the impression of actively manipulating the events, as well as being directly affected by them, and thus facilitates authentic, multidimensional fear responses (Slater, 2009;Lynch and Martins, 2015;Lin, 2017). Whereas conventional laboratory setups have to rely on rather limited or substitutional response options, highly interactive VEs allow for physical movements and full-body responses. Consequently, participants might even fight or flee from fear cues, thus physically approaching or avoiding dangers in order to cope with them.
Markers of those behaviors are electrophysiological correlates of approach and avoidance. While event-related potentials associated with approach and avoidance, like modulations of the late positive potential (e.g., Bamford et al., 2015), reflect finegrained but only specific parts of the electrophysiological response, oscillatory neuronal dynamics allow for an ongoing assessment of cognitive processes (Bastiaansen et al., 2011). In particular, frontal alpha asymmetries (FAA) have been regarded as a canonical oscillatory correlate of emotional and motivational directions (e.g., Davidson et al., 1990;Coan et al., 2006;Rodrigues et al., 2018;Lacey et al., 2020). According to the valence model of FAA, relatively greater left frontal cortical activity relates to positive emotions and approach, whereas relatively greater right frontal cortical activity relates to negative emotion and withdrawal (Davidson et al., 1990;Davidson, 1998). Later models suggest the corresponding FAAs be indicative rather of the motivational direction, i.e., approach motivation and withdrawal motivation, independent of emotional valence (e.g., Gable and Harmon-Jones, 2008;Harmon-Jones et al., 2010;Harmon-Jones and Gable, 2018). For example, anger, obviously of negative valence, is related to relatively greater left frontal activity (e.g., Gable and Harmon-Jones, 2008). Notably, so far none of these models has emerged as being universally valid. An increasing number of studies offer divergent results and interpretations, adding to the debate about FAAs as indicators of either emotional or motivational directions (for review see e.g., Harmon-Jones and Gable, 2018). Recent models even suggest that FAAs indicate effortful control of emotions rather than emotional directions (Lacey et al., 2020; see also Schöne et al., 2015).
However, the vast majority of studies relating approach and avoidance to FAAs are based upon highly controlled laboratory setups, resembling real-life situations only to a very limited degree. Initial approaches to enhance FAA's generalizability to realistic conditions employed somewhat more immersive, so-called desktop-VR setups (Brouwer et al., 2011;Rodrigues et al., 2018). In particular, Rodrigues et al. (2018) associated active behavior with FAAs as indicated by the motivational direction model: Participants moved via joystick through a virtual maze depicted on a conventional desktop, encountering either a sheep, a monster, or a neutral person. Greater left frontal activation was associated with approach behavior and greater right frontal activation with withdrawal behavior respectively (Rodrigues et al., 2018). However, desktop-VR cannot offer as many degrees of freedom as highly immersive VR systems (e.g., HMDs, CAVE), inter alia, stereoscopic 360°view, and physical movements within a VE (e.g., Smith, 2019). This further enables mobile and multi-modal brain/ body imaging utilizing head-tracking, motion capture or analysis via video, opening up possibilities for less restricted behavioral reactions to be explicitly recorded, analyzed, and integrated into the research design (Makeig et al., 2009).
Our previous study on FAA in virtual environments has demonstrated the general technical feasibility of combined VR EEG-FAA measurements (Schöne et al., 2021; see also Lange and Osinsky, 2020 for mobile EEG). Most importantly, the study provided the first evidence that the same stimulus material presented in VR compared to a 2D condition yields different motivational patterns reflected in the FAA data. Although the immersive nature of VR provides a more realistic environment compared to a conventional laboratory setting, a key element of the everyday experience is not yet part of the equation: Motivational tendencies, as reflected by FAA, are accompanied by a corresponding behavior adapted to the situation in which it occurs. Whereas in laboratory settings, approach or withdrawal motivation is indicated by keystroke (e.g., Gable and Harmon-Jones, 2008), the advantage of VR as a tool is the creation of controlled environments in which the participant can roam and respond freely. Consequently, the question remains whether FAAs would follow the same trend as proposed by Rodrigues et al. (2018) under highly immersive conditions that allow for physical, realistic approach and avoidance behavior.
Going beyond previous VR studies on fear, the aim is not only to capture affective fear responses by means of subjective reports elicited by the VR environment, but to examine holistic fear responses, comprising full-body behavioral expressions of fear, and to put to the test whether corresponding electrophysiological correlates of approach and avoidance behavior obtained under conventional laboratory conditions apply to highly immersive VR setups. To this end, we set up an EEG-VR study in which participants explored either a neutral or a negative, i.e., frightful cave. We aimed to situate participants in an immersive environment triggering a strong, authentic fear response. As a neutral control, a second group of participants explored a non-emotional cave. To enhance the feeling of being present in the VE, and thereby impression of being personally and physically affected by the environment and events, we build an exact, spatially aligned, physical replica of the cave -touching the cave's stone wall in the virtual world thus led to a corresponding physical sensation (see Kisker et al., 2019a;Biedermann et al., 2017). As interactivity is a major factor enhancing fear in VR setups (Lynch and Martins, 2015;Madsen, 2016;Lin et al., 2018), participants physically walked through the cave holding a controller appearing as a flashlight in VR. Thus, their virtual movements corresponded to their physical movements. Above all, they gained the impression of being able to touch their surroundings and, more importantly, of being touched by them in return.
when participants encountered the werewolf, we expected them to exhibit either one of two behaviors: Firstly, advance toward the werewolf risking physical encounter to get past it. Secondly, to retreat to safe distance and wait to see how the situation develops to plot a safe escape route. As fearful, cautious behavior is associated with slower walking compared to harmless situations (Biedermann et al., 2017;Kisker et al., 2019a), the negative condition might exhibit longer exploration times compared to the neutral condition.

Psychophysiological Response
In line with the expected affective and behavioral responses, we assumed corresponding psychophysiological responses, i.e., decreases in heart rate variability (HRV; see e.g., Castaldo et al., 2015) to indicate increased stress levels in the negative condition. In contrast, we assumed that the neutral group would not exhibit any fear-related behavioral responses and stay unaffected in respective psychophysiological responses.

Electrophysiological Response
Derived from the aforementioned theoretical models on frontal alpha asymmetry, we hypothesized that the FAAs would significantly differ between conditions as a function of the exhibited behavioral responses. In particular, we expected avoidance behavior to be linked to relatively greater right cortical activity, and approach behavior to relatively greater left cortical activity.

Participants
The study was approved by the local ethics committee of Osnabrück University. Ninety-six participants were recruited from the local student population, gave their informed written consent, but were blind to the research question and experimental conditions. They were screened for psychological and neurological disorders using a standard screening for mental disorders and distress (anamnesis). All had a normal or corrected-to-normal vision. When vision correction was necessary, only participants wearing contact lenses could participate, not those wearing glasses. The participants were randomly assigned to one of two conditions (negative vs. neutral; see below) and blind to which condition they would participate in. As stated in the hypothesis, the cave was designed in such a way that we expected two behavioral patterns to emerge within the negative condition. Based on this assumption, twice as many participants were assigned to the negative condition as to the neutral condition.
The sample size was determined based on previous studies that conducted EEG measurements in a VR condition (Kisker et al., 2020;Lange and Osinsky, 2020;Schöne et al., 2021). Based on these studies, we aimed for a sample of about 25 participants per subgroup (see Exploration time and behavior). Although data acquisition was stopped due to the COVID-19 pandemic, we are optimistic that we obtained an adequate number of data sets corresponding to groups sizes implemented in previous VR studies (see Schöne et al., 2019;Kisker et al., 2020;Lange and Osinsky, 2020;Schöne et al., 2021). The participants received either partial course credits or 15€ for participation.
One participant was excluded during anamnesis and five participants of the negative condition terminated the experiment during the virtual simulation. Nine participants were excluded from analysis due to insufficient EEG data quality (n 1) or technical problems during the virtual experience (n 8). Hence, a final sample size of N 81 participants was obtained for analysis (negative: n 54, M age 21.67, SD age 3.57; 81.5% female, none diverse, 13% left-handed; neutral: n 27, M age 23.15, SD age 2.98; 59.3% female, none diverse, none left-handed). The high proportion of female participants results from a random sample with the majority of local psychology students being female. Although women are more likely to suffer from anxiety disorders and experience fear more frequently in their lives than men (e.g., McLean and Anderson, 2009), we found no significant differences between groups concerning general anxiety and current state of mind before the cave exploration. Hence, we assume that the gender imbalance did not affect the results obtained from group comparisons (see results).

Experimental Conditions and Setup
The experiment was comprised of two experimental conditions (negative vs. neutral). For both conditions, a mixed-reality design was implemented. A VR cave was designed in Unity 5 (version 2018.3.0f2, Unity Technologies, San Francisco, United States) and a physical replica of the cave was set up in the laboratory. The physical setup resembled the virtual layout and walls, allowing for haptic sensations when touching the virtual surroundings. Relevant objects within the cave were physically represented: Ivy vines at the cave's exit were mimicked by jute ropes, a corpse was mimicked by a life-size puppet, tree trunks and rocks by paper-mâché replicas ( Figure 1). The cave's layout and the path running through it were identical for both conditions. There was only one possible path through the cave. The virtual environment was presented with a wireless version of the HTC Vive Pro (HTC, Taoyuan, Taiwan) head-mounted display (HMD). Movement within the cave was implemented through active, physical walking. All participants held a Vive controller in their dominant hand, serving as a flashlight.
The difference between the caves was achieved by atmospheric elements alone as outlined in detail below. Events related to the atmospheric elements, e.g., the onset of wind howling, were automatically triggered depending on the position of the participant within virtual the cave. Each event was triggered only once per participant. Exemplary videos of the scenery and a video abstract are provided (see availability of data, material, and code).

Exploration of the Negative Cave
The negative condition was designed as a gloomy environment. The cave was only dimly illuminated. A mutilated corpse, the sound of crying, and a werewolf were used as fear-triggering stimuli ( Figure 1B1,D1). In the cave's entrance area, it was obvious that a frightening environment was to be expected, with weapons and corpses laying on the floor at distance ( Figure 1C1). The area aimed to allow the participants to immediately terminate the experiment if they did not dare to explore the negative, i.e., frightful cave. To navigate through the cave, participants had to turn around 180°. At a distance of about 2 m lay a mutilated corpse at the first turn-off of the path ( Figure 1B1). Shortly before reaching the corpse, crying could be heard. The participants had to step around the corpse to follow the path any further. Shortly before they reached the next turn-off, a monstrous roar and footsteps could be heard. Once they had passed this turnoff, a 2 m high werewolf was visible, walking towards the participants from the other end of the cave up to a fixed point at the third turn-off of the cave ( Figure 1D1). Participants did not know that the werewolf would not approach them any further than to this fixed point. The werewolf stopped at the junction, leaving room to pass it, still roaring and striking towards the participants. Participants had to walk towards the werewolf and turn off directly in front of it to reach the cave's exit ( Figure 2).

Exploration of the Neutral Cave
The neutral condition was designed as a non-emotional environment. The cave was also only dimly illuminated, but brighter than the negative cave. All stimuli of the negative condition were replaced by neutral stimuli. In detail, the corpse was replaced by a tree trunk, the werewolf by a sheep ( Figure 1B2,D2), and wind howling replaced the sound of crying ( Figure 2). The entrance area of the cave was designed plainly. Wooden barrels and buckets lay in the places where the negative condition contained weapons and corpses (Figure 1, C2). To navigate through the cave, participants had to turn around 180°. At the first turn-off of the path lay a tree trunk ( Figure 1B2). Shortly before reaching the tree trunk, wind howling could be heard. Shortly before the second turn-off, a bleating sheep and its footsteps could be heard. Stepping around this turn-off, a sheep became visible, walking towards the participant from the other end of the cave up to a fixed point at the third turn-off of the cave ( Figure 1D2). Participants did not know that the sheep would not approach them any further than to this fixed point. The sheep stopped at the junction, leaving room to pass it, still bleating and eating grass. Participants had to walk towards the sheep and turn off directly in front of it to reach the cave's exit ( Figure 2).

Procedure
Participants were blind to the experimental conditions and design but were informed that the cave might be perceived as unpleasant. During experiment preliminaries, it was checked whether participants had gained any previous information about the experiment's research objective, content, or design. If any of this was true, they were excluded from the experiment. Participants were screened for psychological and neurological disorders using a standard screening for mental disorders. Special attention was paid to anxiety disorders, subclinical fears, and current emotional strain. If participants were currently experiencing neurological or psychological disorders or were currently undergoing psychological, psychiatric, or neurological treatment, they were excluded from participation in the study.
Participants were asked to fill out a set of questionnaires, including the German versions of the State-Trait-Anxiety-Inventory, trait scale (STAI-T; Laux et al., 1981), the Sensation Seeking Scale Form-V (SSS-V; Zuckerman, 1996), the E-Scale (Leibetseder et al., 2007), the BIS/BAS scale (Carver and White, 1994;Strobel et al., 2001), the reinforcement sensitivity theory personality questionnaire (RST-PQ; Corr and Cooper, 2016) and revised paranormal belief scale (RPBS; Tobacyk, 2004). Afterward they were equipped with a wireless mobile EEG system and ECG electrodes (see electrophysiological recordings). For the assessment of their current mood, participants filled out the German version of the Positive and Negative Affect Schedule (PANAS; Krohne et al., 1996) immediately before instructions. Participants were instructed that their task would be to explore a cave and find its exit, leading into a village. They got no information concerning the cave's layout or size in advance. They received no prior information about the cave's affective design and stimuli, like sheep, werewolf, or corpse. If they were unable to find the exit or did not want to proceed with exploring the cave, they were free to return to their starting position or terminate the experiment. They were instructed how to use the controller as a flashlight and to move physically through the cave. All participants were instructed to immediately terminate the experiment if they felt too uncomfortable (both physically or mentally).
Participants were equipped with a wireless version of the Vive Pro HMD before entering the VR laboratory and did not see the physical setup of the cave at any time before the virtual experience started. To increase the participants' immersion and maintain it during the experiment, any communication with the investigators was stopped completely from the moment they entered the experimental room. Participants were informed that the investigators would not communicate with them or respond to any speech as long as they were in the cave unless they gave a predetermined command to terminate the experiment.
An ECG baseline measurement was recorded in a plain default VR room with the HMD turned on. Afterward the cave simulation was launched. Participants were free to start exploring the cave as soon as they felt comfortable doing so. When they left the cave through the exit, they entered a safe, pleasant-looking fishing village. Once participants reached the village, they stood still for 30 s, allowing for another baseline measurement. Afterward they were distinctly addressed by the investigator and informed that the equipment would be removed from them. They immediately left the VR laboratory. If participants terminated the experiment at an early stage, the environment was immediately switched to the safe fishing village to release the participants from the unpleasant environment as quickly as possible.
To assess mood and the sense of presence, participants were asked to filled out the PANAS, the Igroup Presence Questionnaire (IPQ; Schubert et al., 2001), and an in-house post-questionnaire asking about the emotional and motivational experiences in the cave. The latter included a visual analog scale (VAS) to determine the physical distance to either the werewolf or the sheep which participants preferred (zero up to 10 m). Before participants left the laboratory, they were rewarded with either partial course credits or 15€. The principal psychological investigators ensured that the participants felt safe and sound after the experiment. Pre-Processing

Electrophysiological Recording and Pre-processing
For EEG-data acquisition, the mobile EEG-system LiveAmp32 by Brain Products (Gilching, Germany) was used. The electrodes were applied in accordance with the international 10-20 system. An online reference (FCz) and ground electrode (AFz) were included. The impedance of all electrodes was kept below 15 kΩ. The data was recorded with a sampling rate of 500 Hz and online band-pass filtered at 0.016-250 Hz. Triggers marking the position of the participant within the cave and the onset of virtual events (e.g., wind howling, monstrous roar, etc.; see Figure 2) were transmitted from Unity to Lab Streaming Layer (LSL by SCCN, https://github.com/sccn/ labstreaminglayer), which was used to synchronize the EEG data stream and Unity triggers.
All pre-processing steps serve the function of ensuring robust data quality and comparability. In particular, the aim is to reduce the amount of variance caused by common EEG artifacts (e.g., due to eye blinks). The EEG data was analyzed using MATLAB (version R2020b, MathWorks Inc) and EEGLAB (Delorme and Makeig, 2004). The continuous EEG data was bandpass-filtered between 1Hz, reducing slow drifts, and 30 Hz to remove highfrequency artifacts like electrical line noise (see Cohen, 2014). The average reference was used for further offline analysis as recommended for large sets of electrodes (see Cohen, 2014). Artifact correction was performed using "Fully Automated Statistical Thresholding for EEG artifact Reduction" (FASTER; Nolan, Whelan and Reilly, 2010). In brief, this procedure automatically detects and removes artifacts, like blinks and white noise, based upon statistical estimates for various aspects of the data, e.g., channel variance. FASTER has high sensitivity and specificity for the detection of various artifacts and is described in more detail elsewhere (e.g., Nolan et al., 2010). Due to recommendations for the use of FASTER with 32 channel setups, independent component analysis (ICA) and channel interpolation were applied, whereas channel rejection and epoch interpolation were not applied. Each electrode was detrended separately to ensure the same statistical properties for the time series (Cohen, 2014) before segmenting the data into epochs based upon the position triggers. The segmentation of the continuous EEG data into epochs matching the cave sections enabled a more differentiated analysis of the cave exploration. Per epoch, a windowed fast Fourier transform (FFT) was calculated to isolate alpha-band-specific activity (8-13Hz;Berger, 1929). To this end, a hamming window with a length of one second and 50% overlap was applied. The mean FFT score was logarithmized to calculate alpha-band power. For the calculation of the FAA score, the electrode F4 was subtracted from the electrode F3 [logarithmized left alpha power minus logarithmized right alpha power; ln(µV 2 )]. The former steps to calculate FAAs follow the standard procedure recommended by (Smith et al., 2017).

Exploration time and Behavior
Exploration time was measured in seconds from the initial entrance into the cave (marker 11, Figure 2) to exiting the cave (marker 19, Figure 2) and for the path section along which participants headed directly towards the werewolf/the sheep (epoch 57).
As expected, the examination of the video recordings of the cave exploration revealed two different behavioral patterns manifested within the negative condition, subdividing the negative group into two subgroups: When first encountering the werewolf, participants of the negative group either retreated, i.e., hesitated or hid behind a former wall (subgroup labeled "hesitating"), or quickly advanced toward the werewolf to get past it, hastening around the cave's next turn-off (subgroup labeled "hastening"). They were assigned to the subgroups by the assessment of three investigators. To cross-check the division into the three subgroups, we implemented a video rating of the participants' fear behavior by blind raters (see Box 1). Since the blind ratings favored the classification of the subgroups (see Box 1), the investigators' proposed subdivision was adopted (hesitating group: n 33, M age 21.70, SD age 3.85, 87.9% female, 87.9 right-handed; hastening group: n 21, M age 21.62, SD age 3.17, 71.4% female, 85.7% right-handed; neutral group: n 27, M age 23.15, SD age 2.98, 59.3% female, all right-handed). We provided an analysis of both conditions (negative versus neutral) without subdivisions into the hastening group and the hesitating group as supplementary material (see Supplementary Material S1).

Cardiovascular Measurements and Pre-processing
A three-channel ECG (Brain Products, Gilching, Germany) was applied and transmitted to the mobile EEG system. Electrodes were placed at the left collarbone, the right collarbone, and at the lowest left costal arch. The ECG was recorded synchronously with the EEG data.
The ECG data was segmented into the baseline measurements before the start of the cave exploration (60 s) and directly after leaving the cave, i.e., standing in the village (30 s). ECG measures during cave exploration were not further analyzed due to insufficient data quality. Participants who were excluded due to technical problems or insufficient EEG data quality and those who terminated the experiment early were excluded from ECG analysis. The datasets were further preprocessed using BrainVision Analyzer 2.2.0 (Brain Products, Gilching, Germany). Datasets were filtered between 5 and 45 Hz to remove low and highfrequency artifacts. Additionally, a notch filter (50 Hz) was applied. An automatic R-peak detection was applied and visually counterchecked. 14 datasets were excluded due to insufficient ECG quality during at least one of both baselines. For the remaining 67 datasets (n hesitating 29; n hastening 18; n neutral 20), the classical HRV parameter, i.e., the root mean square of successive differences (rmSSD) was calculated per baseline using MATLAB. The parameter rmSSD was chosen for analysis as it is recommended for ultra-short-time measurements (10-60 s; Shaffer and Ginsberg, 2017). The individual change in rmSSD between both baselines was calculated per participant and averaged per group for comparisons (delta baseline 2-baseline 1; see Figure 2).

Statistical Analysis
All statistical analyses were carried out using SPSS 26 (IBM). All variables were tested for normal distribution regarding each group separately using the Shapiro-Wilk test and all further statistical tests were chosen accordingly (see Supplementary Material S2 for a detailed report of the Shapiro-Wilk test). In case that at least one subgroup per variable or at least one subscale or subvariable of a measure was not normally distributed (p < 0.1), a non-parametric test (Kruskal-Wallis test, Mann-Whitney U-test) was used for the analysis of that measure, as parametric tests, i.e., ANOVA and t-test are less robust to violation of normal distribution in case of unequal group sizes.

Subjective Measures
The scales of the questionnaires were calculated as the sum of the corresponding item values (sum scale). Concerning the PANAS, in addition to the scores for positive and negative affect, the change in affect was calculated as the difference between premeasurement and post-measurement (change post-pre). For the in-house post-questionnaire, the subscales affect and motivation were calculated as mean values of the corresponding items. The preferred physical distance to either the werewolf or the sheep (via VAS) was transformed into the distance in percent (relative distance preferred distance/total distance possible). All questionnaires were analyzed using the Kruskal-Wallis test and complemented by post-hoc Mann-Whitney U-tests, with exception of the SSS-V, which was analyzed using a one-way ANOVA, complemented by post-hoc t-tests. Due to the directional wording of the hypothesis concerning acute fear and presence, negative affect, motivation, and presence were tested one-tailed. All other self-reports were tested two-tailed. Cronbach's alpha was calculated per scale and reached at least acceptable levels for most scales (α ≥ 0.70) with exception of the following subscales: BIS/BAS: goal drive, fun seeking, reward responsiveness; RST-PQ: reward interest, impulsivity; IPQ: Spatial presence, realness (0.45 < α < 0.70, see Supplementary Material S3 for details).

Dependent Measures Exploration Time
Exploration time was compared between groups using the Kruskal-Wallis test, followed by post-hoc Mann-Whitney U-tests. Due to the directed hypothesis concerning exploration time, the total exploration time and the exploration time during epoch 57 (see Figure 2) were tested one-tailed.

Electroencephalic Measures
For statistical analysis of the FAAs, individual outliers were determined per epoch. Scores with a greater interquartile distance than 1.5 from the group mean were excluded from the analysis of the individual epoch. The FAA scores were analyzed based upon the subdivision of the negative condition into the subgroups "hesitating" and "hastening" and the neutral condition. The latter was not further subdivided (see results). The Kruskal-Wallis test was used for analysis and complemented by post-hoc Mann-Whitney U-tests. The parameter r (√η 2 ) was calculated as an estimate of effect size.

Cardiovascular Measures
The average change in rmSSD as a measure of HRV (delta baseline 2-baseline 1) was compared between groups using the Kruskal-Wallis test and post-hoc Mann-Whitney U-tests. The parameter r (√η 2 ) was calculated as an estimate of effect size. Due to the directed hypothesis concerning HRV, the measure was tested one-tailed.

BOX 1 | Cross-check of group subdivision by blind rating
Procedure: A blind video rating was conducted to check the subdivision into the subgroups hesitating, hastening, and neutral based on three investigators' assessment. To this end, recordings of the participants exploring the cave were used. The recordings did neither reveal the participants' identity, nor in which experimental condition they were, nor what they saw in the virtual environment. Only their behavior within the physical replica was visible. Videos of participants who terminated the experiment (n 5) or did not agree to the use and publication of the recordings (n 4) were not included in the rating. The naive raters' task was to evaluate to what extent the person in the video showed fear in their behavior. To do so, they were asked to rate the person's fear on a scale from zero (no fear at all) to six (very strong fear). Each rater evaluated each of the videos (n 77) in randomized order. They were allowed to take breaks if needed.
Blind raters: Twenty-seven blind raters completed the video rating. It was ensured that none of the raters had prior knowledge of the original study, that none participated in the original study, and that none suffered from any psychological or neurological conditions. Four raters were excluded due to the anamnesis' exclusion criteria, resulting in n 23 valid ratings (M age 21.74, SD age 2.54, 20 female, 3 male, none diverse). To ensure the raters' aptitude, their empathic ability and emotional competence were assessed using the German versions of the e-scale (Leibetseder et al., 2007), and the self-assessment of emotional intelligence (SEK-27; Berking and Znoj, 2008). They were blind to the content, experimental conditions, and objectives of the original study.
Statistical analysis: Per rater, mean fear scores were calculated. For this purpose, the individual video ratings were averaged based on conditions (negative vs. neutral), as well as based on the previous division of subjects into subgroups (hesitating vs. hastening vs. neutral). These mean fear scores were tested for normal distribution using the Shapiro-Wilk test and further analyzed using separate t-tests for dependent samples.
Results Conclusion: The blind ratings are in line with the subdivision into the subgroups hesitating, hastening, and neutral, as proposed based on the investigators' assessment. All subgroups differed significantly in the fear levels as assessed by naive raters based on the participants' behavior. Consequently, participants' fear levels were explicitly and distinctly expressed in their behavior, even observable by blind, naive raters, indicating a high level of realistic fear behavior.
Frontiers in Virtual Reality | www.frontiersin.org August 2021 | Volume 2 | Article 716318 Note. The detailed statistics for Kruskal-Wallis test are provided in Supplementary Material S2. The respective descriptive statistics are given per condition. The parameter r (√η 2 ) was provided as an estimate of effect size (a small effect, b medium effect, c large effect). Significant differences between groups were marked accordingly (*p < 0.05; **p ≤ 0.01, ***p ≤ 0.001). One-tailed tests are marked accordingly 1 .

Subjective Measures
No group differed significantly from others in personality traits that otherwise might have an impact on the perception of and reactions to the specific VR scenario, such as anxiety, empathy, paranormal belief, behavioral activation system, and behavioral inhibition system (all Hs(2) < 5.10, p > 0.05; see Supplementary Material S2 for details), as well as sensation seeking (F (2,74) 3.05, p 0.053). However, groups differed in impulsivity (H (2) 6.23, p 0.044) and in the fight-flight-freeze system (FFF-S; H (2) 6.46, p 0.040) as assessed by RST-PQ. In particular, the hesitating group scored lower in impulsivity compared to the hastening and neutral groups. Moreover, the hesitating group scored significantly higher in the FFF-S compared to the neutral group. The difference in the FFF-S between the hesitating and hastening groups followed the same trend but did not reach significance. The hastening and neutral groups did not differ significantly in both traits (see Table 1).
Before the cave exploration, groups did not differ concerning their mood (all Hs(2) < 1.1, all ps > 0.50). However, after the cave exploration, groups reported different levels of negative affect, as well as differing changes in negative and positive affect. In detail, both hesitating and hastening groups experienced equal negative affect as well as similar increases in negative affect, but significantly stronger increases compared to the neutral group (see Table 1). Surprisingly, the hesitating group reported significantly higher increases in positive affect compared to the neutral group as well. The hastening group followed the same trend but did not reach significance. The hastening and the hesitating groups experienced similar increases of positive affect (see Table 1). All groups reported similar levels of presence (all Hs(2) < 5.20, all ps > 0.07), with exception of the sensation of spatial presence (H (2) 5.17, p 0.038) 1 . In particular, the hesitating group felt more spatially present compared to both other groups, whereas the hastening and the neutral groups exhibited similar levels of spatial presence (see Table 1).
As assessed by the in-house post-questionnaire, the hastening and the hesitating groups preferred a significantly greater distance to the werewolf, whereas the neutral group preferred a significantly closer distance to the sheep (22% of the possible distance). Descriptively, the hesitating group (74% of the possible distance) preferred a slightly greater distance to the werewolf compared to the hastening group (69% of the possible distance), but the groups did not differ significantly (see Table 1). Furthermore, both the hesitating group and the hastening group perceived the cave as strongly negative and reported a significantly greater motivation to leave the cave at an early stage compared to the neutral group. Even more, the hesitating group exhibited a significantly stronger motivation to leave the cave early compared to the hastening group. The hastening group perceived the cave as significantly more negative compared to the neutral group as well, and reported a high motivation to leave the cave early, whereas participants of the neutral group tended to perceive the cave as rather comfortable and were only slightly motivated to leave it at an early stage (see Table 1).

Exploration time and Behavior
The hesitating group took approximately 1.7 times as long as the hastening group and the neutral group to reach the cave's exit and thus to end the exploration (Md hesitating. 49.7s; Md hastening. 29.10, Md neutral. 33.15). In contrast, the hastening and the neutral groups took approximately the same time to end the exploration (see Tables 2, 3). This pattern was evident for epoch57, when participants headed towards the werewolf/ sheep, as well (see Table 2, exploration time epoch 57). The hesitating group walked significantly slower towards the werewolf compared to the hastening group (U 81.00, z −4.42, p < .001 1 , r 0.62) and the neutral group towards the sheep (U 121.00, z −4.52, p < .001 1 , r 0.60), whereas hastening group and neutral group walked at the same pace (U 255.00, z −0.11, p .456 1 , r 0.02). In detail, the hesitating group took more than three times as long as both other groups for this path section (Md hesitating 21.57; Md hastening 6.60; Md neutral 6.09; Md in seconds).
The significant difference in exploration time was reflected in the directly observable behavior within the cave: The neutral group explored the cave rather casually, maintaining a constant walking pace and showing no particular signs or verbalizations of unease. In contrast, both negative groups walked rather cautiously, looking around turn-offs before continuing the exploration. Both subgroups explicitly expressed fear by verbalizations and body language. For example, participants gasped, looked around nervously, or wrapped their arms protectively around themselves. Five participants terminated the experiment either at first sight of the cave's entrance area (n 1) or at first sight of the werewolf (n 4). Beyond that, the hesitating group either stopped or even hid behind the former wall when detecting the werewolf, whereas the hastening group did not hesitate at all, but advanced toward the werewolf to get past it (see data repository for exemplary video recordings). Based on these bodily fear cues, even naive raters were able to classify participants' fear levels adequately, indicating a high consistency with real-life fear behavior (see Box 1). Note. Significant differences between groups (*p < 0.05; **p ≤ 0.01, ***p ≤ 0.001) and one-tailed tests 1 were marked accordingly.

Cardiovascular Measures
All groups exhibited equal changes in rmSSD between both baseline measurements (H (2) 2.00, p .184 1 , see Table 2). Descriptively, all groups exhibited an increased rmSSD in the second baseline compared to the first baseline (Md hesitating 13.58; Md hastening 17.74; Md neutral 17.81), indicating higher stress levels during the first baseline.

Electroencephalographic Measures
The Kruskal-Wallis test revealed differences regarding the FAA scores between the three subgroups for epoch 34 (H (2) 6.13, p 0.047) and epoch 57 (H (2) 6.59, p 0.037). However, they did not differ during baseline or further epochs (all Hs(2) < 4.6; all ps > 0.10; see Supplementary Material S2 for details). Hence, only epoch 34 and epoch 57 were further analyzed by post-hoc Mann-Whitney U-tests. In epoch 34, a significantly stronger left frontal cortical activity was observed in the hastening group compared to the neutral group, which exhibited a stronger right frontal cortical activity (see Table 4 and Figure 3). However, the hesitating group did not differ from the hastening group or from the neutral group with respect to their FAA scores. In contrast, the hastening and the hesitating groups differed significantly during epoch 57, with the hesitating group showing greater left frontal cortical activity and the hastening group showing greater right frontal cortical activity. Both did not differ significantly from the neutral group during this epoch (all Us > 240.00, all ps > 0.05, see Table 4 and Figure 3).

DISCUSSION
The present study aimed to examine authentic fear responses, especially complex behavioral expressions of fear, and the electrophysiological correlates of approach and avoidance, i.e., frontal alpha asymmetries (FAA) in an immersive virtual reality setup. The incremental value of the study lies particularly in the simultaneous examination of realistic behavior and the associated electrophysiological responses. To this end, participants explored either a negative, i.e., frightful cave, containing corpses and a monstrous werewolf, or a neutral cave, containing tree trunks and a sheep. As expected, the negative cave elicited significantly stronger negative affect, fear, and the motivation to withdraw from the scenario earlier as opposed to the neutral condition. Going beyond previous findings, these affective responses were very pronounced and identifiably reflected in the participants' behavior. While the neutral condition's participants explored the cave rather casually, the negative condition's participants walked rather cautiously, adapting their pace to the threatening atmosphere. Even more, the negative condition exhibited two different behavior patterns, subdividing participants into a hesitating and a hastening group. Surprisingly, and even though selfreports and behavior indicated great differences in emotional experiences, the different groups could be distinguished in only two out of seven cave sections based on the FAAs.

Affective Responses to the Virtual Cave
In line with previous research, the respective design of the cave was sufficient to trigger distinct emotional reactions as intended. Indicative of successful fear elicitation, both negative subgroups reported higher levels of acute fear compared to the neutral condition. Specifically, both negative groups reported highly negative affect, a strong fear of the respective stimuli, and great motivation to leave the cave early, while the neutral group did not. With all negative stimuli removed, a still dimly lit cave exhibited no particular reports of fear in the neutral condition. Hence, while context and distinct cues determine which specific emotion is induced, i.e., fear of an approaching werewolf (Felnhofer et al., 2015;Lin, 2017), the level of interactivity adds to the plausibility and realness of the VE (plausibility illusion; Slater, 2009), thereby increasing emotional involvement (Gorini et al., 2010;Diemer et al., 2015) and behavioral realism (Blascovich et al., 2002;Slater, 2009;Kisker et al., 2019a). In particular, the possibility to interact with and within the VE, and to be personally affected by occurring events overcomes the remoteness of conventional screen experiences (Slater, 2009;Lin, 2017;Lin et al., 2018;Lin, 2020;Kisker et al., 2020;Schöne et al., 2019). More than that, the experienced self-efficacy may reinforce the feeling of personal vulnerability to the occurring events (see Lin, 2017;Lin et al., 2018).
In a similar vein, participants of all groups felt generally present within the VE. However, the hesitating group felt Note. The respective descriptive statistics are given per group and the effect size r (√η 2 ; a small effect, b medium effect) was determined. Significant differences between groups (*p < 0.05; **p ≤ 0.01, ***p ≤ 0.001) and one-tailed tests 1 were marked accordingly.
Frontiers in Virtual Reality | www.frontiersin.org August 2021 | Volume 2 | Article 716318 more spatially present compared to both other groups. Numerous previous studies indicate a positive correlation between emotion and the feeling of presence, although the effective direction remains unclear (e.g., Riva et al., 2007;Diemer et al., 2015;Felnhofer et al., 2015;Kisker et al., 2019a). As both conditions allowed for equal levels of interactivity, spatial presence might be enhanced by emotional arousal, as the hesitating group felt the most frightened being the cave. A previous mixed reality study Note. Kruskal-Wallis test revealed no group differences concerning further epochs. Significant differences between groups as indicated by Mann-Whitney U-test are marked (*p < 0.05; **p ≤ 0.01, ***p ≤ 0.001). The respective descriptive statistics are given per group and the effect size r (√η 2 ; a small effect, b medium effect) was determined. Positive FAA scores indicate withdrawal motivation, whereas negative FAA scores indicate approach motivation.
FIGURE 3 | FAA scores [ln (µV 2 )] per subgroup and epoch of the cave exploration. Significant differences between groups are marked (*p <0.05; **p ≤ 0.01, ***p ≤ 0.001). Positive FAA scores indicate relatively greater right frontal cortical activity, whereas negative FAA scores indicate relatively greater left frontal cortical activity. The standard deviations from group mean are depicted as error bars in separate panels for increased visibility. The area between both dotted grey lines indicate the epochs during which the creature (werewolf/sheep) was visible.
Frontiers in Virtual Reality | www.frontiersin.org August 2021 | Volume 2 | Article 716318 also only modified the visual impression, i.e., threatening vs. nonthreatening, and concluded that the threatening condition corresponded with higher sensations of presence (Kisker et al., 2019a). However, although these findings may seem intuitive, our data does not allow a causal conclusion and could also be the result of interdependence of emotional experience and presence (Kisker et al., 2019a). Moreover, all groups exhibited equal levels of general presence, involvement, and realness, opposing the idea of modulation of presence by the emotional experience alone. Accordingly, factors other than emotion and immersion may have varying effects on the dimensions of presence, which should be objective to further research. Surprisingly, both negative groups experienced a stronger increase in positive affect compared to the neutral group. This might, at first sight, seem counterintuitive. However, previous studies also found an increase in positive affect after unpleasant situations, ascribing this finding to relief, or even pride about having mastered an unpleasant, or in our case threatening, situation (Williams and DeSteno, 2008;Kisker et al., 2019a). Contrary to our expectations, the groups did not differ in the extent to which their HRVs changed. This is surprising given that the HRV parameter rmSSD tends to decrease during stressful situations (Castaldo et al., 2015). Therefore, we had expected that both negative groups would show a significant reduction in rmSSD compared to the neutral group. Instead, all groups showed a slight, not significantly different increase in rmSSD after exploring the cave compared to pre-exploration measurement. The increase in rmSSD, classically interpreted as reduced stress experience (Castaldo et al., 2015), might nevertheless reflect the changes in positive affect: The uncertainty about the cave's content and size before its exploration might have led to anticipatory stress, while the completion of the exploration might be experienced as a relief. However, the recording of the pre-and post-exploration phases might not have been sufficient to validly determine HRV parameters. Although rmSSD can be determined based on short-time measurements, it is usually preceded by resting phases (Castaldo et al., 2015). In our experiment, participants moved physically between measurements, which may have distorted the HRV assessment and limits its interpretability.
As all groups were equal in prior VR-experience, preexperimental mood, trait anxiety, and further personality traits right before the VR exposure, differences between groups during or after cave exploration cannot be traced back to pre-existing differences, but the cave exploration. As the only, but a highly interesting exception, the hesitating group reported significantly lower impulsivity than both other groups and scored higher on the FFF-S, which indicates that their behavior is more likely to be determined by avoidance tendencies (Corr and Cooper, 2016;Pugnaghi et al., 2018) and corresponds to their initial reaction when detecting the werewolf.

Authentic Fear Behavior in Immersive Virtual Reality
Most importantly, and going beyond matching self-reports, participants adapted their behavior immensely to their virtual surroundings. While the neutral group explored the cave rather casually, both negative groups exhibited distinct signs of acute and strong fear expressed via body language: They slowed down their pace, glimpsed around corners before taking the turn-off, or held their arms tight to their bodies in addition to verbal expressions of fear (see Adolphs, 2013). More than that, when being confronted with the werewolf, participants tended to advance toward the werewolf to get past it or to retreat, subdividing participants of the negative condition into two distinct subgroups: The hesitating group hesitated or even hid behind the former turn-off when detecting the werewolf, which corresponds to their lower levels of impulsiveness and more pronounced action control by avoidance. In contrast, the hastening group advanced toward the next turn-off before the werewolf approached any closer. These behavioral patterns were reflected in significantly higher exploration times in the hesitating group compared to both other groups. Slowing down their pace allows for greater vigilance and thus for potential hazards to be identified more quickly (Rinck et al., 2010). By hesitating to pass by the werewolf, the hesitating group stayed in the cave longer, whereas the hastening group, in comparison, abbreviated it by fleeing towards the cave's exit, thereby matching the neutral group's exploration time.
These behavioral adaptions point towards a crucial characteristic of VR setups: Since they are the subject of the virtual events and are personally involved in them (Slater, 2009;Slater and Wilbur, 1997;Kisker et al., 2020;Schöne et al., 2019), a behavior change is inevitable to deal with the threats within the cave (see Kisker et al., 2019a). The impression of realness must therefore have been so intense that the knowledge of being in a VR simulation was not sufficient to suppress the feeling of personal threat and a corresponding coping reaction (place illusion; Slater, 2009). As the mixed reality design allows for realistic sensorimotor actions, participants are enabled to react naturally and promptly when confronted with fear cues. In particular, realistic responses are enhanced by the impression of being directly and personally affected by the events within the VE (Slater and Wilbur, 1997;Nilsson, et al., 2016), for example the impression, that the werewolf can actually reach and harm them.
In standard laboratory settings, participants are supposed to indicate their natural response via substitutional responses: They are required to cognitively evaluate their initial response, determine the correct substitutional response, and then carry it out. In a real-life, threatening situation this chaining of cognitive evaluation and reaction might be dysfunctional. Rather, people would instinctively back off, freeze, or defend themselves physically as an initial impulse. Following LeDoux's (e.g. LeDoux, 1995;LeDoux, 1996;LeDoux, 1998;Debiec and LeDoux, 2004) theory on the fear circuit, VR setups would allow to access the immediate, emotional processing of stimuli. Conversely, capturing fear via substitutional responses would involve the slower cognitive path, as participants process their initial reaction and match it to an abstract, pre-set action to indicate how they feel. However, reactions triggered by VR events can only be accepted as equivalent to real reactions if virtual and real environments actually elicit identical reactions (Slater, 2009 More and more studies indicate that VR settings not only lead to stronger emotional reactions compared to classical PC setups but that these reactions triggered by virtual events correspond to their real-life counterparts (Gorini et al., 2010;Higuera-Trujillo et al., 2017;Chirico and Gaggioli, 2019). Consequently, VR setups allow for a more naturalistic and non-mediated assessment of fear, offer an immense spectrum of response options, and involve the full body, mimicking the natural fear reaction to events in the real world (Bohil et al., 2011).

Alpha-Asymmetry Models and Complex Behavioral Responses
Remarkably, the electrophysiological response distinguished between subgroups in two of the seven exploration sections based on the FAA. Based on existing models, and equivalent to Rodrigues et al. (2018), we expected relatively stronger left-frontal activity for approach-related behavior, i.e., negative FAA-scores when the hastening group approached the werewolf, and a relatively stronger right-frontal activity for avoidance-related behavior, i.e., positive FAA-scores for the hesitating group. Neutral behavior was not proposed to be linked to a distinct asymmetry. The three subgroups were distinguishable directly after passing the corpse/tree trunk and hearing the werewolf/sheep (epoch 34), and when walking towards the werewolf/sheep (epoch 57). In particular, the hastening group differed significantly from the neutral group in epoch 34, exhibiting relatively greater left frontal activity, indicating approach motivation, while the neutral group exhibited relatively greater right frontal activity, indicating avoidance motivation (e.g., Harmon-Jones and Gable, 2018;Rodrigues et al., 2018). The hesitating group, descriptively exhibiting a slight approach motivation, did not differ significantly from any of the other groups. One might argue that both negative groups exhibited approach motivation towards the exit. The neutral group had no incentive to leave the cave early and was thus, possibly motivated to avoid the exit to explore the situation longer.
Moreover, during epoch 57, the hastening and hesitating groups differed significantly, with the hastening group exhibiting avoidance motivation and the hesitating group exhibiting approach motivation. On the one hand, the hastening group's avoidance motivation might be linked to their instant initiation of an escape from the current cave section towards the exit before the werewolf comes any closer. The hesitating group's approach motivation, on the other hand, might reflect the emotional selfcontrol to pass the werewolf to reach the exit after initially hiding or hesitating. The latter interpretation supports recent models that associated FAAs with inhibitory top-down regulation of emotion (Lacey et al., 2020;Schöne et al., 2015). The neutral group exhibited equal levels of avoidance motivation compared to the hastening group, which might indicate avoidance of leaving the cave early. However, during this epoch, none of the groups knew that the exit was behind the next turn. Therefore, the previous interpretation seems rather speculative.
In terms of the revised sensitivity theory (Gray and McNaughton, 2000), Wacker and colleagues (Wacker et al., 2003;Wacker et al., 2008) introduced the behavioral inhibition model of anterior asymmetry (BBMAA). The BBMAA relates relatively greater left frontal activity, as in the hesitating group, to the activation of the fight-flight-freeze-system (FFF-S), responding to negative stimuli and threat, whereas relatively greater right frontal activity, as in the hastening group, might relate to the behavioral inhibition system (r-BIS), allowing for superordinate emotion-regulation and behavioral control (Gray and McNaughton, 2000). According to the group's behavioral responses, hesitating and hiding from the werewolf would fit with the FFF-S and might, in line, be interpreted as active behavior to avoid the fear cue. Vice versa, accelerating their pace to instantly pass the werewolf would fit with the r-BIS. But notably, the respective asymmetry is proposed to indicate passive behavior (Wacker et al., 2003;Rodrigues et al., 2018), standing in stark contrast to instantly approaching the werewolf and passing it. To hasten past the werewolf is undoubtedly effective to escape the threatening situation and thus may be interpreted as avoidance rather than approach. However, the hastening group seems to be primarily driven by emotion regulation, as they do not hesitate, but instantly move towards the werewolf. Hence, the synthesis of this behavior and FAA might rather relate to effortful control of emotion (Lacey et al., 2020;Schöne et al., 2015), allowing to escape from the threatening situation instead of freezing.
None of the aforementioned explanatory approaches covers that the groups' FAAs did not differ significantly for the greater part of the cave exploration. Despite of showing such a variety of and strongly pronounced behavioral responses, participants of all groups could only be distinguished in two of the seven cave sections based on the FAA data. This was particularly surprising as the negative condition triggered intense negative affect, as well as a high motivation to leave the cave early, which was significantly reflected in self-reports and behavior. Walking towards a corpse and sounds of crying compared to a tree trunk and wind sounds were not accompanied by significantly different FAA scores between groups. Even more, the hesitating group descriptively exhibited relatively greater left frontal activity throughout the cave exploration which is, according to the most well-known models, associated with approach motivation or positive affect (e.g., Davidson et al., 1990;Gable and Harmon-Jones, 2008;Harmon-Jones and Gable, 2018). For obvious reasons, the valence model (e.g., Davidson et al., 1990;Davidson, 1998) does not correspond to the observed behavioral responses, while one might speculate whether the observed approach motivation might reflect to urge to reach the exit.
Hence, the FAA data collected in our immersive VR setup could be aligned with previous models only to a very limited, inchoate degree. Although initial desktop-VR studies provided evidence that the behavioral patterns in a video game trigger FAAs corresponding to the motivational model (Rodrigues et al., 2018), we were unable to replicate these findings in a highly immersive VR setup.

The Special Role of Immersive Virtual Reality Setups
Even though we could not fully reconcile the self-reports and behavioral data with the obtained FAA data, we would like to consider the following points as potential, but not incontrovertible explanations for the observed discrepancies: As previously speculated, the hesitating group exhibiting an approach motivation throughout the cave exploration might be attributed to having a strong motivation in terms of approaching the cave's exit. This assumption presupposes that FAAs do not reflect an emotional or motivational response to distinct fear cues, but a higher goal, supporting the idea that the FAA dynamics might reflect top-down inhibitory executive processes, rather than motivational tendencies per se (Schöne et al., 2015). In line, the neutral cave might elicit FAAs since the aim of finding the exit is pursued, although the neutral environment would not in itself provide a specific incentive to do so. However, as leaving the cave seems much more urgent in the negative condition, it would still have been expected that the FAAs elicited by the neutral and the negative conditions would be significantly different.
Apart from that, the best-known FAA models are not entirely consistent with each other: each model has been repeatedly lined with evidence, revised, or even overruled (for review see e.g., Harmon-Jones and Gable, 2018;Lacey et al., 2020). Considering this limited consistency, it is less surprising that the FAA data obtained from a very different setup compared to the conventional assessments does not match previous models one-to-one. In terms of emotion induction methods, the discrepancy might arise from the multidimensional nature of VR setups: a major advantage of classical laboratory experiments is the possibility to isolate relevant processes (Kvavilashvili and Ellis, 2004;Parsons, 2015). In contrast, VR setups, like real experiences, are multidimensional (Bohil et al., 2011;de la Rosa and Breidt, 2018;Pan and Hamilton, 2018) and, as we argued, facilitate realistic reactions (e.g., Blascovich et al., 2002;Slater, 2009;Kisker et al., 2019a). Accordingly, rather weak signals like the FAA may play in concert with further cognitive and emotional processes in complex, realistic situations (see Bohil et al., 2011). Thus, classical measurements as applied in conventional setups might not adequately capture FAAs under more naturalistic conditions and might need adaption for sufficient application.
In a similar vein, the assignment of FAAs to certain emotional or motivational states might not be unrestrictedly generalizable to complex behavioral reactions going beyond abstract responses: Models concerning the FAA are based on highly standardized laboratory setups, which strongly limit the behavioral response options to rather abstract stimuli presented on a screen (e.g., Wacker et al., 2008;Parsons, 2015). So-called desktop-VR settings, being somewhat more immersive, still reduce the behavioral response options, e.g., to movements of a joystick (e.g., Rodrigues et al., 2018;Brouwer et al., 2011). In contrast, immersive VR setups, such as the physical exploration of a cave, allow for multisensory, realistic sensations and significantly broader and non-mediated behavioral reactions (e.g., Rinck et al., 2010;Lin, 2017). Accordingly, the reduction of complex reactions to a single electrophysiological marker seems too abstract for realistic conditions (e.g., Lange and Osinsky, 2020;Bohil et al., 2011).
One might argue that movement-induced artifacts or wearing an HMD might overshadow significant differences between groups. However, recent methodological examinations demonstrated that mobile EEG obtains good data quality during walking even in single-trial setups (Debener et al., 2012;Nathan and Contreras-Vidal, 2016), and wearing common HMDs does not impact the EEG's signal quality regarding frequency bands below 50 Hz (Hertweck et al., 2019). Accordingly, it seems unlikely that differences between groups would have been overshadowed.
Summing up, the source of the discrepancy between behavioral responses and canonical FAA models is not yet conclusively understood. The differences found between groups seem to be mainly attributable to top-down emotion regulation (Lacey et al., 2020;Schöne et al., 2015). However, based on the aforementioned considerations, we assume that the canonical FAA and respective models cannot be applied to complex, holistic behavior without restriction or adaption, as FAAs have so far been investigated by means of abstract responses. Rather, the complexity of realistic behavioral responses may not be fully predicted by a single, very specific electrophysiological marker (Bohil et al., 2011;Lange and Osinsky, 2020). Accordingly, contemporary FAA models offer an avenue to explore approach and avoidance behavior, but under realistic conditions, FAAs may not be as predominant as previous models suggest.

Ethical Challenges of Using VR as an Experimental Tool
With the high level of realism that VR offers, the ethical and moral responsibility in the implementation of experimental studies increases at an exponential rate. Many objectives could potentially be investigated more naturally and efficiently when implemented via realistic experimental setups. Nevertheless, the participants' safety must always come first, and it must be carefully considered whether the gain from extended knowledge justifies the participants' potential strain.
Despite ethical approval, exploring an unknown cave without warning that, and which negative stimuli would await the participants was a significant strain on them. Five of the 59 participants exploring the negative cave terminated the experiment at the first sight of either the cave's entrance (n 1) or the werewolf (n 4). Although being anecdotal evidence only, some participants whimpered heavily, others engaged in selfcalming strategies, like telling themselves repeatedly that it was only a game to break immersion. One participant even started crying when detecting the werewolf, three participants reported having nightmares the night after the experiment. To put it bluntly, we were rather surprised that so many participants completed the cave exploration while experiencing intense fear and distress, although they were distinctly and repeatedly instructed that they could stop the experiment immediately if they felt uncomfortable.
Some VR horror games even explicitly warn that the experience in VR is more intense compared to conventional games and might cause significant psychological strain. Attending and staying in such simulations could be attributed to the general appeal of mediated horror content (Lin et al., 2018). Considering that VR setups are assumed to evoke real-life behavior (e.g., Slater, 2009;Kisker et al., 2019a), emotions (e.g., Higuera-Trujillo et al., 2017;Chirico and Gaggioli, 2019), and transfer such experiences to real-life in terms of learning (e.g., Ragan et al., 2010) and mnemonic processes (e.g., Kisker et al., 2019b;Schöne et al., 2019;Kisker et al., 2020), it is an effective tool for e.g., exposure therapy (for review see e.g., Botella et al., 2017). But on the flip side of this coin, VR has not only the potential to treat but also to cause psychological dysfunction, such as PTSD-related symptoms (e.g., Dibbets and Schulte-Ostermann, 2015). The blurring of the mental border between virtual and real, and the resulting costs and benefits for all parties involved, must therefore be weighed very carefully on a case-by-case basis (for an in-depth discussion of ethics of virtual reality see e.g., Parsons, 2019;Slater et al., 2020).

CONCLUSION
Our results demonstrate that the employed VR setup facilitates realistic fear responses beyond affective responses: Exceeding the participants' self-reports of intense fear in both negative subgroups, they adapted their behavior to the encountered situation. While conventional setups can only operationalize the participants' substitutional response, e.g., in the form of a keystroke, VR setups allow for an immediate expression and assessment of the comprehensive fear response, e.g., by physically backing away from a stimulus. To our best knowledge, this study is the first one to investigate complex behavioral fear responses employing a mixed VR setup and thus, complements previous findings. Participants exploring the negative cave either quickly advanced toward the werewolf to get past it or retreated when spotting the werewolf. In stark contrast, participants exploring the neutral cave behaved casually and showed no particular signs of fear or discomfort. Overall, these behavioral responses exhibited in the cave resemble lifelike responses on an affective but foremost on the behavioral level, extending scientific evidence for VR-based research's feasibility and effectiveness.
Moreover, no previous study has collected electrophysiological correlates of approach and avoidance under similarly immersive conditions. The different behavioral patterns were reflected in the electrophysiological responses. Specifically, the FAA discriminated between the advancing (hastening group) and retreating (hesitating group) behavior as they walked towards the werewolf in the negative condition, indicative of differences in emotion regulation. Furthermore, differences between the hastening and the neutral groups were obtained only at rare occasions. Especially the absence of effects is remarkable, and albeit their ability to discriminate between different motivational or affective states, the remaining FAAs could not be reconciled with contemporary FAA models. This discrepancy could be attributed to the FAA models being based on data obtained under abstract laboratory conditions. The study at hand further incorporates the participants' complex behavioral responses, possibly affecting motivational tendencies.
Hence, putting laboratory-based models to the test under realistic conditions shows that they may not unrestrictedly predict real-life behavior. Yet, they provide a baseline for further refinement of experimental findings, which can be complemented by VR-based research. Accordingly, our findings demonstrate the high potential of implementing VR technology in experimental settings to increase the ecological validity of scientific findings. VR allows for non-mediated and life-like affective and behavioral responses and scientific measurements of real-world processes.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://osf.io/jwt7d/? view_only 92bda76c430247bca3a93eacf4567813

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the local ethic committee of Osnabrueck University, Germany. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
The study is based on a concept drafted by BS. All authors contributed to the study design. KF, MK, FT, PO, and NL developed the Unity VR environment and the physical replica under supervision of BS and JK. JK and LL integrated EEG and ECG applications into the Unity environment. Testing and data collection were performed by JK, LL, and KF. CG developed the application "Cagliostro" for the video rating of the participant's behavior. Data analyses were performed by JK under the supervision of BS and TG. JK and LL performed the data interpretation under the supervision of BS, TG, and RO. JK drafted the manuscript, LL revised the manuscript. BS, TG, and RO provided critical revisions. All authors approved the final version of the manuscript for submission.

FUNDING
We acknowledge support by Deutsche Forschungsgemeinschaft (DFG) and Open Access Publishing Fund of Osnabrück University.