Virtual Reality as a Context for Adaptation

The COVID-19 pandemic has accelerated interest in virtual reality (VR) for education, entertainment, telerehabilitation, and skills training. As the frequency and duration of VR engagement increases—the number of people in the United States using VR at least once per month is forecasted to exceed 95 million—it is critical to understand how VR engagement influences brain and behavior. Here, we evaluate neurophysiological effects of sensory conflicts induced by VR engagement and posit an intriguing hypothesis: the brain processes VR as a unique “context” leading to the formation and maintenance of independent sensorimotor representations. We discuss known VR-induced sensorimotor adaptations to illustrate how VR might manifest as a context for learning and how technological and human factors might mediate the context-dependency of sensorimotor representations learned in VR.


INTRODUCTION
The COVID-19 pandemic has accelerated interest in virtual reality (VR) for education (Affouneh et al., 2020;Pears et al., 2020;Pregowska et al., 2021), entertainment (Sigala, 2020), telerehabilitation (Mantovani et al., 2020;Singh et al., 2020;Wang et al., 2020), and skills training (De Ponti et al., 2020;Ehrlich et al., 2020). As the frequency and duration of VR engagement increases-the number of people in the United States using VR at least once per month is forecasted to exceed 95 million (Vailshery, 2021)-it is important to understand how VR engagement influences the brain and behavior. Here, we evaluate the known behavioral and neurophysiological effects of sensory conflicts such as visual-vestibular mismatch induced by VR engagement and posit an alluring hypothesis: the brain interprets VR as a unique "context", leading to the formation and maintenance of specific sensorimotor representations for VR engagement. We provide a working definition of VR as a context and offer examples of how context-specificity of VR may influence the brain and behavior at different levels of sensorimotor functioning: vestibulo-ocular reflex (VOR) gains, visuomotor adaptation of voluntary movements, and spatial navigation. We next review several technological and human factors that may influence the extent to which the brain might interpret VR as a unique context for learning and performance. Finally, we identify the implications of this hypothesis and avenues for additional scientific exploration.
What is a context for learning? Two definitions of "context" are relevant to VR. In experimental psychology (associative learning (Aiba et al., 1994;Wasserman and Miller, 1997;Bouton, 2010;Rosas et al., 2013;Urcelay and Miller, 2014), fear conditioning (Antoniadis and McDonald, 2000;Marschner et al., 2008;Maren et al., 2013), semantic memory (Kutas and Federmeier, 2000;Federmeier et al., 2002;Jones et al., 2015)), context-specificity implies that a behavior is more likely to be displayed in the state, place, or circumstance in which it was learned (e.g., attending an examination in the same location as the place of study results in better retrieval of the subject matter). In sensorimotor neuroscience, context-dependent adaptation refers to learning multiple motor programs depending on specific sensory conditions and efficiently retrieving the learned motor programs later upon recognition of the same sensory conditions (Glover and Dixon, 2001;Richter et al., 2004;Burguiere et al., 2005;Welch and Ting, 2014;Neszmélyi and Horváth, 2019). Here, we define a "VR context" as a set of sensory cues associated with engagement with immersive head-mounted display-based virtual reality (HMD-VR), and "context-dependent learning" as the memory of learned adaptations that previously yielded reduced sensory conflict and hence more accurate behavior in VR. In this perspective, we focus our attention on the context-dependencies of the VR experience agnostic to the virtual scene, task, or paradigm. We, therefore, focus more directly on the sensorimotor aspects of the "VR context." However, we do not exclude the possibility of contextdependent behavioral patterns associated with the content of the virtual scene.

How Might VR Manifest as a Context?
Repeated experiences within a context can enhance retrieval of specific adaptation strategies required for successful actions. For example, with repeated exposures to VR, a user may over time form a prediction about a sensorimotor error experienced in VR. Donning a head-mounted display (HMD) may cue recall of a previously learned adaptation strategy to overcome the error, establishing VR as the context for retrieval of previous learning. This context-specific learning may involve simple reflex adaption, visuomotor adaptation of voluntary movements, and navigationbased adaptations.

VR as a Context for Reflex Adaptation
Relatively low-tech experiences, such as wearing corrective magnifying lenses or scuba goggles, provide clues about context-dependent learning during engagement with HMD-VR. These accessories alter the perceived distance, position, and size of objects, creating a vestibular-ocular conflict akin to that experienced in VR and requiring recalibration of the vestibulo-ocular reflex (VOR) to stabilize gaze. VOR is a low latency (10-12 ms) reflex that enables eye rotation in an equal and opposite direction of head rotation to maintain gaze fixation (Gauthier and Robinson, 1975;Gonshor and Jones, 1976a;Gonshor and Jones, 1976b;Paige and Sargent, 1991). Atypical viewing conditions can result in the loss of fixation due to insufficient ocular compensation for head rotation. Image blur due to this "retinal slip" serves as an error signal, encouraging the adaptation of the VOR gain to minimize the blur (Ito, 1998). Multiple VOR gains can be toggled as appropriate contexts arise. For instance, donning a familiar pair of magnifying eyeglasses induces rapid changes in VOR gain to accommodate the magnification (Collewijn et al., 1983;Demer et al., 1987). Simply, the tactile feedback of putting on or even touching scuba goggles suffices to toggle VOR adaptation in experienced divers (Virre, 1996;Sharoni et al., 2001). Does VR also constitute a context for which a VOR gain is learned and retrieved under specific sensory conditions? Visuovestibular conflict induced by dynamic head-tracking errors and delays in virtual environment projection result in a velocity-dependent phase lag between the vestibular feedback of head rotation and visual feedback of scene rotation (DiZio and Lackner, 1992). Just like corrective magnifying lenses, visuovestibular conflict in VR also induces VOR adaptation (Draper, 1996(Draper, , 1998Virre, 1996). For example, reduced VOR gain was found following 20 min of gameplay when head rotation was used to direct the character's movement and returned to normal 30 min following cessation of VR engagement (Di Girolamo et al., 2001). In a cohort of patients with unilateral vestibular hypofunction, VOR gain increased following 1 month of vestibular training using a VR racing game (Micarelli et al., 2017. This cohort also showed better retention of increased VOR gain at a 12-month follow-up than a comparable cohort that received conventional vestibular training alone . More investigations of VOR adaptation in healthy individuals using modern HMD-VR systems with repeated engagements are needed to comprehensively probe these phenomena. Given these initial studies, we hypothesize that VR may constitute a context for which VOR gain can be learned and retrieved whenever that context is recalled based on sensory cues ( Figure 1). Similar to how putting on goggles can retrieve VOR adaptation specific to the lenses' magnification (Herdman, 1998;Gimmon et al., 2018), donning an HMD may also drive retrieval of a learned VOR adaptation. Contextual cues typically associated with VOR adaptation such as vergence angle (Lewis et al., 2003), head position Yakushin et al., 2003), and eye position  are common attributes of HMD-VR headsets (Kramida, 2015), and hence, VOR may adapt specifically to HMD-VR. If true, context-dependent retrieval of VOR adaptation should depend on the duration, frequency, and consistency of VR engagement. For instance, in one study, VOR adaptation paired with a unique head orientation was retained for a much longer time than the training duration, and some retention existed outside of the training context (Yakushin et al., 2003;Schubert et al., 2008). To understand VR as a context for reflex adaptation, we need to address whether these gains are truly remembered or learned de novo each time, albeit at a faster rate with the help of familiar sensory cues.

VR as a Context for Adaptation of Voluntary Movements
Adaptation of voluntary movements refers to the integration of proprioceptive and visual information of movement outcomes to reduce sensory prediction error by updating an internal model. It is typically studied by examining changes in movement patterns in response to visuoproprioceptive discordance, such as in the prismatic adaptation paradigm (Redding et al., 2005;Luauté et al., 2009;  2013). Individuals can learn to toggle between learned adaptations and multiple environments by rapidly retrieving the appropriate internal model or strategy based on specific sensory cues (Mistry and Contreras-Vidal, 2004;Hegele and Heuer, 2010;Huberdeau et al., 2015;Schween et al., 2018). Errors in co-registration between the head and virtual scene can cause displacement and rotation of the virtual display with respect to the real world, inducing visual-proprioceptive discordance (Draper, 1996). Visual-proprioceptive discordance may also arise from body tracking errors resulting in displacements or gains between real-world movements and those of virtual avatars (Draper, 1996).
Accumulating evidence suggests that VR might encourage reliance on explicit learning strategies based on explicit knowledge about the task and target error , in contrast to implicit adaptation, or "error-based learning," which improves performance continuously and involves updating an internal model based on sensory prediction errors. Researchers evaluated differences in motor learning mechanisms between a 2D screen-based visuomotor adaptation task and HMD-VR presentation of the same task (Anglin et al., 2017). Participants were more likely to use explicit strategies in HMD-VR, although in both conditions, they required the same time to adapt to the perturbation and . An aftereffect is experienced in the real world, causing retinal slip. VOR deadapts to reduce the error. (B-E) may repeat several times prior to F, resulting in the learning of the adaptation. (F, G). Even sight or touch of the HMD triggers a "preparatory" change in VOR gain upon or even prior to entering the VR. (H). Retinal slip is minimal or absent in the HMD-VR due to preparatory VOR adaptation. (I).
Removal of the HMD is accompanied by preparatory deadaptation of VOR gain. (J). Aftereffects are greatly diminished due to preparatory deadaptation. (K). The HMD-VR has now become a "context" for the retrieval of a previously successful strategy for reducing retinal slip, and the sight or touch of the HMD becomes the sensory cues triggering this retrieval.
Frontiers in Virtual Reality | www.frontiersin.org November 2021 | Volume 2 | Article 733076 reduce errors. In another study, participants showed larger aftereffects in a prismatic adaptation task in HMD-VR compared to prism goggles (Ramos et al., 2019). Evaluating aftereffects is critical to learning in VR, but few studies have investigated this variable. Findings from the real world indicate that explicit strategies provide rapid performance improvements during adaptation and are particularly beneficial for tasks requiring rapid and precise mastery of a visuomotor transformation. However, explicit strategies may be detrimental in tasks requiring consistent performance during learning (e.g., performing an endoscopic surgery using a robotic device).
Explicit strategies are predominantly used early in the learning process, while later learning relies on the adoption and use of an internal model (Taylor and Ivry, 2011;McDougle et al., 2015). Studies that include a focus on duration of VR engagement are especially important to understanding if, and at what point, individuals adapt an implicit strategy in VR. Additionally, the nature of the learning process is a critical factor for understanding the transfer of skills from VR to the real world. VR-based learning often shows little transfer to the real world (Levac et al., 2019). For instance, amplification of errors in VR negatively impacted transfer due to the use of different coordination strategies (Marchal-Crespo et al., 2017. Even in older adults and healthy controls, practice in VR does not transfer to the real world (de Mello Monteiro et al., 2014). The typical approach to enhancing skill transfer is to increase the similarity between virtual and real tasks (Levac et al., 2019). However, this attempt would be useless if the brain perceives VR as a distinct context.

VR as a Context for Spatial Navigation
Navigation involves the use of 1) idiothetic or "self-motion" cues (e.g., vestibular, proprioceptive, efference) generated by the body and head movements for multisensory path integration and 2) allothetic or "landmark" cues (e.g., visual, auditory, tactile) for processing landmark information. Integration of the two is necessary to specify an individual's spatial orientation in allocentric coordinates. The entailed visual, proprioceptive, and vestibular multisensory integration might differ between VR and the real world.
The importance of self-motion in spatial navigation is well demonstrated. The accuracy of scene recognition is reduced when an array of objects is rotated relative to a stationary observer but not when the observer moves relative to a stationary display (Simons and Wang, 1998). Self-motion, but not the passive motion of objects, facilitates scene recognition from novel viewpoints (Witmer and Kline, 1998;Wang and Simons, 1999), and self-motion is critical for orientation (Klatzky et al., 1998). Not surprisingly, given their susceptibility to disorientation after visual rotations, people face difficulty in learning spatial layouts in VR (Richardson et al., 1999). Context-specific learning in VR does not necessarily involve bodily self-movement in the visual scene (Riecke et al., 2010), but spatial navigation within VR may entail intrinsic conflicts due to a false sense of motion induced by optic flow (Park et al., 2018).
The distinct relationship between self-motion and optic flow in VR likely leads to distinct ways in which spatial information is encoded (Aghajan et al., 2015).
In summary, navigation in VR likely does not engage the idiothetic component of "self-motion" comparable to that in the real world. It is immensely challenging to fully identify how this fact influences how VR is interpreted by the brain as a unique context.

KEY FACTORS THAT INFLUENCE THE INTERPRETATION OF VR AS A SENSORIMOTOR CONTEXT 3.1 Technological Factors
Sensory conflict in HMD-VR arises from 1) head motion tracking errors, 2) body motion tracking errors, and 3) delays, lags, and errors in optic flow ( Figure 2) (Holloway, 1995). The previous sections have described the processes by which tracking errors lead to sensory adaptations with a focus on dynamic head-tracking errors and VOR adaptation, static head or body tracking errors and adaptation of voluntary arm movements, and the influence of optic flow errors on spatial navigation. Currently, little is known about how the type, magnitude, and variability of VR-system errors affect the adaptation and recall of sensorimotor representations. Information about these factors is critical to engineering innovation in VR to further decrease the gap between the real and virtual world. In this regard, "presence" becomes a critical lens through which to view these factors that determine the extent to which the brain interprets VR as a distinct context. Presence most broadly refers to "the perceived realness of a mediated or virtual experience" (Skarbez et al., 2017). However, "presence" as a universal construct for evaluating VR remains amorphous (see Skarbez et al., 2017 for an in-depth discussion of the definitions of presence). Several definitions of presence concentrate on sensorimotor coupling in the virtual world (Slater and Wilbur, 1997;Zahorik and Jenison, 1998;Slater, 2009;Skarbez et al., 2017), with perhaps the most well established being Slater's "response-asif-real (RAIR)" formulation (Slater, 2009). RAIR states that if a VR user experiences Place Illusion (sense of being in the virtual environment) and Plausibility Illusion (the sense that the virtual experience is really happening), then they should react to virtual stimuli as if they were real. Place Illusion is described to be a function of the sensorimotor contingencies, referred to as immersion, afforded by the virtual reality system. In contrast, Plausibility Illusion is described to be a function of the internal logical and behavioral consistency, referred to as coherence, of the virtual experience. Importantly, this formulation of presence can be assessed objectively through measurements of participant behavior and is, therefore, most relevant to the notion that VR may represent a context for adaptation. We note that this is indeed distinct from definitions of presence that describe "feeling" present, which is a subjective response most often measured by self-report. Whether the sense of

Human Factors
Age is a critical factor influencing the extent to which VR is interpreted as a distinct context. VOR gain changes in early development (<10 years) have been linked to the development of inhibitory control of the reticular formation in the brainstem (Ornitz et al., 1985). VOR gain also reduces in aging individuals (>75 years) (Baloh et al., 1993), indicating a reduction in reflex adaptation. Due to poorly calibrated VOR in these populations, head movements can cause image motion on the retina, leading to deficits in motor learning in VR. Whether VR is associated with greater sensitivity to retinal slip and whether VOR adaptation, retention, and consolidation in VR proceed the same way over the lifespan remain open questions. Young children may experience VR as real to a greater extent than adults do (Flavell et al., 1990;Sharar et al., 2007;Richert et al., 2011;Bailey and Bailenson, 2017) and even respond to non-immersive virtual environments in a way that is cognitively and behaviorally distinct from adults (Baumgartner et al., 2006(Baumgartner et al., , 2008. In two studies, adolescents (13-17 years of age) (Baumgartner et al., 2006) and adults (21-43 years of age) (Baumgartner et al., 2006) were found to recruit the prefrontal cortex during the virtual engagement more than children (8-11 years old and 6-11 years old, respectively). It may be that young children, who have a less mature prefrontal cortex and feel more presence in virtual environments, might show increased reliance on implicit learning strategies and may consequently experience a greater degree of interference between realworld tasks and VR. Indeed, evidence indicates that VR might interfere with the normal development of sensorimotor coordination (Miehlbradt et al., 2020) due to an increased reliance on the information obtained from the modality with the highest context-dependent reliability (Gori et al., 2008;Nardini et al., 2014). However, we are unaware of systematic investigations about the sensorimotor consequences of prolonged VR engagement in pediatric populations.
In contrast to the younger populations, aging increases reliance on sensorimotor predictions about the consequences of self-generated actions due to the structural and functional changes in frontostriatal circuits (Wolpe et al., 2016). Older populations may therefore be more likely than young adults to interpret VR as a distinct context. These agerelated changes are important to consider since they may make VR-based training less likely to transfer to the real world in these geriatric populations, as has been reported (Levac et al., 2019). Beyond age, sensorimotor deficits due to various health conditions might also affect the scope of VR-based interventions in clinical populations, though conclusive evidence remains sparse. Finally, other human factors to consider include sex-related differences. In fact, sex-related differences in postural stability in VR have been noted in the literature; women are more likely than men to experience Frontiers in Virtual Reality | www.frontiersin.org November 2021 | Volume 2 | Article 733076 5 cybersickness in VR (Koslucher et al., 2016;Munafo et al., 2017). Additional studies should examine whether these differences percolate to reflex adaptation, the adaptation of voluntary movements, and spatial navigation and if the female brain interprets VR as a distinct context more readily than the male brain does.
Overall, an emerging theme is that the developmental status of the prefrontal cortex (young children), and the ability to integrate multisensory information quickly and veridically (aging adults), influences the extent to which the brain interprets VR as a distinct context, and the sense of presence may be the critical component mediating its influence on cognition and behavior.

Duration of VR Engagement
Most investigations of sensory conflict in VR involve a single session with less than 2 hours of VR engagement. Even these studies have been limited to subjective reports of cybersickness caused by visuovestibular conflict (Gallagher and Ferrè, 2018;Weech et al., 2018;Kim et al., 2020). Evidence that increased duration of single-session VR engagement increases self-reported cybersickness (Kennedy et al., 2000;Kourtesis et al., 2019) and repeated exposure to VR reduces self-reported cybersickness (Kennedy et al., 2000;Risi and Palmisano, 2019) offers insights into how the duration and frequency of VR engagement might be related to context-dependent learning. In particular, reduced cybersickness with repeated VR engagement might indicate a strategy to overcome sensory conflict errors learned during previous VR engagements. This hypothesis is also in line with recent reports that faster readaptation to a learned sensory conflict relies more on retrieving explicit learning than faster implicit learning (Avraham et al., 2021).
When sensory conflict resolution in VR is viewed as a form of context-dependent learning, exciting questions emerge about how the schedule of VR engagement affects known properties of context-dependent learning. What schedule of engagement is required for VR to constitute a contextual cue for retrieval of learned adaptations? Certain types of contextdependent learning, such as fear conditioning, form strong context-dependent memories upon a single exposure to the context (Maren et al., 2013;Lonsdorf et al., 2017), whereas other types of learning require repeated context-dependent learning to form strong context-dependent memories (Ingram et al., 2011;Ruitenberg et al., 2012;Lee and Fisher, 2019). It is important to understand the interaction between the strength of context-dependent memories of learned adaptations and variability in the magnitude of sensory conflict upon repeated VR engagement. If tracking errors or visual display lags vary even slightly, retrieving a learned adaptation may interfere with the recalibration of sensory adaptations (Fu and Santello, 2012). Probing the effects of different forms of interference on context encoding, conditioning, retrieval, and extinction would provide valuable information about how VR-induced sensory conflict is resolved (Bouton, 1994(Bouton, , 2010.

DISCUSSION
Understanding the extent to which the brain interprets VR as a unique context precludes sustained and successful adoption of VR technology. Context-dependent learning may either be an asset or a hindrance to VR engagements. When used for entertainment, teleconferencing, or work it may be preferable to minimize carryover of sensorimotor adaptations from VR to the real world. Because short-lasting or absent aftereffects are a hallmark of context-dependent learning, it may be desirable to enhance context dependency of learned adaptations for these use cases. In contrast, when VR is used for for skills training or rehabilitation it may be desirable to reduce the contextdependency of learning to enhance aftereffects and ultimately generalization of learning from VR to the real world. This transfer might be accomplished by reducing the repeatability of the environment or increasing the presence of the experience.
A complete absence of visuomotor discrepancies, or full immersion, has been previously hypothesized to give rise to a strong sense of Place Illusion (Skarbez et al., 2017;Slater, 2009). Given the definition of context presented here, the complete absence of visuomotor discrepancies would theoretically remove the need for interpreting VR as a context for sensorimotor adaption. However, whether this is true or not remains to be tested. Perhaps the more pertinent question is, how veridical does a VR system need to be to remove context? Furthermore, it is likely that the magnitude and type of sensorimotor discordance may affect context dependencies of reflex conditioning, voluntary motor adaptation, and spatial navigation differently. We hypothesize that VR as a context for spatial navigation is likely distinct from VR as a context for reflex adaptation or for adaptation of voluntary movements. Low coherence within the virtual world, yielding poor Plausibility Illusion, is more likely to influence how spatial information is encoded and may likely constitute a context distinct from the real world that persists even when sensorimotor discordance is low. Studies that address the technological and human factors that may influence whether the brain interprets a context distinct from the real world are few and far between. The following questions remain open for both the engineers developing the systems and the perceptual scientists interested in the neurological effects of VR: • How do the duration, frequency, and schedule of engagement influence whether the brain perceives VR as a context distinct from the real world? • What are the aftereffects of VR engagement and how do they change with repeated exposure? • What are the thresholds for sensorimotor adaptation in VR?
How close to the human perceptual threshold can sensory conflicts occur without causing the individual to invoke a learning or adaptation strategy? Does it matter if sensory conflicts occur suddenly or gradually? • How do sensory conflicts arising from multiple sources of error (e.g., head and body tracking errors combined) affect adaptation? Is there a different threshold for each source of error?
Frontiers in Virtual Reality | www.frontiersin.org November 2021 | Volume 2 | Article 733076 6 • Are there interference or reinforcement effects between training performed in VR and transferred to the real world?
Future work on context-dependent learning based on numerous well-validated designs previously used for testing retrieval, interference, and savings (Krakauer et al., 1999(Krakauer et al., , 2005Zarahn et al., 2008;Huang et al., 2011; can provide a greater understanding of the extent to which the brain interprets VR as a unique context, providing invaluable information to VR applications across multiple domains.

DATA AVAILABILITY STATEMENT
No original data was included in the article, further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
MY, MM, SN, and ET conceived the idea for the manuscript; MY, MM, and SN drafted the manuscript; MY, MM, SN, and ET edited and revised manuscript; MY, MM, SN, and ET approved the final version of manuscript.

FUNDING
The project was primarily funded by a research contract under Facebook's Sponsored Academic Research Agreement. The project was additionally supported in part by NIH-2R01NS085122 (ET), NIH-2R01HD058301 (ET), NSF-CBET-1804550 (ET), and NSF-CMMI- M3X-1935337 (ET and MY)