The Use of Virtual Reality to Influence Motivation, Affect, Enjoyment, and Engagement During Exercise: A Scoping Review

Many adults are physically inactive. While the reasons are complex, inactivity is, in part, influenced by the presence of negative feelings and low enjoyment during exercise. While virtual reality (VR) has been proposed as a way to improve engagement with exercise (e.g., choosing to undertake exercise), how VR is currently used to influence experiences during exercise is largely unknown. Here we aimed to summarize the existing literature evaluating the use of VR to influence motivation, affect, enjoyment, and engagement during exercise. A Population (clinical, and healthy), Concept (the extent and nature of research about VR in exercise, including underpinning theories), and Context (any setting, demographic, social context) framework was used. A systematic search of Medline, Scopus, Embase, PsycINFO, and Google Scholar was completed by two independent reviewers. Of 970 studies identified, 25 unique studies were included (n = 994 participants), with most (68%) evaluating VR influences on motivation, affect, enjoyment, and engagement during exercise in healthy populations (n = 8 studies evaluating clinical populations). Two VR strategies were prominent – the use of immersion and the use of virtual avatars and agents/trainers. All studies but one used virtual agents/trainers, suggesting that we know little about the influence of virtual avatars on experiences during exercise. Generally, highly immersive VR had more beneficial effects than low immersive VR or exercise without VR. The interaction between VR strategy and the specific exercise outcome appeared important (e.g., virtual avatars/agents were more influential in positively changing motivation and engagement during exercise, whereas immersion more positively influenced enjoyment during exercise). Presently, the knowledge base is insufficient to provide definitive recommendations for use of specific VR strategies to target specific exercise outcomes, particularly given the numerous null findings. Regardless, these preliminary findings support the idea that VR may influence experiences during exercise via multiple mechanistic pathways. Understanding these underlying mechanisms may be important to heighten effects targeted to specific exercise outcomes during exercise. Future research requires purposeful integration of exercise-relevant theories into VR investigation, and careful consideration of VR definitions (including delineation between virtual avatars and virtual agents), software possibilities, and nuanced extension to clinical populations.

= 994 participants), with most (68%) evaluating VR influences on motivation, affect, enjoyment, and engagement during exercise in healthy populations (n = 8 studies evaluating clinical populations). Two VR strategies were prominent -the use of immersion and the use of virtual avatars and agents/trainers. All studies but one used virtual agents/trainers, suggesting that we know little about the influence of virtual avatars on experiences during exercise. Generally, highly immersive VR had more beneficial effects than low immersive VR or exercise without VR. The interaction between VR strategy and the specific exercise outcome appeared important (e.g., virtual avatars/agents were more influential in positively changing motivation and engagement during exercise, whereas immersion more positively influenced enjoyment during exercise). Presently, the knowledge base is insufficient to provide definitive recommendations for use of specific VR strategies to target specific exercise outcomes, particularly given the numerous null findings. Regardless, these preliminary findings support the idea that VR may influence experiences during exercise via multiple mechanistic pathways. Understanding these underlying mechanisms may be important to heighten effects targeted to specific exercise outcomes during exercise. Future research requires purposeful integration INTRODUCTION Regular exercise is well-established as a key strategy to improve overall health, decrease the risk of musculoskeletal, metabolic, cardiovascular, and neurological conditions, and reduce all-cause mortality (Warburton et al., 2006;Blair, 2009). Despite the clear and well-known benefits of exercise, nearly one-third of adults over the age of 15 do not meet these exercise guidelines (World Health Organization, 2017).
Physical inactivity and exercise avoidance is a complex issue influenced by environmental, sociocultural, and individual psychological and physical factors (Kendzierski et al., 1998;Booth et al., 2000;Giles-Corti and Donovan, 2002;Ekkekakis et al., 2005). In part, the way people feel during exercise, the enjoyment they experience, their previous exercise experience, and their beliefs about exercise may be strong influencing factors that result in exercise avoidance (Williams et al., 2008;Ekkekakis et al., 2011). Importantly, how people feel during exercise predicts their future exercise engagement (Williams et al., 2008), raising the possibility that enhancing exercise experiences within an individual may have important influences on their future exercise behavior.
Virtual Reality (VR) has been proposed as one way to improve exercise experiences. Indeed, there is ample literature suggesting that certain technological features can increase the likelihood of an individual choosing to engage with that technology and to undertake exercise (Yim and Graham, 2007;Knaving et al., 2015;Rogers, 2017). However, VR may also be used to change the experiences that occur during the exercise session itself. For example, during exercise, the experience could be augmented in various ways that may include the implementation of competing virtual agents, distraction through change of context or narrative, provision of motivational feedback and more. There is growing evidence that VR use during exercise might improve exercise experiences within the exercise session itself. Recent work (Bird et al., 2019) has shown that through using VR during exercise, factors such as enjoyment and affect (how pleasant/unpleasant and energized/lethargic one feels) can be positively influenced.
To date, the literature describing how VR is currently being used within the context of exercise, specifically to influence experiences occurring while exercising, has not been formally summarized, making it difficult to judge the overall usefulness of VR in this field. Additionally, it is unclear whether VR-enhanced exercise is being used in clinical populations to improve the exercise experience, such as those with neurological conditions, or whether its use is limited to lab-based assessment of healthy populations. Such knowledge is important for determining the present scope of VR application. Further, in studies that use VR to enhance an individual's experiences during exercise, the theoretical underpinnings that are being used to justify VRbased exercise are also unknown. It is likely that there are various mechanisms which influence the effect of VR on exercise experiences and these mechanisms may depend both upon the type of population and/or the type of exercise experience being targeted. For example, altering affect vs. altering motivation during exercise may require different design strategies such as the use of natural scenery, or distracting features during the experience. Lastly, it is also unknown whether certain VR features have positive (or negative) influences on exercise experiences and/or are more (or less) potent in altering a person's experiences during exercise. Such knowledge is relevant to guide the prescription of VR, in order to attain maximal benefits.
Recent, and significant, technological advances in VR now mean that it is possible to use VR outside of research settings, taking it into real-world environments. Given this increasing accessibility of VR as a testing or training technology, it is critical to more fully understand how VR is being used for exercise prescription, and in what context. Therefore, a scoping review with narrative synthesis was undertaken to capture research using VR to alter experiences during exercise. While understanding how people engage with VR and are motivated to use VR-based exercise is important, here the focus was to explore and understand VR-induced malleability of the exercise experience itself, given the link between experiences during exercise and future exercise behavior. Thus, this scoping review aimed to answer the question: what is the current state and nature of the literature investigating the use of VR in healthy and clinical populations to alter motivation, affect, enjoyment, and/or engagement during exercise? The specific aims were to explore the types of VR technology used (high vs. low immersive), the types of VR strategies being implemented (e.g., avatars or agents), and the populations (healthy, clinical) that VR-enhanced exercise is being used in. This review aimed to interpret the effects of VR on exercise experience outcomes as a function of these features. Given that different features of a VR experience may differentially influence healthy and clinical populations and that findings in healthy populations do not always translate to clinical populations (e.g., even in practical set-up) (Garrett et al., 2018), here we examined the literature for these populations separately. This allows nuanced suggestions for future research as well as VR design considerations. Last, we aimed to summarize the proposed theoretical underpinnings (or lack thereof) for VR use within an exercise context by amalgamating the theories provided by current studies evaluating VR-enhanced exercise.

METHODS
A Population, Concept, and Context framework was used as per PRISMA scoping review guidelines (Peters et al., 2015;Tricco et al., 2018). The protocol was registered with Open Science Framework (OSF) prior to the synthesis of the literature and can be viewed at https://bit.ly/39GtQlB.

Data Sources
A systematic search of Medline, Scopus, Embase, PsycINFO, and Google Scholar databases was performed from inception to August 5, 2019. Keywords and relevant subject headings for VR, exercise, and enjoyment, motivation, engagement, affect, physical exertion, or work rate were used. Subject headings were adapted to each database with the assistance of an academic librarian (see Table 1 for the Medline search strategy). The search was limited to studies that had been published in indexed journals.

Eligibility Criteria
The Population was open to clinical and healthy populations, of any age, gender, or cultural background. The Concept included the extent and nature of research evaluating VR use during exercise, specifically focussing on the types of VR interventions/strategies being used (and in what populations), their theoretical underpinnings, and their effects on motivation, affect, enjoyment and/or engagement during exercise. To be included, studies were required to identify their study as using VR technology (i.e., self-identified; no judgment was made based on what we thought VR technology was), use VR in an exercise context, and assess one of the key outcomes (exercise motivation, affect, enjoyment, and/or engagement) during exercise, with or without use of a comparator treatment/experimental arm or condition.
A priori definitions for exercise outcomes (motivation, affect, enjoyment, engagement) were used to determine study inclusion. Motivation was defined as the psychological underpinning that drives intensity, trend and persistence in behavior (Iso-Ahola and Clair, 2000). Studies that quantitatively measured perceived motivation [i.e., used outcome measures such as the intrinsic motivation inventory (Tsigilis and Theodosiou, 2003) or Likert scales (Joshi et al., 2015)] were included. Affect was defined as the pleasure or displeasure, tension, or relaxation, energy or lethargy one feels (Ekkekakis et al., 2011). Studies that used an established affect scale [e.g., feeling scale (Hardy and Rejeski, 1989), felt arousal scale (Svebak and Murgatroyd, 1985), or physical activity affect scale (Lox et al., 2000)] were included within the affect outcome category. Enjoyment was defined as a positive emotion or a positive affective state (Wankel, 1993). Last, engagement was defined generally as the participants' participation during exercise, measured by voluntary changes in workload (e.g., power output, distance traveled), psychophysiological indications of workload (e.g., heart rate), changes in physiological response (e.g., electromyography; EMG, and range of motion; ROM), self-report measures of engagement (e.g., Paffenbarger Physical Activity Questionnaire; PPAQ), or time spent exercising by choice. Importantly, engagement did not refer to the further use or engagement with VR beyond that of the study exercise session (e.g., did not include the motivation to engage with the VR-based technology).
The overall Context of this review was open (any level of education, income, patient demographics) and included any geographical/sociocultural context. The specific context was to consider the provision of VR during any exercise session, including experimental and clinical settings, as well as acute experimental and intervention studies.

Study Selection
The search results were uploaded to Covidence (Veritas Health Innovation, 2017) with titles and abstracts screened by two independent reviewers (BM, MM) to remove clearly irrelevant papers. The full text of potentially eligible studies was then retrieved, and the same two reviewers formally evaluated eligibility using the above criteria. The two reviewers addressed any discrepancies, and if needed, consulted with a third, independent reviewer. The reference lists of full text studies were manually searched by both reviewers for additional potentially relevant studies.

Data Extraction
A custom-designed, piloted data extraction spreadsheet was used. The following data were independently extracted by the same two reviewers and cross-checked for accuracy: study design; population demographics [age, sex, number, healthy or clinical (e.g., autism, spinal cord injury, obesity)]; types of VR interventions (systems and whether they were high immersion or low immersion); types of VR strategies used and any control comparisons; type, intensity and volume of exercise; underpinning theories of the interventions; aims of the intervention; main findings (means, standard deviations, and statistical results); and report of adverse events.

Data Handling
Data from included studies were summarized in tables, allowing for descriptive narrative analysis. Data were grouped based on the following factors: (1) type of VR strategy (e.g., avatars/virtual trainers, immersion); (2) type of outcome assessed (motivation, affect, enjoyment, or engagement); (3) population (clinical or healthy); and, (4) type of VR system (high vs. low immersion). Note that in our last revision the reviewer asked us to change the order in which we arranged our results. The changes here look large, but they aren't. I'm just changing the order of the text here so that it matches the order that the results section will follow. Our apologies we didn't catch this in the earlier draft. For the VR strategy used, immersion was considered to be evaluated by a study when various levels of VR immersion were compared (e.g., high vs. low), when additional sensory features were added to a VR experience, or when a VR condition was compared to no-VR condition during exercise. Additionally, avatars were defined as first person perspective, human-controlled and virtual agents were defined as third-person perspective, computercontrolled (Bailenson and Blascovich, 2004). Competitive agents/virtual trainers were typically considered those that provided input/feedback exceeding participant effort (e.g., faster speed), with cooperative agents/virtual trainers considered those providing input/feedback to maximize performance (e.g., to achieve ideal heart rate) (Marker and Staiano, 2015). Ghost agents were typically considered those that provided input/feedback of an individual's previous performance (Farrow et al., 2019). For the type of VR system used (e.g., high vs. low immersion), immersion levels were here defined as high if the study used a head mounted display (HMD) with realtime tracking of movement. Low immersion was defined as any other intervention using projection, television, computer, audio input only, or a HMD that did not have real-time motion tracking.
When data were available from included studies, mean differences and the 95% confidence intervals were calculated for between group/condition comparisons. If data were not available within the study results, authors were emailed a maximum of 3 times requesting missing data. If a study used an outcome measure that overlapped with other exercise outcomes, only the findings relating to the original purpose were used. For example, the Intrinsic Motivation Inventory (Tsigilis and Theodosiou, 2003) includes one subscale evaluating interest/enjoyment. However, given that the interest/enjoyment subscale is part of a greater motivation construct, the subscale findings were not separately discussed within the enjoyment sections.

RESULTS
The systematic search generated a total of 970 potentially eligible studies (see Figure 1 for PRISMA flow diagram). Of these, 83 studies were retrieved for full-text screening, with 58 not meeting eligibility criteria. A total of 25 published studies, consisting of 28 individual experiments, were included.

Study Characteristics
Studies spanned 22 years from 1997 to 2019 (Figure 2). The research was carried out in 11 countries: nine studies were undertaken in the United States; three each from the United Kingdom and Australia; two each from New Zealand and France; and one each from Canada, Norway, Saudi Arabia, Spain, Sweden, and Switzerland. The total number of participants across all studies was 994. Most studies (17/25) evaluated the use of VR in a healthy population (n = 883, 88.8%) with eight investigating VR in clinical populations (n = 111, 11.2% of total participants). Similar numbers of each sex were recruited (498 female; 50.1%). Three studies did not report the sex of recruited participants (n = 19, 3% of total participants).
Seventy-two percent of studies did not provide an explicit definition for VR, resulting in a combination of both immersive and non-immersive technology included in this review. Specifically, seven studies used highly immersive, head-mounted displays (HMD) VR such as the Oculus Rift, PlayStation VR, or Samsung Gear VR, which enables motion tracking of the participants' head position and allows 360-degree exploration of their environment. One study used a HMD without congruent tracking of motion (Calogiuri et al., 2018). Twelve studies used low-immersive VR, consisting of a computer, television, or projector to provide the visual stimuli. Other interactive technology was also used including the Interactive Rehabilitative System (IREX) VR (Bryanton et al., 2006), Super Pop VR TM (García-Vergara et al., 2015), and Nintendo Wii with additional tracking sensors (Hossain et al., 2013). Such systems provide motion tracking of the participant which allowed them to interact with the virtual environment. Last, Gillman and Bryan (2016) utilized app-based technology where audio was the primary sensory driver of the virtual experience.
Only two studies explicitly reported adverse effects during the use of VR. Calogiuri et al. (2018) reported that 19 out of 26 participants had experienced "cyber-sickness." Shaw et al. (2016) had three participants withdraw due to "discomfort" from the Oculus Rift HMD.
Of the studies that evaluated virtual avatars or agents, 71% did not refer to the intervention using appropriate terminology -i.e., that aligned with the established definitions for virtual avatars or virtual agents provided by Bailenson and Blascovich (2004).

Type of Exercise
Various exercise modalities were paired with VR. Ten studies used stationary cycling, five used treadmill/outdoor walking, five used bodyweight-based exercises (e.g., doing squats, flexing joints), two used stationary rowing, two used treadmill running, and one study used specific gait orthosis Lokomat treadmill walking.

Type of VR Intervention Strategies and Extent of VR Use for Exercise Outcomes
Two VR strategies were prominent, regardless of exercise outcome: the use of immersion and the use of virtual avatars and agents. See Table 4 for an overview of the study count, participant numbers, participant sex; organized by healthy and clinical population and exercise outcomes (motivation, affect, enjoyment, and engagement). Table 5 provides an overall summary of study findings across exercise outcome, VR strategy used, and population evaluated.

Influence of Immersion on Motivation During Exercise Healthy Populations
Three studies evaluated the influence of VR immersion on motivation. Of these, only one study by Shaw et al. (2017) used high immersive VR, and found that adding immersive features such as wind, sound, and resistance (provided in addition to the visual VR) during virtual cycling resulted in greater motivation, and had an additive effect when all elements were combined.
The remaining studies used low immersive VR and found no effect of VR on motivation. García-Vergara et al. (2015) compared two versions of an interactive VR game called Super Pop VR, finding no significant differences in motivation during game play between a simple version and a version providing greater immersive customization features (e.g., customized sound effects). Gillman and Bryan (2016) compared two auditory-only phone apps while having participants run on a treadmill. There were no differences in metamotivational state between the group receiving a narrative-based gamification experience (auditory feedback of Zombies chasing them) and the group receiving running performance feedback via an app.

Clinical Populations
Only low immersive VR was used in clinical populations (n = 2 studies). Meyer (2008), in a small sample of people (n = 3) with obesity, evaluated the effect of 21 weeks of VR treadmill walking on motivation. The VR walking training did not change exercise motivation, although due to lack of a control group, it remains unclear whether motivation might have decreased over time in a group not receiving VR. Finkelstein et al. (2013) investigated motivation using a projected VR game on either one or three  HMD-nVisor SX 3 sets of 20 exercises were completed followed by 2 min of standing followed by an option to continue to exercise or end the experiment. A. VR with a third-person avatar of them self that would lose weight with activity (n = 14) B. VR with a third-person avatar of them self that would gain weight with inactivity (n = 12) C. VR with a third-person avatar of someone else that would lose weight with activity (n = 14) D. VR with a third-person avatar of someone else that would gain weight with inactivity (n = 13) Engagement: Self-avatar groups = ↑ exercise repetitions than other-avatar groups Positive vs. negative reinforcement groups = (NS) Fox and Bailenson (2009 Initially participants were asked to row (on rowing machine) at 75% of their perceived max for 2 min with the distance recorded. After a 7-mi break, participants did a maximum 9-min row.
Affect: No Significant differences between groups in affect. Positive affect subscale ↑ in VR with competitor than no VR.

(Continued)
Frontiers in Virtual Reality | www.frontiersin.org  (Slater, 2009 walls. The study recruited ten children with autism and had them play 15 min of an interactive game titled "Astrojumper." Despite participants reporting they would play the game more if they could use it whenever they wanted, and that it was unlikely that they would get bored of the game, there was no difference in motivation between VR conditions.

Influence of Immersion on Affect During Exercise Healthy Populations
The influence of VR immersion on affect during exercise was evaluated in five studies. Findings consistently showed no influence on affective valence, regardless of VR immersion level, although varying findings were seen for perceived activation (arousal). Using high immersive VR cycling, Bird et al. (2019) compared various levels of audio visual input. Perceived activation (via felt arousal scale) was higher when participants experienced exercise with 360-degree video (i.e., high immersive VR) compared to the control (no music or video) and or the music only condition, but no differences were seen for affective valence (via feeling scale). Low immersive VR did not influence affective valence and this finding was consistent across repeated and single exercise sessions. Annesi and Mazas (1997) evaluated the influence of a 14-week intervention comparing a VR bike with a no-VR recumbent bike, and a no-VR upright stationary bike (with arm component). There were no differences in any of the exercise-induced feeling inventory subscales for the VR condition compared with the no-VR conditions, except for the revitalisation subscale which favored the VR condition over the No-VR upright bike. Evaluating the effect of auditory-based VR apps over one treadmill exercise session, Gillman and Bryan (2016) found that the narrative-based gamification experience app (Zombies! Run) resulted in significantly higher activation levels (via felt arousal scale) but no difference in affective valence (feeling scale) compared to running with an app providing only running performance feedback. Two low immersive VR studies specifically investigated the influence of outdoor nature environments on affect, comparing the difference between VR (visual input of outdoor nature) An audio cue was also used to draw attention to this information. D. Race: a virtual competitor that was programmed from the current speed of the participant was used to create competition.
Engagement: In SCI group, speed and sprint condition = ↑ HR than steady condition. In healthy controls, Race = ↑ HR than steady, and sprint = ↑ HR speed condition. In SCI, speed, sprint, and race conditions = ↑ biceps femoris EMG in swing phase of gait than steady condition. Sprint = ↑ EMG than speed condition. In healthy controls, Race condition = ↑ EMG than steady.
In healthy controls, Race = ↑ gastrocnemius EMG during both phases than steady condition. In SCI, speed and sprint conditions = ↑ rectus femoris EMG during stance phase than steady condition. Speed, race, and sprint conditions = ↑ EMG during swing phase than steady condition. In healthy control, Speed, sprint, and race = ↑ EMG than steady condition during swing phase. Speed condition = ↑ EMG than steady during stance phase.  (Slater, 2009   and actual experience. Actual outdoor walking had consistently superior influences on affect. Specifically, Plante et al. (2006) found that a VR treadmill condition (projected VR of outdoor walking) resulted in significantly higher energy subscale ratings (via activation-deactivation adjective checklist) than a VR seated (no exercise) condition, but that actual outdoor walking resulted in higher ratings than both VR conditions. Similarly, Calogiuri et al. (2018) found that actual outdoor walking resulted in greater positive affect, as measured by the PAAS, than both VR treadmill walking and the same VR footage without exercising. Further, a negative association between participants experiencing cybersickness (sickness induced from the use of VR) and positive affect was found.

Clinical Populations
Two studies evaluated the influence of VR on affect in overweight populations, with positive results shown only when VR was highly immersive. Jones and Ekkekakis (2019) found that in an inactive, overweight population, affective valence during recumbent cycling was significantly more positive in the high immersive VR condition than the no-VR control at 5 min, and higher than both the control and low immersive condition at 10 and 15 min. Baños et al. (2016) used low immersion VR vs. no-VR during 6-min of treadmill walking comparing healthy children to overweight children, and found no difference in affective valence between groups and conditions.

Influence of Immersion on Enjoyment During Exercise Healthy Populations
The effect of immersion on enjoyment in healthy populations was explored in seven studies. Both high and low immersive VR typically increased enjoyment when compared to no-VR conditions, although some conflicting results were seen. Using high immersive VR, Zeng et al. (2017) found a VR biking condition to be significantly more enjoyable than a no-VR biking control group. Similarly, Shaw et al. (2017) found that incorporation of more immersive features of VR (such as sound and wind) into the high immersive VR biking experience did result in significantly higher enjoyment than the vision or resistance alone conditions. However, Bird et al. (2019) found that highly immersive VR cycling (360-degree video, with or without music) did not significantly increase enjoyment compared with the control (no video/music) condition or low immersive VR condition (music and video only).
Using low immersive VR, Plante et al. (2003) found a VR cycling condition was more enjoyable than both a no-VR cycling condition and a VR no-cycling condition (only played a VR game). Similarly, Bird et al. (2019) found higher enjoyment during low immersive VR (music and video) than during a control condition without video/music. In contrast, Mestre et al. (2011b) found no difference in enjoyment during stationary cycling between a low immersive VR group and a group where music was added to the VR, but made no comparisons to no-VR.
Finally, studies comparing low immersive VR (vision of walking in nature) with actual nature walking consistently showed enjoyments benefits in favor of the latter. Plante et al. (2006) found actual nature walking to be more enjoyable (assessed via PACES) than VR of nature paired with treadmill walking and VR while seated, with the latter rated as least enjoyable. Similarly, Calogiuri et al. (2018) found that actual outdoor walking was significantly more enjoyable than both VR treadmill walking (non-responsive to participant movement) and VR no-walking control. This study also found that cybersickness occurring during both VR conditions was negatively associated with enjoyment.

Clinical Populations
Six studies evaluated the effect of immersion on exercise enjoyment in clinical populations. The only study using high immersive VR, (Jones and Ekkekakis, 2019), evaluated enjoyment (via PACES) after 15 min of stationary recumbent cycling at 5 | Summary of the effects of virtual reality on exercise outcomes, specific to type of outcome, type of virtual reality manipulation, and type of population.

Exercise outcome
Influence of immersion Influence of avatars and agents (and high vs. low-immersive VR)

Clinical application
Motivation • High immersive VR seems to positively influence motivation when compared with no-VR conditions. • Low immersive VR does not influence motivation when it solely uses principles of immersion (e.g., adding interactive features or presence-inducing auditory feedback to an app) • High immersive VR appears to positively influence motivation, particularly when the agent is competitive (Shaw et al., 2016). • Use of a ghost (or cooperative/trainer) agent does not increase motivation. • Use of a competitive agent in low immersive VR is no better than low immersive VR alone (no agent).
• Very little work has been done precluding conclusions.

Affect
• Higher levels of immersion (in either high or low immersive VR) appear to result in higher levels of activation but do not alter affective valence. • Low immersive VR does not influence affect to the same degree as outdoor exercise (highly immersive).
• Low immersive VR using virtual competitive agents may have a more positive influence on affect than no VR exercise conditions but is no different than other conditions that also involve VR.
• High immersive VR cycling appears promising at improving affective valence in clinical participants that are inactive and overweight. • Such findings are not seen for low immersive VR evaluated in children who are overweight.

Enjoyment
• Adding sensory features relevant to an immersive VR experience (e.g., wind/sound to VR biking) appears to increase enjoyment during exercise. • Positive effects appear dependent on the context of the sensory addition: merely adding music to highly immersive VR does not influence enjoyment and has varying effects when added to low immersive VR. • Low immersive VR walking does not heighten enjoyment to the same degree as outdoor walking (highly immersive).
• The addition of a competitive or a coach agent does not typically result in greater exercise enjoyment (although general use of VR vs. no-VR does).
• Studies generally supported increased enjoyment during high/low immersive VR conditions vs. no-VR conditions in: Adults who are inactive and overweight • Children who are overweight • Adults with stroke, traumatic brain injury or encephalitis • Increased fatigue occurred in people with neurological conditions due the added sensory stimulation of VR.

Engagement
• Low immersive VR does not influence exercise engagement when it solely uses principles of immersion (vs. no-VR). • The addition of music to an intervention appears to be beneficial to engagement. • Low immersive VR treadmill walking does not heighten engagement to the same degree as outdoor walking (highly immersive): i.e., perceived exertion is higher for the same physiological response (HR) • Results for high immersive VR with competitive agents are conflicting: • Increased engagement during maximal intensity exercise. • No benefit to engagement when using lower intensity exercise. • When an avatar resembles the participant, and/or there is positive or negative reinforcement, an increase in exercise engagement occurs.
• Low-immersive VR appears to be beneficial for engagement in: • Children with cerebral palsy when doing specific joint exercises • People with a spinal cord injury when treadmill walking.
ventilatory threshold in an overweight and inactive population. There were no differences in enjoyment between the high and low immersion VR exercise conditions, but both VR conditions were more enjoyable than the no-VR exercise control condition. Three studies using low immersive VR qualitatively assessed exercise enjoyment in clinical populations of overweight children (Baños et al., 2016), overweight women (Meyer, 2008), and adults with stroke, traumatic brain injury or encephalitis (Törnbom and Danielsson, 2018). Each study reported the qualitative data to be indicative of participants enjoying the VR conditions. Interestingly, despite increased enjoyment, participants with neurological conditions generally found the VR conditions to be more tiring and challenging due to the increased stimulation (Törnbom and Danielsson, 2018). Similarly, using quantitative measures of enjoyment, Finkelstein et al. (2013) investigated enjoyment of exercise (playing a 15 min game) in children with autism, where they compared single wall projection with three walls of projection. Regardless of the condition (one vs. three wall VR projection), the children reported high enjoyment in general for the game but indicated that having three screens was more fun than one. Last, while Hossain et al. (2013) reported assessing enjoyment during a VR exercise game in a population of people with obesity, the findings were not reported.

Influence of Immersion on Engagement During Exercise Healthy Populations
All studies evaluating the effect of immersion on exercise engagement in healthy populations used low immersion VR, with no evidence of positive effects. The only study to evaluate engagement via attendance rates, Annesi and Mazas (1997), found that a low immersive VR stationary bike group had significantly greater attendance rates over a 14 week intervention than a no-VR recumbent bike group, but was no better than a no-VR upright bike group. This supports an effect of the type of bike on engagement, but not the presence of VR. Mestre et al. (2011b) evaluated the effect of low immersion VR cycling with and without music on engagement, as measured by a 0-10 commitment check scale. Findings showed that a video only VR condition had a greater decrease in commitment over time compared to a VR plus music condition, the latter suggesting a maintenance of commitment, but neither improved engagement.
Comparisons to actual outdoor walking showed no benefit of VR on engagement. Calogiuri et al. (2018) found no difference in mean HR between low-immersive VR during treadmill walking (showing outdoor walking) and actual outdoor walking. However, perceived effort (ratings of perceived exertion; RPE) was actually higher during VR treadmill exercise than outdoor exercise.

Clinical Populations
All five studies evaluated engagement during exercise in clinical populations using low immersion VR. Three studies evaluated VR in neurological populations, with two finding beneficial effects. Bryanton et al. (2006) investigated children with cerebral palsy and their ability to dorsiflex their ankle and hold contractions, comparing no-VR to a motion-tracking lowimmersion VR system. VR was more engaging than no-VR, evidenced by greater dorsiflexion hold times and increased ROM during the task. Zimmerli et al. (2013) had people with spinal cord injury (SCI) and a healthy control group complete a 4min VR treadmill walking task in one of four conditions: nonresponsive VR (a steady walking speed with the virtual world not responsive to the treadmill speed); congruent VR (walking speed was congruent with the treadmill speed); metric VR (congruent virtual speed with treadmill with additional metric feedback such as average speed and an auditory cue of performance information); and competitive agent (a race condition). The SCI group had increased electromyography (EMG) activity of the biceps femoris, the rectus femoris, and the gastrocnemius muscle during the swing phase of their gait and higher HR during the congruent VR, competitive agent, and the metric feedback conditions compared to the non-responsive VR condition. Last, Finkelstein et al. (2013) evaluated the influence of one wall vs. three walls of projection during game play in children with autism, and found no differences in metabolic equivalent (METs) expenditure between conditions. Two studies evaluated the effect of VR in obese populations, with no evidence of positive benefit in engagement. Meyer (2008) investigated the use of VR treadmill walking over 21 weeks in women with obesity and showed no improvement in step count during the exercise sessions or in daily physical activity compared to their baseline results. Furthermore, Hossain et al. (2013) investigated children and adults with obesity using an interactive exergame to influence engagement; however, the study failed to report their measured results. Table 5 provides an overall summary of study findings across exercise outcome, VR strategy used, and population evaluated. Notably, only one study evaluated virtual avatars (embodied), with the remaining evaluating the influence of virtual agents.

Influence of Virtual Avatars and Agents on Motivation During Exercise
Five studies (six experiments) evaluated the effect of virtual competitors or coaches during exercise on motivation in healthy populations. No studies investigated the use of virtual avatars in healthy populations. Further, no studies evaluated use of virtual avatars or agents on motivation during exercise in clinical populations.
In healthy populations, studies using high immersive VR showed benefits of competitive virtual trainers on motivation, but no benefit of cooperative (coaches providing positive feedback) or ghost (mimicking past performance) agents/trainers. In particular, Shaw et al. (2016) showed no difference in motivation between solitary VR cycling (no virtual agent), a ghost condition (replay of themselves from the first condition), or a virtual cooperative trainer condition (recommendations were provided based on current speed and HR to maintain ideal heart rate). But, in experiment two, a virtual competitive trainer condition (via increased virtual agent speed) resulted in significantly higher motivation than both the virtual cooperative trainer condition and the solitary VR control condition. Similarly, Farrow et al. (2019) found no benefit on motivation of VR ghost agent conditions (based on past performance and paired with either "normal" cycling resistance or 7% greater resistance) compared with a no-VR, and VR-exercise alone (no virtual agent) condition. However, the VR exercise alone condition resulted in significantly higher scores on the interest/enjoyment and subjective vitality subscales of the Intrinsic Motivation Inventory than the no-VR condition.
Three studies using low immersion VR found conflicting results of competition on motivational constructs. Murray et al. (2016) found that both VR rowing with a competitor and VR rowing alone (no competitor) resulted in higher motivation (interest/enjoyment motivation subscale) than the no-VR rowing group, but that the VR groups did not differ (supporting a general effect of VR immersion). In contrast, Parton and Neumann (2019) found that a competitive VR rowing agent programmed to be 5% faster than the participants' baseline resulted in significant improvements in the intrinsic motivation competence subscale compared with a rower programmed to be 20% faster. No other motivation subscale differences were seen between groups. Last, Neumann and Moffitt (2018) found no differences in motivation during treadmill running between a condition involving thirdperson computer-controlled agents ("other runners" going both faster and slower than the participant) and presentation of static images (low arousal and neutral valence).

Influence of Virtual Avatars and Agents on Affect During Exercise
Only low immersive VR was used to evaluate the influence of virtual agents on affect in healthy populations. No studies investigated the use of virtual avatars in healthy populations and no studies evaluated virtual avatars or agents on affect during exercise in clinical populations.
In a healthy population, Neumann and Moffitt (2018) found no difference in activation or valence (via the feeling scale and felt arousal scale) during treadmill running between groups receiving third person, computer-controlled agents ("other runners") or those receiving static, neutral images. Moreover, the VR "other runners" group actually reported greater negative affect subscale scores (via PAAS) than the neutral images group. Murray et al. (2016) also found no influence of virtual competitors on affective valence or activation during rowing, but did find that a VR competitor group resulted in significant greater positive affect subscale scores (via PAAS) than the no-VR group, but was no different than the VR rowing alone (no competitor) group.

Influence of Virtual Avatars and Agents on Enjoyment During Exercise
The majority of the four studies (five experiments) investigating the influence of a virtual competitors or coaches on exercise enjoyment in healthy participants showed no benefit. Overall, level of immersion (high vs. low immersive VR) did not appear to play a role. No studies investigated the use of virtual avatars in healthy populations. Additionally, no studies evaluated the effect of virtual avatars or agents on enjoyment during exercise in clinical populations.
Using high immersive VR, Shaw et al. (2016) found no difference in enjoyment between a virtual cooperative trainer (coach), solitary VR cycling, and a virtual ghost condition (replay of themselves from the first condition). Additionally, their second experiment comparing solitary cycling in VR with competitive and cooperative virtual trainer conditions also found no difference in enjoyment. Similarly, Mestre et al. (2011a) used low immersive VR and found that VR with a virtual coach for encouragement during a 10 km ride did not result in greater enjoyment than VR as a solo rider. However, both VR conditions resulted in greater enjoyment than the no-VR cycling condition. Neumann and Moffitt (2018) found that during treadmill running participants allocated to receiving static, neutrally valenced images actually had greater enjoyment than the group receiving third person, computer-controlled agents ("other runners"). The only study showing positive effects paired a competitive agent (via low immersive VR) with maximal intensity rowing. Specifically, Murray et al. (2016) found that the competitive agent VR group had significantly higher enjoyment (as measured using PACES) than the VR rowing alone (no competitor) and No-VR rowing group.

Influence of Virtual Avatars and Agents on Engagement During Exercise
Six studies (nine individual experiments) used virtual avatars or agents to investigate exercise engagement in healthy populations. Half of these studies used high immersive VR, with most finding positive benefits for some virtual avatar/agent conditions. No studies investigated the use of virtual avatars or agents on engagement during exercise in clinical populations.
Of the high immersive VR studies tested in healthy populations, Shaw et al. (2016) found no differences in energy expenditure or distance traveled between solitary VR cycling (no virtual agent), a ghost condition (replay of their first condition performance), or a cooperative virtual trainer condition (providing pacing based on their current speed and heart rate). However, in their second experiment, a competitive virtual trainer condition resulted in significantly larger distances traveled as compared with a cooperative virtual trainer condition. The second high immersion VR study (Farrow et al., 2019) found no differences in mean power, mean HR, or future intentions to exercise between conditions of VR cycling alone, VR with a ghost agent, and a "hard" VR ghost agent condition (7% greater cycling resistance). However, the "hard" VR condition resulted in a significantly greater caloric expenditure than the no-VR condition. Finally, the only study using virtual avatars in this review, Fox and Bailenson (2009) evaluated the use of avatars (self-and other-identification), and vicarious reinforcement of exercise (via change in the avatar size to give positive or negative reinforcement of the consequences of exercise). In experiments one and two, engagement was measured as the number of repetitions that participants voluntarily performed following completion of a standardized set of exercises. In experiment one, the positive reinforcement VR group (avatar would lose weight during the exercise task) had significantly greater engagement (via increased exercise repetitions) than the VR normal (no change in avatar size) and VR only (no avatar) group. Experiment Two showed that engagement appeared most dependent on self-identification: participants assigned to a self-avatar group (avatar resembled themselves) had greater engagement than the non-self-avatar groups, but there were no differences between reward groups (loses weight) and punishment groups (gains weight), and no interaction between the factors. In experiment three, participants randomized to watching an avatar that resembled themselves running on a treadmill reported undertaking significantly higher number of minutes of exercise in the following 24 h (via PPAQ) than those randomized to watching a "self-avatar" loitering or those watching a "non-self-avatar" running on the treadmill (resembled someone else).
Three studies used low immersion VR to evaluate the influence of competitive or cooperative virtual agents on exercise engagement, with none showing benefits specific to virtual agents. Murray et al. (2016) found that participants in the VR competitive rowing agent group had significantly greater power output and covered a greater distance than the no-VR group, but there was no difference between the competitive group and the VR only group (no competitor). Lack of difference between competitive agent and VR only groups may reflect low power or only a general effect of VR. Additionally, no differences were seen between groups for HR. Similarly, Parton and Neumann (2019) and Mestre et al. (2011a) found no benefit of virtual competitors or coaches on engagement. Parton and Neumann (2019) showed no difference in distance rowed, power output, stroke rate, or HR between groups experiencing competitive VR rowing agents at 5% faster than their previous performance vs. 20% faster than previous performance. Mestre et al. (2011a) found that virtual speed during VR cycling was no different between VR virtual coach (providing pacing), VR only, no-VR conditions.

Theoretical Underpinnings of VR
Over one-third of the included studies (9 of 25) did not provide a theoretical underpinning for using VR to influence the exercise experience ( Table 6).
Eight studies evaluated the influence of immersion in clinical populations and used three underpinning theories. Baños et al. (2016) used the theory of attentional focus (Wininger and Gieske, 2010) testing children with and without obesity and Jones and Ekkekakis (2019), while not explicitly conveyed, applied the dual-mode theory of affect in addition to attentional focus theory, testing overweight, inactive adults. Zimmerli et al. (2013) used the theory of flow (Csikszentmihalyi and Csikszentmihalyi, 1992) evaluating people with spinal cord injuries. The remaining studies did not provide an underlying theory (Bryanton et al., 2006;Meyer, 2008;Finkelstein et al., 2013;Hossain et al., 2013;Törnbom and Danielsson, 2018).
In studies of healthy volunteers that assessed the effect of competitive, cooperative, or neutral virtual agents/trainers (n = 7 studies), four main theories were used with some studies drawing from multiple theories. Self-determination theory (Ryan and Deci, 2000) was used by Farrow et al. (2019) and Shaw et al. (2016); social comparison theory (Festinger, 1954) was used by Parton and Neumann (2019), with similar theories being used by Shaw et al. (2016) via social facilitation theory (Zajonc and Sales, 1966) and Murray et al. (2016) via the Köhler effect (Stroebe et al., 1996). Attentional focus theory (Wininger and Gieske, 2010) was used by Neumann and Moffitt (2018), Murray et al. (2016), and Mestre et al. (2011a), and social-cognitive theory (Bandura, 1989) was used by Fox and Bailenson (2009).

DISCUSSION
The scoping review found that the current evidence for the use of VR to influence motivation, affect, enjoyment and/or engagement during exercise is limited, heterogenous, and largely confined to evaluation in healthy populations. While preliminary evidence suggests that differing types of VR strategies may influence specific exercise outcomes (i.e., immersion tends to positively influence exercise enjoyment and virtual competitive agents tend to positively influence motivation and engagement), many studies found null results. Despite the presence of numerous null findings, high immersive VR showed promise across most exercise outcomes. The lack of exercise-relevant theories to inform study design, combined with inconsistent definition use for VR technology, makes interpretation of null or differing results challenging. Our findings highlight two key features requiring a concerted effort to move the field forward: standardization of VR terminology; and integration of theorybased research.

Types of VR and Influence on Exercise Outcomes in Healthy Populations
A key finding of this review was that highly immersive VR typically had more beneficial effects on exercise experiences than low immersive VR or exercise without VR, and many low immersive VR studies found null results (particularly in healthy populations). This finding is important given the context of the current VR literature: less than half of included studies used highly immersive motion tracking HMD's; rather, the majority used screen or projected technology. It is clearly important to consider that null results may indeed reflect a lack of efficacy of VR to influence experiences during exercise. However, until highly immersive VR versions are evaluated, it is potentially preemptive to rule out the possible effectiveness of VR applications to shape experiences within an exercise context.
The interaction between VR strategy used (immersion vs. virtual avatars or agents) and the specific exercise outcome (e.g., motivation vs. affect) appeared potentially important. In particular, virtual avatars and agents were more influential in positively changing motivation and engagement outcomes during exercise, whereas studies using immersion were more successful at influencing enjoyment during exercise. Influences of VR strategy on affect during exercise were less clear, with typically negative results regardless of VR strategy. These differences highlight the importance of considering the mechanisms by which VR is thought to influence the specific exercise outcome, so that any effect can be heightened. Such findings also support the idea that VR may influence outcomes during exercise via multiple and varied mechanistic pathways, and thus, the specific programming and type of VR are critically important to better understand. Regardless, this review clearly identified that the current knowledge base is insufficient to provide definitive recommendations for use of specific VR strategies to target specific exercise outcomes.

The Influence of VR Immersion
A common strategy used in current research was to directly vary the level of immersion when testing a VR condition, or to alter immersion by adding in extrasensory features to the VR program. Generally, high immersive VR improved motivation during exercise, but this did not hold true for lowimmersion technology. For affect, VR immersion, regardless of the technology used or level of immersion, resulted in increased perceived activation (arousal) but did not influence affective valence. Of the studies that found immersion to positively influence affect or activation during exercise (Plante et al., 2006;Bird et al., 2019), a concurrent improvement in enjoyment was often observed, suggesting that these outcomes may be influenced by similar factors. Indeed, previous research has shown a positive association between positive affect and

Underpinning theory Theory description Studies using theory
Attentional focus (Broadbent, 1957;Nideffer, 1976;Wininger and Gieske, 2010) Varying attentional theories exist and cover consistent constructs of awareness. For example, Nideffer's theory of attentional and personal style posits that one's attentional focus is modifiable and has two relevant dimensions. Firstly, a dimension ranging from internal bodily focus to external environmental focus and secondly, a dimension of width, i.e., narrow to broad attentional focus. Broadbent's model for human attention suggests that individuals cannot attend to all incoming sensory information and so signals with certain characteristics such as intensity, earliness, absence of recent similar inputs, the hierarchy of needs of the channel, are prioritized in what is then sampled. The papers included within this review as attentional focus have eluded to being underpinned by such theories.  (Ekkekakis, 2003) Dual mode theory of affect posits that the interplay between two factors influence affective responses during exercise; the salience of: (1) afferent interoceptive information; and (2) cognitive processes including the frontal cortex's cognitive appraisal of the exercise goals, meaning of the exercise, self-perceptions, and context. The theory suggests that at intensities around the ventilatory threshold responses are variable with some individuals experiencing a decrease in affective valence and others an increase due to their cognitive appraisal of the interoceptive cues. When exercise intensities reach respiratory compensation point, the theory suggests interoceptive cues become highly salient and dominant resulting in most people experiencing displeasure. Attention-restoration theory (Kaplan and Kaplan, 1989) Posits that experiences such as mental fatigue can be positively influenced by exposure to specific environments that promote fascination, with several factors of the natural world promoting this restoration.
• Calogiuri et al., 2018* Self-determination theory (Ryan and Deci, 2000) Self-determination theory is comprised of four interrelated sub-theories. The first, cognitive evaluation theory (CET); a theory that focuses on the influence of external factors and events on intrinsic motivation via its impact on autonomy and competence of an individual, i.e., external events have informational or controlling elements which can undermine or facilitate intrinsic motivation dependent on the social and environmental context. The second, basic needs theory (BNT) suggests that people are motivated to develop by three components: 1. Competence -mastering experiences, 2. Relatedness -willingness to interact, be connected to, and care for others, 3. Autonomy -to experience causal agency of experiences. The third, organismic integration theory (OIT) describes a continuum from extrinsic motivation through to intrinsic motivation, with behavior becoming less extrinsically motivating the more internalized it becomes. The fourth, causality orientation theory (COT) suggests that people orient and adapt to an environment and regulate their behavior in different ways; autonomous (autonomy, relatedness and competence are satisfied), controlled (relatedness and competence are appeased but autonomy is not), or impersonal orientations (none of the three basic needs are met). Social cognitive theory (Bandura, 1989) The social cognitive theory posits that observational learning and modeling of behaviors by others are influential in one's own cognitions, behaviors, and environment.
• Fox and Bailenson, 2009* Psychological reversal (Svebak and Murgatroyd, 1985) Reversal theory posits two psychological states, the first, the telic state theorizes that behaviors are executed in order to achieve an overarching goal and the individual attempts to minimize felt arousal in order to achieve the objective. It is the achievement of the goal that brings the feeling of pleasure to the experience. The second, the paratelic state, postulates that the experience of the behavior has a greater focus on presence and one infers enjoyment from the behavior itself, rather than the outcome that may eventuate.
• Gillman and Bryan, 2016* Social comparison theory (SCT) (Festinger, 1954) Social facilitation theory (SFT) (Zajonc and Sales, 1966) The Köhler effect (Stroebe et al., 1996) Social comparison theory suggests that individual's compare their performance against other people's performances to establish their own self-evaluation, further influencing cognitions, affect, and behaviors. Similar theories such as social facilitation theory posit that an individual's performance is influenced by the presence of onlookers or also by competition. The Köhler effect is an observation where an individual works harder as a member of a group than by themselves, generally this is weaker members of a group that are motivated to work harder to keep up with more apt members. Flow (Csikszentmihalyi and Csikszentmihalyi, 1992) Postulates that there are several components that lead to enjoyment during an experience. In this sense an experience of high flow would be analogous with a highly engaging and rewarding experience that would motivate an individual to continue that activity. The factors that are proposed to influence this include, that the task is matched to the skill level of the participant, the activity has a clear goal and feedback of progress is present. There is a strong focus toward the task, a movement of consciousness toward dissociation and away from bodily signals, a sense of control, and an altered sense of time.  enjoyment during exercise, but not with negative affective states (Raedeke, 2007).
A prominent design feature of many VR applications was the use of natural, outdoor settings in the viewed VR environment. When compared to outdoor exercise, such VR was consistently found to be less engaging, less enjoyable, and less affectively pleasant. However, in outdoor exercise, a participant is surrounded by a natural environment that provides distraction and a sense of immersion, both of which are proposed to be underpinning mechanisms of VR (Slater and Sanchez-Vives, 2016). Therefore, such findings when comparing VR natural settings to outdoor exercise in natural settings are perhaps unsurprising, particularly given that green-exercise (outside-based exercise) has been shown to result in higher affective valence and enjoyment compared to indoor exercise (Lahart et al., 2019). When putting green-exercise literature in context with the VR exercise literature, a common theme emerges in that the greater the immersion, the more positive the affective valence and greater the enjoyment, i.e., outdoor exercise has greater immersion and distraction than VR exercise, which has greater distraction than stationary, indoor based exercise.
Generally, enjoyment during exercise was improved by higher immersive experiences, but this was dependent upon the context of the immersive features added. For example, only adding music to VR exercise did not influence enjoyment but adding sensory features such as wind and environmental/performancebased sounds did. Findings that low immersive VR also improved engagement compared with no-VR conditions, raises the possibility that enjoyment may be influenced by more general mechanisms initiated by the presence of new/exciting visual input. Last, it is likely pre-emptive to consider the effect of immersion on exercise engagement given that no studies used high immersive VR. In general, low immersive VR was not beneficial in improving engagement during exercise; however, contrary to the findings for enjoyment, the addition of music may have positive effects.

The Use of Virtual Avatars and Agents
Virtual avatars and agents were most commonly evaluated as a strategy to improve motivation and engagement during exercise. Findings from this review suggest that both the nature of the avatar/agent and the level of immersion may be important considerations when adopting these strategies to improve motivation during exercise. Generally, competitive agents/virtual trainers in high immersive VR positively impacted motivation, but cooperative agents did not. Additionally, when lowimmersion technologies were used, competitive agents no longer provided more benefit than conditions with no agent. This review also highlighted that a nuanced understanding of the type of competitive agent (relative to the participant's own performance) may be important to the effect on exercise motivation. While competitive agents typically had beneficial effects on motivation during exercise, e.g., Parton and Neumann (2019), when the competitive agent's performance exceeded the capacity of the participant, detrimental effects on motivation occurred, potentially driven by a decrease in perceived competence. Specifically, when a competitive agent was 5% faster than participants during VR rowing, there were greater increases in motivation and in perceived competence than during a 20% faster agent condition (Parton and Neumann, 2019). Such findings are largely consistent with Bandura's self-efficacy theory (Bandura, 1997) and are supported by previous work (Aral and Nicolaides, 2017) that has shown that an individual's motivation during running is enhanced when competitors are slightly better than the individual, but not when competitors are much better (i.e., motivation decreases).
Despite issues with the construct of engagement (discussed below), VR tended to lead to an increase in engagement during exercise. This benefit may be in part through the challenges that were set by competitive agents (Murray et al., 2016;Shaw et al., 2016;Farrow et al., 2019), performance feedback (Zimmerli et al., 2013), or distraction from noxious internal physiological information (Annesi and Mazas, 1997). Additionally, one study evaluating engagement during exercise (Fox and Bailenson, 2009) explored the use of avatars within a health behavior context, investigating the influence of positive/negative reinforcement (avatar manipulated to decrease/increase in size in response to activity) and self-modeling (avatar looks like the participant). Given that positive effects on engagement are observed both during exercise and in the proceeding 24 h after the intervention, suggests that avatar design, positive or negative reinforcement strategies, and the incorporation of observing a self-modeled avatar may be important features to consider in future VR exercise interventions. Of interest, studies included here primarily investigated the influence of interactions with virtual agents rather than the influence of embodying a virtual avatar (i.e., only one study evaluating the latter). Given previous findings that suggest self-modeled agents improve engagement during exercise (Fox and Bailenson, 2009;Barathi et al., 2018), the ability to embody an avatar may have unique influences on exercise, warranting future investigation.
Virtual competitive/cooperative agents were evaluated less frequently for outcomes of affect and enjoyment during exercise, and typically showed negative findings regardless of immersion level. Some inconsistencies seen for the effect of virtual avatars and agents on affect may be due to differences in the type of virtual avatar used (computer-controlled vs. human-controlled). Typically, competitive or coaching agents do not improve enjoyment during exercise apart from one study reporting benefit when a competitive rowing agent was used in low-immersive VR (Murray et al., 2016).
Despite clear definitions within the field delineating a virtual avatar from a virtual agent (Bailenson and Blascovich, 2004), this review found that several studies inter-changed avatar and agent terminology. For example, Mestre et al. (2011a) describe a virtual coach on a bike as an "avatar, " but because it is computer controlled, technically it meets the Bailenson and Blascovich definition of a virtual "agent." Such discrepancies in the use of terminology may have negative implications when comparing studies and outcomes. Future research would benefit from accurate and consistent use of terminology to differentiate between virtual avatars and virtual agents. Additionally, it is relevant to consider that the specific influence of virtual avatars (this avatar is "me") on outcomes during exercise cannot be disentangled from effects due to VR immersion itself, because without immersion, virtual avatar embodiment cannot occur. To better understand whether embodiment of another person/body might have unique influences on exercise outcomes (i.e., additional to that of immersion), exploring literature that evaluates generalized embodiment illusions [e.g., body ownership illusions (Kalckert and Ehrsson, 2012)] or mediated reality [changing of real-time video of your own body (Nishigami et al., 2019)] may be warranted.

The Use of VR in Clinical Populations
There has been minimal work investigating the use of VR to influence outcomes during exercise in clinical populations. Only one study evaluated the use of a competitive agent in a clinical population (Zimmerli et al., 2013) and no studies have evaluated the use of cooperative agents. Given evidence showing beneficial effects of therapeutic encouragement on selfefficacy (Rajati et al., 2014) and exercise engagement (Casey et al., 2010) in clinical populations (e.g., where exercise may be harder or pain producing), investigation of cooperative, encouraging agents seems warranted. Indeed, this is a clear evidence gap given the promising effects seen with high immersion virtual avatars and agents in healthy populations for motivation and engagement during exercise. Relevant ways forward may include purposeful collaboration with research areas that are outside of the context of exercise, but at the forefront of VR use. For example, insightful considerations for virtual agents or immersion relevant to exercise might come from work evaluating VR use in complex patient populations (e.g., post-traumatic stress disorder) and/or with nuanced, complex VR interventions, such as motor rehabilitation e.g., Rizzo and Shilling (2017) and Rizzo et al. (2004). The use of virtual agents and immersion in contexts outside of exercise may help guide relevant strategies to use in exercise.
To best investigate the potential for VR, it is important to carefully consider the choice of VR technology and the software programming for specific clinical populations as there may be unintended consequences. For example, Törnbom and Danielsson (2018) found that several participants with acquired brain injuries experienced the (low-immersive) treadmill VR to be more tiring and challenging than no-VR condition due to the increased sensory input. Studies that specifically investigate varying levels of sensory input for these populations are likely relevant.
Despite the potential for unintended consequences in some clinical populations with highly immersive VR, this review found preliminary evidence that only high immersive VR positively influences affective valence during cycling in inactive and overweight participants (Jones and Ekkekakis, 2019). Further, while only Jones and Ekkekakis (2019) evaluated the influence of high-immersive VR on enjoyment during exercise in clinical populations, there was evidence from other studies supporting benefits from low immersive VR when compared to no-VR. Regardless, limited evaluation of high immersive VR for enjoyment during exercise is a particularly important finding for two reasons: previous research has suggested exercise enjoyment to be predictive of future exercise engagement (Lewis et al., 2016) so increasing enjoyment has potential to have meaningful realworld effects; and secondly, high immersive environments are likely to provide greater enjoyment (Shaw et al., 2017;Zeng et al., 2017). Thus, for clinical populations that gain both general and condition-specific benefit from ongoing exercise, research evaluating highly immersive VR-exercise should be a priority.

Definitions of Virtual Reality
The primary implication of this review's findings is that no definitive conclusions about the influence of VR on exercise can be made until consensus definitions of VR technology and levels of immersion are created and consistently applied. Our findings highlighted the presence of inconsistent (and/or lacking) definitions to delineate what constitutes VR. Of the included publications, only 28% provided a definition of VR. Part of this challenge relates to inconsistency surrounding VR definitions in the general literature. Some definitions in the literature are very broad, for example, Pan and Hamilton (2018) define VR merely as "a computer-generated world." Other literature (Cipresso et al., 2018) has divided VR into different levels of immersion: non-immersive VR, such as TV screens or computer monitors; immersive VR, such as HMD's and; semi-immersive VR, such as fish tank VR. Indeed, the importance of immersion has been promoted much earlier (Sutherland, 1965). Research from the perceptual field has suggested that two key elements of VR determine whether or not a participant will respond realistically to an environment (a likely pre-requisite for a compelling effect on exercise affect/enjoyment): first, place illusion, a sense of being there or "presence"; and second, plausibility illusion, a sense that the situation is actually taking place (Slater, 2009). While all studies included here reported using VR, many utilized technologies that arguably do not provide sufficient place or plausibility illusion. If place and plausibility illusion characteristics were required for a technology to be considered VR, only seven studies in healthy populations (of 17) and one study in clinical populations (of eight) would have been included.
The lack of a standardized definition of VR has implications for summarizing relevant VR literature. Here studies that did not explicitly define their technology as VR were excluded despite using technology similar to that of studies that were included. For example, Glen et al. (2017) used a screen-based environment to evaluate the influence of cycling while playing an exergame where participants were chasing dragons (vs. a control cycling condition and a control blank screen condition). Their study found that despite working harder in the exergame condition, participants perceived greater enjoyment and pleasantness. However, as the study did not define the intervention as VR, it was not included in this review. Importantly, the results of such studies that used visual/projection technology (but did not call it VR) have similar findings to those studies included in the present review. For example, Monedero et al. (2015) found a cycling exergame with competitive agents led to greater energy expenditure and greater enjoyment, despite not reporting a higher RPE compared to a non-VR condition, consistent with the findings from Murray et al. (2016). Research evaluating pacing cyclists during virtual stationary biking also found improvements in motivation (Corbett et al., 2012), consistent with Shaw et al. (2016). Further, Russell and Newton (2008) used an interactive video game during 30 min of cycle ergometry to assess post-exercise affect, finding that the video game condition added no benefit beyond that of the no-video-game cycle ergometry, this was consistent with the findings by Plante et al. (2003). The presence of similar findings suggest that the inclusion of these studies would not have significantly altered the conclusions of this review.
To meaningfully add to literature in this area, future research must clearly specify the level of immersion and types of technology being used and do so in a standardized manner. Previous work by Milgram and Kishino (1994) depicted a continuum spanning from reality to an experience within a complete virtual environment. We propose that an extension of Milgram and Kishino's reality-virtuality continuum may be warranted, that focusses and expands upon levels of immersion in virtual reality (e.g., the continuum beginning with technologies where motion tracking is present, but VR presence is reduced (e.g., the Kinect), and ending with HMD's with motion tracking and haptic feedback). This expansion of the continuum may be a useful tool to allow standardized reporting of a VR intervention. At minimum, further research should routinely measure and report the level of perceived immersion that participants experience given that this may play an important role in the outcomes of the intervention. We propose that other lower-immersive technology such as television screens, projection, or technology with no proprioceptive integration may be best referred to as low-immersion audio-visual technology, and where these technologies have interactive components that the modifier "interactive" is used i.e., "interactive audiovisual technology." However, such suggestions clearly require field buy-in.

Lack of Theory Informed VR Application
One of the key findings-that the current evidence-base is underpinned by varying theories, or in many cases is not driven by theory-guided study design (33% of studies)-has important implications. The lack of theory-based research prevents mechanistic understanding of the VR strategy; that is, what did the VR aim to target during exercise and was this successful? Such information is key to have any ability to personalize VR and exercise prescription. Additionally, interpretation of study findings becomes difficult because it is unclear what aspect of the VR process did or did not work. For example, Zeng et al. (2017) found cycling with VR to be more enjoyable than cycling without VR. With no explicit underlying theory provided, it is uncertain what aspects of VR were important and what was impacted in participants to alter their exercise experience. One theory that could have been used is attentional focus ( Table 6). Theory-driven study design would ensure that proposed variables of interest (i.e., internal vs. external attentional focus) were assessed, increasing confidence in the results and in the proposed mechanism. For example, during VR cycling did participants have a greater external focus vs. a greater internal bodily focus for the non-VR conditions? Understanding what underlies improvements to enjoyment during exercise can then help guide future research aiming to maximize those improvements and inform nuanced extension to clinical populations (e.g., targeting people with high levels of internal bodily focus).
Theoretical perspectives within VR research are also critical to assist in understanding apparent inconsistencies in findings. For example, both Murray et al. (2016) and Shaw et al. (2016) investigated using virtual competitive agents and cooperative or ghost agents/trainers to influence enjoyment during exercise. While underpinned by similar theories (Shaw et al., 2016 by social facilitation theory and Murray et al., 2016 by the Köhler motivational gain effect), testing at similar exercise intensities, and using comparable competitive agents/trainer settings (individualized to the participant's effort), their findings differed. Specifically, Murray et al. (2016) found a benefit in using virtual agents (vs. no VR) during maximal intensity rowing, whereas Shaw et al. (2016) found no benefit of using a competitive virtual trainer during moderate-vigorous intensity cycling. Because both studies used similar underlying theories, the differential findings can be discussed meaningfully with differences in methodology considered. For example, the competitive manipulation that Murray et al. (2016) used was extensive and included instructions to participants that they would row against a real teammate, enhanced by a deceptive phone call with a mock teammate. This manipulation likely influenced the feeling of having "real" competition, potentially enhancing participant motivation and consequent enjoyment. This is in contrast to Shaw et al. (2016) where participants were aware that they were playing against a computer programmed agent/trainer. These findings may indicate that when exercising at a high intensity, participants may require realistic competition-i.e., if they know that their competition is a real person, with the ability to judge their performance, motivation could be more positively impacted. If the studies had not used similar underlying theories (or did not report them), consideration of methodological differences would be largely speculative.
Theoretical frameworks underpinning the perceptual experiences during exercise are essential to guide rigorous evaluation, as they have a clear impact on the type of exercise chosen, the outcome measures used (including process or mechanistic outcomes), and overall study design. For example, Jones and Ekkekakis (2019) investigated different levels of immersive technology in overweight adults, underpinned by attentional focus (Broadbent, 1957;Nideffer, 1976;Wininger and Gieske, 2010) and the dual-mode theory (DMT) of affect (Ekkekakis, 2003). In line with DMT, the exercise intensity was prescribed at ventilatory threshold, the intensity at which interoceptive cues start to become more salient and dominate cognitive appraisal (Ekkekakis, 2003). As this intensity of exercise has the most affect-related inter-individuality, it is of interest to investigate if higher levels of immersion compete with interoceptive cues, preserving a more positive affect for longer. Further, use of an underpinning theory directed the study methodology toward measures specific to the constructs of the theory. By adopting measures of affect (feeling scale and felt arousal scale), as well as objective measures of the hemodynamic response of the right dorsolateral prefrontal cortex (Radel et al., 2018), an area of the brain associated with cognitive control and selective attention (Sarter et al., 2001), the study findings were contextualized within the theoretical model, providing a clear and concise test of the theory and application of VR in the exercising context. That is, using HMD VR results in greater dissociation from interoceptive cues and an improved affective response at ventilatory threshold prescribed exercise.
Taken together, use of theoretical frameworks to underpin well-structured, rigorous studies is needed within VR-enhanced exercise research. These studies are essential to enable a more robust understanding of what does and does not work to build motivation, affect, enjoyment, and engagement during exercise.

Limitations in Outcome Measures for Exercise Engagement During Exercise
Our results highlighted that exercise engagement during exercise, as used in VR studies, was a largely ambiguous construct with numerous (and varied) proxy measures. Measures of engagement varied between short-term attendance rates, physiological measures (HR, energy expenditure, joint ROM) and objective output measures such as power output and cadence. Engagement is typically defined as "appraisals and coping that promote effortful striving directed toward task goals" (Matthews et al., 2002, p. 335) which can correspond to a proxy measure that reflects a motive-driven change in action. However, use of proxy measures can be problematic, particularly when multiple variables influence an outcome. For example, using HR as a proxy for engagement during exercise is difficult because it can be influenced by both increased engagement (working harder) but also by factors such as anxiety. To reduce these challenges, we propose that engagement measures during exercise should be underpinned by a relevant theory such that the putative mediators of engagement can be quantified. For example, when considering adherence or attendance rates, it is important to understand psychosocial factors that may affect the individual such as social-economic status or distance to the gym, as these can influence the engagement measure.

Limitations of the Review
It is possible that some relevant studies were missed and that this could have impacted the review. All efforts were made to minimize this risk including the use of two independent reviewers, searching of several pre-determined relevant databases, and a consultation with an academic librarian when designing search strategies. This review is also limited to those studies evaluating various outcomes during VR exercise. It does not include the substantial literature related to the choice to engage with (and what underpins the motivation to engage with) certain VR technologies for exercise (Yim and Graham, 2007;Knaving et al., 2015), which has clear relevance for future research directions (see below). It is important to note that VR strategies that may positively influence experiences during exercise, may not necessarily positively influence the motivation to engage with VR technology. Indeed, there is extensive work exploring the importance of virtual avatars and their features in promoting engagement with human computer interfaces (Seinfeld et al., 2020).
Due to heterogeneity in the field, clear recommendations of benefit specific to the type of VR and exercise outcome are preemptive; however, this is an important finding in and of itself, as it can also highlight key future research directions. Additionally, although we grouped studies according to immersion and virtual avatars and agents, we acknowledge that several other strategies could have been used to highlight the current state of the evidence. Last, being a scoping review, we included all available literature from peer-reviewed indexed journals regardless of the study quality or potential risk of bias. While this is beneficial in providing a broader exploration of the current literature, it does not allow for weighting based on the quality of the included studies.

Future Directions
Our findings provide insight into factors worthy of consideration in commercial software design. While the VR systems used by included studies were often commercially available technology (such as the Oculus Rift HMD, PlayStation VR or the Samsung Gear VR), most software programs were custom-built for each study. Thus, given that our findings largely support that high immersive VR holds most promise in enhancing experiences during exercise, commercial software design would be wellplaced to incorporate possibilities for high immersion sensory features and competitive agents. Additionally, the consideration of using commercially available apps for research may hold considerable benefits for researchers and consequent end-users.
Commercially available apps are becoming more commonplace, and many provide the ability to manipulate or alter aspects of the VR experience. For example, Zwift (Zwift Inc. 2014; California, USA) is a screen-based program that responds in real-time to a stationary bike trainer, and the Zwift software incorporates several features that were investigated in our included studies (such as reward systems, performance feedback, and competitive agents). Consideration of available apps for VR research may reduce the time and cost required to create bespoke VR, as well as increasing scalability of research findings.
It is important to acknowledge that changes in physical activity behavior are influenced not only by factors and experiences that occur during the exercise experience, but also by factors that influence the motivation to undertake such an exercise experience with VR in the first place. Past work in games (Yim and Graham, 2007), gamification, human computer interface [including personalized exercise data, such as fitness trackers and activity recognition-aka the quantified self (Meyer et al., 2014;van Berkel et al., 2015)], and collaborative technology (such as Wii) has shown clear benefits of these technologies to motivate individuals to engage with that technology exercise modality. While beyond the scope of this review to fully discuss, such information about the willingness to engage with a particular technology is critically important to consider when designing VR exercise interventions aimed at behavioral change (Peters et al., 2018), because despite evidence of better outcomes while exercising, this does not automatically mean that individuals will choose to undertake that VR exercise. Previous research looking at the use of other (non VR) human computer interfaces (HCI) in recreational running athletes (Knaving et al., 2015) and general computer gaming interaction (Rogers, 2017) has also provided important insight into motivation and enjoyment that may be beneficial in future VR investigation. For example, Knaving et al. (2015) used data from recreational runners that were using motivational HCI's to derive 9 specific guidelines for future technology development. These guidelines included allowing the user to set personal and social goals, considerations that impact user dissociation and association from the exercise, and to adopt strategies that fosters a user's internal motivations. Such inclusion from this field may provide novel design strategies that are worthy of investigation in VR.
In addition to consideration of findings from related fields, nuanced testing of VR designs to comprehensively capture the experience of relevant test populations seems warranted to guide VR enhancement. Recent work evaluating the use of VR in gait training (Hamzeheinejad et al., 2019) used several novel assessments that may have utility for future research consideration, including: measurements of temporal demands and performance; user experience measures, such as perspicuity (clearness of VR), novelty, stimulation, efficiency, and dependability [via the User Experience Questionnaire (Laugwitz et al., 2008)]; and perceived value and pressure. In this study, such evaluation allowed determination that the virtual trainer was not beneficial to user experience and helped guide necessary future changes (e.g., making the virtual trainer more complex to capture interactivity, communication, and socio-motivational aspects inherent to a "real-life" trainer) (Hamzeheinejad et al., 2019). Last, relevant to evaluation of virtual avatars/agents, clear delineation between virtual avatars that are embodied and virtual agents that are interacted with is critical for future research, given that the mechanisms by which these might be hypothesized to influence experiences during exercise are unlikely to be the same. Future research should not use these terms interchangeably.

CONCLUSION
Current evidence for the use of VR to influence motivation, affect, enjoyment, and engagement during exercise is limited, heterogenous, and has primarily been undertaken in healthy populations. Further, although there were often positive findings when high immersive VR was used, the majority of studies used low-immersive VR, suggesting a significant gap in the current literature. The current evidence-base is insufficient to provide definitive recommendations for specific VR use to shape experiences during exercise. Despite many studies reporting null results, specificity of effect of VR on exercise outcome may be present, with use of immersion most promising for exercise enjoyment during exercise, and use of virtual competitive agents most promising for improving motivation and engagement during exercise. Evidence is conflicting for influencing affect during exercise. Taken together, future research is warranted that includes purposeful integration of exerciserelevant theories into VR investigation, and careful consideration and standardization of VR definitions (including high-vs. lowimmersive), software possibilities, and nuanced extension to clinical populations.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s. design, data collection and analysis, decision to publish, or preparation of the manuscript.