ASSESSING THE THERAPEUTIC USES AND EFFECTIVENESS OF VIRTUAL REALITY, AUGMENTED REALITY AND VIDEO GAMES FOR EMOTION REGULATION AND STRESS MANAGEMENT

EDITED BY : Federica Pallavicini and Stéphane Bouchard PUBLISHED IN : Frontiers in Psychology

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-413-2 DOI 10.3389/978-2-88963-413-2

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ASSESSING THE THERAPEUTIC USES AND EFFECTIVENESS OF VIRTUAL REALITY, AUGMENTED REALITY AND VIDEO GAMES FOR EMOTION REGULATION AND STRESS MANAGEMENT

Topic Editors: Federica Pallavicini, University of Milano, Italy Stéphane Bouchard, Université du Québec en Outaouais, Canada

Citation: Pallavicini, F., Bouchard, S., eds. (2020). Assessing the Therapeutic Uses and Effectiveness of Virtual Reality, Augmented Reality and Video Games for Emotion Regulation and Stress Management. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-413-2

# Table of Contents

*04 Editorial: Assessing the Therapeutic Uses and Effectiveness of Virtual Reality, Augmented Reality and Video Games for Emotion Regulation and Stress Management*

Federica Pallavicini and Stéphane Bouchard

*07 Virtual Reality for Anxiety Reduction Demonstrated by Quantitative EEG: A Pilot Study*

Jeff Tarrant, Jeremy Viczko and Hannah Cope

*22 Video Games for Well-Being: A Systematic Review on the Application of Computer Games for Cognitive and Emotional Training in the Adult Population*

Federica Pallavicini, Ambra Ferrari and Fabrizia Mantovani


Barbara Atzori, Hunter G. Hoffman, Laura Vagnoli, David R. Patterson, Wadee Alhalabi, Andrea Messeri and Rosapia Lauro Grotto

*51 Causal Interactive Links Between Presence and Fear in Virtual Reality Height Exposure*

Daniel Gromer, Max Reinke, Isabel Christner and Paul Pauli


Silvia Francesca Maria Pizzoli, Ketti Mazzocco, Stefano Triberti, Dario Monzani, Mariano Luis Alcañiz Raya and Gabriella Pravettoni


Theresa F. Wechsler, Franziska Kümpers and Andreas Mühlberger

# Editorial: Assessing the Therapeutic Uses and Effectiveness of Virtual Reality, Augmented Reality and Video Games for Emotion Regulation and Stress Management

#### Federica Pallavicini <sup>1</sup> \* and Stéphane Bouchard<sup>2</sup>

<sup>1</sup> Department of Human Sciences for Education "Riccardo Massa", University of Milano Bicocca, Milan, Italy, <sup>2</sup> Département de Psychoéducation et de Psychologie, Université du Québec en Outaouais, Gatineau, QC, Canada

Keywords: virtual reality, emotion, emotion regulation, video game, stress, stress management

#### **Editorial on the Research Topic**

#### **Assessing the Therapeutic Uses and Effectiveness of Virtual Reality, Augmented Reality and Video Games for Emotion Regulation and Stress Management**

Virtual reality (VR), Augmented Reality (AR), and Video games (VGs), because of their reasonable cost and increasing diffusion among the public, are becoming very interesting and promising therapeutic approaches for improving individuals' health and well-being (e.g., Granic et al., 2014; Giglioli et al., 2015; Riva et al., 2016; Hemenover and Bowman, 2018).

However, despite that numerous scientific studies have demonstrated the therapeutic benefits of these technologies for diverse cognitive functions and individual patients (e.g., Parsons et al., 2017; Bediou et al., 2018; Bouchard and Rizzo, 2019; Riva et al., 2019), less attention has been devoted to the therapeutic use of such tools for the assessment and training of emotion regulation and stress management skills.

Therefore, we brought together within this Research Topic contributions from researchers investigating theoretical, empirical, experimental, and case studies of VR, AR, and VGs for emotion regulation and stress management assessment and training. This Editorial will provide an overview of the articles accepted for publication in the Research Topic.

### VR AND VGs CONTENT AND APPLICATION FOR EMOTION REGULATION AND STRESS MANAGEMENT

Pizzoli et al. addresses the important question of how to build effective VR contents to promote relaxation and decrease stress. The authors explores a new theoretical approach, that would be based on VR with personalized content, grounded on user research to identify important life events and on the rendering of such events with symbols, activities, or other virtual environments contents. According to the authors, it is possible that such an approach would obtain more sophisticated and long-lasting relaxation in users.

Lindner et al. explore the potential of consumer-targeted VR relaxation applications for widespread dissemination. In their study, they analyze "real-world" aggregated uptake, usage, and application performance statistics from a first-generation consumer-targeted VR relaxation application (i.e., the Happy Place) which has been publicly available for almost 2 years. According to their findings, primarily user engagement needs to be addressed in the early stage of development

Edited and reviewed by: Anton Nijholt, University of Twente, Netherlands

> \*Correspondence: Federica Pallavicini federica.pallavicini@unimib.it

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 30 October 2019 Accepted: 25 November 2019 Published: 11 December 2019

#### Citation:

Pallavicini F and Bouchard S (2019) Editorial: Assessing the Therapeutic Uses and Effectiveness of Virtual Reality, Augmented Reality and Video Games for Emotion Regulation and Stress Management. Front. Psychol. 10:2763. doi: 10.3389/fpsyg.2019.02763

**4**

by including features that promote prolonged and recurrent use (e.g., gamification elements).

Pallavicini et al. describe results of a systematic review on the impact of video games training on emotional regulation and stress management skills—as well as on cognitive abilities (i.e., processing and reaction times, memory, taskswitching/multitasking, and mental spatial rotation)—in the healthy adult population. According to the results, noncommercial video games as well as commercial video games can be effective in inducing positive emotions and in reducing individual levels of stress in healthy adults. Furthermore, results showed that the number of studies conducted about video games training on emotional regulation and stress management skills (i.e., 5) is still smaller than the amount of studies related to cognitive training (i.e., 30).

### APPLICATIONS OF VR FOR EMOTION REGULATION AND STRESS MANAGEMENT ASSESSMENT AND TRAINING IN MENTAL HEALTH CONTEXTS

Wechsler et al. report results of a systematic review focused on the comparison of VR and in vivo exposure in studies applying an equivalent amount of exposure for phobic anxiety disorders and their treatment. According to the 9 studies included, VR exposure show a higher potential and is not less effective than in vivo exposure in specific phobia and agoraphobia.

Tarrant et al. carried out a pilot study aimed to examine changes in brain patterns associated with the use of VR for anxiety management in people with generalized anxiety disorder (GAD). The study involved 14 patients suffering from GAD. Results showed that both a quiet rest control condition and the VR meditation significantly reduced subjective reports of anxiety. However, the VR intervention uniquely resulted in physiological reduction of anxiety.

Atzori et al. and Atzori et al. describe results of a randomized controlled trial that tested the effectiveness of VR as a distraction technique to help control pain in children and adolescents undergoing venipuncture. Fifteen patients suffering from oncological or hematological diseases were randomly assigned to the VR (i.e., SnowBall) or the no-VR control condition (i.e., standard of care). Results showed that VR was more effective in distracting patients during venipuncture and it elicited more positive emotions than the traditional distraction technique.

The same authors report results of a pilot study aimed to evaluate the feasibility and effectiveness of immersive VR as an attention distraction analgesia technique for pain management in children and adolescents undergoing painful dental procedures. Five patients received tethered immersive interactive VR distraction during one dental procedure. On a different visit to the same dentist (e.g., 1 week later), each patient also received a comparable dental procedure during the control condition "treatment as usual." Findings showed that patients reported significantly lower "worst pain" and "pain unpleasantness," and had significantly more fun in the VR condition, compared to a comparable dental procedure without VR.

Kip et al. describe results of a study aimed to explore why and in what way VR can be of added value for treatment of forensic psychiatric patients. Based on the results of semistructured interviews conducted with 8 therapists and 4 patients, 6 scenarios about possibilities for using VR in treatment were created and presented to 89 therapists and 19 patients in an online questionnaire. According to the analysis of the qualitative data emerged, VR offers a broad range of possibilities for forensic mental health, including the training of emotion regulation skills (i.e., coping skills that support the patient in not giving in to impulses when confronted with difficult, emotion-eliciting situations, or stimuli).

Cebolla et al. report results of a randomized controlled trial aimed to investigate the efficacy of a self-compassion meditation procedure based on the machine to be another (TMTBA) system—which uses multi-sensory stimulation to induce a body swap illusion—in increasing positive affect states, mindful selfcare, and adherence to cognitive behavioral interventions (CBIs). Sixteen participants were randomly assigned to two conditions: TMTBA-VR and usual meditation procedure (CAU). Results showed that after 2 weeks, both conditions showed a similar frequency of meditation practice and increases in specific types of self-care behaviors, with the frequency of clinical self-care behaviors being significantly higher in TMTBA. According to the authors, embodied VR could be an interesting tool to facilitate and increase the efficacy of CBIs by facilitating the construction of positive and powerful mental images.

### THEORETICAL PERSPECTIVES OF UNDERSTANDING VR IN EMOTION REGULATION AND STRESS MANAGEMENT ASSESSMENT AND TRAINING PROGRAMS

Gromer et al. report results of a study in which they experimentally manipulated presence and fear to unravel the causal link between these responses in VR environments. A sample of 49 fearful participants were immersed into a virtual height situation and a neutral control situation (fear manipulation) with either high vs. low sensory realism (presence manipulation). Ratings of presence and verbal and physiological (skin conductance, heart rate) fear responses were recorded. Results showed that experiencing emotional responses in a virtual environment leads to a stronger feeling of being there, i.e., increase presence. In contrast, the effects of presence on fear seem to be more complex: on the one hand, increased presence due to the quality of the virtual environment did not influence fear; on the other hand, presence variability that likely stemmed from differences in user characteristics did predict later fear responses.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## REFERENCES


reality for enhancing personal and clinical change. Front. Psychiatry 7:164. doi: 10.3389/fpsyt.2016.00164

Riva, G., Wiederhold, B. K., and Mantovani, F. (2019). Neuroscience of virtual reality: from virtual exposure to embodied medicine. Cyberpsychol. Behav. Soc. Netw. 22, 82–96. doi: 10.1089/cyber.2017.29099.gri

**Conflict of Interest:** SB is a consultant to and own equity in Cliniques et Développement In Virtuo, a spin-off from the university that uses virtual reality as part of its clinical services and distributes virtual environments. The terms of these arrangements have been reviewed and approved by the Université du Québec en Outaouais in accordance with its conflict of interest policies.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pallavicini and Bouchard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Virtual Reality for Anxiety Reduction Demonstrated by Quantitative EEG: A Pilot Study

#### Jeff Tarrant<sup>1</sup> \*, Jeremy Viczko<sup>2</sup> and Hannah Cope<sup>1</sup>

<sup>1</sup> NeuroMeditation Institute, Corvallis, OR, United States, <sup>2</sup> Department of Psychology, University of Victoria, Victoria, BC, Canada

While previous research has established that virtual reality (VR) can be successfully used in the treatment of anxiety disorders, including phobias and PTSD, no research has examined changes in brain patterns associated with the use of VR for generalized anxiety management. In the current study, we compared a brief nature-based mindfulness VR experience to a resting control condition on anxious participants. Self-reported anxiety symptoms and resting-state EEG were recorded across intervals containing quiet rest or the VR intervention. EEG activity was analyzed as a function of global power shifts in Alpha and Beta activity, and with sLORETA current source density estimates of cingulate cortex regions of interest. Results demonstrated that both a quiet rest control condition and the VR meditation significantly reduced subjective reports of anxiety and increased Alpha power. However, the VR intervention uniquely resulted in shifting proportional power from higher Beta frequencies into lower Beta frequencies, and significantly reduced broadband Beta activity in the anterior cingulate cortex. These effects are consistent with a physiological reduction of anxiety. This pilot study provides preliminary evidence supporting the therapeutic potential of VR for anxiety management and stress reduction programs.

#### Edited by:

Stéphane Bouchard, Université du Québec en Outaouais, Canada

#### Reviewed by:

Xavier Bornas, Universitat de les Illes Balears, Spain Bruno Herbelin, École Polytechnique Fédérale de Lausanne, Switzerland

> \*Correspondence: Jeff Tarrant Dr.tarrant@hotmail.com

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 27 February 2018 Accepted: 04 July 2018 Published: 24 July 2018

#### Citation:

Tarrant J, Viczko J and Cope H (2018) Virtual Reality for Anxiety Reduction Demonstrated by Quantitative EEG: A Pilot Study. Front. Psychol. 9:1280. doi: 10.3389/fpsyg.2018.01280 Keywords: virtual reality, VR, Qeeg, sLORETA, mindfulness, anxiety, GAD, nature

## INTRODUCTION

Anxiety disorders are the most common mental health disorder in the United States. Prevalence estimates from population-based surveys indicate that as much as 1/3 of the population has experienced an anxiety disorder during their lifetime (Bandelow and Michaelis, 2015), resulting in significant loss of productivity, health consequences, and emotional distress (Kessler et al., 2005; Stein et al., 2005; Sareen et al., 2006; Saarni et al., 2007; Bandelow and Michaelis, 2015). Standard treatment options include pharmacotherapy and cognitive-behavioral therapy. While there is good evidence for both of these interventions (Roy-Byrne and Cowley, 2002; Butler et al., 2006; Norton and Price, 2007), it is also clear that only a fraction of those identified as having a diagnosable anxiety disorder receive sufficient treatment (Stein et al., 2011). In addition, there are large numbers of people suffering with undiagnosed anxiety. For example, in a review of epidemiological studies with a total of 48,214 participants, it was found that the prevalence for subthreshold generalized anxiety disorder (GAD) was twice that for the full syndrome (Haller et al., 2014). Studies also indicate that subthreshold GAD tends to be persistent and results in significantly more

functional impairment, psychotropic medication use, and accessing of health care services than in non-anxious individuals (Haller et al., 2014).

While cognitive-behavioral therapies (CBT) have been identified as the treatment of choice for GAD, many successful CBT protocols include relaxation training as a central component (Tyrer and Baldwin, 2006). In fact, applied relaxation techniques have been shown to have comparable effects to CBT in short term studies (Siev and Chambless, 2007; Cuijpers et al., 2014). Therapeutic approaches that incorporate mindfulness training also appear to show promise as a treatment for GAD (Wetherell et al., 2011; Hoge et al., 2013).

Given the number of people affected, it seems a logical step to explore the potential of accessible, user-friendly, engaging technologies to assist in this treatment process. This seems particularly relevant for the teaching and provision of applied relaxation and/or mindfulness interventions. Recent evidence suggests that immersive technologies, such as virtual reality (VR), when applied in a specific therapeutic context may an appropriate candidate for this exploration (Maples-Keller et al., 2017).

A review of VR research shows that this modality has been successfully used in the treatment of a variety of anxiety disorders, including phobias and PTSD (Motraghi et al., 2014; Morina et al., 2015). These therapeutic applications have been conducted in the context of an ongoing therapeutic relationship using scenes created in a VR environment, thus becoming sophisticated additions to the traditional model of exposure therapy. In part, the success of these programs appears to be based on the understanding that immersive environments, such as those provided in VR, can generate strong feelings of "presence" (Waterworth et al., 2010; Riva et al., 2011; Riva and Waterworth, 2014; Waterworth and Riva, 2014). "Presence," in this context, is defined as the subjective feeling of being in another place and is a crucial element in exposure-based therapies. Because VR is immersive, it should not be surprising that this format can provide more presence than 2-dimensional scenes. For example, one recent study found that 360-degree video generated more intense feelings of awe than standard 2-dimensional videos (Chirico et al., 2017).

While our understanding of immersive technologies in the treatment of some anxiety disorders is advancing, only one study to date has explored the use of VR in the treatment of Generalized Anxiety Disorders (GAD). Gorini et al. (2010) compared the impact of using VR with biofeedback, VR without biofeedback, and a wait list control group as an intervention for persons diagnosed with GAD. The VR experiences used in this study were combined with additional therapeutic techniques over the course of 8 weeks. The VR experience itself involved the client exploring a tropical island leading to either a campfire, beach, or waterfall. The experience included an audio narrative of a progressive muscle relaxation technique and/or autogenic techniques. The biofeedback group was able to use heart rate measurements to influence fire intensity, or movement of water. Clients were provided with a simplified version of the experiences using a mobile phone for home practice (Gorini et al., 2010). Pre–post analyses indicated that both experimental groups demonstrated improved clinical outcomes at the end of the treatment period. Physiological measures indicated a tendency toward decreased heart rate and galvanic skin response between the pre- and post-session measurements with the biofeedback group showing slightly larger improvements.

While this study suggests that VR may be a useful tool in the treatment of GAD, it had significant limitations. The sample size of this study was quite small, with one of the treatment groups only having four participants, thus limiting confidence in the results. Additionally, this study included numerous uncontrolled variables, making it difficult to know which aspects of the VR or VR plus biofeedback experience were related to the results. For example, participants focused on various aspects of the VR experience every 2 sessions, making it difficult to know which elements were most effective. In short, much more research is needed in this area.

To examine the potential use of VR as an accessible intervention for generalized anxiety, we thought it was important to examine the impact of a single exposure on anxiety levels. In this study, we examined the impact of a VR experience designed by StoryUp VR which includes several components designed to decrease anxiety. Specifically, the design elements were created based on previous research indicating that both exposure to nature, and mindfulness practices can aid in relaxation and anxiety reduction.

Gorini et al. (2010) used a nature-based VR experience in their study of GAD and obtained positive results. This is consistent with research showing that exposure to nature reliably reduces the stress response. This finding has been reported across multiple studies and was even present when nature was presented in the form of plants, posters, slides, videos, etc. These changes in the stress response have been quantified through a variety of physiological monitoring techniques including muscle tension, skin conductance, pulse transit time, cardiac response, and hormone levels (Berto, 2014). Studies examining EEG changes in response to nature have demonstrated increases in cortical Alpha amplitude (associated with a relaxation response) when viewing slides of natural landscapes versus urban scenes (Ulrich, 1981), when viewing plants with flowers versus pots without flowers (Nakamura and Fujii, 1990), and when watching a green space versus a concrete block fence (Nakamura and Fujii, 1992).

There is also growing evidence that mindfulness-based practices can result in reduced stress and anxiety. In a review and meta-analysis of meditation programs, Goyal et al. (2014) found that mindfulness meditation programs demonstrated moderate evidence as an intervention for anxiety. Perhaps because of this understanding a recent review on the use of VR technology in the treatment of anxiety specifically noted that incorporating mindfulness exercises into a VR experience could be a potentially helpful intervention in the treatment of GAD (Maples-Keller et al., 2017).

The EEG patterns most associated with stress and anxiety are increased fast wave activity (Beta) and decreased slow wave activity (e.g., Alpha; Price and Budzynski, 2009; Olbrich et al., 2011). Thompson and Thompson (2007) noted that anxiety is generally associated with an increase of 19–22 Hz activity found in conjunction with a decrease of 15–18 Hz activity. Obsessive worry is connected to excessive Beta activity

along the midline and at electrode site Cz (Hammond, 2005b). Using LORETA (Low Resolution Electromagnetic Tomogrophy) analyses, Sherlin (2009) indicated that the most common pattern associated with anxiety is excessive Beta in the anterior cingulate or midline cortex.

As noted by Sherlin and Wyckoff (2010), it is logical to assume that the same brain regions aroused by anxiety would show quieting patterns during relaxation and meditation. In fact, one of the most common forms of neurofeedback for anxiety treatment involves Alpha training (Moore, 2000; Hammond, 2005a,b). Increasing Alpha tends to result in subduing higher frequency activity and cortical overexcitability across the cortex, much the way training to increase Beta results in increases in higher frequency cortical activation. These patterns are seen in other research relevant to the current study. For example, increases in Alpha power are associated with lower levels of anxiety, increased calmness, positive affect, and a range of other autonomic changes associated with decreased sympathetic arousal (Cahn and Polich, 2006).

Multiple studies have identified specific regions of the brain associated with stress and anxiety. Most notably, the cingulate gyrus is thought to play a significant role in the regulation of nervous system arousal (Critchley et al., 2003). The cingulate gyrus runs down the midline of the brain, immediately superior to the corpus collosum. This region has been shown to become activated during the experience of pain (Vogt et al., 1996), negative emotions and memories (Maddock et al., 2003a,b), and anxiety (Fredrikson et al., 1997; Simpson et al., 2001; Lanius et al., 2003). To examine this region, we used sLORETA (Pascual-Marqui, 2002) analyses to examine current source density (CSD) estimates at the anterior and posterior cingulate as these specific areas contribute important and distinct elements to the experience of anxiety.

The current study investigated the impact of a mindfulnessin-nature VR intervention on persons screened as demonstrating moderate to high levels of generalized anxiety. Changes in anxiety were assessed through self-report questionnaires (STAIstate) as well as EEG patterns associated with anxiety and/or relaxation. We employed a mixed model repeated-measure approach, whereby EEG recordings and state anxiety ratings were made across three time points spanning two 5-min intervals, either containing the meditative VR experience or 5 min of awake, eyes-open rest. For the intervention group the measurement and interval sequencing was comprised of an initial EEG baseline, followed by rest, followed by post-rest EEG and state anxiety measurements, followed by the interval of the VR meditation experience, followed by a final EEG recording and state anxiety measurement. The control group followed the same recording procedures and intervals, with the exception that they participated in a further interval of quiet rest, rather than the VR meditation. We hypothesized that the meditative VR experience, more than rest alone, would result in a significant reduction in reported state levels of anxiety. Furthermore, we hypothesized that such reductions in state anxiety would be accompanied by equally significant changes in overall EEG activity. More specifically, we anticipated a significant drop in anxiety to be associated with a reduction in the amount of high frequency activity in favor of power in lower frequency ranges.

### MATERIALS AND METHODS

### Participants

Participants were recruited in two separate rounds through flyers and marketing on Facebook <sup>R</sup> . The initial subject pool were all part of the intervention group. Following preliminary analyses, it was determined that a control group was necessary to better interpret the results. Consequently, the same procedures were repeated for a second group of participants that served as the control group.

Interested participants called the researcher and completed a phone screening consisting of exclusion/inclusion criterion as well as a Generalized Anxiety Disorder screening (GAD-7). Exclusion criterion included a history of head injury, seizure activity, or major mental health concerns (schizophrenia and bipolar disorder). To be included in the study, participants had to be at least 18 years old with a moderate level of generalized anxiety (score of 8 or higher on the GAD-7. Intervention M = 12.6, SD = 3.7; Control M = 12.0, SD = 4.14). A total of 47 respondents were phone screened in the intervention group, 26 were eligible to participate and 21 completed the study. Of the five that did not complete the study, three canceled and two did not show for their appointments. The complete data set of seven participants were removed due to excessive EEG artifact or incomplete data sets due to researcher error (N = 14). Eighteen respondents were screened for the control group, 5 were not qualified due to GAD scores lower than 8 and one of those five had a traumatic brain injury. Of the 13 controls subjects that completed the study, one was removed due to excessive artifact (N = 12). For demographic information and participants' previous experience with meditative practices, see **Table 1**. The study was performed at the NeuroMeditation Institute, LLC in Corvallis, OR, United States. It was approved by the Quorum Institutional Review Board, Seattle, WA, United States.

#### Measures

#### Demographic Questionnaire

This questionnaire asked subjects to identify information related to their sex, age, race, education level, experience with meditative practices and history of mental illness.

#### Generalized Anxiety Disorder-7 (GAD-7)

This 7-item self-report scale asks subjects to identify how much they were bothered by each of 7 symptoms during the previous 2 weeks. Response options include, "not at all," "several days," "more than half the days," and "nearly every day," scored 0, 1, 2, and 3, respectively. Total score ranges from 0 to 21 with higher scores indicating higher anxiety and a higher likelihood of meeting criterion for GAD. A score of 8 has been shown to have a 92% sensitivity and a 76% specificity in relation to a diagnosis of GAD (Spitzer et al., 2006). This was the cutoff utilized in the current study. In addition, the GAD-7 has



<sup>a</sup>Examples of meditative practices in questionnaire included: seated meditation, yoga, qigong, chanting, prayer or other practices. EX, Participants in VR meditation experience group; CN, Resting control group.

previously demonstrated an internal consistency of 0.92 and test-retest reliability of 0.83 (Spitzer et al., 2006).

#### State-Trait Anxiety Inventory-Y (STAI)

The STAI is a commonly used measure of trait and state anxiety. Form Y has 20 items for assessing trait anxiety and 20 for state anxiety. Only the state portion of the survey was used in this study as this segment was designed to measure more immediate symptoms of anxiety. State items include: "I am tense;" "I am worried;" and "I feel calm." All items are rated on a 4-point scale (e.g., from "not at all" to "very much so"). Higher scores indicate greater anxiety. Internal consistency coefficients for the scale have ranged from 0.86 to 0.95; test–retest reliability estimates have ranged from 0.65 to 0.75 over a 2-month interval (Spielberger et al., 1983).

#### EEG Data Collection

The EEG data in this study was sampled with 19 electrodes in the standard 10–20 International placement referenced to

linked ears. Electrode sites corresponded to Fp1, Fp2, F3, F4, F7, F8, Fz, C3, C4, Cz, P3, P4, Pz, T3, T4, T5, T6, O1, and O2. **Figure 1** illustrates the experimental design and EEG recording intervals. While all the rest intervals were of quiet, eyes-open rest, the EEG-recordings were conducted during eyes-closed resting. Five minutes of eyes-closed data resting was collected at three recording time points for each subject: Time 1 (Baseline), Time 2 (2nd Baseline), Time 3 (Post Experimental Condition). Between the recordings, for Time 1 and Time 2, all subjects were instructed to sit in a state of quiet, natural, eyes-open rest (blinking allowed) for 5-min. This served as a within-subjects control condition for the intervention group. Each raw EEG file was uploaded to Qeeg Pro (QEEG Professionals, The Netherlands) and processed through a Standardized Artifact Rejection Algorithm (S.A.R.A). This process removes segments from an EEG recording that are likely due to other sources, such as eye blinks, muscle tension, etc. Using an automated process such as this ensures that each file is handled in the same manner and reduces the possibility of bias in the artifact removal process. Raw files were then manually inspected. 11.5% of the experimental group and 7.7% of the control group EEG recordings were eliminated due to excessive artifact.

Artifact-free files were then processed through BrainAvatar software (BrainMaster Technologies, Inc.) to obtain power estimates for the bands Alpha (8–12 Hz), Alpha1 (8–10 Hz) and Alpha2 (10–12 Hz) sub-bands, Beta (12–30 Hz), Low Beta (12–18 Hz), and High Beta (18–30 Hz) sub-bands, and broadband activity (Sum: 1–30 Hz) at all 19 electrode sites. In addition, CSD estimates for the same primary EEG bands were obtained for specific regions of interest (ROI's) using the sLORETA algorithm (Pascual-Marqui, 2002) in the BrainAvatar software (BrainMaster Technologies, Inc.). sLORETA is the standardized version of LORETA (Low Resolution Electromagnetic Tomographic Analysis). Both LORETA and sLORETA generate solutions to the "inverse" problem of EEG electrophysiology by using a set of scalp EEG measurements to produce estimates of activity measured in

CSD. While the accuracy of CSD estimates increases with the number of electrodes (Song et al., 2015), it has been shown that with as little as 16 electrodes (Cohen et al., 1990), and using the approximate three-shell head model provided by the original LORETA equations, "human in vivo localization accuracy of EEG is 10 mm at worst" (Pascual-Marqui, 1999, p. 11). Given that sLORETA provides higher resolution than the previous solution (LORETA), accuracy of localization should be improved (Pascual-Marqui, 2002) and sufficient for the current pilot examination. The specific application utilized in this study uses 5-mm voxels to compute 6,239 voxels which can be combined to correspond to 88 specific brain regions (Collura, 2012).

### Procedures

Importantly, for testing the VR intervention, our initial study approach was a within-subject, rest-then-intervention, experimental design. After more resources became available, the participants for the control group were then recruited, screened, and ran, allowing for within- and between-group comparisons. However, the sequential addition of this more rigorous control method resulted in non-random assignment to groups. Although participants in both groups matched closely on all demographic characteristics (see **Table 1**), this is still a notable procedural limitation.

Participating subjects were scheduled for a 75-min office visit. After a verbal description of the study process and completing IRB information/consent forms, subjects completed a demographic, and STAI-state questionnaire. Subjects were then provided with a non-therapeutic VR experience to orient them to the experience of VR. During the VR orientation, subjects were seated in a swivel chair in the center of the research room, given basic instruction on the VR headgear (Gear VR powered by a Samsung Android s7 phone), and then given the opportunity to experience a 2 min, 30 s VR event that involved being on the field just prior to the beginning of a college football game. Following this orientation, subjects were fitted with a 19-channel EEG electrocap (Electrocap International, United States). Each electrode was prepped using electrogel conductance paste (Electro-Cap International, Inc., United States). Impedances for all sites were assessed prior to each recording and kept below 10 kOhms. Subjects completed a 5-min, eyes-closed EEG baseline, recorded using a BrainMaster Discovery amplifier (BrainMaster Technologies, Inc., United States). Following the initial baseline recording, subjects were asked to remain seated with eyes open and no verbal interaction for 5 min. After this resting period, a second eyes closed baseline was recorded for 5 min and subjects were asked to complete a second STAI-state questionnaire. This process provided a within-subject control, allowing us to investigate whether the VR experience results in significantly more change than might be observed simply by time. Control group subjects repeated a second 5 min, eyesopen resting period while experimental group subjects returned to the swivel chair and the VR headgear was placed over the electrocap. The subject was then instructed to simply enjoy the mindfulness in nature VR experience and follow along with the guided meditation.

The Mindfulness in nature experience was 5-min, 41 s in length and produced by StoryUp VR (Columbia, MO, United States) using 360◦ video photography. As the scene opens, there are mountains in the distance and large rocks all around on the landscape (see **Figure 2**). The sky is blue and speckled with clouds and there is a mist rolling in front of the mountains. There is soft piano and violin music playing in the background. Approximately 20 s into the experience, a woman's voice begins guiding the viewer through a mindfulness meditation, directing the attention to elements of the environment and asking them to connect with what they are seeing by imagining that they embody the same qualities as the rocks and sky. Near the end of the scene, an inspirational quote attributed to Lao Tzu (i.e., "Nature does not hurry, yet everything is accomplished") is displayed and the screen slowly dims.

At the conclusion of the second rest period (control group) or the mindfulness-in-nature VR experience (intervention group), the headgear was removed, impedance levels were rechecked, and a post-VR, 5-min EEG was recorded using the same instructions as the previous recordings. Following the final EEG recording, subjects completed a post-VR STAI-state form questionnaire. All subjects completing the study were incentivized for their participation with a \$25 gift card.

#### Data Analysis

Both mean power and power band ratios from the average of all 19 electrodes were used to test hypotheses about global spectral state changes across time points. As a pilot study our primary aim was to capture the spectral changes broadly and not tightly coupled to topographical restrictions. We anticipated the whole scalp average approach would be able to capture the predicted changes in the Alpha and Beta bands, and also limit the number of statistical comparisons employed to test our hypotheses. As part of our analyses we also aimed to analyze relative spectral power shifts in participant EEG profiles between power bands, so simple power ratios were also calculated from the total electrode power means. Ratios of interest included each band and sub-band power over Sum power (1–30 Hz) as well as Alpha/Beta, Alpha1/Alpha2, and Low-Beta/High-Beta.

For increased specificity, we divided Alpha (8–12 Hz) and Beta (12–30 Hz) into two sub-components. Alpha1 (8–10 Hz), sometimes referred to as "Low Alpha," is associated with a calm and relaxing state in which we are not attending to the external world (Thompson and Thompson, 2003). In contrast, Alpha2 (10–12 Hz), sometimes referred to as "High Alpha," is related to a state of relaxed alertness, such as might be observed just prior to engaging in an action (Thompson and Thompson, 2003). The lower end of Beta (12–18 Hz) activity has been implicated in, and emerging from, the undertaking challenging cognitive tasks (Sherlin, 2009). It has also been associated with the maintenance of cognitive, psychological, and behavioral processes, with pathological elevations in Beta activity potentially linked to states of cognitive inflexibility (Engel and Fries, 2010). High Beta (18–30 Hz) is most often associated with higher levels of concentration, but is also observed at increased levels during anxiety or periods of emotional intensity

(Thompson and Thompson, 2003; Franken et al., 2004; Sherlin, 2009).

Our inclusion of sLORETA ROIs for analysis was also restricted for minimum variable inclusion. We limited the analyses to two ROIs: the Posterior Cingulate Cortex (PCC), a major hub of the default mode network (Shulman et al., 1997; Raichle et al., 2001), and the Anterior Cingulate Cortex (ACC), an area associated with cognitive and emotional processing (Bush et al., 2000; Holroyd and Coles, 2002; Mansouri et al., 2009). Both regions have also been associated with state changes during, and as a result of meditative experiences (Holzel et al., 2011; van Lutterveld et al., 2016).

Statistical analyses were performed with SPSS software (SPSS v21; IBM Inc., Armonk, NY, United States). A series of 2 × 3 (Group × Time) mixed model repeated measure ANOVAs were computed for each EEG frequency band of interest, for both mean power and band power ratios. EEG activity was computed across all 19 electrode sites to incorporate the most EEG data as possible and evaluate changes at the broadest scale. Similarly, 2 × 3 (Group × Time) mixed-model repeated-measure ANOVAs were also used to analyze the sLORETA estimated current source densities of ACC and PCC activity in the Theta, Alpha and Beta range. However, here we limited analyses to the broad band level (i.e., Theta, Alpha, Beta), and did not include sub-band computations. Huynh–Feldt corrections to degrees of freedom were applied when violations to variable sphericity across time points were detected. The threshold for evaluating significance was set to α = 0.05. Follow-up one-way ANOVAs were conducted when group or interaction differences were detected, followed by pairwise comparisons, Bonferroni corrected for multiple comparisons.

Of the 34 total participants, two participant's data (one from each group) was removed because of high levels of artifact EEG contamination across all time points. In addition, 6 subjects were missing at least some data in a specific recording condition. For the final analyses, only the subjects with full data sets were used. Thus, the final sample (N = 26) was comprised of 14 participants in the VR group and 12 participants in the control group.

## RESULTS

### Demographic

Control group data was collected after intervention group data due to previously noted constraints. Because these circumstances precluded random assignment between groups, we attempted to match the subsequent control group to the screening and demographic characteristics of the previously collected intervention group. Statistical analyses with independent t-testing, revealed no statistical differences in terms of age, years of education, GAD, or meditation experience. **Table 1** reveals the comparability between groups across all demographic variables.

### State-Trait Anxiety Inventory

To evaluate subject reports of state stress levels across time points, mean STAI scores were analyzed using a repeated-measures 2 × 3 (Group × Time) ANOVA. A significant main effect for Time was observed, F1.4,32.<sup>7</sup> = 35.54, p < 0.001, η 2 <sup>p</sup> = 0.60, with no between Group or interaction effects (Group: F1,<sup>24</sup> = 1.72, p = 0.202, η 2 <sup>p</sup> = 0.07; Group × Time: F1.4,32.<sup>7</sup> = 0.54, p = 0.522, η 2 <sup>p</sup> = 0.02). State stress ratings linearly declined across each interval for both groups, regardless of whether intervals contained rest or the intervention. Means and standard errors across time points for STAI respective to group, and all other dependent variables are contained in **Table 2**.

#### Mean Power

fpsyg-09-01280 July 21, 2018 Time: 17:10 # 7

To evaluate the electrophysiological markers of cognitive state change across intervals we first analyzed the mean power from all electrode sites for each frequency band of interest. Alpha (F2,<sup>48</sup> = 7.06, p = 0.002, η 2 <sup>p</sup> = 0.23) and Beta (F2,<sup>48</sup> = 5.84, p = 0.005, η 2 <sup>p</sup> = 0.20) demonstrated a significant main effect for time, with both groups demonstrating slight linear increases in power across intervals. When both Alpha and Beta were broken down into their sub-bands for increased resolution both Alpha1 (F2,<sup>48</sup> = 6.11, p = 0.004, η 2 <sup>p</sup> = 0.20) and Alpha2 (F2,<sup>48</sup> = 5.75, p = 0.006, η 2 <sup>p</sup> = 0.004), as well as Low Beta (F2,<sup>48</sup> = 8.92, p = 0.001, η 2 <sup>p</sup> = 0.26), but not High Beta (F2,<sup>48</sup> = 2.59, p = 0.085, η 2 <sup>p</sup> = 0.098), revealed significant main effect for time. Across intervals, Alpha1 power increased for both groups, but more so for the intervention group after VR. In contrast, for both groups across rest Alpha2 power shows a small increase, which continues after the second period of rest for the control group, but demonstrates a slight decrease after the VR experience for the intervention group. Group means and standard errors of mean power for each group can be seen in **Table 2**, and reveal comparable linear decreases across all intervals for both groups.

No significant time by group interactions were observed for Alpha, Beta, or the respective sub-band activity of each. However, a significant main effect of group difference was observed for both Alpha (F1,<sup>24</sup> = 4.76, p = 0.039, η 2 <sup>p</sup> = 0.17) and Low Beta (F1,<sup>24</sup> = 5.21, p = 0.032, η 2 <sup>p</sup> = 0.18), to the effect that on average the intervention group had higher power than the control group. Closer investigation of these differences with follow-up one-way ANOVAs at each time point revealed for Alpha initially the intervention and control groups were statistically comparable (T1: F1,<sup>25</sup> = 3.97, p = 0.058), before becoming increasingly different across subsequent intervals (T2: F1,<sup>25</sup> = 4.94, p = 0.036; T3: F1,<sup>25</sup> = 5.14, p = 0.033). Low Beta was found to be initially significantly higher for the intervention group (T1: F1,<sup>25</sup> = 4.96, p = 0.036; T2: F1,<sup>25</sup> = 4.72, p = 0.040), and similarly to Alpha, was most different between groups at the last recording time (T3: F1,<sup>25</sup> = 5.74, p = 0.025). In both cases, the intervention group started with higher average Alpha and Low-Beta spectral power, with the pattern of results showing further power increases particularly for the intervention group after the VR meditation experience (Alpha: T1−<sup>2</sup> M±SE = −0.027±.125, p = 0.998, T2−<sup>3</sup> M<sup>±</sup> SE = −0.264±.147, p = 0.302; Low Beta: T1−<sup>2</sup> M <sup>±</sup> SE = −0.114±0.083, p = 0.582, T2−<sup>3</sup> M <sup>±</sup> SE = −0.232±0.073, p = 0.021) compared to the control group after their second period of quiet rest (Control Alpha: T1−<sup>2</sup> M±SE = −0.341±0.207, p = 0.369, T2−<sup>3</sup> M±SE = −0.240±.111, p = 0.149; Low Beta: T1−<sup>2</sup> M±SE = −0.070±0.035, p = 0.205, T2−<sup>3</sup> M±SE = −0.047±0.052, p = 0.137). This difference likely contributed to the overall group difference effect found in Alpha and Low Beta revealed by the initial omnibus ANOVA.

Together these results reveal significant increases in Alpha and Beta power over the study duration, with the power in the lower, but not higher, Alpha and Beta sub-bands demonstrating slightly higher power increases on average specifically after the VR intervention as opposed to rest.

### Power Ratios

To further investigate spectral power shifts from the perspective of relative power changes, we converted the mean power into ratio form for comparison. We looked at the specific power dynamic changes on the backdrop of global power (i.e., Sum; 1– 30 Hz), as well as between specific band and sub-band frequency ranges. A significant main effect for time was observed for Alpha/Sum (F2,<sup>48</sup> = 4.57, p = 0.015, η 2 <sup>p</sup> = 0.16), and Alpha/Beta (F2,<sup>48</sup> = 5.26, p = 0.009, η 2 <sup>p</sup> = 0.18), but not Beta/Sum (F2,<sup>48</sup> = 0.67, p = 0.523, η 2 <sup>p</sup> = 0.03). Overall alpha increased in proportional power over Sum and Beta activity over the course of the experiment for both groups, whereas Beta/Sum tended toward a slight, non-significant decrease across the experimental intervals. Ratio means for band and sub-band dynamics can be viewed in **Table 3**. No significant group differences or interaction effects were observed for Alpha/Sum, Alpha/Beta, or Beta/Sum.

At sub-band resolution, Alpha1/Sum (F2,<sup>48</sup> = 4.00, p = 0.025, η 2 <sup>p</sup> = 0.14), High-Beta/Sum (F2,<sup>48</sup> = 5.81, p = 0.009, η 2 <sup>p</sup> = 0.18), High-Beta/Alpha (F2,<sup>48</sup> = 9.24, p < 0.001, η 2 <sup>p</sup> = 0.28), Low-Beta/Alpha (F2,<sup>48</sup> = 5.34, p = 0.008, η 2 <sup>p</sup> = 0.18), and Low-Beta/High-Beta (F2,<sup>48</sup> = 13.20, p < 0.001, η 2 <sup>p</sup> = 0.36) all revealed significant main effects across time, with no significant differences between groups. Overall, particular to the second interval, and irrespective of group, Alpha1/Sum significantly increased (T2−<sup>3</sup> M±SE = −0.025±0.009, p = 0.031), whereas Low-Beta/Alpha (T2−<sup>3</sup> M±SE = 0.031±0.011, p = 0.028) and High-Beta/Alpha (T2−<sup>3</sup> M±SE = 0.053±0.014, p = 0.002) decreased. High-Beta/Sum (T1−<sup>3</sup> M±SE = 0.011±0.003, p = 0.009) demonstrated a slight linear decrease across the experiment, more pronounced for the intervention group after the VR experience. Alpha2/Sum (F2,<sup>48</sup> = 2.49, p = 0.093, η 2 <sup>p</sup> = 0.09), Low-Beta/Sum (F2,<sup>48</sup> = 0.42, p = 0.660, η 2 <sup>p</sup> = 0.02), Alpha1/Alpha2 (F2,<sup>48</sup> = 1.00, p = 0.375, η 2 <sup>p</sup> = 0.04) did not change significantly across time, or differ significantly between groups. **Figure 3** shows the relative sub-band power dynamics across intervals, by group.

Notably, a significant interaction was observed for Low-Beta/High-Beta (F2,<sup>48</sup> = 3.71, p = 0.032, η 2 <sup>p</sup> = 0.13). Follow-up ANOVA tests revealed that across the first two recording time points, interleaved by a period of rest for both groups, each group's Low-Beta/High-Beta power was statistically comparable (T1: F1,<sup>25</sup> = 4.15, p = 0.053; T2: F1,<sup>25</sup> = 3.14, p = 0.089). However, at the last recording, a significant difference was observed (T3: F1,<sup>25</sup> = 4.15, p = 0.040). Looking at the within group patterns for the intervention group, there was no significant change in Low-Beta/High-Beta across the rest interval (T1−<sup>2</sup> M±SE = 0.007±.010, p = 0.998), and the significant change only occured after the VR meditation experience (T2−<sup>3</sup> M±SE = −0.054±.014, p = 0.005). This pattern was not observed across either of rest intervals for the control group (T1−<sup>2</sup> M±SE = −0.011±0.008, p = 0.578; T2−<sup>3</sup> M±SE = −0.012±0.008, p = 0.463). **Figure 3F**, illustrates this

#### TABLE 2 | Means and standard errors for anxiety ratings and EEG power.


Anxiety ratings correspond to the mean score across participants on the 20 items State-Trait Anxiety Index (STAI) specifically evaluating current state anxiety. Units are in microvolts-squared (µV 2 ) for mean power. Standard error of mean values are italicized.

TABLE 3 | Means and standard errors for power ratios.


Standard error of mean values are italicized.

interaction. These results indicate that the ratio of Low Beta to High Beta increased to favor Low Beta power, as a specific result of the VR experience.

#### sLORETA

Next, we wanted to investigate specific anterior and posterior cortical regions of interest for changes in regional activity across spectral bands. Specifically, we investigated current source densities in the ACC and PCC for Theta, Alpha, and Beta frequencies. In the ACC, a significant main effect for time was found for Beta (F2,<sup>48</sup> = 3.01, p = 0.054, η 2 <sup>p</sup> = 0.11) but not Theta (F2,<sup>48</sup> = 0.41, p = 0.664, η 2 <sup>p</sup> = 0.02) or Alpha (F2,<sup>48</sup> = 0.47, p = 0.649, η 2 <sup>p</sup> = 0.02). There was no between group main effect for ACC activity, and Theta and Alpha appeared to be quite stable across intervals for both the experimental and control groups across intervals. However, a significant Time by Group interaction emerged for ACC Beta activity (F2,<sup>48</sup> = 3.52, p = 0.038, η 2 <sup>p</sup> = 0.13). Follow-up repeated measures by group indicated no significant change across rest for the experimental group (T1−<sup>2</sup> M±SE = −0.122±0.105, p = 0.795), followed by a significant decrease in ACC Beta activity following the VR intervention (T2−<sup>3</sup> M±SE = 0.341±0.010, p = 0.048). The control group did not demonstrate a significant change across either rest interval (T1−<sup>2</sup> M±SE = −0.090±0.042, p = 0.171; T2−<sup>3</sup> M±SE = 0.005±0.098, p = 0.998). Alpha and Beta ACC activity between groups can be found in **Figure 4**.

For the PCC, significant main effects for time were found for Alpha (F2,<sup>48</sup> = 4.42, p = 0.017, η 2 <sup>p</sup> = 0.16) and Beta (F2,<sup>48</sup> = 5.61, p = 0.006, η 2 <sup>p</sup> = 0.19), but not Theta (F1.7,41.<sup>8</sup> = 2.44, p = 0.106, η 2 <sup>p</sup> = 0.09). Significant increases, irrespective of group and interval, were only observed comparing between the first and last time points (PCC Alpha: T1−<sup>3</sup> M±SE = −0.228±0.086, p = 0.042; PCC Beta: T1−<sup>3</sup> M±SE = −0.122±0.039, p = 0.015). The experimental group demonstrated consistently higher PCC current source densities than the control group (Alpha: F1,<sup>24</sup> = 4.47, p = 0.045, η 2 <sup>p</sup> = 0.16; Beta: F1,<sup>24</sup> = 4.32, p = 0.049, η 2 <sup>p</sup> = 0.15), but both groups demonstrated the same small pattern of linear CSD increase over time.

Together these results reveal a significant specific effect on ACC Beta activity occurring after the VR experience, but not after rest alone. Alpha activity was uniformly higher in PCC for the VR group across all intervals, demonstrating similar incremental increases across the VR interval as rest. It is unclear what accounts for the group differences in PCC Alpha, however,

FIGURE 3 | Changes in sub-band power ratios across rest and virtual reality intervals. (A,B) Change in Alpha 1 and Alpha2 activity relative to total broadband power ('Sum') across measurement time points. (C) Relative power changes between Alpha 1 and Alpha2. (D,E) Change in Low- and High-Beta activity relative to total broadband power. (F) Relative power changes between Low-Beta and High-Beta. A significant interaction (p < 0.05) occurred for Low-Beta/High-Beta. <sup>∗</sup>Denotes p < 0.05 for follow up within-subject repeated measure ANOVAs by group, corrected for multiple pairwise comparisons. Alpha 1 = 8–10 Hz, Alpha2 = 10–12 Hz, Low-Beta = 12–18 Hz, High-Beta = 18–30 Hz, Sum = 1–30 Hz.

FIGURE 4 | Current source density changes in the anterior cingulate cortex (ACC) across rest and virtual reality intervals. (A) Changes in Alpha across measurement time points. (B) Changes in Beta activity across measurement time points. A significant interaction (p < 0.05) occurred for ACC Beta activity. <sup>∗</sup>Denotes p < 0.05 for follow up within-subject repeated measure ANOVAs by group, corrected for multiple pairwise comparisons. <sup>∗</sup>Denotes p < 0.05, for within-subject repeated measure ANOVA, after correction for multiple pairwise comparisons. Current source densities were estimated with sLORETA (Pascual-Marqui, 2002).

it does not appear to be directly related to undergoing the VR meditation experience.

#### Correlations

To further quantify the relationship between self-reported anxiety and changes in electrophysiological profiles we conducted correlational analyses examining both STAI and GAD scores in relation to our EEG power ratio and source localization data (N = 26 for correlational analyses). GAD scores were significantly inversely correlated with Alpha/Sum across all time points (T1, r = −0.47, p = 0.019; T2, r = −0.35, p = 0.080; T3, r = −0.43, p = 0.027). At the sub-band level it was discovered that there was a proportional increase of slower sub-band Alpha activity (Alpha1/Sum: T1, r = −0.46, p = 0.018; T2, r = −0.35,

p = 0.078; T3, r = −0.43, p = 0.029) but not higher Alpha subband activity (Alpha2/Sum: T1−3, r's = −0.12 – −0.13, p's > 0.2) that supported this relationship. Additionally, and perhaps more germane to our main findings, across all time points Low-Beta/High-Beta was similarly significantly inversely related to GAD scores (T1, r = −0.45, p = 0.023; T2, r = −0.43, p = 0.030; T3, r = −0.39, p = 0.048). Conversely, High-Beta/Sum activity was significantly positively associated with GAD across all time points (r = 0.48, p = 0.013; r = 0.45, p = 0.022; r = 0.44, p = 0.025) whereas low Beta/Sum was not (T1−<sup>3</sup> r's = 0.10 – 0.12 p's > 0.5).

To analyze the STAI ratings, difference scores were calculated between T2-1, T3-2, and T3-1. Difference scores were also calculated for Alpha, Beta, and their sub-bands across a number of variables. The only positive correlation between STAI and EEG was between T3-2 for mean High-Beta power and higher T3-2 STAI rating (r = 0.456, p = 0.019). That is, higher High Beta predicted higher state stress ratings, a result which seems to align with the GAD anxiety rating results.

### DISCUSSION

To our awareness, this is the first study to evaluate the potential anxiety reducing effects of a brief VR intervention specifically designed to help individuals with trait elevated general anxiety. Although both our control and experimental VR intervention groups self-reported decreasing state anxiety across the experiment, significant objective electrophysiological markers associated with reduced anxiety states uniquely appeared only after the VR meditation experience, as opposed to normal periods of rest. The VR meditation resulted in both global and regional decreases in Beta activity. Both effects are in line with electrophysiologically indicating a state of reduced anxiousness. The results of this pilot study provide preliminary evidence supporting that VR interventions may be a useful and effective tool for the treatment of elevated anxiety symptoms. However, the findings and interpretations of these results fall on the background of notable study limitations. Future research is needed to further develop and expand upon the current level of neural mechanistic and processual understanding with regards to how meditative VR simulations affect the brain and mind. As well, further and more robust studies are needed to continue testing the therapeutic efficacy of similar VR meditations on clinical populations, and as compared to other varieties of treatment and control conditions. Despite a number of limitations, discussed in the subsequent sections, this pilot study provides a liminary set of results to build upon as both science and technology continue to advance in this burgeoning area of applied research and intervention.

In this study, both the VR intervention and control participants demonstrated linear decreases in subjective anxiety across all time points. Contrary to our hypotheses, after the VR meditation experience we did not observe a particularly notable reduction in self-reported state anxiety, beyond what the interval of mere quiet rest afforded. Decreases occurred in similar magnitudes irrespective of group or whether the preceding interval contained rest or VR. The ubiquity of this pattern between groups, despite obvious experiential differences between VR and rest, could be due to a number of factors. First, is the possibility that the VR experience was indeed completely comparable to the experience of resting. However, given that the experience of a quiet rest is quite different from the VR immersive experience, it is suspect that the exact same anxiety reducing processes were underway across both intervals for both groups. Furthermore, the current study demonstrated a pattern of spectral changes in the intervention group which differed in significant and important ways from the control group, which indicates that there were indeed unique differences in neurophysiological state brought about by the VR meditative experience.

Alternatively, the observed pattern of STAI responses may have been the result of other potentially strong experimental influences such as characteristic demands (expectations to report reduced anxiety), habituation to the testing environment, psychometric limitations such as floor and ceiling effects, or any combination of these factors. We tend to believe these factors were at least partially influential on the pattern of anxiety rating responses across the testing sessions. While incorporating a control group allowed us to account for and mitigate the confounding effects of some of these considerations (e.g., habituation), we believe that incorporating a variety of anxiety measures in future investigations would help to better delineate the specific magnitude and nature of subjective psychological responses to VR interventions.

Importantly, while both groups showed comparable decreases in self-reported state anxiety, only the VR group, after the VR intervention, uniquely demonstrated additional physiological changes in align with reduced hyperarousal and/or anxiety. Our EEG analysis was focused on Alpha and Beta, and the respective sub-bandwidths' activity (Alpha1, Alpha2, Low-Beta, High-Beta). The amount of EEG activity occurring across these ranges has been linked with states of relaxation, stress, and anxiety; with increased Alpha being broadly associated with calm and relaxed states, and Beta, particularly higher Beta frequencies, associated with qualitatively anxious states (Thompson and Thompson, 2007; Price and Budzynski, 2009; Olbrich et al., 2011). Our correlational analyses seemed to support such a relationship. GAD scores were significantly inversely correlated with the proportion of Alpha power relative to the full spectrum power profile (i.e., Sum). Moreover, at the sub-band level it was found that lower Alpha frequencies (Alpha1), but not higher Alpha frequencies (Alpha2) supported this relationship. Conversely, High Beta activity significantly predicted higher anxiety scores on both the GAD and the STAI, whereas Low Beta activity did not.

We anticipated some degree of psychological and physiological relaxation response to emerge from a period of quiet rest, but hypothesized that the VR experience would result in unique significant psychophysiological effects beyond experiencing rest alone. Specifically, we hypothesized that we would observe power shifting from higher frequency ranges, such as High-Beta, into lower frequency ranges, such as Alpha. Interestingly, our results tended to indicate power downshifts occurring within (i.e., sub-bands) - but not between - conventional bandwidths (e.g., overall significantly reduced

broadband Beta activity coinciding with significantly increased Alpha activity).

Indeed, the pattern of results was much more nuanced. Looking at absolute mean power in Alpha and Beta bandwidths revealed similar power increases across groups, with minor, but not statistically significant divergent patterns when mean power was evaluated within respective sub-bands across the experimental intervals. The observed increased Alpha power for both groups generally fits the narrative of reduced anxiety and increased relaxation, despite no significant extra or unique Alpha enhancement emerging from the VR intervention. Overall, broadband Beta activity demonstrated small but significant increases over time, irrespective of group. However, subband dynamics within Beta revealed unique VR effects which supported our general hypotheses.

A more detailed analyses comparing relative power between EEG bands revealed a significant change in the Low-Beta/High-Beta power ratio, specifically occurring in the VR intervention group after the VR experience, but not after rest. While the High Beta frequency range plateaued across the VR interval (and actually decreased relative to total broadband power), the lower Beta range power showed specific increases immediately following the VR exposure. When these shifts in sub-band Beta power were directly compared relative to one another, it was found that the proportion of Low-Beta to High-Beta activity shifted significantly in favor of Low-Beta activity, only for the VR group and only after the VR experience. Although on average the control group had a slightly lower Low-Beta/High-Beta ratio, similar to the VR group, intervals merely involving rest had no effect on the Low-Beta/High-Beta power dynamic.

This finding is significant, indicating a relative reduction of global High-Beta activity only after the VR intervention. This band is of particular interest as it is a the frequency range notably associated with elevated stress and anxiety. In their research, Thompson and Thompson (2007) report that anxiety is generally associated with an increase of High-Beta found in conjunction with a decrease of Low Beta activity; which is the exact opposite of the pattern demonstrated by the VR group in this study. Indeed, our correlational analyses also revealed and support the association between elevated anxiety and High Beta activity. Given our participants were specifically selected based on their elevated anxiety, these shifts in power away from High-Beta are consistent with our hypothesis of reduced anxiety as a result of the VR experience.

Beyond evaluating global electrophysiological pattern changes, we were also interested in select regional activity changes. Using sLORETA, a source localization technique that has been shown to be reasonably effective and accurate, even with relatively low electrode densities such as here (Cohen et al., 1990; Pascual-Marqui et al., 1994; Pascual-Marqui, 1999, 2002, 2007), we examined CSD changes in the anterior and posterior cingulate cortices.

The ACC is thought to serve as a primary mediator between the limbic system and the autonomic nervous system. It is engaged during a range of tasks including decisionmaking (Rushworth et al., 2007; Morecraft and Tanji, 2009; Shackman et al., 2011), reward processing, conflict monitoring, error detection, and the experience of pain (Vogt, 2005; Beckmann et al., 2009; Shackman et al., 2011). Due to its role in focusing attention, appraisal, and cognitive flexibility, it may come as no surprise that ACC overactivity has also been linked to stress, worry, and cognitive rigidity (Amen, 2001). Other research also suggests ACC overarousal is linked to symptoms of obsessive-compulsive disorder (Rauch et al., 1998; Graybiel and Rauch, 2000). Consistent with this understanding, studies designed to reduce anxiety using respiratory sinus arrhythmia breathing (Sherlin and Wyckoff, 2010) and meditation (Cahn and Polich, 2006; Fox et al., 2016) have demonstrated reduced activation of the ACC. Consistent with these roles and intended goal of anxiety reduction, here we observed a significant decrease of Beta activity in the ACC in the VR but not control group, which occurred only after the VR experience as opposed to an equivalent time of rest. This supports the notion that the VR intervention resulted in changes beyond that experienced by time or quiet rest, and implicates the ACC as one specific region likely affected by the VR meditation, in addition to, or as part of, the global reduction in physiological arousal suggested by our other EEG results.

The other region of interest examined was the PCC. This region is a primary hub of the Default Mode Network (DMN), a group of neural structures strongly linked to self-referencing and other self-related processing (Shulman et al., 1997; Raichle et al., 2001). Recent investigations have identified that the processing of self-related information in the DMN is largely driven by the PCC (Davey et al., 2016). Due to its strong role in self-referencing, research has consistently demonstrated that overarousal of the PCC is connected to a variety of mental health concerns, including ADHD (Nakao et al., 2011), depression (Haznedar et al., 2004), and rumination (Berman et al., 2011). Decreased activity in the PCC is a common finding in meditation literature, consistent with the notion of minimizing self-related thinking (Baerentsen et al., 2010; Fox et al., 2016).

In the current study, we found significant group differences sustained across all intervals, to the effect that PCC activity was generally higher in the VR group compared to controls across Theta, Alpha, and Beta activity ranges. It is unclear why this difference existed. It may somehow be related to the sequential nature of our group data collection, a notable limitation of our study; the precise effects of which are unclear. However, both groups, irrespective of whether the intervals contained rest or the VR intervention, demonstrated similar increases in PCC activation across the duration of the study. Given the broad and non-specific nature of the PCC, it is not entirely clear what this pattern of activity means in terms of contributing to particular psychological or physiological effects. Tentatively, the lack of between group differences suggests that sitting quietly compared to engaging in this type of VR mindfulness meditation for brief periods may have similar impacts on this particular brain region. The PCC has been identified as a neural region involved in self-examination and self-referential processing (Cavanna and Trimble, 2006; Herwig et al., 2012), so it is possible that the changes in the PCC are associated with participants repeatedly self-reflecting on their state anxiety across the course of the study.

Unfortunately, for this pilot, we did not formally assess the content of the subject's cognitive processes during the experimental or control conditions, instead relying on STAI rating as indicators of cognitive state and stress levels. Despite this limitation, a number of informal observations were made suggesting the VR intervention was having the predicted effect. Some of these observations in experimental group subjects included relaxed shoulder postures and/or slowing of their breathing pattern during the VR experience, which tended to occur between the 2 and 3-min mark of the experience. In addition, several experimental group subjects made spontaneous comments after the VR experience, suggesting that they found it to be enjoyable and/or relaxing. For example, one subject noted "I liked the woman's voice on the guided meditation. I found it relaxing."

Summarizing to this point, both groups showed equivalent reductions in state anxiety across time based on subjective ratings. As well, both groups demonstrated increased Alpha power, a common concomitant associated with taking a period of rest. However, a unique effect associated with the VR experience was a global power shift from higher to lower Beta frequencies. Given that prominent high frequency Beta activity is a known marker of state hyperarousal and anxiety, this finding is significant. It tends to support the notion that the VR intervention successfully reduced an important psychophysiological aspect of anxiety, in a way that merely taking a resting break is not able to do.

#### Limitations and Future Directions

At the broadest experimental level, are some rather conspicuous limitations. As a pilot evaluation of a newly developed VR meditation intervention, our sample size was small and lacked statistical power to do more robust analyses or detect smaller effects. Future studies evaluating similar interventions on elevated or clinical levels of general anxiety should include larger, possibly more representative samples. Regarding our sample size and composition, here we are somewhat limited in the generalizability of our results. Our sample was comprised of predominantly white women between the ages of 35 and 50. While this may be an important demographic for this type of intervention, the results tell us little about how other populations might respond. Similarly, roughly two-thirds of the sample in this study engaged in some kind of contemplative practice each week. It is unclear how this may have influenced participant interaction with the VR meditation experience. An area that remains to be examined is whether individuals with very different levels of meditative experience engage or benefit from similar VR experiences comparably.

As previously pointed out, another notable limitation was that the two groups, the intervention and the control samples, were recruited and ran separately. Time and resource limitations caused a delay in acquiring the control group sample by about 3 months. The intervention group was fully collected from September-to-November, while the control group was fully collected across February-to-May. It is unclear what, if any, confounding effects resulted from this, suffice to say this was not experimentally optimal and the exact nature and consequence of this serial order, non-randomized, data collection is generally unknown. However, tentatively we did not observe any irrevocable signs of this causing major effects or influences. Our samples were identically recruited, closely matching across demographic characteristics (**Table 1**), and demonstrated very comparable electrophysiological and anxiety reporting patterns across baseline comparisons.

Aside from these more broad-based limitations, we also acknowledge that the results and interpretations of this pilot are by no means definitive, and more research is needed to verify and better elucidate therapeutic mechanisms of this, or similar, VR interventions. Our primary findings were that the VR experience uniquely affected broad and regional Beta activity, with emerging EEG patterns consistent with reductions in anxiety, and physiological hyperarousal. Again, previous research supports our interpretation that this effect is likely indicative of an adaptive state change and a reduction of experiential anxiety; and our results hold early promise based on these connections. However, more research must be conducted to clearly elucidate the nature of high Beta activity in relation to state and trait anxiety manifestations and how interventions, such as we have presented here, can play a maximal therapeutic role. Here, we were restricted in our ability to address such concerns, limited by our reliance on the STAI-state as the sole dependent measure of anxiety. Future research would benefit from including multiple measures to specifically assess changes in cognitive content or somatic indicators of arousal or stress. Our results will need to be replicated and expanded on with a more robust variety of measures.

It will also be important to devise studies which isolate the elements of the VR experience to more clearly define the "active ingredients" of VR based interventions, such as the one tested here. Independently, viewing nature, and practicing mindfulness can both reduce anxiety and promote state changes. This VR experiment has both elements making it unclear if one factor played more of a role that the other, or if perhaps the sum of these parts in the VR experience was/is greater than each factor additively combined. Future studies could also expand on this pilot by including an active control group engaged in a standard relaxation exercise, or an audio-only guided meditation. Such a comparison between groups would provide additional clarity about the potential impact of VR in reducing stress/anxiety above and beyond already existing interventions. Because this is a relatively new and novel approach, future studies with this type of experience should also include follow up questionnaires and/or participant interviews to better ascertain the internal state and reaction to elements of the VR experience, making it easier to interpret specific EEG changes.

Finally, beyond examining the acute effects of a brief single VR session, as we have investigated here, it will also be important to study the effects of longer or repeated VR intervention programs, or to evaluate the potential benefits of VR based meditation experience as an adjunct to traditional psychotherapy. There are still a number of important areas still requiring attention, such as optimal therapeutic VR "dosages," session frequencies, and/or the very nature of how best to integrate VR interventions into broader treatment plans.

#### CONCLUSION

fpsyg-09-01280 July 21, 2018 Time: 17:10 # 13

Our results support the notion that intentionally crafted VR experiences can be therapeutically effective, and may result in immediate, adaptive psychophysiological outcomes. Although there are a number of limitations present in the current study, here we have provided early evidence that VR based meditation interventions have the potential to play an important role in anxiety management and stress reduction. As VR technology becomes more accessible and user-friendly, this type of intervention may find its way into a variety of environments and applications. Because the technology is relatively easy to use, it may serve as a wellness tool in work and school environments, as an intervention for persons with lack of access to nature, as a calming technique for persons receiving medical/dental procedures, as an adjunct to traditional therapeutic interventions, such as CBT programs for GAD, or any number of other applications to increase the personal psychological wellbeing of those in need.

#### REFERENCES


#### DATASETS ARE AVAILABLE ON REQUEST

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

#### ETHICS STATEMENT

The protocol was approved by Quorum Institutional Review Board, Seattle, WA, United States. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

#### AUTHOR CONTRIBUTIONS

JT contributed study conception, design, data collection, and served as the primary writer of the manuscript. JV performed statistical analysis, results reporting, and assisted in manuscript writing. HC organized the database, prepared tables, and assisted in manuscript organization. All authors contributed to manuscript revisions.

sources in the human brain. Ann. Neurol. 28, 811–817. doi: 10.1002/ana.4102 80613



EEG and sLORETA. Appl. Psychophysiol. Biofeedback 35, 219–228. doi: 10.1007/ s10484-010-9132-z


**Conflict of Interest Statement:** JT is contracted by StoryUp VR to assist in product development and assessment. Coauthor JV, who has no affiliation with StoryUp, was recruited to conduct statistical analyses and assist in the interpretation of this data to reduce potential conflicts of interest.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Tarrant, Viczko and Cope. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Video Games for Well-Being: A Systematic Review on the Application of Computer Games for Cognitive and Emotional Training in the Adult Population

#### Federica Pallavicini\*, Ambra Ferrari and Fabrizia Mantovani

Riccardo Massa Department of Human Sciences for Education, University of Milan Bicocca, Milan, Italy

Background: Although several excellent reviews and meta-analyses have investigated the effect of video game trainings as tools to enhance well-being, most of them specifically focused on the effects of digital games on brain plasticity or cognitive decline in children and seniors. On the contrary, only one meta-analysis results to be focused on the adult population, and it is restricted to examining the effects of training with a particular genre of games (action video games) on cognitive skills of healthy adults.

#### Edited by:

M.-Carmen Juan, Universitat Politècnica de València, Spain

#### Reviewed by:

Stephen Fairclough, Liverpool John Moores University, United Kingdom Karmele López-de-Ipiña, Universidad del País Vasco, Spain Magdalena Mendez-Lopez, Universidad de Zaragoza, Spain

#### \*Correspondence:

Federica Pallavicini federica.pallavicini@gmail.com

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 13 June 2018 Accepted: 15 October 2018 Published: 07 November 2018

#### Citation:

Pallavicini F, Ferrari A and Mantovani F (2018) Video Games for Well-Being: A Systematic Review on the Application of Computer Games for Cognitive and Emotional Training in the Adult Population. Front. Psychol. 9:2127. doi: 10.3389/fpsyg.2018.02127

Objectives: This systematic review was aimed to identify research evidences about the impact on cognitive [i.e., processing and reaction times (RTs), memory, task-switching/multitasking, and mental spatial rotation] and emotional skills of video games training in the healthy adult population.

Methods: A multi-component analysis of variables related to the study, the video games, and the outcomes of the training was made on the basis of important previous works. Databases used in the search were PsycINFO, Web of Science (Web of Knowledge), PubMed, and Scopus. The search string was: [("Video Games" OR "Computer Games" OR "Interactive Gaming")] AND [("Cognition") OR ("Cognitive") OR ("Emotion") OR ("Emotion Regulation")] AND ["Training"].

Results: Thirty-five studies met the inclusion criteria and were further classified into the different analysis' variables. The majority of the retrieved studies used commercial video games, and action games in particular, which resulted to be the most commonly used, closely followed by puzzle games. Effect sizes for training with video games on cognitive skills in general ranged from 0.06 to 3.43: from 0.141 to 3.43 for processing and RTs, 0.06 to 1.82 for memory, 0.54 to 1.91 for task switching/multitasking, and 0.3 to 3.2 for mental spatial rotation; regarding video games for the training of emotional skills, effect sizes ranged from 0.201 to 3.01.

Conclusion: Overall, findings give evidences of benefits of video games training on cognitive and emotional skills in relation to the healthy adult population, especially on young adults. Efficacy has been demonstrated not only for non-commercial video games or commercial brain-training programs, but for commercial video games as well.

Keywords: video games, computer games, cognitive training, emotional training, well-being

### INTRODUCTION

Over the last 40 years, video games have increasingly had a transformational impact on how people play and enjoy themselves, as well as on many more aspects of their lives (Yeh et al., 2001; Zyda, 2005; Boyle et al., 2012). Contrary to popular belief, which sees male children or teenagers as main targets of the gaming industry, the average player is instead 30 years old, and the entire gaming population is roughly equally divided into male and female players, therefore representing a daily activity for a consistent percentage of the adult population (Entertainment Software Assotiation, 2015). Thanks to the wide availability on the market, the affordable cost and the massive popularity, video games already represent crucial tools as a source of entertainment, and are soon expected to become critical also in another fields, including the mental health panorama (Granic et al., 2014; Jones et al., 2014).

While much of the early research on computer games focused on the negative impacts of playing digital games, particularly on the impact of playing violent entertainment games on aggression (e.g., Ferguson, 2007), and addiction (e.g., Gentile, 2009), gradually, scientific studies have also recognized the potential positive impact of video games on people's health (e.g., Anderson et al., 2010; Jones et al., 2014).

In recent decades, the field of computer gaming has increasingly developed toward serious purposes, and both commercial and non-commercial video games (i.e., developed ad hoc by researchers for the training of specific individuals' skills) have been tested by several studies. As early as in 1987, it was for the first time observed that famous commercial video games (i.e., Donkey Kong e Pac-Man) can have a positive effect on cognitive skills, improving the RTs of older adults (Clark et al., 1987). A few years later, in 1989, Space Fortress, the first non-commercial computer game designed by cognitive psychologists as a training and research tool (Donchin, 1989) was considered so successful that it was added to the training program of the Israeli Air Force. From that moment on, numerous video games have been developed with the specific purpose of changing patterns of behavior, and are often defined in literature as "serious games" (Zyda, 2005) as they use gaming features as the primary medium for serious purposes (Fleming et al., 2016).

Since these pioneering studies, numerous researches have investigated the potentiality of various video games, both commercial and non-commercial, mainly in relation with cognitive skills of seniors. For instance, it has been observed that the use of complex strategy video games can enhance cognitive flexibility, particularly in older adults (Stern et al., 2011). Furthermore, playing a commercial computer cognitive training program results in significant improvement in visuospatial working memory, visuospatial learning, and focused attention in healthy older adults (Peretz et al., 2011).

Besides being useful tools for the training of cognitive processes, various studies have demonstrated that video games offer a variety of positive emotion-triggering situations (e.g., Ryan et al., 2006; Russoniello et al., 2009; McGonigal, 2011), that may be of benefit during training of emotional skills, including self-regulation habits (Gabbiadini and Greitemeyer, 2017). For instance, puzzle video games such as Tetris, characterized by low cognitive loads and generally short time demands, are capable of positive effects on the players' mood, generating positive emotions and relaxation (Russoniello et al., 2009). Furthermore, by continuously providing new challenges, either it is switching from one level to another (e.g., Portal 2) or between different avatars (e.g., World of Warcraft), video games demand players to "unlearn" their previous strategies and flexibly adapt to new systems without experiencing frustration and anxiety (Granic et al., 2014).

Although several excellent reviews and meta-analyses have investigated the effect of video games training as tools for enhancing individuals well-being, in particular regarding cognitive and emotive enhancement (e.g., Boyle et al., 2016; Lumsden et al., 2016), most of them specifically focused on the effects of digital games on brain plasticity or cognitive decline in children and seniors (e.g., Lu et al., 2012; Lampit et al., 2014). Consonant findings regarding the positive relationship between video game training and benefits on various cognitive skills have been demonstrated by both behavioral studies (e.g., Baniqued et al., 2014) and meta-analytic studies (Toril et al., 2014) regarding both the aforementioned populations. On the contrary, only one meta-analysis focused on the adult population and it is restricted to examining the effects of training with a particular genre of games (action video games) on cognitive skills on healthy adults (Wang et al., 2016).

Despite this scarcity of focus on the adult population, the latter represents an extremely interesting and unique group, with very peculiar characteristics from a neurological and psychological point of view if compared to children and elders. As stated by Finch, the adult age, including both young adults (18– 35 years old) and middle age adults (35–55 years old), plays an important role in the life-span development, and therefore very well deserves to be studied thoroughly (Finch, 2009). On the one hand, the effects of the so-called inverted U curve of neuroplasticity and cognitive performance starts to be evident during the adult age, especially the middle-age (Cao et al., 2014; Zhao et al., 2015). On the other, it is well known that the level of psychological stress perceived by adults is rather high, and it can result in important mental and health disorders (Kudielkaa et al., 2004).

Moreover, as the literature states, baseline individual differences regarding age can determine variations in training effectiveness (Jaeggi et al., 2011; Valkanova et al., 2014), and if it is safe to say that video games can have beneficial effects when included in a training (e.g., Baniqued et al., 2014; Toril et al., 2014), such effects might indeed vary based on age-specific aspects which therefore cannot be overlooked (Wang, 2017).

Consequently, in the current review, we will describe experimental studies that have been conducted between 2012 and 2017, with the aim to identify research evidences about the impact on cognitive and emotional skills of video games training in the adult population. Specifically, a multi-component analysis of variables related to the study, video games, and outcomes of training was made on the basis on important previous works (Connolly et al., 2012; Kueider et al., 2012; Boyle et al., 2016), which provide a useful framework for organizing the research along key variables.

### METHODS

We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines (Moher et al., 2009).

### Search Strategy

With the objective of providing an overview of the experimental studies that have been conducted to test the benefits of different categories of video games used as training tools of cognitive or emotional domains for the adult population, a computerbased search for relevant publications was performed in several databases. Databases used in the search were PsycINFO, Web of Science (Web of Knowledge), PubMed, and Scopus. The search string was: [("Video Games" OR "Computer Games" OR "Interactive Gaming")] AND [("Cognition") OR ("Cognitive") OR ("Emotion") OR ("Emotion Regulation")] AND ["Training"].

### Selection of Articles for Inclusion in the Review

To avoid the risk of bias, PRISMA recommendations for systematic literature analysis have been strictly followed (Moher et al., 2009). Two authors (Federica Pallavicini, Ambra Ferrari) independently selected paper abstracts and titles, analyzed the full papers that met the inclusion criteria, and resolved any disagreements through consensus. Selected papers have to: (a) include empirical evidences on the impact and outcomes of video game based training; (b) have been published during the last 5 years (namely from January 2012 to August 2017), in analogy with several other relevant previous works (i.e., Connolly et al., 2012; Boyle et al., 2016); (c) include participants within an age range of 18–59 years old; (d) only include samples of healthy participants, i.e., not suffering from any neurological disorder (e.g., traumatic brain injury), or psychiatric disorders according to DSM-5 Axis I (American Psychiatric Association, 2013); (e) be published on peer-reviewed journals.

### Coding of Selected Studies, Video Games, and Training Outcomes

The papers selected on the basis of the inclusion criteria were coded from the data extraction pro-forma that was developed by Connolly (Connolly et al., 2012), and subsequently modified by Boyle et al. (2016), adapting it to the specificity of this review and its area of interest. In particular, in this systematic review papers were coded with respect to:

• **Video Game Variables:** The game category (whether the game was commercial or non-commercial); the game genre (action games; driving-racing games; puzzle games; strategy games; simulation games; exergames; horror games; commercial brain training programs; arcade games; adventure games); the platform for the game (console, PC/laptop, or mobile gaming). First of all, the category of the game has been included to explore the effectiveness of several commercial titles, used "as-is" (without modifications), which in previous studies resulted to be effective for the cognitive training (e.g., Green and Bavelier, 2006; Dye et al., 2009). Furthermore, the categorization was included in order to analyze the efficacy of ad hoc developed games, about which an ongoing debate about their effectiveness still persists (e.g., Owen et al., 2010). Secondly, the classification of video game genres was considered because of the fact that, under many points of view, not all video games are equal and their effects strongly depend on specific characteristics of the game itself (Achtman et al., 2008; van Muijden et al., 2012). In addition, it has been reported that combinations between the neurological stage of the participants and the precise features of each video game produce unique results in a matter of benefits on mental skills (Ball et al., 2002; van Muijden et al., 2012). There is no standard accepted taxonomy of genre, although one of the most adopted is the Herz's system (Herz, 1997), while others studies seem to simply divide action games from any other kind, often defined as casual games as a whole (e.g., Baniqued et al., 2013, 2014). Here, we propose the above categorization, which resembles the present commercial classification as much as possible, defining ten different genres of commercial video games. Thirdly, new technologies such as mobile devices and online games have recently expanded the ways in which games have traditionally been played, their medium of delivery and the different platforms available. Platforms of delivery represent important information about video game training, primarily because they are the way in which the training itself can be accessed (Aker et al., 2016).

• **Variables Related to the Study:** The sample included in the study (sample size, mean age, or age range); the research design used (categorized as a Randomized Controlled Trial or Quasi Experimental); the measures used for the assessment of outcomes (self-report questionnaires, cognitive tests, fMRI, physiological data, etc.); the duration of training (duration, intensity, and the total amount of sessions); the effects size of each training outcome, reporting partial-eta squared (η 2 ), with values closer to 1.0 indicating a stronger effect size, and Cohen's d; the calculation of range and mean value of effect sizes for each training outcome has been expressed as Cohen's d, applying the conversion formula when reported by the study in terms of partial-eta squared (η 2 ) (Cohen, 1998); where not reported in the study, standardized Cohen's d effect sizes were derived following a computation formula: the one described in Dunlap et al. (1996) in order to calculate d from dependent t-tests; the computation formula by Thalheimer and Cook (2002) for ANOVAs with two distinct groups (df = 1); the calculation formula by Rosenthal and DiMatteo (2001) from χ 2 (with one degree of freedom); otherwise, in cases where effect sizes could not be calculated because not reported in the study or because the necessary data to derive them through formulas were not present, p-value was reported instead ( e.g., Oei and Patterson, 2013; Wang et al., 2014; Chandra et al., 2016). The sample, study design, and measures of training outcomes have been included as relevant variables in analogy to what has been done in previous reviews (Boyle et al., 2012; Connolly et al., 2012), to facilitate the access to easily classified and comparable studies among the literature. An indication of mean age or age range has been provided in order to identify studies conducted on young vs. middle-aged adults. Training-related factors have also been considered, including the duration, intensity, and total amount of training sessions, as well as the effect sizes of the training outcomes, since they represent useful information about the characteristics and feasibility of the training itself (Hempel et al., 2004).

• **Video Game-Based Training Outcome Variables:** The selected papers have been divided into two macro-categories: cognition and emotion. Regarding cognition, authors identified five domain-specific subcategories, following the classification proposed by Kueider et al. (2012), partially adapted to the specificity of the results that emerged from the review, specifically: (1) multiple domain, namely trainings focused on more than one cognitive skill, such as trainings including reasoning, episodic memory, and perceptual speed as target skills at the same time; (2) processing speed and reaction times (RTs), i.e., respectively, the ability to quickly process information (Shanahan et al., 2006), and the amount of time needed to process and respond to a stimulus and is critical for handling information (Garrett, 2009); (3) memory, defined as the ability to retain, store, and recall information (Baddeley and Hitch, 1974), including many different types of memory, such as episodic, short-term, visual and spatial working memory; (4) task-switching/multitasking, defined as a whole as attributes of control processes while switching from one task to another (Dove et al., 2000); (5) mental spatial rotation, that is the ability to mentally rotate an object (Shepard and Metzler, 1971). Such categorization has been chosen among many others proposed by literature (e.g., Sala and Gobet, 2016; Stanmore et al., 2017; Bediou et al., 2018), because of its particular adaptability to the search results at hand, and because of its effectiveness in defining precise sub-categories of cognitive skills.

### RESULTS

### Papers Identified by Search Terms

A large number of papers (1,423) published in the time period between January 2012 and August 2017 was identified. As

discussed in section Papers Selected Using our Inclusion Criteria, this set of papers was further screened, obtaining a set of 35 relevant papers (see **Figure 1**).

### Papers Selected Using Our Inclusion Criteria

Applying the four inclusion criteria to these papers, 35 papers were identified (see **Table 1**). The largest number of papers was found in Scopus, followed by PsycINFO, Pubmed, and Web of Science (Web of Knowledge).

## Analysis of Game Variables

#### Video Games Category

Considering the entirety of the studies, 42 commercial video games and 7 non-commercial video games have been tested as training tools for cognitive or emotional skills. As for video games used for cognitive enhancement specifically, a total of 38 commercial video games and 6 non-commercial video games have been adopted; concerning emotional enhancement, instead, 4 commercial games and 1 non-commercial game have been used as training tools in the studies included in this review.

#### Video Games Genres

Among the studies included in this systematic review, the genre of commercial games was very varied, with action games (15) being the most used, followed by puzzle games (8), brain training games (5), exergames, and driving-racing games (3 for each category), simulation, driving racing games and exergames (3 games each), adventure games (2 games for each genre), and, finally, strategy games, arcade games, and horror games (1 game for each genre). Regarding training of cognitive skills specifically, among commercial games, the genre was very varied, with action games (14) and puzzle games (7) being the most used, followed by brain training games (5), simulation and driving-racing games (3 games), exergames and adventure games (2 games each), and, finally, strategy games and arcade games (1 game for each genre). As for emotional training, only 1 study adopted a noncommercial video games, while a variety of commercial video games were used (1 horror game, 1 action game, 1 puzzle game, and 1 exergame).

#### Platform/Delivery

Considering the retrieved studies, games delivered via PC or laptop were the most popular in all categories (20 studies), followed by mobile (8 studies) and console (7 studies). Regarding cognitive training, 18 video games delivered via PC were used, 6 via console, 6 via mobile. As for emotional training, 2 video games were delivered via PC, 2 via console, and 1 via mobile.

### Analysis of Variables Related to the Study Sample

The mean number of participants included in the emerged studies was 54.4 (cognition: M = 56.1; emotion: M = 42.8), ranging between 5 (Chandra et al., 2016) and 209 (Baniqued et al., 2013). The samples' mean age, instead, was 24.2 (cognition: M = 23.8; emotion: M = 27.7).

#### Study Design

In general, 28 studies included in the review have use a randomized control trial (RCT), while 7 studies have used a quasi-experimental design. The RCT was the design of choice of 24 studies related to cognitive training (e.g., Hutchinson et al., 2016; Looi et al., 2016). A quasi-experimental design was instead adopted in six studies directed at the evaluation of cognitive trainings based on video games (Mathewson et al., 2012; Montani et al., 2014). As for emotional training, four studies followed a RCT design (e.g., Bouchard et al., 2012), while 1 a quasiexperimental design (Naugle et al., 2014).

#### Duration of the Training

The length of the trainings proposed by studies included in this systematic review resulted to be rather heterogeneous, both in the number of sessions and in the number of weeks. In particular, the mean number of sessions was 10.1, ranging from 1 to 60 sessions, while the mean number of hours played was 13.5, ranging between 10 min and 50 h. As for cognitive training, a minimum of one session (e.g., Colzato et al., 2013; Cherney et al., 2014), and a maximum of 60 sessions (Kühn et al., 2014). The number of hours spent playing the different video games differed from study to study as well: from several minutes (Stroud and Whitbourne, 2015) to up to 50 h (Green et al., 2012; Chandra et al., 2016). As for emotional training, the minimum number of sessions was 1 as well, while the maximum was 10 (Bailey and West, 2013); the minimum time spent playing was of 25 min (Dennis and O'Toole, 2014; Dennis-Tiwary et al., 2016), and the maximum was 10 h (Bailey and West, 2013).

#### Measures Used for the Assessment of Outcomes

The measures of the training outcome adopted in the studies included in this systematic review predictably have largely been constituted by cognitive tests, for a total of 33 studies, 30 related to cognitive training (e.g., Baniqued et al., 2014), and 3 to emotional training (e.g., Bailey and West, 2013). Nonetheless, numerous studies (19) have included selfadministered psychological questionnaires: 14 aimed at cognitive training (e.g., Chandra et al., 2016), and 5 to emotional training (e.g., Dennis and O'Toole, 2014), while physiological measures were used in a total of 2 studies, both emotional trainings (e.g., Bouchard et al., 2012). fMRI-based assessments were instead used to measure the outcomes of cognitive trainings in two studies (e.g., Nikolaidis et al., 2014) and EEG assessments were used in a total of three studies (1) related to cognitive training (i.e., Mathewson et al., 2012), and (2) to emotional training (e.g., Bailey and West, 2013).

#### Analysis of Video Game Training Outcomes Cognition

Thirty studies used cognitive domain-specific training programs including memory, task-switching/multitasking and mental spatial rotation. Across all cognitive trainings, the effect sizes' (Cohen's d) range was 0.141–3.43 for processing and RTs (M = 1.18), 0.06–1.82 for memory (M = 0.667), 0.54–1.91 for TABLE 1 | Information about the video games variables of the selected studies.


(Continued)

#### TABLE 1 | Continued


task-switching/multitasking (M = 1.11), and 0.3–3.2 for mental spatial rotation (M = 1.5).


not report any benefit of commercial video games over these particular skills (van Ravenzwaaij et al., 2014).


#### Emotion

Five studies tested video games as tools for training emotional skills (**Table 2**). Across all these training programs, the effect sizes' range (Cohen's d) was 0.201–3.01 (M = 0.897). First of all, playing a commercial action game resulted in brain changes related to the emotion processing of facial expressions, with a reduction in the allocation of attention to happy faces, suggesting that caution should be exercised when using action video games to modify visual processing (Bailey and West, 2013). Moreover, playing exergames at a self-selected intensity has been reported to positively influence emotional responses (enjoyment, changes in positive and negative affects) (Naugle et al., 2014). Interestingly, commercial video games have also been tested as a tool to provide interactive Stress Management Training (SMT) programs, mainly used for decreasing levels of perceived stress and negative effects. In particular, training with a commercial horror video game combined with arousal reduction strategies (e.g., exposure to stressful scenarios, traditional biofeedback techniques) has shown efficacy in increasing resilience to stress in soldiers, as observed through analyses of salivary cortisol level conducted along the training (Bouchard et al., 2012).


**30**

TABLE 2 |

Information

 about the selected studies on video games for emotional training.


TABLE

2


Continued

Regarding non-commercial video games, training with an ad hoc non-commercial video game has been shown to help trait-anxious adult people handle emotional and physiological responses to stressors (Dennis and O'Toole, 2014), as well as improve behavioral performance in an anxiety-related stress task among female participants (Dennis-Tiwary et al., 2016).

### DISCUSSION

In the present systematic review, we examine experimental studies that have been conducted with the aim to identify research evidences about the impact on cognitive and emotional skills of video games training in the healthy adult population. The large number of papers (1,423) identified using our search terms confirmed that there has been a surge of interest in the use of games for the aforementioned specific population, following the tendency already registered about elders (e.g., Lampit et al., 2014), and young people (e.g., Gomes et al., 2015). After the application of the inclusion criteria, 35 papers were finally included and described on the basis of important previous works, which provide a useful framework for organizing the research along key variables (Connolly et al., 2012; Kueider et al., 2012; Boyle et al., 2016).

With respect to video game variables, starting from the games' category, efficacy was demonstrated not only for noncommercial video games or commercial brain-training programs, but for commercial off-the-shelf video games as well. Interesting cases regard Tetris, which resulted to be more effective than a commercial brain training program (i.e., Brain Age) in improving cognitive skills such as short-term memory and processing speed (Nouchi et al., 2013), and Portal 2, that has proven to be effective in improving skills such as problem solving even more effectively than a brain training program specifically developed for this purpose (i.e., Lumosity) (Shute et al., 2015). The fact that not only ad hoc non-commercial games, but also commercial video games can be useful for training cognitive and emotional capacities, if confirmed, appears to be very interesting, as it opens the possibility to use commercial titles for the training of cognitive and emotional abilities in the adult population. This could mean increasing adherence to training, keeping the trainee engaged with an effective feedback system (Cowley et al., 2008), and enhancing the accessibility of training programs in terms of costs and ease of access to treatment, since it would be sufficient to simply have a console or another gaming device.

As for the distribution of game genre, considering only commercial games, in the emotional training sector no genre prevalence is recorded, while in cognitive training action games are the most commonly used, followed by puzzle games, and by brain training games. Such result should not be considered surprising, as previous literature indicates action games as the class of video games which has been scientifically assessed for the longest time (e.g., Adachi and Willoughby, 2011), similarly to puzzle games (e.g., Carvalho et al., 2012), and brain training games (e.g., Owen et al., 2010).

Results showed that the delivery platform of choice for more than half of the included studies was the PC, distantly followed by games delivered via consoles or via mobile. This distribution is valid for both commercial and non-commercial games, which seems to be a rather interesting fact and various reasons behind this consistency of distribution can be hypothesized. Future studies should better investigate especially mobile training, which, because of its potential ubiquity, its low costs, and its potentially real-time use, could offer unique advantages over traditional tools such as PCs.

Regarding the variables related to the studies, namely the sample characteristics, the results of this systematic review showed that the majority of studies have been conducted on young adults (18–35 years) rather than middle-aged adults (35– 55 years). A possible explanation of this tendency could be linked to the fact that many studies have enlisted college students as participants, for a matter of simplicity of recruitment. However, it is important to note that the differentiation between young and middle-aged adults can be particularly relevant. As it is reported by scientific literature, in fact, the effects of the so-called inverted U curve of neuroplasticity and cognitive performance and of the perceived stress starts to be evident during the middle-age (Cao et al., 2014; Zhao et al., 2015). Moreover, strong differences in terms of knowledge and use of video games characterize these two age ranges. For these reasons, future studies should better investigate differences and analogies between young and middleaged adults, for instance to identify in which life-span moment a game-based cognitive or emotional treatment would potentially be more effective.

Secondly, regarding the experimental design adopted in the studies, results show that in the majority of cases studies were conducted using a RCT design. This seems to be linked to the need for evidences of well-controlled studies, differently from previous studies in which less strong methods (e.g., survey, correlational design) were used. It will be important for future studies to continue using this type of experimental design, which is considered as the most reliable empirical design in order to prove a treatment's effectiveness, minimizing the impact of confounding variables (Levin, 2007).

The measures of outcome of the training adopted in the studies included in this systematic review predictably have largely been constituted by cognitive tests (e.g., Blacker et al., 2014). Nonetheless, numerous studies have included selfadministered psychological questionnaires (e.g., Nouchi et al., 2013), physiological measures (e.g., Naugle et al., 2014), EEGbased assessment measures (e.g., Dennis-Tiwary et al., 2016), and fMRI-based assessments measures (e.g., Kable et al., 2017), which seem to be more reliable in assessing change over time, therefore an openness to such ways of assessment is desirable in a perspective of empirical evidence.

The length of the training programs proposed by studies included in this systematic review resulted to be rather heterogeneous, both in the number of sessions and in the number of weeks: from a minimum of one session (e.g., Colzato et al., 2013; Cherney et al., 2014) to a maximum of 60 sessions (Kühn et al., 2014), and with gameplay time ranging from 10 min to 50 h (Green et al., 2012; Chandra et al., 2016). Since the duration and intensity of training has been reported to be a relevant variable, as it has a rather important impact on the accessibility and feasibility of the training itself (Hempel et al., 2004), future studies should address in detail such aspects of the training, for instance comparing the effectiveness of shorter trainings to longer ones in order to identify the minimum number of sessions to obtain an effective program.

Finally, regarding the training outcome, based on this review, video games appear to hold promise for improving both cognitive and emotional skills in the healthy adult population. Empirical evidences were identified for all the training outcomes (i.e., cognition: multiple domain, processing speed and RTs, memory, task-switching/multitasking, mental spatial rotation; emotion).

Effect sizes (Cohen's d) for cognitive training, in general, ranged from 0.06 to 3.43: in particular from 0.141 to 3.43 for processing and RTs, 0.06 to 1.82 for memory, 0.54 to 1.91 for taskswitching/multitasking, and 0.3 to 3.2 for mental spatial rotation (**Table S1**). Effect sizes reported in this systematic review are comparable to those reported for video game interventions aimed at enhancing cognitive skills of senior populations (Kueider et al., 2012; Lampit et al., 2014). For instance, a systematic review of a computerized cognitive training with older adults reported a range standardized pre-post training gain from 0.09 to 1.70 after the video game intervention, which appears to be similar to the values emerged from the traditional (0.06–6.32) or computerized (0.19–7.14) trainings (Kueider et al., 2012).

Based on the studies reviewed, the largest impact of video game trainings for cognitive skills was found on processing speed and RTs, as these cognitive domains presented the larger effect sizes. In particular, it has been observed that training with action games (Green et al., 2012; Wang et al., 2014), FPS games (Colzato et al., 2013; Hutchinson et al., 2016), adventure (Li et al., 2016), and puzzle games (Stroud and Whitbourne, 2015) can enhance these skills in healthy adults. In only one case no benefits have been reported over these particular skills after training with commercial video games (van Ravenzwaaij et al., 2014). The possibility to train processing speed and RTs with video games, especially with action video games, represents one of the largest interests of video game and cognitive training literature in spite of mixed results about its effectiveness (e.g., Dye et al., 2009; Wang et al., 2016), therefore further investigation is surely needed. For instance, action video game novices assigned to action video game training show faster visual information processing according to one study (Castel et al., 2005), while no improvement has been reported for seniors involved in a brief training (Seçer and Satyen, 2014).

Results were generally positive across studies on training of memory as well. In particular, improvements in visual and spatial working memory have been observed after training with an action game (e.g., Blacker et al., 2014), an adventure game (Clemenson and Stark, 2015), and a non-commercial game (Looi et al., 2016). Concerning other forms of memory, a positive effect of an adventure game-based training on mnemonic discrimination was reported in one study (Clemenson and Stark, 2015), while improvements in short term memory skills have been noticed after a brain training program (Nouchi et al., 2013). On the contrary, no positive effects on episodic memory nor on visual and spatial working memory have been reported after training with puzzle games (Baniqued et al., 2014). What emerged from the studies included in this review appears to be in line with previous evidences concerning the possibility to effectively use video games to enhance the memory skills of young and older populations, in particular regarding visual and spatial working memory (e.g., Wilms et al., 2013; Toril et al., 2014). It is nonetheless important to highlight the fact that, in this systematic review and in previous literature, the efficacy (or the ineffectiveness) of each training seems to differ on the basis of the specific game genre, as well as of the sample characteristics (e.g., Baniqued et al., 2013; Oei and Patterson, 2015; Chandra et al., 2016). Future studies are therefore necessary in order to better investigate the role of video games in such sense.

Regarding mental spatial rotation, even though the effect sizes are averagely high, only two studies have been included in this review, therefore results should be considered in the context of such numerical limitation. From what emerged from this systematic review, an enhancement of mental spatial rotation abilities was reported after training with commercial exergames and driving-racing games, with a greater advance for women (Cherney et al., 2014), while no improvement was observed after training with other commercial games (one exergame and several action games) (Dominiak and Wiemeyer, 2016). Since early findings in this research field have reported evidences supporting an enhanced performance in spatial relations after video game training in elders (e.g., Maillot et al., 2012) and children (Subrahmanyam and Greenfield, 1994), future studies should deeply verify the possible usefulness of video games as training of such cognitive skill in adults specifically.

As for task-switching/multitasking, in spite of high effect sizes suggesting the effectiveness of video game trainings in such sense, it is once again important to underline the limited number of considered studies (three). According to the included studies, the cost of dual tasking and the cost of task-switching decreased after training with a commercial puzzle game (Oei and Patterson, 2014), as well as with a custom-made video game (Montani et al., 2014; Parong et al., 2017). The use of video games for such purpose, because of their own nature of requiring complex planning and strategizing, appears to be rather significant, as it could potentially allow training or rehabilitation of these cognitive skills (e.g., Boot et al., 2008). Literature, nonetheless, still presents mixed results, not always positive (e.g., Green et al., 2012), and for this reason future studies providing an in-depth analysis are still necessary.

Finally, regarding video games for the training of emotional skills, effect sizes ranged from 0.201 to 3.01. Despite the generally high values, it is currently impossible to compare them with results emerged from other systematic reviews or meta-analyses concerning the same topic, as the few works around the subject do not provide any information about effect sizes (e.g., Villani et al., 2018). The studies included in this review provide evidences suggesting that non-commercial video games (Dennis and O'Toole, 2014; Dennis-Tiwary et al., 2016) and commercial video games (exergames and horror games) can be effective in inducing positive emotions and in reducing individual levels of stress in healthy adults (Bouchard et al., 2012; Naugle et al., 2014). From this review, it appears that the number of studies conducted about this kind of training is smaller than the amount of studies related to cognitive training. This fact is rather curious, because the video games' intrinsic characteristics of being motivating, engaging, and easily accessible (Granic et al., 2014), make computer games potentially useful tools in order to better the individuals' emotion regulation. Future studies will be fundamental in order to explore the potentiality of video games as emotional training tools, and to identify the most effective game genres for this purpose, examining potentially interesting genres that have not been investigated yet (e.g., affective gaming, virtual reality-based gaming).

#### Limitations

As with all literature reviews, the current review does not claim to be comprehensive, but summarizes the current research on video games for the cognitive and the emotional training in the adult population based on specific key words used in the search string, the database included and the time period of the review. Moreover, in this review we based our choice of categories on a specific model (Connolly et al., 2012; Kueider et al., 2012; Boyle et al., 2016), however the level of specificity and distinctiveness of different categories is an ongoing discussion in the scientific world, both in relation with the outcomes of cognitive and emotive trainings, and with analyzing video games. Finally, the follow-up effect of video games training was not specifically addressed in this review, since a very limited number of studies provided follow-up tests.

#### Future Directions

The present systematic review provides several directions for future studies in this research field. First of all, further studies are needed to better examine the video games effects on cognitive and emotional skills, especially in middle age adults, population which has been investigated in a limited number of studies. Secondly, one of the biggest unresolved issues appears to be the generalizability of improvements: up to now, only short-term

#### REFERENCES


effects and specific improvements have been recorded in most studies (e.g., Hardy et al., 2015; Tárrega et al., 2015). In addition, video game characteristics (e.g., genre, platform) in relation with trained skills should be further investigated in the future, in order to create specific and effective training programs.

### CONCLUSION

To summarize, the present systematic review gives evidences of benefits of video game trainings on cognitive and emotional skills in relation to the healthy adult population, especially on young adults. Efficacy has been demonstrated not only for noncommercial video games or commercial brain-training programs, but for commercial video games as well. As for the distribution of game genre, action games are the most commonly used, followed by puzzle games. Finally, in this review, empirical evidences were identified for all the training outcomes, showing the potential effectiveness of video games for the training of both cognitive (i.e., multiple domain, processing speed and RTs, memory, taskswitching/multitasking, mental spatial rotation), and emotional skills.

#### AUTHOR CONTRIBUTIONS

FP, AF, and FM conceived the idea of this systematic review. FP and AF examined and write the description of the studies included. FM supervised the scientific asset. FP and AF write the first draft of the paper. All the authors read and approve the final version of the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.02127/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Pallavicini, Ferrari and Mantovani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Virtual Reality Analgesia for Pediatric Dental Patients

Barbara Atzori<sup>1</sup> , Rosapia Lauro Grotto1,2, Andrea Giugni<sup>3</sup> , Massimo Calabrò<sup>3</sup> , Wadee Alhalabi4,5 and Hunter G. Hoffman<sup>6</sup> \*

<sup>1</sup> Department of Health Sciences, Università degli Studi di Firenze, Florence, Italy, <sup>2</sup> Multidisciplinary Analysis of Relationships in Health Care (M.A.R.H.C.) Joint Laboratory, Uniser and Università degli Studi di Firenze, Florence, Italy, <sup>3</sup> Medical Practitioners and Dentists Board, Prato, Italy, <sup>4</sup> Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia, <sup>5</sup> Computer Science, Effat University, Jeddah, Saudi Arabia, <sup>6</sup> Mechanical Engineering, University of Washington, Seattle, WA, United States

Background: Dental procedures often elicit pain and fear in pediatric dental patients.

Aim: To evaluate the feasibility and effectiveness of immersive virtual reality as an attention distraction analgesia technique for pain management in children and adolescents undergoing painful dental procedures.

#### Edited by:

Federica Pallavicini, Università degli Studi di Milano Bicocca, Italy

#### Reviewed by:

Marco Fyfe Pietro Gillies, Goldsmiths, University of London, United Kingdom Yong Liu, Universität Hamburg, Germany

> \*Correspondence: Hunter G. Hoffman hoontair@gmail.com

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 21 July 2018 Accepted: 31 October 2018 Published: 23 November 2018

#### Citation:

Atzori B, Lauro Grotto R, Giugni A, Calabrò M, Alhalabi W and Hoffman HG (2018) Virtual Reality Analgesia for Pediatric Dental Patients. Front. Psychol. 9:2265. doi: 10.3389/fpsyg.2018.02265 Design: Using a within-subjects design, five patients (mean age 13.20 years old, SD 2.39) participated. Patients received tethered immersive interactive virtual reality distraction in an Oculus Rift VR helmet (experimental condition) during one dental procedure (a single dental filling or tooth extraction). On a different visit to the same dentist (e.g., 1 week later), each patient also received a comparable dental procedure during the control condition "treatment as usual" (treatment order randomized). After each procedure, children self-rated their "worst pain," "pain unpleasantness," "time spent thinking about pain," "presence in VR," "fun," and "nausea" levels during the dental procedures, using graphic rating scales.

Results: Patients reported significantly lower "worst pain" and "pain unpleasantness," and had significantly more fun during VR, compared to a comparable dental procedure with No VR. Using Oculus Rift VR goggles, patients reported a "strong sense of going inside the computer-generated world," without side effects. The dentist preferred having the patients in VR.

Conclusion: Results of this pilot study provide preliminary evidence of the feasibility of using immersive, interactive VR to distract pediatric dental patients and increase fun of children during dental procedures.

Keywords: virtual reality, pain, analgesia, attention, distraction, dental, dental caries, children

### INTRODUCTION

#### Traditional Analgesia

Pain during dental procedures is common, especially during invasive dental treatments such as tooth extractions or dental cavity fillings (Costa et al., 2012). Although local analgesics are routinely used to help control patients pain during dental procedures, pediatric patients often experience pain and anxiety during dental procedures (Guelman, 2005). Experiencing pain and anxiety during dental procedures can result in several negative consequences, such as higher levels of dental fear,

uncooperative behaviors and a general dissatisfaction of the patient with dental care (Guelman, 2005). Unpleasant early dental/medical experiences can affect patients' perception of healthcare, can increase pain and suffering during subsequent medical visits, and can reduce preventative healthcare, affecting lifelong health (El-Housseiny et al., 2014). El-Housseiny et al. (2014) recently conducted a study on dental fears in children. The children reported the following fears most prominently, 'fear of usual dental procedures and injections,' 'fear of strangers' (i.e., the dentist), 'fear of general medical aspects of treatment,' and 'fear of health care personnel.' Children with fear of dentist have more cavities/caries, and visit dentists less often than children who do not have fear of dentist (Milsom et al., 2003). One study found that over half of children with dental fears became difficult to handle or exhibited problematic behaviors during the dental procedures (Goumans et al., 2004). It is recommended that children visit a dentist every six to 12 months, and for good reason. In one recent study of school children (aged 9–12 years) from a randomly selected sample of primary schools from Sharfia area of Jeddah Saudi Arabia, over 75% of the children had one or more carious first permanent molars (i.e., cavities/tooth decay). With regular visits to the dentist, cavities and more serious problems can often be prevented, but many children do not want to go to the dentist. Children's learned aversion to visiting the dentist could be prevented by making dental visits less painful and more fun.

Ironically, when patients avoid going to the dentist, what could have been treated early as a tooth filling (preventative medicine), left untreated, may lead to advanced tooth decay such that patients require tooth extraction and/or root canal. Inflammation of the gums surrounding the infected tooth makes the dental care more painful, and healing after surgery takes longer with more advanced tooth decay. In some cases, unpleasant dental experiences can generalize to avoidance of healthcare in general.

Several techniques, both pharmacological and psychological, can be used to reduce patients' pain and anxiety during dental procedures. Local anesthesia is the most frequent pharmacological technique to reduce dental pain. Ideally, local anesthesia results in complete absence of pain in the anesthetized area during dental procedures. However, local anesthesia requires an injection into the jaw with a long needle, and patients often refuse it because they consider the injection painful and or because patients fear/avoid needles (Kuscu and Akiuz, 2007). Among the psychological techniques for pain management, distraction is a simple psychological non-drug pain control technique that can be used in addition to traditional pain medications, to help control acute pain during medical procedures. According to the Attention Pain Theory by Eccleston and Crombez (1999), distraction can reduce the amount of attentional resources the patient' brain has available to process incoming neural signals from pain receptors, with the result of a reduced subjective pain experience. However, the effectiveness of traditional distractions, such as music, for reducing pain and fear is often limited (Aitken et al., 2002; Koller and Goldman, 2012; Bellieni et al., 2013).

## Virtual Reality Analgesia via Attention Distraction

Virtual Reality (VR) analgesia is showing promise as an effective pain distraction technique for helping reduce the suffering and increasing the amount of fun children experience during painful medical procedures (Hoffman, 1998, 2004; Hoffman et al., 2006, 2011; Atzori et al., 2017).

The essence of immersive virtual reality is the user's illusion of going inside the 3D computer generated world, as if the virtual world is a place the patient is visiting. Researchers propose the following explanation for why VR reduces pain (Hoffman, 1998; Hoffman et al., 2000, 2011). "Being there" in the virtual world, floods the brain with information. The brain is so pre-occupied with processing information presented via virtual reality, that the patient has less attention available to process incoming pain signals. VR allows the user to be immersed in a computer-generated environment. Patients wear a Head Mounted Display (HMD) that blocks the patients view of the real world, substituting computer generated visual images and sound effects. VR may also tap into a natural desire of patients to "escape" from painful situations. Among adult dental patients, preliminary studies have shown virtual reality was effective for helping reduce pain in patients undergoing periodontal scaling and root planning (Hoffman et al., 2001; Furman et al., 2009) and unspecified dental procedures (Wiederhold et al., 2014). A growing number of studies have shown the effectiveness of VR for reducing pain of severe burn patients, including children (see Hoffman et al., 2011 for a review), but the effectiveness of highly immersive, interactive Oculus Rift VR for reducing pediatric dental pain and increasing fun during dental procedures, is currently unknown.

The current pilot study is the first to explore the feasibility, acceptability and the effectiveness of immersive VR to reduce pain during dental procedures such as dental fillings, and to explore the dentist's thoughts and opinions on the feasibility/applicability of this technique during dental procedures.

## MATERIALS AND METHODS

### Experimental Subjects

For 6 months, patients aged 7–17 years, who needed dental fillings or a tooth extraction during two visits were recruited. Patients were selected in a Private Dental Practice in the city of Prato (ITALY) with the help of the staff assistant who schedules patients, according to the following criteria based on the existent literature (Atzori et al., 2017). To be included, children and adolescents had to be able to understand Italian language and complete the tests, and had to be able to wear the helmet and interact with the VR environment, without any physical or psychological impairments. Patients were excluded from the study if they needed other kinds of procedures during the same visit, if they had a diagnosis of epilepsy, if they were not accompanied by parents, and patients were excluded if they were older than 17 years or younger than 7 years old.

Five patients, three males (aged 11, 12, and 14 years old), and two females (aged 12 and 17 years) met the inclusion criteria and underwent tooth extraction or dental fillings on two dental visits separated by at least 1 week between visits.

### Procedure

The protocol used in the current study was approved by the IRB ethics committee at the University of Florence Italy. The study was undertaken with the understanding, approval and written consent of each subject and their parent/guardian. The protocol was conducted under internationally accepted ethical standards and was approved by the dentists of the Dental Private Practice. Children and adolescents meeting the inclusion and exclusion criteria were referred to the psychologist researcher. Selected patients and their parents were approached in the waiting room of the Dental Practice to determine if they had any interest in participating. Interested families accompanied the psychologist researcher into a private room where they were informed about what would be involved, and if they were interested, they signed written informed assent/consent forms. Each patient received VR during one dental procedure, and received no VR during a second comparable procedure on a different day (e.g., 1 week later). Using a withinsubjects crossover design, with treatment order randomized, each patient received Yes VR on 1 day, and No VR on the second visit (or No VR on their first visit, and Yes VR on their second visit). No reward was given to patients for participating.

#### Measures

Pain levels, the quality of the VR experience, nausea and fun were measured using the Italian translation of the 0–10 graphic rating scale (GRS, Tesler et al., 1991; Hoffman et al., 2014) questionnaire adopted to evaluate pain, the quality of VR experience, fun and nausea (Hoffman et al., 2006). The cognitive, affective and sensory components of pain were evaluated by asking patients to respond to the following questions with a score between 0 and 10: (1) "Rate your WORST PAIN during the most recent pain stimulus (pain intensity): 0 no pain at all, 1–4 mild pain, 5–6 moderate pain, 7–9 severe pain, 10 excruciating pain." (2). How much TIME did you spend thinking about your pain during this most recent pain stimulus? (10-cm line with numeric and word descriptors beneath it: 0 = none of the time; 1–4 some of the time; 5 half of the time; 6–9 most of the time; and 10 all of the time). (3) How UNPLEASANT was the most recent pain stimulus? (10-cm line with numeric and word descriptors beneath it: 0 not unpleasant at all; 1– 4 mildly unpleasant; 5–6 moderately unpleasant; 7–9 severely unpleasant; and 10 excruciatingly unpleasant). (4) How much FUN did you have during the most recent pain stimulus? (10 cm line with numeric and verbal descriptors: 0 no fun at all; 1–4 mildly fun; 5–6 moderately fun; 7–9 pretty fun; 10\_extremely fun).

Patients were asked to respond to the following questions with a score between 0 and 10: While experiencing the virtual world, to what extent did you feel like you WENT INSIDE the virtual world? (10-cm line with numeric and verbal descriptors: 0 = I did not feel like I went inside at all; 1–4 mild sense of going inside; 5–6\_moderate sense of going inside; 7–9 strong sense of going inside; 10 I went completely inside the virtual world). How REAL did the virtual objects seem to you during virtual reality? 0 = completely fake, 1–4 somewhat real, 5 = moderately real, 6–9 = very real, 10 = indistinguishable from a real object. To what extent (if at all) did you feel nausea (sick to your stomach) as a result of experiencing the virtual world during the most recent VR session? (from 0 = "no nausea at all," 1– 4 = mild nausea, 5 = moderate nausea, 7–9 = severe nausea, 10 = vomit).

The dentist's experience during the procedure while the patients were using VR distraction, was investigated with a semistructured interview of 30 min conducted by a psychologist at the end of data collecting (July 2016). The dentist answered the following questions: (1) How did you feel when you performed the procedure and the patient was interacting with VR, compared to the standard routine? (2) What do you think about patients' experience during VR? (3) Did you find any impediment for the use of VR during dental procedures? and (4) Do you have any suggestion to improve VR distraction?

### Immersive Virtual Reality System

The current study used Oculus Rift DK2 and CV1 virtual reality goggles<sup>1</sup> , with two miniature computer screens, one screen per eye. The goggles received video and audio input from an MSI GT Series GT72 Dominator Pro G-1252 Gaming Laptop 6th Generation Intel Core i7 6700HQ (2.60 GHz) 16 GB Memory 1 TB HDD 512 GB SSD NVIDIA GeForce GTX 980M 4 GB GDDR5 17.3" with Windows 10 Home 64-Bit. Patients interacted with SnowWorld<sup>2</sup> , a virtual environment specifically designed for pain management of immobilized patients with severe burn injuries during painful procedures such as wound cleaning and range of motion exercises (Hoffman et al., 2001). In SnowWorld, patients have the illusion of going into an icy canyon where they throw snowballs at penguins, snowmen and other characters. The patient interacted with the virtual environment using a wireless mouse. SnowWorld VR software is specifically designed to be distracting, pleasant and nonnauseogenic, and to be used by patients who need to keep their heads and bodies still during the medical procedure (Hoffman et al., 2001). Traditional VR gaming software (which typically encourages head and body movements) could not be used by dental patients, who must remain very still during the dental procedures.

### Data Analysis

Within-subjects, paired t-tests were adopted to compare pain, nausea and fun levels between the "No VR" condition and the "Yes VR" condition. A researcher not involved in data collection carried out data analysis using the statistical Software SPSS 23. Results were considered significant when associated with p-values less than 0.05, using two tailed paired t-tests.

<sup>1</sup>www.oculusvr.com

<sup>2</sup>www.vrpain.com

### RESULTS

fpsyg-09-02265 November 21, 2018 Time: 19:51 # 4

#### Pain

Mean pain ratings were significantly lower during VR compared to the control condition for affective, and sensory components of pain. The mean "pain unpleasantness" during No VR was 2.40 (SE = 1.52), and dropped to 0.60 (SD = 0.55) during virtual reality, t(4) = 3.67, p < 0.05, SD = 1.10. The mean "worst pain" was 3.80 (SD = 2.59) during No VR, and this dropped to 2.20 (SD = 1.79) during virtual reality, t(4) = 3.14, p < 0.05, SD = 1.14. One patient showed no reduction in pain during VR, the other four patients all reported reductions in pain during VR. Although the difference in "time spent thinking about pain" was not statistically significant for this measure, responses showed the predicted pattern of results. Patients spent more time thinking about their pain during No VR (mean = 2.60, SD = 1.95) vs. during Yes VR (mean = 1.00, SD = 1.00), t(4) = 2.36, p = 0.08 NS, SD = 1.52.

### Quality of IVR Experience, Fun and Nausea

While undergoing the painful dental procedure while interacting with VR, patients reported mean presence ratings of 7.40 (SD = 2.70) corresponding to "a strong sense of going inside the computer generated world," and a mean of 7.40 (SD = 1.82) for the realism of VR objects, corresponding to "very real." Mean nausea ratings were considered negligible in both conditions (<1 on a 0–10 scale). When interacting with VR, patients reported significantly higher levels of fun during the painful procedure, compared to the control condition. Fun during No VR (mean = 3.20, SD = 4.32) was "mildly fun" vs. "pretty fun" during Yes VR (mean = 8.20, SD = 2.49), t(4) = 2.80, p < 0.05, SD = 4.00.

### Dentist's Experience

All patients had the same dentist. The dentist who performed all ten procedures (9 dental fillings and one tooth extraction) was also one of the authors of the current manuscript.

During a semi-structured interview after the study, the dentist made the following observations and comments. (1) The dentist felt more relaxed and was able to be more concentrated on his job when he performed the dental procedures while the patient was interacting with VR, compared to the routine standard care.

(2) The dentist considered patients less stressed during the interaction with VR and thought that they felt less pain than during the standard care. In the dentist's opinion, the VR system was suitable both for children and adolescents, because all patients reported fun during the interaction with the virtual world and many of them wanted to continue playing after the end of the procedure. The dentist considered VR to be an effective distraction technique especially for patients with high levels of anxiety.

(3) No impediment or contraindication emerged. In the dentist's opinion, the VR goggles/VR system didn't impede the dentist's ability to perform the dental procedures, and the dentist was able to easily communicate with the patient. The dentist found SnowWorld suitable for all patients.

(4) The dentist highlighted the need of a new VR software with a "hot" scenario in the future, because, he was concerned that the illusion of cold sensations could evoke pain during dental procedures in some patients (e.g., patients with cold sensitive teeth). Moreover, he expressed the desire to extend the use of VR pain management to also include adult patients.

### DISCUSSION

The current pilot study was conducted as a proof of concept, to explore the feasibility of using a new generation of mass produced commercially available Virtual Reality to distract children during painful/fear inducing dental procedures. The current study tested the effects of immersive, interactive Oculus Rift virtual reality distraction as a psychological technique to control pain during tooth extraction and dental fillings/cavities in children and adolescents. Based on the Interruption of Attention Pain Model by Eccleston and Crombez, we predicted that patients focusing their attentional resources on the virtual environment, would experience less pain, including the cognitive, affective and sensory components of pain, and we predicted patients would report having more fun during their dental procedure, on the day they received VR compared to the day they received standard of care with No VR. During VR, patients reported a significant 42% reduction in their "worst pain ratings, a significant 75% reduction in patients ratings of "pain unpleasantness" and patients reported a significant 61% increase in their ratings of how much fun they experienced during the dental procedure. Despite having to keep their heads still, and using a computer mouse to look around and shoot snowballs at objects in the virtual world, patients reported a "strong sense of going inside the computer-generated world" during VR, without side effects.

Although promising, the current pilot study has several limitations. First of all, the dentist who declared the "Dentist's experience" is one of the authors, raising a high risk of confirmation bias. The small sample size is another limitation. Studies with small samples sizes are vulnerable to the possibility that results may be biased if one single patient reports an extreme value overshadowing other patients' response. Fortunately, this was not a problem in the current study. One patient showed no reduction in pain during VR, the other four patients all reported reductions in pain during VR, with no outliers. However, small sample size is always an important concern. Because it is not possible to make broad scientific conclusions based on the results of studies that use small sample sizes (Campbell and Stanley, 1963), the current study must be followed up with larger, more carefully controlled studies. Another limitation is that the patients only received VR during one visit. Moreover, because all patients used VR for the first time during the current research, results could possibly be due in part to a novelty effect. Future research is needed to determine whether virtual reality continues to reduce pain when used repeatedly, and ideally to compare immersive VR to other distraction using emerging technologies (e.g., augmented reality with see-through glasses).

The current study showed no problems with VR induced nausea. Having patients keep their heads still (crucial in the

current study) greatly reduces the computational demands on the high performance gamers VR computer, reducing lag that can lead to VR simulator sickness/motion sickness in some people (e.g., for tips on using VR with children see3,<sup>4</sup> ). The current results also provide preliminary evidence that VR can also be an effective technique to promote good emotions and help patients cope with painful procedures in a non-stressful manner. Indeed, patients that interacted with VR during tooth extraction and dental fillings reported significantly higher levels of fun, compared to No VR, treatment as usual. During an interview after completing the study the dentist who performed the procedures suggested that VR distraction could be especially effective for anxious patients. Future studies should evaluate VR effectiveness for pain management comparing dental fears patients vs. patients do who do not have dental fears. For many children, experiences at the dentist give the young pediatric patients their first impressions about visiting healthcare givers.

VR could also be used to distract patients during needle injections into their gums before dental procedures (Atzori et al., 2017). We predict more pediatric dental patients would be able to tolerate getting local anesthesia injections, if they are in virtual reality during the injection. In that case they could benefit from both local analgesia, and continue to use virtual reality during their dental procedure. The greatest total analgesia will likely be achieved when immersive interactive VR + traditional pain medications are used concurrently (Hoffman et al., 2007).

Future research is needed to determine whether immersive interactive VR has any long term benefits for improving children's attitudes toward dental visits, and whether VR can improve children's attitudes toward healthcare in general, and whether more positive experiences using VR during dental care increases patients future willingness to seek healthcare.

In conclusion, the present study supports the feasibility of VR as a distraction technique for pain management in children and adolescents. The results of this preliminary pilot

### REFERENCES


study showed that this psychological technique can help reduce pain during tooth extraction and dental fillings without side effects, and made dental procedures more fun. Recent mass production of immersive VR goggles has increased the availability and affordability of Oculus Rift VR helmets, and there is growing interest in non-pharmacological techniques for pain management, making VR analgesia a promising direction for future research.

### ETHICS STATEMENT

This study was performed in accordance with the provisions of the Declaration of Helsinki. We obtained informed written consent from the patients and their caregivers. The patients' anonymity has been preserved. The protocol was conducted under internationally accepted ethical standards and it was approved by the Department of Health Sciences (University of Florence) and by the Dentists of the Dental Practice.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

NIH grants to David Patterson, R01GM042725, R01AR054115, and by Effat University Research and Consultancy Institute, Jeddah Saudi Arabia, and the Mayday Fund.

### ACKNOWLEDGMENTS

We thank all the families, who participated in this study and the staff of the Dental Practice.


<sup>3</sup> www.commonsensemedia.org/research/virtual-reality-101

<sup>4</sup> https://www.commonsensemedia.org/about-us/news/press-releases/commonsense-report-highlights-potential-impact-of-virtual-reality-on


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Atzori, Lauro Grotto, Giugni, Calabrò, Alhalabi and Hoffman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Virtual Reality Analgesia During Venipuncture in Pediatric Patients With Onco-Hematological Diseases

Barbara Atzori<sup>1</sup> \*, Hunter G. Hoffman<sup>2</sup> \*, Laura Vagnoli<sup>3</sup> , David R. Patterson<sup>4</sup> , Wadee Alhalabi5,6, Andrea Messeri<sup>7</sup> and Rosapia Lauro Grotto1,8

to help control pain in children and adolescents undergoing venipuncture.

<sup>1</sup> Department of Health Sciences, University of Florence, Florence, Italy, <sup>2</sup> Department of Mechanical Engineering, University of Washington, Seattle, WA, United States, <sup>3</sup> Pediatric Hospital's Psychology, Meyer Children's Hospital, Florence, Italy, <sup>4</sup> Department of Rehabilitation Medicine, University of Washington, Seattle, WA, United States, <sup>5</sup> Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia, <sup>6</sup> Department of Computer Science, Effat University, Jeddah, Saudi Arabia, <sup>7</sup> Pain Service and Palliative Care, Meyer Children's Hospital, Florence, Italy, <sup>8</sup> Multidisciplinary Analysis of Relationship in Health Care (MARHC) Lab, Pistoia, Italy

Background: Venipuncture is described by children as one of the most painful and frightening medical procedures.

Objective: To evaluate the effectiveness of Virtual Reality (VR) as a distraction technique

#### Edited by:

Federica Pallavicini, Università degli Studi di Milano Bicocca, Italy

#### Reviewed by:

Karel Allegaert, University Hospitals Leuven, Belgium Paula Goolkasian, University of North Carolina at Charlotte, United States

#### \*Correspondence:

Barbara Atzori psicob.atzori@gmail.com Hunter G. Hoffman hoontair@gmail.com

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 01 October 2018 Accepted: 26 November 2018 Published: 20 December 2018

#### Citation:

Atzori B, Hoffman HG, Vagnoli L, Patterson DR, Alhalabi W, Messeri A and Lauro Grotto R (2018) Virtual Reality Analgesia During Venipuncture in Pediatric Patients With Onco-Hematological Diseases. Front. Psychol. 9:2508. doi: 10.3389/fpsyg.2018.02508 Methods: Using a within-subjects design, fifteen patients (mean age 10.92, SD = 2.64) suffering from oncological or hematological diseases received one venipuncture with "No VR" and one venipuncture with "Yes VR" on two separate days (treatment order randomized). "Time spent thinking about pain", "Pain Unpleasantness", "Worst pain"

Results: During VR, patients reported significant reductions in "Time spent thinking about pain," "Pain unpleasantness," and "Worst pain". Patients also reported significantly more fun during VR, and reported a "Strong sense of going inside the computer-generated world" during VR. No side effects were reported.

the quality of VR experience, fun during the venipuncture and nausea were measured.

Conclusion: VR can be considered an effective distraction technique for children and adolescents' pain management during venipuncture. Moreover, VR may elicit positive emotions, more than traditional distraction techniques. This could help patients cope with venipuncture in a non-stressful manner. Additional research and development is needed.

Keywords: virtual reality, children, adolescents, pain, pediatric cancer, distraction

## INTRODUCTION

For many children, venipuncture is one of the most frightening aspects of visiting a hospital (Duff, 2003; Caprilli and Messeri, 2006). Experiencing pain and anxiety during medical procedures can result in several negative consequences, such as higher levels of fear. Unpleasant early medical experiences can affect patients' perception of healthcare, can increase pain and suffering during subsequent medical visits, and can reduce preventative healthcare, affecting lifelong health (El-Housseiny et al., 2014). Developing expectations of pain (e.g., via memories for previous painful medical procedure experiences, Noel et al., 2015) can increase how much pain patients

**44**

experience during a procedure, via top-down amplification of neural signals coming into the brain from the pain receptors (Fields, 2018). For patients with chronic diseases, venipuncture can be particularly painful and stressful (Bisogni et al., 2014). Adequate pain management is especially important for patients who receive multiple venipunctures. Indeed, untreated pain can have damaging effects on future pain perceptions and can provoke negative psychological effects (Weisman et al., 1998) and a simple procedure, such as venipuncture, could represent an additional stressor in an already critical condition.

Traditional distraction techniques (e.g., reading books or listening to music) are some of the most common psychological strategies for the reduction of procedural pain and anticipatory anxiety during venipuncture (Birnie et al., 2018).

Birnie et al. (2018) conducted a Cochrane Review of Psychological interventions for needle-related procedural pain and distress in children and adolescents (5550 participants). The studies evaluated in Birnie et al's (2018) review included venipuncture, intravenous insertion, and vaccine injections in patients aged two to 19 years. The most common psychological interventions were distraction (n = 32 studies) and only two VR distraction studies were included in Birnie et al's Cochrane review.

Virtual Reality (VR) is showing promise as an innovative distraction technique for pain management among children undergoing medical procedures (Hoffman, 1998; Hoffman et al., 2000a; Bailey and Bailenson, 2017; Atzori et al., 2018b,c). VR reduces the cognitive component of pain (time spent thinking about pain), but also reduces the affective component (pain unpleasantness) and the sensory component (worst pain), as consistently shown in studies with adult and pediatric participants with burn injuries (Hoffman, 1998; Hoffman et al., 2000a,b, 2011; Atzori et al., 2018a; Soltani et al., 2018). Unlike traditional distractions, VR allows the user to be immersed in a computer-generated environment, wearing a Head Mounted Display (HMD), or similar goggles, that occlude the patient's view of the hospital treatment room and blocks sounds of the real environment (Hoffman, 2004). The user can also interact with the VR environment, if the software allows it (Hoffman et al., 2006; Won et al., 2017). Although the mechanism(s) of how VR reduces pain are still under investigation, Hoffman and colleagues applied the Eccleston and Crombez's (1999) Attention Pain Theory to explain how VR can reduce the perception of pain. Attention is required to feel pain, but the illusion of being in a virtual environment and the patients' interaction with the objects in the virtual world reduce the amount of attentional resources the patient's brain has available to attend to the painful stimulus, thus reducing conscious pain perception (Hoffman et al., 2004a). Much of VR's therapeutic power is derived from its ability to divert attention away from painful medical interventions. Moreover, in addition to reducing pain, patients report having fun during burn wound care when playing VR (Hoffman et al., 2004b). The immersiveness of the VR systems, such as the quality of the helmet, and the patients' ability to interact with objects in the virtual world, influence how much VR reduces pain in adults (Hoffman et al., 2004c, 2006; Wender et al., 2009). Moreover, as converging objective evidence, fMRI brain scan studies have shown reduced activity of the brain areas involved in pain perception in healthy adult volunteers distracted with VR during a brief painful thermal stimulus (Hoffman et al., 2004b, 2007).

There is a growing interest in using VR for distraction among children and adolescents; however, to date most clinical studies on VR analgesia have included burn patients during physical therapy (Carrougher et al., 2009) or during burn wound cleaning (Das et al., 2005; Maani et al., 2011; Hoffman et al., 2014; Dascal et al., 2017). VR has also emerged as a useful intervention for procedural pain in patients suffering from chronic diseases, such as cancer patients' support during medical treatments (Chirico et al., 2016) and pain management during invasive procedures in pediatric cancer patients (Wint et al., 2002; Gershon et al., 2003, 2004; Wolitzky et al., 2005). Results exploring the use of VR distraction during needle related procedures have been encouraging, but mixed. Several studies have shown the predicted pattern, but non-significant reductions in patients' pain during painful cancer treatment procedures for children. For example, in a non-immersive VR study by Sanders Wint et al. (2002), patients watched a traditional movie via see-through glasses, and found no significant reduction in cancer patients' ratings of pain during venipuncture. A small early study using a relatively low tech VR goggles, did not find significant reduce patients pain during IV placement (Gold et al., 2006). Similarly, using early low tech VR technology, although they found some encouraging patterns, Gershon et al. (2004) found no significant reduction in patients' pain during port placement. Due in part to recent dramatic increases in the availability of immersive VR equipment (e.g., Hoffman et al., 2014), there is growing interest in using VR as a nonpharmacologic analgesic. Gold and Mahrer (2017) recently published a large definitive clinical study exploring the use of immersive VR using untethered Oculus VR goggles, and found significant reductions in pain during blood draws. Using tethered Oculus VR DK2 goggles, Piskorz and Czub (2018) found significant reductions in pain during blood draws in children with kidney problems.

The aim of the present study was to investigate VR effectiveness as a distraction technique for pain management in children and adolescents with onco-hematological diseases undergoing venipuncture. Based on the Eccleston and Crombez's (1999) interpretative model, we predicted that patients would focus their attentional resources on VR, and would have less attentional resources available to process incoming pain signals, with the result of reduced pain perception during venipuncture. We expected that patients interacting with VR during venipuncture would report less pain ("pain unpleasantness", "time spent thinking about pain" and "worst pain") compared to "treatment as usual". The current study is the first to measure how much fun patients experienced during venipuncture, during No VR vs. during Yes VR. We predicted that patients would report significantly higher levels of fun during VR (a surrogate measure of positive affect), compared with the standard care (no VR), without side effects (Sharar et al., 2016).

#### MATERIALS AND METHODS

fpsyg-09-02508 December 18, 2018 Time: 16:28 # 3

#### Participants

From February 2014 to July 2016, patients attending the Service of Pediatric Oncology and Hematological Diseases of an Italian Children's hospital participated. Children and adolescents who needed to undergo venipuncture twice in a year, for intravenous placement during chemotherapy, transfusions, magnetic resonance or blood analysis were recruited. Patients were selected according to the following criteria based on the existent literature (Won et al., 2017; Atzori et al., 2018a): children and adolescents who were able to understand Italian language, complete the tests, wear the helmet and interact with the VR environment, without any physical or psychological impairments. Patients with a venous access already inserted, with a diagnosis of epilepsy, not accompanied by their legal guardians, older than 17 years old and younger than 7 years old were excluded. Moreover, patients who wanted their own distraction tool (i.e., a book, a videogame or mp3-player) during the venipuncture, were excluded.

Seventeen patients met the inclusion criteria. However, one of them withdrew because he decided to use his own distraction technique and another patient withdrew because he didn't want to use VR during the second venipuncture (the reason was not indicated). A total of 15 patients (66.7% males, 33.3% females; mean age 10.92, SD = 2.64, see **Table 1**) took part in the study. All patients had previously received at least one venipuncture by nurses of the Service of Pediatric Oncology and Hematological Diseases and none of patients was at the first access. No patient reported pain before the beginning of the procedure. No patient had previously used VR before the study and all participants were familiar with the wireless mouse. All patients underwent two venipunctures on two different days: one venipuncture with No VR, and one venipuncture with Yes VR on a second visit (treatment order randomized). The mean time between the first and the second venipuncture was 26.6 days (± 24.5).

TABLE 1 | Demographic, clinical and procedural characteristics.


#### Procedure

This research was conducted in accordance with the Declaration of the World Medical Association<sup>1</sup> . The protocol was accepted by the Ethical Committee of the Hospital and the study was approved by the physicians and the nurses of the Service of Pediatric Oncology and Hematological Diseases and conducted in collaboration with the Pediatric Psychology Service and the Pain Therapy Service. Patients meeting the inclusion criteria were approached in the waiting room by a psychologist before the procedure in order to inform the families and get the signed written informed consent form by the patient's caregivers. The written informed consent was obtained from all the parents of the participants. All participants provided written informed consent/assent in accordance with the Declaration of Helsinki. The study was described to the parents/guardians. If the parent/guardian's gave permission, the research team then explained the study to the child in age-appropriate language, to see if the child was willing to participate in the study. Both children and their parents were encouraged to ask questions. Before the procedure began, patients and their caregivers were next escorted to the treatment room, and then the nurse arrived.

Using a within-subjects design, patients were assigned to the control condition ("No VR") or the experimental condition ("Yes VR") (treatment order randomized). Patients underwent the second venipuncture using the distraction technique not used the first time. The "No VR" control condition consisted of non-medical conversation by the nurse who performed the venipuncture (standard of care). In the "Yes VR" condition, patients interacted with VR during venipuncture. Before the nurse arrived, patients had 5 min to learn how to use the VR system. The helmet and the earphones included in the VR system were worn at the arrival of the nurse for the procedure and removed after the procedure. In both conditions, during the venipuncture, the patient, a nurse, the patient's parent/caregiver and the psychologist researcher were present. After the procedure, the nurse left the room and patients completed the self-report questionnaire (pain ratings). No reward was given to patients for participating.

#### Measures

At the end of the procedure, patients filled out a brief selfreport questionnaire aimed to evaluate the pain, the quality of the VR experience, nausea and fun. Patients responded by giving a 0–10 score on a horizontal Visual Analogue Scale (VAS; Price et al., 1994; Bailey et al., 2012). Pain, evaluated in its cognitive component (time spent thinking about pain), affective component (pain unpleasantness) and sensory component (worst pain) (Pagé et al., 2012), fun and nausea (Hoffman et al., 2004a) were evaluated in both conditions ("No VR" vs. "Yes VR", within subjects). The quality of VR experience was investigated only in the "Yes VR" condition asking patients what extent did they feel like they went into the virtual world, and how real did the VR objects seem. The included questions were based on those used in previous studies (Hoffman et al., 2004a; Atzori et al., 2018b) and they were translated from English language into Italian language

<sup>1</sup>www.wma.net

using the back-translation method, one of the most commonly used methods for cross-cultural translation (Maneesriwongul and Dixon, 2004). The total time for the procedure was comparable in both conditions. The time was measured from the positioning of the tourniquet to the needle extraction.

### Immersive VR System

fpsyg-09-02508 December 18, 2018 Time: 16:28 # 4

The VR equipment consisted of a VR helmet, the Personal 3D Viewer Sony: HMZ T-2, supported by a laptop, that allowed the interaction with the VR environment. The helmet had a 45◦ diagonal field of view, 1280 x 720 pixels per eye, latex-free earphones to provide acoustic isolation and it was suitable for both younger and older children. The VR software used was Snow World<sup>2</sup> , one of the most frequently employed VR environments, specifically designed to promote distraction from procedural pain (Hoffman et al., 2001). In SnowWorld, patients "go into" an icy canyon, where they throw snowballs at penguins, snowmen and other characters in VR, using a wireless mouse with the hand not employed in the venipuncture. SnowWorld was previously used in studies evaluating VR effectiveness for pain reduction in burn patients and during dental procedures. This is the first study in which this virtual environment is applied during venipuncture in patients with oncological and blood diseases.

### Data Analysis

A t-test for paired samples was adopted to compare pain, nausea and fun levels and the total time for the procedure between the "No VR" condition and the "Yes VR" condition. A researcher not involved in data collection carried out data analysis using the statistical Software SPSS. Based on the a priori assumption that the differences could only be in one direction, results were considered significant when associated with p-values less than 0.05, one tailed.

### RESULTS

#### Pain

As reported in **Table 2**, when patients underwent venipuncture, their mean pain levels were significantly lower during VR, compared with pain levels during "No VR", for all the three pain components: "Time spent thinking about pain" during "No VR" mean = 3.23 (SD = 2.98) vs. during "Yes VR" mean = 1.33 (SD = 1.05), p < 0.05, Cohen's d = 0.62, moderate effect size; "Pain unpleasantness" during "No VR" mean = 3.27 (SD = 3.43) vs. during "Yes VR" mean = 0.93 (SD = 1.16), p < 0.01, Cohen's d = 0.70 moderate effect size; "Worst pain" during "No VR" mean = 3.60 (SD = 3.00) vs. during "Yes VR" mean = 2.00 (SD = 1.20), p < 0.05, Cohen's d = 0.51, moderate effect size.

### Quality of VR Experience, Fun, Nausea and Total Time for the Procedure

Patients distracted by VR reported a mean presence score of 7.93 (SD = 1.79), corresponding to "strong sense of going inside the TABLE 2 | Means (Standard Deviation) in "No-VR" condition vs. "Yes-VR" condition.


computer generated world", and a mean realism of VR objects score of 6.80 (SD = 2.37), corresponding to "Moderately real."

Patients rated "fun" during the venipuncture as "mildly fun" during No VR, vs. "pretty fun" during VR. A significant difference for fun levels emerged between the two conditions: during "No VR" mean = 2.93 (SD = 3.58) vs. "Yes VR" mean = 8.80, SD = 1.42; t(14) = −6.60, p < 0.0001, Cohen's d = 1.71, large effect size. No significant differences emerged for nausea levels between the two conditions (p > 0.05 NS): no patient reported nausea during the interaction with VR. During the "Yes VR" condition the mean of the total time of the procedure was 3.09 min (SD = 1.81) vs. 4.45 (SD = 3.50) during the "No VR" condition. The pattern of results showed the venipuncture took less time during VR vs. during No VR; however, the differences were not significant (p > 0.05, NS).

### Gender Effects

As shown in **Figures 1**, **2** when males and females were analyzed separately, both males and females showed the predicted pattern of results (lower pain during VR compared to standard of care No VR).

### DISCUSSION

Based on the Eccleston and Crombez's (1999) Interruption of Attention and Pain model, pain requires attention, and humans have limited attentional resources, we predicted that patients would focus their attention on VR, and would have fewer attentional resources available to focus on the painful stimulus, so patients would feel less pain. As predicted, children and adolescent patients reported significant reductions in pain unpleasantness, reported spending significantly less time thinking about their pain during venipuncture and reported significantly lower intensity of pain during VR. As predicted, children and adolescent patients reported significantly more fun

<sup>2</sup>http://www.vrpain.com

when they used VR during their venipuncture. In the current study, patients experienced a strong illusion of presence and rated the virtual objects as "moderately real" looking. According to these results, the current VR system including the Sony HMZ-T2 helmet and the VR software SnowWorld, could be considered suitable for clinical applications with children and adolescents, promoting a medium-high quality virtual experience, without side effects. A higher quality helmet could potentially promote even better analgesia, as suggested by the literature (Hoffman et al., 2006).

The current study has some important limitations. Firstly, the sample size is small. Future studies with a larger sample are needed, and should also evaluate how much VR distraction reduces anxiety. Another limitation of the current study is the use of standard of care as a control group. Because the current study used standard of care as the control group, the difference between the groups may simply be due to the use of a distraction technique rather than to the specific use of VR. Addition research comparing VR to a more conventional distraction technique such as listening to music, is needed before any firm conclusions can about whether VR is unusually distracting.

According to Gold et al. (2007), not only the attentional demanding, but also the elicitation of positive emotion may contribute to VR analgesia (e.g., Sharar et al., 2016). The isolation from the medical setting (helmet blocking the patients view of the hospital room) and the possibility to be immersed in a pleasant activity makes VR a strong distraction technique, in particular for younger patients. The current study compared VR distraction with treatment as usual (non-medical conversation).

Future studies should compare VR to other distraction techniques during venipuncture, and should further explore the role of emotional activation in VR analgesia. For example, in future studies the VR condition should be compared to another simpler form of distraction such as having the children listen to an audiotape while undergoing the venipuncture treatment. And a study comparing immersive VR to augmented reality glasses

could be interesting. Moreover, patients interacted with Snow World, a virtual environment specifically designed for procedural pain management, in particular for burn patients. In future studies, environments designed for the specific kind of procedure and patient's characteristics (i.e., age, gender, cognitive abilities) are recommended.

Future studies may explore whether personality aspects of the child, such as catastrophizing, fear of pain, as well as parents' anxiety, and patients' memory for previous painful procedures (Noel et al., 2015) influence how much pain children experience during medical procedures (e.g., De Castro Morais Machado et al., 2018).

### CONCLUSION

This study contributes to a growing literature that supports the use of immersive VR distraction for pain control. The current study evaluated the effectiveness of VR to control pain (in its affective, sensory and cognitive components) and to promote fun during venipuncture in pediatric patients with cancer and blood diseases. Younger patients suffering from chronic diseases (i.e., cancer and blood diseases), who spend much time in hospital and need several painful and stressful medical procedures, could particularly benefit from this distraction. In the future, VR systems could also let patients have social interaction

### REFERENCES


(Won et al., 2017) and the quality of VR experiences will be more and more attentional demanding. VR distraction may also offer new opportunities for socialization and social support, especially for those patients in isolation or hospitalized for long periods.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

This study was supported by NIH grants R01GM042725 and R01AR054115 to DP, and by Effat University Research and Consultancy Institute, Jeddah, Saudi Arabia, and the Mayday Fund. Thanks to the support of Foundation Cassa di Risparmio di Firenze.

### ACKNOWLEDGMENTS

Thanks to Foundation Cassa di Risparmio di Firenze. Special thanks to A.T.C.R.U.P.

adults with burn injuries. J. Burn Care Res. 30, 785–791. doi: 10.1097/BCR. 0b013e3181b485d3



of combat-related burn injuries using robot-like arm mounted VR goggles. J. Trauma 71(1 Suppl.), S125–S130. doi: 10.1097/TA.0b013e31822192e2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Atzori, Hoffman, Vagnoli, Patterson, Alhalabi, Messeri and Lauro Grotto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 1

# Causal Interactive Links Between Presence and Fear in Virtual Reality Height Exposure

#### Daniel Gromer<sup>1</sup> , Max Reinke<sup>1</sup> , Isabel Christner<sup>1</sup> and Paul Pauli1,2 \*

<sup>1</sup> Department of Psychology, Biological Psychology, Clinical Psychology and Psychotherapy, University of Würzburg, Würzburg, Germany, <sup>2</sup> Center of Mental Health, Medical Faculty, University of Würzburg, Würzburg, Germany

Virtual reality plays an increasingly important role in research and therapy of pathological fear. However, the mechanisms how virtual environments elicit and modify fear responses are not yet fully understood. Presence, a psychological construct referring to the 'sense of being there' in a virtual environment, is widely assumed to crucially influence the strength of the elicited fear responses, however, causality is still under debate. The present study is the first that experimentally manipulated both variables to unravel the causal link between presence and fear responses. Height-fearful participants (N = 49) were immersed into a virtual height situation and a neutral control situation (fear manipulation) with either high versus low sensory realism (presence manipulation). Ratings of presence and verbal and physiological (skin conductance, heart rate) fear responses were recorded. Results revealed an effect of the fear manipulation on presence, i.e., higher presence ratings in the height situation compared to the neutral control situation, but no effect of the presence manipulation on fear responses. However, the presence ratings during the first exposure to the high quality neutral environment were predictive of later fear responses in the height situation. Our findings support the hypothesis that experiencing emotional responses in a virtual environment leads to a stronger feeling of being there, i.e., increase presence. In contrast, the effects of presence on fear seem to be more complex: on the one hand, increased presence due to the quality of the virtual environment did not influence fear; on the other hand, presence variability that likely stemmed from differences in user characteristics did predict later fear responses. These findings underscore the importance of user characteristics in the emergence of presence.

Keywords: presence, fear, virtual reality, visual realism, acrophobia

### INTRODUCTION

Psychological treatments using virtual reality (VR) have shown promising results for different psychopathologies (Riva et al., 2016; Freeman et al., 2017), including specific phobia (Shiban et al., 2017; Meyerbröker et al., 2018), social phobia (Bouchard et al., 2017), PTSD (Rothbaum et al., 2014; Beidel et al., 2017), eating disorders (Manzoni et al., 2016; Ferrer-García et al., 2017), and schizophrenia (du Sert et al., 2018; Pot-Kolder et al., 2018), among others. To date, the most evidence for the efficacy of VR treatments has been shown in phobic disorders

#### Edited by:

Stéphane Bouchard, Université du Québec en Outaouais, Canada

#### Reviewed by:

Philip Lindner, Stockholm University, Sweden Soledad Quero, Jaume I University, Spain

\*Correspondence: Paul Pauli pauli@mail.uni-wuerzburg.de

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 20 September 2018 Accepted: 15 January 2019 Published: 30 January 2019

#### Citation:

Gromer D, Reinke M, Christner I and Pauli P (2019) Causal Interactive Links Between Presence and Fear in Virtual Reality Height Exposure. Front. Psychol. 10:141. doi: 10.3389/fpsyg.2019.00141 fpsyg-10-00141 January 28, 2019 Time: 18:40 # 2

(Parsons and Rizzo, 2008; Powers and Emmelkamp, 2008; Opris et al., 2012; Turner and Casey, 2014; Morina et al., 2015), where pioneering studies established VR as a treatment medium as early as the late 1990s (e.g., for claustrophobia treatment, Botella et al., 1998). In virtual reality exposure therapy (VRET) for specific phobias, VR is used to simulate threatening environments and stimuli (e.g., virtual heights), allowing to expose patients to their fear (Riva et al., 2016). The ability of virtual environments (VE) to elicit symptoms of pathological fear has been shown in numerous studies (Diemer et al., 2014), yet the factors influencing how much fear is elicited (Diemer et al., 2015) and how phobic stimuli should be presented for optimal outcome of VRET (Freeman et al., 2017) are still not fully understood.

The VR-related psychological construct of presence, the user's sense of 'being there' in the VE, is widely assumed to be crucial for the fear responses in VR (Slater et al., 1994; Witmer and Singer, 1998; Cummings and Bailenson, 2016). Regarding the relationship between presence and therapy efficacy, Freeman et al. (2017, p. 2394) for example state that "VR has extraordinary potential to help people overcome mental health problems if high levels of presence are achieved [. . .]". However, the few studies on the effect of presence on VRET efficacy revealed mixed results (Schuemie et al., 2000; Krijn et al., 2004; Price and Anderson, 2007; Quero et al., 2008; Price et al., 2011). Likewise, a causal relationship between presence in VR and strength of fear responses when exposed to the feared stimulus or situation is assumed. For example, Price et al. (2011, p. 768) state that "[. . .] presence is the mechanism by which a virtual stimulus can elicit fear [. . .]". However, the assumed relationship between presence and fear is mainly confirmed by reports of positive correlations between both measures (see Ling et al., 2014, for a meta-analysis). A possible causal relationship has not been demonstrated unequivocally yet and therefore is still subject to debate (Diemer et al., 2015; Peperkorn et al., 2015; Riva et al., 2015).

Few experimental studies tried to demonstrate the assumed causal relationship between presence and fear in VR. Bouchard et al. (2008) compared effects of an anxiety-inducing VE vs. a control VE on ratings of presence and anxiety. They found higher presence ratings in the anxiety-inducing VE and concluded that the increase in anxiety caused the increase in presence. Peperkorn et al. (2015) studied associations between presence and fear by exposing participants multiple times to a virtual spider. They concluded that presence predicted fear (and not the other way around) in early trials, whereas the relationship became bidirectional in later trials. Robillard et al. (2003) exposed phobic and non-phobic participants to phobic stimuli and environments and assessed both presence and anxiety ratings. The authors then conducted stepwise linear regression analyses on both presence and anxiety ratings to find the best predictors for each variable. Since both variables were important predictors of each other, the authors concluded that the results "indicate a synergistic relationship between presence and anxiety" (Robillard et al., 2003, p. 467). Shortcomings of these previous studies are that they were either correlational or they manipulated only one of both variables. To our knowledge, the present study is the first which experimentally manipulated both presence and fear and assessed presence ratings as well as verbal and physiological fear responses.

Fear is typically manipulated by presenting stimuli and environments relevant vs. irrelevant to a given phobia (Bouchard et al., 2008; Alsina-Jurnet and Gutiérrez-Maldonado, 2010) and/or by comparing fear responses of phobic vs. non-phobic participants (Robillard et al., 2003; Alsina-Jurnet and Gutiérrez-Maldonado, 2010). Both approaches are effective, and we followed the former approach by presenting a virtual height and a control environment to height-fearful participants.

Experimental manipulation of presence may be achieved by changing hardware characteristics of the VR system or the quality of the VE. Increased field of view, use of stereoscopy, and increased levels of user-tracking were found to show clear effects on presence (Cummings and Bailenson, 2016). In contrast, manipulations of sensory realism, i.e., quality of visual and auditory simulations, had mixed results (Cummings and Bailenson, 2016), although knowing the relevance of these two factors for the experience of presence would be of high interest especially for researchers who develop VEs. According to Christou and Parker (1995), visual realism "can be equated with how closely the artificial world resembles a corresponding possible real world" (Christou and Parker, 1995, p. 53). Elements of visual realism are geometry (e.g., vertex count), lighting (e.g., static vs. dynamic shadows, soft vs. hard shadows) and material properties (e.g., texture resolution, use of normal maps) (Slater et al., 2009; Reinhard et al., 2013). Some studies found increased presence with higher visual realism (Welch et al., 1996; Slater et al., 2009; Kwon et al., 2013), whereas other studies did not find such an effect (Dinh et al., 1999; Zimmons and Panter, 2003; Mania and Robinson, 2004; Lee et al., 2013; Lugrin et al., 2015). Similarly, some studies found a positive effect of auditory simulation (e.g., absence vs. presence of sound, stereo vs. spatial sound) on presence (Hendrix and Barfield, 1996; Dinh et al., 1999; Larsson et al., 2007; Brinkman et al., 2015), while other studies could not find such an effect (Nichols et al., 2000; Keshavarz and Hecht, 2012a,b). Please note that these studies used different manipulations of visual realism and/or auditory content, and also different measures for presence (Kober and Neuper, 2013), and therefore, conclusions about the best option to manipulate presence cannot be drawn. We decided to manipulate presence by changing the sensory realism of VEs because of the high relevance for researchers and because the need to advance unequivocal findings.

The present study exposed height-fearful participants to a fear-eliciting VE versus two neutral control VEs (fear manipulation, within subjects), whereby half of the participants experienced high sensory realism VEs (visual content of high quality and with auditory simulation) versus low sensory realism VEs (visual content of low quality and without auditory simulation) for the other half (presence manipulation, between subjects). Presence as well as verbal and physiological (skin conductance and heart rate) fear responses were registered. Our hypotheses were: (1) higher quality of visual and auditory content of the VE increases presence, and (2) there is a causal relationship between presence and fear responses, i.e., either (2a) increased fear levels (comparing the height vs. the neutral situation) lead to a higher reported sense of presence (fear → presence), or (2b) increased presence (comparing high quality vs. low quality simulations) leads to stronger fear responses (presence → fear).

### MATERIALS AND METHODS

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 3

#### Sample

Potential participants were recruited via advertisement and the university subject pool, and were screened for fear of heights using a subset of the Acrophobia Questionnaire (AQ, Cohen, 1977) to predict AQ scores. Volunteers with estimated scores between 20 and 50 (targeting a height-fearful but non-clinical population) were invited to the study and 49 participants (age: M = 26.84, SD = 10.94; 37 female) were included. The experimental procedure was approved by the Ethics Committee of the Institute of Psychology at the University of Würzburg. All participants gave written informed consent in accordance with the Declaration of Helsinki. Participants received either 8 EUR or course credit for participation.

#### Apparatus

The virtual environment was rendered in Unreal Engine 4.12 (Epic Games, Cary, NC, United States) using assets from the Open World Demo Collection and was displayed on a HTC Vive (HTC, New Taipei City, Taiwan) with a resolution of 1080 pixels × 1200 pixels per eye at 90 Hz, and a 100◦ field of view. The experiment ran on a Windows 10 64-bit machine with an Intel Core i5-6600k, 16 GB RAM and a Nvidia GTX 970. A Sennheiser HD 439 (Sennheiser, Wedemark-Wennebostel, Germany) was used for audio presentation. Physiological signals (electrodermal activity, electrocardiogram) were recorded by a Brainproducts V-AMP 16 and the Vision Recorder 1.2 software (Brain Products, Munich, Germany).

### Experimental Design and Procedure

A 2 × 3 mixed design was used for the study. Experimental manipulations were presence manipulation by means of sensory realism (low vs. high, between factor) and fear manipulation with different situations (control 1 vs. height vs. control 2, within factor). Participants were randomly assigned to the between subject factor.

For the fear manipulation, two different environments were created: a control situation which exposed participants to a forest environment surrounded by rocks and trees, and a height situation which exposed participants to the edge of a 30 m deep canyon. These VEs were manipulated regarding sensory realism to induce different levels of presence. This was realized by modifying both the visual realism of the VE as well as the auditory content. The low sensory realism condition was derived from the high sensory realism condition by (1) simplifying polygon meshes by scaling down the vertex count of meshes to 5–10% using the Decimate modifier in Blender, (2) reducing texture quality by applying both a Mosaic filter and Surface Blur filter to the textures in Photoshop (see **Figure 1**), (3) replacing tree meshes with twodimensional bitmaps (sprites), and (4) turning sound off (see **Figure 2** for demonstration of the different conditions).

FIGURE 1 | Example for the manipulation of visual realism. In the low and high sensory realism conditions, the rock was rendered with 152 vertices and simplified texture (left), and 2,342 vertices and fine-grained texture (right), respectively.

After the arrival in the laboratory, participants read and signed the informed consent. Participants were then equipped with electrodes for heart rate and skin conductance measurement. During a baseline measure of 5 min, participants filled in questionnaires (demographics, Acrophobia Questionnaire, and State-Trait Anxiety Inventory) and read an information letter, which described the concept of presence in VR, and how it would be rated during the experiment. Subsequently, participants were placed in the center of the VR tracking area and helped to put on the head-mounted display and headphones. The actual experiment consisted of the fixed sequential exposure to three situations, which were presented in either their high or low sensory realism version: the control situation (control 1), the height situation (height condition), and again the control situation (control 2). Each trial consisted of a fade-in of the virtual scene, a 1-min exploration phase where participants could look around, and a rating phase where participants were asked to give their fear and presence ratings, followed by a fadeout of the virtual scene. After taking off the head-mounted display, participants filled in another set of questionnaires (State-Trait Anxiety Inventory: state anxiety subscale only, Simulator Sickness Questionnaire, and MEC Spatial Presence Questionnaire).

## Measures

#### Questionnaires

Acrophobia Questionnaire (AQ; Cohen, 1977) is a self-report questionnaire that assesses trait height anxiety on the subscales anxiety and avoidance. The subscale for anxiety comprises of 20 situational items (α = 0.86), such as "standing next to an open window on the third floor." Each item is rated on a seven-point Likert Scale ranging from 0 ("not at all anxious") to 6 ("extremely anxious"), resulting in a sum score of 0–120. The avoidance subscale consists of the same 20 situational items (α = 0.73). Each item is rated on a three-point Likert Scale ("would not avoid doing it," "would try to avoid doing it," and "would not do it under any circumstances"), resulting in a sum score of 0–40.

State-Trait Anxiety Inventory (STAI; Laux et al., 1981) is a selfreport questionnaire that measures state and trait anxiety. The state anxiety subscale consists of 20 items (e.g., "I am calm") fpsyg-10-00141 January 28, 2019 Time: 18:40 # 4

that are rated on a four-point Likert Scale ranging from "not at all" to "very much so" (α = 0.85 at pre- and α = 0.93 at postmeasurement, respectively). Participants are asked to rate the statements according to their present feelings. The trait anxiety subscale also consists of 20 items (e.g., "I am content") which are rated on a four-point Likert Scale ranging from "almost never" to "almost always" (α = 0.92). Participants are asked to rate the statements according to how they feel generally. The range for both scales is from 20 to 80. The STAI was measured as a control variable.

Simulator Sickness Scale (SSQ; Kennedy et al., 1993) is a self-report questionnaire that measures simulator sickness, that is symptoms such as nausea, dizziness, headache, or eyestrain, resulting from immersions into VEs. The questionnaire comprises 16 items rated on a four-point Likert Scale ranging from "none" to "severe." The resulting sum scores are associated with the three factors nausea (e.g., stomach awareness) (α = 0.75), oculomotor problems (e.g., eyestrain) (α = 0.57), and disorientation (e.g., vertigo) (α = 0.78), as well as a total score (α = 0.85). The SSQ was measured as a control variable.

MEC Spatial Presence Questionnaire (MEC-SPQ; Vorderer et al., 2004) is a self-report questionnaire that measures different aspects of spatial presence. It builds upon the process model of spatial presence by Wirth et al. (2007) and consists of eight subscales measured by either 4, 6, or 8 items, respectively, rated on a five-point Likert Scale ranging from 1 ("I do not agree at all") to 5 ("I fully agree"). In the current study, five subscales were used in the 8-item version: Attention Allocation (e.g., "I devoted my whole attention to the virtual environment.") (α = 0.89), Spatial Situation Model (e.g., "I had a precise idea of the spatial surroundings presented in the virtual environment.") (α = 0.84), Spatial Presence: Self Location (e.g., "I felt as though I was physically present in the environment of the presentation.") (α = 0.92), Spatial Presence: Possible Actions (e.g., "I had the impression that I could be active in the environment of the presentation.") (α = 0.89), and Suspension of Disbelief (e.g., "I concentrated on whether there were any inconsistencies in the virtual environment") (α = 0.88). The three remaining subscales Higher Cognitive Involvement, Domain Specific Interest, and Visual Spatial Imagery were not measured because of the length of the full questionnaire and our focus on subscales that measure spatial presence in the narrower sense. The questionnaire therefore comprised 40 items.

#### Online Ratings

Fear ratings were assessed by means of Subjective Units of Discomfort Scales (SUDS) ranging from 0 to 100. Presence ratings were assessed using the question "To which extent did you feel present in the virtual environment, as if you were really there?" (Bouchard et al., 2004) with a range from 0 to 100.

### Physiological Measures

#### **Heart rate (HR)**

The electrocardiogram (ECG) was derived using three Ag/AgCl electrodes placed under the right collarbone, on the lower left costal arch (reference electrode), and on the lower left back (ground electrode), recorded at a sample rate of 500 Hz. The ECG was filtered offline with a 50 Hz notch filter and a 2.5 Hz high-pass filter. Detection of R waves and correction of interbeat interval artifacts was done in PeakMan 0.3.0<sup>1</sup> . The sequence of interbeat intervals was processed with the R package phyr6<sup>2</sup> . First, the sequence was segmented (control situation 1, height situation, and control situation 2) and subsequently baseline corrected, using the phase where participants filled in questionnaires as baseline. Second, the mean heart rate change from baseline (1HR in bpm) was calculated for each segment.

#### **Skin conductance level (SCL)**

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 5

The electrodermal activity (EDA) was derived using two 13/7 mm Ag/AgCl electrodes filled with 0.5% NaCl gel. The electrodes were placed on the thenar and hypothenar of the right hand and the signal was recorded at a sample rate of 500 Hz. Segmentation was done analogously to the ECG signal. Before applying the baseline correction, the EDA signal was added to 1 and logarithmized to control for skewness. The change in SCL from baseline [1SCL in log(µS + 1)] was then calculated by computing the mean of each segment.

#### Data Analysis

All statistical analyses were conducted with R 3.5.0 (R Core Team, 2016). The afex package (Singmann et al., 2016) was used for ANOVA with type 3 sum of squares, and the emmeans package (Lenth, 2018) was used for post hoc comparisons (using Tukey's method for alpha adjustment for multiple comparisons). In the ANOVA for presence ratings, one participant had to be excluded due to missing data and in the ANOVA for SCL another participant had to be excluded due to technical problems with the electrodes. The cross-lagged panel model was fitted using the lavaan package (Rosseel, 2012) and displayed with the semPlot package (Epskamp, 2017).

#### RESULTS

#### Group Characteristics

Participants in the two experimental conditions did not differ in sex, χ <sup>2</sup> = 0.06, p = 0.802, as well as height-fearfulness, state and trait anxiety, and simulator sickness after the experiment (see **Table 1**).

#### Influence of Sensory Realism on Presence

In order to test whether the manipulation of presence by means of sensory realism was successful, a two sample t-test on the presence rating in the first control situation was conducted. The test showed a significant difference between sensory realism conditions, t(46.93) = 2.31, p = 0.026, d = 0.66. Participants in the high sensory realism condition (M = 60.0, SD = 23.9) reported significantly higher presence than participants in the low sensory realism condition (M = 43.6, SD = 25.9). For the MEC-SPQ scores, a one-way MANOVA with the five subscales as dependent variables and sensory realism as independent variable revealed no main effect of sensory realism, Wilks' λ = 0.94, F(5,41) = 0.56, p = 0.726 (see **Supplementary Material** for descriptive statistics).

### Causal Relationship Between Presence and Fear

Following the hypotheses of the study, ANOVAs were computed for both the presence and fear ratings with the presence manipulation (sensory realism) as between factor and the fear manipulation (situation) as within factor.

For a causal effect of fear → presence, we expected presence ratings to be higher in the height situation compared to the control situations. The ANOVA showed a significant main effect for the presence manipulation, F(1,46) = 5.70, p = 0.021, η 2 <sup>p</sup> = 0.11, a significant main effect for the fear manipulation, F(1.73,79.40) = 13.01, p < 0.001, η 2 <sup>p</sup> = 0.22, and no interaction effect, F(1.73,79.40) = 0.07, p = 0.905, η 2 <sup>p</sup> < 0.01 (see **Figure 3A** and **Supplementary Material**). For the significant main effect for the presence manipulation, means indicate higher presence ratings in the high sensory realism compared to the low sensory realism condition. For the significant main effect for the fear manipulation, post hoc pairwise comparisons (alpha adjustment with Tukey's method) between situations yield a significant difference between control situation 1 and the height situation, t(46) = −4.36, p < 0.001, a significant difference between the height situation and the control situation 2, t(46) = 3.43, p = 0.004, and no difference between control situation 1 and control situation 2, t(46) = −1.69, p = 0.220. Presence ratings in the height situation were higher than in both control situations.

For a causal effect of presence → fear, we expected fear ratings specifically in the height situation to be higher in the high sensory realism condition. The ANOVA on fear ratings revealed no main effect for the presence manipulation, F(1,47) = 1.02, p = 0.317, η 2 <sup>p</sup> = 0.02, a significant main effect for the fear manipulation, F(1.10,51.62) = 161.63, p < 0.001, η 2 <sup>p</sup> = 0.77, and no interaction effect, F(1.10,51.62) = 0.92, p = 0.350, η 2 <sup>p</sup> = 0.02 (see **Figure 3B** and **Supplementary Material**). Post hoc pairwise comparisons (alpha adjustment with Tukey's method) between situations yield a significant difference between control situation 1 and the height situation, t(47) = −12.86, p < 0.001, a significant difference between the height situation and the control situation 2, t(47) = 12.98, p < 0.001, and no difference between control situation 1 and control situation 2, t(94) = −1.18, p = 0.469. Fear ratings in the height situation were higher than in both control situations.

#### Physiological Responses

Skin conductance and heart rate were analyzed analogously to the ratings. The ANOVA for SCL revealed no main effect for the presence manipulation, F(1,46) = 0.13, p = 0.717, η 2 <sup>p</sup> < 0.01, a significant main effect for the fear manipulation, F(2,92) = 81.10, p < 0.001, η 2 <sup>p</sup> = 0.64, and no interaction effect, F(2,92) = 0.31, p = 0.731, η 2 <sup>p</sup> < 0.01 (see **Figure 3C** and **Supplementary Material**). Post hoc pairwise comparisons

<sup>1</sup>https://github.com/dgromer/PeakMan

<sup>2</sup>https://github.com/dgromer/phyr6

#### TABLE 1 | Questionnaire data.

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 6


AQ, Acrophobia Questionnaire; STAI, State-Trait Anxiety Inventory (t<sup>1</sup> = at the beginning and t<sup>2</sup> = in the end of the experiment); SSQ, Simulator Sickness Questionnaire.

(alpha adjustment with Tukey's method) between situations yield a significant difference between control situation 1 and the height situation, t(46) = −10.71, p < 0.001, a significant difference between the height situation and the control situation 2, t(46) = 11.06, p < 0.001, and no difference between control situation 1 and control situation 2, t(92) = −1.61, p = 0.251. SCL values in the height situation were higher than in both control situations.

For the heart rate, the ANOVA showed no main effect for the presence manipulation, F(1,47) = 0.92, p = 0.341, η 2 <sup>p</sup> = 0.02, a significant main effect for the fear manipulation, F(1.32,61.99) = 7.97, p = 0.003, η 2 <sup>p</sup> = 0.14, and no interaction, F(1.32,61.99) = 0.21, p = 0.713, η 2 <sup>p</sup> < 0.01 (see **Figure 3D** and **Supplementary Material**). Post hoc pairwise comparisons (alpha adjustment with Tukey's method) between situations yield a significant difference between control situation 1 and the height situation, t(47) = −3.05, p = 0.010, no difference between the height situation and the control situation 2, t(47) = 1.35, p = 0.377, and a significant difference between control situation 1 and control situation 2, t(47) = −4.07, p < 0.001. Heart rate in the height situation and control situation 2 were higher than in control situation 1.

### Exploratory Correlations and Cross-Lagged Panel Models

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 7

Two post hoc exploratory analyses were conducted to take a more in-depth look into the associations between presence and fear ratings.

First the bivariate correlation between presence and fear ratings in the height situation for the complete sample was, r(47) = 0.62, p < 0.001. Within the two groups varying in sensory realism (presence manipulation) the correlations were r(22) = 0.80, p < 0.001 for the group with high sensory realism and r(23) = 0.42, p = 0.038 for the group with low sensory realism, with a significant difference between these correlation coefficients, z = 2.13, p = 0.033.

Second, we fitted the presence and fear ratings in crosslagged panel models, again split by the between-subject factor, to test whether presence and fear ratings would predict ratings in successive trials (see Peperkorn et al., 2015, for a similar, but correlational approach). In the high sensory realism group (see **Figure 4A**), significant paths were (1) the autoregressive paths for presence: presence in the height situation was predicted by presence in the control situation 1, βstd = 0.82, p < 0.001, and presence in the control situation 2 was predicted by presence in the height situation, βstd = 0.91, p < 0.001; (2) the regression coefficient of presence in the control situation 1 predicting fear in the height situation, βstd = 0.55, p = 0.002; and (3) the correlation between presence and fear in the height situation, r = 0.72, p = 0.005. In the low sensory realism group, only the autoregressive paths for presence were significant (both p < 0.001). For further visualization of the regression of initial presence ratings predicting later fear ratings in the height situation, the correlation between presence ratings in the first control situation and fear ratings in the height situation were calculated and are displayed in **Figure 4B**.

#### DISCUSSION

The present study investigated two research questions: first, whether a manipulation of sensory realism of a VE, i.e., high versus low quality of visual and auditory content, has an influence on experienced presence, and second, whether there is a causal relationship between presence and fear in VR. For this purpose, both presence and fear were manipulated experimentally in VR. Height-fearful participants were immersed into a virtual height situation and a control situation (fear manipulation) with either high or low sensory realism (presence manipulation). During

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 8

immersion, we assessed ratings of presence and verbal and physiological fear responses.

#### Effects of Sensory Realism on Presence

The use of highly detailed geometry, increased texture quality, and sound, compared to a low fidelity setup, led to increased presence ratings. This finding is in line with previous research (Hendrix and Barfield, 1996; Welch et al., 1996; Dinh et al., 1999; Larsson et al., 2007; Slater et al., 2009; Kwon et al., 2013; Brinkman et al., 2015), and the calculated effect size (small to medium) is in line with a recent meta-analysis (Cummings and Bailenson, 2016). Inconsistent with the verbal online ratings of presence assessed at the end of each VR experience, the presence questionnaire, measured after the experiment, revealed no difference between the two groups who experienced different sensory realism conditions. Earlier studies on the effects of the quality of visual and auditory content on presence have applied numerous presence measures (Cummings and Bailenson, 2016) and these diverse measures might differ in their sensitivity to detect an effect of manipulations of visual realism and auditory content on presence. Previous research suggests that such discrepancies between different measures of presence are not uncommon (Kober and Neuper, 2013). However, to our knowledge, there has not yet been an extensive comparison on such qualitative differences between multiple measures of presence. Another important point is that different presence measures also quantify different aspects of presence (e.g., spatial presence in the MEC-SPQ; spatial presence, involvement, and experienced realism in the Igroup Presence Questionnaire). The conviction of having been located in either the high or low sensory realism version of our VEs (spatial presence) might, in retrospect, not differ between groups, because both conditions allowed the same spatial perception of the environment. A manipulation of stereoscopy for example, which affects depth perception, might have had a stronger influence on retrospective reports of the experience of spatial presence (Cummings and Bailenson, 2016). In sum, we found some indication that our experimental manipulation of sensory realism modulated presence, although with a small to medium effect size only and restricted to one of two measures. Given this result, costly efforts to achieve a high sensory realism of VEs, e.g., by use of photogrammetry in the modeling process, might not be necessary for VEs to be plausible and to be able to induce presence. This argument is further supported by a comparison of presence responses to VEs across decades (e.g., Krijn et al., 2004; Gromer et al., 2018), where the 2004 study achieved even higher scores on the Igroup Presence Questionnaire<sup>3</sup> . However, it remains an open research question, whether VEs that achieved high presence in earlier studies, also induce high presence today (e.g., due to different standards of users). Further research is therefore needed to decide whether costly efforts to increase sensory realism are reasonable to increase presence.

#### Effects of Fear on Presence

Our study expands previous research on the relationship between presence and fear in VR as our experimental fear manipulation caused an increased sense of presence. Similar effects of fear on presence have been shown for snake phobia by Bouchard et al. (2008), test anxiety by Alsina-Jurnet and Gutiérrez-Maldonado (2010), and spider phobia by Peperkorn et al. (2016), with higher presence ratings in fear-relevant versus neutral VEs or in phobic versus non-phobic participants. Our results corroborate these reports by demonstrating that experiencing emotional responses in VR leads to stronger feelings of actually being there in the VE (see also Riva et al., 2007). To explain these findings in a theoretical model, Diemer et al. (2015) postulated an interoceptive attribution model of presence which proposes two main factors that lead to higher presence ratings: immersion (i.e., technological characteristics of the VR system) and arousal. Our results fully support this model as presence was increased by both the manipulation of sensory realism (i.e., immersion), as well as an arousal manipulation (i.e., the height situation compared to the control situation elicited higher arousal as indicated by skin conductance). At a first glance, two comparable previous studies (Diemer et al., 2016; Gromer et al., 2018), which could not find differences in presence between high and low heightfearful participants after exposure to virtual height environments, seem to contradict this interpretation. However, Gromer et al. (2018) did not collect any physiological measure of arousal, and therefore these results allow no firm conclusions about the interoceptive attribution model. Diemer et al. (2016) measured skin conductance but revealed equal levels of physiological arousal in high and low height-fearful participants in the height situation. Consequently, these findings are still in line with the interoceptive attribution model, because it states presence as a function of arousal.

#### Effects of Presence on Fear

Our study revealed no support of a causal effect of presence on fear as our experimental manipulation of presence did not lead to increased levels of fear in the virtual height situation. Several explanations have to be discussed. First, the strength of the fear response might not be dependent upon presence. If this is the case, putting much effort in creating highly realistic VEs is not necessary for virtual exposure as simpler VEs might be sufficient. Second, effects of presence on fear might only be observable if the manipulation of presence is strong enough. This argument receives support by a comparison of our study's effect sizes for the manipulation of presence (η 2 <sup>p</sup> = 0.11) and fear (η 2 <sup>p</sup> = 0.77), suggesting that the effect of the presence manipulation was probably too small. Third, following the presence as a gateway hypothesis (Felnhofer et al., 2014), fear might not increase linearly with higher presence but rather a certain degree of presence is necessary to provoke fear responses (Bouchard et al., 2008). Once this threshold is reached, further increases in presence do not further affect fear responses. In our study, both the high and low sensory realism conditions might already have induced enough presence to pass this threshold, which then resulted in similar fear responses in both groups. However,

<sup>3</sup>The Krijn et al. (2004) study used sums instead of means to calculate the total score of the Igroup Presence Questionnaire (IPQ). When scoring the data from Gromer et al. (2018) analogously, there is a significant difference between IPQ total scores, t(108) = 2.94, p = 0.004.

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 9

contrary to the assumptions of the presence as a gateway hypothesis (with increasing presence, there is a plateau in fear responses), the correlation between presence and fear responses was much stronger in the high sensory realism condition. In order to experimentally test the different discussed explanations, future studies should use stronger presence manipulations and/or have experimental designs inducing multiple levels of presence to specifically test the predictions of the presence as a gateway hypothesis.

Interestingly, and in concordance with findings by Peperkorn et al. (2015), in the high sensory realism group initial ratings of presence in the first control situation were predictive of fear ratings in the later height situation, indicating an effect of interpersonal variability in presence on fear. Referring back to the interoceptive attribution model of presence by Diemer et al. (2015), which postulated presence as a function of immersion and arousal, our results also highlight the importance of user characteristics in the emergence of presence (IJsselsteijn et al., 2000; Wirth et al., 2007). User characteristics that have been thought to have an influence on presence include immersive tendencies (Witmer and Singer, 1998; Robillard et al., 2003; Murray et al., 2007; Phillips et al., 2012; Kober and Neuper, 2013; Ling et al., 2013), absorption (Baños et al., 1999; Schuemie et al., 2005; Murray et al., 2007; Phillips et al., 2012; Wirth et al., 2012; Kober and Neuper, 2013; Ling et al., 2013), dissociation (Baños et al., 1999; Murray et al., 2007; Phillips et al., 2012; Williams, 2014), spatial abilities (Alsina-Jurnet and Gutiérrez-Maldonado, 2010; Coxon et al., 2016), and personality (Alsina-Jurnet and Gutiérrez-Maldonado, 2010; Kober and Neuper, 2013). According to Kober and Neuper (2013), who studied the relationship between user characteristics and multiple presence measures, the best predictor for presence was absorption, followed by immersive tendencies, perspective taking, and mental imagination. Of note is that the relationship between interpersonal variability in presence in the neutral situation and later fear ratings in the height situation was only significant for the group with high visual realism and auditory content. Furthermore, the correlation between presence and fear, both measured in the height situation, was higher in the high than low sensory realism condition. Based on these findings, we suggest that it might be beneficial to use VEs with high sensory realism in studies on the presence-fear relationship.

#### Limitations

Some limitations of the present study should be noted. First, as noted earlier, the effect of the presence manipulation, compared to the effect of the fear manipulation, was rather weak, possibly hindering a measurable influence of presence on fear. This was also reflected in a discrepancy between verbal presence ratings and presence measured via questionnaire. In future studies, stronger presence manipulations should be used to address this issue, e.g., by realizing various sensory realism manipulation plus manipulations in stereoscopy or user-tracking (Cummings and Bailenson, 2016).

Second, the cross-lagged panel model was conducted post hoc in an exploratory manner. To corroborate our findings, a further study should be planned and conducted with a priori hypotheses about the relationships within the cross-lagged panel model.

Third, the present study investigated only participants with a subclinical fear of heights. A recent meta-analysis revealed differences in the magnitude of the correlation between presence and fear between different phobias and between clinical and nonclinical fearful participants (Ling et al., 2014). It is therefore crucial to replicate the findings in other phobias, as well as a clinical population, with regards to generalizability.

## CONCLUSION

The present study sheds light on the causal interaction between presence and fear responses in VR and indicates a bidirectional relationship between both variables. First, our results show that experiencing fear as indicated by verbal and physiological responses in virtual heights leads to higher presence, supporting the hypothesis that arousal is an important factor in the formation of presence. Second, although our experimental manipulation of presence did not affect fear responses, the link between interpersonal variability in presence on the one hand and fear responses on the other hand suggests that higher presence leads to stronger fear responses. Furthermore, this finding stresses the importance to take user characteristics in the emergence of presence into account. Further studies are needed to test whether other experimental (e.g., different manipulations of immersion) or quasi-experimental manipulations (e.g., users with different characteristics) of presence have an influence on fear responses.

## DATA AVAILABILITY

The dataset for this study can be found at osf.io/8z6gt/.

## AUTHOR CONTRIBUTIONS

DG, MR, IC, and PP contributed to the study concept and design. MR and IC collected the data. DG performed the data analysis and interpretation under the supervision of PP. DG drafted the article and MR, IC, and PP provided critical revisions. All authors approved the final version of the article prior to submission.

### FUNDING

This publication was funded by the Volkswagen Foundation (AZ 94 102), German Research Foundation (DFG), and the University of Wuerzburg in the funding program Open Access Publishing.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.00141/full#supplementary-material

### REFERENCES

fpsyg-10-00141 January 28, 2019 Time: 18:40 # 10


fpsyg-10-00141 January 28, 2019 Time: 18:40 # 11

obesity: a randomized controlled study with 1 year follow-up. Cyberpsychol. Behav. Soc. Network. 19, 134–140. doi: 10.1089/cyber.2015.0208


derived from computer games. Cyberpsychol. Behav. 6, 467–476. doi: 10.1089/ 109493103769710497


**Conflict of Interest Statement:** PP is shareholder of a commercial company that develops virtual environment research systems (VTplus GmbH) for empirical studies in the field of psychology, psychiatry, and psychotherapy.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gromer, Reinke, Christner and Pauli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Potential of Consumer-Targeted Virtual Reality Relaxation Applications: Descriptive Usage, Uptake and Application Performance Statistics for a First-Generation Application

#### Philip Lindner<sup>1</sup> \*, Alexander Miloff<sup>1</sup> , William Hamilton<sup>2</sup> and Per Carlbring1,3

<sup>1</sup> Department of Psychology, Stockholm University, Stockholm, Sweden, <sup>2</sup> Mimerse, Stockholm, Sweden, <sup>3</sup> Department of Psychology, University of Southern Denmark, Odense, Denmark

#### Edited by:

Stéphane Bouchard, Université du Québec en Outaouais, Canada

#### Reviewed by:

Marc Wittmann, Institut für Grenzgebiete der Psychologie und Psychohygiene (IGPP), Germany Paul Pauli, Universität Würzburg, Germany

\*Correspondence: Philip Lindner philip.lindner@psychology.su.se

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 01 October 2018 Accepted: 15 January 2019 Published: 04 February 2019

#### Citation:

Lindner P, Miloff A, Hamilton W and Carlbring P (2019) The Potential of Consumer-Targeted Virtual Reality Relaxation Applications: Descriptive Usage, Uptake and Application Performance Statistics for a First-Generation Application. Front. Psychol. 10:132. doi: 10.3389/fpsyg.2019.00132 Virtual Reality (VR) technology can be used to create immersive environments that promote relaxation and distraction, yet it is only with the recent advent of consumer VR platforms that such applications have the potential for widespread dissemination, particularly in the form of consumer-targeted self-help applications available at regular digital marketplaces. If widely distributed and used as intended, such applications have the potential to make a much-needed impact on public mental health. In this study, we report real-world aggregated uptake, usage and application performance statistics from a first-generation consumer-targeted VR relaxation application which has been publicly available for almost 2 years. While a total of 40,000 unique users signals an impressive dissemination potential, average session duration was lower than expected, and the data suggests a low number of recurrent users. Usage of headphones and auxiliary input devices was relatively low, and some application performance issues were evident (e.g., lower than intended framerate and occurrence of overheating). These findings have important implications for the design of the future VR relaxation applications, revealing primarily that user engagement needs to be addressed in the early stage of development by including features that promote prolonged and recurrent use (e.g., gamification elements).

Keywords: relaxation, virtual reality, stress, pain, consumer, application (app)

## INTRODUCTION

Virtual Reality (VR) refers to technology that simulates being present in a virtual, computergenerated world, most often achieved using a head-mounted display (HMD) that covers the user's eyes with dual-display stereoscopy to simulate depth perception, withholding of the actual surroundings and making the video and audio presentation interactive to head movements (Botella et al., 2017; Freeman et al., 2017). Since the early 2000s (Hoffman et al., 2000), VR have been used to create immersive experiences for relaxation and distraction, with meta-analyses of clinical trials

revealing that VR is effective in treating anxiety (Carl et al., 2018) and pain (Malloy and Milling, 2010; Kenney and Milling, 2016), and several studies have shown that VR can induce relaxation specifically (Riva et al., 2007; Baños et al., 2009, 2013, 2014; Valtchanov et al., 2010; Annerstedt et al., 2013; Serrano et al., 2016; Anderson et al., 2017).

To date, however, VR mental health interventions have seen very limited dissemination outside specialized clinics and university laboratories (Freeman et al., 2017). Arguably, the inaccessibility, high financial costs and low user-friendliness of the previous generation of VR hardware have constituted substantial barriers to dissemination (Segal et al., 2011; Schwartzman et al., 2012). Since 2016, consumer VR hardware and software has seen an impressive growth and now offers a mature ecosystem from development to distribution, in theory addressing past concerns and barriers (Lindner et al., 2018a). This makes consumer VR an attractive avenue to disseminate, on an unprecedented scale, immersive applications that reduce symptoms of stress, anxiety, and pain (Lindner et al., 2017), especially in the form of consumer-targeted applications running on consumer hardware and distributed through established digital marketplaces (Miloff et al., 2016; Donker et al., 2018; Freeman et al., 2018).

In the current study, we present real-world uptake data from a first-generation consumer-targeted VR relaxation and distraction application that has been publicly available on a digital marketplace for almost 2 years, providing a first glimpse of the dissemination potential of such applications, as well as valuable usage and application performance data that may guide the development of future applications.

FIGURE 1 | Screenshots of the Happy Place application. (Top) Day environment with interactive objects. (Bottom) Night environment. Screenshots published with permission from copyright holder Mimerse.

### MATERIALS AND METHODS

#### Application

The Happy Place application was developed by Mimerse using the Unity game engine. Relaxation and distraction is induced by situating the user in a relaxing nature environment (Valtchanov et al., 2010; Annerstedt et al., 2013; Berto, 2014), one which includes full day-night and weather cycles, scripted animal behaviors, and spatialized background sounds. This virtual nature environment is stylized in a low-polygon appearance that is both visually pleasing and computationally less expensive to generate than photorealistic equivalents; an important concern for computationally constrained mobile VR units. See **Figure 1** for screenshots. Using a gaze reticle interface, the user can explore the environment through head rotation: resting the reticle on one of 50 included objects will trigger minor environmental events. This as-requested interactivity aims to increase sense of presence and to provide users with the choice of a passive or active engagement style, making the application suitable for both relaxation and distraction. The user experience begins in an empty "void world" while the nature environment loads during approximately 30–50 s, which appears first through a saturated black and white filter before gradually coming to life and into full color during a 30-s transition, designed to rapidly increase immersion. Once the user has transitioned into the nature environment, the experience is open-ended and the user can remain as long as desired, with the option of listening to voice-over guided meditation.

#### Data

The application was made publicly available for mobile VR users at no cost on the Oculus Store, in October 2016. Data from date of public release (1st week discarded to exclude live developer testing) were extracted from the developer platform. Raw data were aggregated on group level, meaning that there is no individual user data available for correlational or subgroup analyses. For presentation purposes, the data was then further aggregated (sum, mean, or max) on a weekly level to smooth out random and non-random (e.g., weekday) variations of no interest. Extracted metrics covered application uptake, usage and application performance: accumulated installations and uninstallations, unique active users and sessions (the later converted to a sessions per user ratio), average session duration, percentage usage of gamepad, controller and headphones, as well as average framerate (Frames Per Second, FPS; 60 intended), overheating rate (event percentage by session), available memory and battery burn rate. Oculus states that reported metrics have a maximum 5% error margin. Of note, some metrics were implemented during the data collection period.

By Swedish law (2003:460), ethical approval is not applicable to the handling or publishing of non-interventional human research data aggregated on group level (i.e., non-identifiable data). All users consented to having their uptake and basic usage data shared with Oculus, which in turn could be shared with third-parties including researchers, as part of the terms of service of the Oculus Store. Oculus granted permission to

extract and publish the raw data, which is now available at an online repository (Lindner et al., 2018b). Thus, the current study describes analyses performed on publicly available data. While research on publicly available data (other common examples being, e.g., self-disclosed social media content, search statistics, etc.) is typically exempt from standard research regulations, ethical concerns have been raised (Metcalf and Crawford, 2016). These concerns do, however, not apply to aggregated data such as those presented in this report.

### RESULTS

#### Uptake Statistics

From October 2016 to September 2018, the application saw n = 40,153 unique active users, with daily active users (raw data) ranging from 4 to 653, and on an aggregated weekly level ranging from 123 to 2677 (which could be recurrent or new users). As evident by the increasing accumulated installations yet relatively stable number of daily active users (with the exception of two prominent spikes), there was a low degree of recurrent users. The ratio between number of sessions and active users was consistently around 1.3. The peaks in installations and active users is likely associated with increased media attention or in-store coverage, while the larger peak in session-per-user is likely random. See **Figure 2** panel top rows.

### Usage Statistics

Average session duration was approximately 5 min. Headphone usage was stable at around 25%. Gamepad usage averaged around 3% during the period, with an apparent delayed decrease

occurring after the introduction of handheld controllers in early 2017, the use of which increased over time. See **Figure 2** panel middle rows.

#### Application Performance Statistics

Average framerate increased over time and approached the intended 60 FPS in the final months. Available memory also increased, while overheating event rate decreased over time. Burn rate, however, increased during the last 6 months. See **Figure 2** panel bottom rows.

### DISCUSSION

From a dissemination perspective, the Happy Place application can be considered a successful first attempt at distributing consumer VR relaxation applications at an unprecedented scale. Over 40,000 unique users over a 2-year period is well beyond even the largest stress reduction trials, both VRbased (Serrano et al., 2016) and internet- or smartphonebased (Shimazu et al., 2005; Heber et al., 2016). These uptake number provides a glimpse of how many individuals can be reached with this type of intervention – a number which can be expected to rise with the continued growth of consumer VR.

The dissemination potential notwithstanding, application usage, indexed both by average session duration and daily users (which did not grow despite growing number of installations), was lower than expected. Without outcome data, individual data or known variances around the means, we can only speculate on whether the application was successful in inducing relaxation and pain distraction. Although the aim of the current study was not to examine efficacy or effectiveness, the low average session duration and low number of daily active user suggests that most users did not find the experience beneficial enough to use for prolonged durations or repeat regularly; however, this does not rule out the possibility of a subset of users having more frequent and longer session durations, indicative of effects. Further, our real-life data does not allow a differentiation between application engagement, acceptability and immersion, although they are likely highly inter-correlated; the measure of each of these aspects, and modeling of their moderating and mediating effects will need to be examined in a future study.

Usage issues can be addressed with application design: providing users with pre-set session duration options can help delivery of a minimum effective dose and will standardize usage, while repeated frequent use can be reinforced by outside-VR prompts (e.g., by email or a companion smartphone application), gamification elements in VR emphasizing progression (e.g., awarding points and badges for accomplishments), options to customize the virtual environment and session (sandbox, "More to explore," design), and other features already ubiquitous in games and smartphone applications (Domínguez et al., 2013). Of importance, even with suboptimal adherence and retention, VR interventions that show small effect sizes at group level may achieve a large public health impact if distributed at scale. In addition, the risk of negative effects in VR are generally low (Fernández-Álvarez et al., 2018) and there is to our knowledge no suggestion in the extant literature that potential failed attempts at self-help translates into a reluctance to seek professional help.

Findings on the three metrics covering usage of input and output devices (gamepad and controller, and headphones, respectively) provide insights that should guide the development of future applications. The relatively low usage of headphones reveals that most users either rely on the built-in speaker or mute the audio. Thus, including high-quality, detailed audio, or making audio an integral but implicit part of the user experience is not advisable at present. Prompts may increase headphone use and future HMDs with built in high-quality headphones may alleviate the issue. As to the use of external input devices, use of both gamepads and controllers was low, which is not surprising given that the application was not designed for such input devices. Handheld controllers, the movements of which can be translated into virtual hands, have the potential to increase presence through increased interactivity. Since all modern VR platforms now feature handheld controllers, future research should explore how to best use this technology to increase relaxation and distraction effectiveness.

Finally, since the application was not continuously updated to increase application performance, the observed increases in FPS and available memory over time, and decrease of overheating events, likely reflect the release of more powerful devices during the data collection period. The slight increase in battery burn rate, which was moderate in relative terms but very small in absolute terms, likely mirrors the increased performance at expense of burn rate, as well as a changing device pool. More powerful devices will allow for more stable and content-rich applications, yet the very presence of overheating events and suboptimal framerates stress the importance of taking computational limitations of the intended device into account at an early stage in development.

### STRENGTHS AND LIMITATIONS

The primary limitation of the current study is that no individual data were available, meaning that between-subject variations in metrics could not be calculated and that associations between metrics cannot be examined. Also, only a limited number of metrics automatically collected by the Oculus platform, rather than the application itself, were available. Oculus estimates a 5% maximum error margin, an independent verification of which is not possible since no individual data is available. The analgesic efficacy of the application is currently being evaluated in a pilot randomized controlled trial (NCT03762213).

These limitations notwithstanding, this study reports unique, real-world uptake, usage and application performance data that could only have been estimated using other methods, from a large number of users (forty thousand), and over a relatively long duration (almost 2 years).

### CONCLUSION

fpsyg-10-00132 February 1, 2019 Time: 16:45 # 5

We conclude that consumer VR relaxation applications do indeed present an attractive opportunity for unprecedented dissemination that can achieve a much-needed impact on mental public health problems such as stress, anxiety and pain. However, application design must also be guided by both psychological science and real-world usage data, and there is a constant need for research on efficacy, effectiveness and mechanisms of action.

#### REFERENCES


### AUTHOR CONTRIBUTIONS

PL extracted and analyzed data, and drafted the manuscript. WH developed the application. AM, WH, and PC made significant contributions to the interpretation of findings and writing.

### ACKNOWLEDGMENTS

The authors wish to thank Oculus for granting permission to publish the data.


presence and emotions. Cyber Psychol. Behav. 10, 45–56. doi: 10.1089/cpb.2006. 9993


**Conflict of Interest Statement:** PL consults for Mimerse, the application developer, but holds no financial stake in the company (a private limited company). WH is the founder, owner, and Chief Technology Officer of Mimerse. The Happy Place application described in this report does not generate revenue for Mimerse since it was released free of charge.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a past co-authorship with one of the authors PC.

Copyright © 2019 Lindner, Miloff, Hamilton and Carlbring. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Identifying the Added Value of Virtual Reality for Treatment in Forensic Mental Health: A Scenario-Based, Qualitative Approach

Hanneke Kip1,2 \*, Saskia M. Kelders1,3, Kirby Weerink<sup>2</sup> , Ankie Kuiper<sup>1</sup> , Ines Brüninghoff<sup>1</sup> , Yvonne H. A. Bouman<sup>2</sup> , Dirk Dijkslag<sup>2</sup> and Lisette J. E. W. C. van Gemert-Pijnen<sup>1</sup>

<sup>1</sup> Centre for eHealth and Wellbeing Research, Department of Psychology, Health and Technology, University of Twente, Enschede, Netherlands, <sup>2</sup> Department of Research, Stichting Transfore, Deventer, Netherlands, <sup>3</sup> Optentia Research Focus Area, North-West University, Vanderbijlpark, South Africa

Background: Although literature and practice underline the potential of virtual reality (VR) for forensic mental healthcare, studies that explore why and in what way VR can be of added value for treatment of forensic psychiatric patients is lacking.

#### Edited by:

Federica Pallavicini, Università degli Studi di Milano Bicocca, Italy

#### Reviewed by:

Remco Veltkamp, Utrecht University, Netherlands Luca Morganti, Università degli Studi di Milano Bicocca, Italy

> \*Correspondence: Hanneke Kip h.kip@utwente.nl

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 12 April 2018 Accepted: 11 February 2019 Published: 27 February 2019

#### Citation:

Kip H, Kelders SM, Weerink K, Kuiper A, Brüninghoff I, Bouman YHA, Dijkslag D and van Gemert-Pijnen LJEWC (2019) Identifying the Added Value of Virtual Reality for Treatment in Forensic Mental Health: A Scenario-Based, Qualitative Approach. Front. Psychol. 10:406. doi: 10.3389/fpsyg.2019.00406 Goals: This study aimed to identify (1) points of improvements in existing forensic mental health treatment of in- and outpatients, (2) possible ways of using VR that can improve current treatment, and (3) positive and negative aspects of the use of VR for the current treatment according to patients and therapists.

Methods: Two scenario-based methods were used. First, semi-structured interviews were conducted with eight therapists and three patients to elicit scenarios from them. Based on these results, six scenarios about possibilities for using VR in treatment were created and presented to 89 therapists and 19 patients in an online questionnaire. The qualitative data from both methods were coded independently by two researchers, using the method of constant comparison.

Results: In the interviews, six main codes with accompanying sub codes emerged. Ideas for improvement of treatment were grouped around the unique characteristics of the forensic setting, characteristics of the complex patient population, and characteristics of the type of treatment. For possibilities of VR, main codes were skills training with interaction, observation of situations or stimuli without interaction, and creating insight for others into the patient. The questionnaire resulted in a broad range of insights into potential positive and negative aspects of VR related to the current treatment, the patient, the content of a VR application, and practical matters.

Conclusion: VR offers a broad range of possibilities for forensic mental health. Examples are offering training of behavioral and cognitive skills in a realistic context to bridge the gap between a therapy room and the real world, increasing treatment motivation, being able to adapt a VR application to individual patients, and providing therapists with new insights into a patient. These findings can be used to ground the development of new VR applications. Nevertheless, we should remain critical of when in the treatment process and for whom VR could be of added value.

Keywords: virtual reality, forensic mental health, psychological treatment, delinquent behavior, contextual inquiry

### INTRODUCTION

fpsyg-10-00406 February 25, 2019 Time: 18:23 # 2

Virtual reality (VR) has been rapidly gaining ground in mental health research and practice, and evidence so far has warmed many researchers and clinicians up to VR's potential in improving treatment. In VR, patients can enter computergenerated environments, which substitute real-world sensory visual and auditory perceptions with virtual ones (Freeman et al., 2017). Ideally, this will elicit a sense of presence, which is the illusion of actually being in a place, while one is physically situated in another (Witmer and Singer, 1998; Riva et al., 2003; Diemer et al., 2015). VR has several advantages for mental healthcare. Among other things, it can increase treatment motivation of patients because they enjoy using the technology; ensure that the content and form of interventions are tailored to the needs of individual patients; decrease treatment costs because of higher efficiency; and facilitate therapy within a specific environment that cannot be accessed from a therapist's office (Turner and Casey, 2014; Kim et al., 2016; Botella et al., 2017). These qualities make VR an especially appealing technology for psychological treatment, since mental health problems such as phobias, alcoholism or even extreme paranoia are closely intertwined with the perceived environment (World Health Organization, 2004; Freeman et al., 2017). Reviews have indeed shown positive effects of VR interventions for mental disorders such as specific phobias (Botella et al., 2017), PTSS (Botella et al., 2015), psychoses (Veling et al., 2014), and eating disorders (Ferrer-García and Gutiérrez-Maldonado, 2012). According to several reviews, few studies focused on the use of VR with very complex and multifaceted disorders, patients and types of treatment (Turner and Casey, 2014; Freeman et al., 2017). Examples are patients suffering from multiple disorders, mental retardation, chronic psychiatric problems, and (closed) mental health settings such as hospital wards or forensic units (Freeman et al., 2017). Nevertheless, our recent review pointed out that forensic psychiatric patients – often residing in secured settings and suffering from complex disorders – might especially benefit from the immersive qualities of VR (Kip et al., 2018). More research into the ways VR can be used for these intricate mental disorders, mental health settings and types of patients is required.

Forensic mental health is a subdomain of psychiatry which deals with the assessment and treatment of in- and outpatients whose behavior has led, or could lead, to offending (Mullen, 2000). Forensic mental health has several specific characteristics. It's main difference with regular mental healthcare is that preventing delinquent behavior is an important treatment goal, so treatment takes place at the intersection between law and psychiatry (Arboleda-Florez, 2006). Furthermore, forensic patients often have little treatment motivation, low literacy levels, and are heterogeneous in type of offense, psychopathology and risk factors, so different patients have different treatment goals (Drieschner and Boomsma, 2008; van der Veeken et al., 2018). Also, there are differences in security level: forensic outpatients live at home and receive treatment at an outpatient clinic, whereas inpatients reside in forensic hospitals while preparing for their return to society. All of this points out that forensic mental health is a setting with multiple unique characteristics. VR has been suggested by multiple authors as a potentially effective intervention strategy for forensic in- and outpatients (Renaud et al., 2010, 2014; Fromberger et al., 2014; Benbouriche et al., 2016; Kip et al. unpublished). VR can elicit emotional responses similar to those in real-life situations that are inaccessible for in- and outpatients because of security levels or ethical concerns, for example in the case of sexual offenders (Fromberger et al., 2014; Renaud et al., 2014). Furthermore, specific behavioral skills and coping strategies can be trained in controlled environments that are tailored to the individual patient's dynamic risk factors, without endangering others (Fromberger et al., 2014). Unfortunately, there is little empirical proof to support these claims: to our knowledge, no experimental studies on the use of VR in forensic mental health have been published yet. Most published studies on VR in forensic mental health focus on the assessment of sexual offenders (e.g., Renaud et al., 2014), and even less is known about possibilities for other types of forensic patients. These gaps in knowledge cannot be filled by generalizing findings from studies on VR in other mental healthcare domains to forensic mental health because of its aforementioned unique characteristics. Especially when not much is known yet, VR applications for specific domains should be thoughtfully developed (Kim et al., 2016) to ensure that they fit the context, patients and therapists that will use it.

While a new wave of VR intervention development is approaching – or perhaps has already arrived – (Turner and Casey, 2014), very little attention has been paid to how these interventions should actually be developed and which development methods are suitable for complex domains and disorders (Dugas et al., 2017; Freeman et al., 2017). The importance of a good development process to guarantee a fit between technology, people and the context has been acknowledged by multiple studies (Coiera, 2004; Nielsen and Mathiassen, 2013; Feldman et al., 2014; Glasgow et al., 2014; Beerlage-de Jong, 2016). A sound development process should start with a thorough contextual inquiry in which stakeholders such as patients and therapists are actively involved (van Gemert-Pijnen et al., 2011). During a contextual inquiry, multiple methods are used to get a good grasp of areas of improvements of a specific context, to investigate how technology can contribute to resolving these issues, and to determine who might benefit from the technology in what way (Holtzblatt and Jones, 1993; Wentzel, 2015). Especially when there is little knowledge on the use of a technology in a specific domain, it is important to thoroughly analyze when and how a technology can be of added value. A way to do this is via participatory development, which promotes a structural cooperation with end-users and other important stakeholders via the use of multiple methods, mainly to ensure that the perspectives of these stakeholders are accounted for (van Gemert-Pijnen et al., 2011; Beerlage-de Jong, 2016). Scenario-based design is a method that fits well with participatory development. On the one hand it can be used to elicit concrete narratives from stakeholders about situations that illustrate which aspects of a current situation can be improved (Lim and Sato, 2006; Anggreeni and Voort, 2008). On the other hand, scenarios can be created by researchers to explicitly describe the hypothetical use of a to-be-developed product. These

concrete scenarios can be used to support stakeholders in making their needs and preferences explicit (Beerlage-de Jong, 2016). Because methods from scenario-based design can be used to identify points of improvement and preferences regarding a technology according to stakeholders, this approach is a good first step in determining in which ways VR can be used to be of added value for forensic mental health.

In order to create a broad, multifaceted picture of the potential of VR for forensic mental health, the current study combined two scenario-based methods with stakeholders. In interviews, therapists and patients have been asked to provide scenarios themselves to gain insight into the current treatment situation and broad possibilities for VR. Based on these scenarios, concrete examples of the application of VR in forensic mental health have been be created. Scenarios that illustrate the use of these examples in treatment have been be presented to patients and therapists in an online questionnaire, in order to gain a more detailed view of their opinions, preferences and ideas for the use of VR in treatment. Via the combination of these two scenario-based methods, a broad, multifaceted picture of the possibilities that VR offers treatment in forensic mental health can be painted. This information can serve as a foundation for the development of VR applications in forensic mental health, and possibly also settings that bear similarities to forensic mental health, such as closed psychiatric hospital wards. The main goal of this paper is to identify what the added value of VR for treatment in forensic mental health can be. The three accompanying research goals are to identify (1) points of improvements in existing forensic mental health treatment of a forensic hospital with in- and outpatients, (2) possible ways of using VR that can improve current treatment, and (3) positive and negative aspects of the use of VR for the current treatment according to patients and therapists.

## MATERIALS AND METHODS

In the current study, multiple methods have been used to answer the research questions. The results from two focus groups with patients and therapists were used to structure the interview scheme that was used to interview other patients and therapists. The results of these interviews served as the foundation for six scenarios on possible VR applications that were used in an online questionnaire. In this paper, the focus lies on the interview and questionnaire, of which the methods are described below.

#### Study 1 – Interviews Participants

Both therapist and patients were included in this study since they are important stakeholders and (potential) end-users of a future VR intervention. Therapists and patients were recruited at a forensic hospital with in- and outpatients in the east of the Netherlands. All therapists directly involved in any type of treatment were eligible to participate in this study. Convenience sampling via team leaders was used to include therapists. Eight different locations that were representative of the forensic hospital were selected, and one therapist was recruited from each location. Patients were recruited via two therapists who were part of a project team for the development of a VR application. Patients could not participate if they were diagnosed with a current psychosis or mental retardation, or if a therapist indicated that dangerous situation during the interview might arise. Participation was only allowed when a therapist indicated that the interview would not be uncomfortable or damaging for the patient. Initially, the goal was to involve eight patients, but inclusion was found to be difficult due to unwillingness to participate, so three patients were interviewed.

#### Study Procedure and Interview Scheme

The eleven interviews were conducted in May and June 2017 by one researcher (KW) and had taken place at the location of the forensic hospital that was most convenient for the participant. The interviews took between 25 and 50 min, with an average of 32 min, excluding the introduction and signing of the informed consent. All interviews were audio-recorded and transcribed verbatim.

A semi-structured interview scheme was used to elicit scenarios of treatment situations that could be improved via VR, according to therapists and patients. Throughout the interview, probing questions were asked to gain more information about classic scenario elements such as actors, their goals, the setting, activities and possible events (Rosson and Carroll, 2002). The interview started with a brief introduction. The introduction began with demographic questions and an explanation of the nature of the interview, which focused on eliciting concrete examples of situations that could be improved with VR. During this introduction, it was explicitly pointed out that the goal of the interview was not to come up with concrete ideas for the content of a VR intervention. In the first part of the interview, the participant was asked to come up with areas of improvements in current treatment, without any input or structuring from the researcher. During the second part of the interview, three categories of possible applications of VR were presented and participants were asked to describe situations in treatment where these might be beneficial. The three categories of types of VR were based on the outcomes of two previously held focus groups with, respectively, 14 patients and 23 therapists from the same forensic hospital. During these focus groups, participants were asked to come up with ideas for VR-applications for forensic mental healthcare individually and in small groups. The generated ideas were coded and categorized by one researcher (HK). The coding process of the focus group data resulted in the three main categories below, which were briefly explained in the second part of the interview to provide the participants with inspiration for their answers.


using relaxing environments), and autism (e.g., training emotion recognition).

• **Insight**. The use of VR to create insight into criminal behavior by looking at behavior from another perspective. This can be by achieved observing one's own behavior from the perspective of another (e.g., seeing a fight between parents from the child's perspective); providing others with the patient's perspective (e.g., showing a significant other what a psychosis looks like); or observing how a patient reacts to a realistic situation with triggers (e.g., patient's response to an environment with alcohol).

Two researchers, a therapist and a patient all provided feedback on the content and structure interview scheme. A pilot test was conducted with a psychologist and former patient, and minor adjustments were made accordingly. Ethical approval was given by the Ethics Committee of the University of Twente (Behavioral, Management and Social Sciences).

#### Data Analysis

Two coders independently coded all transcripts (HK and SK), using the method of constant comparison (Boeije, 2002). First, the coders read the transcripts to familiarize themselves with the content. Then, all fragments that were related to either one of the research questions were selected. We distinguished between fragments that focused on points of improvement in the current situation (research question 1) and possible application of VR that could be of added value (research question 2). Based on these fragments, two preliminary coding schemes were created inductively: codes were based on the content of the fragments and not established beforehand. We identified several main codes with accompanying sub codes and used the sub codes to code the fragments. The coders used this first version coding scheme to code the first two interviews separately. The disagreement between the two coders was discussed, and adaptations were made to the coding scheme accordingly. Both coders coded all interviews using this adapted coding scheme. After that, the outcomes of both coders were again compared. 178 Fragments were coded, and researchers agreed on 131 of these fragments and disagreed on 47. In case of disagreement, discussion took place until consensus was reached. Minor adaptations were made to the coding scheme throughout the process to ensure that the codes and their definitions optimally fit the data.

#### Study 2 – Online Questionnaire Participants

The target group of the online questionnaire consisted of (former) patients and therapists in forensic mental health in the Netherlands. We made use of convenience sampling and recruited participants in several ways. On a national level, the link to the questionnaire was posted on national websites, in newsletters and via a national conference on forensic mental health. Additional sampling activities were conducted in the forensic hospital in the east of the Netherlands where the interviews took place. Flyers were distributed amongst employees and the in- and outpatients, the link was posted on the website of the forensic hospital and e-mails were send to the staff.

#### Study Procedure and Questionnaire

In the scenario-based questionnaire, six examples of the use of VR in forensic mental health were presented to the participants via six videos of on average 1.5 min. These scenarios were generated by a multidisciplinary project group consisting of researchers, patients and therapists, and were based on the outcomes of the interviews. A brief explanation of the content of the six ideas is presented in **Attachment 1**, and the videos can be watched here: https://bit.ly/2sYkbTM. The questionnaire started with a brief introduction, an informed consent and questions on demographics and other relevant background information. After that, the videos were presented to the participants in randomized order in order to ensure that all videos would receive a comparable number of responses. After watching each video, participants were asked to grade the idea and filled in the Personal Involvement Inventory (Zaichkowsky, 1994). Since the goals of the current study are of qualitative nature these quantitative results are beyond the scope of this paper and will be discussed elsewhere (Kip et al., unpublished). After the grade and the PII, three open-ended questions were presented: one question on what participants found positive, interesting or exciting about the idea, one question on what they found negative, less appealing or unfavorable, and one on suggestions to improve the idea. In total, 108 participants, of which 19 (18%) were patients and 89 (82%) therapists, participated in the questionnaire. On average, participants spend 21 min on the questionnaire, and 49% of all participants fully completed the questionnaire. Ethical approval for this study was given by the Ethics Committee of the University of Twente (Behavioral, Management and Social Sciences).

#### Data Analysis

For the analysis, the answers to the three open-ended questions of all six scenarios were analyzed together because we were interested in positive and negative aspects of VR in general, and not specific for each idea. Two coders independently coded all answers (AK and IB), applying an inductive, iterative approach based on the method of constant comparison. Consequently, the data of the questionnaire was analyzed in the same way as the interview data. Two coding schemes were created: one on potential positive, and one on potential negative aspects of VR. Most answers to the question on suggestions could be categorized under either positive or negative aspects. The remaining suggestions were either too specific or focused on details of the ideas, so they could not be used to answer this study's research question. Consequently, we did not provide a separate table with suggestions. The initial two coding schemes were developed by the two researchers based on the answers that were given for two ideas. After elaborate discussions with another researcher (HK), the coding schemes were adapted and used to code all data. Throughout this process, constant adaptations were made to make sure the codes fit the data as closely as possible. The same main codes were identified for the positive and negative aspects, but both coding schemes contain different sub codes.

### RESULTS

#### Study 1 – Interviews Demographics

fpsyg-10-00406 February 25, 2019 Time: 18:23 # 5

A total of 11 participants were included in this study, of which eight therapists and three patients. Three therapists were male and five were female, their average age was 46.88 (SD = 14.11), and their experience in forensic mental health ranged from 1 to 30 years, with an average of 14 years. Two therapists were psychologists, five were forensic nurses/socio-therapists, and one was an art therapist. Half of them worked in inpatient care, three in outpatient care, and one therapists worked in Forensic Flexible Assertive Community Treatment (ForFACT). All three included patients were male and were on average 48.67 years old (SD = 2.89). They had received an average of 2.83 years of forensic treatment, with a range of 1–4 years. One patient received inpatient care, the other two outpatient.

### Points of Improvement in Treatment

Therapists and patients provided scenarios of situations in current treatment that could be improved, without coming up with concrete solutions. The identified main and sub codes and their accompanying definitions are provided in **Table 1**.

#### Characteristics of the Forensic Setting

This main code is related to the unique characteristics of the forensic setting that distinguish it from most types of regular mental healthcare and can be accompanied by fairly unique issues. Preventing delinquent behavior and successfully reintegrating in society are important treatment goals, especially for patients that are excluded from society for a longer period of time. According to the participants, the transfer from a closed setting to living independently again is often a big transition, and it can be hard to fully prepare patients for this in a therapy room. First of all, inpatients might not be **emotionally or cognitively prepared for their return to society**. Especially after several years of residing in a closed setting, changes will have occurred within society or in the patient himself. Consequently, some patients have a lack of knowledge or insight into activities that are required to function well in society, e.g., using the public transport or the internet. Also, some inpatients feel anxious about their return.

Second, inpatients can **lack practical skills** that are required for successful societal participation. Daily living skills can be underdeveloped because patients didn't learn or practice these activities during their stay in a closed setting, as was explained by Participant 2:

"Someone who has been locked up for 10 years doesn't know 'outside' anymore, so also doesn't know the entire digital world. He'll still go to the bank and wants to fill in paper forms while that doesn't even exist anymore."

Third, multiple participants indicated that another important skill is **dealing with a new status as an offender**: patients can find it hard to explain their delinquent background in situations that are important for functioning well in society, such as job interviews or meeting new people. Finally, multiple participants indicated that an important issue related to forensic mental health specifically is that the **chances for recidivism are high** after return to society. Besides re-offending, patients can show other undesirable behavior such as drug abuse. Participant 7 explained some of these problems:

"We do make early recognition plans: how do you recognize signals in yourself, and what can someone else do in that, and what can you do yourself? But it remains a piece of paper, it remains: I should do this, or should do that. But we all know that we are driving through a red traffic light every once in a while, and that we actually shouldn't do that. Sometimes people do things they actually shouldn't have done."

#### Patient Characteristics

This main code refers to difficulties in treatment that arise because of specific characteristics of the forensic psychiatric patient population, which was seen as complex by multiple therapists.

Participants mentioned that an important element of this complexity is a **low motivation for treatment** and an accompanying resistance to actively participate in all parts of their therapy. Also, according to therapists, a large share of the **patients lack the cognitive skills** to grasp all elements of their treatment. This might be because treatment activities or assignments require a certain level of abstract reasoning and reflecting that is too difficult, or the reasoning of a therapist is hard to understand, which is illustrated by a quote of Participant 4:

"They often don't understand things and become angry and nervous because of that. If you explain things, it should be very brief, otherwise they don't get it. There is much to gain there."

Finally, therapists indicated that many forensic psychiatric patients have difficulties in **regulating their emotions**. On the one hand, patients might be too anxious or stressed before or during treatment, while on the other hand, it might be difficult to provoke specific emotions that are required for specific types of treatment, as is explained in the quote below of Participant 1 on EMDR:

"Well, with EMDR you are working with eliciting that trauma, you want the anxiety to be as high as possible, and only then you start decentration. And with some people that doesn't always work, it is advised against for people who don't feel emotions, for example."

#### Treatment Characteristics

This main code refers to issues that arise because of the nature of the therapy, which often takes place one-on-one, and in a closed setting or therapy room that does not resemble the real world. An often-mentioned sub code was **skills training in context**. Participants mentioned that patients have to develop, practice and improve behavioral or cognitive skills that are required for their functioning in society during therapy sessions, such as

TABLE 1 | Points of improvements of the current treatment according to therapists and patients (n = 11).


<sup>a</sup>The total number of times a code was mentioned in all interviews. <sup>b</sup>The number of different therapists that mentioned a code, and (#) the total number of times the code was found in all interviews with therapists. <sup>c</sup>The number of patients that mentioned a code, and (#) the total number of times the code was found in all interviews with patients.

social, emotion regulation, or relaxation skills. However, these types of skills can often only be practiced in a therapy setting which requires a lot of imagination - and not in a realistic context with realistic stimuli and environments. Therapists are often restricted to discussing situations instead of actually practicing them, as was illustrated by Participant 2:

"People who keep on finding it difficult, who have been incarcerated for a long time or don't have good social skills anyway. Then you'll say: hey, practice! Some things are already being done with eMental Health, but I think that you cannot learn social skills from a screen: you have to experience and do."

Another issue related to the limitations of treatment is that **therapists do not always have as much insight** into a patient's mental disorder, problematic behavior or triggers of delinquent behavior as they might require for optimal treatment. This can be caused by difficulties with self-reporting instruments, the inability to observe specific, offense-related behavior in context, or social desirability during conversations with a therapist, which was explained by Participant 2:

"I do have a patient, and I'm thinking: what is this, then? You always have - especially with sexual offenders - social desirability. And in the social desirability, me and other colleagues as well are wondering: is this the patient? Or is this the patient in the social desirability that has been admitted to the clinic, and that knows: 'I have to do this to progress in my treatment'? So in how far is someone calculating, and is someone controlling certain things?"

Furthermore, multiple therapists indicated that there are forensic psychiatric patients with anxiety disorders that require **exposure therapy**, but this can be difficult to arrange, either because of legal restrictions which prescribe that a patient is to remain in a closed setting, or because of practical constraints which make it difficult to present the fear-eliciting stimuli or situations to a patient, e.g., in case of a fear of flying. Also, several participants mentioned that currently, little attention is paid to the **physical activity of patients**, either during their day-to-day life in a closed setting, or during therapy sessions. Finally, another point of improvement was that **significant others** of patients could be more involved in their treatment, partly because of their (often) important role in the prevention of re-offending.

#### Possibilities of VR to Improve Treatment

Besides points of improvement, therapists and patients also provided multiple scenarios of possible ways of using VR to improve the current treatment situation. The identified main and sub codes that arose from the inductive analysis of the interview and their accompanying definitions are provided in **Table 2**.

#### Skills Training With Interaction

This main code refers to the possibility of VR to develop, practice and improve specific skills in a realistic context, in which interaction with virtual avatars is possible. This type of interaction can be seen as a more realistic form of roleplaying during treatment because an ecologically valid context can be added to the interaction between therapist and patient. VR can

TABLE 2 | Possibilities of VR to improve current treatment according to therapists and patients (n = 11).


<sup>a</sup>The total number of times a code was mentioned in all interviews. <sup>b</sup>The number of different therapists that mentioned a code, and (#) the total number of times the code was found in all interviews with therapists. <sup>c</sup>The number of patients that mentioned a code, and (#) the total number of times the code was found in all interviews with patients.

be used for different types of skills. Participants pointed out that VR provides many opportunities for patients to develop and improve **basic, practical skills that are required for daily living** and functioning well in modern-day society. Participant 3 provided some examples of these types of skills:

"I would really like it if the people here can bear some more responsibilities and will also take those. That we can also offer them these responsibilities. And that ranges from daily living activities, to working, to going to the city, to getting up on your feet again, to searching a girlfriend again: the entire range."

Second, participants suggested that **social skills** can be practiced in virtual environments. This refers to skills that are required for good and healthy social interactions that will not lead to any undesirable behavior. Third, **emotion regulation skills** can be trained in VR: coping skills that support the patient in not giving in to impulses when confronted with difficult, emotioneliciting situations or stimuli. The following quote of Participant 7 explains this in the case of aggression:

"Yeah, and then for aggression, because I was talking about it a while ago with a patient who said: 'if someone's looking at me and that person doesn't look away, it doesn't even have to be an acquaintance. . .'. That patient really feels like: I am the boss and if the other one doesn't look away. . . Then you get macho behavior and it goes wrong. I would like to be able to practice that. So regulating emotions, regulating aggression, eliciting aggression. That really adds something."

#### Observing Without Interaction

This main code refers of using VR to facilitate the patient in the mere observation of virtual situations or environments, in which communication with another person does not play a role. An option that was mentioned, was the use of VR to **expose forensic psychiatric patients to stimuli or situations** that elicit negative emotions and cognitions. These stimuli or situations can be associated with phobias or anxieties, but might also be related to the offense, for example children in case of a pedophile. An example on exposure to drugs was provided by Participant 9, a patient:

"How do you respond to being exposed to drugs? Yes, a coffee shop in VR, or, let's keep it simple, just a dealer on the street. And how does someone respond to it?"

Also, multiple participants suggested that forensic psychiatric inpatients can observe **regular, realistic daily life situations or environments** to get re-acquainted with society. Furthermore, VR might be used to **observe desirable or undesirable behavior**, in order to increase the patient's insight. Participants mentioned that patients can watch themselves from the perspective of an outsider but can also watch similar behavior displayed by another person. Patients can observe mental disorders such as schizophrenia, but also offense-related behavior, which was illustrated by Participant 1:

"There was domestic violence and he was then, he moved to another place. And then he heard the upstairs neighbor who was being, well, beat up by her partner. And then he said: 'only then I realized what that looks like from the outside, through a window, so to speak'. So it worked really well there, so I believe that it will definitely have added value."

#### Creating Insight for Others

This main code refers to the possibility of VR to give the patient's therapists and significant others new insights into problematic behavior and/or mental disorders of a patient, in order to increase their understanding and to better support the patient. According to participants, VR can be used to provide the therapist with an increased **insight into the patient's reactions to realistic triggers** when he or she is confronted with a stimulus or situation in an ecologically valid way. This can increase a therapists' knowledge of a patient, which was illustrated by Participant 7:

"The one person doesn't look away, the other one doesn't look away, and then things start stirring up inside. Now you can talk about it, but if it actually happens you are not there. And in VR you are actually there, and you can see how someone responds and what it does to someone physically."

Furthermore, therapists and significant others can actually **see the point of view of a patient**. VR can be used to provide a

TABLE 3 | Potential positive aspects of the use of VR in treatment according to therapists (n = 89) and patients (n = 19).


<sup>a</sup>The total number of times a code was mentioned in all answers together, and (#%) the percentage of the total number of all 466 codes. <sup>b</sup>The number of times a code was mentioned by a therapist, and (#%) the percentage of the total number of times the code was mentioned by therapists and patients. <sup>c</sup>The number of times a code was mentioned by a patient, and (#%)the percentage of the total number of times the code was mentioned by therapists and patients.

realistic experience of how it is to suffer from a mental disorder, for example a psychosis. Another way to gain insight into the point of view of a patient is to view how the patient experienced a situation in which an offense took place, as was explained by a patient, Participant 10:

"Yes I think especially loved ones, for me that's the case. [. . .] And she doesn't see why I have become this way, so to say."

## Study 2 – Online Questionnaire

#### Demographics

In total, 19 forensic psychiatric patients (2 female; mean age 41.53; SD = 7.37) and 89 therapists (62 female; mean age 38.79; SD = 12.51) working in forensic mental health participated in the questionnaire. In total, six inpatients participated, the remainder were outpatients. On average, they were treated in forensic mental healthcare for on average 7.44 years (SD = 7.89). Four patients were treated for aggressive delinquent behavior, six for sexual delinquent behavior, and nine patients did not indicate the main focus of their treatment. 89 therapists of 22 different Dutch forensic mental institutions participated. On average, they had 9.45 years of experience in forensic mental health (SD = 8.84). Several therapists worked in multiple settings: 49 participants worked with inpatients in closed settings, 61 delivered outpatient care.

#### Potential Positive Aspects of VR

Patients and therapists evaluated the scenarios provided in the questionnaire and wrote down aspects they found positive,

TABLE 4 | Potential negative aspects of the use of VR in treatment according to therapists (n = 89) and patients (n = 19).


<sup>a</sup>The total number of times a code was mentioned in all answers together, and (#%) the percentage of the total number of all 168 codes. <sup>b</sup>The number of times a code was mentioned by a therapist, and (#%) the percentage of the total number of times the code was mentioned by therapists and patients. <sup>c</sup>The number of times a code was mentioned by a patient, and (#%) the percentage of the total number of times the code was mentioned by therapists and patients.

interesting or exciting about the examples. The codes that resulted from these answers are provided in **Table 3**.

As is shown in the table, therapists accounted for 86% of the codes and patients for 14%. This ratio is comparable to the division between participants: 82% were therapists and 18% patients. The main code **Treatment characteristics** focuses on advantages of VR that were seen as potentially beneficial for the current treatment. The use of VR to learn new or improve specific types of offense-related behavior in a realistic way during treatment was mentioned most by the participants. Furthermore, only therapists and no patients mentioned the possibility of VR to practice behavior in a safe way, and the generation of new topics that can be discussed in treatment. Compared to the code Treatment characteristics, relatively more patients mentioned codes within the **Patient characteristics** category, which focuses on advantages of VR for patients. The most-mentioned advantage by all participants combined was the possibility to offer the patient new insights into his or her behavior. Also, many patients mentioned the use of VR to gain more understanding in the behavior of others, and the use of VR to increase their motivation for treatment. In the **Content** code, the focus lies on the composition of possible VR applications, for example the visual design and storylines. Three of the four codes addressed the possibility to adapt certain aspects of a VR application to an individual patient. It was also pointed out that realistic behavior of virtual avatars is important and beneficial. The last main code, **Practical**, addresses advantages that are not related to content of VR or treatment, but about the characteristics of the technology and practical criteria for its use. A relatively high number of patients found the use of a new, innovative technology positive, while most therapists addressed the importance of a VR application that looks and feels realistic to the user.

#### Potential Negative Aspects of VR

fpsyg-10-00406 February 25, 2019 Time: 18:23 # 10

Patients and therapists also wrote down points that they found negative, less appealing or unfavorable about the examples. The codes that resulted from these answers are provided in **Table 4**.

For the disadvantages, the percentage of codes mentioned by therapists and patients is again comparable to the ratio of patients and therapists as participants. For the main code **Treatment characteristics**, potential disadvantages or barriers for using VR in treatment were discussed. Most of these codes were mentioned by therapists. The issue that was mentioned most often, was that a VR application might not fit or complement their current way of working. Again, relatively more patients mentioned codes belonging to the **Patient characteristics**. The code that was identified most often was that VR might not be suitable for specific types of patients, for example patients with a current psychosis. Relatively many patients pointed out that the use of VR could cause unintended, unnecessary negative feelings, for example an abundance of anxiety because of a specific stimulus. Relatively few codes were identified for the main code **Content**, in which possible disadvantages or pitfalls of the content of a VR application were discussed. Participants worried that skills learned via the VR application would not be relevant for real life and indicated that behaviors and conversations with virtual avatars should resemble the real world as closely as possible. A broad range of codes was identified for the last main code, **Practical**. The most mentioned disadvantage was that the visual design of a virtual environment would not look realistic enough. This disadvantage was only mentioned by therapists, not by patients. Relatively many patients mentioned that VR might be difficult to use and that learning to use VR might take a lot of time and effort.

When looking at the tables, it becomes apparent that more positive (466) than negative (168) codes have been identified. While both tables provide a broad range of codes that differ from each other, some positive and negative codes seem to contradict each other. For example, visual realism is mentioned as an advantage, but also as a potentially negative aspect. Also, the fit with the current treatment was seen as a positive, but the lack of a good fit with the current way of working was mentioned as a disadvantage as well. Finally, a potentially positive aspect of VR was its suitability for specific groups of patients, but it's nonsuitability for specific types of patients was also seen as a barrier.

### DISCUSSION

This study aimed to identify points of improvement of current forensic mental health treatment, to find possible ways of using VR to improve the current situation and to identify potential positive and negative aspects of VR according to therapists and patients. Several points of improvement arose from the interviews. First, being isolated from society might cause difficulties for inpatients when preparing for or actually returning to society. Second, the complex and diverse nature of the often low educated patient population was described as difficult. Third, treatment often doesn't take place in the setting where problematic behavior is displayed, and thus a realistic context for skills training or the observation of behavior by a therapist is lacking. During the interviews, multiple ways in which VR can address these points of improvements have been identified. While the results of the questionnaire pointed out some new possibilities, most of the findings from the interviews were underlined and specified by these results. First, VR can overcome issues related to the - often closed - forensic setting by offering skills-training in and observations of realistic scenarios, which can be used to overcome practical, legal and safety barriers. Via VR, new behavior can be learned or improved. Second, VR can fit the forensic psychiatric patient population via emphasis on doing instead of abstract talking. Also, virtual persons and scenarios can be adapted to the needs of individual patients, resulting in a personalized intervention, which might also improve treatment motivation. Third, VR addresses characteristics of treatment by providing the therapists with new ways of gaining insight into a patient's behavior and cognitions, and by letting significant others experience the point of view of a patient. With regard to potential negative aspects, therapists and patients feared that a VR application cannot be sufficiently adapted to an individual, might not be realistic enough, or that it does not have any added value for existing treatment. The answers of both methods show several contradicting codes, which partly points out the differences between the opinions and preferences of the different participants. Based on these results, it becomes apparent that there are numerous ways in which VR can be of added value for treatment in forensic mental health, as long as the opportunities technology offers are adapted to the characteristics of the patients, therapists and forensic context.

An important finding that arose from both the interviews and questionnaire, was that VR should serve as a bridge between a closed setting or therapy room, and real-life situations. The broad range of possibilities of VR to practice behavior in a realistic context and to equip offenders with cognitive and behavioral skills to prevent recidivism, without endangering others, has been pointed out by multiple authors (Benbouriche et al., 2014; Fromberger et al., 2014; Renaud et al., 2014). Many therapists indicated in the interviews that a large share of the forensic psychiatric patient population has difficulties with abstract reasoning and lacks a certain amount of imagination required for roleplaying. According to participants of the interviews and questionnaires, performing behavior in a realistic virtual scenario instead of discussing it can be of much added value

for these patients. This possibility is underlined by a recent review, in which the authors stated that people can react and behave in a genuine, realistic way in VR environments (Gonzalez-Franco and Lanier, 2017). Experiencing emotions and behaving in VR as one would do in circumstances in reality is a behavioral correlate of a sense of 'presence' (Riva, 2011; Slater and Sanchez-Vives, 2016). This sense of presence can improve skill acquisition and knowledge transfer, partly because of a situated performance in VR (Martirosov and Kopecek, 2017). However, several participants of the questionnaire indicated that skills learned in VR might not be transferable to real life. More research is required on transfer of skills and the added value of VR compared to in-person role-playing. Nevertheless, the findings of our studies, combined with existing literature on presence in VR, endorse the potential of VR in overcoming current problems with practicing and observing behavior in context.

Another theme that often recurred in the interviews and questionnaires is related to the potential of VR to acquire new insights into patients. Current risk assessment instruments provide evidence-based ways of gaining a thorough understanding of a patient's static and dynamic risk factors, but the accuracy of such risk assessment tools is imperfect, for example with regard to predictive validity – the extent to which the scores on these tools can predict recidivism (Douglas et al., 2017). Interviewed therapists indeed indicated that there still remains a lot that is unknown about a patient, for example about behavior outside of the therapy setting or reactions to realistic triggers such as drugs or aggressive others. Because of VR's aforementioned qualities, participants suggested that therapists can study patient's responses to a VR-scenario in an ecologically valid way. Via this information, therapists can gain more insight into risk factors and make their treatment even more responsive to the individual patient. Several researchers have used VR in a comparable way: they studied the use of VR to assess reactions to virtual children in pedophilic men and combined this with Penile Plethysmography (PPG) to also assess physiological arousal (Renaud et al., 2005, 2014; Benbouriche et al., 2014). Eye tracking was also suggested as a means of improving assessment of sexual offenders in VR (Fromberger et al., 2015). Based on the ideas of the interviewed participants, it seems that this way of using VR can be applied to other types of patients and offenses in an ethical sound, ecologically valid and safe way as well, e.g., to identify triggers for reactive aggression. However, there is much research that needs to be done to answer questions about reliability and validity of such approaches. Furthermore, it is important to ensure that these types of VR applications fit existing assessment and treatment approaches, such as the Risk-Needs-Responsivity model (Bonta and Andrews, 2007).

A third important finding is related to the importance of personalization of VR. Both the conducted interviews and literature show the broad nature of the forensic psychiatric patient population (van der Veeken et al., 2018). Therefore, participants also provided a broad range of areas of applications of VR, e.g., for aggression, autism, mental retardation, sexual offenders, psychosis, etcetera. This implies that a one-sizefits-all intervention is not suitable for forensic psychiatric patients (Kip et al., 2018; Kip et al., unpublished). Indeed, an important theme that came forward in the answers to the questionnaire was the importance of adapting virtual persons, environments and scenarios to the needs of individual patients. Consequently, it should be possible to personalize a VRintervention to the characteristics of individual patients, mainly to ensure that a technology is personally relevant. Participants of the questionnaire indicated that the virtual environment, persons, and the scenarios should be personalized. This might be especially important for patients that have difficulties with abstract reasoning and imagination. Studies on personalization of eHealth in general showed that adapting an intervention to a specific individual can lead to more effectiveness (Kaptein et al., 2015; Lentferink et al., 2017). Furthermore, personalization might also lead to more treatment motivation, which was an important topic in both interviews and questionnaires, especially for patients. Based on the results of the current study, we expect that personalization of VR interventions would be beneficial for treatment, but since not much is known about the working mechanisms and benefits of personalization in VR yet, more research on this topic is required. Merely stating that personalization is important will not suffice on the long-term: once a personalized VR intervention is developed, evaluation studies should pay attention to questions about the added value of personalization, which way of using VR works best for which type of patient, and if VR can and should be used for all types of patients.

Finally, it is important to note that the results of this study do not serve as ready-to-use ideas for interventions: the identified avenues should be further developed in a systematic way, for example via the multidisciplinary CeHRes Roadmap (van Gemert-Pijnen et al., 2011).

### Strengths and Limitations

The main strength of this study is the combination of two scenario-based methods. The results of the interviews provided much insight into the current situation and could be used to create valid, realistic scenarios for the questionnaire. The findings of both methods complemented each other: the results of the questionnaires were used to further specify the results of the interviews and validate the most important conclusions. This combination of methods resulted in conclusions and recommendations with a solid foundation.

In the interviews, patients and therapists were asked about scenarios on their own experiences and ideas in an open, explorative manner. Often, stakeholders are involved as mere informants who are asked to react to ideas of researchers or designers (Scaife et al., 1997), but by applying this bottom-up approach, we gained many valuable insights into the current situation and promising directions. When used with therapists, this method resulted in a broad range of points of improvements and possibilities. No new codes were identified in the last interviews. Nevertheless, this way of interviewing appeared to be difficult because it requires a specific way of asking probing questions to elicit

complete scenarios. Because several scenarios were not as elaborate as they should have been, there was some disagreement between researchers during the coding process, but after discussion, consensus was reached on all codes. While this way of interviewing seemed to be suitable for therapists, it proved to be more difficult for patients. They especially struggled with the first, broad part of the interview in which no input was provided by the researcher. We found slightly more points of improvements in the second part, in which the categories from the focus groups were used. This is in line with another study on participatory eHealth development, which stated that merely asking about needs and wishes requires a too great amount of imagination, so concrete examples should be used to prompt reactions and ideas (Beerlage-de Jong et al., 2017). Since the interviews with patients did not provide us with enough results and including patients proved to be very difficult, we decided not to continue with the interviews and initiated a complementary method: a questionnaire that made use of concrete scenarios.

While including patients in the questionnaire proved to be more difficult than therapists, the quality of the patients' responses proved to be much higher than in the interviews. This is underlined by the fact that the percentage of codes found in the answers of patients were in line with the ratio of patients that participated. Consequently, time-consuming methods that require a certain level of abstract thinking, such as the hour-long interviews, might not be suitable for most forensic psychiatric patients, while methods that are experienced as enjoyable, such as an online questionnaire with videos, can promote more involvement in vulnerable populations (Dugas et al., 2017). Still, including patients in the questionnaire proved to be more difficult than including therapists. There might be several reasons for this. Dugas et al. (2017) state that participating in research should be rewarding for vulnerable patients. It might be that forensic patients did not perceive any direct rewards or positive consequences in participating in the interviews and questionnaire. Also, forensic patient populations are known for their low treatment motivation (Drieschner and Boomsma, 2008), which might explain the low motivation to participate in research. Unfortunately, too little research on methods that are suitable for these types of populations is available. To support researchers in choosing methods that fit vulnerable target groups with specific characteristics, for example low educational level or severe psychiatric disorders, studies should pay attention to the development process of their eHealth intervention, publish about these processes in a replicable and transparent way, and critically reflect on the used methods.

An important limitation of both studies regards the representativeness of the population. The interviews were conducted in one forensic hospital in the Netherlands, which might raise questions about the generalizability of results to other forensic hospitals. However, half of the therapists that participated in the questionnaire worked in other forensic settings in the Netherlands, which enhances the generalizability of the results. Most patients that filled in the questionnaire received treatment from the same hospital in which the interviews were held, but on average they received 7,5 years in forensic care, which makes it very plausible that a large share of these patients has experience with different types of treatment with differing levels of security (Nederlandse Zorgautoriteit, 2017; Stichting IFZ, 2017), which increases the generalizability of their opinions. Finally, while the Dutch forensic mental healthcare system differs from that of other countries, most findings of the current study seem to be consistent with recommendations of studies from other countries (e.g., Benbouriche et al., 2014; Fromberger et al., 2014), so we believe that the identified possibilities of VR are valuable for different countries as well.

### CONCLUSION

The results of both qualitative, scenario-based studies provide insights into the added value of VR in treatment of complex populations such as forensic in- and outpatients, and can also serve as input for new, meaningful VR-interventions that have the potential to improve quality of care, if developed and implemented thoroughly. There is not one optimal way of using VR, but a broad range of possibilities that can improve treatment in forensic mental health, e.g., developing new skills in context, exposing patients to the outside world, or providing therapists with more insight into a patient. This study pointed out that personalization is essential to make the most out of all these possibilities: VR scenarios should fit the individual needs, characteristics and treatment goals of a patient. While there is much potential, we should remain critical of when VR has added value and for whom. A thorough and continuous development and evaluation approach, in which methods that are suitable for this complex setting are used, is key for sustainable use of VR in forensic mental health.

### AUTHOR CONTRIBUTIONS

HK, KW, DD, and YB contributed to the design and planning of the study. HK, DD, and YB collected the focus group data. KW collected the interview data. HK and DD collected the questionnaire data. HK and SK analyzed the interview data, AK, IB, and HK analyzed the questionnaire data. All authors contributed to the reporting and interpretation of the results and approved the manuscript.

## FUNDING

This study was part of a larger project on virtual reality, VooRuit met VR, funded by Stichting Vrienden van Oldenkotte.

### REFERENCES

fpsyg-10-00406 February 25, 2019 Time: 18:23 # 13



Zaichkowsky, J. L. (1994). The personal involvement inventory: reduction, revision, and application to advertising. J. Adv. 23, 59–70. doi: 10.1080/ 00913367.1943.10673459

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Kip, Kelders, Weerink, Kuiper, Brüninghoff, Bouman, Dijkslag and van Gemert-Pijnen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## ATTACHMENT 1 – DESCRIPTION OF THE SCENARIOS USED IN THE QUESTIONNAIRE

All videos were between 1 and 2 min and all contained a brief explanation of the goal of the VR application, the embedment in the existing therapy, and an example to illustrate the idea, and an explanation of the desired outcomes. Voice-overs were added to clarify the video and provide further explanation where necessary. The ideas used in the videos are described below. All videos (audio in Dutch, but with English subtitles) can be watched via the links after the descriptions of the video.

### Triggers and Helpers

fpsyg-10-00406 February 25, 2019 Time: 18:23 # 15

This type of VR application focuses on dealing with specific triggers, which can elicit undesirable feelings, thoughts and behaviors in patients. Examples are alcohol, fans of another football club, drug dealers or, as shown in the video, women. In this type of VR application patient and therapist can look for helpers that support the patient in dealing with triggers (https://youtu.be/cKg6M1yoSa8).

### Observing and Interpreting Body Language

This VR application focuses on the reaction of patients to the body language of another person in daily life. The patient observes situations in which the non-verbal behavior of another is central. Think of a person that walks too close by, a cashier that seems to ignore you, or someone on a terrace who stares at you. The patient discusses the thoughts and feelings that arise in these situations with the therapist, and together they look for more appropriate reactions (https://youtu.be/iBOizRz0xG8).

### Body Language and the Effect on Others

This application focuses on insight of a patient into the influence of his or her body language on another person. The influence of the environment, e.g., a quiet or crowed room, is also accounted for in this. The patient gains more insight into the effect of one's own body language, such as an intimidating posture or restless, agitated movements (https://youtu.be/7sIKte7Dmo0).

### Roleplaying in Context

This idea for a VR application focuses on practicing social skills via a roleplay in a virtual environment. The therapist can play the other person via a voice-distorting microphone. The physical appearance and environment can be adapted to optimally fit the patient. An example is a roleplay in a train compartment, a crowded bar or a football stadium. Via this application, the patient can develop and improve social skills in a realistic context (https://youtu.be/T5njasY9YBg).

### Moments of Choice

This VR application focuses on gaining insight into the consequences of one's own behavior. The patient can be placed in different types of virtual scenarios. During a scenario, multiple moments of choice are presented and the patient has to indicate what he or she would do and why. Based on the decision, the consequences are displayed in the virtual scenario, and this is discussed with the therapist (https://youtu.be/1wGGynUqTCM).

### Offense Script

In this VR application, patient and therapist are working with a virtual construction box to create virtual environments. They can create an individualized crime scenario that the patient can enter via VR goggles. In this environment he can analyze behavior and antecedents of behavior with the therapist, and together they can look for alternative, better behavior (https://youtu.be/ZJCJMQEnfc4).

# User-Centered Virtual Reality for Promoting Relaxation: An Innovative Approach

Silvia Francesca Maria Pizzoli1,2 \*, Ketti Mazzocco1,2, Stefano Triberti1,2, Dario Monzani1,2 , Mariano Luis Alcañiz Raya<sup>3</sup> and Gabriella Pravettoni1,2

<sup>1</sup> Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy, <sup>2</sup> Applied Research Division for Cognitive and Psychological Science, European Institute of Oncology IRCCS, Milan, Italy, <sup>3</sup> Institute of Research and Innovation in Bioengineering, Universidad Politécnica de Valencia, Valencia, Spain

Virtual reality has been used effectively to promote relaxation and reduce stress. It is possible to find two main approaches to achieve such aims across the literature. The first one is focused on generic environments filled with relaxing "narratives" to induce control over one's own body and physiological response, while the second one engages the user in virtual reality-mediated activities to empower his/her own abilities to regulate emotion. The scope of the present contribution is to extend the discourse on VR use to promote relaxation, by proposing a third approach. This would be based on VR with personalized content, based on user research to identify important life events. As a second step, distinctive features of such events may be rendered with symbols, activities or other virtual environments contents. According to literature, it is possible that such an approach would obtain more sophisticated and long-lasting relaxation in users. The present contribution explores this innovative theoretical proposal and its potential applications within future research and interventions.

## INTRODUCTION

Virtual reality (VR) has been successfully employed to promote relaxation and reduce stress, and it has notably matured trough time, showing a relevant potential in improving and regulating emotional well-being.

Keywords: virtual reality, relaxation, emotion regulation, personalized virtual reality, user-centered virtual reality

Two main approaches aiming at achieving relaxation, stress reduction and emotion regulation can be found in the literature. The first one employs mainly "generic environments" by which users are exposed to relaxing narratives stimuli and try to gain control over body physiological activation, while the second one requires users to be active, implying an interaction with VR contents to train emotion regulation.

However, in the epoch of personalization a new approach is needed that provides the user with a more adequate and person-adapted techniques. In the present contribution a third approach is therefore proposed, which integrates and extends those mentioned above, but is based on its specific methodological assumptions.

This third perspective would propose a VR based on personalized contents, built on distinctive features picked-up by users' memories of relevant life-events, and on adaptive virtual reality. The present contribution presents and discusses this novel approach and its similarities and dissimilarities with the previous ones, highlighting how personalized contents features

#### Edited by:

Stéphane Bouchard, Université du Québec en Outaouais, Canada

#### Reviewed by:

Michaël Reicherts, Université de Fribourg, Switzerland Massimo Mecella, Sapienza University of Rome, Italy

#### \*Correspondence:

Silvia Francesca Maria Pizzoli silviafrancescamaria.pizzoli@ieo.it

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 01 October 2018 Accepted: 18 February 2019 Published: 12 March 2019

#### Citation:

Pizzoli SFM, Mazzocco K, Triberti S, Monzani D, Alcañiz Raya ML and Pravettoni G (2019) User-Centered Virtual Reality for Promoting Relaxation: An Innovative Approach. Front. Psychol. 10:479. doi: 10.3389/fpsyg.2019.00479

can be identified and their possible applications and benefits in the field of relaxation and emotion regulation.

To trace similarities and differences between these 3 approaches, a list of features was built up (**Table 1**), trying to describe their specific and common characteristics.

### THE FIRST APPROACH: RELAXING VR

The first approach, which we will name "relaxing VR" (rVR henceforth), presents contents inspired or directly derived from classical relaxation techniques such as progressive muscle relaxation, autogenic training, yoga, meditation; typically, the user is shown environments that can help him/her feel safe. Often, virtual environments feature contents that are generically associated with pleasant, peaceful, non-arousing sceneries such as islands, parks, gardens and other open-space, generic naturebased environments. Indeed, as often used in imagery techniques too, these environments have proved to be a valuable means to reduce stress (Baños et al., 2005; León-Pizarro et al., 2007; Villani et al., 2007; Felix et al., 2017), both in healthy and pathological contests (e.g., pain) (Hoffman et al., 2011). Within the rVR approach, natural scenarios and visual or auditory natural elements have been frequently employed, showing a fair efficacy in several contexts (Annerstedt et al., 2013;

TABLE 1 | Characteristics of the three main approaches to virtual reality for promoting relaxation or emotion regulation.


Anderson et al., 2017). For example, regarding sound features specifically, nature-based VR scenarios filled with natural sounds resulted in being more effective for stress reduction, compared with natural scenarios without them (Annerstedt et al., 2013). These rVR interventions usually present users with multimodal stimuli, involving visual, auditory and haptic modalities to try lowering physiological activation, gaining control over body reactions. Thus, not only visual natural elements, but also auditory natural elements display some intrinsic relaxing properties (Saadatmand et al., 2013). Apart from natural sounds, rVR scenarios present auditory stimuli that may include warm and calm voices, giving instructions to relax muscles, relieve stress and negative thoughts. Controlling breathe frequency and amplitude is another technique usually delivered by rVR narratives, in that breathing exercises are widely employed in clinical psychology (Smith, 1999).

Relaxing VR is a useful application of VR because relaxation states, by lowering general arousal, are proven to have positive effects on cognitive and physical stress. The main aim of this first approach is to put users in a more or less passive state of relaxation, or to lower physiological arousal, inducing a positive state of well-being. This is the case not only of relaxation-focused interventions, but also of those interventions aiming at induce positive mood states, through the exposition to emotionally connoted scenarios (Baños et al., 2004, 2008; Riva et al., 2007).

Importantly, such interventions induce a transitory state of relaxation or positive emotion (Felnhofer et al., 2015). Users may come even to a deep relaxation state, but such physiological state and its benefits are usually not maintained for long in the everyday life and the VR experience could hardly be generalized to other contexts. Indeed, the purpose of rVR is to reach a positive state of well-being in the very moment of the VR exposure, rather than engage users in a learning process to gain new abilities. That's not to say that rVR cannot produce long-lasting positive effects, as relaxation or positive emotions can produce even medium and long term benefits on well-being, coping strategies and health (Folkman and Moskowitz, 2000; Fredrickson, 2001; Tugade et al., 2004; Lyubomirsky et al., 2005; Pressman and Cohen, 2005; Cohn et al., 2009).

#### THE SECOND APPROACH: ENGAGING VR

Targeting a learning process to empower users is in fact one of the core differences between rVR and the second approach that can be found in the literature, which we will label engaging VR (eVR henceforth).

Under the term "eVR," we classify those interventions that try to build learning processes about one's own emotional and behavioral abilities, giving users a flexible and modifiable environment. This kind of approach does not merely imply a passive visualization of the virtual environment or exposure to relaxing stimuli, rather it requires users to interact with virtual contents, permitting the acquisition of specific skills. It is the case of emotion regulation training in VR and of some therapeutic interventions in VR.

Broadly, emotion regulation refers to how individuals influence and express the emotions they experience and how they experience them, and it has been defined as "all of the conscious and no conscious strategies we use to increase, maintain, or decrease one or more components of an emotional response" (Gross et al., 1998).

Virtual reality therapies have been employed for several psychopathological conditions: anxiety disorders (Baños et al., 2011), specific phobias (Parsons and Rizzo, 2008), eating disorders (Ferrer-García and Gutiérrez-Maldonado, 2012; Ferrer-García et al., 2013), trauma and stress-related disorders (Gonçalves et al., 2012), as well as other serious psychiatric conditions (Maples-Keller et al., 2017a,b). VR therapy interventions have some characteristics in common with eVR, as the virtual environments vary according to the symptoms to be addressed.

A customized environment is used also in "stress inoculation training," which shares some similarities with VR therapies, and in particular with Virtual Reality Exposure Therapy (VRET). Based on the assumption that gradual exposure to fear-inducing stimuli can increase "mental readiness," stress inoculation training can train subjects' stress response capacity (Bosse et al., 2014).

Training emotional reactivity in response to negative stimuli has resulted to be a valuable primary prevention intervention for burn-out and psychological functioning for different work categories (Rizzo et al., 2008; Popovic et al., 2009 ´ ; Bosse et al., 2014) and it represents a case of eVR, as it aims to improve stress response capacity, targeting mental presence and emotional reactivity in stressful situations.

During the training, users are helped gain skills to cope with negative stimuli and feelings, managing the emotions they experience in a more functional way. By increasing emotional management, such interventions have an important impact on well-being and mental health functioning (Serino et al., 2014).

For example, job interviews have been simulated in virtual environments to give participants the opportunity to exercise their ability to manage emotions before and after a 5-weeks long emotional skills training (Villani et al., 2017). Also, VR applications have been used to train children with autism spectrum disorder to learn how to deal emotionally with everyday life social situations (Ip et al., 2018).

For the sake of completeness, it is interesting to note that innovative resources to improve/exercise emotional skills come from the video games scenario. Indeed, the association between gaming experiences and real life abilities improvements has been established (Lobel et al., 2014), with the capacity of video games to modulate arousal and interoceptive awareness. As highlighted by a recent review, commercial video games are valuable tools to improve emotion regulation capacities, training emotional intelligence and emotional strategies (Villani et al., 2018). Many emotional experiences can be practiced when playing video games, ranging from primary emotions such as fear, pleasure and anger (Rodríguez et al., 2015b; Wrzesien et al., 2015; Vara et al., 2016; Hemenover and Bowman, 2018), to complex emotions such as feeling "enriched" (Oliver et al., 2016) or feeling socially engaged with other characters. The first kind of emotions is more contingent and easily induced, through narratives and ad hoc scenarios, while the latter is usually evoked by complex narratives, requiring for example characters that have a long and struggling journey (Oliver et al., 2012).

Apart from entertainment, that of course is an emotional outcome of playing video games, players experience personally meaningful gaming experiences, encountering opportunities for introspection through the identification with characters and avatars (Oliver et al., 2016).

Importantly, rVR and eVR approaches are far from being mutually exclusive, on the contrary they can be used conjointly. It is the case for example of the VRET, that, as in the traditional psychotherapeutic settings from whom it is derived, requires the patient to be systematically desensitized (Marks and Gelder, 1965) within a combination of relaxation and exposure. The great majority of the VRET studies in fact combined the two approaches (Maples-Keller et al., 2017a), with an initial phase with psychoeducational methods and breathing or relaxing exercises. The combination of both interventions has to be preferred, as in the traditional techniques and relaxation training on its own for specific phobia has shown reduced efficacy compared to exposure intervention (Mühlberger et al., 2001).

### THE THIRD APPROACH: PERSONALIZED VR

The third approach, "personalized VR" (pVR henceforth), would be different from the previous ones, specifically for what concerns the choice and the construction of VR contents and environments.

Personalized VR would be in fact a user-centered approach to the design and implementation of the VR setting itself.

A pVR would be defined by two main characteristics, one relating to the design of VR contents, and the other to the technology to implement. The design aspect regards **preliminary investigation about users' relevant life events**, aiming to extract distinctive perceptive features of personal memories and experiences. If a relaxing environment has to be built, a preliminary inquiry about potential users' relaxing life events will be conducted, while for an emotional training VR, user would be asked for emotionally connotated memories. Such descriptions should be accurate, carefully recorded and should include multisensory details: visual, tactile, auditory and even olfactory elements.

Description should include multisensory elements, both to have detailed elements and to understand with which perceptive modality the user re-experience the autobiographical scene (sounds, temperature, and colors). User would also be explicitly asked about which stimuli they associate with this experience, trying to find out significant perceptive cues effective in memories elicitation (Holland and Kensinger, 2010).

As a second step, distinctive features of such autobiographical life events and evoking cues would be rendered with symbols, activities or other virtual environments contents, through a qualitative analysis of users' descriptions (see **Figure 1**).

The pVR approach would target autobiographical memories both because it is important to show the user's preferred places (Fisher, 1974; Kyle et al., 2004; Korpela et al., 2008; Korpela and Ylén, 2009), and because recalling personal experience can bring sensorial and vivid feelings (Rubin and Kozin, 1984; Tulving, 1985; Wright and Gaskell, 1992; Brewer, 1996; Rubin, 2005).

Not only user-centered approaches (e.g., based on users' feedback for content) have proved to be particularly effective in different applications of VR (Parsons et al., 2009; Rizzo et al., 2011), but knowing which perceptive and contextual elements connote emotional memories can also help in emotion elicitation or emotion modulation, and can inform experimenters and developers on which elements have to be modulated or have not to be included, depending on the target of the intervention. pVR environments and stimuli would be built focused on the specific users' characteristics and, more specifically, on the environments features users give importance and relevance to, eventually acting as a cue for recalling personal past memories and enhancing users involvement.

The vividness of emotional engagement grows when something personal, related to the subjective experience, is presented. Re-evoking personal contents in fact enhance affective psychological states (Picard et al., 2001; Witvliet et al., 2001), augmenting the capacity to elaborate memories and emotional experiences.

Previous studies have already proved that putting autobiographical stimuli within the VR environment helps give a vivid emotional connotation to the experience and enhance sense of presence (Baños et al., 2004, 2008, 2009; Rey et al., 2005; Riva et al., 2007).

In the above-mentioned studies, different emotional states (such as relaxation, joy, sadness), were evoked through autobiographical stimuli presented in different sensorial modalities (picture, audio, music, autobiographical recall), suggesting that autobiographical contents in VR may play a valuable role in subjective emotional experience. This might constitute a relevant difference from the rVR and eVR, which typically lack of contents tailored on their users.

pVR would address stress and emotion regulation, both relying on complex circuit involving cognitive, behavioral and psycho-physiological responses (Lang, 1968; Borkovec and Costello, 1993), targeting (a) a bottom-up modality with personalized stimuli and autobiographical cues and (b) addressing physiological correlates through adaptative methods. In a relaxing environment, element recalling previous safe place would enhance feelings of security and peace, while in an emotion regulation training, providing affective connotated stimuli could help emotion regulation and arousal modulation by giving the user the occasion to confront his/her own relevant life events (Holland and Kensinger, 2010).

Relaxation can be guided for example with verbal instructions bringing attention to muscles activation and breathing (Jacobsen, 1929; Miller, 1987), or induced through cognitive representations of positive thoughts and stimuli (Tusek and Cwynar, 2000; Vempati and Telles, 2002). pVR can be modified according to the relaxing technique employed in the specific pVR intervention, rather it would stress user-centered environment

and autobiographical engagement (Holland and Kensinger, 2010) to enhance sense of presence and affective involvement. Furthermore, pVR would provide adaptative physiological mechanism, monitoring physiological parameters.

Differently from the studies employing unique autobiographical contents and stimuli, in which contents have to be specifically selected for each subject, the pVR approach would allow for a broader generalization, reached through the construction of a stimuli library, based on memories relevant features.

From a methodological point of view, user centered design techniques would be implemented as the foundation for the VR design. These techniques come from the user experience field, but they are not focused on evaluating technologies (e.g., in terms of usability and functionality), rather they are meant to be used before the tool or technology is created, in order to provide information to design in terms of users' needs, intentions and contexts of use (Garrett, 2010; Triberti and Barello, 2016). On the one hand, techniques such as contextual inquiry (which is a field research method, a semi-ethnographic interview to be conducted within the place/context of interest) (Holtzblatt and Jones, 1993) could be a valuable tool to enrich users' subjective testimony with objective measures and properties to be included in a personalized virtual environment; similarly, empathy maps, which are a visual diagram featuring user personas' experience and needs (Curedale, 2016), could be used to systematize information on users to guide pVR implementation.

User Centered research could allow pVR to develop transsituational knowledge about the most common or recurrent multi-sensorial features of relaxing experiences in order to develop a library of stimuli which can be used to approximate the personalized virtual scenario for each possible user.

The second main characteristic of the virtual tools created within a pVR approach, relating to technology to be implemented, is that of **adaptive virtual reality** (Alcañiz et al., 2007, 2009; Parsons and Reinebold, 2012): in order to be really personalized, a virtual environment should be able to adapt its own contents to the user's state in the very moment of the interaction. In other words, pVR environments would feature integrated systems able to sense and analyze users' state by means of psychophysiological correlates, self-reported states, and observable behaviors in order to provide modifications to the virtual experience itself. Doing so, pVR would not be "user centered" only in the sense that user's pre-existing needs, experiences and memories are taken into consideration for design, but also personalized changes will happen within the virtual instance according to users' current state.

As written above, relevant properties of personal memories would be organized in terms of macro and sub-categories (e.g., perceptual aspects such as form and color of objects; content aspect such as the meaning to be communicated through discourses present in the VR; etc.) and the system would operate such properties dynamically during any VR instance, in order to maintain a set level of immersion and emotional reaction by the user, continually monitored through psychophysiological indexes.

In the future, from a technological point of view, such adaptation feature of virtual reality would be exploited within the merging of VR and Artificial Intelligence (AI). AI techniques, such as machine learning, deep learning and natural language processing (NLP) provide computers with reasoning and analytical capabilities that, until quite recently, it was possible to achieve only with standalone servers in specialized laboratories. Today, any field of knowledge can employ AI capabilities due to the development of cloud-based AI servers. Historically, AI has been associated to virtual reality technology mainly for what regarded the implementation of virtual agents (Luck and Aylett, 2000). However, the implications of merging AI and VR for personalized VR experiences has been less studied. Research into VR focuses on computing techniques that bring humans with natural perceptual ways of interaction and new methods of thinking and learning in virtual and augmented environments. AI provides technologies that allow computers to mimic abilities that are exclusive to humans, such as intelligence and consciousness. These two fields have in common the enhancement of human abilities at a perceptual level and knowledge generation. The integration of both research fields will permit the development of more natural and realistic virtual environments in which humans and computers will interact naturally. The synergy between mixed reality interfaces, AI and the high-speed ubiquitous communication networks of the future will generate radically innovative human-human communication channels, with the intelligent processing of signals such as body movements, facial expressions, eye tracking, physiological variables and brain signals, among others. The majority of emerging technologies reports focus on three emerging technology mega-trends: AI everywhere, transparently immersive experiences and digital platforms (Gartner, 2017).

In pVR, such technology will be employed to analyze multiple, integrated data about VR users' current state, to translate these into emotional information, and finally to modify the virtual stimuli. Recent studies (Rodríguez et al., 2015a; Bermudez i Badia et al., 2018; Marín-Morales et al., 2018) showed that it is possible to apply machine learning techniques to measure specific emotions in VR, so to extract a set of psychophysiological and behavioral features to support autonomous emotion recognition.

Four main phases are imagined for development of pVR:


(b) then the with those instruments that have been already scientifically tested;

(4) Developing adaptive virtual environments, based on the ability to analyze and exploit users' current states to adapt virtual contents to one's own present state.

#### CONCLUSION

The approach proposed in the present contribution promises to be an innovative advancement in the field of VR to positively manipulate the emotional experience of people. It is important to consider that the pVR approach is still in its infancy, or better, here it has been outlined in nuce only. However, it is possible not only to identify methodological guidelines to develop it, but also to highlight its possible limitations; specifically, it is still not very clear how to identify participants to the user research who would give developers the most meaningful information to start understanding relax/emotions personal experiences. Another issue related to sampling is connected to the inherently "personalized" nature of the VR contents within the pVR approach: participants could possibly report inadequate or unusable memories; for example, patients with chronic conditions tend to remember negative experiences because of the salience effect (Renzi et al., 2016), which may make difficult for them to recall positive experiences before the onset of chronic pain or pathological stress. Secondarily, issues related to pVR implementation still need to be taken into consideration, such as its cost compared to other approaches. We speculate that pVR could bring efficacy advantages, compared to non-personalized approaches, despite its major costs, in particular in the field of clinical psychology interventions, where a more general personalization of contents (for example in the case of specific phobias and VRET) is already applied to embrace users specificity. Overall, personalized approaches and personalized medicine bring economic advantages, focusing on personalized needs and on effective interventions for specific patients (Annemans et al., 2013). From an organizational point of view, being based on complex and possibly long-lasting user research, the pVR approach could be difficult to include within

#### REFERENCES


organizational practices (e.g., hospitals or other care facilities which employ VR for rehabilitation); indeed, its influences on practices by health professionals are not easy to prefigure and could generate risky courses of actions (Fairbanks and Wears, 2008; Gilardi et al., 2014).

While future technological advances in VR and AI promise to enable richer user experiences, perhaps the greatest revolution that will be made possible by next-generation pVR involves the ability to collect, analyze and make use of unprecedented amounts of data, that is, bringing pVR into the age of "Big Data Psychology" (Mitroff et al., 2015).

When the use of pVR sites become a commodity, there will be a large potential user base for pVR applications able to analyze huge sample sizes. The adaptation of A/B testing methods to pVR would give researchers the tools and the sample size to investigate the impact on patients of even minor changes in pVR. The combination of big data machine learning algorithms, capable of automatically identifying behavioral patterns and statistically predicting outcomes, will also continuously increase pVR efficacy. This analysis need not only be at a coarse level, aggregating data from millions of users, but might also be done at an individual level, for example, by learning which specific types of stimuli evoke a better patient experience in a particular user, and tailoring future scenarios accordingly.

Future research is needed to explore both the user research as main information guiding the design of pVR contents, and to test the efficacy of the first prototypes.

#### AUTHOR CONTRIBUTIONS

SP conceived the work and wrote the first draft. KM contributed in conceiving the work and helped to refine the theoretical framework. ST and DM performed the literature search and contributed with writing to the final version. MAR contributed with important intellectual content and helped to refine the theoretical framework. GP supervised the whole process and contributed with important intellectual content. All authors contributed to manuscript revision, read and approved the submitted version.

virtual reality forest — results from a pilot study. Physiol. Behav. 118, 240–250. doi: 10.1016/j.physbeh.2013.05.023


program. Int. J. Hum. Comput. Stud. 69, 602–613. doi: 10.1016/j.ijhcs.2011. 06.002



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pizzoli, Mazzocco, Triberti, Monzani, Alcañiz Raya and Pravettoni. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Putting Oneself in the Body of Others: A Pilot Study on the Efficacy of an Embodied Virtual Reality System to Generate Self-Compassion

Ausiàs Cebolla1,2, Rocío Herrero2,3 \*, Sara Ventura<sup>1</sup> , Marta Miragall1,2 , Miguel Bellosta-Batalla<sup>1</sup> , Roberto Llorens4,5 and Rosa M<sup>a</sup> Baños1,2

<sup>1</sup> Department of Personality, Assessment and Psychological Treatments, University of Valencia, Valencia, Spain, <sup>2</sup> CIBER of Physiopathology of Obesity and Nutrition (CIBEROBN), Madrid, Spain, <sup>3</sup> Department of Basic and Clinical Psychology and Psychobiology, Universitat Jaume I, Castelló, Spain, <sup>4</sup> Neurorehabilitation and Brain Research Group, Instituto de Investigación e Innovación en Bioingeniería, Universitat Politècnica de València, Valencia, Spain, <sup>5</sup> Servicio de Neurorrehabilitación y Daño Cerebral de los Hospitales Vithas-NISA, Fundación Hospitales NISA, Valencia, Spain

#### Edited by:

Stéphane Bouchard, Université du Québec en Outaouais, Canada

#### Reviewed by:

Joaquim Soler, Psychologist, Spain Marcela Matos, University of Coimbra, Portugal

> \*Correspondence: Rocío Herrero rherrero@uji.es

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 19 December 2018 Accepted: 17 June 2019 Published: 02 July 2019

#### Citation:

Cebolla A, Herrero R, Ventura S, Miragall M, Bellosta-Batalla M, Llorens R and Baños RM<sup>a</sup> (2019) Putting Oneself in the Body of Others: A Pilot Study on the Efficacy of an Embodied Virtual Reality System to Generate Self-Compassion. Front. Psychol. 10:1521. doi: 10.3389/fpsyg.2019.01521 Compassion-based interventions (CBIs) have been shown to be effective for increasing empathy and compassion, and reducing stress, anxiety, and depression. CBIs are based on constructive meditations where imagery abilities are essential. One of the major difficulties that participants report during the training is the difficulty related to imagery abilities. Virtual reality (VR) can be a useful tool to overcome this limitation because it can facilitate the construction and sustainment of mental images. The machine to be another (TMTBA) uses multi-sensory stimulation to induce a body swap illusion. This system allows participants to see themselves from a third perspective and have the illusion of touching themselves from outside. The main objective of the present study was to analyze the efficacy of a self-compassion meditation procedure based on the TMTBA system versus the usual meditation procedure (CAU) in increasing positive affect states, mindful self-care, and adherence to the practice, and explore the influence of imagery abilities as moderators of the effects of the condition on adherence. A sample of 16 participants were randomly assigned to two conditions: TMTBA-VR and CAU. All participants had to listen to an audio meditation about self-compassion and answer questionnaires before and after the training. The TMTBA-VR condition also had a body swap experience at the end of the meditation while listening to self-compassionate messages. Afterward, they were invited to practice this meditation for 2 weeks and then measured again. After the compassion practice, both conditions significantly increased positive qualities toward self/others, decreased negative qualities toward self, and increased awareness and attention to mental events and bodily sensations, with no differences between the conditions. After 2 weeks, both conditions showed a similar frequency of meditation practice and increases in specific types of self-care behaviors, with the frequency of clinical self-care behaviors being significantly higher in TMTBA.

**91**

Finally, lower imagery ability in the visual and cutaneous modality were moderators of the efficacy of the TMTBA (vs. CAU) condition in increasing adherence to the practice. Embodied VR could be an interesting tool to facilitate and increase the efficacy of CBIs by facilitating the construction of positive and powerful mental images.

Keywords: compassion, virtual reality, body swapping, full body illusion, self-compassion, mindfulness, meditation

### INTRODUCTION

fpsyg-10-01521 June 28, 2019 Time: 15:14 # 2

Compassion-based interventions have been shown to be effective in increasing empathy and compassion (Brito et al., 2018), and reducing stress, anxiety, and depression (Kirby et al., 2017). They have been used in clinical settings, such as oncology (Gonzalez-Hernandez et al., 2018) or personality disorders (Feliu–Soler et al., 2017). Compassion refers to "the feeling that arises in witnessing another's suffering and that motivates a subsequent desire to help" (Goetz et al., 2010, p. 351). When this feeling is focused on oneself, it is called self-compassion, defined as individuals' ability to respond to their own suffering with warmth and the desire to alleviate their own pain (Neff and Dahm, 2015).

Compassion-based interventions use different techniques and meditations to achieve the objective of increasing selfcompassion and compassion skills, such as focused attention meditations to calm the mind and, mainly, the family of constructive meditations (Dahl et al., 2015). In this family of meditation practices, the meditator purposefully strengthens his/her natural capacity for loving kindness and compassion by intentionally generating compassionate thoughts, feelings, and motivations toward different objects, including him/herself (Brito et al., 2018).

In order to induce and train these positive mental states, the family of constructive meditations requires the use of mental imagery abilities. Surprisingly, the impact of these imagery skills on CBIs has not been studied, even though one of the major difficulties that participants report during the training is related to these imagery abilities. According to Pearson et al. (2013), there are four different mental imagery skills related to different processes that could be interacting with these types of meditation: (a) creation, (b) sustainment, (c) inspection, and (d) transformation of mental images. In creation, the meditators have to select the type of images or elements that will be used in the meditation. In the second process, the sustainment of the mental image, research shows that after 250 ms (the time necessary for eye movement) (Kosslyn, 1994), the image starts to decay. Thus, participants usually have to deal with the frustration of not being able to sustain the image long enough, which could interact with their positive emotional state. The third aspect, inspection, refers to the interpretation of an object-based characteristic or spatial property of this generated image. For example, the lack of definition (blurred) and vividness of the mental image is also a significant factor. Finally, transformation includes the capacity for rotation and restructuring.

Although the effects of compassion and self-compassion training are well known, the factors that predict why the training works for some people and not for others have been understudied. In this regard, the absence of adequate training in the ability to create, sustain, inspect, or transform mental images may impede the expected positive effects of compassion training. This lack of ability can lead people to struggle with steps prior to the compassion itself, and this experience can discourage people from continuing the necessary training to develop the compassion skills, like self-care or positive qualities (compassion, equanimity, joy, or loving kindness).

Virtual reality can be a useful tool to overcome this limitation because it can help to construct, sustain, inspect, and transform mental images. VR can be considered an advanced imagery system and an experiential form of imagery that is as effective as reality in inducing cognitive, emotional, and behavioral responses (Day et al., 2004). VR has been used to train compassion and self-compassion. For instance, Slater's group studied how the use of virtual bodies can promote compassion and self-compassion by analyzing the effects of self-identification with virtual bodies within immersive VR on increasing self-compassion in individuals with high self-criticism and depression (Falconer et al., 2014) showing how could be effective in reducing depression severity and self-criticism. The same group investigated how an embodied black avatar decreases racial prejudice and changes negative interpersonal attitudes (Peck et al., 2013). Bailenson's group also studied how an embodied avatar in VR can make people more altruist. For example, participants embodied a Superman avatar, and the results showed that they felt more helpful after the experiment (Rosenberg et al., 2013).

All these studies use embodied VR systems, which is a cognitive science approach that emphasizes, among other aspects, the subjective experience of using and "having" a body. This paradigm has been used to generate Full Body Illusions (Ehrsson, 2007) and body swapping experiments, which have become an increasingly popular method for investigating how illusory ownership of an entire fake or virtual body affects various aspects of bodily perception and experience. Thus, VR allows individuals to be present not only in the environment, but also in someone else body. VR allows the person to be "inside" another body (e.g., another person or animal), creating a body swap experience that makes it possible to study the embodiment processes, as well as emotional states such as empathy or compassion (Peck et al., 2013; Rosenberg et al., 2013; Falconer et al., 2014).

**Abbreviations:** CBIs, compassion-based interventions; GAD-7, the generalized anxiety disorder questionnaire; MSCS, mindfulness self-care; PANAS, positive and negative affect schedule; PHQ-9, patient health questionnaire-9; QMI, Betts' questionnaire upon mental imagery; SMS, state mindfulness scale; SOFI, self-other four immeasurable scale; TMTBA, the machine to be another; VR, virtual reality.

The Machine to be another is a low-budget body swapping system designed to address the relationship between identity and empathy through the use of multi-sensory stimulation (visual, cutaneous, proprioceptive, and auditory) to induce a body swap illusion (Oliveira et al., 2016). It allows the user to have an immersive experience of seeing him/herself in the body of another person –a performer– (Bertrand et al., 2018). The TMTBA is connected to the head-mounted display (an Oculus Rift), and the performer's first-person perspective is captured by a camera controlled by the user's head movements, showing the torso, legs, and arms of the performer's body. The user, through the Oculus Rift, sees the image captured by the camera, creating the illusion of being another person, and seeing him/herself from a third-person point of view.

As mentioned above, this system could overcome some limitations of imagery skills, and it can be combined with selfcompassion meditations to generate a powerful emotion response of self-compassion and, therefore, increase adherence to the meditation practice and its effects. Thus, the main objectives of this study are to analyze the effects of a self-compassion meditation supported by the TMTBA-VR system, compared to usual practice (only audio) and analyze whether the imagery skills would moderate the effect of the condition on the adherence to meditation practice after 2 weeks. The main hypothesis are divided in two groups, on one hand it is expected effects before and after the meditation supported by TMTBA-VR, showing effectiveness (1) to increase positive qualities toward self/others, decrease negative qualities toward self/others, and increase awareness and attention to the present experience immediately after a compassion practice; and, on the other hand, it is expected an effect after 2 weeks of practice, showing that the participants who received the TMTBA-VR condition will (2) increase adherence to mediation practice, the frequency of selfcare behaviors, and positive affect, and decrease negative affect, after 2 weeks compared to usual practice. Furthermore, it is expected that imagery skills will moderate these results.

### MATERIALS AND METHODS

#### Participants

The study was approved by The Ethics Committee at the University of Valencia (Spain), with registration number: H1513592028862. The size of the sample has been determined with the G-Power program, taking as a measure of effect size 0.8 (probability of alpha = 0.05) from the results obtained in the Zeng et al.'s (2015) CBIs meta-analysis of estimating the need to include a sample of 16 participants. The sample was composed of 16 students from the University of Valencia; 75% were female, and the mean age was 30.56 (SD = 10.86), ranging from 21 to 59. Participants were allocated to one of two conditions: Guided meditation or Guided meditation supported by TMTBA. The inclusion criteria were: (a) being older than 18 years; and (b) having a good level of Spanish or Valencian. The exclusion criteria were: (a) having a current diagnosis of a psychological disorder; (b) currently undergoing psychological treatment; (c) substance use or abuse; and (d) being a regular practitioner of any meditation practice (more than 5–6 times per week). The screening was completed by 22 participants, but only 16 met the criteria.

#### Measures

# Sociodemographic, Psychological, and

Practice-Related Meditation Variables Questionnaire An ad hoc questionnaire was made to collect information about age, gender, highest education level attained, history of mental or chronic illness, use or abuse of drugs, current psychological treatments, and experience with mindfulness and other meditation practices. This questionnaire was administered as part of the screening process for the study.

#### Patient Health Questionnaire-9 (Kroenke et al., 2001; Wulsin et al., 2002)

This is a 9-item questionnaire that measures the presence of depressive disorders, rated on a 0–3 scale. Total scores range from 0 to 27; higher scores indicate higher levels of depressive symptoms. The PHQ-9 has been shown to have good psychometric properties. This questionnaire was administered as part of the screening process for the study. A Spanish adaptation performed by the authors was used and showed adequate internal consistency for the total score (α = 0.66).

#### The Generalized Anxiety Disorder Questionnaire (Spitzer et al., 2006; Ruiz et al., 2011)

This is a one-dimensional self-administered scale used to detect the presence of the symptoms of anxiety. It is an efficient, quickly applied, reliable, and valid instrument, with a scoring scale from 0 to 3 (0 = nothing, 3 = almost every day). The GAD-7 has demonstrated good internal consistency and test-retest reliability, as well as convergent, construction, criterion, procedural, and factorial validity for the GAD diagnosis (Spitzer et al., 2006; Löwe et al., 2008). This questionnaire was administered as part of the screening process for the study, and internal consistency was adequate for the total score (α = 0.82).

#### Betts' Questionnaire Upon Mental Imagery (Sheehan, 1967; Campos and Pérez-Fabello, 2005)

It consists of 35 items about imagery vividness, rated on a 7-point scale (1 = image perfectly clear and as vivid as the actual experience; 7 = no image present at all, you only "know" that you are thinking of the object). Higher scores indicate low imagery in the seven sensory modalities. Scores on the subscales were calculated according to the Spanish validation (Campos and Pérez-Fabello, 2005), which had adequate internal consistency. This questionnaire was administered in order to use the subscales as moderator variables for the study, and internal consistency was adequate for all the subscales: gustatory and auditory (α = 0.93), kinesthetic (α = 0.98), organic (α = 0.94), visual (α = 0.96), auditory (α = 0.87), and cutaneous (α = 0.92).

#### Positive and Negative Affect Schedule (Watson et al., 1988; Robles and Páez, 2003)

It includes 20 items that evaluate positive affect (10 items) and negative affect (10 items) on a 5-point scale. Respondents indicate

how frequently they experience an emotion within a given time framework; in this study, we asked for the past 2 weeks. It has shown good properties of validity, convergence, and divergence, in addition to being a brief, reliable self-report measure. This questionnaire was administered at baseline and at the 2-week follow-up, and internal consistency was adequate for negative PANAS (α = 0.79–0.91) and positive PANAS (α = 0.94–0.95) across the administrations.

#### State Mindfulness Scale (Tanay and Bernstein, 2013)

It has been developed to measure the levels of awareness and attention to present experience during a specific period of time (in the present study, we used the duration of the intervention as a temporal framework) and context (in this case, the self-compassion intervention). The scale is composed of 21 items and includes two subscales: state mindfulness of bodily sensations (6 items) and state mindfulness of mental events (15 items). The scale has shown strong internal consistency reliability for both subscales, as well as the total scale. The scale has also shown and adequate construct validity through positive correlations with a state mindfulness measure and incremental sensitivity to change through demonstrated increases in SMS scores following a mindfulness meditation practice. This questionnaire was administered before and after the compassion intervention. A Spanish adaptation performed by the authors was used and showed adequate internal consistency for the bodily sensations subscale (α = 0.75–0.78) and the mental events subscale (α = 0.93–0.94) across administrations.

#### Self-Other Four Immeasurable Scale (Kraus and Sears, 2009)

It is a measure designed to assess the four main qualities of Buddhist teachings: loving kindness, compassion, joy, and acceptance toward both the self and others. It is a 16-item scale, rated on a five-point Likert scale. The scale has a structure composed of four subscales: positive qualities toward self, positive qualities toward others, negative qualities toward self, and negative qualities toward others. All the subscales have high internal consistency and have shown appropriate construct, convergent, and discriminant validity (Kraus and Sears, 2009). This questionnaire was administered before and after the compassion intervention. A Spanish adaptation performed by the authors was used and showed adequate internal consistency for positive qualities toward self (α = 0.78–0.86), positive qualities toward others (α = 0.70–0.87), and negative qualities toward self (α = 0.83–0.86) across administrations, but not for negative qualities toward others (α = 0.39–0.42).

#### Mindfulness Self-Care (Cook-Cottone and Guyker, 2018)

It is a 33-item scale that measures the self-reported frequency of self-care behaviors in 6 specific domains and on more global practices of self-care. The MSCS asks participants to rate the frequency of self-care behavior in the past 2 weeks. The items are rated on a 5-point Likert scale ranging from 1 (Never or 0 days) to 5 (Regularly or 6 to 7 days). Responses to all items are totaled, with higher scores representing increased frequency of self-care behaviors. The MSCS total and subscales have strong internal consistency reliability. This questionnaire was administered at baseline and at the 2-week follow-up. A Spanish adaptation performed by the authors was used and showed adequate internal consistency for supportive relationships (α = 0.60–0.69), mindful awareness (α = 0.88–0.90), selfcompassion and purpose (α = 0.78–0.87), supportive structure (α = 0.59–0.71), and general self-care practices (α = 0.77–0.84) across administrations, but not for physical care (α = 0.45–0.68), mindful relaxation (α = 0.17–0.81), and clinical self-care practices (α = 0.19–0.66).

#### Adherence Question

Adherence to the self-compassion practice was measured at follow-up using a single question. Participants were asked to register the frequency of their meditation practice in the past 2 weeks. Participants rated the frequency on a 5-point Likert scale: 1 = never; 2 = 1–2 times per week; 3 = 3–4 times per week; 4 = 5–6 times per week; 5 = every day.

#### Embodiment in TMTBA Questionnaire

It is an adaptation of the original questionnaire to assess the Rubber Hand Illusion experience developed by Longo et al. (2008), to assess the strength of embodiment elicited by the TMTBA during the study, by asking, for example, about the experience of being located in the performer's body. The scale is composed of 10 items rated on a Likert-type scale ranging from 1 (strongly disagree) to 7 (strongly agree). The scale contains 3 subscales: 5 items assess body-ownership, 3 items assess location, and the remaining 2 items assess agency.

#### Procedure

Students were invited to participate in a study aimed to increase their compassionate skills. Potential participants contacted researchers via email and received access to the screening assessment and the informed consent (T1). Participants who fulfilled the inclusion and exclusion criteria were randomly assigned to one of the two study conditions: usual meditation (CAU) or Meditation through the Machine to be another (TMTBA-VR), using the Random Allocation Software 2.0. Participants came to the laboratory and filled out the preassessment (T2). Once they had finished, they performed the meditation. Participants in the CAU condition were seated in a quiet room and listened to a recorded audio with a traditional meditation. Participants in the TMTBA-VR condition followed the steps explained in Section "The TMTBA-VR Condition" below. Both conditions used the same meditation audio. The meditation used was focused in the generation of a self-compassionate state inviting participants to think about themselves while they were kids, at the end a selfcompassionate mantras was also used. Once participants had finished the meditation, they completed the post-assessment questionnaires (T3). Participants were instructed to practice the meditation for the following 2 weeks. For this purpose, all participants received an audio track with the meditation performed in the laboratory session. At the end of the 2 weeks, participants completed the follow-up assessment (T4).

### The TMTBA-VR Condition

fpsyg-10-01521 June 28, 2019 Time: 15:14 # 5

For clarity, the TMTBA-VR condition in this study was divided into three phases (**Figure 1**). The first phase has the purpose of generating a body swap illusion, allowing the participant to take over the body of another person (the performer) (**Figure 1A**). To do so, an embodied induction is performed. In this phase, the participant and the performer are sitting aligned. The participant is wearing the head-mounted display (VR Oculus Rift), which allows him/her to see the torso, legs, and arms of the performer's body. The performer is wearing a camera controlled by the participant's head movements. A pre-recorded instruction to perform specific movements is played to each participant (e.g., "Put your right hand on your right knee, and then slowly move it up to your lap, as if you were caressing it"). All the movements selected followed two principles: (1) movements that require a combination of visual and haptic senses, in order to increase the embodied illusion; and (2) movements that ensure the synchronization between the participant's and performer's movements. This phase lasts 5 min. The second phase consists of the compassion meditation itself (**Figure 1B**). The participant is still wearing the VR Oculus Rift, but it is turned off. A self-compassion meditation is played to the participant for 15 min. At the end of the meditation, the third phase begins (**Figure 1C**). The participant faces him/herself while listening to selfcompassionate messages. To do so, the performer sits facing the participant. After this, the VR Oculus Rift is turned on, allowing the participant to see him/herself from a third-person perspective. Participants are invited to hug themselves. The performer follows the participant's movements like a mirror. This phase lasts from 5 to 7 min.

In order to induce this experience, the TMTBA-VR has the support of a performer, a person who is trained to mimic the user's movements to induce the embodied illusion. Hence, in order to control the user's movements, an audio recording is played where precise and slow movements are requested (e.g., "move your hand from your lap to your knee"). The TMTBA-VR is connected to the head-mounted display (Oculus Rift), and the performer's first-person perspective is captured by a camera controlled by the user's head movements, revealing the torso, legs, and arms of the performer's body (**Figure 2**). Through the Oculus, the user sees the image captured by the camera, creating the illusion of being another person (**Figure 3**).

### Data Analyses

All statistical analyses were performed using the SPSS v.24. First, descriptive statistics, independent-samples t-tests, and chi-square tests were conducted to test whether there were significant differences between conditions on sociodemographic, psychological, and practice-related meditation variables, as well as on baseline measures (PHQ-9 and GAD-7). Second, mixed 2 × 2 ANOVAS, with condition (TMTBA-VR and CAU) as between-subjects factor and time (T1 and T4) as withinsubjects factor, were performed to analyze the effects of the condition on SOFI and SMS, before and after the meditation practice. Third, one-sample t-tests were conducted to explore the effect of the TMTBA-VR on the embodiment scores (location, ownership, and agency). Fourth, an independent-samples t-test was performed to analyze the adherence to the meditation practice. In addition, mixed 2 × 2 ANOVAS were conducted, with condition (TMTBA-VR and CAU) as between-subjects factor and time (T2 and T3) as within-subjects factor, in order to analyze the effects of the condition on adherence, MSCS, and PANAS at baseline and after 2 weeks of meditation practice. Fifth, moderation analyses were carried out to test whether the imagery ability subscales (QMI) moderated the effect of the condition on adherence to meditation practice after 2 weeks. These analyses were performed using the procedure described by Hayes (2018) from the macro PROCESS (version 3.2), choosing model 1. In these analyses, the TMTBA-VR condition was coded as "1," and the CAU condition as "2." All the regression coefficients were reported in unstandardized form as b-values. Tests of significance (p < 0.05) or a confidence interval (not including zero) in the interaction between condition and the QMI subscales were carried out to find out whether the QMI moderated the effect of condition on adherence. The conditional effects of condition on adherence at medium (the mean), low (+1 SD), and high (−1 SD) levels of the QMI subscales were examined using the "pick-a-point" approach (or analysis of simple slopes).

### RESULTS

fpsyg-10-01521 June 28, 2019 Time: 15:14 # 6

### Differences in Sociodemographic, Psychological, and Practice-Related Meditation Variables and Baseline Measures (PHQ-9 and GAD-7) Between Conditions

Descriptive statistics for sociodemographic, psychological, and practice-related meditation variables, as well as the PHQ-9 and GAD-7 questionnaires for each condition, are shown in **Table 1**. There were no significant differences between conditions for age, t(14) = −0.89, p = 0.387; gender, χ 2 (1, N = 16) = 1.33, p = 0.248; history of mental or chronic illness, χ 2 (1, N = 16) = 0.00, p = 1.00; experience with meditation, χ 2 (1, N = 16) = 1.07, p = 0.302; frequency of meditation, χ 2 (1, N = 16) = 1.07, p = 0.302; PHQ-9, t(14) = −0.82, p = 0.427; or GAD-7, t(14) = −0.18, p = 0.859.

### Effects of the TMTBA-VR (vs. CAU) Condition on SOFI and SMS Before (T2) and After (T3) the Compassion Practice

Descriptive statistics for SOFI and SMS at T2 and T3 are shown in **Table 2**. For the SOFI subscales, there were main effects of time for positive qualities toward self, F(1,14) = 21.30 p < 0.001, η 2 <sup>p</sup> = 0.60; positive qualities toward others, F(1,14) = 9.41, p = 0.008, η 2 <sup>p</sup> = 0.40; and negative qualities toward self, F(1,14) = 5.40, p = 0.036, η 2 <sup>p</sup> = 0.28, with higher scores at T3 than at T2 in the case of positive qualities, and lower scores at T3 than

TABLE 1 | Descriptive statistics of sociodemographic and practice-related meditation variables, and baseline measures (PHQ-9 and GAD-7) in TMTBA-VR and CAU condition.


PHQ-9, patient health questionnaire-9; GAD-7, the generalized anxiety disorder questionnaire.

at T2 in the case of negative qualities. However, the main effect of time for negative qualities toward others was not statistically significant, F(1,14) = 1.68, p = 0.216, η 2 <sup>p</sup> = 0.11. No significant interaction effects were found for any of the SOFI subscales: positive qualities toward self, F(1,14) = 0.66 p = 0.429, η 2 <sup>p</sup> = 0.05; positive qualities toward others, F(1,14) = 0.03, p = 0.859, η 2 <sup>p</sup> = 0.00; negative qualities toward self, F(1,14) = 1.57, p = 0.231, η 2 <sup>p</sup> = 0.10; or negative qualities toward others, F(1,14) = 0.75, p = 0.402, η 2 <sup>p</sup> = 0.05.

Regarding the SMS subscales, there were main effects of time on mental events, F(1,14) = 25.66, p < 0.001, η 2 <sup>p</sup> = 0.65, and bodily sensations, F(1,14) = 14.44, p = 0.002, η 2 <sup>p</sup> = 0.51. However, there were no interaction effects between condition and time on mental events, F(1,14) = 3.44, p = 0.085, η 2 <sup>p</sup> = 0.20, or bodily sensations, F(1,14) = 0.16, p = 0.692, η 2 <sup>p</sup> = 0.01.

### Effect of the TMTBA-VR on Embodiment Scores

Regarding the effects of the TMTBA-VR on embodiment scores for location, M = 5.42, SD = 1.86, t(7) = 6.72, p = 0.001, ownership, M = 4.53, SD = 2.20, t(7) = 4.53, p = 0.003, and agency of the performer's body, M = 4.69, SD = 1.94, t(7) = 5.36, p = 0.001, they were significantly greater than the score of 1 which is equivalent to the non-experience of location, ownership, or agency of the performer's body–.

### Effects of the TMTBA-VR (vs. CAU) Condition on Adherence, MSCS, and PANAS at Baseline (T1) and After Two Weeks of Meditation Practice (T4)

Descriptive statistics for adherence to meditation practice, MSCS, and PANAS are shown in **Table 2**. Regarding the effects of the condition on adherence, there were no significant differences in the days that participants had practiced meditation in the past 2 weeks, t(10.76) = 1.43, p = 0.182, d = 0.67, 95% CI (−0.33, 1.68).

Regarding the MSCS subscales, main effects of time were found for Self-compassion and Purpose, F(1,14) = 7.10, p = 0.018, η 2 <sup>p</sup> = 0.34, Mindful Relaxation, F(1,14) = 6.63, p = 0.022, η 2 <sup>p</sup> = 0.32, Clinical practices, F(1,14) = 39.51, p < 0.001, η 2 <sup>p</sup> = 0.74, and General practices, F(1,14) = 9.25, p = 0.009, η 2 <sup>p</sup> = 0.40, with higher scores at T4 than in T1. However, there were no main effects of time for Physical Care, F(1,14) = 2.65, p = 0.126, η 2 <sup>p</sup> = 0.16, Supportive Relationships, F(1,14) = 0.85, p = 0.373, η 2 <sup>p</sup> = 0.06, Mindful Awareness, F(1,14) = 3.42, p = 0.086, η 2 <sup>p</sup> = 0.20, or Supportive structure, F(1,14) = 3.04, p = 0.103, η 2 <sup>p</sup> = 0.18. Significant interaction effects between condition and time were only found for Clinical practices, F(1,14) = 5.09, p = 0.041, η 2 <sup>p</sup> = 0.27. Post hoc analysis using Bonferroni correction showed that both conditions increased from T1 to T4 (TMTBA-VR: p < 0.001; CAU: p = 0.013), and a trend toward statistically significant differences was found between conditions at T4 (p = 0.099), with higher scores on the TMTBA-VR than in the CAU condition. By contrast, no interaction effects were found for the rest of the subscales: Self-compassion and Purpose, F(1,14) = 1.34, p = 0.266, η 2 <sup>p</sup> = 0.09, Mindful Relaxation, F(1,14) = 0.09, p = 0.771, η 2 <sup>p</sup> = 0.01, General, F(1,14) = 1.28, p = 0.278,

TABLE 2 | Descriptive statistics of SOFI and SMS before (T2) and after (T3) the compassion practice, and adherence, MSCS, and PANAS at baseline (T1) and after 1 week of meditation practice (T4).


SOFI, self-other four immeasurable scale; SMS, mindfulness state scale; MSCS, mindfulness self-care; PANAS, positive and negative affect schedule.

η 2 <sup>p</sup> = 0.08, Physical Care, F(1,14) = 0.15, p = 0.704, η 2 <sup>p</sup> = 0.01, Supportive Relationships, F(1,14) = 0.30, p = 0.590, η 2 <sup>p</sup> = 0.02, Mindful Awareness, F(1,14) = 0.07, p = 0.796, η 2 <sup>p</sup> = 0.01, and Supportive structure, F(1,14) = 0.03, p = 0.876, η 2 <sup>p</sup> = 0.00.

Regarding PANAS, there were no main effects of time on positive PANAS, F(1,14) = 0.86, p = 0.370, η 2 <sup>p</sup> = 0.02, or negative PANAS, F(1,14) = 0.01, p = 0.925, η 2 <sup>p</sup> = 0.00. Moreover, there were no significant interaction effects between condition and time on positive PANAS, F(1,14) = 0.28, p = 0.605, η 2 <sup>p</sup> = 0.02, or negative PANAS, F(1,14) = 0.01, p = 0.925, η 2 <sup>p</sup> = 0.00.

### Imagery Ability (QMI) as a Moderator of the Effect of Condition on Adherence to Meditation Practice After Two Weeks

Moderation analyses showed that the specific QMI subscales moderated the effect of condition on adherence: cutaneous, F(1,12) = 5.95, p = 0.031, and a trends of statistical significance in visual, F(1,12) = 3.65, p = 0.080 (see **Figure 4**). However, the following imagery ability factors did not moderate this relationship: gustatory and olfactory, F(1,12) = 0.56, p = 0.468; kinesthetic, F(1,12) = 0.03, p = 0.866; organic, F(1,12) = 0.50, p = 0.495; and auditory, F(1,12) = 0.06, p = 0.808.

Regarding the first significant moderation model with cutaneous imagery ability as moderator, the overall model

explained 54.95% of the variance in adherence, F(3,12) = 6.11, p = 0.009. The interaction between condition and adherence was significant, F(1,12) = 5.95, p = 0.031, which means that cutaneous imagery (QMI) was a moderator of the effect of the condition on adherence, accounting for 16.86% of the variance. Simple slopes analysis showed that there was a significant negative relationship between condition and adherence when cutaneous

imagery was "medium," b = −1.97, 95% CI (−3.16, −0.77), t = −3.57, p = 0.004, and "low," b = −3.29, 95% CI (−5.28, −1.30), t = −3.61, p = 0.004. Participants in the TMTBA-VR (vs. CAU) had greater adherence to the meditation practice when cutaneous imagery was "medium" (TMTBA-VR = 4.47 ≈ 5–6 times; CAU = 2.51 ≈ 3–4 times) and "low" (TMTBA-VR = 4.09 ≈ 5–6 times; CAU = 0.80 ≈ never).

With regard to the second relevant moderation model with visual imagery ability as moderator, the overall model explained 51.66% of the variance in adherence, F(3,12) = 7.50, p = 0.004. The interaction between condition and adherence showed a trend of statistical significance, F(1,12) = 3.65, p = 0.080, which means that visual imagery (QMI) could be a moderator of the effect of the condition on adherence, accounting for 13.98% of the variance. Simple slopes analysis showed that there was a negative significant relationship between condition and adherence when visual imagery was "medium," b = −1.27, 95% CI (−2.18, −0.36), t = −3.05, p = 0.010, and "low," b = −2.36, 95% CI (−3.77, −0.94), t = −3.63, p = 0.004. Participants in TMTBA-VR (vs. CAU) had greater adherence to meditation practice when visual imagery was "medium" (TMTBA-VR = 4.36 ≈ 5–6 times; CAU = 3.08 ≈ 3–4 times) and "low" (TMTBA-VR = 3.99 ≈ 5–6 times; CAU = 1.63 ≈ 1–2 times).

#### DISCUSSION

The general objective of the present study was to analyze the effects of a self-compassion meditation supported by an embodied VR system (TMTBA-VR), compared to usual practice (only audio).

Regarding the results before and after the meditation, it was expected an increase in positive qualities toward self/others and a decrease in negative qualities toward self/others, as a measure of the compassion effects. Both conditions increased the positive qualities toward self and others and decreased the negative qualities toward self. There were no differences between conditions across time, but an exploration of the effect sizes showed that the effect of positive qualities toward self was larger in the TMTBA-VR condition than in the CAU condition. Thus, our hypothesis that the VR system could facilitate access to the powerful emotional response of self-compassion was not fully supported. Different factors could explain this lack of evidence and need to be further study, such as the sample size or the characteristics of participants (all were university students). Even though, a promising result was found in the difference between groups, as an exploration of the effect sizes showed that the effect of positive qualities toward self was larger in the TMTBA-VR than in the CAU condition. In this sense, we would like to highlight the experience of awe as a potential explanation of this difference. Most participants in the TMTBA-VR condition reported a sense of awe generated by the experience of embodying another's body and touching themselves, and this experience could be interacting with the self-compassion response. As was previously reported in other studies, VR is an effective way to induce awe in controlled experimental settings, due to its ability to provide participants with a sense of "presence," that is, the subjective feeling of being displaced in another physical or imaginary place (Chirico et al., 2018). Unfortunately, we did not use a questionnaire to measure awe in order to control this experience. Furthermore, there is no previous research about the efficacy of a single meditation supported by VR in generating positive mental states. Thus, further studies are needed to verify the sense of awe triggered by the VR experience and its effects on the practice and adherence to the practice. Regarding this point, it could be controlled in future studies allowing the participants to habituate to the TMTBA previous to study its efficacy.

Both conditions increased awareness and attention to the present for mental events and bodily sensations after the compassion practice, with no differences between conditions across time. A mindfulness state refers to the mental ability to pay attention to physical or mental events that occur in the present

moment (Tanay and Bernstein, 2013), which means that the TMBTA-VR experience does not distract from body sensations and the mind, an essential aspect of self-compassion and other meditations (Cebolla et al., 2017, 2018). However, explorations of the effect sizes showed that both factors were larger in the CAU condition than in the TMTBA-VR condition.

Regarding the second objective of this study, it was hypothesized that TMTBA-VR would increase adherence to meditation practice by supporting the construction of the self-compassionate image. Both conditions practiced a similar number of days after 1 week, around 4 times per week. The characteristics of the sample could explain these unexpected results and future research is needed to test this results in participants who score high on fear of compassion, self-criticism, or lack of imagination skills.

With regard to the frequency of self-care behaviors, the results showed that both conditions increased the scores on the frequency of self-care behaviors, such as self-compassion and purpose (e.g., acceptance of failure and challenge as part of the process), mindful relaxation (e.g., doing specific practices that can help individuals to relax), clinical self-care practices (e.g., resting when individuals needed to), and general self-care practices (e.g., engaging in a variety of self-care strategies). Explorations of the effect sizes showed that participants in the TMTBA-VR condition achieved larger effects than in the CAU condition, especially on the clinical self-care subscale. In fact, an interaction effect between condition and time was found for this subscale. Nevertheless, the difference between conditions at 2 weeks did not reach statistical significance, and the results derived from this subscale –as well as the mindful relaxation subscales– should be viewed with caution because the internal consistencies of these subscales were not adequate. This result is interesting because in the experiment both conditions had similar scores prepost, which means that the home practice is where participants generate different states that contribute to different mindful selfcare responses. Further research should analyze the quality of the practice at home and the ability of this practice to produce positive mental states, as well as the impact on self-care responses.

The secondary objective of this study was to analyze whether the imagery ability was a moderator of the effect of the condition on adherence to meditation practice after 2 weeks. Analyses showed that adherence was significantly different between conditions when the cutaneous and visual (only a trend) imagery abilities were introduced as moderators (although the interaction was marginally significant in the case of the visual imagery ability). Participants in the TMTBA-VR (vs. CAU) had greater adherence to the meditation practice when cutaneous and visual imagery were "medium" and "low" in this sample. Thus, when visual and cutaneous imagery were lower, participants in the TMTBA-VR condition practiced a minimum of 5–6 times per week, whereas participants in the CAU condition practiced a maximum of 3–4 times per week (and even "never" when cutaneous imagery was low). This result is consistent with the main hypothesis proposing that VR could facilitate a powerful self-compassion experience in participants, and that imagery could be used in daily practice, increasing adherence. Furthermore, this effect seems to be more powerful in people who show a lack of imagination skills, which means that VR could be especially relevant in clinical samples where imagery has been shown to play a key role (Pearson et al., 2013). This results is congruent with the importance of mental imagery in the area of clinical psychopathology (Hackmann et al., 2011) and psychological interventions (Gilbert and Irons, 2004). Embodied VR can be especially relevant when constructive meditations are used in CBIs, given the key role of imagery. Even though, it is important to highlight that not only imagery skills are important in compassion practice. Others factors like body awareness or body representations could play a key role in the practice. However, few researches has been done in this area, and further studies should be conducted in the future.

Despite the several limitations of the present pilot study pointed out in the current study (including the small sample size, the participant's characteristics given that they are students with high level of education with no mental disorder) it is important to highlight that this study is one of the first study that proposed the use of VR as a tool to improve compassion practice. Using VR as support to the compassion training may help to overcome several difficulties that trainee experience in the early state of their training, especially those related to the imagery skills. After this study several question arise that need to be study to better understand the role of the VR in the training and its effects. In this sense more research is needed for example to test the effects of the TMTBA-VR in clinical population, the role played by imagery skill levels, awe, and other relevant variables like self-criticism. Regarding this point, it could be controlled in future studies allowing the participants to habituate to the TMTBA previous to study its efficacy.

### ETHICS STATEMENT

The study was approved by The Ethics Committee at the University of Valencia (Spain), with registration number: H1513592028862.

### AUTHOR CONTRIBUTIONS

AC and RH made substantial contribution to the conceptualization, formal analyses, and drafting the manuscript. SV and MB-B made substantial contribution to the collection of the data. MM made substantial contribution to the forma analyses and drafting the manuscript. RB made substantial contribution in revising the manuscript critically for important intellectual content. All authors provided final approval of the version to be published, and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

### FUNDING

This work was supported by CIBEROBN, an initiative of the ISCIII (ISC III CB06 03/0052) and Ministerio de Economía y Competitividad (Spain) under AN-BODYMENT (PSI2017-85063-R).

### REFERENCES

fpsyg-10-01521 June 28, 2019 Time: 15:14 # 10


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Cebolla, Herrero, Ventura, Miragall, Bellosta-Batalla, Llorens and Baños. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Inferiority or Even Superiority of Virtual Reality Exposure Therapy in Phobias?—A Systematic Review and Quantitative Meta-Analysis on Randomized Controlled Trials Specifically Comparing the Efficacy of Virtual Reality Exposure to Gold Standard in vivo Exposure in Agoraphobia, Specific Phobia, and Social Phobia

#### Theresa F. Wechsler\*, Franziska Kümpers and Andreas Mühlberger

Department for Clinical Psychology and Psychotherapy, Institute of Psychology, University of Regensburg, Regensburg, Germany

Background: Convincing evidence on Virtual Reality (VR) exposure for phobic anxiety disorders has been reported, however, the benchmark and golden standard for phobia treatment is in vivo exposure. For direct treatment comparisons, the control of confounding variables is essential. Therefore, the comparison of VR and in vivo exposure in studies applying an equivalent amount of exposure in both treatments is necessary.

Methods: We conducted a systematic search of reports published until June 2019. Inclusion criteria covered the diagnosis of Specific Phobia, Social Phobia, or Agoraphobia, and a randomized-controlled design with an equivalent amount of exposure in VR and in vivo. We qualitatively reviewed participants' characteristics, materials, and the treatment procedures of all included studies. For quantitative synthesis, we calculated Hedges' g effect sizes for the treatment effects of VR exposure, in vivo exposure, and the comparison of VR to in vivo exposure in all studies and separately for studies on each diagnosis.

Results: Nine studies (n = 371) were included, four on Specific Phobia, three on Social Phobia, and two on Agoraphobia. VR and in vivo exposure both showed large, significant effect sizes. The comparison of VR to in vivo exposure revealed a small, but non-significant effect size favoring in vivo (g = −0.20). Specifically, effect sizes for Specific Phobia (g = −0.15) and Agoraphobia (g = −0.01) were non-significant, only for Social Phobia we found a significant effect size favoring in vivo (g = −0.50).

#### Edited by:

Federica Pallavicini, University of Milano-Bicocca, Italy

#### Reviewed by:

Philip Lindner, Stockholm University, Sweden Soledad Quero, University of Jaume I, Spain

#### \*Correspondence:

Theresa F. Wechsler theresa.wechsler@ur.de

#### Specialty section:

This article was submitted to Human-Media Interaction, a section of the journal Frontiers in Psychology

Received: 23 December 2018 Accepted: 15 July 2019 Published: 10 September 2019

#### Citation:

Wechsler TF, Kümpers F and Mühlberger A (2019) Inferiority or Even Superiority of Virtual Reality Exposure Therapy in Phobias?—A Systematic Review and Quantitative Meta-Analysis on Randomized Controlled Trials Specifically Comparing the Efficacy of Virtual Reality Exposure to Gold Standard in vivo Exposure in Agoraphobia, Specific Phobia, and Social Phobia. Front. Psychol. 10:1758. doi: 10.3389/fpsyg.2019.01758

**101**

Except for Agoraphobia, effect sizes varied across studies from favoring VR to favoring in vivo exposure.

Conclusions: We found no evidence that VR exposure is significantly less efficacious than in vivo exposure in Specific Phobia and Agoraphobia. The wide range of study specific effect sizes, especially in Social Phobia, indicates a high potential of VR, but also points to the need for a deeper investigation and empirical examination of relevant working mechanisms. In Social Phobia, a combination of VR exposure with cognitive interventions and the realization of virtual social interactions targeting central fears might be advantageous. Considering the advantages of VR exposure, its dissemination should be emphasized. Improvements in technology and procedures might even yield superior effects in the future.

Keywords: anxiety disorder, agoraphobia, social anxiety, specific phobia, exposure therapy, virtual reality, meta-analysis, systematic review

### INTRODUCTION

### Rationale

#### Phobic Anxiety Disorders

Phobic anxiety disorders (ICD-10 F40) are listed as a subcategory of anxiety disorders in the ICD-10 (World Health Organization, 1992). They are characterized by anxiety in circumscribed situations, which currently pose little or no actual danger, and by an avoidance of those situations or an endurance with dread (World Health Organization, 1992). There are three subtypes of phobic anxiety disorders in the ICD-10 (World Health Organization, 1992): Agoraphobia (F40.0), Social Phobia (F40.1), and Specific Phobia (F40.2). Patients with Specific Phobia fear specific situations or objects such as animals, heights, thunder, darkness or closed spaces. Social Phobia patients report fear of scrutiny by other people, which leads to an avoidance of social situations. Agoraphobia is characterized by a fear of situations in which fleeing from the situation or help is not easily accessible, such as crowds in public spaces, leaving home, entering shops, or traveling alone in a train, bus or plane. It can be coded with (F40.01) or without (F40.00) Panic Disorder. Other anxiety disorders (ICD-10 F41) include Panic Disorder (F41.0), or generalized anxiety disorders (GAD) (F41.1) (World Health Organization, 1992). Both anxiety disorders are related to internal stimuli, like bodily sensations in panic disorder and worries in GAD.

Twelve-month prevalence rates for phobic anxiety disorders have been reported to range from 0.3 to 1.6% for Agoraphobia, 1.2 to 6.8% for Social Phobia, and 3.5 to 8.7% for Specific Phobia (Bijl et al., 1998; Alonso et al., 2004; Kessler et al., 2005; Stein et al., 2017; Wardenaar et al., 2017; Stagnaro et al., 2018). Lifetime prevalence rates have been reported to range from 0.9 to 3.4% for Agoraphobia, 2.4 to 7.8% for Social Phobia, and 7.7 to 10.1% for Specific Phobia (Bijl et al., 1998; Alonso et al., 2004; Kessler et al., 2005; Stein et al., 2017; Wardenaar et al., 2017). Prevalence rates for the subtypes of Specific Phobia have been reported to range from 3.3 to 5.7% for animal phobia, 4.9 to 11.6% for natural environment phobia (with 3.1 to 5.9% for height phobia), 5.2 to 8.4% for situational phobia (with 2.5 to 2.9% for flying phobia), and 3.2 to 4.5% for blood, injury and injection phobia (LeBeau et al., 2010).

With evidence from prospective studies, anxiety disorders must be seen as chronic disorders, starting in childhood, adolescence or early adulthood with a peak in middle age and a decrease in older age (Bandelow and Michaelis, 2015). According to the Global Burden of Disease Study 2015, anxiety disorders are ranked as the ninth largest contributor to global disability, leading to a global total of 24.6 million years lived with disability (YLD) in 2015 (Vos et al., 2016). For Specific Phobia, 18.7% of people with a 12-month Specific Phobia diagnosis reported severe role impairment in at least one out of four domains consisting of home, work, relationships and social life, and a mean number of 12.2 days out of role in the past year was assessed due to the disorder (Wardenaar et al., 2017). For Social Phobia, 37.6% of people with a 12-month diagnosis stated a severe role impairment in at least one domain, and a mean number of 24.7 days out of the role per 1 year was reported (Stein et al., 2017). For Panic Disorder with Agoraphobia, 84.7% of people with a 12-month diagnosis described severe role impairment, and for Agoraphobia without a history of Panic Disorder, but including panic attacks, 39.0% reported severe impairment (Kessler et al., 2006).

#### Exposure Therapy

The first-line treatment for anxiety disorders consists of exposure therapy (Chambless et al., 1998; Wolitzky-Taylor et al., 2008; Bandelow et al., 2014; Barlow et al., 2015; Steinman et al., 2015). During exposure therapy, patients confront themselves over a long period of time, repetitively, with a feared external or internal stimulus until distress has decreased significantly. They are also advised not to use cognitive or behavioral avoidance strategies. During exposure therapy in phobic anxiety disorders, patients particularly confront themselves with external stimuli such as height in fear of heights, crowds in Agoraphobia, or giving a speech in front of an audience in Social Phobia. This can be conducted in their imagination (exposure in sensu) or in real live (exposure in vivo). Exposure therapy in other anxiety disorders differs slightly from the procedure in phobic anxiety disorders. In panic disorder, interoceptive exposure to internal stimuli in

the form of bodily sensations like heartbeat or dizziness are mainly applied (see for e.g., Forsyth et al., 2008; Gerlach and Neudeck, 2012). In Agoraphobia, interoceptive exposure is used in addition to in vivo exposure to external stimuli. In GAD treatment, patients confront themselves with internal or external aspects of their anxiety (Overholser and Nasser, 2000; Hoyer and Beesdo-Baum, 2012). Through imaginal exposure, GAD patients are exposed to central worries (e.g., concerning physical injury or impaired health), and through in vivo exposure, patients expose themselves to daily-live situations that elicit worries while not using safety behaviors such as telephone calls. In PTSD, a stress-related disorder, imaginal exposure to traumatic memories is performed and can be combined with in vivo exposure to daily-life actions (e.g., a patient traumatized by a car accident drives a car) (Riggs et al., 2007; Friedman, 2015). Besides anxiety disorders and PTSD, exposure therapy is also conducted in other disorders like obessive compulsive disorder (Lewin et al., 2014), eating disorders (see for e.g., Griffen et al., 2018; Waller and Raykos, 2019), or substance addiction (Marlatt, 1990; Drummond et al., 1995), respectively, with a modified procedure.

With phobic anxiety disorders as the focus of this systematic review and meta-analysis, there is robust empirical evidence for the efficacy of exposure therapy, even as the sole treatment method. According to numerous studies, in vivo exposure shows high effect sizes in the treatment of Agoraphobia (Ruhmland and Margraf, 2001), Social Phobia (Mayo-Wilson et al., 2014) and Specific Phobia (Wolitzky-Taylor et al., 2008). The most approved mechanisms underlying exposure treatment are habituation, extinction, correction of negative beliefs, and emotional processing (Foa and Kozak, 1986; Clark, 1999). Above that, inhibitory learning was recognized to be central to extinction learning (Craske et al., 2008). The authors propose that fear toleration, the development of competing non-threatening associations, and the enhancement of the accessibility and retrievability of those associations from different context and time, are more important for corrective learning than fear levels and fear reduction during exposure (Craske et al., 2008). Exposure is often performed in combination with further cognitive behavioral therapy (CBT) interventions such as psychoeducation, cognitive interventions, or relapse prevention strategies. While for Specific Phobia such additional interventions are minimal in many approaches, e.g., in the very effective One-Session Treatments (Davis et al., 2012), exposure in Social Phobia is typically integrated in further cognitive behavioral interventions and is particularly framed as experimental tasks, focusing on the verification and correction of dysfunctional beliefs in social situations (Clark, 2001).

Despite its convincing theoretical and empirical foundation, there seem to be barriers in the dissemination of exposure therapy in routine care. Neudeck and Einsle (2012) mention structural barriers (e.g., time, insurance, or logistics) and barriers up to the therapist (e.g., negative attitudes toward exposure therapy or insufficient familiarity with the method). Both impede the (accurate) application of exposure techniques in clinical practice. These barriers cause a problem for patients, preventing them from receiving highly efficacious treatment.

#### Virtual Reality Exposure Therapy

The use of Virtual Reality (VR) technology represents an option with the potential to overcome such barriers. VR exposure therapy (VRET), also called exposure therapy in virtuo, is based on the very similar rationale of in vivo exposure therapy, however, in VR exposure, phobic stimuli are presented to the patient in VR. VR is a computer-generated presentation, which provides input to the user's sensory system and interacts with the user (also see Diemer et al., 2015). Visual VR stimuli are presented via VR glasses (HMD: head mounted display) or via projection-based systems like CAVE-systems (cave automatic virtual environment), which is a room with up to six projection sides. Auditory input is applied via loudspeakers or earphones, and tactile, haptic or olfactory stimulation is possible but rarely provided. The aim is to replace sensory input from the real world and to create a presence of the user in the virtual world. To interact with the user in real time, the VR system collects information about the users' position and (head) movements via sensors and input devices like a head tracking system or a joystick.

By bringing virtual phobic stimuli into the therapist's office, VR exposure has many structural advantages. It is less time consuming in its application (e.g,. because driving to a high tower in heights phobia treatment is not necessary any more), costeffective (e.g., in comparison to cost-intensive in vivo treatments for fear of flying), and requires less organization (e.g., regarding the acquisition of living spiders in spider phobia treatment, or of an audience for Social Phobia treatment). Furthermore, there are fewer difficulties concerning safety and insurance arrangements.

Above that, the VR technique might enhance usage of exposure treatment through a higher acceptance by patients, and thereby ease an efficacious procedure of psychotherapy. For in vivo exposure in Specific Phobia, high treatment responses but low treatment acceptance and high dropout rates have been reported in the past (Choy et al., 2007). In a direct comparison, García-Palacios et al. (2007) found evidence that patients with Specific Phobia prefer VR exposure to in vivo exposure and are significantly more willing to participate in VR treatment, mostly because they are too afraid of confronting the real feared stimuli. Quero et al. (2014) examined patients with Panic Disorder and Agoraphobia concerning their opinion toward VR and traditional interoceptive exposure before, directly after, and 3 months after treatment. Both treatments were well-accepted at all three time points, although VR exposure was considered a little, but not significantly, less aversive. Before treatment, the VR exposure rationale was expected to be significantly more logical and useful in other problems. Interestingly, higher expectations before treatment predicted a better clinical improvement at the post-test and follow-up. After 3 months, participants in the traditional interoceptive exposure group reported a significantly higher satisfaction. Nevertheless, clinical improvements did not show significant differences between the two conditions at the post-test and follow-up. Concerning dropout from an ongoing exposure treatment, a meta-analysis of randomized-controlled trials (RCTs) conducted by Benbow and Anderson (2018) found no significant difference in the likelihood of discontinuation

between VR and in vivo exposure, although the attrition rate for VR exposure was found to be slightly below estimates reported for in vivo exposure and CBT for anxiety disorders. Thus, offering VR exposure might, in particular, lead to a higher number of patients agreeing to exposure therapy. During and after exposure therapy, on the other hand, an application in VR might not have relevant advantages with regards to dropout rates—at least if patients were randomly assigned to either VR or in vivo therapy—or with regards to the patients' opinion toward the treatment.

Besides the patients' acceptance, VR provides the advantage that phobic objects and situations can be easily adapted according to therapeutic considerations. For example, the therapist can entirely control type, intensity, duration and repetition of the exposed object or situation, and can implement specific stimuli (e.g., turbulences in the exposure of flight phobia) (Diemer et al., 2015). Furthermore, contextual shifts are less time consuming and costly in VR in comparison to in vivo exposure (Botella et al., 2017), what might be relevant as using multiple contexts in spider phobia already showed an improvement in the generalizability of exposure therapy (Shiban et al., 2015). Altogether, VR provides a high level of control and flexibility with the possibilities to even surpass reality (Botella et al., 2017). One example is the use of virtual spiders, which can be constructed to be much bigger than living spiders (see for e.g., Shiban et al., 2015). These possibilities might even facilitate an enhancement of the efficacy of VR in comparison to in vivo exposure therapy, although empirical evidence from studies explicitly exhausting the technical possibilities of VR exposure are still rare.

Finally, the German Practice Guideline for anxiety disorders already recommends VR exposure on the basis of expert consensus for Specific Phobia if in vivo exposure is not available or possible (Bandelow et al., 2014). Moreover, the guideline preliminarily lists VR therapy as an effective treatment option for Agoraphobia/Panic Disorder.

#### Efficacy of Virtual Reality Exposure Therapy for Phobic Anxiety Disorders

To empirically prove the efficacy of VR exposure therapy in anxiety disorders, numerous original studies and meta-analyses have been published throughout the last decade. While some of the meta-analyses highlight a broad perspective on the use of VR in cognitive behavioral therapy, including other VRbased interventions than exposure (Fodor et al., 2018), or show effect sizes for symptom improvements through VR exposure under the inclusion of primary studies that applied no control group (Parson and Rizzo, 2008), most meta-analyses focus on comparisons of VR exposure to inactive and active control conditions. According to the Cochrane Handbook for Systematic Reviews of Interventions, inactive control groups consist of for example a placebo, no treatment, standard care, or a waitlist control, while active control groups consist of a different kind of therapy (Higgins and Green, 2011). Results from previous meta-analyses on the efficacy of VR exposure therapy for anxiety disorders showed that VR exposure therapy yields large effects with regards to the reduction of anxiety symptoms (Parson and Rizzo, 2008) and greatly outperforms inactive control conditions (Powers and Emmelkamp, 2008; McCann et al., 2014; Fodor et al., 2018; Carl et al., 2019). Compared to active treatment conditions, results were more indifferent. Two meta-analyses showed no significant difference in the efficacy of VR exposure for anxiety disorders in comparison to classical evidence-based treatments like CBT, imaginal exposure and in vivo exposure (Opri¸s et al., 2012), or when specifically compared to in vivo exposure therapy (Carl et al., 2019). In contrast, Fodor et al. (2018) found that non-VR interventions like CBT, imaginal exposure, and in vivo exposure were slightly more effective than VR exposure in anxiety disorders. Powers and Emmelkamp (2008) instead reported a small effect size favoring VR exposure over in vivo exposure for anxiety disorders. As further results, Opri¸s et al. (2012) showed that gains from VR exposure therapy could be transferred to real life situations, and that VR exposure showed a good stability of its outcomes over time, similar to that of classical evidence-based treatments, yet, there is evidence that deterioration rates of VR therapy for anxiety disorders did not differ significantly from other therapeutic approaches and were less frequent in comparison to waitlist control groups (Fernández-Álvarez et al., 2018).

In addition to addressing different anxiety disorders, some meta-analyses focused on the efficacy of VR exposure in specific kinds of Phobias. Morina et al. (2015) conducted a metaanalysis on fear of heights and fear of spiders in particular. The examination of behavior changes in real life situations and stability over time showed that VR exposure performed significantly better than waitlist did as an inactive control condition, and that there were no significant differences between VR exposure therapy and in vivo exposure therapy as an active control condition. Cardo¸s et al. (2017) conducted a meta-analysis on flight phobia in particular. They reported an advantage of VR exposure therapy when compared to classical evidence-basedinterventions at the post-test and follow up, and when compared particularly to imaginal or in vivo exposure at follow up but not at post-test. In a meta-analysis on Social Phobia in particular, Chesham et al. (2018) showed no relevant difference between VR and imaginal or in vivo exposure.

Notably, there were different primary studies included in the reported meta-analyses. Regarding the meta-analyses addressing different anxiety disorders, Parson and Rizzo (2008) examined the effects of VR exposure therapy for Phobias and PTSD in studies without a control group, with waitlist, bibliotherapy, relaxation, or attention as inactive control groups, or with in vivo exposure as an active control group. The meta-analysis by Fodor et al. (2018) provides a broader perspective on the use of VR in cognitive behavioral therapy and in this regard examined RCTs on VR-enhanced exposure and also on VRenhanced CBT interventions without exposure. However, in two particular subgroup analyses, VR-enhanced exposure only was compared to inactive control conditions including waitlist, placebo, treatment-as-usual, and relaxation, and to active control conditions including CBT, imaginal exposure, and in vivo exposure. Carl et al. (2019) synthesized trials on anxiety disorders and PTSD with random or matched allocation and compared VR exposure conditions, in which VR was not combined with another intervention, medication, or placebo to mixed control conditions like wailtlist, information, attention control, treatment as usual, relaxation, or present-centered therapy, and to in vivo exposure as an active control condition. McCann et al. (2014) synthesized RCTs on different anxiety disorders and compared VR exposure to waitlist or active placebo as inactive control groups, and to active control groups which in this study constisted of interventions like treatment as usual, cognitive therapy, present centerd therapy, computer-aided exposure, CBT, imaginal exposure, or in vivo exposure. Powers and Emmelkamp (2008) examined RCTs on anxiety disorders and PTSD and compared VR conditions that do not combine VR with other interventions or medication, to inactive control groups like waitlist, attention control, bibliotherapy, or relaxation, and to in vivo exposure as active control group. Opri¸s et al. (2012) examined RCTs on anxiety disorders comparing VR conditions to waitlist as inactive control group, and to classical evidencebased treatments like CBT, imaginal exposure and in vivo exposure as active control groups in a clinical population.

Regarding the reported meta-analyses addressing particular Phobias, Morina et al. (2015) synthesized studies on Specific Phobia and compared the efficacy on behavioral outcome measures in VR based exposure interventions with inactive control conditions like waitlist or attention placebo, and with active control conditions like CBT, imaginal exposure or in vivo exposure. Cardo¸s et al. (2017) included RCTs on flight phobia and compared VR exposure treatments with or without other interventions, to waitlist or attention control as inactive control groups, and to classical evidence-based interventions like CBT, bibliotherapy, cognitive therapy, relaxation, CBT with standard exposure (in vivo), relaxation techniques with imaginal exposure, and computer aided exposure as active controls, and particularly to exposure-based interventions including imaginal and in vivo exposure as active control groups. Chesham et al. (2018) included studies on Social Phobia with random, quasi-random or matched assignment, and compared VR exposure conditions to waitlist as inactive control group, and to in vivo or imaginal exposure as active control conditions.

Overall, only two studies (Powers and Emmelkamp, 2008; Carl et al., 2019) conducted a quantitative meta-analysis on the efficacy of VR exposure therapy in comparison to in vivo exposure therapy as the gold standard treatment for phobic anxiety disorders. No previous meta-analysis considered the amount of exposure applied in VR and in vivo conditions. This reduces the internal validity of previous results, because differences in effect sizes between VR and in vivo exposure therapy cannot be clearly attributed to the application mode of exposure treatment but could be due to differences in the load of exposure treatment.

#### Objectives

As the first systematic review and meta-analysis, we aim at comparing the efficacy of VR and in vivo exposure therapy for phobic anxiety disorders, based on randomized controlled trials including an equivalent amount of exposure in VR and in vivo. We chose to focus on phobias, as they are a highly comparable group of anxiety disorders with a similar procedure of exposure treatment. In these disorders, in vivo exposure as an individual treatment component is considered the gold standard. Furthermore, there are concrete external phobic stimuli that are usually presented during in vivo exposure, and these stimuli can be directly transferred to VR.

In our quantitative meta-analysis, we evaluate pre to post effect sizes for VR exposure therapy, in vivo exposure therapy, and for the comparison of VR and in vivo exposure therapy. Furthermore, we report the individual effect sizes of all included studies and provide a systematic review of the participants' characteristics, the materials, and the treatment procedures used. On this basis, we aim at discussing potential mechanisms of more or less efficacious VR exposure therapy.

Other than the recent meta-analysis by Carl et al. (2019), which provides a broad overview of the topic, our focus is the direct comparison of an equivalent amount of exposure in VR and in vivo. In this regard, we apply stricter inclusion criteria than Carl et al. (2019) to control for potential confounding variables. We exclude not only studies with a different amount of exposure in the VR and in vivo condition, but also studies with imaginal exposure but no in vivo exposure as the control group, with exposure treatment applied only to selected participants, with VR presentation without using immersive systems (e.g., HMD) and head tracking, and with dependent samples. Since Carl et al. (2019) do not provide a qualitative review of their included studies, we furthermore fill this gap. We therefore offer detailed descriptions and assessments of the individual studies' characteristics and of differences in their individual effect sizes. We summarize the patients' characteristics as well as the treatment materials and procedures, including information on the exposure strategy, the type of HMDs and their technical features, the virtual and in vivo environments, and additional interventions along with the exposure that were applied in the VR and in vivo exposure condition. As VR exposure is a quickly expanding field, high quality meta-analyses and high resolutions in research is needed to contribute to theory building, the development of future research questions, and the improvement of VR exposure procedures.

### Research Question

We examine whether there is a relevant difference in the efficacy of VR exposure therapy in comparison to in vivo exposure therapy as the gold standard treatment for phobic anxiety disorders, when synthesizing RCTs with an equivalent amount of exposure in the VR and in vivo condition. Furthermore, we aim at a qualitative examination of the participants' characteristics, materials, and treatment procedures of all the included studies.

### METHODS

### Protocol

We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist and protocol provided by the PRISMA Group (Moher et al., 2009).

### Eligibility Criteria

Only original studies published until June 2019 were included. The language inclusion criterion was (1) a report written in English or German. The population inclusion criterion was (2) an ICD or DSM diagnosis for Agoraphobia, Specific Phobia, or Social Phobia. The intervention inclusion criteria were (3) a treatment for Agoraphobia, Specific Phobia, or Social Phobia, (4) exposure therapy in virtual reality using immersive systems (e.g., HMD) and head tracking (no augmented reality or 3D computer animation in front of a PC screen) in the experimental group, (5) exposure therapy in vivo in the control group, and (6) no combination of the VR or in vivo exposure therapy with a specific psychopharmacological treatment. The outcomes inclusion criterion was that (7) studies examined the reduction of phobic anxiety as the primary outcome. To ensure an high internal and external validity, the inclusion criteria for study design were (8) a minimum number of 10 participants per group, (9) a randomized assignment of the participants to one of both exposure conditions, (10) an equivalent amount of exposure in both conditions and equivalency concerning additionally applied interventions alongside exposure treatment, (11) a pre and post measurement of phobic anxiety (12) with a symptom specific, standardized questionnaire or interview, and (13) sufficient statistical values (means and standard deviations in outcome parameters for each group).

#### Information Sources

A literature search in PubMed, PsychInfo and Web of Science was conducted in October and November 2017 and was updated in November 2018 and in June 2019. We also asked experts in the field of VR therapy to provide possible eligible studies.

#### Search

We searched for the keywords "virtual" and "phobia" in the PubMed, PsychInfo and Web of Science databases. Moreover, we conducted a search on the term "social anxiety." We did not set a time limit for the period in which the studies were conducted. Depending on the different databases' search template structure, we used slightly different search strategies. In PubMed the connector 'AND' was used to search for "Virtual AND Phobia" as well as "Virtual AND anxiety AND social" in titles and abstracts. In PsycInfo we searched for "virtual" in title and "phobia" in abstracts, as well as for "virtual" in title and "anxiety" and "social" in abstracts. In Web of Science we searched for "virtual" in title and "phobia" in the topic, as well as for "virtual" in title and "anxiety" and "social" in the topic.

### Study Selection

A PRISMA flow diagram (Moher et al., 2009) illustrates the number of studies screened and excluded during the screening process (see **Figure 1**). Therefore, the numbers from the first search in November 2017 and the updated search in November 2018 and June 2019 were summed up. During screening process of all records identified through database searching (n = 1,126) and other source (n = 3), obvious duplicates (n = 143) were removed first. The titles and abstracts of the remaining reports (n = 986) were then screened against the eligibility criteria. If a title or abstract provided the information that at least one eligibility criterion was not fulfilled, the record was excluded (n = 944). All remaining records with no evidence for a violation in eligibility criteria within the abstract were passed on for fulltext screening (n = 42). During this process, all 13 eligibility criteria were assessed, and reports not fitting every eligibility criterion were excluded (n = 33). We contacted two authors to check on dependent samples in different records to avoid including data twice. One author provided the information that there was no overlap in the samples of two flight phobia studies (Rothbaum et al., 2000, 2006). The other author informed us that one eligible record (Robillard et al., 2010) includes preliminary data from a larger study (Bouchard et al., 2016), so we excluded the preliminary data from our meta-analysis.

There were three researchers involved in the screening process. One researcher screened the titles and abstracts of all studies, and then screened the full-text of the remaining studies providing suggestions for the selection of eligible reports. A second researcher additionally screened the abstracts and fulltext and selected eligible reports. Disagreements concerning the inclusion of studies after the full-text screening of two researchers were discussed with the third researcher. We performed the exclusion process based on the information provided in the published articles and to the best of our knowledge.

### Data Collection Process

Data were extracted from each report independently by two researchers. Means and standard deviations for the participants' age and the distribution of sexes was missing in one report, but we could not reach the respective author. One author was contacted concerning missing information on the type of HMD and the author provided the respective information. Technical data on the image resolution and the field of view of HMDs were collected from the reports, and if not available HMD data sheets from internal databases and from the producers' websites were used. Because not all reports provided statistical data on both, the intent-to-treat and the completer sample, we contacted the respective authors to ask for additional data. As we could not receive both data sets for every included study, we used the intent-to-treat data if available, otherwise we used data from the completer sample. Disagreements between the two researchers concerning the collected data were discussed with the third author until a consensus was reached.

#### Data Items

The following data were extracted: (1) number of participants in total and in the VR and in vivo group, (2) age of the participants as means, standard deviations and range, (3) distribution between the sexes of the participants, (4) medication of the participants, (5) treated disorder (Agoraphobia, Social Phobia, or Specific Phobia), (6) number of total treatment sessions in the VR and in vivo condition (exposure sessions plus additional sessions applying other interventions), (7) amount of exposure in the VR and in vivo condition, assessed in form of (7a) the number of exposure sessions and (7b) the duration of exposure sessions in minutes, (8) exposure strategy, (9) type of HMD with (9a) resolution and (9b) field of view, (10) information on movement mode and further stimulation of senses alongside the sense of sight in VR, (11) description of the VR exposure environment, (12) description of the in vivo exposure environment, (13) therapeutic interventions used for pre- and post-processing and to accompany exposure, (14) sample on which the calculation of means and standard deviations was carried out on (intentto-treat vs. completer sample), (15) type of standardized measurement for symptom specific anxiety, and (16) means and

standard deviations of the pre- and post-symptom measurement for the VR and in vivo group. If more than one symptom measurement was applied, the measurement that assessed the anxiety symptoms of the treated phobia the most specifically was collected.

### Risk of Bias in Individual Studies

Risk of bias in the individual studies was assessed using a tool for bias detection in randomized trials from the Cochrane Collaboration (Higgins et al., 2011). To assess the risk of selection bias, performance bias, detection bias, attrition bias, and reporting bias, we checked the criteria (1) random sequence generation, (2) allocation concealment, (3) blinding of participants and researchers, (4) blinding of outcome assessment, (5) incomplete outcome data, and (6) selective reporting. The risk of bias in each domain was rated as either low, unclear, or high following the explanations and examples provided by Higgins et al. (2011). We again performed this process based on the information provided in the published articles and to the best of our knowledge.

#### Synthesis of Results Qualitative Review

As qualitative synthesis of all included studies, we conducted a qualitative review. A qualitative review provides a structured presentation and assessment of central characteristics of the included studies. For this purpose, we examined and summarized the participants' characteristics, diagnostic measures, study methodology, and treatment materials and procedures to provide an overview and a basis for discussions on effect sizes and future research perspectives. In the examination of the treatment materials and procedures, we particularly considered information on the exposure strategy, visual VR devices (type of HMDs including technical data on resolution and field of view), movement mode in VR, devices for further stimulation of senses alongside the sense of sight in VR, VR and in vivo exposure environments, and additional interventions alongside the exposure applied in the VR and in vivo condition (see collected data items in section Data Items).

#### Quantitative Meta-Analysis

To provide a statistical summary of the results on the efficacy of VR and in vivo exposure therapy from the included studies on phobic anxiety disorders, we performed a quantitative metaanalysis. In this regard, we calculated pre- to post-effect sizes for VR exposure, in vivo exposure, and the comparison of VR to in vivo exposure for the individual studies and then synthesized them for all included studies. In addition, we separately calculated synthesized effect sizes of all studies on Specific Phobia, Social Phobia and Agoraphobia. We used Microsoft Word Excel 2016 as the software tool for the statistical analysis.

#### **Effect sizes for the individual studies**

VR exposure therapy and in vivo exposure therapy. As a first step, we calculated the pre- to post-effect sizes for (1) the VR exposure treatment and (2) the in vivo exposure treatment of the individual studies included in the meta-analysis. We therefore computed the standardized mean difference between pre- and post-measurement separately for the VR group and in vivo group of each study using Cohen's d for studies that use pre-postscores according to Borenstein et al. (2009). Because correlations between the outcome measures were not available, the value was set to zero, constituting a conservative calculation (Lenhard and Lenhard, 2014). As an indicator corrected for small sample bias, we computed the Hedges' g (Hedges and Olkin, 1985). The Hedges' g coefficients were calculated by the multiplication of Cohen's d and a correction factor according to Borenstein et al. (2009). In addition, variance, standard error and 95% confidence interval for Hedges' g were calculated. The Hedges' g may be interpreted as small (0.2), medium (0.5), and large (0.8) (Ellis, 2010).

Comparison of VR and in vivo exposure therapy. As a second step, we calculated the pre- to post-effect sizes for (3) the comparison of VR to in vivo exposure therapy for the individual studies included in the meta-analysis. For each study, we calculated the standardized mean difference by subtracting the pre to post change in the in vivo group from the pre to post change in the VR group, and then divided the result by the pooled pre-test standard deviation (Morris, 2008). The standard deviations were pooled across pretest scores of both conditions as recommended by Morris (2008) as the best choice for pretestposttest-control group designs. Hedges' g again was calculated for the standardized mean difference using a correction factor (Morris, 2008; Borenstein et al., 2009). The variance, standard error and confidence interval for g were also calculated. The variance of g was computed as the multiplication of the squared correction factor and an approximation of the variance of the uncorrected standardized mean difference, using an equation for independent samples following Borenstein et al. (2009). In this calculation, a positive Hedges' g effect size reflects superiority of the VR exposure treatment, while negative coefficients indicate superiority of the in vivo exposure treatment.

#### **Synthesis of effect sizes for all studies on phobic anxiety disorders**

To synthesize the effect sizes for VR exposure, in vivo exposure, and the comparison of VR to in vivo exposure therapy from all included studies on phobic anxiety disorders, we estimated total mean effect sizes in a random-effect model following Borenstein et al. (2009). A random-effect model accounts for the variation across the studies and assumes that the true effects are normally distributed (Borenstein et al., 2009). It therefore considers the within-study variance and the variance between-studies.

Three random-effect models were calculated to synthesize the individual effect sizes for (1) VR exposure, (2) in vivo exposure, and (3) the comparison of VR and in vivo exposure from all included studies. During the calculation of each model according to Borenstein et al. (2009), an estimate for the between-studies variance was computed first, using the method of moments. Second, each study was weighted by the inverse of its variance plus the estimated between-studies variance. Third, we estimated the mean effect size. For this purpose, we calculated the weighted mean of the Hedges' g effect sizes of all studies, as the sum of the weighted effect sizes of the individual studies, divided by the sum of the weights. We also computed the variance, standard error, confidence interval, Z-value and two-tailed p-value for the estimated mean effect size.

#### **Synthesis of effect sizes for studies on Specific Phobia, Social Phobia, and Agoraphobia**

In addition, we calculated synthesized effect sizes for VR exposure, in vivo exposure, and for the comparison of VR to in vivo exposure therapy separately for studies on Specific Phobia, Social Phobia, and Agoraphobia. Because the estimate of the between-studies variance, which is necessary to calculate the random-effect model, has a poor precision if the number of studies is very small (Borenstein et al., 2009), we calculated a fixed-effect model as an option for a small number of studies suggested by Borenstein et al. (2009). A fixed-effect model is already reasonable for a synthesis up from two studies, because a synthesis of two or more studies offers a more precise estimate of the true effect compared to one study alone (Borenstein et al., 2009). A fixed-effect model does not allow inferences on a wider population but provides a descriptive analysis about the included studies (Borenstein et al., 2009). It assumes that the true effect size is the same in all studies included in the metaanalysis (Borenstein et al., 2009). Although the fixed-effect model actually demands functionally identical studies (Borenstein et al., 2009), which is basically implausible in studies performed by different researchers, it does however seem applicable for the synthesis of studies on one kind of phobic anxiety disorder in this meta-analysis, particularly because the inclusion criteria created a relatively high homogeneity concerning the participants and the procedure used across the studies.

The fixed-effect models were computed according to Borenstein et al. (2009). Altogether, we calculated nine fixedeffect models synthesizing the pre to post effect sizes for (1) VR exposure therapy, (2) in vivo exposure therapy, and (3) the comparison of VR to in vivo exposure therapy separately, for all included studies on (1) Specific Phobia, (2) Social Phobia, and (3) Agoraphobia. During the calculation of each fixed-effect model, the effect size of each individual study was weighted by the inverse of its own variance. The weighted mean was then calculated as the sum of the weighted effect sizes, divided by the sum of the weights. In addition, we computed the variance, standard error, 95% confidence interval, Z-value and two-tailed p-value for the summary effect using equations according to Borenstein et al. (2009).

### Risk of Bias Across Studies

To assess the risk of bias across the included studies, a funnel plot with the standard errors for Hedges' g on the axis of ordinates and Hedges' g on the axis of abscissae was conducted. We therefore used Hedges' g for the comparison of the VR and in vivo condition, as the main result of our analysis. A skewed or asymmetrical funnel in a visual examination can indicate a publication bias, as (smaller) studies that do not show statistically significant effects remain unpublished (Easterbrook et al., 1991; Egger et al., 1997).

### RESULTS

#### Study Selection

The PRISMA flow-chart diagram (**Figure 1**) shows the number of screened studies, excluded studies, and studies finally included in the meta-analysis. During full-text assessment, 33 studies were excluded because they did not fulfill the eligibility criteria for the following reasons: presentation of 3-D-stimuli on a PC screen instead of VR presentation using immersive systems (e.g., HMD) and head tracking (Klinger et al., 2005), comparison of two different VR exposure groups but no comparison to an in vivo exposure control group (Fraser et al., 2001), imaginal/in sensu exposure but no in vivo exposure as the control group (Wiederhold et al., 2001, 2002; Wallach et al., 2009; Rus-Calafell et al., 2013; Triscari et al., 2015), interoceptive exposure but no in vivo exposure as the control group (Quero et al., 2014), interoceptive and imaginal exposure but no in vivo exposure as the control group (Vincelli et al., 2003), computer-aided exposure as the control group (Tortella-Feliu et al., 2011), in vivo exposure only for patients with comorbid conditions in the control group (Krijn et al., 2007), relaxation training as the control group (Mühlberger et al., 2001), cognitive treatment as the control group (Mühlberger et al., 2003; Wallach et al., 2011), evaluation of VR exposure treatment effects on a graduation flight conducted accompanied or alone but no comparison between VR exposure and in vivo exposure as control group (Mühlberger et al., 2006), no control group (Baños et al., 2002; Anderson et al., 2003, 2005; Wald, 2004; Grillon et al., 2006; Piercey et al., 2012; Felnhofer et al., 2014), combination of exposure with paroxetine (Pitti et al., 2015), <10 participants per group (Botella et al., 2000), no equivalent amount of exposure in the VR and in vivo condition (Pelissolo et al., 2012), report of a study protocol without results (Miloff et al., 2016), and no equivalency concerning additional interventions applied alongside exposure in the VR and in vivo condition (Miloff et al., 2019). In the latter reports by Miloff et al. (2016, 2019), VR exposure was conceptualized as a fully automated VR serious game constructed to work independently from the presence of a human therapist. In contrast, in vivo exposure was conducted as a single session exposure approach according to Öst and was guided by a human therapist. Therefore, the VR condition was confounded with an automated exposure approach. As the in vivo exposure condition furthermore consisted of additional interventions conducted by the human therapist, that were not applied in the VR condition, like reflection on catastrophic beliefs, exploration of what occurs at each treatment stage, exploration of violations of expectancy and monitoring and discussion of safety behaviors, the study did not fulfill our inclusion criteria concerning equivalence in the additional interventions applied alongside exposure in the VR and in vivo condition and was thus excluded from the analysis. The following studies were also excluded for the exceptional reasons of a combination of VR exposure with in vivo exposure (Choi et al., 2005), and as pilot trials, follow-up studies, or studies that examined a new research question based on the data of another screened study (Rothbaum et al., 2002; Robillard et al., 2010; Safir et al., 2012; Anderson et al., 2016; Kampmann et al., 2019).

### Study Characteristics

Nine studies fulfilled all eligibility criteria and were included in our meta-analysis (see **Tables 1**, **2** for study characteristics). The final sample consisted of two studies on Agoraphobia (Botella et al., 2007; Meyerbroeker et al., 2013), three on Social Phobia (Anderson et al., 2013; Bouchard et al., 2016; Kampmann et al., 2016), and four on Specific Phobia (Rothbaum et al., 2000, 2006; Emmelkamp et al., 2002; Michaliszyn et al., 2010). As different sub-types of Specific Phobia, two studies target fear of flying (Rothbaum et al., 2000, 2006), one study targets fear of heights (Emmelkamp et al., 2002), and one study targets fear of spiders (Michaliszyn et al., 2010).

As presented in **Table 1**, the nine studies were published between 2000 and 2016 and included data from 371 participants overall, with a mean sample size of 41.22 patients (SD = 14.39). All studies included participants with the ICD or DSM diagnosis of a phobic anxiety disorder. In the two studies on flight phobia as a Specific Phobia, also patients with an Agoraphobia with flying as the main feared stimulus were included (Rothbaum et al., 2000, 2006). In one study on Agoraphobia (Botella et al., 2007), 17.1% of all participants including a waitlist condition were diagnosed with Panic Disorder without Agoraphobia. The study on spider phobia as a Specific Phobia included four participants with only a partial diagnosis of Specific Phobia who however scored within the phobic range for the questionnaire measures and behavioral avoidance task (Michaliszyn et al., 2010). The age of the included participants ranged from 18 to 72 - referring to those studies providing information on this sample characteristic (see **Table 1**). In all studies except one (Emmelkamp et al., 2002), more women than men were included, though one study did not give information on the distribution of sexes (see **Table 1**). Information on the percentage of medicated participants was only available in three studies (Botella et al., 2007; Michaliszyn et al., 2010; Bouchard et al., 2016) and showed a wide range of medication rates from zero to 66.6%. The number of total treatment sessions applied to each participant ranged from four (Emmelkamp et al., 2002) to 14 (Bouchard et al., 2016), with a mean total treatment session number of 8.78 (SD = 2.64). The number of total treatment sessions includes exposure sessions as well as additional sessions, for example with interventions like psychoeducation or relapse prevention (see **Table 2**). As required by the eligibility criteria of this meta-analysis (see section Eligibility Criteria), the amount of exposure was equal in the VR and in the in vivo exposure condition of all included studies. The amount of exposure was typically assessed by the number of exposure sessions, and the duration of one exposure session was also considered. In all studies, the number of exposure sessions performed in the VR and the in vivo group ranged from three sessions (Emmelkamp et al., 2002) to eight sessions (Bouchard et al., 2016) with a mean number of exposure sessions of 5.44 (SD = 1.59). The duration of one exposure session ranged from 20 to 90 min with a mean duration of 54.29 min (SD = 22.81) – though

#### TABLE 1 | Participants' and treatment characteristics in RCTs included in the meta-analysis.


The left side of this table presents the number of patients included in the individual studies, their ICD or DSM diagnosis, their age, the distribution of the sexes, and the number of medicated patients. Age is reported as means and standard deviations either for the whole sample or separately for the both treatment groups VR exposure therapy (VRET) and in vivo exposure therapy (IVET). The range of age is stated for the whole sample. The distribution of sexes is presented as absolute numbers or as percentages of male and female participants and is reported either for the whole sample or separately for both treatment conditions. If information on medication was available, the absolute number or the percentage of medicated participants was reported either for the whole sample or separately for both treatment conditions. The right side of the table gives an overview of the treatment sessions applied to participants in the VR exposure and in the in vivo exposure condition. The total number of treatment sessions adds up the number of exposure sessions and the number of additional sessions for pre-and post-processing and to accompany exposure. The number of exposure sessions and the duration of one single exposure session in minutes is reported, too. A description of the concrete exposure procedures and interventions performed during additional sessions is summarized in Table 2. Studies are sorted by the type of phobia and date of publication. N (total participants) = 371. N/A: information was not available.

<sup>a</sup>Patients diagnosed with Agoraphobia with flying as main feared stimulus, n = 3.

<sup>b</sup>Patients diagnosed with Agoraphobia (with or without Panic Disorder) with flying as main feared stimulus, n = 10.

<sup>c</sup>Patients with partial diagnosis of Specific Phobia but scoring within the phobic range on questionnaire measures and BAT, n = 4.

<sup>d</sup>Participants with diagnosis of Panic Disorder without Agoraphobia in whole sample including waitlist, % = 17.1.

<sup>e</sup>Values for the whole sample including third condition (waitlist), N = 45.

<sup>f</sup>Percentages for whole sample including waitlist condition, N = 97.

<sup>g</sup>Participants included in VR and in vivo group by re-randomization from waitlist are not included in values for mean age and sex distribution, but in age range.

<sup>h</sup>Values for whole sample including third condition (waitlist).

i Inclusion criteria consist of a stable medication for 3 months.

<sup>j</sup>Tranquilizers excluded, stable dose of antidepressants required.

<sup>k</sup>Different number and duration of exposure sessions but with the same total duration of 120 min in the VR exposure condition (four times 30 min) and in the in vivo exposure condition (six times 20 min).

two studies did not give information on this treatment procedure characteristic (see **Table 1**).

As presented in **Table 2**, the exposure strategy for both the VR and in vivo group was described as gradual in all studies except one. This study mentioned a special feature of their exposure strategy, where the focus was to develop new, non-threatening and adaptive interpretations, and that habituation was not required (Bouchard et al., 2016). All studies applied a therapist guided exposure approach in the VR and in vivo condition. They all used HMD devices for visual stimuli presentation in the VR exposure condition. Image resolution and field of view of HMDs differed over the individual devices. The image resolution determines how clean the picture quality is, while the field of view (FoV) refers to the view or the surroundings that a human eye can see without eye movements (Jerdan et al., 2018). One study additionally used a CAVE system for visual stimuli presentation but did not find significant differences in contrast to an HMD presentation (Kampmann et al., 2016). For tactile and haptic stimulation, two studies on flight phobia mentioned the use of a specific seating construction in the VR condition (Rothbaum et al., 2000, 2006), and one study on fear of heights used a railing to hold on to (Emmelkamp et al., 2002). Some studies provided information on the movement mode in VR and mentioned either the use of a mouse (Botella et al., 2007; Michaliszyn et al., 


Frontiers in Psychology | www.frontiersin.org

(Continued) Virtual Reality Exposure in Phobias

Virtual Reality Exposure in Phobias

TABLE 2 | Continued

References

Exposure

strategy

Type

of

HMD

with

Movement

mode

in


Exposure Treatment

environments

In

vivo

environments

Additional

interventions

(VR

and

VR

This table provides detailed information on the exposure treatment materials and procedures and on additional interventions applied in participants of the included studies. It mentions the general exposure strategy that was similar in VR and in vivo. Moreover, it gives information on the type of HMD used for visual stimuli presentation, including data on image resolution and field of view (FoV). The image resolution is reported by the number of pixels arranged horizontally and vertically; the field of view is reported as diagonal FoV in degrees. If available, information on the movement mode in VR and on additional devices for tactile stimulation is provided. The table furthermore provides descriptions of the VR and in vivo exposure environments and mentions psychological interventions that were applied in addition to pure exposure treatment in the VR as well as for the in vivo exposure condition. Studies are sorted by the type of phobia and date of publication. N/A: no information available.

 aSeat with woofer under it to create noise and vibrations.

<sup>b</sup>In this study, a CAVE system was used in addition to HMD as an alternative mode for VR presentation. No significant effects of HMD vs. CAVE were found on outcome-measures. cImaginal exposure was conducted during in vivo exposure on a stationary plane.

2010; Bouchard et al., 2016), or that the participants could walk around freely in a demarcated space (Emmelkamp et al., 2002). Concerning the exposure environments for the VR and in vivo condition, some studies translated the in vivo environments directly into VR environments (Emmelkamp et al., 2002; Meyerbroeker et al., 2013), and others used in vivo environments that slightly differed from the VR environments. In one study on Social Phobia, in vivo group therapy was used to create a real-life audience for participants delivering a speech (Anderson et al., 2013). The study by Kampmann et al. (2016) provided standardized social scenarios in the VR condition but conducted exposure exercises on the participants' individual social situations in the in vivo condition. All included studies on Social Phobia furthermore mention the realization of social interactions with negative reactions of counterpart(s) in particular for the VR condition but not for the in vivo condition. Anderson et al. (2013) list bored, hostile and distracted as reactions of a virtual audience, Kampmann et al. (2016) mention dialogues with an unfriendly content, and - most pronounced - Bouchard et al. (2016) name acting under the scrutiny of strangers and facing criticism or insistence while meeting unfriendly neighbors or while refusing to buy from a persistent shop seller as virtual scenarios (see **Table 2**). In the two studies on fear of flying, VR and in vivo exposure differed in that way that no real flight was realized in the in vivo condition, but instead imaginal exposure of take-off, flight and a landing was conducted while sitting on a stationary plane (Rothbaum et al., 2000, 2006). In the study on spider phobia (Michaliszyn et al., 2010), in vivo exposure consisted of handling a living spider with the hands, while no tactile feedback was provided in VR. As additional interventions accompanying exposure, all studies conducted introduction interventions like psychoeducation. Most studies furthermore conducted cognitive or behavioral fear management strategies in addition to pure exposure treatment (Rothbaum et al., 2000, 2006; Botella et al., 2007; Michaliszyn et al., 2010; Meyerbroeker et al., 2013; Anderson et al., 2016; Bouchard et al., 2016). The two studies on Agoraphobia (Botella et al., 2007; Meyerbroeker et al., 2013) and the two studies on flight phobia (Rothbaum et al., 2000, 2006) applied interoceptive exposure in addition to VR and in vivo exposure. In all studies, additional interventions were conducted in the VR and the in vivo condition.

#### Risk of Bias Within the Studies

To reduce the risk of bias within the included studies, only randomized-controlled trials and only studies on participants with valid diagnoses were selected. The main goal of this metaanalyses was to compare studies that applied an equal amount of exposure in the VR and in vivo condition, which is an important contribution to reduce the risk of bias. However, it should be noted that all studies were published by authors that are researchers in the field of VR exposure, which may enhance a particular risk of bias.

To assess common sources of bias within randomizedcontrolled trials in detail, we used a bias detection tool from the Cochrane Collaboration (Higgins et al., 2011). Altogether, our assessment largely showed a low to unclear risk of bias in the included studies (see **Table 3**). Concerning the risk of selection bias in particular, all studies used random assignment, but not all studies described the concrete procedure of random sequence generation and allocation concealment. Concretely, there was an unclear risk of selection bias in six studies, as the method of random sequence generation and allocation concealment was not further specified. One study reported an assignment based on a computerized random number generator and the participation of a third study coordinator to ensure an unknown allocation before the participants' enrollment (Anderson et al., 2013), one study reported an assignment based on random number tables and a concealed assignment not further specified (Bouchard et al., 2016), and one study reported the use of a computerized random number generator and a concealed assignment using envelopes prepared by a third person and opened after enrollment of the participants (Kampmann et al., 2016). All of the above had a low selection bias risk. A performance bias must be suspected in all studies, as the blinding of participants and researchers during VR or, respectively, in vivo exposure was not possible due to the nature of the intervention. Generally, not all sources of bias can be avoided, due to the nature of the applied intervention. A certain performance bias therefore must be tolerated. Nevertheless, at least one study reported the blinding of the participant and researcher until the exposure component was applied, thereby enabling blind pre-processing interventions (Botella et al., 2007). This procedure was rated as low performance bias risk, considering the nature of the treatment. Other studies did not further specify the time point of de-blinding and therefore the risk of performance bias concerning pre-processing remained unclear. Also blinding of outcome assessment was not entirely realizable as the metaanalysis was conducted on self-report measurements of phobic fear and participants could therefore not be blind to the applied condition at post measurement. Though, two studies realized blinding during pre-assessment as an approximation (Botella et al., 2007; Kampmann et al., 2016), both rated as low detection bias risk under the decribed circumstances. Other studies did not further specify if pre-processing was performed blind and were rated as an unclear risk of detection bias. Risk of attrition bias was low in many studies; however, some studies did not provide sufficient information, therefore, risk of attrition bias remained unclear. In studies with an intent-to-treat data, attrition bias was rated as low, if losses to post-test were disclosed with respective reasons, intent-to-treat analysis method was described, and if means and standard deviations were reported with information on the sample size in both groups (Meyerbroeker et al., 2013; Bouchard et al., 2016). If those descriptions were incomplete in studies with an intent-to-treat sample, attrition bias risk was rated as unclear. One study, where the completers and intentto-treat sample were the same and outcome data therefore was complete (Botella et al., 2007), was rated as having a low risk of attrition bias. In studies with data for the completer sample, risk of attrition bias was rated as low if a precise description of attrition and exclusion of patients was provided (Rothbaum et al., 2000; Emmelkamp et al., 2002). In comparison, in studies with participants switching between conditions at an unspecified point of time (Michaliszyn et al., 2010), risk of attrition bias was rated as high. Risk of reporting bias was rated as low in all studies, as prespecified outcome measures were all reported.

TABLE 3 | Assessment of risk of bias within the studies.


Risk of bias was assessed using a tool from the Cochrane Collaboration (Higgins et al., 2011). Risk of bias was rated as low, unclear, or high.

#### Results of Individual Studies

**Table 4** shows means and standard deviations of the anxiety measures at pre and post assessment, as well as sepereate Hedges' g effect sizes for pre-post treatment effects for both the VR and in vivo group. Effect sizes for the VR exposure condition ranged from 0.35 (Rothbaum et al., 2006) to 2.76 (Michaliszyn et al., 2010), while effect sizes for the in vivo exposure condition ranged from 0.31 (Rothbaum et al., 2006) to 3.86 (Michaliszyn et al., 2010). Six studies conducted an intent-to-treat analysis (Rothbaum et al., 2006; Botella et al., 2007; Anderson et al., 2013; Meyerbroeker et al., 2013; Bouchard et al., 2016; Kampmann et al., 2016), one of them reporting the same sample size for participants included in the study and completers (Botella et al., 2007). Three studies reported on the completer sample (Rothbaum et al., 2000; Emmelkamp et al., 2002; Michaliszyn et al., 2010).

In the comparison of the treatment effects of the VR exposure and the in vivo exposure condition in the individual studies (see **Figure 2**), one study showed a large (g ≥ 0.80) (Anderson et al., 2013), two studies a medium (0.80 > g ≥ 0.50) (Rothbaum et al., 2000; Kampmann et al., 2016) and one study a small (0.50 > g ≥ 0.20) (Michaliszyn et al., 2010) negative effect size, indicating superiority of in vivo exposure. One study showed a small (0.50 > g ≥ 0.20) (Emmelkamp et al., 2002) and one study a medium (0.80 > g ≥ 0.50) (Bouchard et al., 2016) positive effect size in the direction of superiority of VR exposure over in vivo exposure therapy. Three studies showed an effect size around zero (−0.06 to 0.02) and thereby below a small effect (g < 0.20) (Rothbaum et al., 2006; Botella et al., 2007; Meyerbroeker et al., 2013), pointing to no relevant difference between VR and in vivo exposure therapy.

#### Synthesized Findings

Both, Virtual Reality exposure therapy (g = 1.00) and in vivo exposure therapy (g = 1.07) showed a large, significant overall effect size, when synthesizing the nine included studies (n = 371) on phobic anxiety disorders using a random-effect model (see **Table 5**). Calculated separately for each phobic anxiety disorder using fixed-effect models, VR exposure therapy showed a medium, significant effect size in Specific Phobia (g = 0.68), and a large, significant effect size in Social Phobia (g = 1.17) and Agoraphobia (g = 0.99). in vivo exposure therapy also yielded a medium, significant effect size in Specific Phobia (g = 0.72), and a large, significant effect size in Social Phobia (g = 1.19) and Agoraphobia (g = 0.90) (see **Table 5**).

For the comparison of the treatment effect of VR exposure therapy and in vivo exposure therapy in all nine included studies on phobic anxiety disorders (n = 371), using a Hedges' g randomeffects model, we obtained a mean overall effect size estimate of Hedges' g = −0.20, SE = 0.18, p = 0.271, 95% CI [−0.55, 0.16] (see **Figure 2**). The negative effect size represents a difference of mean treatment changes in the direction of superiority of in vivo exposure, but the effect size was at the lower limit of a small effect (0.50 > g ≥ 0.20) and therefore very small, and not significantly different from zero. Accordingly, we found no evidence for a significant difference in the efficacy of VR and in vivo exposure therapy over all studies on phobic anxiety disorders.

To separately compare the treatment effects of VR exposure and in vivo exposure therapy for the three subtypes of phobic anxiety disorders, we applied fixed-effect models (see results in **Figure 2**), which are appropriate for the small number of studies included on each phobia. The pooled effect size for four studies on Specific Phobia (n = 153) showed a non-significant

#### TABLE 4 | Effect sizes for the pre-post treatment effects of VR exposure therapy and in vivo exposure therapy.


This table provides means and standard deviations of pre and post measurements on the stated anxiety measures, as well as pre to post effect sizes for VR exposure therapy (VRET) and in vivo exposure therapy (IVET) of all studies included in the meta-analysis. Effect sizes were reported as Hedges' g. The statistical values either refer to the completer sample (Completer) or to the intent-to-treat sample (ITT), as mentioned. Studies are sorted by the type of phobia and date of publication. AQ-Anxiety, Acrophobia Questionnaire, Anxiety-subscale; FSQ, Fear of Spiders Questionnaire; FFI, Fear of Flying Inventory; PRCS, Personal Report of Confidence as a Speaker; LSAS-SR, Liebowitz Social Anxiety Scale; FQ-Agoraphobia, Fear Questionnaire – Agoraphobia; ACQ, Agoraphobic Cognition Questionnaire; CI, confidence interval; LL, lower limit; UL, upper limit.

<sup>a</sup>The report did not present sample sizes and/or a declaration of ITT or completer sample in the table on means and standard deviations, information from the text was used for specification.

<sup>b</sup>Patients from waitlists were allocated to the VRET and IVET condition and included in the analysis.

<sup>c</sup>The authors reported the same sample size for the number of participants included in the study and the analysis sample.

result in favor of in vivo exposure that was below the level of a small effect (g < 0.20), g = −0.15, SE = 0.16, p = 0.333, 95% CI [−0.47, 0.16]. This means that we found no significant difference in the efficacy of VR exposure and in vivo exposure in Specific Phobia. The pooled effect size of three studies on Social Phobia (n = 148) showed a medium and significant effect size favoring in vivo exposure, g = −0.50, SE = 0.17, p = 0.003, 95% CI [−0.83, −0.16]. Accordingly, VR exposure therapy was found to be significantly less efficacious than in vivo exposure therapy in Social Phobia. The pooled effect size for two studies on Agoraphobia (n = 70) yielded a non-significant result close to zero, g = −0.01, SE = 0.23, p = 0.959, 95% CI [−0.47, 0.45], not favoring one of the treatment conditions. This indicates similar treatment effects of VR and in vivo exposure therapy in Agoraphobia.

#### Risk of Bias

Visual inspection of a funnel plot (see **Figure 3**) showed a sample of studies with relatively homogenous standard errors and widespread effect sizes. There was no asymmetry detected indicating a (publication) bias.

### DISCUSSION

#### Summary and Discussion of Main Findings

Applying strict inclusion criteria to focus exclusively on the comparison of VR and in vivo exposure, this meta-analysis synthesized nine randomized-controlled trials with altogether 371 participants, comparing the pre to post treatment effects of VR and in vivo exposure therapy in phobic anxiety disorders applied with an equivalent amount of exposure and with equivalent additional interventions alongside exposure in both conditions. VR and in vivo exposure both yielded large effect sizes concerning the reduction of phobic fear. For the comparison of VR and in vivo exposure therapy, we found a small, but nonsignificant effect size (g = −0.20, p = 0.271) favoring in vivo exposure over VR exposure (see **Figure 2**). Although a nonsignificant effect is not a final proof of equivalence, it shows that there is no evidence that VR exposure is significantly less efficacious than in vivo exposure therapy in phobic anxiety disorders. The 95% confidence interval of the synthesized effect size ranged from −0.55 to 0.16. This illustrates that the true effect may lie in this range and that VR exposure could be inferior to slightly superior in comparison to in vivo exposure. Regarding previous meta-analyses on the comparison of VR

FIGURE 2 | Forest plot with pre to post effect sizes for the comparison of VR exposure therapy to in vivo exposure therapy. All effect sizes are reported as Hedges' g, using a fixed-effect model or a random-effect model as stated. Negative effect sizes indicate superiority of in vivo exposure therapy, while positive effect sizes indicate superiority of virtual reality exposure therapy. Studies are sorted by the type of phobia and date of publication.

TABLE 5 | Pooled effect sizes for the pre-post-treatment effects of VR exposure therapy and in vivo exposure therapy.


This table presents pre to post effect sizes for VR exposure therapy and for in vivo exposure therapy pooled from studies on Specific Phobia, Social Phobia, Agoraphobia, and all studies. Pooled effect sizes are reported as Hedges' g using a fixed-effect model or random-effect model as stated. CI, confidence interval; LL, lower limit; UL, upper limit.

to in vivo exposure, a non-significant effect size is consistent with the finding of a recent meta-analysis on VR versus in vivo exposure in anxiety disorders by Carl et al. (2019), which showed a non-significant, negative effect size in favor of in vivo exposure (g = −0.07, p = 0.544). However, this effect size was even below the level of a small effect. An earlier meta-analysis by Powers and Emmelkamp (2008) even reported a small, positive effect size in favor of VR exposure over in vivo exposure therapy (g = 0.34). Those differences of our results to former meta-analyses comparing VR to in vivo exposure might be due to factors like the number and type of studies included, the selected outcome measures, or the data analysis strategy. Powers and Emmelkamp (2008) included only five studies published until 2007 and in this regard conducted their meta-analysis on a smaller sample of original studies. Carl et al. (2019) synthesized 14 studies, and included studies with a different amount of exposure between conditions (Pelissolo et al., 2012), with imaginal instead of in vivo exposure in the control group (Wallach et al., 2009), with in vivo exposure only for patients with comorbid conditions in the control group (Krijn et al., 2007), with VR presentation without using immersive systems (e.g., HMD) and head tracking (Klinger et al., 2005), as well as preliminary data (Robillard et al., 2010) on an already included study (Bouchard et al., 2016). These studies were excluded in our meta-analysis due to the stricter inclusion criteria. As one important point, we ensured that in vivo exposure was applied for all clients in the control condition

and in an equivalent amount to VR exposure. That might have made the control condition more powerful thereby shifting our overall effect size toward the superiority of in vivo exposure in comparison to the previous meta-analysis. Furthermore, we diminished potential sources of bias like dependent samples. For example, the study consisting of preliminary data showed a medium positive effect size in favor of VR exposure, whereby the inclusion of this study in the meta-analysis by Carl et al. (2019) might have moved their overall effect size toward VR exposure.

Considering the results for the individual studies included in our meta-analysis, the effect sizes for the pre to post treatment efficacy of VR in comparison to in vivo exposure therapy varied largely (−1.02 to 0.53) (see **Figure 2**), with some favoring VR exposure, some favoring in vivo exposure and some detecting no relevant differences. The wide range shows the high potential of VR, but also illustrates that VR exposure therapy could be less efficacious than in vivo exposure therapy. This raises a discussion on potential working mechanisms of a more or less efficacious VR exposure therapy if compared to in vivo exposure. On the one hand, variance in the effect sizes of the individual studies might be due to confounding variables, like differences in the distribution of participants' characteristics for the two conditions (e.g., age, comorbidities, or severity of phobic anxiety). On the other hand, the variance could result from differences in the specific materials and procedures of exposure therapy in the individual studies (e.g., technical features of VR devices, realization of VR and in vivo environments, or combination of exposure with accompanying treatment elements, for example cognitive interventions). Furthermore, it could be due to differences in the efficacy of VR exposure for different kinds of phobic anxiety disorders. We discuss these factors in the paragraphs that follow.

Looking separately at studies on Specific Phobia (n = 4), we found a negative effect size in favor of in vivo exposure (g = −0.15) (see **Figure 2**), which was however non-significant and furthermore below the level of a small effect. VR exposure consequently does not seem to be significantly less efficacious than in vivo exposure in this phobic anxiety disorder. This is consistent with the results of the latest meta-analysis on VR versus in vivo exposure in anxiety disorders by Carl et al. (2019), who particularly synthesized five studies comparing VR and in vivo exposure in Specific Phobia. Other than Carl et al. (2019), the meta-analysis conducted here, excluded a study with in vivo exposure only for patients with comorbid conditions in the control group (Krijn et al., 2007), and thereby could prove the result for a sample of studies free from understated in vivo conditions.

Behind the synthesized effect size of four studies on Specific Phobia, the effect sizes of the individual studies ranged from −0.65 to 0.27 (see **Figure 2**). It is important to mention that we pooled studies on three different Specific Phobias (fear of heights, fear of spiders and fear of flying) as there were not enough articles published to synthesize results separately for each Specific Phobia. Effect sizes indicating superiority of in vivo exposure were found for two studies, one on fear of spiders (Michaliszyn et al., 2010) and one on fear of flying (Rothbaum et al., 2000). Michaliszyn et al. (2010) assumed that inferiority of VR exposure might be based on an insufficient presence or problems with cybersickness. Another aspect to consider is that in vivo exposure was defined as successful in this study once patients could handle a living spider in their hand (Michaliszyn et al., 2010). In contrast, VR exposure did not consist of tactile stimulation (see **Table 2**) which could have diminished its efficacy when compared to in vivo exposure. Another point to explain inferiority of VR in one study conducted earlier in time (Rothbaum et al., 2000) might be the use of an older HMD technology. Against this hypothesis speaks a comparison between the two studies on fear of flying conducted by the group around Barabara Rothbaum (Rothbaum et al., 2000, 2006) with a relative similar treatment proecedure. The study conducted 2006 pointed more toward equivalency of VR and in vivo exposure than the earlier study conducted 2000, although an HMD with an equal amount of pixel and a lower field of view was used in the 2006 study (see **Table 2**). Therefore, further potential (confounding) variables, for example differences in the sample characteristics between both studies, must be considered as relevant for the difference in effect sizes. Another discussion point concerning those two studies is, that Rothbaum et al. (2000) and Rothbaum et al. (2006) both conducted an in vivo exposure of a parked plane but only imaginal exposure of a takeoff, flying, and landing in the control group. In contrast, VR exposure consisted of a takeoff, landing, and flying in calm and stormy weather (see **Table 2**). Therefore, a flight exposure conducted entirely in vivo might still yield superior effects in comparison to VR exposure. Hence, the comparison of VR and in vivo flight exposure is an interesting question for future research. The study on fear of heights by Emmelkamp et al. (2002), showing a small effect size favoring VR therapy, indicates that VR exposure can be equally efficacious or even superior than in vivo exposure. In this study, participants in the VR group were exposed to exactly the same three situations as participants in the in vivo group, which were rebuilt as VR environments (see **Table 2**). Furthermore, participants could walk around freely in a demarcated space during VR exposure in this study, while other studies like for example Michaliszyn et al. (2010) used a mouse for movements in VR (see **Table 2**). This could potentially represent an advantage of the VR condition in this study when compared to the VR conditions of other studies.

The result for studies on Agoraphobia (n = 2) distinctly points to a similar efficacy of VR and in vivo exposure, as the synthesized effect size is close to zero and non-significant (g = −0.01) (see **Figure 2**). Here, the separate effect sizes for the individual studies were homogeneous, ranging from −0.07 to 0.02. In both included studies, participants were gradually exposed to situations like a subway, public buildings, or supermarkets in VR, and a similar in vivo exposure was applied in the control group (Botella et al., 2007; Meyerbroeker et al., 2013) (see **Table 2**). Unlike our result, the meta-analysis on VR versus in vivo exposure in anxiety disorders by Carl et al. (2019) synthesized three studies on Panic Disorder with Agoraphobia and found a small, negative effect size in favor of in vivo exposure (g = −0.26), which was, however, non-significant. Carl et al. (2019) additionally included one study that performed CBT methods classically recommended for panic disorder (among them in vivo exposure) in the control group, and VR exposure to different environments as the main intervention in the experimental group (Pelissolo et al., 2012). This study yielded a negative effect size in favouring of in vivo exposure, which might possibly be due to the application of interventions relevant for panic disorder patients such as relaxation training, cognitive interventions, and interoceptive exposure in the in vivo but not in the VR condition. The study was excluded from our meta-analysis due to the stricter eligibility criteria of an equivalent amount of exposure and the equivalency of additional interventions in the VR and in vivo condition. Its negative effect size probably explains the stronger trend in favor of in vivo exposure for Agoraphobia in the meta-analysis by Carl et al. (2019). The two studies included in our metaanalysis on Agoraphobia conducted cognitive interventions and interoceptive exposure as additional interventions in both the VR and in vivo exposure group (Botella et al., 2007; Meyerbroeker et al., 2013) (see **Table 2**). Considering this, we suspect that VR and in vivo exposure in Agoraphobia show a similar efficacy in studies with a highly comparable treatment procedure of exposure and additional interventions in VR and in vivo therapy.

Interestingly, we found a significant, medium effect size (g = −0.50) in favor of in vivo exposure in the synthesis of studies comparing VR and in vivo exposure in Social Phobia (n = 3) (see **Figure 2**). This indicates evidence for the superiority of in vivo exposure in Social Phobia and therefore represents an inconsistency to former meta-analyses. So, Chesham et al. (2018) found an extremely small, non-significant effect in favor of VR exposure when pooling five randomized trials comparing VR exposure with standard treatment (in vivo or imaginal exposure) for Social Anxiety. Also, Carl et al. (2019) found a small, nonsignificant effect in favor of VR exposure when comparing to in vivo exposure in six studies on Social Anxiety and Performance Anxiety. Unlike Carl et al. (2019) and Chesham et al. (2018), we excluded studies with imaginal exposure or with a videotaped visualization procedure but with no in vivo exposure as the control group (Wallach et al., 2009; Heuett and Heuett, 2011), because the abscence of in vivo exposure could lower the efficacy of the control condition in contrast to the VR condition. Both studies actually yielded either a positive effect size in favor for VR exposure (Heuett and Heuett, 2011) or an effect size close to zero, while not favoring VR or in vivo exposure (Wallach et al., 2009). Moreover, unlike Carl et al. (2019) we excluded studies with VR presentation without using immersive systems (e.g., HMD) and head tracking (Klinger et al., 2005), and studies on preliminary data on already included trials (Robillard et al., 2010), that both yielded positive effect sizes in favor of VR. The exclusion of those four studies in our meta-analysis can probably explain why our results differ from the meta-analysis by Carl et al. (2019) or Chesham et al. (2018).

We found a superiority of in vivo exposure over VR exposure only in Social Phobia but not in Agoraphobia and Specific Phobia, indicating that it might be more difficult to create virtual social environments for Social Phobia exposure than virtual spiders, heights and airplanes for Specific Phobia exposure, or virtual shopping malls, subways, or tunnels for Agoraphobia exposure. In addition, Social Phobia is considered to be a more complex disorder with high comorbidity, chronicity, and impairment (Wittchen and Fehm, 2003), and shows lower remission rates in CBT treatments than other anxiety disorders (Springer et al., 2018). Actually, these issues should affect VR as well as in vivo exposure therapy in Social Phobia. Nevertheless, it might be easier to activate specific dysfunctional beliefs of Social Phobia patients (e.g., concerning what others think of them) in vivo than in VR. In general, the creation of avatars, agents, and social interactions for VR settings is an issue which is challenging not only for psychological but also for computer science research. As an example, the degree of naturalism required for virtual agents is being intensively discussed, which seems to not be linearly related to the users' acceptance of the agent ("uncanny valley effect") (see for e.g., Stein and Ohler, 2017; Schwind et al., 2018).

The effect sizes on the comparison of VR to in vivo exposure from the individual reports on Social Phobia ranged from −1.02 to 0.53 (see **Figure 2**). This shows that VR exposure was partially inferior and partially superior to in vivo exposure in the studies included in our meta-analysis. An effect size favoring in vivo exposure was found in the study on public speaking phobia by Anderson et al. (2013) in which participants were asked to deliver a speech in front of a virtual audience varying in size, if assigned to the VR condition, or respectively, in front of a real audience in a group therapy setting if allocated to the in vivo group (Anderson et al., 2013) (see **Table 2**). In VR, the audience members could be manipulated on their reactions (e.g., bored or interested; see **Table 2**) and could pose standardized questions (Anderson et al., 2013). As an important difference, participants in the in vivo group not only delivered their own speech but additionally listened to the speeches of other group therapy members and received positive feedback on their own videotaped speeches from the whole group instead of only from the therapist like realized in the VR condition (Anderson et al., 2013) (see **Table 2**). According to the authors of the study, the group setting might have been of a stronger interpersonal nature than the VR environment. Furthermore, they considered the feedback as a higher dose of exposure in the in vivo condition (Anderson et al., 2013). Above that, one might speculate that the in vivo condition provided a higher individualization in social interactions in contrast to the standardized reactions and questions of the virtual audience in the VR condition. Furthermore, positive feedback from numerous peers might have supported cognitive reinterpretations of the feared situation and observing speeches from other participants could have possibly worked as model learning. This might represent advantages of the in vivo in contrast to the VR condition. In the second study yielding an effect size in favor of in vivo exposure (Kampmann et al., 2016), a gradual exposure to different standardized social situations with standardized dialogues with different content and style from friendly to unfriendly and with varying personal relevance was conducted in the VR group, whereas individual social situations were translated to in vivo exposure exercises in the in vivo control group, and also exposure in the personal environment of the participants was realized in this condition (see **Table 2**). Both conditions were not combined with cognitive interventions to examine the pure effect of exposure therapy (Kampmann et al., 2016). It might be possible that a stronger individualization of the situations in the in vivo group resulted in a higher efficacy of in vivo in comparison to VR exposure. The authors of the study furthermore speculated that the results could be attributed to the fact that it was their first version of a Social Anxiety VR environment, and moreover mentioned that VR exposure for Social Phobia might need to be combined with cognitive elements to improve efficacy (Kampmann et al., 2016). Superiority of VR exposure in Social Phobia was actually achieved in a study which realized social situations in VR and in vivo while focusing on cognitive restructuring without requiring habituation (Bouchard et al., 2016) (see **Table 2**). The authors of the study pointed out that cognitive interventions might influence the way exposure is mentally processed by the patients. As a further difference to the study conducted by Kampmann et al. (2016), the therapist was present in the same room during VR exposure. Bouchard et al. (2016) in this regard considered a better therapeutic alliance in the VR condition as another possible explanation for the different results. As a difference to the studies conducted by Anderson et al. (2013) and Kampmann et al. (2016), Bouchard et al. (2016) partially used role-playing for exposure in the in vivo condition. One might argue that role-play causes less fear activation than social situations in reallife, which could have lowered the efficacy of the in vivo in comparison to the VR condition of this study. Furthermore, it is to mention that the social situations realized in the in vivo condition of the study by Bouchard et al. (2016) seem to be less individualized in comparison to the study conducted by Kampmann et al. (2016). Finally, in the study by Bouchard et al. (2016) VR exposure scenarios included social interactions like acting under the scrutiny of strangers, being refused, or facing criticism or insistence that were not described for the in vivo condition (see **Table 2**). Although negative reactions of virtual counterparts were also realized in the studies by Anderson et al. (2013) and Kampmann et al. (2016) (see **Table 2**), they seem more pronounced in the study by Bouchard et al. (2016). Because negative reactions of a counterpart target central fears of Social Phobic patients, this might - especially when combined with cognitive interventions - for example facilitate expectancy violation concerning catastrophic beliefs. This could be a further aspect explaining why VR exposure was more efficacious than in vivo exposure in this study. Overall, real humans' reactions cannot be manipulated in the same systematical way as in VR, and social interactions comprising rejection are therefore not easily realized in vivo. This might generally represent an advantage of VR over in vivo exposure in Social Phobia.

To further discuss variables that might have influenced the efficacy of VR in comparison to in vivo exposure therapy, differences in the participants' characteristics, in the technical features of VR devices, and in the concrete kind of VR and in vivo exposure environments and procedures are considered.

As one sample characteristic, we looked at differences in the participants' age in all included studies. Age ranged from 18 to 72 years, with a mean age over both experimental groups ranging from 29.1 to 43.97 years (see **Table 1**). Although the variance was not strikingly high, differences in the participants' age between the individual studies might have had an influence on the efficacy of the VR exposure treatment for example. One might hypothesize that younger participants profit more from a VR treatment than older participants, as they are more familiar with this technique. Contradictory to this hypothesis, VR exposure was inferior to in vivo exposure (g = −0.44) in the study which included the youngest participants (age M = 29.1, SD = 7.99) (Michaliszyn et al., 2010). Furthermore, the study which included the oldest participants (age M = 43.97, SD = 9.34) showed a result in favor of VR exposure therapy (g = 0.27) (Emmelkamp et al., 2002). And also over all studies, visually inspection does not show a distinct positive relationship between the efficacy of VR in comparison to in vivo exposure and the mean age of the participants. Further sample characteristics like disorder severity, comorbidities, or medication for example, could not systematically be examined, as they have not been measured homogeneously in the different studies.

Another point to consider is the changes VR technologies experienced from 2000 to 2016, during the period in which the included studies were published. One might argue that the technical development of VR devices could have affected the therapy results in the VR exposure condition, while the in vivo exposure procedure stayed relatively unchanged. In this regard, better developed VR techniques might have led to higher efficacy of VR exposure treatments in comparison to in vivo exposure. As already mentioned in the discussion on effect sizes for Specific Phobia, the results for the studies by Rothbaum et al. (2000, 2006) do not point to this hypothesis. For a deeper examination, we provide a description of the resolution and field of view of the different types of HMDs used in the individual studies included in this meta-analysis (see **Table 2**). Over all studies, visual inspection does not show a positive relationship between the efficacy of VR in comparison to in vivo exposure and the technical development of the HMD devices. For example, VR exposure conducted with a nVisor SX, as a VR device with a - for the first decade of the century- relatively high resolution and a wide field of view (1280 × 1024/60◦ ), in one case shows a similar efficacy of VR to in vivo exposure (g = 0.02) (Meyerbroeker et al., 2013), and an inferior efficacy of VR in another case (g = −0.68) (Kampmann et al., 2016). Moreover, studies using a VFX, as a type of HMD with a lower resolution and field of view (640 × 480/35◦ ), also point at similarity between VR and in vivo exposure (Rothbaum et al., 2006), or show inferiority of VR exposure (Anderson et al., 2013). Studies conducted with a V6, as another type of HMD (640 × 480/60◦ ), point at similarity between VR and in vivo exposure (g = −0.07) (Botella et al., 2007), or show an effect size in favor of in vivo exposure (g = −0.65) (Rothbaum et al., 2000).

Finally, variables like age and technical development of HMDs alone cannot explain differences in the efficacy of VR in comparison to in vivo exposure therapy in all the studies included in this meta-analysis. Instead, different modes of movement in VR, individualization of VR scenarios, the therapeutic alliance in VR exposure, the combination of VR exposure with cognitive interventions, and the creation of virtual social interactions targeting central fears are interesting factors to be considered in future research on the effective factors of VR exposure therapy, especially in Social Phobia. As soon as more studies are available, systematic meta-regression analysis could statistically examine the influence of certain variables on the efficacy of VR in comparison to in vivo exposure therapy. Therefore, original studies should not only control for participants' characteristics characteristics such as disorder severity, comorbidity, or accompanying medication as potential confounding variables, but also systematically describe their materials and procedure in matters of the VR settings such as movement mode in VR, stimulation of further senses alongside the sense of sight, design of virtual social interactions, exposure strategy, role of the therapist during exposure, as well as realization of theoretical and empirical approaches such as cognitive restructuring and inhibitory learning. Moreover, experimental studies could systematically variegate potential effective factors thereby providing findings on a causal influence on the efficacy of VR exposure. By doing this, future studies could reveal more about efficacious application procedures of VR exposure therapy.

#### Limitations

The main limitation of this meta-analysis is the relatively small number of included studies, as only nine published articles fulfilled the eligibility criteria. This limits the generalizability of the results, especially for the specific phobic anxiety disorders. On the other hand, a statistical summary, even of a small number of studies, can be meaningful since it prevents intuitive ad hoc summaries which are often highly misleading (Borenstein et al., 2009). However, the relatively strict inclusion criteria of this meta-analysis for the participants, procedures, materials, and study design resulted in pooled effect sizes from a sample of highly comparable studies with a study methodology that was of comparatively good quality, but which could have been improved on.

As one eligibility criterion, we only included studies of participants diagnosed with an ICD or DSM phobic anxiety disorder, resulting in a relatively homogeneous sample of all studies. However, the inclusion of studies on three different phobic anxiety disorders resulted in a certain variance between participants. Moreover, there were studies which included participants diagnosed with other (phobic) anxiety disorders than the target disorder of the particular study. Those were, in particular, patients with Agoraphobia in the two studies on fear of flying (Rothbaum et al., 2000, 2006) and participants with Panic Disorder without Agoraphobia in a study on Agoraphobia (Botella et al., 2007) (see **Table 1**), which could cause a certain bias if those participants were not equally distributed in the experimental conditions.

Another limitation lies in the use of different anxiety measurements across different studies, which could have biased the synthesized effect sizes. As we calculated effect sizes on symptom specific anxiety measurements, it was necessary to pool different assessments for Specific Phobia, Social Phobia and Agoraphobia, but also, studies on the same phobic anxiety disorder applied different symptom specific measurement (see **Table 4**). Moreover, we only included self-report measurements, as there were not enough studies which conducted comparable objective measures, like behavioral avoidance tasks. Though, as standardized test were used, those should have generated sufficiently reliable and valid measurements.

Regarding VR equipment, only studies using VR devices with immersive systems (e.g., HMD) and head tracking were included, leading to a relatively homogeneous application of VR stimuli. Nevertheless, the studies did not all use the same type of hardware, such as different types of HMDs (see **Table 2**) and different types of software as a potential source of variance for example. One study partly used an HMD and partly a CAVE system as the VR presentation mode (see **Table 2**), but in this case no significant differences in the outcome measures were found (Meyerbroeker et al., 2013).

Concerning the treatment procedure, a similar amount of exposure in the VR and in vivo condition was required according to the eligibility criteria. Nevertheless, the exposure environment in VR and in vivo was not always equal (see **Table 2**). For example, two studies conducted a flight exposure in VR, but only exposed participants to sitting on a stationary plane with imaginal exposure of a flight in the in vivo condition (Rothbaum et al., 2000, 2006). Moreover, over different studies there were differences in the amount and kind of therapeutic techniques applied for pre- and post-proceeding and to accompany exposure treatment (see **Table 2**). For example, Social Phobia exposure was combined with cognitive techniques in the study by Bouchard et al. (2016), but not in the study by Kampmann et al. (2016).

A limitation of the included data is that the original studies differed in providing either data on an intent-to-treat sample, a completer sample, or both. If available, intent-to-treat data was used for synthesis, but if not provided, data on completer samples was included (see **Table 4**). This should be considered as a potential source of bias. Future original studies should thus provide data on both samples, so that separate meta-analysis can be conducted.

A limitation concerning the synthesis of data is the use of fixed-effect models for pooling a smaller number of studies on one specific kind of phobic anxiety disorder. A fixedeffect model requires functionally identical studies (Borenstein et al., 2009), which can approximately be reached by the strict inclusion criteria and focusing on only one phobic disorder for the synthesis. Nevertheless, the included studies on one phobic anxiety disorder were not entirely identical and therefore the results must be interpreted cautiously and should not be used for generalizations on a wider population (Borenstein et al., 2009). They do however provide a descriptive analysis of differences concerning the treatment effects of VR and in vivo exposure therapy in different phobias and allow a discussion on potential mechanisms behind differences in effect sizes for the comparison of VR and in vivo exposure therapy between different phobic anxiety disorder.

Furthermore, the applied statistical tests are only valid for testing differences between groups, but not for proving equivalency. Therefore, the non-significant results have to be interpreted cautiously and the relevance of effect sizes has to be considered. Future meta-analyses based on a larger number of trials should also draw on non-inferiority or equivalence tests (Piaggio et al., 2006) to examine the equivalence of VR and in vivo exposure.

Finally, we could not conduct a statistical analysis of potential effective factors of VR exposure therapy due to the small number of available studies. As soon as more original studies have been published, this will also be an important research question for future meta-analyses.

In general, this meta-analysis and the original studies were conducted by researchers in the field of VR. This is a potential source of bias, especially as no pre-registration of this systematic review and meta-analysis protocol was conducted. However, we comprehensively disclose our methods and results. Furthermore, as there are no original studies available from field-independent researchers, we provide a comprehensive and objective description of the materials and procedures of all studies included in this meta-analysis. In this regard, we want to enable the reader to capture information for an independent assessment and interpretation of the results and thereby attenuate the bias caused by non-independent researchers.

### Conclusions

This meta-analysis provides results on a head to head comparison of VR exposure and in vivo exposure as the golden standard of treatment for phobias, synthesized from methodologically comparable studies with an equivalent amount of exposure in VR and in vivo, and with equivalent interventions applied alongside VR and in vivo exposure in phobic anxiety disorders, especially in Agoraphobia and Specific Phobia. For Social Phobia, the synthesized effect size points to a superiority of in vivo exposure, but the wide range of effect sizes for the individual studies also shows the high potential of VR exposure in this phobic anxiety disorder.

While the individual effect sizes for the studies on Agoraphobia both indicate equivalency between VR and in vivo exposure, the individual effect sizes for the studies on Specific Phobia and Social Phobia ranged from inferiority to equivalency and even superiority of VR exposure. Studies that yielded an equivalent or even superior effect of VR exposure combined an exposure of agoraphobic situations in VR and in vivo with cognitive interventions and interoceptive exposure (Botella et al., 2007; Meyerbroeker et al., 2013), realized an equivalent environment for the exposure of fear of heights in VR as in vivo and allowed patients to move in VR by walking around freely in a demarcated space (Emmelkamp et al., 2002), or focused on reinterpretations without requiring habituation during VR and in vivo exposure in Social Phobia and applied social interactions realizing rejection experiences in the VR condition (Bouchard et al., 2016). Although a statistical analysis of potential effective factors was not possible, such observations can contribute to the implementation of maximized efficacious VR exposure therapy. There are hints that VR exposure in Social Phobia should be combined with cognitive interventions and should use the possibility to manipulate virtual agents in order to target central fears of Social Phobic patients to reach equal or even better efficacy in comparison to in vivo exposure.

Considering the results of this meta-analysis, and because there are barriers in conducting in vivo exposure in clinical practice (Neudeck and Einsle, 2012), it would be strategically useful to promote the dissemination of VR comprehensively. As VR therapy is time-effective, accounts for less organizational effort (Diemer et al., 2015), and has a higher acceptance in patients (García-Palacios et al., 2007), this could be a feasible possibility in achieving efficacious exposure treatment for a wider population of patients.

As there were only a few studies with sufficiently homogenous materials and procedures published on the examined research question, this meta-analysis is still based on a small number of studies. A proportionally high level of internal validity can be expected for the results, but the generalizability should be verified in an updated meta-analysis as soon as more studies are published in the following years. In addition, future research should focus on the effective factors of VR exposure therapy and further examine mechanisms enhancing the treatment effects that may be applicable using VR exposure. Examples are the impact of cognitive strategies (see the study conducted by Bouchard et al., 2016), the movement mode in VR (mouse/gamepad vs. walking freely in a demarcated space as achieved in the study conducted by Emmelkamp et al., 2002), or the repetition of exposure with different stimuli or contexts (Shiban et al., 2013, 2015). Further interesting research questions include multimodal exposure (Peperkorn et al., 2016), eligible patient population, or a different amount of VR sessions applied. As VR materials and procedures continuously improve, superior effects of VR exposure in comparison to in vivo exposure therapy could be realized in future. This is because VR has the possibility to create ideal environments for exposure, for example virtual rejection experiences targeting central fears of Social Phobia patients as for example achieved in the study conducted by Bouchard et al. (2016), and also has the possibility to consider

#### REFERENCES


and test theoretical and practical concepts, for example inhibitory learning and inhibitory regulation (Craske et al., 2008; Craske, 2015). The creation of complex virtual social interactions is therefore a challenge that can be solved in future research and VR scenario development.

#### AUTHOR CONTRIBUTIONS

AM and TW developed the idea of the research question. FK performed the systematic literature search to identify published studies, and conducted the title, abstract and fulltext screening. TW also conducted abstract and full-text screening, selecting eligible studies. AM was consulted to discuss disagreements during the screening process. TW and FK performed the data extraction. TW conducted the statistical analyses with contributions from FK and AM. TW and AM interpreted the findings and wrote the manuscript draft.


of anxiety disorders: an evaluation of research quality. J. Anx. Disord. 28, 625–631. doi: 10.1016/j.janxdis.2014.05.010


Friedman (Oxford: Elsevier), 186–191. doi: 10.1016/B978-0-12-397045-9.0 0266-4


**Conflict of Interest Statement:** AM is stakeholder of a commercial company that develops virtual environment research systems.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Wechsler, Kümpers and Mühlberger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

<sup>∗</sup> Studies included in the meta-analysis are denoted with an asterix.