Enhancing Our Lives with Immersive Virtual Reality

Slater, Mel; Sanchez-Vives, Maria V.

doi:10.3389/frobt.2016.00074

OPINION article

Front. Robot. AI, 19 December 2016

Sec. Virtual Environments

Volume 3 - 2016 | https://doi.org/10.3389/frobt.2016.00074

This article is part of the Research TopicThe Impact of Virtual and Augmented Reality on Individuals and SocietyView all 24 articles

Enhancing Our Lives with Immersive Virtual Reality

Mel Slater^1,2,3*

Maria V. Sanchez-Vives^1,2,4

¹Event Lab, Department of Clinical Psychology and Psychobiology, University of Barcelona, Barcelona, Spain
²Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
³Department of Computer Science, University College London, London, UK
⁴Institut d’investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain

Summary

Virtual reality (VR) started about 50 years ago in a form we would recognize today [stereo head-mounted display (HMD), head tracking, computer graphics generated images] – although the hardware was completely different. In the 1980s and 1990s, VR emerged again based on a different generation of hardware (e.g., CRT displays rather than vector refresh, electromagnetic tracking instead of mechanical). This reached the attention of the public, and VR was hailed by many engineers, scientists, celebrities, and business people as the beginning of a new era, when VR would soon change the world for the better. Then, VR disappeared from public view and was rumored to be “dead.” In the intervening 25 years a huge amount of research has nevertheless been carried out across a vast range of applications – from medicine to business, from psychotherapy to industry, from sports to travel. Scientists, engineers, and people working in industry carried on with their research and applications using and exploring different forms of VR, not knowing that actually the topic had already passed away.

The purpose of this article is to survey a range of VR applications where there is some evidence for, or at least debate about, its utility, mainly based on publications in peer-reviewed journals. Of course not every type of application has been covered, nor every scientific paper (about 186,000 papers in Google Scholar): in particular, in this review we have not covered applications in psychological or medical rehabilitation. The objective is that the reader becomes aware of what has been accomplished in VR, where the evidence is weaker or stronger, and what can be done. We start in Section 1 with an outline of what VR is and the major conceptual framework used to understand what happens when people experience it – the concept of “presence.” In Section 2, we review some areas where VR has been used in science – mostly psychology and neuroscience, the area of scientific visualization, and some remarks about its use in education and surgical training. In Section 3, we discuss how VR has been used in sports and exercise. In Section 4, we survey applications in social psychology and related areas – how VR has been used to throw light on some social phenomena, and how it can be used to tackle experimentally areas that cannot be studied experimentally in real life. We conclude with how it has been used in the preservation of and access to cultural heritage. In Section 5, we present the domain of moral behavior, including an example of how it might be used to train professionals such as medical doctors when confronting serious dilemmas with patients. In Section 6, we consider how VR has been and might be used in various aspects of travel, collaboration, and industry. In Section 7, we consider mainly the use of VR in news presentation and also discuss different types of VR. In the concluding Section 8, we briefly consider new ideas that have recently emerged – an impossible task since during the short time we have written this page even newer ideas have emerged! And, we conclude with some general considerations and speculations.

Throughout and wherever possible we have stressed novel applications and approaches and how the real power of VR is not necessarily to produce a faithful reproduction of “reality” but rather that it offers the possibility to step outside of the normal bounds of reality and realize goals in a totally new and unexpected way. We hope that our article will provoke readers to think as paradigm changers, and advance VR to realize different worlds that might have a positive impact on the lives of millions of people worldwide, and maybe even help a little in saving the planet.

1. Virtual Reality – Foundations

1.1. Introduction – Now Is the Time

“It’s a very interesting kind of reality. It’s absolutely as shared as the physical world. Some people say that, well, the physical world isn’t all that real. It’s a consensus world. But the thing is, however real the physical world is – which we never can really know – the virtual world is exactly as real, and achieves the same status. But at the same time it also has this infinity of possibility that you don’t have in the physical world: in the physical world, you can’t suddenly turn this building into a tulip; it’s just impossible. But in the virtual world you can …. [Virtual reality] gives us this sense of being able to be who we are without limitation; for our imagination to become objective and shared with other people.” Jaron Lanier, SIGGRAPH Panel 1989, Virtual Environments and Interactivity: Windows to the Future.

Although said more than 25 years ago by the person who coined the term “virtual reality” (VR) this statement about the excitement and potentiality that was apparently just around the corner in the late 1980s really does apply today. The dream at the time was a VR that would be available cheaply on a mass scale worldwide. The expectation and hope was very high. As Timothy Leary said in the following year’s SIGGRAPH Panel, imagining a time when the cost of an HMD and body-tracking equipment would be at low-end consumer level, “… suddenly the barriers of class and linguistics and education and nationality are gone. The kid in the inner city can slip on the telepresence hardware and talk to young people in China or Russia. And have flirtations with kids in Japan. In other words, to me there is something wonderfully democratic about cyberspace. If it’s virtual you can be anyone, you can be anything this time around. We are getting close to a place where that is feasible.” Unfortunately, the feasibility was not there, or at least not realizable at that time or anywhere near it. Now though the possibility is real, and for whatever reason now is the time.

During the past 25 years when VR was supposed to have “died”¹ masses of research into both the development of the technology and its application in a vast array of areas has been continuing. Scott Fisher, one of the VR pioneers in a 1989 essay reported in Packer and Jordan (2002) set out a number of applications: telepresence, where VR provides an interface through which the participant operates in a distant place embodied in a robot located there; data visualization; applications in architectural visualization; medicine including surgical simulation; education and entertainment; remote collaboration. These were all applications that were being worked on at the time. In this article, we set out how VR has been used in these and in a variety of other applications, applications that have already shown results that may be of significant benefit for individuals and society. With VR available on a mass scale, the potential for these benefits to have significant impact is now all the greater. However, as Jaron Lanier also said in the 1990 panel “… there’s really a serious danger of expectations being raised too high.” This remains true today, but we can have slightly less caution since research in the intervening quarter of a century has demonstrated results that stand on a reasonably solid scientific basis.

For an overview of a range of applications of VR (not all considered in this article), see the paper by one of the pioneers of VR, Frederick Brooks (1999), with an updated discussion by Slater (2014). What follows is not meant to be a survey of all possible results in all possible applications. We have selected areas that we believe are particularly important for demonstrating how VR has been and might be used to improve the lives of people, and to help overcome some societal problems, or at the very least help in scientific understanding of problems and contribute toward solutions. Readers might find that their favorite topic, research result, or paper has not been mentioned. This is because we have focused on illustrative results and developments rather than attempting to be comprehensive. Indeed, to write comprehensively about every section in this article would require something like the whole article length devoted to it. Even so without trying to be comprehensive, we have found it necessary to cite many references. We have concentrated on scientific papers in peer-reviewed journals. Immersive VR has shown an extremely impressive array of applications over the years, but what is important now, given the lesson of what happened in its first phase, is that we emphasize results that have some level of scientific support. The scope of this article is on the uses of VR; we are not presenting techniques, methods, interfaces, algorithms, or any of the technical side, except where this is relevant to explain a particular application or results.

Our thesis is similar to that presented in the quote from Jaron Lanier above: VR offers us a way to simulate reality. We do not say that it is “exactly as real” as physical reality but that VR best operates in the space that is just below what might be called the “reality horizon.” If a virtual knife stabs you, you are not going to be physically injured but nevertheless might feel stress, anxiety, and even pain. If a virtual human unexpectedly kisses you, you may blush with embarrassment, and your heart start pounding, but it will be a virtual kiss only. On the other hand, as Lanier said, the real power of VR is to go beyond what is real, it is more than simulation, it is also creation, allowing us to step out of the bounds of reality and experience paradigms that are otherwise impossible.

Virtual reality is “reality” that is “virtual.” This means that, in principle, anything that can happen in reality can be programed to happen but “virtually,” a point that we return to in Chapter 8, since, for example, this is not the case with touch and force feedback. Therefore, writing about the potentialities inherent in VR is a difficult task – since it encompasses what can be done in physical reality (for good or evil). But even more, since it is VR, we emphasize that we can break out of the bounds of reality and accomplish things that cannot be done in physical reality. Herein lies its real power. With VR we can, for example, simulate and improve traditional physiotherapy by making it more interesting for the patient by changing their apparent location and activity to something more interesting than just what they are actually doing. In reality, a machine might be helping someone to move their legs for physiotherapy, but with VR they can be given the illusion that rather than just moving their legs for therapy they might be playing soccer in the World Cup. This type of approach augments current practices. But, VR can go way beyond this and introduce radical paradigm shifts.

In VR we are currently still at the stage similar to that of the transition between theater and movies as pointed out by Pausch et al. (1996). Movies were originally just another way to show theater. It took a while before moviemakers developed a new grammar, ways of presenting a story unique to this medium. So, the same will be true of VR. Nowadays, a computer game in VR is just a traditional computer game – but displayed in a different medium. Eventually there will be a paradigm shift, one that we cannot know at the time of writing. Putting this another way, VR is revolutionary, even though it has taken 50 years to get from the initial idea in the lab to becoming a mass consumer product. How this product might develop and change the world in which we live remains unknown. In this article, we try to set out some of what has been done with VR and to some extent what might be done. We address positive uses of VR, while recognizing from the outset that there will be, like with any technology, uses that are morally repugnant. For example, vehicles can do serious damage when used improperly, even though their designed purpose is to transport people or facilitate commercial activity.

1.2. Essential Concepts

The idea of immersive VR in the form that we think of it today was foreshadowed by Ivan Sutherland in 1965 (Sutherland, 1965) and then realized with the “Sword of Damocles” HMD described in a paper published 3 years later (Sutherland, 1968).² This was not the first ever HMD – see, for example, a collection of pictures compiled by Stephen R. Ellis of NASA Ames, which includes one dating back to 1613.³ Nor was this the first ever virtual environment system – see the multisensory Sensorama system by Morton Heilig,⁴ or Myron Krueger’s pioneering work on Artificial Reality (Krueger et al., 1985; Krueger, 1991), or the years of work on flight simulators (Page, 2000). However, it was the first that, although using almost totally different technology than available today, introduced (and implemented) the concepts that make up a VR system. An HMD delivers two computer-generated images, one for each eye. The 2D images are computed and rendered with appropriate perspective with respect to the position of each eye in the three-dimensionally described virtual scene. Together, the images therefore form a stereo pair. The two small displays are placed in front of the corresponding eye, with some optics that enables the user to see the images. The displays are mounted in a frame, which additionally has a mechanism to continually capture the position and orientation of the user’s head, and therefore gaze direction (assuming that the eyes are looking straight ahead). Hence, as the head of the user moves, turns, or looks up and down, this information is transmitted to the computer that recomputes the images and sends the resulting signals to the displays. From the point of view of the users, it is as if they are in an alternate life-sized environment, since wherever they look, in whichever direction, they see this surrounding computer-generated world in 3D stereo with movement and motion parallax. (The same can be done with specialized sound.) In fact, from this point on we drop the term “user” and refer to the “participant.” VR is different from other forms of human–computer interface since the human participates in the virtual world rather than uses it.

In the 1980s, NASA Ames developed the VIEW system (Virtual Interface Environment Workstation) described by Fisher et al. (1987).⁵ This was a full VR system with all components recognizable today: head-tracked wide field-of-view relatively light weight HMD, audio, tracking of the body, tracked gloves that allowed participants to interact with virtual objects, tactile and force feedback (haptics), and where the VR could be linked to a telerobotics system (Section 6.4).

Also in the 1980s a company VPL led by Jaron Lanier became a driving force of VR developments constructing the Eyephone HMD, tracked data gloves⁶ for interaction, whole body tracking, and reality built for two (Blanchard et al., 1990).⁷ They also developed a visual programming language that made it possible to build virtual environments with limited programming. It was a goal for people to be able to construct their virtual realities, while in VR, and immediately share these with multiple people. It was probably through the work of VPL that the idea of VR became widely publicized.

The degree of excitement, creativity, speculation, visions of a positive future, belief in the near-term mass availability of VR cannot be overemphasized. Indeed, the ideas and realizations that were around in the late 1980s and early 1990s can be read anew today and have a new freshness – and are especially important because what was hoped for then (VR for the mass of people at low cost) is now becoming a reality. Readers are urged to read the proceedings of two panels that occurred at the SIGGRAPH conference in 1989 (Conn et al., 1989) and 1990 (Barlow et al., 1990) to get an idea of the excitement and promise of the heady days of early VR.

Head-mounted display technology puts the displays close to the eyes. Another type of immersive VR system was developed by Cruz-Neira et al. (1992) referred to as a CAVE™ system (Cruz-Neira et al., 1993). Here, images are back-projected onto the walls of an approximately 3 m cubed room (front projected onto the floor by a projector mounted on the ceiling above the open topped cuboid). Typically, three walls and the floor are screens. The images are projected interlaced at, e.g., 90 frames per second, 45 showing left eye images and the others the right eye images. Lightweight shutter glasses alternately have one eye lens opaque and the other transparent, in sync with the projected images. The brain fuses the two into one overall 3D stereo scene. Through head tracking mounted on the glasses, the image is correctly perspective computed for the head position, direction, and orientation of the participant. More than one person can be in the Cave simultaneously, and wearing the stereo shutter glasses, but the perspective is only correct for the one wearing the head-tracked glasses. Hence, such Cave-like systems, like HMDs deliver a surrounding 3D world. Of course, such a system has been far more expensive than an HMD system, both in terms of the space required and the cost (high powered projectors, a multiprocessor computer system, complex software for lock-step stereo rendering across all the displays, equipment maintenance). Moreover, as the promise of HMD driven VR diminished in the 1990s through the failure to develop high quality displays at low enough cost, and with acceptable ergonomics (such as weight), Cave-like systems came to be used as an alternative. However, unlike HMDs, each Cave was typically tailor-made to order (it depended on available space apart from anything else) and never became a mass product. Caves became one of the mainstays of VR research and applications from the late 1990s and through the 2000s until recently. The applications we discuss below include both HMD and Cave systems.

Conceptually, a minimal VR system places a participant into a surrounding 3D world that is delivered to a display system by a computer. At the very least, the participant’s head is tracked so that image and auditory updates depend on head-position and orientation. The computer graphics of the system delivers perspective-projected images individually to each eye, and the resulting scenario should be seen with correct parallax. Ideally, there should be a means whereby participants can effect changes in the virtual world. This may be accomplished by 3D tracked data gloves, or a handheld device such as a Wand (which is like a mouse or joystick but tracked in 3D space). Note that this says nothing about how the world is rendered. Even with the wire frame (lines only) images portrayed in Sutherland (1968), Ivan Sutherland noted that “An observer fairly quickly accommodates to the idea of being inside the displayed room and can view whatever portion of the room he wishes by turning his head ….Observers capable of stereo vision uniformly remark on the realism of the resulting images.”

1.3. Immersion and Presence

Consciousness of our immediate surroundings necessarily depends on the data picked up by our sensory systems – vision, sound, touch, force, taste, and smell. This is not to say that we simply reproduce the sensory inputs in our brains – far from it, perception is an active process that combines bottom-up processing of the sensory inputs with top-down processing (including prior experience, expectations, and beliefs) based on our previously existing model of the world. After a few seconds of walking into a room we think that we “know” it. In reality, eye scanning data show that we have foveated on a very small number of key points in the room, and then our eye scan paths tend to follow repeated patterns between them (Noton and Stark, 1971). The key points are determined by our prior model of what a room is. We have “seen” a small proportion of what there is to see; yet, our perceptual system has inferred a full model of the room in which we are located. In fact it has been argued that our model of the scene around us tends to drive our eye movements rather than eye movements leading to our perceptual model of the scene (Chernyak and Stark, 2001). It was argued by Stark (1995) that this is the reason why VR works, even in spite of relatively simplistic or even poor rendering of the surroundings. VR offers enough cues for our perceptual system to hypothesize “this is a room” and then based on an existing internal model infer a model of this particular room using a perceptual fill-in mechanism. Recall the quote from Sutherland above how people accommodated to and remarked on the realism of the wire frame rendered scene displayed in the “Sword of Damocles” HMD.

The technical goal of VR is to replace real sense perceptions by the computer-generated ones derived from a mathematical database describing a 3D scene, animations of objects within the scene – represented as transformations over sets of mathematical objects – including changes caused by the intervention of the participant. If sensory perceptions are indeed effectively substituted then the brain has no alternative but to infer its perceptual model from its actual stream of sensory data – i.e., the VR. Hence, consciousness is transformed to consciousness of the virtual scenario rather than the real one – in spite of the participant’s sure knowledge that this is not real.

Effective substitution of real sensory data is an ideal. In practice, it depends on several factors, not least of which is – which sensory systems are included? Typically, vision, and often auditory, more rarely touch, more rarely force feedback, more rarely still smell, and almost unknown taste.⁸ If we consider the typical VR system, it is primarily centered around vision, may have sound, and may have some element of tactile feedback. However, even vision alone is often enough for numerous applications, since anyway for many people it is perceptually dominant. So, participants in a VR typically encounter a situation where their visual system places them on say a roller coaster, but all other sense perceptions are from the surrounding physical environment. Nevertheless, they may scream and react as if they are on the roller coaster even while talking to a friend in reality standing nearby.

Factors that are critical for effective sensory substitution have been known for several years (Heeter, 1992; Held and Durlach, 1992; Loomis, 1992; Sheridan, 1992, 1996; Steuer, 1992; Zeltzer, 1992; Barfield and Hendrix, 1995; Ellis, 1996; Slater and Wilbur, 1997): such as wide field-of-view vision, stereo, head tracking, low-latency from head move to display, high-resolution displays, and of course the more sensory systems that are substituted the better. However, these types of technical factors (and there are others) are for one purpose – to afford the participant to perceive using natural sensorimotor contingencies (O’Regan and Noë, 2001a,b; Noë, 2004). What this means is that in order to perceive we use our bodies in a natural way. We turn our head, move our eyes, bend down, look under, look over, look around, reach out, touch, push, pull, and doing all or some subset of these things simultaneously. Perception is a whole body action. Hence, the primary technological goal of VR is to realize perception through such natural sensorimotor contingencies to the best extent possible, and of course this continually comes up against limitations. For example, if while wearing an HMD or in a Cave we look very closely at an object, eventually we will see pixels. Or, in most existing VR systems, if we touch some arbitrary virtual object we will not feel it.

By an immersive VR system we mean one that delivers the ability to perceive through natural sensorimotor contingencies. This is entirely determined by the technology. Whether you can turn around 360°, all the while seeing a very low-latency continuous update of your visual field in correspondence with your gaze direction, is completely a function of the extent to which the system can do this. We can classify systems in this way as being more or less immersive. We say that system A is more immersive than system B if A can be used to simulate the perception afforded by B but not vice versa. Hence, in this sense an HMD is “more immersive” than a Cave, since there is something that can be represented in an HMD that cannot be represented in a Cave (even a six-sided Cave): the virtual representation of the participant’s body. In a Cave when you look down toward yourself you will see your real body. In an HMD with head tracking you can see a virtual body substituting your own (if this has been programed). Moreover, the virtual body can be designed to look like the real one, or not, and certainly with body tracking can be programed to move with real body movements and so on. So, in this way an HMD-based system can (in an ideal sense) be set up to simulate a Cave, but not vice versa.

Immersion describes the technical capabilities of a system, it is the physics of the system. A subjective correlate of immersion is presence. If a participant in a VR perceives by using her body in a natural way, then the simplest inference for her brain’s perceptual system to make is that what is being perceived is the participant’s actual surroundings. This gives rise to the subjective illusion that is referred to in the literature as presence – the illusion of “being there” in the environment depicted by the VR displays – in spite of the fact that you know for sure that you are not actually there. This specific feeling of “being there” has also been referred to as “place illusion” (PI) (to distinguish it from the multiple alternative meanings that have been attributed to the term “presence”) (Slater, 2009). It was coined by Marvin Minsky (1980) to describe the similar feeling that can arise when embodying a remote robotic device in a teleoperator system.

Place illusion can occur in a static environment where nothing happens – just looking around a stereo-displayed scenario, for example, where nothing is changing. When there are events in the environment, events that respond to you, that correlate with your actions, and refer to you personally, then provided that the environment is sufficiently credible (i.e., meets the expectations of how objects and people are expected to behave in the type of setting depicted), this will give rise to a further and independent illusion that we refer to as “Plausibility” (Psi) that the events are really happening. Again, this is an illusion in spite of the sure knowledge that nothing real is happening. A virtual human approaches and smiles at you, and you find yourself smiling back, even though too late you may say to yourself – why did I smile back, there is no one there?

The real-time update of sensory perception as a result of movement (e.g., head turning) gives rise to the sense of “being there” – the illusory sensation of being in the computer-generated environment (Sanchez-Vives and Slater, 2005). The dynamic changes following events caused by or to the participants can give rise to the illusion that the events are really happening – “plausibility” (Slater, 2009). With a technically good VR system (wide field-of-view high-resolution stereo display, with low-latency head tracking at a minimum), the “being there” aspect is essentially determined for all but a few moments during an experience (Slater and Steed, 2000). Psi is much harder to attain, often requiring specific domain knowledge (e.g., the virtual representation of a doctor’s surgery for the purposes of training had better be according to their expectations if doctors are to accept it). In this article, we use PI to refer to the illusion of being there, whereas presence refers to both PI and Psi. Following Sanchez-Vives and Slater (2005), the behavioral correlate of “presence” is that participants behave in VR as they would do in similar circumstances in reality. For a more formal treatment of PI, Psi, and presence, including experimental results, see Slater et al. (2010a).⁹ These issues are taken up again in Chapter 8.

This fundamental aspect of VR to deliver experience that gives rise to illusory sense of place and an illusory sense of reality is what distinguishes it fundamentally from all other types of media. It is true that in response to a fire in a movie scene, the viewers’ hearts might start racing, with feelings of fear and discomfort. But, they will not run out of the cinema for fear of the fire. In VR, about 10% did run out when confronted by a virtual fire even though the fire did not look realistic (Spanlang et al., 2007). In a movie that includes a fight between two strangers in a bar, audience members will not intervene to stop the fight. In VR, they do – under the right circumstances – specifically when the victim shares some social identity with the participant (Slater et al., 2013), which itself is remarkable because obviously there is no one real there with whom to share social identity.

So, VR is a powerful tool for the achievement of authentic experience – even if what is depicted might be wholly imaginary and fantastic. In a scenario with dinosaurs such as that shown in “Back to Dinosaur Island – Jurassic World with Oculus Rift,”¹⁰ of course participants know that the situation is not real. Nevertheless, they would typically have the illusion of being there and have the illusory sensation that the dinosaur’s actions are really happening.

Evidence over the past 25–30 years shows that PI and Psi can occur even in quite low-level systems. This is because VR relies on the brain “filling in” detail in response to the apparent situation, so that just like in physical reality people find themselves responding with physiological and reflex actions before they consciously reason out the situation – in this case that in fact nothing real is happening. That reasoning or high-level cognitive processing occurs more slowly, after the autonomic bodily responses have already occurred. For example, put someone next to a virtual precipice and their heart will start pounding (Meehan et al., 2002), even though eventually of course they can say to themselves that it is not really there. VR effectively relies on this duality – between very rapid brain activation that causes the body to respond (by the body responding, we include autonomous responses and thoughts that are generated in response to an apparent situation) and the slower cognitive process that reasons things out, which is of course a vital mechanism for survival, and occurs normally in physical reality.

Since VR evokes realistic responses in people, it is fundamentally a “reality simulator.” By this we mean that participants can be placed in a scenario that depicts potentially real events, with the likelihood that they would act and respond quite realistically. This can obviously be exploited for many applications including rehearsal for the actual events, planning, training, knowledge dissemination, and so on. However, VR is also an unreality simulator! The events that it depicts may be ones that are highly unlikely to happen or cannot happen because they violate fundamental laws of physics, such as defying the laws of gravity. In VR, the physical laws can be simulated to the limit that computational power supports, or they can be changed or violated. Similarly, social conventions can be violated. A person might one day participate in a world that has never existed, such as Pandora from James Cameron’s movie Avatar.¹¹ But still, provided some fundamental principles are adhered to, giving rise to the illusions of being in the virtual place where real events are taking place – participants can nevertheless demonstrate realistic responses. At the simplest level your heart is likely to race equally being faced with a realistic depiction of a precipice (something that could happen) or being chased by otherworld monsters. In this way, VR dramatically extends the range of human experiences way beyond anything that is likely to be encountered in physical reality. Hence, the amazing capability of VR not just as a reality simulator but as an unreality simulator that can paradoxically give rise to realistic behavior.

In this article, we will outline some of the applications that have been developed that show the positive use of VR for the potential benefit of society and individuals – how VR can be used to enhance well-being across a vast range of aspects of life. VR as a reality simulator has its uses in various forms of training, for education, for travel, some of which are discussed in the sections below. Moreover, VR as an unreality simulator can be used for many different types of entertainment – that extend from passive to active. It should also be noted that VR as an unreality simulator can also be used to solve “real” problems – as we will indicate later.

In each of the sections below, we will tackle a different domain of application. We will show in each section what has been done at the time of writing and give some indication of the degree to which it has been successful (i.e., its scientific validation). Additionally, where relevant, we will discuss ideas and proposals indicating what could be done in this domain.

2. Science, Education, and Training

2.1. Psychology and Neuroscience

2.1.1. The Virtual Body

In Franz Kafka’s Metamorphosis,¹² Gregor Samsa woke up one morning lying in bed and found himself transformed into a horrible insect-like creature. The body felt like his own, but he had to learn how to move himself in new ways, and of course it had an impact on his attitudes and behaviors and those of others who saw him. Using VR, it has been shown to be possible to actually experiment with these types of body transformations, though rather more pleasant ones, and in the early days at the VPL company, there was experimentation by Jaron Lanier with embodiment in a virtual lobster body.

The question of how the brain represents the body is fundamental in cognitive neuroscience. How does the brain distinguish that this object is “my” hand and part of my body, but that object, a cup, is not part of my body, or that other object is your hand and not part of me? Common sense would have us believe that our own internal body representation is stable, something that changes only slowly through time, but experiments have shown that it is quite easy to shift the illusion of body ownership to objects that are not part of the body at all, or to a radically transformed body, so that our body representation is highly malleable.

A classic and very simple experiment to show this is called the rubber hand illusion (RHI) presented by Botvinick and Cohen (1998) in a one page Nature paper in 1998, which has had an enormous impact on the field (over 1800 citations – Google Scholar – at the time of writing). It has led to a vast literature that exploits these illusions to understand how the brain represents the body. Recent reviews are provided in Blanke (2012); Ehrsson (2012); and Blanke et al. (2015). In the RHI, the subject sits by a table onto which a rubber hand is placed in an anatomically plausible position, and approximately parallel to the subject’s corresponding real hand. The real hand is hidden behind a partition. The experimenter sitting opposite the subject taps and strokes the seen rubber hand and the hidden real hand synchronously in time and as far as possible at the same locations on the two hands. From the subject’s point of view, there is a rubber hand seen on the table in front, and arranged so that it could be the subject’s own hand, and this hand is seen to be tactilely stimulated. But, corresponding to the seen stimulation, there is actually felt stimulation on the real hand. The brain’s perceptual system resolves this conflict by integrating the two separate but synchronous inputs into one, resulting in the perceptual and proprioceptive illusion that the rubber hand is the subject’s hand.¹³^,¹⁴ This feeling, just like PI or Psi, is impossible to describe – it has to be experienced. If the visual and tactile stimulation are asynchronous, then the illusion does not occur, or occurs to a much lesser extent. To elicit a behavioral measure of the illusion, the idea of “proprioceptive drift” was introduced in Botvinick and Cohen (1998). Before the stimulation, participants with eyes closed had to point to their hand under the table on which their arm was resting. After the stimulation, participants were again asked to repeat the pointing procedure. The distance between the post- and pre-measures is called the proprioceptive drift, where greater values indicate that participants pointed more toward the rubber hand after than before. Indeed, it was found that the drift was on the average positive for those in the synchronous condition and zero for those in the asynchronous.

Armel and Ramachandran (2003) went on to show that subjects also respond physiologically to a threat to the rubber hand. They argued that our internal body representation is updated moment to moment based on the stimulus contingencies received. Synchronous multisensory perception leading to the hypothesis that a rubber hand might be our real hand is taken on by the brain that very quickly generates the corresponding illusion as a way to resolve the contradiction between the seen and felt synchronous stimulation. There are limitations, such as the rubber hand needing to look like a human hand, its position must be plausible, and so on, but the fundamental result that we can have strong feelings of ownership over an object that we know for certain is not part of our body is clearly demonstrated by this illusion.

Lenggenhager et al. (2007)¹⁵ and Ehrsson (2007)¹⁶ went on to show how similar multisensory techniques could be used to induce out-of-body illusions. Each of these used an HMD via which subjects saw a distant body. The HMD received video signals from cameras pointing toward the body. In the case of Lenggenhager et al. (2007), the distant body was a manikin with its back to the subject. The manikin was seen to be stroked on the back, which was felt on the subject’s back through synchronous stimulation by the experimenter. Subjects then had the strange illusion of being located at or drawn toward the manikin body to their front. In the case of Ehrsson (2007), the video cameras were pointed to the back of the subject’s own seated body. So from the perspective of the subject, they saw their own body from behind themselves. The experimenter synchronously stroked the subject’s real chest (out-of-sight) and visibly made similar strokes under the cameras. From the point of view of the subjects, they saw and felt stroking toward themselves (since their viewpoint was that of the stereo cameras), but they were apparently located behind their real body. Here, the visual and tactile information cohered to generate the illusion of being behind their own body. When the space under the camera was attacked with a hammer, participants responded physiologically (since the hammer would seem to be coming toward the illusory location of their chest). When the visual and tactile stimulation was asynchronous neither the illusion nor the physiological response occurred to the same extent.

Following this, a form of VR to study body ownership with respect to the whole body (full body ownership) was achieved by Petkova and Ehrsson (2008) through the use of video cameras mounted on top of a manikin that fed a stereo HMD worn by the participant, so that when participants looked down toward their real body, they would see the manikin body instead of their own. This was accompanied by visuotactile synchrony, induced by applying tactile stimulation to the real body synchronized with a corresponding visual stimulation to the manikin body. The result was subjective illusion of ownership over the manikin body, demonstrated also by a physiological response when a knife threatened that body. The illusion diminished when visuotactile asynchrony was applied.

The use of VR to transform the body was first realized by Jaron Lanier, in the late 1980s. The importance of this work for cognitive neuroscience was not realized at the time, and it was never published scientifically, although see Lanier (2006) and it is referred to in Lanier (2010). Lanier used the term “homuncular flexibility” to refer to the finding that the brain can adapt to different body configurations and learn how to manipulate such an alien body – for example, manipulating end-effectors of a body representation as a lobster by learning to use muscles in the stomach, or though combinations of different muscle activations. The extreme flexibility of the body representation had been studied in the 1980s by Lackner (1988). It was found that applying vibrations of around 100 Hz to a muscle tendon on the biceps leads the forearm to move in flexion, but if the movement is resisted, then there will be an illusion of movement of the forearm in the opposite direction (extension). Now suppose that both hands are holding the waist and such muscle spindle vibrations are applied. There is an illusion that both arms are extending, but since the hands are attached to the waist this is impossible. The way that the brain resolves this is to give the illusion of an expanding waistline! By vibrating on the other side of the muscle tendons the arms can be given the illusion of flexing – which will result in a shrinking waist illusion. Ehrsson et al. (2005) used these illusions with brain imaging to capture brain activation changes associated with these radical changes in the body. Tidoni et al. (2015) used these vibratory techniques in conjunction with VR as part of a developing program for the rehabilitation of disabled patients. This followed earlier work by Leonardis et al. (2012) who used such vibrations to induce illusory movements but in conjunction with a brain–computer interface (BCI) motor-imagery paradigm, i.e., the participant imagines moving their arm, feels their arm moving through application of the vibrations technique, and then sees the corresponding virtual arm move. This was part of an Embodiment Station (discussed in Section 6.5).

Regarding non-human body configurations Ehrsson (2009) and Guterstam et al. (2011) showed, for example, that using the multisensory techniques associated with the RHI, it is possible to give participants the illusion of owning additional arms. Regarding body shape, Kilteni et al. (2012)¹⁷ showed that it is possible to have an illusion of ownership over an asymmetric human body, where one arm is three times as long as another, and where the participant responds by automatically withdrawing the arm when there is a threat to the distant hand. This illusion had first been implemented and experienced at VPL in the 1980s, although not published. Steptoe et al. (2013) showed how humans could adapt to having a tail, through embodiment using a Cave-like system, but seeing the virtual body from behind. Participants learned how to use the tail in order to avoid harm to the body. More recently, Won et al. (2015a) have continued to study homuncular flexibility, showing that people can learn to control virtual bodies through mappings that are different from the usual ones. Some implications of this across a range of fields have been discussed in Won et al. (2015b).

Returning to the RHI, Ijsselsteijn et al. (2006) found that an illusion of ownership can be attained over a 2D projection of an arm on a table top when the visuotactile synchronous stimulation is applied as in the RHI. Although the subjective illusion was reported, the proprioceptive drift effect did not occur. Using VR, Slater et al. (2008) showed that a virtual arm could be felt as owned by participants when seen to be stroked synchronously with the corresponding hidden real arm. This was achieved by a virtual arm being displayed on a powerwall as projecting (in stereo) out of the real shoulders of participants. A tracked wand was used to tap and stroke the participant’s hidden real hand, which was shown on the display as a virtual ball tapping the virtual hand. This was done synchronously in which case the full illusion of ownership occurred including proprioceptive drift, or asynchronously, which typically did not result in the illusion.

In the full body illusion setup of Petkova and Ehrsson (2008), there was no head tracking so that participants had to be looking down in a fixed orientation toward their body, in order to see the manikin body as substituting their real body. Slater et al. (2010b) carried out the first study of full body ownership using VR where participants saw a virtual body that was spatially coincident with their own and which they saw through a wide field-of-view stereo and head-tracked Fakespace Wide5 HMD.¹⁸ Hence, when they looked down toward themselves they saw a virtual body that substituted their actual (hidden) body and from the viewpoint of the eyes of that virtual body (coincident with their own). We refer to this as first-person perspective (1PP). The experiment also included visuotactile synchrony (they felt their arm being stroked in synchrony with seeing their corresponding virtual arm stroked) or visuotactile asynchrony. There was also a condition where the virtual body was seen from a third-person perspective (3PP) (i.e., the virtual body was not spatially coincident with the real body, but to the left of the participant’s location). In this setup, it was found that 1PP was clearly the dominant factor, although visuotactile synchrony had some contribution. Remarkably, the illusion occurred in spite of the fact that all the participants were adult males but were embodied in a young female body.¹⁹ The difference between the results of Petkova and Ehrsson (2008) and Slater et al. (2010b) was taken up by Maselli and Slater (2013). The vital importance of 1PP for body ownership was also emphasized by Petkova et al. (2011) and considered further by Maselli and Slater (2014).

One of the major advantages of VR in this context compared to using rubber hands or manikin bodies is that virtual limbs or the whole virtual body can be moved. Sanchez-Vives et al. (2010) exploited this to show that the illusion of ownership over a virtual arm can be induced by synchrony between real and virtual hand movements (visuomotor synchrony). Participants wearing a data glove that tracked the movements of their hand and fingers saw a virtual hand (projected in stereo 3D on a powerwall) move in synchrony or asynchrony with their real hand movements. This resulted in an illusion of ownership just as with visuotactile stimulation.

The same can be done for the body as a whole. Through real-time motion capture, mapped onto the virtual body, when the person moves their real body they would see the virtual body move correspondingly. Participants can see their virtual body moving by directly looking toward themselves and in virtual mirror reflections (and shadows) (Slater et al., 2010a). Kokkinara and Slater (2014) showed in later work that when there is a 1PP view of the virtual body then visuomotor synchrony is the more powerful inducer of the body ownership illusion than visuotactile synchrony.

We use the term virtual embodiment (or just embodiment) to refer to the process of replacing a person’s body by a virtual one. This requires the stereo HMD with wide field-of-view (so that the person can actually see their virtual body), with head tracking, at the minimum. Additional multisensory correlations such as visuotactile and visuomotor synchrony may be included. A technical setup to achieve this is described in Spanlang et al. (2014). Virtual embodiment may give rise under the right multisensory conditions (such as 1PP, visuotactile, and/or visuomotor synchrony) to the illusion of body ownership, which is a perceptual illusion that the virtual body feels as if it is the person’s own body (even though it may look nothing like their real body).

There has been a lot of work on building virtual embodiment technology (Spanlang et al., 2013, 2014), studying the conditions that can lead to such body ownership illusions (Slater et al., 2008, 2009, 2010b; Sanchez-Vives et al., 2010; Borland et al., 2013; González-Franco et al., 2013; Llobera et al., 2013; Maselli and Slater, 2013, 2014; Pomes and Slater, 2013; Blom et al., 2014; Kokkinara and Slater, 2014) and exploring the effects of distortions away from the normal form of a person’s actual body (Slater et al., 2010b; Normand et al., 2011; Kilteni et al., 2012; Steptoe et al., 2013). There have also been studies on how illusions of body ownership might result in various changes to the real body.

For example, it had previously been shown that the RHI leads to a cooling of the real hand (Moseley et al., 2008) – though see also Rohde et al. (2013) – as well as an increase in its histamine reactivity (Barnsley et al., 2011). Cooling of several points on the body has also been reported in a 3PP full body illusion (Salomon et al., 2013). There is also evidence using VR suggesting that the 1PP full body ownership illusion can result in changes in temperature sensitivity (Llobera et al., 2013). It has also been shown that when in the full body illusion the virtual hand is attacked that there is an electrical brain response (EEG) that corresponds to what would be expected to occur when a real hand is attacked (González-Franco et al., 2013). Banakou and Slater (2014) showed that embodiment in a virtual body that is perceived from 1PP and that moves synchronously with the real body can result in illusory agency over an act of speaking. The virtual body was seen directly and in a virtual mirror. Participants spent a few minutes simply moving with the virtual body moving synchronously with their movements in the experimental condition or asynchronously in another. At some moment, the virtual body unexpectedly uttered some words (45 in total) with appropriate lip sync. Those in the visuomotor synchronous condition later reported a subjective illusion of agency over the speaking – as if they had been the ones who had been speaking rather than only the virtual body. Moreover, when participants were asked to speak after this exposure, the fundamental frequency of their own voice shifted toward that of the higher frequency voice of the virtual body. Thus embodiment resulted in the preparation of a new motor plan for speaking, which was exhibited by participants in the synchronous condition changing the way that they spoke after compared to before the experiment. This did not happen for those in the asynchronous condition.

Thus, VR offers a very powerful tool for the neuroscience of body representation. For a recent review of this field, see Blanke et al. (2015). It can be used to do effectively and relatively simply what is impossible by any other means – instantly produce an illusion of change to a person’s body. In the next section, we consider some of the consequences of changing representations of the self.

2.1.2. Changing the Body Can Change the Self

“… one of the fundamental differences between virtual reality and other forms of user-interface is that you’re really present in it, your body is represented and you can react with it as you, … And the fact that you’re in it, and that you define yourself is really fascinating. Oftentimes, being able to change your own definition is actually part of a practical application. Like in the world we did last year, where an architect was designing a day care center and could change himself into a child and use it with a child’s body and run faster and have different proportions and all that.” Jaron Lanier (Barlow et al., 1990).

This quote is another illustration that much of what is being discussed today was already thought of and even implemented in the heady days of early VR. If VR can endow someone with a different body, what consequences does this have? We have already mentioned above that ownership over a rubber hand can lead to physiological responses, and there is some evidence that points to the possibility that the experimental real arm can experience (very small) drops in temperature, and that the same can occur over different parts of the body in a virtual whole body illusion, or that in the virtual arm illusion that there may be a change in temperature sensitivity. But, are there higher-level changes to attitudes, behaviors, even cognition?

Yee and Bailenson (2007) introduced a paradigm called the “Proteus Effect,” where it was argued that the digital self-representation of a person could influence their attitudes and behaviors in online and virtual environments. Essentially, the personality or type of body or the actions associated with the digital representation would influence the actual real-time behaviors of participant, both in the VR and later outside it. In their 2007 paper, they showed that being embodied in an avatar that had a face that was judged as more attractive than their actual one led participants to move closer to someone else displayed in a collaborative virtual environment than those participants whose avatar face was judged less attractive. Similarly, being embodied in taller avatars led to more aggressive behaviors in a negotiation task than being embodied in shorter avatars. These results also carried over to representations in online communities (Yee et al., 2009). Groom et al. (2009) embodied White or Black people in a Black or White virtual body, in the context of a scenario in which they were in an interview applying for a job. The embodiment was through an HMD with head tracking, with the body seen in a mirror, and lasted for just over 1 min. Using a racial Implicit Association Test (IAT) (Greenwald et al., 1998), they found after the exposure there was greater bias in favor of White for those embodied in the Black virtual body. This difference did not occur when participants simply imagined being in a White or Black body. Hershfield et al. (2011) studied the effect of embodiment in aged versions of themselves on their savings behavior. They embodied people in a virtual body that either had a representation of their own faces, or their faces aged by about 20 years. The virtual body was shown in a virtual mirror. They found some modest evidence in favor of the hypothesis that being confronted with their future selves influenced their behavior toward greater savings for the future. See also the example concerned with fostering exercise (Fox and Bailenson, 2009) in Section 3.1.2.

The theoretical basis of Proteus Effect (Yee and Bailenson, 2007) is Self-Perception Theory [e.g., Bem (1972)], which suggests that people infer their attitudes by observing their own behaviors and the context in which these occur, and almost all the examples above do put people into behavioral situations. It has been also been argued that attitudinal and behavioral correlates of transformed body ownership can be explained as people behaving according to how others would expect someone with that type of body to behave (Yee and Bailenson, 2007). Essentially, this comes down to stereotyping. For example, in the case of the racial bias study of Groom et al. (2009), participants were put into precisely a situation that is known to be one where there is implicit bias against Black people compared to White.

In an experiment by Kilteni et al. (2013), people were embodied either in a dark-skinned casually dressed (Jimi Hendrix-like) body or in a light-skinned virtual body. The body moved with visuomotor synchrony, but also there was synchronous visuotactile feedback through a drumming task, so that participants saw their virtual hands hit a virtual hand drum that was coincident in space with a real hand drum. Hence, when they hit the virtual drum they would also feel it.²⁰ In this experiment, those embodied in the dark-skinned casual body expressed significantly greater body movement while drumming than those embodied in the light-skinned body that was wearing a formal suit. This result occurred, in the view of the stereotype theory, because there is greater expectation that people who look more like Jimi Hendrix would be more bodily expressive. However, self-perception theory and stereotyping cannot account for attitudinal changes that have been observed in experiments where only the body changes, and there are no particular behavioral demands within the study. These results are better explained within the multisensory perception framework based on the research that has stemmed from the RHI.

Peck et al. (2013) carried out a racial bias study where participants were embodied for 12 min in either a Black body, a White one, a purple one, or no body at all. The body moved synchronously with real body movements of the participants through real-time motion capture and was seen directly by looking toward the self with the head-tracked HMD and in a mirror.²¹ Those in the “no body” condition saw a mirror reflection of a Black body, but which moved asynchronously to their own movements. A racial IAT was applied some days before the experience and then immediately after. It was found that average implicit racial bias significantly decreased only for those who had the Black embodiment. During the 12 min of exposure, the participant did not have any task except to move and to look toward themselves and in the mirror while doing so. The only events that occurred were that 12 virtual characters walked by, 6 of them Black and the others White. It is likely that the results are different from Groom et al. (2009) because of the much longer exposure time, the full body synchronous movement, and the fact that there was no task, so that this was only based on body ownership through multisensory perception. Given the contrary earlier result of Groom et al. (2009), it was hard to believe that just 12 min of this experience could apparently reduce implicit racial bias. However, independently it was shown by Maister et al. (2013) that the RHI over a black rubber hand also leads to a reduction of implicit racial bias in light-skinned people. For a review of this area of research see Maister et al. (2015). Recent results demonstrate that the decrease in implicit bias lasts for at least 1 week after the exposure (Banakou et al., 2016).

van der Hoort et al. (2011) showed using the multisensory techniques of Petkova and Ehrsson (2008) that when average sized adults have an illusion of body ownership over smaller or larger manikin bodies that this results in changes in their perception of object sizes (in a small body objects seem to be larger, but smaller in a large body). Banakou et al. (2013) reproduced this result in immersive VR.²² They showed that the illusion of body ownership of adults over small body leads to overestimation of object sizes. However, if the form of the body represented that of a (4-year-old) child then the size overestimation was approximately double that compared to when the form of the body was an adult body but shrunk down to the same size as the child. Moreover, in the child embodiment case, there were changes in implicit attitudes about the self toward being child-like substantially beyond changes induced by the illusion of ownership of the adult-shaped body of the same size. In other words, only the form of the body (child-like compared to adult-like) has this effect.

The child and racial bias studies relied on an IAT – e.g., Greenwald et al. (1998) – a reaction time measure where participants have to quickly associate between two target concepts (e.g., Black and White people) and an attribute (e.g., Positive and Negative). When the concept and attributes must be simultaneously selected (e.g., when deciding if a stimulus matches White or Black but where each is also associated with Positive or Negative), then a faster choice in pairing say Black and Negative and White and Positive, compared to Black and Positive with White and Negative, would indicate an implicit racial bias. Such implicit bias is found notwithstanding the explicit attitudes of people, which may not be discriminatory, there being a dissociation between implicit and explicit bias (Greenwald and Krieger, 2006). Indeed, in the explicit racial attitudes test in Peck et al. (2013) there was no evidence of explicit racial bias – although there was implicit racial bias shown in the preexperiment IAT. When it comes to discriminatory behavior, the IAT results have better predictive power in social interaction than explicit measures (Greenwald et al., 2009) – for example, with respect to eye contact, proxemics, and hiring practice (Ziegert and Hanges, 2005; Rooth, 2010). Even though the use and interpretation of the IAT may be controversial, there is evidence supporting its explanatory and predictive power (Jost et al., 2009).

With respect to embodiment in a child body, it is known that perception from the perspective of a smaller body results in size overestimations (van der Hoort et al., 2011), and indeed this occurred for both the adult and child conditions in Banakou et al. (2013). However, this does not explain why the overestimation in the child condition was almost double that of the adult condition. Since we have all been children it is possible that the brain relies on autobiographical memory thus making the world appear larger, and more rapidly finds associations between the self and child-like categories. However, with respect to the racial bias study (Peck et al., 2013), none of the participants had ever had dark skin, and yet 12 min of exposure was enough to significantly change their IAT score away from indications of bias. How is this possible? Our answer suggests that the body ownership and agency over the virtual body is more than a superficial illusion, and that it goes beyond the perceptual to influence cognitive processing. It was argued in Banakou et al. (2013); Llobera et al. (2013) that a fundamental mechanism may be through the postulated “cortical body matrix” (Moseley et al., 2012), which maintains a multisensory representation of the space immediately around the body in a body-centered reference frame. The system is responsible for homeostatic regulation of the body, and for dynamically reconstructing the body representation moment to moment based on current multisensory information. It was argued that if, as seems likely, such a system exists, it then operates globally in a hierarchical top-down fashion, so that attribution of the whole body to the self leads to attribution of the body parts to the self. Moreover, it was proposed that it also maintains an overall consistency between the multifaceted aspects of self (personality, attitudes, and behaviors) and the body representation. We can view IAT changes as direct evidence of this – changing the body apparently leads to changes in implicit attitudes. We can say that as well as body ownership over a different body leading to changes in implicit attitudes, the documented changes in implicit attitudes are a very strong signal that in fact there has been a change in body ownership. A further study also hints at the likelihood that a change in body ownership can also result in cognitive changes (Osimo et al., 2015), where it was shown that swapping bodies with (virtual) Sigmund Freud led to an improvement in mood after a self-counseling process.²³

The use of embodiment and the transformative power that it seems to have is fundamental feature that separates immersive VR from other types of system, and recent scientific results do back up the statement by Jaron Lanier in the quote at the head of this section, said a quarter of a century ago.

2.1.3. Spatial Representation and Navigation

Virtual reality is especially suitable for the study of spatial representation and spatial navigation. This at the core of the use of VR: to break down the walls of our room, to transport us to another space, a space that we can explore with or without moving (see Section 6). Spatial navigation is useful for a number of areas and purposes: for learning to navigate a certain model space such as a foreign city to be visited, for rehabilitation of spatial abilities after a neurological disorder or brain injury that affected this function, for neuroscience research (to understand the basis of spatial cognition, memory, and sensory processing), for city design, or to treat post-traumatic stress disorder (PTSD) associated with a space, among others.

We may want to move around the city of Paris and to become oriented before we travel to the real city. Or we do not plan to go, and we just want to visit virtual Paris. First of all, how do we move around the city? We can move with a joystick. This allows us to navigate easily from our couch, for example. However, this method may not be optimal if we are planning to internalize, to “learn” the spatial map of Paris, which is better achieved if we move our bodies, since this then enhances theta frequencies in the hippocampus (Kahana et al., 1999). We can also navigate by walking-in-place (Slater et al., 1995; Usoh et al., 1999). Another technique for moving through distances that are greater than the physical space in which the participant can move is called “redirected walking,” where, for example, the system takes advantage of participant head turns to rotate the environment more than the head turn – in this way giving people the impression that they had walked in a long straight line when in reality they had walked in a curve or vice versa (Razzaque et al., 2001, 2002), research that is ongoing, e.g., Suma et al. (2015). Or, we could eventually navigate by thought alone if the VR is connected to a BCI (Pfurtscheller et al., 2006). This is an excellent possibility for patients who are completely immobilized since they can feel the freedom of navigating by thought, an experience very positively evaluated by users (Friedman et al., 2007; Leeb et al., 2007) (see Section 6.5).

Understanding the brain mechanisms that underlie the generation of internal maps of the external world, the storage (or memory) of these maps, and their use in the form of navigation strategies is an important field in neuroscience (notice that the Nobel Prize in Physiology or Medicine 2014 was shared, one-half awarded to John O’Keefe, the other half jointly to May-Britt Moser and Edvard I. Moser “for their discoveries of cells that constitute a positioning system in the brain,” known as “place cells” and “grid cells”). Many of the associated studies have been carried out in rodents that were navigating in laboratory mazes. But, how can we study navigation in humans? VR navigation has been found to provide a consistent sensitive method for the study of hippocampal function (Gould et al., 2007). The hippocampus is the main brain structure supporting spatial representation, a structure that is larger than average in London taxi drivers, who are famous for learning the map of London in great detail (Maguire et al., 2000). Virtual cities have been used to determine, for example, that we activate different parts of the brain when we do wayfinding versus route following (Hartley et al., 2003), and to identify spatial cognition deficits in disorders such as depression (Gould et al., 2007) or Alzheimer (Cushman et al., 2008).

Even though the brain processes underlying spatial navigation in rodents used to be studied in real mazes, in recent years VR for rodents has also become a valuable tool in basic research in neuroscience. This technique allows navigation of virtual spaces while the animals walk in place on a rotating ball, such that their head is stable and their brain can be visualized while they do spatial tasks (Harvey et al., 2009). Even more recent VR systems for rodents allow 2D navigation including head rotations, resulting in the activation of all the same brain mechanisms that had been identified for freely moving animals, while the animals remain static and walking-in-place (Aronov and Tank, 2014). This approach allows detailed observation of specific brain cells during navigation.

Since navigation in virtual space can activate the same brain mechanisms as navigation in the real world, spatial “presence” can be successfully generated (Brotons-Mas et al., 2006; Wirth et al., 2007). The illusory sensation of spatial presence allows the recreation of all the sensations associated with a particular place by using VR, which is useful in order to treat PTSD associated with a space. This has been widely used with soldiers that had been in Iraq and Afghanistan (Rizzo et al., 2010). Virtual spaces such as virtual Iraq, and in particular virtual navigation, have also been used for assessment and rehabilitation following traumatic brain injury, a lesion also frequent in soldiers (Reger et al., 2009). Assessment tasks and training tasks for rehabilitation often go hand in hand, and thus retraining in topographical orientation, wayfinding, and spatial navigation in VR is often used in cognitive rehabilitation following traumatic brain injury, neurological disorders (Bertella et al., 2001; Koenig et al., 2009; Kober et al., 2013). Furthermore, it has been proposed that sustained experiential demands on spatial ability carried out in VR protect hippocampal integrity against age-related decline (Lovden et al., 2012).

Virtual reality can be used to study the strategies that humans use for spatial navigation, which reveals the underlying geometry of cognitive maps. These maps could have a Euclidean structure preserving metrics and angles or a topological graph structure. To study this, experiments in the VENLab²⁴ (Rothman and Warren, 2006; Schnapp and Warren, 2007) included a large area that allowed tracked displacements while in VR. A virtual environment representing a virtual hedge maze allowed identification of the location of certain landmarks. By creating two “wormholes” that rotate and/or translate a walker between remote places in the virtual hedge maze, they made the space non-Euclidean, in order to explore the navigational strategies used by different subjects. This is a good example of how VR can be used in this domain to achieve things that are impossible in reality.

The study of navigation and wayfinding in VR has a long history. A good starting point for those interested in following this up is the special issue of the journal Presence – Teleoperators and Virtual Environments, edited by Darken et al. (1998). There is a difference between techniques for navigating effectively within a virtual environment, and the extent to which learning wayfinding through a space in a virtual environment transfers to real-world knowledge. Darken and Goerger (1999) pointed out that while the use of VR seems to produce the best results in terms of acquiring spatial knowledge of a terrain, when it comes to actual performance VR training often does not transfer, and can even make the situation worse. The authors, based on a number of studies, concluded that using specific VR techniques (e.g., a virtual compass) and relying on specific virtual imagery during the learning process does not transfer well to real-world wayfinding. However, those who use the VR to rehearse what they will later do in reality, to make a plan, without relying on detailed cues but rather transferring their experience into more abstract spatial knowledge do a lot better. Ruddle et al. (1999) carried out a direct comparison between navigation on a desktop system compared to a head-tracked HMD. They found that although there were no differences in task performance between the two systems in the sense of measuring the distance traveled, the HMD users stopped more frequently to look around the scene and were able to better estimate straight line paths between waypoints. On the other hand, those using the desktop system seemed to develop a kind of tunnel vision. This difference between the two illustrates that in immersive VR there is generation of the types of kinesthetic and proprioceptive cues, i.e., body-centered perception – contributing to what we referred to earlier as natural sensorimotor contingencies for perception – that improve the chance of transfer of knowledge to real-world task behavior. Ruddle and Lessels (2009) carried out a further study where they compared navigation task performance in a virtual environment under three different conditions: (1) a desktop interface, (2) an HMD that was tethered, so that although participants could look around, they could not walk, and (3) a wide area tracking system that allowed participants to really walk. They found that in both their reported experiments (which differ in rendering style of the environment) that those who were able to really walk outperformed the other two groups. See also Ruddle et al. (2011b). In fact, it was later found that walking (in this case enabled through an omnidirectional treadmill) clearly resulted in improved cognitive maps of the space compared to other methods (Ruddle et al., 2011a, 2013) as predicted by Brotons-Mas et al. (2006). In this context, it is worth noting that when comparing presence in a virtual environment through a head-tracked HMD, using (1) point-and-click techniques, (2) walking-in-place where the body moves somewhat like walking but not actually walking, and (3) real walking using wide area tracking, Usoh et al. (1999) found that subjectively reported PI (the component of presence referring to the sense of “being there”) was greater for both types of walking compared to the point-and-click technique. On some presence measures, real walking was preferred to walking-in-place, and as would be expected, real walking was the most efficient form of navigation.

A recent study by Sauzéon et al. (2015) used a powerwall-based VR system to test the effect on episodic memory of a virtual apartment. Participants had two methods for navigation through the apartment, either passively watching or using a joystick to actively explore. It was found that episodic memory was superior in the active condition. A similar setup using a virtual model of the city of Tübingen was shown to be advantageous in helping stroke patients to recover some wayfinding ability (Claessen et al., 2015).

In a very famous experiment in 1963, Held and Hein (1963) took 10 pairs of neonatal kittens and arranged that 1 navigated an environment by actively moving around it, but the second was carried along passively in a basket by movements of the first. They found that the kittens that were passively moved around, although in principle subject to the same visual stimuli as the active ones, developed significant visual-motor deficits. The authors concluded that “self produced movement with its concurrent visual feedback is necessary for the development of visually-guided behavior.” A similar observation was obtained in rats while walking versus being driven in a toy car (Terrazas et al., 2005), while simultaneous brain recordings were obtained, and the spatial information carried per neuronal spikes in place cells was found to be smaller in the passive navigation. This type of finding fits very well with findings in human studies in virtual environments. The conclusion from these studies is that simply putting someone in a VR in order to learn a particular environment can be effective provided that the form of locomotion includes active control by the participant. Concomitant with our views that the most important factor behind PI is the affordance by the system of perception through natural sensorimotor contingencies, the more that the whole body can be involved in the process of locomotion, the better the result in transfer to the real world, and the formation of cognitive maps.

This is an important and vitally important area of research, and above, we have scratched the surface. As VR becomes used on a mass scale, one of its most frequent uses will probably be for virtual travel. If people simply use VR to observe an environment then the form of interface for navigation does not matter much – other than adhering to excellent user interface principles suitable for VR: of greater interest are the sights and sounds encountered. However, if people want to use it for rehearsal, to learn about how to get from A to B, then they had better use a form of body-centered interface, at least equivalent to walking-in-place, but preferably one of the new generation of treadmill interfaces that are currently in development.

2.2. Scientific and Data Visualization

Immersive VR visualization and interaction with data is relevant for scientific evaluation and also in the fields of training and education. It also allows an active interaction with the representations, e.g., in drug design (see below). We can walk through brains²⁵^,²⁶ or molecules, and we can fly through galaxies. The requirements and level of interaction will vary depending on whether this “walk” is for professional use, for students, or for the general public. Immersion in the data could take place alone or in a shared environment, where we explore and evaluate with others. The data could be static, or we could be immersed in dynamic processes. The data should be viewable in multiscale form.

Three-dimensional representation of real or modeled data is important for understanding data and for decision-making following this understanding, a relevant topic for a number of fields, especially at this time of exponentially growing datasets. Even when most of the analysis tools are computer-run algorithms, human vision is highly sensitive to patterns, trends, and anomalies (van Dam et al., 2002). There is a substantial difference between looking at 3D data representations on a screen and being immersed in the data, navigating through it, interacting with it with our own body, and exploring it from the outside and the inside. It is logical to expect that when VR commercial systems are pervasive, there will be a trend for currently used 3D data representations on a flat screen to be visualized in immersive media. This, along with the body-tracking systems, will allow a more natural interaction with the data. The extent to which this interaction with data goes further than the “cool” effect and adds real value to the comprehension, evaluation, and subsequent decisions taken as a result is an important issue to explore. It is also important to identify ways to maximally exploit the potential of this data immersion capability.

Specific examples of VR for data visualization include molecular visualization and chemical design. In a recently described system called the “Molecular Rift,” the immersive 3D visualization of molecules is combined with interaction with molecules based on gesture-recognition (Norrby et al., 2015). In this version, participants were immersed into protein–ligand complexes. The system was evaluated by groups with experience in medical chemistry and drug design, and the study was focused on the improvement of the user-interaction with the molecules based on gestures and not in the evaluation of improved performance of drug design or specific tasks. Out of 14 users, all of them found the system potentially useful for drug design, and they enjoyed using it, while none experienced motion sickness.

A more specific task in interaction with molecules was tested by Leinen et al. (2015). In this study, a task of manipulating nanometer-sized molecular compounds on surfaces was tested under usual scanning probe microscopy versus immersive visualization through an Oculus Rift HMD. The hand-controlled manipulation for extracting a molecule from a surface was improved by the visual feedback provided by immersive VR visualization: preestablished 3D trajectories were followed with higher precision, and deviations from them were better controlled than in immersive than in non-immersive systems (Leinen et al., 2015).

Moving from the nanoscale to the microscale, a specific task consisting of the evaluation of the spatial distribution of glycogen granules in astrocytes (glial cells, a type of brain cells) was evaluated in an immersive environment in a Cave-like system (Cali et al., 2015). A section of the hippocampus of 226 μm³ at a voxel resolution of 6 nm was 3D reconstructed based on electron microscopy image stacks. A set of procedures and software was developed to allow such immersive reconstruction. The distribution of glycogen granules initially appeared to have a random distribution, but they were discovered to be grouped into clusters of various sizes with particular spatial relationships to specific tissue features. The authors found the immersive evaluation of the 3D structure to be pivotal to identify such non-random distribution (Cali et al., 2015). The use of an interactive VR room also allowed multiple users to share and discuss the evaluation of the cellular details. In this study there were, however, no comparisons between task performance across different display media.

A comparison across three different media – 3D reconstructions rendered on (1) a monoscopic desktop display, (2) a stereoscopic visual display on a computer screen (fishtank), and (3) a Cave-like system – was carried out by Prabhat et al. (2008). In this study, confocal images of Drosophila data: the egg chamber, the brain, and the gut, were evaluated by subjects who had to describe or quantify specific features mostly related to spatial distribution or colocalization and geometrical relationships. A more immersive environment was preferred qualitatively by subjects, and task performance was also superior.

Immersive VR is of great value for surgery training, an aspect that is developed in Section 2.4 where specific examples are described. Visualization of the human body from an immersive perspective can provide medical students an unprecedented understanding of anatomy, being able to explore the organs from micro to macro scales. Furthermore, immersive dynamic models of body processes in physiological and pathological conditions would result in an experience of “immersive medicine.”

Large-scale coordinated efforts to understand the brain are under way in projects such as the European Human Brain Project²⁷^,²⁸ and BRAIN²⁹ Initiative of the United States. These projects are generating detailed multiscale and multidimensional information about the brain. Immersive VR will have a role in the visualization of these brain reconstructions or of the simulations built based on the experimental data. The Blue Brain Project (predecessor of the Human Brain Project) has already generated a full digital reconstruction of a rat slice of somatosensory cortex with 31,000 neurons based on real neurons, and 37 million synapses (Markram et al., 2015). This simulation generates patterns of neuronal activity that reproduce those generated in the brain and is amenable of immersive exploration into the structure and function of the brain.

Considering now a larger spatial scale, astronomical visualization in immersive VR has also been explored, both for professional and educational purposes (Schaaff et al., 2015). These authors represented high-resolution simulations of re-ionization of an Isolated Milky Way-M31 Galaxy Pair, with various different representations. It is interesting for education that information can be added to the immersive displays.

There is an exciting perspective in the scientific and data visualization area that will open new doors to our understanding. It will be important to evaluate the extent to which immersion and interaction with data results in a more thorough, intuitive, and profound understanding of structures and processes. But in any event, once this route is open, visualization of 3D models on a flat screen will feel like watching Star Wars on a small black and white TV (see Presentation S1 in Supplementary Material).

2.3. Education

Isaac Asimov’s novels Fantastic Voyage (1966),³⁰ based on the movie of the same name,³¹ and Fantastic Voyage II: Destination Brain (1987)³² portrayed a situation with humans shrunk to microscopic scale entering into the body of a patient. VR and the detailed human body scans that now exist make this possible (of course in virtual reality). McGhee et al. (2015) have used the “fantastic voyage” approach to support education of stroke patients about their condition by allowing them to move through a brain representation using the Oculus Rift HMD.

The area of application of VR in education is vast. For recent reviews, see Abulrub et al. (2011), Mikropoulos and Natsis (2011), Merchant et al. (2014), and Freina and Ott (2015). There are several reasons why VR is an excellent tool for education. First, it can change the abstract into the tangible. This could be especially powerful in the teaching of mathematics. For example, Hwang and Hu (2013) suggest that the use of a collaborative virtual environment has advantages for students learning geometrical concepts compared to traditional paper and pencil learning. However, it is not completely clear which type of VR system was used, although it appears to be of the desktop variety. Kaufmann et al. (2000) describe an HMD-based augmented reality system that provides a learning environment for spatial abilities including concepts from vector algebra. They provide anecdotal evidence for the effectiveness of the method. Roussou (2009) reviews the teaching of mathematics in VR using a “virtual playground”³³^,³⁴ and in particular describes an experiment on learning how to compare fractions by 50 children of between 8 and 12 years in a Cave-like system (Roussou et al., 2006). In a between-groups experiment, there were three conditions – children who learned using active exploration of the scenario (n = 17), those who used the virtual playground but who learned by passively observing a friendly virtual robot (n = 14), and another group who did not use VR but rather a Lego-based method (n = 19). Quantitative analysis of the results found no advantage to any system. A detailed qualitative analysis, however, suggested that the passive VR condition tended to foster a reflective process among the children, and great enjoyment in interacting with the robot, associated with better understanding.

The second advantage of VR in education is, notwithstanding the results of the virtual playground experiment, that it supports “doing” rather than just observing. One example of this is surgical training (see Section 2.4), for example, one review emphasizes how VR is increasingly used in neurosurgery training (Alaraj et al., 2011), ideally in conjunction with a haptic interface (Müns et al., 2014). Indeed, a European consensus program for endoscopic surgery VR training has been designed and agreed (van Dongen et al., 2011). For an example in engineering learning see Ewert et al. (2014).

The third advantage is that it can substitute methods that are desirable but practically infeasible even if possible in reality. For example, if a class needs to learn about Niagara Falls 1 week, the Grand Canyon the next, and Stonehenge³⁵ the week after, it is infeasible for the class to visit all of those places. Yet, virtual visits are entirely possible, and such environments have been under construction (Lin et al., 2013) including the idea of virtual field trips (Çaliskan, 2011). It has certainly been suggested that immersive VR will change the nature of field trips,³⁶ and although there have been plenty of inventive demonstrations³⁷^,³⁸^,³⁹ it seems that as yet there have been no studies of the effectiveness of this, although perhaps it is so obviously advantageous that formal studies may be unnecessary.

The fourth advantage of VR in education involves breaking the bounds of reality as part of exploration. For example, changing how activities such as juggling would be if there was a small change in gravity, or how it would be to ride on a light beam, a universe where the speed of light were different. These ideas were envisaged and implemented for VR by Dede et al. (1997); however, there has been no more recent follow-up, which could now occur given greater availability of VR equipment.

In this article, we have emphasized that the real power of VR is that it enables approaches that go beyond reality in a very fundamental way – more than just exploring strange physics. An example of this in the field of education was provided by Bailenson et al. (2008), concerned with the delivery of teaching rather than the content. In a collaborative virtual environment, it is possible to arrange the virtual classroom so that every student is at the center of attention of the teacher, and where the teacher has feedback about which students are not receiving enough eye gaze contact. Additionally, virtual colearners who could be either model students or distracting students can influence learning, and the results overall showed that these techniques do improve educational outcomes. Bailenson and Beall (2006) referred to this type of technique as “transformed social interaction.”

Overall, for the reasons we have given, and no doubt others, VR is an extremely promising tool for the enhancement of learning, education, and training. We have not mentioned other possibilities such as music or dance, or various dexterous skills, but for these areas VR has clearly great potential.

2.4. Surgical Training

Within the area of VR for training, surgical training has been a thoroughly investigated field (Alaraj et al., 2011). The use of simulations in surgical planning, training, and teaching is highly necessary. To give an illustrative example of why VR is necessary for surgery: interventional cardiology has currently no other satisfactory training strategy than learning on patients (Gallagher et al., 2005). It seems that acquiring such training on a virtual human body would be a better option.

In the training of medical students and in particular of surgeons, there is a relevant potential role for VR as a tool to learn anatomy through virtual 3D models. Even though there are studies trying to evaluate how useful VR can be to improve the learning of anatomy (Nicholson et al., 2006; Seixas-Mikelus et al., 2010; Codd and Choudhury, 2011) – including studies proposing that VR could replace the use of corpses in medical school – fully immersive and interactive systems have hardly been used up to now. Most of the 3D models used so far are for screen displays. Still, even the visualization of non-immersive 3D body models to study anatomy yields good results for learning, and therefore this is an area that should expand in the future, integrating fully immersive systems and different forms of manipulation and interaction of the trainees with the body models.

One of the first publications of VR in the field of surgery was on VR-hepatic surgery training, and the words “Surgical simulation and virtual reality: the coming revolution” were on the title of both the article (Marescaux et al., 1998) and the editorial (Krummel, 1998) in the Annals of Surgery nearly 20 years ago. However, the revolution has not happened yet, although the field is now ready for this possibility.

Surgical training in VR requires a combination of haptic devices and visual displays. Haptic devices transmit forces consisting of both the forces exerted by the surgeon and a simulation of the forces and resistances of the various body tissues. A critical question is whether the skills acquired in a virtual training are successfully transferred to the real world of surgery. Seymour et al. (2002), in a highly cited article, provides one of the first demonstrations that this is the case. The performance of laparoscopic cholecystectomy gallbladder dissection was found to be 29% faster for VR-trained versus classically trained surgeons, while errors were six times less likely to occur in the VR-trained group. The system used though (Minimally Invasive Surgical Trainer-Virtual Reality – MIST VR system – Mentice AB, Gothenburg, Sweden), was a 2D representation on a screen of a haptic system used for simulated surgery. These results are likely to improve with a more immersive system. To illustrate the value given to surgical training in VR, an FDA panel voted in August 2004 to make VR simulation of carotid stent placement an important component of training. In the same month, the Society for Cardiovascular Angiography and Interventions, the Society for Vascular Medicine and Biology, and the Society for Vascular Surgery all publicly endorsed the use of VR simulation in carotid stent training (Gallagher and Cates, 2004).

The most common uses so far of VR for surgical training have been those of laparoscopic procedures (Seymour et al., 2002), carotid artery stenting (Gallagher and Cates, 2004; Dawson, 2006), and ophthalmology [Eyes Surgical, based on Jonas et al. (2003)]. In general terms, a large number of studies – out of which only a few seminal ones are cited here – coincide in finding positive results of VR training.

Most of the systems mentioned above concentrate on the local surgical procedure, e.g., how to place a stent or dissect the gallbladder. However, the reality in a surgery room is more complex, and the surgery may need to be performed in situations where the patient’s physiological variables are not stable, or there can be a hemorrhage, or even a fire in the surgical theater. The response of the surgical team to these situations will be critical for the well-being of the patient, and immersive VR should be an optimal frame for such training. VR can embed the specific surgical procedure, for example, the placement of the carotid stent, into various contexts and under a number of emergency situations. In this way, during training, not only the contents but also the skills and the experience of being in a surgery room for many years can be transmitted to the trainees, which can include not only surgeons but all the sanitary personnel, each in their specialized roles.

There is a huge explosion of research in the effectiveness of VR-based training for surgery including meta-analyses and reviews (Al-Kadi et al., 2012; Zendejas et al., 2013; Lorello et al., 2014), transfer of training (Buckley et al., 2014; Connolly et al., 2014), and many specialized applications (Arora et al., 2014; Jensen et al., 2014; Singh et al., 2014). This is likely to be a field that expands considerably.

3. Physical Training and Improvement

Here, we broadly address issues relating to physical training and improvement through sports and exercise, an area of growing interest to professional sports.

3.1. Sports

In the 1990 SIGGRAPH Panel (Barlow et al., 1990), Jaron Lanier mentioned the idea of being able to play table tennis (ping-pong) with a remote player using networked VR. Of course this is now possible⁴⁰ and is certain to be readily available in the near future. For example, a version has been implemented using two powerwall displays plus tracking for each player (Li et al., 2010). However, the opponent need not be a remote player in a shared VR but may be a virtual character. Immersive VR, at least with hand tracking if not full body tracking, has ideal characteristics for playing table tennis or other competitive sports, with the possible advantage of not having to spend time traveling to the gym.

There are several areas where VR can provide useful advantage for sport activities. First, for leisure and entertainment reasons – such as the table tennis example above. Second, for learning, training, and rehearsal. To the extent that VR supports natural sensorimotor contingencies at high enough precision, it could be used for these purposes. However, here it would be important to carry out rigorous studies to check in case small differences between the VR version and the real version might lead to poor skills transfer, or incorrect learning. For example, learning to spin or slam in table tennis requires very fine motor control depending on vision, proprioception, vestibular feedback, tactile feedback, force feedback, even the movement of air, and the sound of the ball hitting the table and the bat. Hence, to build a virtual table tennis that is useful for skill acquisition or improvement must take into account all of these factors, or the critical ones if these are known. On the other hand, virtual table tennis could be thought of as a game in its own right and nothing much to do with the real thing. In this case, virtual table tennis would fall under the first category – entertainment and leisure. Additionally, as we will see in Section 6.3 in the context of acting rehearsal, although VR misses fine detailed facial expression that is critical for successful acting, it is nevertheless useful for that aspect of rehearsal known as “blocking,” which is concerned more with overall spatial configuration of the actors in the scenario. Similarly, even without being able to reproduce all the fine detail necessary for the transfer of training skills to reality, VR may be useful in team sports to plan overall strategy and tactics. A third utility of VR in sports is for rehabilitation following injury. We will briefly consider some of these areas.

In a comprehensive review of VR for training in ball sports Miles et al. (2012) analyze eight challenges: effective transfer of training, the types of skills best learned in VR, the technologies that result in the best quantifiable performance measures, stereoscopic displays have both advantages and disadvantages (e.g., vision is not the same as in real life) – under which conditions should they be used?, the role of fidelity – to what extent and under what conditions is it important?, what kind of feedback should be delivered to the learner, how and when is feedback appropriate?, the effectiveness of teaching motor skills in the inevitable presence of latency and inaccuracies of representation, and finally, cost. The review points out several inevitable hurdles that must be overcome. For example, in training for field games such as American Football or soccer, the area of play is huge compared to the effective space in which someone in a VR system can typically move. A play on a field may involve running 25 m, whereas the effective area of tracking is say 2 m around a spot where the participant in VR must stand. Clearly, using a Wand to navigate or even a treadmill may miss critical aspects of the play (see also Section 2.1.3 for a brief discussion of different methods of moving through a large virtual environment). The paper reports many such pitfalls that need to be overcome and points out that studies have been inconclusive and therefore, there is the need for more research.

Craig (2013) reviews how VR might be used to understand perception and action in sport. She argues that VR offers some clear advantages for this and gives a number of examples where it has been successful, as well as pointing out problems. However, she wonders why if it is successful it has not been widely used in training up to now, but where there is reliance on alternatives such as video. She points out that one problem has been cost, though this is likely to be ameliorated in the near term. A second problem is to effectively and differentially meet the needs of players and coaches, pointing out how VR action replays could be seen from many different viewpoints, including those of the player and of the coach so that different relevant learning would be possible. Another advantage of VR would be to train players to notice deceptive movements in opponents, by directing attention to specific moves or body parts that signal such intentions. However, she points out as mentioned above how it is critical to provide appropriate cues to avoid mislearning.

Ruffaldi et al. (2011) examined the theoretical requirements for successful training transfer in the context of rowing and described a haptic-enabled VR system with a single large screen for visual feedback. Rauter et al. (2013) described a different VR simulator for rowing. This was a Cave-like system enhanced with auditory and haptic capabilities, an earlier version described in von Zitzewitz et al. (2008). Their study, carried out with eight participants, compared skill acquisition between conventional training on water, with training in the simulator. Examining the differences between the two they concluded that both with respect to questionnaire and biomechanical responses that the methods were similar enough for the simulator to be used as a complementary training tool, since there was sufficient and appropriate transfer of training using this method. Wellner et al. (2010b) described an experiment where 10 participants took part in simulated rowing. The novelty was that they added a virtual audience to test the idea that the presence of an audience would encourage the rowers in a competitive situation. They did not find a notable outcome in this regard, only the relatively high degree of presence felt by the participants. On similar lines, Wellner et al. (2010a) examined whether the presence of virtual competitors in a rowing competition would boost performance. No definite results were found, but according to the authors, the study had some flaws, and in any case the sample size was small (n = 10). In spite of null results, it is important to note how VR affords the possibility to experiment with such factors that would be possible, but logistically very difficult to do in reality.

Another example of this use of VR that is logistically very difficult to do otherwise is for spectators to attend sports matches when they cannot physically attend (e.g., someone in the US who is a fan of English soccer). Instead, they can view them, as if they were there – and have the excitement of seeing the game life-sized, first hand, and among a crowd of enthusiasts. Kalivarapu et al. (2015) implemented a system to display American Football in a high-resolution, six-sided, Cave-like system and also in an Oculus DK2 HMD. They carried out a study with 60 participants who were divided into three conditions: Cave (n = 20), HMD (n = 20), and video (n = 20), where the game and associated events were shown on video. They concluded that the Cave and HMD experiences gave the participants greater opportunity to interact (i.e., view from different vantage points) compared to the video. Participants nevertheless experienced a greater degree of realism in the Cave, perhaps not surprising because of its greater resolution (and several orders of magnitude greater cost). On the whole, the HMD and Cave produced similar results across a number of aspects of presence. There is a growing interest in the use of VR for sports viewing and other events, mainly using 360° video. See also the “Wear the Rose” system that gives fans the chance to experience rugby games first hand,⁴¹^,⁴²^,⁴³^,⁴⁴ and an example of its use in American Football.⁴⁵

There have been many other applications of VR in sports – impossible to cover all of them here – for example, a baseball simulator,⁴⁶ for handball goalkeeping (Bideau et al., 2003; Vignais et al., 2009), skiing (Solina et al., 2008), detecting deceptive movements in rugby (Brault et al., 2009; Bideau et al., 2010), and pistol shooting⁴⁷ (Argelaguet Sanz et al., 2015), among others. A special issue of Presence – Teleoperators and Virtual Environments was devoted to VR and sports (Vignais et al., 2009; Multon et al., 2011), which would be a good starting point for readers wishing to follow up this topic in more detail (see Presentation S2 in Supplementary Material).

3.2. Exercise

It is well known that aerobic exercise is extremely good for us, especially as we age. A meta study of research relating to older adults carried out by Colcombe and Kramer (2003) showed that there is a clear benefit for certain cognitive functions. A more recent survey by Sommer and Kahn (2015) again showed the benefits of exercise for cognition for a variety of conditions. Yu et al. (2015) showed its utility for Alzheimer patients and Tiozzo et al. (2015) for stroke patients. However, repetitive exercise with aerobic benefits can be boring; indeed, Hagberg et al. (2009) found in a study that enjoyment is important in increasing physical exercise.

Virtual reality opens up the possibility of radically altering how we engage in exercise. Instead of just being on a stepping machine watching a simple 2D representation of a terrain, we can be walking up an incline on the Great Wall of China, or walking up the steps in a huge auditorium where we are excitedly going to watch a sports game, or even walking up steps to a fantasy castle in a science fiction scenario. Instead of just riding an exercise bike, we can be cycling through the landscape of Mars.⁴⁸^,⁴⁹^,⁵⁰

One use of VR for exercising would be an extension of approaches that have already been tested, normally referred to as “exergaming.” This involves, for example, connecting an exercise bike to a display, so that the actions of the rider affect what is displayed, e.g., faster pedaling leads to corresponding depiction of increased optic flow on the display. Moreover, other motivational factors can be introduced such as virtual competitors (as we saw in the rowing example above). Anderson-Hanley et al. (2011) carried out a study with n = 14 older adults using a cybercycle (an exercise bike with a screen in front) and competitive avatars as in a race.⁵¹ Their evidence suggested that this social factor tended to increase participants’ effort. Finkelstein and Suma (2011) used a three-walled stereoscopic display and upper body tracking of participants who had to dodge virtual planets flying toward them. Their experiment included n = 30 participants who played for 15 min. They found that the method produces increased heart rate (i.e., is aerobic) and motivates children and adults to exercise. Mestre et al. (2011) had n = 12 participants in an experiment that used an exerbike (with a large screen) where they compared video feedback with video and music feedback. They found that the addition of music was beneficial both psychologically (for motivation and pleasure) and behaviorally. Anderson-Hanley et al. (2012) carried out a formal clinical trial where they used “cybercycling,” as above, stationary cycling tied to a screen display, with older people (n = 102). They were interested in testing among other things whether such cycling would improve executive function. They found that cognitive function was improved among the cybercyclers, and that it was likely that it would help to prevent cognitive decline compared to traditional exercise. Overall, while there has been significant work in this area, a systematic review carried out by Bleakley et al. (2013) found that although these types of approach are safe and effective, that that there is limited high quality evidence currently available.

It is one thing to be cycling or walking on a treadmill or exercise steps while looking at a screen, since this is anyway the case with most exercise machines even though the display may be very simplistic. Since the exerciser is not actually moving through space, looking at a screen should be harmless. However, it is not obvious that the same activities could be safely or successfully carried while people are wearing an HMD, which not only obscures their vision of the real world but may also lead to a degree of nausea – which is all the more likely to occur while moving through virtual space. Shaw et al. (2015b) discussed five major design challenges in this field. First, to overcome the problem of possible sickness; second, to have reliable tracking of the body; third to deal with health and safety aspects; fourth the choice of player visual perspective; and fifth, the problem of latency. They described a system that was designed to overcome these problems, that used an Oculus DK2 HMD, and which was evaluated in an experimental study (Shaw et al., 2015a). This had n = 24 participants (2 females, ages between 20 and 24). They compared three setups: a standard exercise bike with no feedback, the exercise bike with an external display, and the bike with the HMD. The fundamental findings were that on several measures (calories burned, distance traveled) the two feedback systems outperformed the bike only condition but did not differ from each other. The two systems with feedback were also evaluated as more enjoyable than the bike only, and the HMD was more enjoyable and was associated with greater motivation than the external display system. Only 4 out of 26 reported some minor symptoms of simulator sickness. As the authors pointed out, the study was limited, since the participants were almost all males, and with limited age range, and it is not known how well these results would generalize. Bolton et al. (2014) also described a system that combined an Oculus Rift HMD⁵² with an exercise bike that was designed to reduce the possibility of motion sickness; however, no experimental results were given. There are several other applications without associated papers such as RiftRun⁵³ where participants run on the spot to virtually run through an environment.

Overall, as in other fields, there are promising but far from conclusive results, but irrespective of scientific studies it is highly likely that immersive VR will be combined with personal exercise systems, since the relatively low cost now makes this possible, and some sports providers may decide that the “cool” factor makes such an enterprise worth the economic risk. Whether these are successful or not will obviously depend on consumer uptake.

Finally, as in other applications, we emphasize that VR allows us to go beyond what is possible in reality. Even cycling through Mars is just cycling. It is physically possible, if highly unlikely to be realized. Perhaps though there are fundamentally new paradigms that can really exploit the power of VR – the virtual unreality that we mentioned in the opening of this article. One approach is to use VR to implicitly motivate people toward greater exercise rather than as a means to carry out the exercise itself. Fox and Bailenson (2009) carried out a study where participants using a head-tracked HMD-based VR saw a virtual character from 3PP (i.e., across the room and looking toward them) with a face that was based on a photograph of their own face and that therefore had some likeness to themselves. Participants at various points were required to carry out physical exercises or not. While they did not carry out these exercises the body of their virtual doppelganger became fatter, and while they did the exercises the virtual body became thinner. There were n = 22 participants in this reinforcement condition, n = 22 in another condition where the virtual body did not change, and n = 19 in another condition where there was just an empty virtual room with no character. The dependent variable was the amount of voluntary exercise that participants carried out in a final phase of the experiment (during which there was also positive and negative reinforcement). It was found that the greatest exercise was carried out by the group that had the positive and negative reinforcement. In order to check that it was the facial likeness that accounted for this result, a second experiment introduced another condition, which was that the face of the virtual body was that of someone else. Here, the result only occurred for the condition of the virtual doppelganger. Finally, it could be argued that the participants in the voluntary exercise phase only exercised to avoid the unpleasant sensation of seeing their virtual doppelganger “gaining weight.” A third study examined participants’ level of exercise during a 24-h period after the conclusion of the study, through a questionnaire returned online. The setup was that they saw their doppelganger exercising on a treadmill, or a virtual character that did not look like themselves exercising, or a condition where their doppelganger was not doing any exercise but just standing around. The results suggested that those who saw their virtual look-alike exercising did carry out significantly more exercise in the real world in a period after the experiment than the other two conditions.

A second approach might be to use VR to provide a surrogate for exercising, rather than providing a motivation to exercise physically in reality. Kokkinara et al. (2016) illustrated what might be possible. Participants who were seated wearing an HMD and unmoving (except for their head) saw from 1PP their virtual body standing and carrying out walking movements across a field. They saw this when they looked down directly toward their legs that would be walking, and also in a shadow. In another condition they saw the body from a 3PP. After experiencing this virtual walking for a while they approached a hill, and the body walked up the hill. In the embodied 1PP condition participants had a high level of body ownership and agency over the walking, compared with the 3PP condition. More importantly, for this discussion, while walking up the hill participants had stronger skin conductance responses (more sweat) and greater mean heart rate in the embodied condition, compared to a period before the hill climbing, which did not occur for those in the 3PP. There were 28 participants each of whom experienced both conditions (there was another factor, but it is not relevant to this discussion).

Although there are caveats for both of these studies, the important aspect for our present purpose is that they illustrate how VR might be used to break out of the boundaries of physical reality and achieve useful results through quite novel paradigms. Of course it must always be better to carry out actual physical exercise rather than relying on your virtual body to do it for you. Yet sometimes, for example, on a long flight, virtual exercise might be the only possibility. Indeed, in this context, it has been found that participants who perceive their virtual body from 1PP in a comfortable posture are more likely to feel actual comfort than those who see their body in an uncomfortable posture (Bergström et al., 2016).⁵⁴ The point is that VR has the power to go beyond what we can do in physical reality, even in principle, and become a radically new medium with different ways of thinking and novel ways of accomplishing life-changing goals.

4. Social and Cultural Experiences

There are many areas of social interaction between people where it is important to have good scientific understanding. What factors are involved in aggression of one group against another, or in various forms of discrimination? Which factors might be varied in order to decrease conflict, improve social harmony? It is problematic to carry out experimental studies in this area for reasons discussed below. However, immersive VR provides a powerful tool for the simulation of social scenarios, and due to its presence-inducing properties can be effectively used for laboratory-based controlled studies. Similarly, away from the domain of experiments, there are many aspects of our cultural heritage that people cannot experience – how an ancient site might have looked in its day, the experience of being in a Roman amphitheater as it might have been at the time, and so on. Again, VR offers the possibility of direct experience of such historical and cultural sites and events. In this section, we consider some examples of the application of VR in these fields, starting first with social psychology.

Loomis et al. (1999) pointed out how VR would be a useful tool for research in psychology and Blascovich et al. (2002) in social psychology. Here, the potential benefits are enormous. First, studies that are impossible in reality for practical or ethical reasons are possible in VR. Second, VR allows exact repetition of experimental conditions across all trials of an experiment. Moreover, virtual human characters programed to perform actions in a social scenario can do so multiple times. This is not possible with confederates or actors, who can become tired and also have to be paid. Although it is costly to produce a VR scenario, once it is done, it can be used over and over again. Also, the scenarios can be arbitrary rather than restricted to laboratory settings. Rovira et al. (2009) pointed out how the use of VR in social science allows for both internal and ecological validity. The first refers to the possibility of valid experimental designs including issues such as repeatability across different trials and conditions, the precision at which outcomes can be measured, and so on. The second refers to generalizability. For example, in a study of the causes of violence, VR can place people in a situation of violence, which cannot be done in a real-life setting. This means that there is the possibility of generalization of results out of the laboratory to what may occur in reality. In particular, VR can be used to study extreme situations that are ethically and practically impossible in reality. This relies on presence – PI and Psi – leading to behavior in VR that is sufficiently similar to what would be expected in real-life behaviors under the approximately the same conditions. In the sections below, we briefly review examples of research in this area.

4.1. Proxemics

How do you feel when a stranger approaches you and stands very close? The answer may vary from culture to culture, but at least in the “Anglo-Saxon” world you are likely to back away. Proxemics is the study of interpersonal distances between people, discussed in depth by Hall (1969). He defined intimate, personal, social, and public distances that people maintain toward each other (and these distances may be culturally dependent). An interesting question is the extent to which these findings also occur in VR. If a virtual human character approaches and stands close to you, in principle this is irrelevant since nothing real is happening – there is no one there. Even if the character represents a physically remote actual person who is in the same shared virtual environment as you, they are not really in the same space as you, and therefore not close. We briefly consider proxemics behavior in VR because it is a straightforward but fundamental social behavior, and finding that the predictions of proxemics theory hold true for VR is a foundation for showing that VR could be useful for the study of social interaction.

There has not been a great deal of work on this topic that has exploited VR. Bailenson et al. (2001) showed that people tend to keep greater distances from virtual representations of people than cylinders in an immersive VR. This work was continued in Bailenson et al. (2003) where it was shown that participants maintain greater distances from virtual people when approaching them from the front, than from the back, and also greater distances when there is mutual eye gaze. Participants also moved away when virtual characters approached them. Readers might be wondering – so what? This is obvious. It has to be remembered though that these are virtual characters, no real social interaction is taking place at all. Further studies have shown that proxemics behavior tends to operate in virtual environments (Guye-Vuilleme et al., 1999; Wilcox et al., 2006; Friedman et al., 2007).

McCall et al. (2009) showed that proxemics behavior can be used as a predictor of aggression. Proxemics distances of n = 47 (mainly self-identified as White) participants were measured from two White or two Black virtual characters. Subsequently, participants engaged in a shooting game with those virtual characters. It was found that there was a positive correlation between the distance maintained from the characters in the first phase and the degree of aggression exhibited toward them in the second phase but only for the condition where both virtual characters were Black.

Llobera et al. (2010) examined proxemics in immersive VR by measuring how skin conductance response varied with the approach of one or multiple virtual characters toward the participant, to different interpersonal distances. This was to test the finding of McBride et al. (1965) of a relationship between proximity and heightened skin conductance. It was found that there was a greater skin conductance response as a function of the closeness to which the characters approached participants and the number of characters simultaneously approaching. However, it was found that there was no difference in these responses when cylinders were used instead of characters. It was suggested that skin conductance cannot differentiate between the arousal caused by characters breaking social distance norms and the arousal caused by fear of collision with a large object (the cylinder) moving close to the participants.

Kastanis and Slater (2012) showed how a reinforcement learning (RL) agent controlling the movements of a virtual character could essentially learn proxemics behavior in order to realize the goal of moving the participant to a specific location in the virtual environment. Participants in an immersive VR saw a male humanoid virtual character standing at a distance and facing them. Every so often the character would walk varying distances toward the participant, walk away from the participant, or wave for the participant to move closer to him.⁵⁵ The RL behind the character gained a positive reward every time the participant stepped backwards toward a target position. The long run aim was to get the participant to move far back to this target, unknown to the participant herself. The RL eventually learned that if its character went very close to the participant, then the participant would step backwards. Moreover, if the character was far away then it sacrificed short-term reward by simply waiving toward the participant to come closer to itself, because then its moving forwards action would be effective in moving the participant backwards. Hence, the RL relied on presence (the participant moving back when approached too close – from the prediction of proxemics theory) and learned how to exploit this proxemics behavior to achieve its task. For all participants, the RL learned to get the participant back to the target within a short time. This method could not have worked unless proxemics occurred in the VR. Having shown that this is the case we move on to more complex social interaction.

4.2. Discrimination

Research suggests that VR can provide insights into discrimination by affording the opportunity for people to have simulated experiences of the world through another group’s perspective even if only briefly. For example, we saw earlier how simply placing White people in a Black body in a situation known to be associated with race discrimination led to an increase in implicit racial bias (Groom et al., 2009). On the other hand, virtual body representation has been shown to be effective with respect to racial bias, where White people embodied in a Black-skinned body show a reduction in implicit racial bias (Peck et al., 2013)⁵⁶ in a neutral social situation as we saw in Section 2.1.2.

More generally, the method of virtual embodiment has also been used to give adults the experience of being a child (Banakou et al., 2013), has been shown to affect motor behavior while playing the drums (Kilteni et al., 2013), and has been used to give people the illusory sensation of having carried out an action that they had in fact not carried out (Banakou and Slater, 2014). Some of the work in the area of body representation applied to implicit bias is reviewed in Slater and Sanchez-Vives (2014) and Maister et al. (2015).

A further question is whether embodied experiences as an “outgroup” member will actually translate into different behavior toward members of the group. Although not in the context of discrimination there is some evidence from the work of Ahn et al. (2013) that this might be the case. They immersed people with normal vision into an HMD-delivered VR where they experienced certain types of color blindness. In three experiments (N = 44, N = 97, and N = 57), they compared the effects of perspective taking where participants simply imagined being color blind to a condition where the display actually made them color blind in the virtual environment. They found that indeed the VR experience did result in greater helping behavior of participants toward color blind people both within the experiment and in their behavior after the experiment (with a moderate effect size of the squared multiple correlation of around 10%). It illustrates how VR might be used to put people experientially in situations and how this may influence their behavior compared with only imaginal techniques.

4.3. Authoritarianism

Stanley Milgram carried out a number of experiments in the 1960s designed to address the question of how events such as the Holocaust could have occurred (Milgram, 1974). He was interested in finding explanations of how ordinary people can be persuaded to carry out horrific acts. The type of experiments that he conducted involved experimental subjects giving apparently lethal electric shocks to strangers. These are a very famous experiments that are as topical today as in the 1960s, and barely a week goes by when there is not some mention of it in news media,⁵⁷ or further research relating to it is reported.⁵⁸ There were several different variants of the experiment that Milgram designed. Typically, the experimental subject, normally recruited from the local town (near Yale University) rather than from among psychology students, were invited to the laboratory where he or she met another person, also supposedly recruited in the same way. The other person was in fact a confederate of the experimenter, an actor hired for the purpose, this being unknown to the subject. The experimenter invited the subject and the actor to draw lots to determine their respective roles in the experiment. It turned out that the subject was to play the role of Teacher, and the actor the role of Learner, but the outcome of this draw was fixed in advance. Then both the Teacher (subject) and Learner (actor) were taken to another room, where the Learner had electrodes placed on his body connected to an electric shock machine. It was explained that the idea was to examine how punishment might aid in learning. The Learner was to learn some word-pair associations, and whenever he gave a wrong answer he was to be shocked. The Learner, acting in a jovial manner, explained that he had a mild heart condition, and the experimenter assured both Learner and Teacher that “Although the shocks may be painful they are not dangerous.” There are online videos showing the original experiment.⁵⁹

The Learner was left in the room, and the experimenter took the Teacher back into the main laboratory, closing the door to that room. He explained to the Teacher that he had to read out cues for the word-pair tests and whenever the Learner gave the wrong answer the Teacher should increase the voltage on a dial and administer an electric shock at that voltage. The voltages were labeled from 15 V (slight shock) to 375 V (danger: severe shock) to 450 V (marked “XXX”). During the course of the experiment, a tape was played giving the responses of the Learner. With the low voltage shocks there was no response. After a while though the Learner could be heard saying “ouch!” and as the voltage increased further he complained more and more vociferously, eventually saying that he had the heart condition and that his heart was starting to bother him. He shouted that he wanted to be let out of the experiment, and finally with the strongest shocks he became completely silent. If at any point the Teachers said that they felt uncomfortable or that they wanted to stop, the experimenter would say one of “The experiment requires that you continue,” “It is absolutely essential that you continue,” or “You have no other choice, you must go on” in a prescribed sequence. Participants generally found that the experience was extremely stressful, and even if they continued through to lethal voltages they were clearly very upset.

Prior to the experiment, Milgram had asked a number of psychologists about how many people would go all the way and administer even lethal voltages to the Learner. The view was that only a tiny minority of people, those with psychopathic tendencies, would do so. In the version of the experiment described above, about 60% of subjects went all the way to administer the most lethal shocks. The results stunned the world since it apparently showed that ordinary people could be led to administer severe pain to another at the behest of an authority figure. There is a wealth of data and analysis and a description of many different versions of this experiment in Milgram (1974), but the basic conclusion was that people will tend to obey authority figures. Here, ordinary people were being asked to carry out actions in a lab in a prestigious institution (Yale University) and in the cause of science. They tended to obey even if they found that doing so was extremely uncomfortable. Although this is not the place for discussion of this interpretation, interested readers can find alternative explanations for the results in, for example, Burger (2009); Miller (2009); Haslam and Reicher (2012); and Reicher et al. (2012).

Participants in these experiments were deceived – they were led to believe that the Learner was really just another subject, a stranger, and that he was really receiving the electric shocks. The problem was not so much the stress, but that fact that participants were not informed about what might happen, were not aware that they may be faced with an extremely stressful situation, and were ordered to continue participating even after they had clearly expressed the desire to stop. These and other issues led to strong criticism from within the academic community that eventually led to a change in ethical standards – informed consent, the right to withdraw from an experiment at any moment without giving reasons, and care for the participants including debriefing. See also a discussion of these issues as they relate to VR in Madary and Metzinger (2016). Hence, these experiments on obedience, no matter how useful, cannot be carried out today for research purposes, no matter how valuable they might seem to be scientifically. Yet, the questions addressed are fundamental since it appears that humans may be too ready to obey the authority of others even to the extent of committing horrific acts.

In 2006, a virtual reprise of one version of the Milgram experiments was carried out (Slater et al., 2006), with full ethical approval. The approval was given because participants were warned in advance about possible stress, could leave the experiment whenever they wanted, and of course they knew for sure that no one in reality was being harmed because in this experiment the Learner was a (poorly rendered) virtual female character displayed in a Cave-like VR setting.⁶⁰ The participants (Teachers) sat in the Cave system by a desk on which there was an electric shock machine. They saw the virtual Learner on the other side of a (virtual) partition, projected in stereo on the front wall of the Cave. They went through the same routine with the virtual Learner as in Milgram’s experiment, reading out cue words, and administering “electric shocks” to the virtual Learner whenever she answered with an incorrect wrong word-pair association. Just as in the original experiment, after a while she began to complain and demanded to be let out of the experiment, and eventually seemed to faint. However, if participants expressed a wish to stop, no argument against this was given, and they stopped immediately.

Even though carried out in VR, many of the same results as the original were obtained, though at a lower level of intensity of stress. There were n = 34 participants, 23 of whom saw and heard the virtual Learner throughout the experiment, and 11 who saw and spoke to her initially but then a curtain descended, and they only communicated with her through text once the question and answer session began. All those who communicated by text gave all of the shocks. However, 6 of the 23 who saw and heard the Learner withdrew from the experiment before giving all shocks. In other words, 74% continued to the end, in spite of the fact of feeling uncomfortable, as was shown by their physiological responses (skin conductance and electrocardiogram responses).

In the paper, it was argued that the gap between reality and VR makes these types of experiments possible. Presence (PI and Psi) leads to participants tending to respond to virtual stimuli as if they were real. But, on the other hand, they know that it is not real, which can also dampen down their responses. In debriefing, when participants were asked why they did not stop even though they felt uncomfortable, a typical answer was “Since I kept reminding myself that it wasn’t real.” From the original experiments of Stanley Milgram we know (at least for the 1960s around Yale in the US) how people actually responded. In VR, we see that they responded similarly, though not with the very strong and visible stress that many of the original participants displayed. Using VR, we can study these types of events, and how people respond to them, and construct predictive theory that may help us understand how people might respond in reality. The predictions can then be tested against what happens in naturally occurring events and the theory examined for its viability. This type of approach can also be used to gather real-time data about brain activity of people when faced with such a situation (Cheetham et al., 2009).

4.4. Confronting Violence

You are in a bar or other public place and suddenly a violent argument breaks out between two other people there. It seems to be about something trivial. One man is clearly the perpetrator, and the victim is trying to calm down the situation, but his every attempt at conciliation is used by the perpetrator as a cue for greater belligerence. Eventually the perpetrator starts to physically assault the victim. What do you do? Suppose you are alone there? Suppose there are other people? Perhaps the victim shares some social identity with you, such as being a member of the same club or same ethnic group different to that of the aggressor. How do you respond? Do you try to intervene to stop the argument? Or walk away? How is your response influenced by these factors such as number of other bystanders or shared social identity with the victim or aggressor?

This area of research was initiated in the late 1960s provoked by a specific incident when apparently 38 bystanders observed a woman being murdered and did nothing to help.⁶¹ Latane and Darley (1968) introduced the notion of the “bystander effect,” which postulates that the more bystanders there are at an emergency event such as this, the less likely it is that anyone would intervene, due to diffusion of responsibility, see also Darley and Latané (1968). However, other researchers have also suggested the importance of social identity as a factor, the perceived relationships between the people involved, for example, see Reicher et al. (2006); Hopkins et al. (2007); Manning et al. (2007); and Levine and Crowther (2008). There is a meta-analysis and review of the field by Fischer et al. (2011).

As pointed out by Rovira et al. (2009), one of the problems in this area of research is that for ethical and practical reasons it is not possible to actually carry out controlled experimental studies that depict a violent incident such as that described in the opening paragraph of this section. This is very similar to the situation of the Obedience studies discussed above. Instead, researchers have to study surrogates such as the responses of people to someone falling (Latane and Rodin, 1969) or responses to an injured person laying on the ground (Levine et al., 2005). However, these are not violent emergencies so that it may not be valid to extrapolate results from such scenarios to what might happen in actual violent emergencies. In VR it is possible to set up simulated situations, where we know from presence research that people are likely to react realistically to the events portrayed. King et al. (2008) suggested the use of Second Life to provide a non-immersive simulation of the bystander situation and described a case study where a particular person was victimized to examine how the presence of bystanders mediated the level of helping offered. It was concluded that one reason that people did not intervene was that they thought that this should be the responsibility of the Second Life monitors rather than the ordinary “citizens.” In another video-game setting, Kozlov and Johansen (2010) found that participants were less prone to helping behavior in the presence of larger groups of virtual characters. A possible problem though with using video games is that they do not mobilize the body – there are no natural sensorimotor contingencies so that PI becomes something at best imaginal. In some applications this may not be important. However, when studying people’s responses to emergency situations it may be prudent to have whole body engagement, some illusion that the body itself is present and at risk. Garcia et al. (2002) showed that only imagining the presence of other bystanders results in a bystander effect to the extent that participants are less likely to help others after the end of the study if they had been primed to think about or being in a group than being alone. Hence, it might be the case that video games are mainly aids to imagination and that results obtained from video games might be the same as those from imagination. Indeed, a result from Stenico and Greitemeyer (2014) suggests that this might be the case. This is not to say that such results are invalid but that by themselves they are not convincing enough, and some experimental evidence is needed that does place participants into the midst of a violent emergency so that various factors influencing their responses can be investigated. But, as we have said this cannot be done both for practical and above all ethical reasons.

Slater et al. (2013) used immersive VR (a Cave-like system) to study the social identity hypothesis: that participants who share social identity with the victim are more likely to intervene to help than if they do not share social identity. The method to foster social identity with a virtual human character was through the use of soccer club affiliation. All of the n = 40 participants were fans of the English soccer team Arsenal. They were in a virtual bar where they had an initial conversation with a life-sized male virtual character (V). This character was either an Arsenal supporter depicted through his shirt and his enthusiastic conversation about Arsenal (n = 20, “ingroup” condition), or a generic football fan, not a supporter of Arsenal (n = 20, “outgroup” condition). After a while of this conversation another character (P) – also wearing a generic soccer shirt but not Arsenal – butted in and started to attack V especially because of his support of Arsenal. This attack increased in ferocity until after about 2 min it became a physically violent attack.⁶² The main response variable was the number of times that the participant intervened on the side of V. It was found in accordance with social identity theory that those in the group where V was an enthusiastic Arsenal supporter intervened much more than those in the other group. There was a second factor, which was whether or not V occasionally looked toward the participant during the confrontation, but this had no effect. However, there was a positive correlation between the number of interventions and the extent to which participants believed that V was looking toward them for help – but only in the ingroup condition.

Since it is impossible to compare these results with any study in real life, of course their validity in the sense of how much they would generalize to real-life behavior cannot be known. However, experiments such as these generate data and concomitant theory, which can be compared in a predictive manner with what happens in real-life events. In fact, there is no other way to do this other than the use of actors – which as mentioned earlier can run into ethical and practical problems. Moreover, the knowledge gained from such experiments can be used also in the policy field, for example, providing advice to victims on how to maximize the chance that other people might intervene to help them, or of use to the emergency or security services on how to defuse such a situation.⁶³ It is a way to provide evidence-based policy, and if the evidence is not generalizable to real situations then with proper monitoring, the policy will ultimately be changed.

4.5. Cultural Heritage

“In today’s interconnected world, culture’s power to transform societies is clear. Its diverse manifestations – from our cherished historic monuments and museums to traditional practices and contemporary art forms – enrich our everyday lives in countless ways. Heritage constitutes a source of identity and cohesion for communities disrupted by bewildering change and economic instability.” (Protecting Our Heritage and Fostering Creativity, UNESCO).⁶⁴

The preservation of the cultural heritage of a society is considered as a fundamental human right, and there is a Hague Convention on the protection of cultural property in the event of armed conflict.⁶⁵ As we have seen tragically in recent years, there has been massive and deliberate destruction of cultural heritage, two well-known examples being the Buddhas of Bamiyan⁶⁶ and the partial destruction of Palmyra.⁶⁷ UNESCO maintains a country-by-country world heritage list.⁶⁸

The ideal way to preserve cultural heritage is physical protection, preservation, and restoration of the sites. There has also been significant work over many years concerned with digital capture and visualization of such sites, which of course can be displayed in VR (Ch’ng, 2009; Rua and Alvito, 2011). The first and obvious application of VR in this field is to allow people all over the world to virtually visit such sites and interactively explore them. This is no different from virtual travel or tourism, except for the nature of the sight visited. This is also possible through museums that have VR installations. The second is digitization of sites for future generations, and especially those that are in danger of destruction either through factors such as environment change or conflict. The third type of application is to show how these sites might have looked fully restored in the past and under different conditions such as lighting conditions. For example, it is quite different to see the interior of a building or a cave with electric lighting than under the original conditions that the inhabitants of that time would have seen them – by candlelight or fire. The fourth is to see how sites, both cultural heritage and non-cultural heritage sites might look in the future, under different conditions such as under different global warming scenarios.

This is a massive field and mainly concerned with digitization, computer vision, reconstruction, and computer graphics techniques. Here, we give a few examples of some of the virtual constructions that have been done and that potentially could be experienced immersively in VR.

An example of one type of application is described by Gaitatzes et al. (2001) who show how museum visitors can walk through various ancient sites visualized in a Cave-like system, in particular through the ancient Greek city of Miletus.⁶⁹ Carrozzino and Bergamasco (2010) give various examples of museum installations.⁷⁰^,⁷¹ Interestingly, they speculate on a number of reasons why the use of VR in museum settings may not have been taken up so much recent years: (1) cost; (2) it requires a team to be able to do this; (3) lots of space is needed for the installation; (4) visitors do not want to wear VR equipment; (5) it is a single person experience; and (6) VR might be thought to be not serious enough to include in such august settings as museums. Apart possibly from the last issue, each of these problems is largely overcome with the advent of low-cost, high-quality HMDs with built-in head tracking. Of course it is still true that an interdisciplinary team is required to create the environments, although see Wojciechowski et al. (2004) and Dunn et al. (2012) for an example of how to do this. In particular, digital acquisition and rendering of cultural heritage sites requires a huge amount of data to be processed. An example of how this was handled for the site of the Monastery of Santa Maria de Ripoll in Catalonia, Spain, is presented in Besora et al. (2008) and Callieri et al. (2011) and an example of a user interface for virtually navigating this site in Andújar et al. (2012). A famous example of the virtual recreation of world heritage is the digitization and rendering of Michelangelo’s statue of David plus several statues and other artifacts of ancient Rome (Levoy et al., 2000). The David statue⁷² required 2 billion polygons for its representation, and the software is available as freeware from Stanford.⁷³

Sometimes a digital reconstruction is the only way to view a site. The ancient Egyptian temple of Kalabsha was physically moved in its location to preserve it from rising flood waters. Sundstedt et al. (2004) digitally reconstructed it to show it in its original site, and also how it may have looked two millennia earlier, including illuminating it with simulations of the type lighting that may have been used at that time. Gutierrez et al. (2008) describe a method for highly accurate illumination methods for heritage sites. Happa et al. (2010) review various examples of illuminating the past, together with descriptions of the methodology used.

Many examples of virtual cultural heritage in the past have been implemented for desktop or projection systems – though of course they could always be displayed immersively in HMDs. However, this raises other issues such as appropriate tracking, interfaces, and so on. A joystick for navigation, for example, is not always appropriate for an HMD (especially bearing in mind that movement without body action can sometimes be a cause of simulator sickness). Also a screen display has the advantage that typically it can be much higher resolution than what is possible in an HMD, where all the detailed lighting and detail rendering might not even be perceivable. Webel et al. (2013) describe their experience with a number of the newer technologies for display and tracking in the virtual construction of four different sites for display in a museum. They point out how traditional systems, such as tracking, requiring the wearing of devices, and expensive Caves are not always suitable for busy environments such as museums. However, low–cost, camera-based tracking systems do not require physical contact with visitors, and the use of the Oculus Rift HMD (in their application) allowed visitors to look around the virtual environment simply by turning their head rather than learning a joystick type of navigation method. In other words, these systems provide a natural means of interaction. As the authors wrote: “With the Oculus Rift as a display and head-tracking device, the user’s immersion can be extremely increased. The natural camera control just by turning the head, like one would do in the real world, lets users control this aspect without even thinking about it. The combination with natural interaction inputs with the Kinect or the Leap Motion enables the user to directly interact with the virtual world.”

Kateros et al. (2015) review the use of Oculus HMDs for cultural heritage and show how they were used in a number of applications and give insight into their ideas for preparing a user study. Casu et al. (2015) carried out such a study comparing the viewing of art masterpieces in the classroom through a non-immersive multimedia white board display and the Oculus Rift. Their experiment had n = 23 students in a between-groups design (12 saw the non-immersive display) and found that the HMD method was superior across a range of subjective questionnaire-based factors including motivation. Such studies, while useful, do not address the problem of the “wow factor,” i.e., using the HMD is novel, and it certainly provides a quite different experience than the multimedia white board. However, maybe once such systems become commonplace, the same results might not be obtained. There are no clear-cut answers, and it is not easy to establish criteria for the success or otherwise in comparing such systems (since there are many factors that vary between them). For example, Loizides et al. (2014) compared a powerwall with an Oculus Rift HMD for virtual visits to cultural heritage scenarios in Cyprus. They found that participants appreciated both types of display and especially the presence-inducing capabilities of the HMD. However, the HMD also led to greater nausea. As mentioned though, it is very difficult to make such comparisons because on the one hand the HMD had the natural interface for viewing (head tracking) but on the other hand much lower resolution. Moreover, the price ratio between powerwall and HMD was (at that time) 40 to 1, a factor not reflected in the difference in participant evaluation.

Finally, it should be noted that cultural heritage is not only buildings and statues. There are rich traditions in societies that are passed down the generations that are certainly no less important to preserve for the future – intangible heritage. An obvious example is folklore stories, but the medium for the ultimate representation of these for preservation through the generations is in written form. However, there are other examples, such as folk dances – which can be preserved through younger generations learning these from their elders – but this does not provide a form for others to experience. Aristidou et al. (2014) show how folk dancing can be digitally captured and represented.⁷⁴ They concentrate on the technical aspects, but clearly such efforts can be portrayed immersively (see Presentation S3 in Supplementary Material).

5. Moral Behavior

Sometimes in our professional and personal lives we are faced with problems that cannot be answered by any kind of evidence-based scientific reasoning. The science can provide information, but it cannot determine what should be done. Imagine that there is a nuclear reactor providing power for millions of people, and that the science determines that in the next 10 years there is a 5% chance that it will explode causing massive contamination. There are no resources to repair it and no alternatives. It can be decommissioned, and in the short to medium term this will cost many lives and great suffering. It can be left to run, with the corresponding risk. The science can determine the level of risk, but it cannot determine the action. In military or police action, there is the issue of “collateral damage.” Action to resolve one kind of threat that might save many lives may indeed cost many lives in its execution. The science can inform about relative risks and costs, but it cannot determine what is the right thing to do.

How people “should” and do make decisions under such conditions of moral uncertainty are subjects for study in moral philosophy and neuroscience. Normally, abstract situations are used for reasoning or gathering evidence about the responses of people. A famous example is the “trolley problem,”⁷⁵ where you have to make a choice between allowing a runaway trolley (or tram or train…) to run over and kill five unaware people in its path or diverting it to kill another person (Foot, 1967; Thomson, 1976). What do you do? Suppose the trolley were running toward the one person, but there were five others on another track. Would you divert to the train to save the one but kill the five? According to survey evidence (Hauser et al., 2007), most people will choose the action that saves the greatest number – five rather than one.⁷⁶ Suppose to save the five, however, you have to push someone else onto the track to divert the train. In this case, few people will choose to take that action.

Philosophers distinguish between utilitarian and deontological principles. The first states that it is best to take the action that maximizes the greatest good, i.e., is concerned with consequences (the end justifies the means). The second emphasizes rather that an action in itself must be ethical, based on universal maxims. For example, if it is wrong to steal then it is wrong to steal in any circumstances, irrespective of possible beneficial outcomes. See Hauser (2006) for an exposition of these various principles in the context of psychology and neuroscience. Although sacrificing one person to save five is the utilitarian solution, people also do act out of deontological principles – which is why few support actively pushing someone onto the track even though the outcome is exactly the same in utilitarian terms. Moreover, choosing to take the action of diverting the train to save five rather than one has the same outcome as not choosing to divert the train when it is running toward one with five on another track (omission). However, omission could be argued to be both utilitarian (five are saved rather than one) and deontological (not personally taking an action that would kill).

These discussions have been going on for centuries. But, how can we know what people would actually do? As we saw in the example of the Stanley Milgram Obedience experiments (Section 4.3) what people might say they would do and what they do actually do when faced with a situation are not necessarily the same. Below we give some examples where VR has been used, relying on its presence-inducing capabilities, to face people with such dilemmas and where their behavior can be observed. Of course, this does not solve the moral problem of what the “right” behavior should be, but rather can inform about what people actually do, and ultimately the factors and brain activity behind this.

5.1. Virtual Representations of Moral Dilemmas

Transforming a short verbal description of a scenario such as the trolley problem into VR is non-trivial. There are “five people” – which people? Gender? Age? Ethnicity? Social class? How do they look? What are they doing? Why are they there? There is a trolley or train – exactly how does it look? How fast is it going? What is the surrounding scenery? The experimental subject can divert the train – exactly how? Which action needs to be taken? How can the designer be sure that the subject will even be looking in the necessary direction? How can it be set up so that the subject sees the five and also sees the one? Doing something in VR means making it concrete and specific, obviously changing the scenario – which in one case is dependent on the imagination of the subject in response to a statement in a questionnaire, but in the other is there to be seen and heard.

Navarrete et al. (2012) implemented a version of the trolley problem, making all of the above choices but staying true to the story line, and they carried out an experiment where participants were faced with the choice between saving five or one.⁷⁷ There were n = 293 participants who experienced the scenario in an HMD-based system (NVIS). This was a between-groups experiment where one group experienced the action condition (they could act to save five) and the other group the omission condition (if they did not act five would be saved). Just over 90% of subjects chose the utilitarian solution in line with questionnaire-based results. However, those who had to actively save the five showed greater arousal (skin conductance levels) than those who could save the five by doing nothing. Moreover, the greater level of arousal was associated with a lower propensity to take the utilitarian outcome. This could indicate that following the utilitarian path leads to greater internal conflict within participants, but following it without simultaneously violating deontological principles is a less stressful choice. Ideally, in order to rule out the effect on arousal simply of carrying out the action there should be a condition that equalizes the level of physical action across the conditions. However, the important point is that such studies can be carried out at all.

Pan and Slater (2011) portrayed a dilemma equivalent to the trolley problem. Participants were taught how to control a platform that operated as an elevator in an art gallery. The gallery consisted of two floors, ground and upper level. Virtual characters entered and could ask to be taken to the upper level to view the paintings there or remained on the ground floor. At one point – in the Action condition – there were five characters on the upper level and one on the ground level. A seventh person entered and asked to be taken to the upper level. While still on the elevator, that character raised a gun and started to shoot toward all those on the upper level. The participant could leave the shooter there (risking the five) or bring the elevator down (risking the one). The Omission condition was similar except that at the critical moment there was one character upstairs and five downstairs. To avoid the problem that the types of people represented by the virtual visitors might influence the results they were portrayed as stick figures, so that characteristics such as those mentioned above – age, gender, etc. – could not be inferred. This was a between-groups experiment with 36 participants in 2 factors: the situation was portrayed in a 4-screen Cave-like system or on a single PC screen. The second factor was the Action and Omission conditions. Running such an experiment in VR really illustrates how different it is than telling people a story and asking for their response. For those in the Cave their fundamental reaction was confusion or panic illustrated by the fact that 61% of them carried out multiple actions in response to the shooting compared to 33% of those in the desktop condition. However, taking into account the final resting point of the platform, 89% of those in the Action condition in the Cave brought the lift down, whereas 22% did so in the Omission condition. For those in the desktop condition the equivalent proportions are 67 and 22%. The differences between Cave and desktop were not significant, although being a pilot experiment the sample sizes were small. This experiment was featured in a BBC Horizon documentary “Are You Born Good or Evil?” where people naïve to the experiment were filmed. More than the statistics, their reactions pointed to the fact that they did actually experience a genuine dilemma.⁷⁸^,⁷⁹ A more sophisticated version of this setup was repeated in an HMD-based study (Friedman et al., 2014) concerned with embodiment and time travel, where realistic virtual characters were portrayed. In terms of responses to the dilemma they were similar to the other studies. In these studies it has been found that people become more utilitarian in VR compared to what they will say in response to a questionnaire – i.e., they are more likely to adopt a decision depending on the outcome (saving five rather than one). In another study that used desktop VR the same was found. Specifically, subjects were more likely to make utilitarian decisions in VR compared to the same scenario described textually. In other words, although participants judged it less acceptable to sacrifice one person to save five when this dilemma was presented verbally, when it came to their actual action in VR they were more likely to do so. There is therefore a division between what people will say they would do and what they would actually do faced with the situation. This illustrates what VR is useful for in these types of context.

Finally, Skulmowski et al. (2014) used a screen-based system to situate participants in a trolley that they could control and avoid colliding with people standing on branching tracks. They investigated a number of hypotheses relating to specific types of potential victims (male, female), the number balanced against each other (e.g., 10 people rather than 5 against 1, or 1 against 1), ethnicity, altogether with 11 different hypotheses. They found that there were different response times depending on gender of the potential virtual victims, with a greater tendency to sacrifice males. In this study, arousal was estimated by measuring pupil dilation (see Presentation S4 in Supplementary Material).

5.2. Doctor/Patient Interaction

One area in which VR is likely to flourish in the coming years, as its cost comes down and it becomes more ubiquitous, is for the training of professionals. In many professions, people make fundamental ethical decisions – not so dramatic as the trolley problem, but nevertheless often very important. How does a lawyer act knowing for certain that a client has committed a horrific crime? Does a health inspector close down a factory putting at risk hundreds of jobs or allow the factory to continue with unsanitary practices – when it is clear after several warnings that there will be no significant improvement? With limited resources should an agency responsible for deciding which medicinal drugs should be available on prescription go for the cheaper one that has been shown to have limited success, or the vastly superior one that is also vastly more expensive? Choosing the latter might disadvantage the greater number of people due to restrictions on other drugs, yet also save the lives of a few.

Sometimes, these issues are covered by law and sometimes not. We consider one example. How do medical professionals learn to interact with their patients in such circumstances? Of course they observe their supervisors and teachers, and they read and learn about this in medical school. However, there is no substitute for experience. But, experience requires that prior to interacting with patients the doctors have already learned to interact with patients. Hence, VR can provide training and many different scenarios that will help toward gaining experience (Cook et al., 2010).

The idea of using virtual patients has been very thoroughly studied for many years⁸⁰ (Cendan and Lok, 2012). For example Kleinsmith et al. (2015) has investigated empathy training with virtual patients. Here, though we consider only ethical problems in dealing with patients – where contrary to medical advice a patient demands a certain medicine; the first time that a doctor confronts this problem with a patient would typically be with a real patient. A case in point is the overprescription of antibiotics. This is a balance between the needs of society as a whole (to avoid enhanced bacterial resistance to antibiotics) and the needs of the individual. If a patient demands antibiotics but the medical evidence suggests that these would not be appropriate, does the doctor prescribe in order to have a quieter life, or perhaps avoid being sued should the decision ultimately have been a wrong one, or follow the higher principle that not prescribing unless clearly necessary may be the best thing to do for the greater good? Pan et al. (2016) carried out an experiment with n = 21 medical doctors (general practitioners; 9 being trainees with limited experience and the remainder with an average of about 6 years’ experience). The experiment was carried out using an Oculus DK2 through which each doctor had a consultation with a virtual mother and her daughter. The mother had a small cough, and the daughter demanded that the mother be given antibiotics because when faced with the same problem a year before, the antibiotics had cured the problem immediately. Since the medical indications were that this was probably a viral infection, the participants (GPs) resisted the demand for antibiotics, which unleashed a torrent of complaints and anger from the virtual daughter.⁸¹ Finally 8 out of the 9 trainees prescribed the antibiotics, whereas 7 out of the 12 experienced doctors did so. The results also suggested that for those in experienced group, the greater their reported level of presence the less the probability that they would administer the antibiotic. The use of this type and many other scenarios in the medical and other professions could be of great utility in training, and preparing people for situations that they are almost bound to face eventually. Just as airline pilots first learn on simulators so the same is likely to be true across a range of professions.

6. Travel, Meetings, and Industry

6.1. Virtual Travel

Using VR, it is possible that you may not need to have physically gone to a place to say that you have visited it. Sitting in your home you can be navigating the streets and shopping in Hong Kong, ascending Mount Everest, visiting the Taj Mahal, exploring the Forbidden City in Beijing, or even the landscape of Mars. You can watch at first hand ceremonies and customs from Polynesia to Greenland.⁸² This is an obvious and long-discussed application. There are various possibilities: to visit a place virtually before going there, to visit the place instead of going there, to have a business meeting virtually with remote partners, meeting in a shared virtual environment, have a break on a beach in the middle of the day in winter during your coffee break in the office; the possibilities are limited only by imagination and what technology can deliver at the time (which of course is always changing).

This is far from a new idea. Already two decades ago people in the travel industry were considering the “virtual threat to travel and tourism” (Cheong, 1995), arguing that “the perceived threat of virtual reality becoming a substitute for travel is not unfounded and should not be ignored. Virtual reality offers numerous distinct advantages over the actual visitation of a tourist site … that could result in the eventual replacement of travel and tourism by virtual reality.” The advantages of VR suggested were (1) technology could eventually support “the perfect virtual experience” where the sun never stops shining (for one kind of holiday), or the snow is perfect (for another kind), there are no unruly (real) people around, and so on. (2) It is convenient – there is not the stress of traveling, it is significantly cheaper, there are no inconveniences. (3) Places could be visited that are not easily accessible (Mars is an extreme example). One could even travel in the past or to fantasy worlds. (4) People who are unable to travel because of illness or disability would easily be able to do so. (5) There are no risks – tropical diseases, accidents, and food poisoning. (6) There is no damage to the places visited. (7) Business travel could be simplified. However, Cheong (1995) goes on to discuss the reasons why this might not really be a threat – virtual immersion is not the same as really being there; it would be difficult in VR to engage in exchanges with the locals (like discussions in a market, learning to dance the Hula); there is a level of complexity and randomness in the real world that cannot be reproduced in VR; people might confuse reality and VR; and there would be problems with countries whose revenues depend greatly on tourism.

On the one hand, of course since 1995 tourism has not been replaced by VR (on the contrary – see the next section), but on the other hand, none of the objections above seem insurmountable (even revenue from tourism could be protected by some kind of royalty system). Moreover, as global warming becomes an increasingly serious prospect and threat, VR could provide a way of lessening some of the negative impact of travel. An article by Guttentag (2010) suggested that VR could be useful for tourism for planning, management, marketing, entertainment, education, providing accessibility to inaccessible places such as archeological sites (see Section 4.5) with consequent heritage preservation. However, Guttentag wondered whether VR could ever provide an alternative to real travel, emphasizing a point made in Cheong (1995) that VR may never be able to substitute basic sensory experiences – “the smell of ocean spray” or make virtual surfing feel like the real thing. In other words, at the end of the day will VR ever be technically up to the mark in providing a genuine substitute for the real experience?

In this section, we do not attempt to answer this question, since the answer cannot be known. Rather, we describe what has already been accomplished in this realm across a variety of applications that require some kind of travel. Perhaps, VR is not meant to be a substitute for real travel but just another form of travel, no less valid in its own terms than all that physically boarding the real aeroplane entails.

6.2. Remote Collaboration

The contribution of travel to the world economy is colossal. According to the World Travel and Tourism Council (WTTC, 2015), travel and tourism generated $7.6 trillion in 2014, amounting to 10% of global GDP. It also accounted for 10% of all jobs (277 million), with the travel economy growing faster than other sectors such as health, financial services, and automotive. See also the extensive statistics produced by the World Tourism Organization UNWTO.⁸³ On the other side, travel comes with significant costs (Reford and Leston, 2011). The first obvious one is the potentially disastrous impact on the planet’s environment (Zhou and Levy, 2007) including the negative impact on health of air pollutants – e.g., Curtis et al. (2006) and Kampa and Castanas (2008) – see, for example, a meta-analysis by Mustafić et al. (2012) that reports a clear relationship between many of the associated pollutants and the near-term risk of heart disease. A second problem is especially in regard to business travel. In the US alone, $283B was spent on business travel in 2014.⁸⁴ However, such travel can be disruptive both to the business and the personal life of the traveler (Gustafson, 2012) including contributing to family conflict and burnout (Jensen, 2014). Nevertheless, for business (let alone personal and family relationships) face-to-face contact is thought to be essential. Even if face-to-face meetings can be substituted by one of the various forms of teleconferencing systems available, it has been suggested that these types of virtual meetings may even generate greater physical travel (Gustafson, 2012).

In an analysis of the relationship between air travel and the possibilities offered by videoconferencing in the past four decades Denstadli et al. (2013) did not find any clear picture and certainly not the case that videoconferencing might substitute air travel. Based on the analysis by Jones (2007), it is argued that face-to-face meetings are important for completing projects across international sites, maintaining commitment to strategic plans and shared organizational culture, knowledge sharing, creativity, and new services. There are of course related issues such as trust, using business meetings to get away from the office from time to time, taking the opportunity to meet friends or relatives in remote locations, and so on. Hence, face-to-face meetings seem to be essential, and interestingly it is precisely those who travel the most who engage in most videoconferencing meetings. Hence, there is a complex relationship between the two. Nevertheless, in the study of Denstadli et al. (2013) (n = 1413), of those who had access to videoconferencing tools one-third said that they believed that some air travel could be replaced by videoconferencing. For example, probably some readers of this article would have experienced the situation of several hours of travel to attend or speak at a 1-h meeting and then to travel home shortly afterward – sometimes wondering what the point of it all might have been. Can VR be of benefit in this domain?

In this section, we briefly review the possibilities offered by immersive VR as a means for enabling remote communication and collaboration. We consider a virtual environment that is shared between multiple participants. Each participant is represented by a virtual body (an “avatar”) and can see the representations of the others. Ideally participants’ movements are tracked, they can move through the virtual environment, and can talk to one another. Hence, they are in a 3D stereo surrounding space along with others. Of course, there are several technical issues involved in how to realize such a system (Steed and Oliveira, 2009), such as how and where to distribute the computation (one master machine broadcasting to all the others or a distributed network?), how to keep the various participant environments synchronized with one another so that they are all able to perceive the same consistent environment etc., but these issues are not considered here. In its ideal form, such a system must be superior to videoconferencing – since for example, the latter cannot display spatial relationships, eye contact, and so on. However, an ideal form of a shared VR would require real-time full facial capture, eye tracking, real-time rendering of subtle emotional changes such as blushing and sweating, subtle facial muscle movements such as almost imperceptible eyebrow raising, the possibility of physical contact such as the ability to shake hands, or embrace, or even push, and so on. Such a system does not exist today, though it is one to strive for. Some of these capabilities might be realized with the type of VR referred to as 360° surround, but we defer the discussion of this to Section 7.2. In the following section, we review some of what has been achieved and what the likely prospects are.

6.3. Shared Virtual Environments

Probably, the first published work where more than one person could simultaneously inhabit the same virtual environment was presented by Blanchard et al. (1990). This was the VPL system that allowed two people each with their own HMD (Eye Phone) and data glove to be simultaneously copresent in a virtual environment. Over the next few years, there were many systems that provided this and typically extending to multiple participants rather than two (Greenhalgh and Benford, 1995; Frécon and Stenius, 1998; Frecon et al., 2001), and today it is a matter of course that VR systems support this capability (Bierbaum et al., 2001; Tecchia et al., 2010), and VR development platforms of recent choice such as Unreal Engine or Unity3D are also multi-participant systems.

So, the capability for virtual environments shared by multiple participants has been around for a long time, supported by many platforms, and realized in massive online systems such as Second Life, although typically non-immersively. The work by Apostolellis and Bowman (2014) is a good recent illustration of collaboration in a learning context that was realized with screen-based displays. The early days of research in this area, apart from the technical issues of how to build systems, concentrated on exploiting the capabilities of VR to improve remote collaboration beyond what might be possible even in face-to-face communications – for example, the type of work reported in Benford and Fahlén (1993) and Koleva et al. (2001). However, the primitive representations of people (very crude block-like characters) due to the relatively limited graphics and processing power at the time made this of interest only in a research context.

Later work concentrated on exploring social dynamics within shared virtual environments. For example, the research described in Tromp et al. (1998); Steed et al. (1999); Sadagic and Slater (2000) and Slater et al. (2000) had three-person groups carry out a task together although they were physically in different places (including even different countries). This also compared the group dynamics in VR to real encounters and found that the dynamics was greatly influenced by the computational power and type of immersion. For example, the group leader that would emerge in VR was the one with an HMD rather than those interacting with the others on screen, but this same person was less likely to be the leader when the group met for real. Also, people were quite respectful of each others’ avatars, notwithstanding their extreme simplicity – for example, avoiding collisions and apologizing when collisions invariably happened. Steed et al. (2003) carried this further by having pairs of people, one in London, UK, and the other in Gothenburg, Sweden, each in a Cave-like system spend around 3.5 h working together. Some of the pairs were friends, and some were strangers. They found that the partners could collaborate well on spatial tasks, where the avatars representing their whole bodies played an important role. However, on other negotiation tasks, where facial expression would be quite important to gage the intentions of the other, the friends did better together than the strangers. A review of this type of avatar-mediated communications can be found in Schroeder (2011).

Although during the 2000s the graphics power to display more realistic human avatars in real time and in large numbers became available, the type of “ideal” system mentioned earlier still was far from possible. Nevertheless, researchers began to address critical aspects of non-verbal communications that can make remote face-to-face interactions in virtual environments effective, such as shaking hands (Giannopoulos et al., 2011; Wang et al., 2011). Steptoe et al. (2008) introduced eye tracking as a way to determine the gaze of each individual avatar in virtual meetings between three remote participants (one in London, one in Salford, and the other in Reading, UK) each in a Cave-like system. Analysis suggested that participants automatically used gaze direction much as they would in a similar conversation in reality. This was followed up by Steptoe et al. (2010) who showed that eye tracking data that allowed avatars to be rendered showing gaze direction, blinking, and pupil size resulted in participants being able to better detect one another telling lies compared to a video conferencing system. This was between two participants in different physical places one using a Cave-like system and the other a power wall. Another recent idea for remote collaborative working is for each party to use a whiteboard, where they would see a silhouette of the remote person, like a shadow, on the white board. It was found that participants tended to act as if they were in the presence of the remote person (Pizarro et al., 2015). Although a lot of work on such avatar-mediated communication during this period took place using projection systems such as Caves, Dodds et al. (2011) used HMDs to embody two remote people in the same environment. They found that body tracking, in particular showing arm gestures, played an important role in bidirectional communication between the partners. When, for example, the gestures of the avatar of one of the partners were replaced by prerecorded animations then the communication was not as successful in task achievement.

A combination of HMD and Cave system was used for a case study of remote acting, where two actors rehearsed a short scene using a script from The Maltese Falcon movie⁸⁵ (Normand et al., 2012a). One actor was in Barcelona wearing a full motion capture suit and a wide field-of-view high-resolution HMD. The other actor was in a Cave in London and had some level of body tracking (arm gestures). The two were in the same virtual environment and could see and hear the avatars representing the other. A director was in a separate room in London. He could see and hear the scenario on screen, and video of the director’s face was streamed in real time to both actors. Therefore, the director could communicate to the actors and tell them where to stand, what to say, how to improve their performance – generally act like a director.⁸⁶ The professional actor involved in London concluded that such a system could be used for remote acting rehearsal especially for aspects such as blocking concerned with spatial locations and movements of actors, lines of sight, and so on. This work was followed up by Steptoe et al. (2012) who used again an actor in Barcelona in VR who saw a virtual representation of the remote London scenario, and she was represented as a wall screen avatar with a spherical display to represent her head to the actor, and the director was in the Cave. See also Steed et al. (2012) for a description of the technology. Observers from the Royal Academy of Dramatic Art commented on the positive potential uses of such a system for rehearsal and blocking, which are the arrangements and lines-of-sights of actors at the different stages of a play. Of course, again the lack of facial expression shown on the avatars is a drawback in these types of system.

Another drawback is the lack of touch – if one participant touches the avatar of another then typically nothing would be felt. Bourdin et al. (2013) set up an application where two remote people wearing an HMD and body-tracking suit interacted with a third person (an experimenter) who was in a Cave, so that all three saw representations of one another in a shared virtual environment. The experimenter had the task of persuading the other two to sing together. As part of the persuasion, she could touch the avatars of the two participants on the shoulder, upon which they could feel a vibration from a small actuator located on their shoulder. Thus touch was used as part of the persuasion.⁸⁷ Earlier Bailenson et al. (2007) carried out experiments using haptic only virtual environments where they showed that touch helped in the communication of emotions between people, both with respect to recognizing emotions recorded as haptics earlier by others, and with respect to simultaneous communications between remote partners. Their paper also contains a review of the field and a theoretical model. Basdogan et al. (2000) using a haptic only environment carried out a series of experiments, which also found that haptic feedback could impart critical information in remote communications. This work culminated in a “hands across the Atlantic” experiment where remote participants, one in London, UK, and the other in Cambridge, MA, USA, carried out joint tasks together such as lifting an object that they saw on screen and using haptics to help in the communication between them (Kim et al., 2004). Apart from describing the technological issues involved in setting up such a system, the results showed that the haptic feedback improved the sense of copresence, that is, that the remote participants felt that they were together.

6.4. Virtual Beaming

One obvious way to introduce haptics into remote VR-enabled communication is to actually use physical representations of people in the form of remotely controlled robots. This was envisaged and implemented in the very early days of VR. Fisher et al. (1987) described a telerobotic control system developed at NASA Ames (CA, USA), where the participant wearing a head-tracked HMD and other tracking, audio, and tactile feedback equipment received visual input from the cameras mounted on a remote robot. The robotic body essentially visually substituted the person’s own body, therefore appearing to be colocated somewhat like the discussion of embodiment in Section 2.1.1. Recently, this idea of the symbiosis between a person in VR being represented remotely as a humanoid robot has seen some new applications as a particularly exciting form of remote collaboration where the participants are given physical form in the remote place. Here, the participant uses VR to perceive the remote location in full stereo with head- and body-tracking but is represented as a humanoid robot in the remote location. The humanoid robot moves as a function of the real-time body tracking of the participant, who can speak (through the robot) to local people in the remote location. It is a further and up-to-date realization of what was presented in Fisher et al. (1987) except now for the purposes of remote collaboration.

An example was shown in a BBC interview.⁸⁸ The BBC interviewer in London (Technology Correspondent Rory Cellan-Jones) interviewed a scientist in Barcelona who was fitted with a wide field-of-view head-tracked HMD and a body-tracking suit. She was represented as a humanoid robot that was in the same room as the journalist in London. Her movements captured by the motion capture suit were transmitted across the Internet to the robot and applied to it so that it moved almost synchronously and in correspondence with her. A Skype connection allowed her to speak through the robot, whose mouth opened and closed in sync with her speech. Cameras fitted as the eyes of the robot transmitted video back to the HMD, so that she saw the surrounding London environment in stereo. Since the HMD head tracking data were transmitted and applied to the robot head, she could look around the room in London and converse with the BBC interviewer. The technology used was described in Spanlang et al. (2013). The same technology was used to beam journalist Nonny de la Peña from Los Angeles (CA, USA) to Barcelona. In Los Angeles, she wore the body-tracking suit and HMD. She was represented as the humanoid robot in Barcelona. Embodied as the robot, she conducted a debate between three students on the issue of Catalan independence from Spain and also interviewed a scientist about his research on HIV.⁸⁹

The idea is reminiscent of “beaming” in Star Trek. Instead of a person being physically decomposed, transmitted to a remote place, and then recomposed there, a person in VR has their movements and speech transmitted to the remote place and applied to a humanoid robot, and sensory data – vision, sound, and touch – is transmitted back from the robot’s sensory apparatus to the person, that is perceived in VR. The locals in the remote place interact with the robot that is embodied by the beamer. The beamer, however, through the VR becomes present in the remote place. This has also been used by journalist Nonny de la Peña to beam from London, UK, to Barcelona to interview neuroscientist Dr. Perla Kaliman about food for the brain.⁹⁰ This journalism resulted in a news article about the results of the interview itself, rather than about the system used to realize it⁹¹ (Kishore et al., 2016).

The same kind of beaming setup has been used to create a shared environment between a small animal and a human. Normand et al. (2012b) showed a human participant in VR interacting with a virtual human, which in fact was a tracked rat in a cage 12 km away. Simultaneously, the rat interacted with a rat-sized robot, which in fact was moving determined by the tracked the movements of the remote human. Hence, each interacted with an entity at its own scale (the rat with a small robot, the human with a human-sized avatar), leading to interspecies communication. This type of setup is of value in ethology. In an article on animal geography and related issues, Hodgetts and Lorimer (2015) wrote in reference to this work that “… it is claimed that the human and the rat were able to participate in a purportedly playful meeting of species that seems straight from the pages of science fiction. Such experiments in adjusting scale do little to shift power dynamics in interspecies communication. Nor does the lab maze create anything more than a novel environment for encounter. Yet the prospect of engaging with animal worlds in more embodied, interactive and exploratory ways opens new avenues for developing richer accounts of animal lifeworlds.”

The issue of non-verbal communications is critical for face-to-face communications, and as we have mentioned above there are attempts to overcome this problem, for example, using eye tracking to animate the eyes of avatars. Telerobotics enables physical presence and to some extent the conveyance of body language, depending on the extent of body tracking and the capabilities of the robot; however, facial expression remains a problem, even though some robots can do this. Nevertheless, the subtle cues of which we are not consciously even aware in communication are not rendered. One way out of this problem has been explored through the combination of animatronics and “shader lamp” technology. Shader lamps project computer-generated images onto neutral objects so that observers would see the simple object as animated. In particular, an animated human face can be projected onto, for example, a spherical or egg-shaped object, thus making it appear as if the physical object were an animated face. Moreover, the face could be one that is captured by face-tracking or video from a remote person. Lincoln et al. (2009) proposed and implemented shader lamps for the faces of remote people projected onto animatronic puppets. The participant could be far away seeing the real surroundings of the puppet through a VR, and his or her face back-projected onto a shell, so that an observer of the puppet would see video of the real face of the distant person, and be able to interact with that person.⁹² Some research has suggested that this type of technology, where faces are displayed on physical objects, in this case a spherical display, can improve the aspects of trust in remote communications (Pan et al., 2014) (see Presentation S5 in Supplementary Material).

6.5. Interacting by Thought

The descriptions above of embodiment in remote robots through which social interaction can take place with distant people are reminiscent of movies such as Avatar (see text footnote 11) and Surrogates.⁹³ The fundamental difference is that whereas in the systems above people move their remote robotic bodies through their own deliberate movement (realized through real-time motion capture), in the vision presented in these movies, the remote representation is moved through a brain interface. The participant only has to think or imagine moving the remote body, and it moves the corresponding cyborg or robot body (in the movies perfectly) just as if they were moving their own real body. To a limited extent, this has been achieved today. For example, Millan et al. (2004) were able to control a mobile robot through non-invasive brain recordings or BCIs. Leeb et al. (2006) described their research with a tetraplegic patient who was able to use a BCI to navigate through a virtual environment presented in a Cave. He triggered his movement entirely by the voluntary production or halting of a specified electrical brain signal (EEG pattern).⁹⁴ The same motor-imagery paradigm was used for the voluntary control of an arm belonging to the participant’s virtual body (Perez-Marcos et al., 2009), resulting in an illusion of ownership over the virtual arm. BCI was used in a telepresence application for disabled patients by Tonin et al. (2011), although the patients did not see the remote environment via VR but rather video on a PC display. Nevertheless, this demonstrated the possibility. A survey of the use of BCI in VR and games was presented by Lécuyer et al. (2008).

Martens et al. (2012) demonstrated that a number of whole body tasks could be realized by a participant wearing an HMD embodied in a remote robot controlled through various BCI paradigms. Participants could pick and place objects, and engage in a game. This study also illustrated how the BCI could be used to recognize the intentions of the participant (for example, pick up a glass) and the robot would execute and complete the intention (since non-invasive BCI today simply does not permit the fine control necessary).

The lack of fine motor control results from the fact that most BCI systems use non-invasive scalp electrodes that therefore record brain signals of low spatial resolution. For patients who cannot otherwise move, acting in the world through the motor control of a robot is a possibility that may justify (invasive) brain implants. Small electrodes placed in the cortical tissue record the activity of groups of neurons with higher spatial resolution, allowing the control of finer movements. Wessberg et al. (2000) first showed that direct recording from the neurons in monkeys enables them to control quite sophisticated movements of a remote robot arm without using their own real arm. A similar approach has been used in people with tetraplegia that could successfully control robotic arms through brain implants (Hochberg et al., 2006, 2012). Moreover, depending on what the actuators may encounter, feedback can be used to stimulate appropriate groups of neurons that cause different tactile sensations. This was realized in monkeys by O’Doherty et al. (2011) where they were able to move a virtual arm that touched virtual objects distinguished only by their texture. Such technology could be used to drive prostheses that replace missing limbs, or exoskeletons that move actual but paralyzed limbs, or virtual bodies experienced in immersive VR or remote physical robots or cyborgs.

The latter possibility is the vision of Avatar and Surrogates. In each case, people perceive through the senses of their remotely embodied cyborg or robot and act in the world through those bodies. In John Scalzi’s novel Lock In⁹⁵ people suffering from “locked in syndrome” are present in the world through such robot embodiment. Although these are works of science fiction they are beginning now to be technically feasible and almost surely are going to be realized with the advance of neuroscience, VR, and robotic technology. For example, Kishore et al. (2014) showed how BCI could be used to embody people in a remote robot through which they could gesture and maintain a conversation with the people there.⁹⁶^,⁹⁷

The “Embodiment Station” reported by Leonardis et al. (2014) was inspired by the setup in Surrogates. The Embodiment Station is a large chair that is a mobile platform that can induce force feedback (see text footnote 97 from minute 2:50). The participant is fitted with an HMD and has a multitude of physiological responses recorded and various different types of stimulation applied to his or her body. The participant may be embodied in a virtual body or remote physical body.

People in Avatar are shut into a tubular structure that monitors their brain and provides feedback so that they become embodied into a remote genetically engineered cyborg body. Cohen et al. (2014b) [see also Cohen et al. (2012)] show how to use real-time fMRI to decode particular thoughts of participants so that they are able to embody a virtual character⁹⁸ and control a remote robot thousands of kilometers away (Cohen et al., 2014a).⁹⁹ Although of course the degree of control and the level of embodiment are generations away from what is depicted in Avatar, it is nevertheless a clear step along the road toward this vision (see Presentations S6 and S7 in Supplementary Material).

6.6. Industrial Applications and Design

During the 25 years when VR was supposedly dead, or at best confined to University laboratories, industry was busy using it to develop products, inventing new methods of manufacturing, assembly and training, maintenance, and shopping. We briefly review some work in this area.

In a major review of the use of VR in car manufacture, Lawson et al. (2016) pointed out that VR can be used for design, avoiding the complex and expensive procedure of building physical mockups. With a mockup, any small change can result in major new work. Of course, VR is far more flexible in this regard. VR is also used for virtual manufacturing, that is part of the preparation, planning, and risk assessment in the manufacturing process, and clearly also invaluable for training. VR can be used for learning the assembly and disassembly of parts. Data from an in-depth survey revealed that VR was being used for a number of aspects in the design, manufacture, and evaluation – to examine the look of the vehicle including product reviews with clients, motion capture of manufacturing procedures, reviews relating to ergonomic use of the vehicle.

There has been significant work on industrial assembly, training for maintenance and remote maintenance – for example, Gavish et al. (2011, 2015) and Seth et al. (2011). This is also enhanced by the possibility of mixed reality where a participant in a VR can see their own hands incorporated into the virtual environment (Tecchia et al., 2014; Sportillo et al., 2015).¹⁰⁰ Immersive VR is also being used for automobile testing.¹⁰¹

In another context, Tiainen et al. (2014) found that customers were equally at home in evaluating furniture presented virtually as physically. Indeed, they made more suggestions for design improvements in evaluations of the virtual products. Customers designing aspects of the interior of automobiles is also being prototyped using HMD-based VR.¹⁰²

Virtual reality has also been used in the clothing industry where powerful computer graphics-based cloth simulators are used to allow customers to virtually try on clothes on virtual representations of their own bodies (Hauswiesner et al., 2011; Magnenat-Thalmann et al., 2011; Sun et al., 2015). Although not yet used in an immersive way, such systems are bound eventually to be a normal part of shopping – as we will have our own body representations, trying on clothing in the comfort of our homes without the inconvenience of traveling, queues, and fitting rooms would be a possible major application.

A final example is a highly innovative potential application in the food industry. Ruppert (2011) describes how VR is used to study the behavior of shoppers in response to different kinds of packaging and layout in supermarkets. It is suggested that where consumers want to buy healthier products that experimentation with different types of presentation could result in knowledge about how to best present such products so that they stand out for these types of consumer.

As argued by Lawson et al. (2016), VR can improve the prototyping, production, evaluation processes in manufacture, it can also be part of the design process, and ultimately for marketing. It also offers the possibility of consumers being involved in design and even designing aspects of the products that they will buy. In fact, VR combined with 3D printing could totally revolutionize how products are designed, manufactured, and delivered, giving enormous new power and possibilities to consumers¹⁰³ (see Presentations S8 and S9 in Supplementary Material).

7. News and Entertainment

We have already mentioned the potential benefits of VR for travel, for visiting remote relatives, and so on. Moreover, the use of VR in games is obviously going to be a huge area of application and one of the driving forces of the industry.¹⁰⁴^,¹⁰⁵ There is a clear role also for immersive movies, where the participant plays a role within the story, somewhere between a game and a movie. These are such obvious applications of VR we are not going to discuss them further here. The chances are that any person first learning of VR in 2016 will do so because of a game or movie. In this section, we therefore concentrate on a quite novel field that VR opens up, which is the immersive presentation of news. This is usually called “immersive journalism.” However, it is important to note that it is not the journalism that is immersive but the presentation of its results through immersive media, leading to the creation of a genuine new type of media for news reporting. We will consider the issues involved, including ethical issues, and finally discuss the differences between computer graphics-based VR and 360° video.

7.1. News and Immersive Journalism

The idea of immersive journalism is “the production of news in a form in which people can gain first-person experiences of the events or situation described in news stories” (de la Peña et al., 2010). Let’s consider the main headlines (online) of the Los Angeles Times on January 23, 2016 and see what this might mean.

7.1.1. Los Angeles Times January 23, 2016

If we compare the report with the VR version we can see that they reflect quite different purposes. In each row, the left side is the reporting of “news” (“Newly received or noteworthy information, especially about recent events,” Oxford English Dictionary). There are masses of academic research studies and theories of what makes it into “The News” (as reported by newspapers, radio, TV, and of course now myriad online outlets). Interested readers could read, for example, a classic analysis by Galtung and Ruge (1965) who identify a number of factors that influence what events typically get into the news, and a follow-up study by Harcup and O’Neill (2001) who examined the earlier theory in the light of a content analysis of stories in three British newspapers. The theory includes factors such as those events involving elite nations or persons are more like to be newsworthy than non-elites. For example, news in Western media is more likely to report on events in the USA, Europe, China, and Russia than in the Seychelles, except, for example, when events in other places directly affect those countries (e.g., events in the Middle East). The divorce of a movie star is far more likely to make it into the news than the divorce of your next-door neighbor (unless you happen to live next to a movie star). However, who decides what is important? This reflects another aspect of news, which is that there are not events just “out there” floating around, and they just happen and then are selected by journalists according to some criteria and then reported factually, but it is an active process where what is news is defined by journalists and multifarious interests and ideologies that make up particular media cultures (O’Neill and Harcup, 2008). For example, a President attends an important international event. If the President is a man, the reporting may focus on the event and its background. If the President is a woman, a great deal of attention may be instead paid to her clothing.¹⁰⁶^,¹⁰⁷ News values can differ enormously between different organizations. What makes it into the equivalent of the left side of each row in the table above, and how it is reported, are not simply matters of fact.

Now considering the possible immersive VR versions there is quite a difference – the goal is not so much the presentation of “what happened” but to give people experiential, non-analytic insight into the events, to give them the illusion of being present in them. That presence may lead to another understanding of the events, perhaps an understanding that cannot be well expressed verbally or even in pictures. It reflects the fundamental capability of what you can experience in VR – to be there and to experience a situation from different perspectives. This is no more or less “objective” than news in traditional forms – what is selected, and how it is presented inevitably will reflect the interests, culture, political views of the journalists involved, and perhaps even more importantly their news organizations. There is no way around that, since what might be “news” is infinite, and something has to be selected.

Moreover, how news in VR will be understood will also be actively shaped by the participant. Recall that in VR there are neither “users” nor “observers” but participants or consumer-participants. Even if you are just an observer without the actual ability to intervene, presence in VR is such that you will likely have the perception that ongoing events could affect you. Hence, the consumer of a news story in one medium becomes a participant in the virtual story in the other, the “immersive journalism” that creates a scenario to represent aspects of the news story in VR. However, there is a difference. Let’s go back to the woman President attending an event. A VR rendition of this puts you in the scene in the 1PP of someone who attended and who was greeted by the President. She moves over to you, smiles, and says some words of greeting: to you. Assuming that the journalist had made every effort in visual reconstruction to be faithful to the original event, whether the clothes that the President is wearing stand out or not depend wholly on you, the perceiver. You may pay attention to them or not, you may see them as remarkable or not. If the journalist wanted to really point out to participants the clothing worn by the President, this is of course entirely possible in VR – whether openly or surreptitiously. However, if the goal is to try to be objective, then how certain aspects of the events are interpreted will depend more on the perceiver than on the designer. We will come back to some of these points later.

The first immersive journalism piece was developed in 2010 in Barcelona, Spain, and directed by journalist Nonny de la Peña with the help of digital artist Peggy Weil. It followed on from the idea of their 2009 interactive Second Life piece that portrayed a virtual Guantánamo Bay prison.¹⁰⁸ The immersive news story was displayed in a Wide5 HMD by Fakespace for the display (see text footnote 18) and incorporated body tracking. It established a pattern that was to be used by Nonny de la Peña in later productions, which was to use a mix of data from actual events combined with a computer graphics-based reconstruction. It relied on transcripts of the interrogation of Detainee 063, Mohammed Al Qahtani, at Guantánamo Bay Prison 2002–2003. The scenario was in a single cell-like room, and the participant was embodied in a virtual character wearing an orange “jump-suit.” From a 1PP, the participant’s virtual body posture was shown in a stress position – one reportedly used for “harsh interrogations.” The participant could see the virtual body either directly looking toward his own body and in a virtual mirror. However, in fact the participant was seated comfortably in a chair. The participant would hear an interrogation as if coming from a cell next door.¹⁰⁹ A case study (de la Peña et al., 2010) with three participants was carried out who were interviewed after their experience. All reported that even though they were seated comfortably, they felt uncomfortable, even pain, from the posture of their virtual body. This result that the posture of the virtual body can actually influence feelings of comfort or discomfort of participants has recently found new evidence (Bergström et al., 2016) (see text footnote 54). The three participants felt a foreboding that the interrogation in the next cell would soon shift to them. Although the participants had not been given any forewarning of the meaning of the event that they were to experience, one of them said: “During the experience I was kind of reminded of the news that I heard about the Guantánamo prisoners and how they feel and I really felt like if I were a prisoner in Iraq or some… war place and I was being interrogated.” It illustrates the difference between the left column (traditional reporting of news) and right column (news in VR) in the Table above. The left column might be a written piece about harsh interrogation methods, or a TV news piece illustrating aspects of this. But, on the right hand side there is experience. Of course, this is not the real experience, but may give participants insight into how some aspects of the situations depicted might have been.

“Hunger in Los Angeles”¹¹⁰ was a subsequent piece by Nonny de la Peña. This puts participants in a food line in Los Angeles where one of the people in the queue faints due to diabetes, and the various characters around react. It was based on an actual event and blended real sound recordings with computer graphics. The virtual characters in the food line were animated through the motion capture of actors. It was experienced by hundreds of people at the Sundance Film Festival in 2012. The 2014 World Economic Forum featured “Project Syria” by de la Peña, which depicted a bomb explosion in a Syrian town and its aftermath (see text footnote 110). This followed the same pattern of being based on an actual event and starting from video and audio from the real scenario. Further pieces on the same lines are “One Dark Night”¹¹¹ about the shooting of teenager Travyon Martin and “Kiya” about an incident of domestic violence and murder¹¹² (recall the fifth item in the table above).

An alternative to using computer graphics to reconstruct events is the use of 360° video. A scenario is captured by using a special camera and subsequent software to patch video together to form a completely surrounding scene that can be displayed in an HMD. Due to head tracking, the viewer can look all around the scene, and depending on how it has been captured, it can also be displayed in stereo. We will return to the technology in Section 7.2. This is therefore an alternative way of displaying events immersively.

“Waves of Grace”¹¹³ by Gabo Arora (Senior Advisor and Filmmaker, United Nations) and Chris Milk (Vrse.works) use this technique to recreate the true story of a survivor of Ebola in Liberia. They also created “Clouds over Sidra,” a documentary about a child refugee in the Syrian war.¹¹⁴ Louis Jebb founder and Edward Miller head of visuals of Immersiv.ly use 360° video to create immersive news events. Some examples have been the coverage of unrest in Hong Kong¹¹⁵ and a 360° VR experience of the paintings of the artist Gretchen Andrew on a self-guided interactive tour of a computer-generated recreation of the De Re Galler in Los Angeles.¹¹⁶ The Des Moines Register working with Dan Pacheco produced a documentary that combined both computer graphics-generated VR and 360°, which can be viewed in an Oculus HMD that provided an in-depth study of the situation of farmers in Iowa, called “Harvest of Change.”¹¹⁷ The New York Times has started VR news based on 360°, using Google Cardboard as the means of display and has created a number of stories with this technology.¹¹⁸ The BBC is also experimenting with 360° HMD-based news,¹¹⁹ for example, providing experience of the refugee crisis.¹²⁰

At the same time as the great enthusiasm of VR in this domain,¹²¹ there are also warnings about its ethics. For example, in an excellent and comprehensive article on potential problems, Tom Kent (Standards Editor, Associated Press and Columbia University) urges “an ethical reality check for virtual reality journalism.”¹²² The first point concerns the depiction of reality. For example, “Hunger in Los Angeles” was a reconstruction using computer graphics for the display. It was not the real thing. It is important for consumer-participants to always be made aware of this, and it should form part of the ethics code being devised by digital journalists.¹²³ However, it is important to note that all journalistic reporting necessarily involves transformation and cannot possibly ever depict every aspect of reality. At the moment that the news camera focuses on the face of a politician, it of course misses everything else that is happening at the same time, some of which may change the meaning of the facial expression. Depicting any event with its infinite aspects and nuances in any media whatsoever necessarily involves a transformation. As we argued above, starting from what is selected to how it is portrayed involves myriads of choices. VR is no different in this regard. It can be argued that in VR a journalist could, for example, deliberately change the facial expression of a protagonist from a friendly smile (as it was in reality) to an arrogant grin. This could happen deliberately or by accident. However, how different is this from taking a small sentence in a speech of a politician out of context, thus distorting its meaning away from that intended? The use of VR requires ethical standards no more or less than conventional news reporting.

Another point relates to 360° video-based pieces, where there is an issue of image integrity. Since the Associated Press does not allow manipulation of images should particularly disturbing parts of a scene on a battlefield or bomb site be left in or not? Again, this is nothing special for VR. Of course a 360° view is less selective than a single camera shot or normal video shot. There are conventions where images are “distorted” though – such as blurring the faces of vulnerable people in order to protect them. It is not clear why such conventions could not be applied in the same way. This is nothing really to do with VR. As we argued in Section 1.1, VR is a media where conventional approaches will eventually be overtaken by a new paradigm. Today, shooting a 3D movie inevitably draws on the conventions of traditional movie making, so that problems of inclusion are paramount, since 360° in principle shows “everything.” New paradigms will eventually overcome this problem.

The third point is that there may be competing views of what happened in any event, so VR portraying one version may not reflect the diversity of views. This also has nothing to do with VR. In fact, VR may have an advantage that it is possible to relive a scenario from multiple points of view – from the viewpoints of different protagonists, which may sometimes even explain why they describe an event quite differently. The 1950 Japanese movie “Rashomon”¹²⁴ received international acclaim for doing this – depicting a story from the multiple points of view of the characters involved. Another version was released in 1964 called “The Outrage.”¹²⁵ VR could excel in such multi-viewpoint recreations.

Tom Kent argues that since VR is excellent for producing empathy, and identification with characters who may be experienced as being physically close to consumer-participants, that journalists have a special responsibility to make sure that their piece is balanced. For example, if they have the goal of producing sympathy toward particular people or situations they could emphasize aspects that provoke empathy or leave out balancing information that could be inconvenient to their story. This is of course true but again it applies no less than to conventional media. It could be argued though that VR is particularly adept at raising emotions and therefore unwitting consumer-participants might be more easily manipulated. This may be true. For example, we have seen in Section 2.1.2 how embodying White people in a Black body appears to reduce their implicit racial bias against Black people (Peck et al., 2013). However, we also saw in Section 4.4 that in a fight between two virtual characters about soccer teams, only participants who supported the same team as the victim tended to try to intervene to stop the fight (Slater et al., 2013). People did not change their behavior simply as a result of being near a virtual character that was attacked by another. In other words, people are not like sponges and just soak up whatever emotion is poured into them. In the racial bias example, participants were generally not explicitly biased, so in reducing their implicit (i.e., largely non-conscious) bias perhaps they were being helped toward realizing their own non-biased preferences. Imagine a VR scenario that placed a United States Democrat supporter into a Republican rally or an English vociferously anti-European voter into the heart of the Brussels decision-making community. Are either of these likely to change their views as a result? Of course, research is needed on this issue, but people should not be considered as empty vessels ready to be filled by whatever propaganda comes along. At the end of the day if a journalist wants to present a particular viewpoint they will do so with whatever means they have, so that the critical requirement is openness, information about potential distortions, and appropriate ethical standards.

The final main point made in the article by Tom Kent is that the virtual environment is a circumscribed world, and of course the scenario is embedded in a wider world in which other related events may be happening. On the one side, the VR gives the impression to participants that they can freely go wherever they want, but of course the specific virtual environment has boundaries outside of which nothing can be perceived. This is a problem of selection, applying no less to other news media. When you are reading a story in a newspaper is it the whole story? Of course not, and it never can be.

Arguments about the ethics of VR miss the point that it is not the only way or even the “best” way to deliver news (or indeed any story at all, whether supposedly real or fictional). Just as VR is not going to replace novels in the form of books, it is not going to replace traditional media. It is another medium, another method for the production and display of narrative, providing a different kind of “information,” providing a different kind of emotional engagement. These are not “better” or “worse” but just different. You can read about the refugee camp at Calais in France full of people wanting to enter the UK, or you can visit there virtually,¹²⁶ or really go there. Each of these will provide quite different information and responses. One may give facts and figures and talk about policy and implications for the future of the European Union, another may show the physical and emotional plight of particular people in that camp. Visiting the camp virtually might lead someone already so inclined to do something to try to help the individuals concerned, but not necessarily result in a change in their political convictions about immigration. What is important is that all types of journalism follow ethical standards, and this applies no matter what the medium (see Presentation S10 in Supplementary Material).

7.2. 360° and VR

There is some discussion about whether 360° video as has been used in some of the pieces described above is “really” VR. For example, Will Smith in an article in Wired¹²⁷ argued that systems such as 360° video as might be seen through Google Cardboard should not be called “VR,” the main argument being that the relationship between head moves and image changes are more likely to lead to simulator sickness in 360°. However, this battle has already been lost. Mainstream media are already referring to 360° video as VR, and that is not going to change.

In order to consider this question, we return to the concept of “immersion” discussed in Section 1.3. Immersion refers exclusively to the technical affordances of a system. Different types of immersion may give rise to different types of subjective experience, but this is a different issue. One system is “more immersive” than another if the first can be used to simulate the second. This can classify all systems into what mathematicians call a “partial order.” It is partial because that not all pairs can be classified in this way – there may be two systems where neither can be used to simulate the other.¹²⁸ Now, if we consider 360° VR as video captured in a real setting and displayed in a head-tracked HMD then that can, in principle, be entirely simulated by a computer graphics rendering of the same scene, but not vice versa. By a graphics rendering of the scene we mean one based on a computer model (the model ultimately describes all the geometry, material properties, lighting, and dynamics of objects in the scene). Since there is a model, participants can change their point of view to anywhere within the scene. For example, they can move close to any object and then circle around it while observing it. If the viewpoint is restricted to only a few specific points, where from those points the viewer can turn around and look 360° then this is equivalent to “360-degree” VR. However, 360° VR cannot allow participants the full range of movement through the scene, to be able to observe any object arbitrarily from any angle.

In normal vision based on natural sensorimotor contingencies, when we see one object obscuring another, we can move our head and in principle see completely behind the obscuring object. This can be done with correct perspective and head movement parallax in graphics-based VR. This cannot be done, or to a very limited extent in 360° video. Graphics-based VR can be restricted to simulate the 360° simulation, but not vice versa. Therefore, there is a fundamental technical difference that will always persist by definition between 360° and model-based VR. Model-based VR can simulate 360°, but not vice versa. Therefore, technically it has a greater immersion in this classification of systems.

Ultimately, this means that they are useful for different purposes. If the VR is meant to depict something up-close and personal, such as interaction with a virtual character where the participant and virtual character might be arbitrarily changing their positions in the space, then this cannot be accomplished by 360°, since this type of parallax effect (e.g., just moving the head to see behind the character) just is not possible, unless every possible move that the participant was going to make was determined in advance and camera data made available for these possibilities. On the other hand, for a large-scale scene such as witnessing street protests as in Immersiv.ly’s Hong Kong protests mentioned above, then 360° is sufficient. Provided that the designers did not intend the possibility for a participant to move up close to any arbitrary protestor for one-on-one unplanned interaction then this is fine.

Therefore, we would conclude that model or graphics-based VR and 360° VR are different possibilities in the domain that is referred to as “virtual reality,” and designers and application builders will use the type of system that fits best with their goals. For close-up interaction, 360° will quickly break the natural sensorimotor contingencies that are necessary for the generation of presence. On the other hand, for large-scale scenes looking at objects far enough away, 360° is not only the simpler form of construction and rendering, but it is good enough in terms of sensorimotor contingencies. It is not either one or the other, both have their role. A major worry of Will Smith is that one would be confused with the other, and that people with poor experiences in 360° will therefore label “virtual reality” as poor. Sensible and careful use of both types of technology where they are most appropriate would avoid this possibility.

It should be noted that it is not the model-based solution in itself that is important here, but what it offers in terms of natural sensorimotor contingencies for perception. There will eventually be other solutions that are not model-based but offer the same. One likely solution will be based on light fields (Levoy and Hanrahan, 1996; Ng et al., 2005), which attempt to fully simulate the propagation of light through an environment, and therefore allow a viewer to dynamically move anywhere within a scene. The problem is that dynamic changes to objects, and especially changing lights, cannot easily be supported. Some recent developments for HMDs based on light field displays are discussed in Lanman and Luebke (2013).

8. Conclusion

8.1. Recent Novel Ideas and Applications

In this article, we have mainly reviewed developments in VR that have taken place since its origins in the 1980s, focusing on applications, and especially those with outcomes that have some level of research support. The field is changing extremely rapidly, and the inventiveness of people is amazing, with new ideas and projects emerging daily. Here, we briefly list some recent ideas that have caught our attention (as of May 2016). Mostly, these are ideas in progress, with no results, or maybe not even any level of implementation. They are presented in random order.

Mark Zuckerberg: Virtual Reality Might Be Coming to Your Baby Photos

https://www.youtube.com/watch?v=rACZOac1w8w

The idea that VR may be used to share photos immersively.

Dreams of Dali

http://thedali.org/dreams-of-dali/

A VR experience based on Dali’s 1935 painting Archeological Reminiscence of Millet’s “Angelus.”

Visualizing Big Data

http://www.mastersofpie.com/project/winners-of-the-big-data-vr-challenge-set-by-epic-games-wellcome-trust/

How “big data” in particular a longitudinal social survey can be explored in HMD-based VR.

Topshop – London Fashion Week

https://www.inition.co.uk/case_study/virtual-reality-catwalk-show-topshop/

Attend the show using VR.

A History of Cuban Dance

http://with.in/watch/a-history-of-cuban-dance/

A 360° VR documentary.

Second Life in VR

http://www.bizjournals.com/sanfrancisco/blog/techflash/2016/01/second-life-second-act-virtual-reality-sansar.html

San Francisco Business Times reports “In virtual reality, Second Life prepares for its second act.”

Megadeth in VR

https://www.youtube.com/watch?v=PnQAz8jWAh0

A YouTube documentary about Megadeth bringing heavy metal to VR.

In the eyes of the Animal

http://www.sundance.org/projects/in-the-eyes-of-the-animal

A Sundance Festival winner showing views of how the world might look to various animals

Virtual Reality in Court

http://www.popsci.com/jurors-may-one-day-visit-crime-scenes-using-forensic-holodecks

A Popular Science report “Scientists Want To Take Virtual Reality To Court – Jurors May One Day Visit Crime Scenes Using Forensic Holodecks.”

Project Nourished – A Gastronomical Virtual Reality Experience

http://www.projectnourished.com

“You can eat anything you want without regret.”

Curing Cataract Blindness

http://www.ndtv.com/world-news/virtual-reality-could-be-the-next-big-thing-in-curing-cataract-blindness-1269591

NDTV report “Virtual Reality Could Be The Next Big Thing In Curing Cataract Blindness.”

Oculus Quill

https://www.youtube.com/watch?v=kPHWHJNTlkg

Drawing in VR.

Producer of Acclaimed “First” Sets Sights on Anne Frank VR Experience

http://www.roadtovr.com/producer-of-acclaimed-first-sets-sights-on-anne-frank-vr-experience/

Plans for a historical VR reconstruction of aspects of the life of Anne Franke.

Step inside the Large Hadron Collider (360 video)—BBC News

https://www.youtube.com/watch?v=d_OeQxoKocU&index=1&list=PLS3XGZxi7cBXqnRTtKMU7Anm-R-kyhkyC

“A 360 tour of CERN that takes you deep inside the Large Hadron Collider—the world’s greatest physics experiment—with BBC Click’s Spencer Kelly.”

And so on…

8.2. General Considerations

We have reviewed numerous applications of VR many of which were already envisioned or developed in its earlier forms in the 1980–1990s and have been more extensively developed and tested in the last 25 years. In most cases, the societal reach has been restricted given that the VR systems used (in combination or not with robotics, tracking, etc.) were too costly to move out the research laboratories and reach consumers. There has nevertheless been significant testing and validation of potential applications in many different areas.

This article has shown that the applications of VR are very extensive and range across numerous domains of knowledge. This means that even though the most frequent use that the mass of people are going to experience as a consumer product will probably be for games and entertainment, all advances and developments in VR will also have an impact in more specialized research and professional fields. More affordable systems will facilitate not only the reach to final consumers but also to more developers and research groups, resulting in a much wider range of applications and generation of content for VR that will emerge in the near future.

Even though applications in psychology, medicine, education, or research will reach many, there are some sectors of the population that may be also directly benefited from VR: those with reduced mobility for any reason, lesions, neurological disorders, or aging. To such people VR may provide a new space to move freely, interact, or work. This could be achieved by acting in VR through various means including motor action, BCIs, eye tracking, or physiological responses.

Finally, we also point out that since the use of VR in these many application realms should be evidence-based, that scientific papers should adhere to the highest standards of rigor and reporting. In the hundreds of papers we have reviewed in the preparation of this article, there are many that do not even say what type of equipment was being used. The term “virtual reality” has been overused, when scientific papers are often simply talking about a PC display with a mouse, and the reader has to look very hard through the paper in order to discover that – if is stated at all.

8.3. Speculations – “I’ve seen things …”

“I’ve seen things you people wouldn’t believe; attack ships on fire off the shoulder of Orion. I watched C-beams glitter in the dark near the Tannhäuser Gate. All those moments will be lost, in time, like tears in rain. Time to die.” (Replicant Roy Batty, near the closing scene of the movie Blade Runner).¹²⁹

In the introduction to this article, we defined our notion of “immersion” as the “physics” of a system – how well it can afford people real-world sensorimotor contingencies for perception and action. We pointed out that this also offers a way of ordering systems – where one system is “more immersive” than a second if the first can be used to simulate experiences on the second, but not vice versa. We used this classification, for example, to show that model-based VR is “more immersive” than 360° VR, so that these have different functionality and uses.

Yet, this raises a paradox. Immersive VR simulates experiences of physical reality. Does that mean that VR is more “immersive” than reality? Like any paradox, this helps us to understand the underlying concepts. There must always be some aspect of the VR that does not conform with reality. This is certain. Why? Because were it not the case then what the participant experiences would be his or her reality! This is not word play but rather illustrates a fundamental aspect of VR. The reader may respond – “Yes, but it is only a matter of time before the graphics, sound, tracking, haptics, etc. become so advanced that people will not be able to distinguish a VR experience from a real one, just like nowadays it is becoming difficult to distinguish pictures or videos that are photographs of real world scenes from those that are wholly generated by graphics.” However, in order for the VR to be indistinguishable from reality, the participant would have to not remember that they had “gone into” a VR system. Even if the devices become almost completely transparent and just a part of normal clothing, still the participant has to not know, in other words, has to forget that this is VR, has to forget pressing the button, or having the right thought in a BCI that commands: “Now put me into VR.” If it goes so far that they do not remember getting into VR and they consider that they are directly perceiving physical reality, then they are perceiving their own physical reality.

When we think of VR we are typically thinking about experiences in the visual and auditory domains, rather than haptics (touch and force feedback). The field of haptics has excellent solutions for specific types of interaction, such as pushing a needle through soft tissue (as in medical applications), or using an exoskeleton to apply force feedback to an arm. However, unlike the visual and auditory fields, there is no generalized solution. By a generalized solution we mean a single device whereby participants in a VR can feel anything (just as a display can be programmed to display anything), for example, feel something when their virtual body accidentally brushes against a virtual wall or fall backwards when hit by a tidal wave of virtual water. As argued by Slater (2014), solutions to such issues may well have to go down the route of direct brain interfaces to solve such fundamental problems in a general way that can never be solved with external devices, which in the haptics domain always provide very specific stimuli. VR would become an applied branch of neuroscience in this view. Since as we and others have argued before our notion of reality is a constructed one, by activating the appropriate brain areas, our perception in this type of VR based on direct neural intervention would be indistinguishable from perception of “reality.” As the philosopher Thomas Metzinger has pointed out¹³⁰ we are about to embark on an enormous process of new learning through mass availability of VR: “The real news, however, may be that the general public will gradually acquire a new and intuitive understanding of what their very own conscious experience really is and what it always has been” – that our conscious experience is one possible model – an interpretation – of the world.

Now, let us imagine the perfect VR system with perfect immersion, so perfect that for most people it is completely indistinguishable from reality – it is their reality (recall that they must not remember that they “went into VR” and likewise they must not know when they “come out of VR”). Again seemingly paradoxically in such a situation the notion of presence vanishes. There is no sense of presence in physical reality. Presence is the feeling of being transported to another place. This is why our notion of “place illusion” as “being there” includes the rider “…in spite of the fact that you know for sure that you are not actually there.” It contains an element of surprise: “I know I am at home wearing a HMD, but I feel as if I am in the Himalayas.” In physical reality, there is no perceptual surprise, no feeling “Wow! Look at that, it is amazing that I am here!” (except, for example, as a way of expressing good fortune at being in a fabulous place). We are just “here.” We do not comment on it or think about it from the perceptual point of view – only sometimes at the content of our perception – the scenery or surprising events. There is no special or remarkable feeling associated with being in a place. It is how things always are. The only time we might feel something unusual is when some aspect of our perception breaks – for example, through mental illness, hallucinogens, the aftermath of an injury – where we find ourselves outside of the reference frame of our normal perception. In the movie The Matrix,¹³¹ almost everyone was living in perfect immersion, perfect VR. They only became aware of “presence” (i.e., that their world was illusory) at moments when the system failed.

Hence, the illusion of presence actually represents the non-perfection of immersion. On one side, as we improve immersion more and more through technical advances what this means in terms of “presence” is that the “wow” factor, the sensation of the difference between where we know ourselves to be, but where we feel ourselves to be, i.e., the level of illusion, will become stronger and stronger. The shock of putting on the HMD and seeing an alternate reality in high-resolution, all around, with fantastic vision, sound, haptics, smell, taste, and full body tracking will become overwhelming. But, on the other side, when immersion becomes perfect – to the point that we do not in any way distinguish between perception of reality and VR even to the extent of not knowing when we are perceiving from one rather than the other – then presence will disappear.

However, it is also possible that the surprise element of “presence” will disappear for another reason. Imagine the generation that grows up where VR is just as much part of their lives as cell phones are today. Although they will distinguish reality from VR, their illusion of presence may diminish because the surprise element will disappear through acclimatization. Older generations today still marvel at being able to have real-time video connections at virtually zero added cost with people half way around the world, but a younger generation that is growing up with that find it completely unremarkable. So, this new generation that grows up with VR will of course have the illusion of “being there” in VR, but it will be nothing special, and therefore there will be all the more reason that they will tend to behave the same in VR as they do in reality in similar circumstances. It will be like: Now I am at home. Now I am at school. Now I am in place X in VR. They will become equivalent perceptually, cognitively, and behaviorally. But, just as kids learn “Don’t run in the school corridor,” “Don’t shout in the classroom,” so they will learn different forms of behavior that apply to different places in different modes of reality. VR will have its own customs, norms of behavior, and politeness. Today all we can say is that however we imagine this might be – it won’t be like that, since it will be the result of an unpredictable and complex product of technological advance and social evolution.

We have used the term “presence” slightly loosely here. Recall that there are two components: PI (resting as a necessary condition on sensorimotor contingencies) and Psi (the illusion that events are real). The latter is just as critical and maybe more difficult to get right in many applications. For example, in a real street we might avoid parking our car because we see a police officer standing nearby. On closer inspection we realize that the police officer is actually a manikin dummy. So we park. This is a failure of Psi of the dummy. In VR, we are enjoying talking to a very nice virtual person. Eventually, we realize that the virtual person is going through some repetitive actions and is not actually aware of what we are doing. We move away. This is a failure of Psi, even though our illusion of being in the place is intact. Both PI and Psi are critical components of successful VR applications.

Virtual reality, however, can deliver forms of Psi that have never existed in reality and yet still lead to the illusion of these happening. In Slater et al. (1996), we put people in a VR where they could play 3D chess (like in Star Trek). Not one person was shocked or made any comment about the fact that when they touched the chess pieces these would float in the virtual space to their next location. When asked about this one participant said: “Oh that’s just how things behave in this reality.” So Psi is a difficult concept. In some circumstances, expectations cannot be broken. In others VR can create new expectations that seem completely natural even though they could never happen in physical reality. This is something really worth understanding, and it is connected to our final point.

Virtual Reality encompasses virtual unreality. Almost all the applications we have reviewed, and a lot of what we see, translate something from reality into VR. A fear of heights application puts people … on a height. A fear of public speaking application puts people … in front of an audience. These are fine. However, maybe there are completely new ways to think about these types of applications that make use of the amazing power to put people outside of the bounds of reality and have a positive effect. Even though VR has been around for half a century, still not enough is known about it. The goal is to shape it to create moments that enhance the lives of people and maybe help secure the future of the planet.

And those moments need not be lost.¹³²

Author Contributions

All the authors listed have made substantial, direct, and intellectual contribution to the work and approved it for publication.

Conflict of Interest Statement

The authors were approached by the company Facebook to write an article on potential applications of VR. After completion, the article was subject to a review by the Facebook legal team. There was neither implicit nor explicit encouragement to promote or favor any Facebook products or services. The authors were free to write about virtual reality as they wished. The work is a review of virtual reality in general and not related to any particular products, software, or services.

Acknowledgments

Thanks to James Hairston of Oculus for his support of this work. In addition, the authors thank the following people who have provided images or video that appear in the Supplementary Presentations: Abderrahmane Kheddar, Aitor Rovira, Albert ‘Skip’ Rizzo, Anatole Lécuyer, Angus Antley, Anthony Steed, Antonio Frisoli, Barbara Rothbaum, Christoph Guger, Daniel Freeman, Doron Friedman, Emmanuele Tidoni, Ferran Argelaguet, Franck Multon, Franco Tecchia, Greg Welch, Henry Fuchs, Henry Markram, Hunter Hoffman, Jeremy Bailenson, Jordi Moyes Ardiaca, Larry Hodges, Louis Jebb, Lucia Valmaggia, Mark Huckvale, Nonny de la Peña, Pablo Bermell, Pere Brunet, Rafi Malach, Robert Riener, Salvatore Aglioti, Stephen Ellis, Sylvie Delacroix, Will Steptoe, Xueni (Sylvia) Pan, Yiorgos Chrysanthou, and Zillah Watson.

Funding

This work was funded by Oculus VR, LLC, a Facebook Company.

Supplementary Material

The Supplementary Material for this article can be found online at http://journal.frontiersin.org/article/10.3389/frobt.2016.00074/full#supplementary-material.

Footnotes

^http://www.technologyreview.com/view/421293/whatever-happened-to-virtual-reality/ though see also http://science.nasa.gov/science-news/science-at-nasa/2004/21jun_vr/ from NASA Ames, 2004.
^https://www.youtube.com/watch?v=NtwZXGprxag&feature=youtu.be
^http://humansystems.arc.nasa.gov/groups/acd/projects/hmd_dev.php
^http://www.mortonheilig.com/InventorVR.html
^https://www.youtube.com/watch?v=3L0N7CKvOBA
^https://www.youtube.com/watch?v=fs3AhNr5o6o
^https://www.youtube.com/watch?v=ACeoMNux_AU
^Though see Project Nourished: http://www.projectnourished.com
^https://www.youtube.com/watch?v=QEKxyhSPiVg
^https://www.youtube.com/watch?v=lmHEQRVJzBI
^http://www.avatarmovie.com/index.html
^http://www.gutenberg.org/files/5200/5200-h/5200-h.htm
^https://youtu.be/x5-TPXIzKuI
^https://www.youtube.com/watch?v=TCQbygjG0RU
^https://www.youtube.com/watch?v=4PQAc_Z2OfQ
^https://www.youtube.com/watch?v=ee4-grU_6vs
^https://www.youtube.com/watch?v=EyujFtuFWvo
^http://www.fakespacelabs.com/Wide5.html
^https://www.youtube.com/watch?v=3wg14z5O9Ug
^https://www.youtube.com/watch?v=ydzSgLim5Y4
^https://www.youtube.com/watch?v=HliN3iOX090
^https://www.youtube.com/watch?v=8Oy83OVgbSM
^https://www.youtube.com/watch?v=sn-UNGcbi2Q
^http://www.cog.brown.edu/research/ven_lab/research.html
^http://www0.cs.ucl.ac.uk/research/equator/projects/escience/
^https://www.youtube.com/watch?v=tFtpmOBt7jY
^https://www.youtube.com/watch?v=_UFOSHZ22q4
^https://www.youtube.com/watch?v=ldXEuUVkDuw
^https://www.youtube.com/watch?v=_N-BAv3Hz8k
^http://www.goodreads.com/book/show/83539.Fantastic_Voyage
^http://www.imdb.com/title/tt0060397/
^http://www.goodreads.com/book/show/83545.Fantastic_Voyage_II
^https://www.youtube.com/watch?v=PLqlTaT3Bgk
^https://www.youtube.com/watch?v=UxUZIHAJ2H4
^https://www.youtube.com/watch?v=iiGzNGlnYJ4
^https://www.youtube.com/watch?v=sSRzeGkhUic
^https://www.youtube.com/watch?v=iK3GsAcwKaI
^https://www.youtube.com/watch?v=JEsV5rqbVNQ
^https://www.youtube.com/watch?v=mlYJdZeA9w4
^https://www.youtube.com/watch?v=m4Oeu4SLCgY
^http://www.o2.co.uk/sponsorship/rugby/wear-the-rose
^http://news.sky.com/story/1222817/oculus-rift-headset-may-help-sports-training
^http://www.telegraph.co.uk/technology/news/10621480/Virtual-reality-headset-recreates-England-rugby-squad-training-experience.html
^http://www.telegraph.co.uk/technology/technology-topics/10681570/Virtual-reality-training-session-with-England-rugby-squad.html
^http://bleacherreport.com/articles/2563010-stanfords-new-virtual-reality-system-is-changing-sports-forever
^https://www.youtube.com/watch?v=hXOQsXFcWnk
^https://www.youtube.com/watch?v=RM9IT_N6jFE
^https://archive.org/details/SciterianTechnologiesMars3D_CahokiaPanorama-VirtualReality
^https://www.youtube.com/watch?v=xDqYz5pKA_o
^An online search of “Oculus” and “Mars” will find many “prototype” examples of people experimenting with rendering and walking through a Mars terrain in VR.
^https://www.youtube.com/watch?v=sKz0FVIeEFI
^https://www.youtube.com/watch?v=Wy4Ku2iZjQM
^https://www.youtube.com/watch?v=cN7W0VBi0jo
^https://www.youtube.com/watch?v=P9OXRDc3flU
^https://youtu.be/D4KgWpta7YI
^https://www.youtube.com/watch?v=NrRRKZRGZbE (“Can virtual reality be used to tackle racism?” Report by Melissa Hogenboom, BBC Click).
^E.g., http://nymag.com/scienceofus/2015/10/theres-a-new-film-about-the-milgram-experiment.html
^In the period of January 1 to May 2, 2016 there were more than 100 articles published that reference the Milgram work.
^E.g., https://www.youtube.com/watch?v=fCVlI-_4GZQ
^https://youtu.be/RjUNg3pkEag
^https://en.wikipedia.org/wiki/Murder_of_Kitty_Genovese. See also a recent New York Times article following the death in prison of the murderer http://www.nytimes.com/2016/04/05/nyregion/winston-moseley-81-killer-of-kitty-genovese-dies-in-prison.html?_r=0
^https://www.youtube.com/watch?v=yspbUFhzGC0 (experiment scenario – bleeped out swearing).
^https://www.youtube.com/watch?v=11NH0K23nEM (BBC TV report about bystander experiment).
^http://en.unesco.org/themes/protecting-our-heritage-and-fostering-creativity
^http://portal.unesco.org/en/ev.php-URL_ID=13637&URL_DO=DO_TOPIC&URL_SECTION=201.html
^http://whc.unesco.org/en/list/208
^http://whc.unesco.org/en/list/23
^http://whc.unesco.org/en/list/
^http://www.tholos254.gr/projects/miletus/index-en.html. (This also links to a 360° virtual tour).
^https://www.youtube.com/watch?v=U00bmFyipNw
^https://www.youtube.com/watch?v=DZx8NqjIgF4
^https://www.youtube.com/watch?v=e-l2BMStRcg
^https://graphics.stanford.edu/software/scanview/
^https://www.youtube.com/watch?v=iiuZznpHyPs&feature=youtu.be
^https://www.youtube.com/watch?v=bOpf6KcWYyw (a cartoon exposition of the trolley problem).
^http://www.moralsensetest.com/experiment/originaldilemmas.html (a survey at Harvard University).
^https://www.youtube.com/watch?v=yk_hftGBHy4
^http://www.bbc.co.uk/programmes/p00k9drg
^https://www.youtube.com/watch?v=M2aorOAY8o8
^https://www.youtube.com/watch?v=05jSp63-W7c&list=PLjjzAm1HXwJOFD6aG9vCYHL4cFoYef6ya
^https://www.youtube.com/watch?v=KhcnvdKbHrM&feature=youtu.be
^See an example from Marriott https://travel-brilliantly.marriott.com/our-innovations/oculus-get-teleported
^http://www2.unwto.org
^https://www.ustravel.org/research/travel-industry-answer-sheet
^http://www.imdb.com/title/tt0033870/
^https://www.youtube.com/watch?v=c9bLWQhbJz0
^https://www.youtube.com/watch?v=gc8ySZHZLC0
^http://www.bbc.com/news/technology-18017745
^https://www.youtube.com/watch?v=FFaInCXi9Go (in Catalan and English).
^https://www.youtube.com/watch?v=I58wF9f3_a0
^The news article was published in Latino LA and focused solely on the substantive issue of food for the brain, rather than the system that was used for the interview: http://latinola.com/story.php?story=12654
^https://www.youtube.com/watch?v=eQLr83Co-GI
^https://www.youtube.com/watch?v=UGwQ74cH5O0
^https://www.youtube.com/watch?v=cu7ouYww1RA
^http://us.macmillan.com/lockin/johnscalzi
^https://www.youtube.com/watch?v=iGurLgspQxA
^https://www.youtube.com/watch?v=XUg990uZjEo
^https://www.youtube.com/watch?v=PeujbA6p3mU
^https://www.youtube.com/watch?v=pFzfHnzjdo4
^https://www.youtube.com/watch?v=3Q3ZC124Qbc
^https://www.youtube.com/watch?v=EP0olmaL4Xs
^https://www.youtube.com/watch?v=TOx4q711dY8
^https://www.youtube.com/watch?v=6nHw4RsNJ3Q
^http://www.cnet.com/news/virtual-reality-is-taking-over-the-video-game-industry/
^https://storystudio.oculus.com/en-us/henry/
^http://www.telegraph.co.uk/news/worldnews/europe/germany/9427863/Double-take-Angela-Merkel-steps-out-in-same-dress-she-wore-to-same-event-four-years-ago.html
^http://www.ft.com/intl/cms/s/2/10369810-aeaf-11e3-aaa6-00144feab7de.html#slide0
^http://www.immersivejournalism.com/gone-gitmo/
^https://www.youtube.com/watch?v=_z8pSTMfGSo
^https://www.youtube.com/watch?v=SSLG8auUZKc
^http://www.emblematicgroup.com/#/one-dark-night/
^http://www.emblematicgroup.com/#/kiya/
^http://vrse.works/creators/chris-milk/work/waves-of-grace/
^https://www.youtube.com/watch?v=FFnhMX6oR1Q
^http://www.hongkongunrest.com/vr-player.html
^http://virtualrealityderegallery.com
^http://www.desmoinesregister.com/pages/interactives/harvest-of-change/
^http://www.nytimes.com/newsgraphics/2015/nytvr/
^http://bbcnewslabs.co.uk/projects/360-video-and-vr/
^http://www.bbc.co.uk/taster/projects/we-wait
^http://www.nytimes.com/2016/01/21/opinion/sundance-new-frontiers-virtual-reality.html?hp&action=click&pgtype=Homepage&clickSource=story-heading&module=mini-moth®ion=top-stories-below&WT.nav=top-stories-below&_r=0 (NYT Feature “Where Virtual Reality Takes Us”).
^https://medium.com/@tjrkent/an-ethical-reality-check-for-virtual-reality-journalism-8e5230673507#.ftgz6i1v3
^https://ethics.journalism.wisc.edu/resources/digital-media-ethics/
^http://www.imdb.com/title/tt0042876/
^http://www.imdb.com/title/tt0058437/
^http://www.fastcompany.com/3053219/fast-feed/virtual-reality-journalism-is-coming-to-the-associated-press
^http://www.wired.com/2015/11/360-video-isnt-virtual-reality
^For example, we can say that coordinate (x, y) is “less than” (z, w) if x < z and y < w. This defines a partial order over the set of all such coordinates. (1, 2) is less than (3, 4), but there is no order between (1, 2) and (0, 3).
^http://www.warnerbros.com/blade-runner
^http://edge.org/response-detail/26699 Edge “Virtual Reality Goes Mainstream: A Complex Convolution.”
^http://www.warnerbros.com/matrix
^https://www.youtube.com/watch?v=NoAzpa1x7jU&feature=youtu.be (“I’ve seen things …” Blade Runner).

References

Abulrub, A.-H. G., Attridge, A. N., and Williams, M. (2011). “Virtual reality in engineering education: the future of creative learning,” in Global Engineering Education Conference (EDUCON), 2011 IEEE (Amman: IEEE), 751–757.

Google Scholar

Ahn, S. J., Le, A. M. T., and Bailenson, J. (2013). The effect of embodied experiences on self-other merging, attitude, and helping behavior. Media Psychol. 16, 7–38. doi: 10.1080/15213269.2012.755877