Skip to main content

PERSPECTIVE article

Front. Virtual Real., 20 October 2021
Sec. Virtual Reality and Human Behaviour
Volume 2 - 2021 | https://doi.org/10.3389/frvir.2021.728461

Hand Tracking for Immersive Virtual Reality: Opportunities and Challenges

  • Department of Sport and Health Sciences, University of Exeter, Exeter, United Kingdom

Hand tracking has become an integral feature of recent generations of immersive virtual reality head-mounted displays. With the widespread adoption of this feature, hardware engineers and software developers are faced with an exciting array of opportunities and a number of challenges, mostly in relation to the human user. In this article, I outline what I see as the main possibilities for hand tracking to add value to immersive virtual reality as well as some of the potential challenges in the context of the psychology and neuroscience of the human user. It is hoped that this paper serves as a roadmap for the development of best practices in the field for the development of subsequent generations of hand tracking and virtual reality technologies.

Introduction

Immersive virtual reality (iVR) systems have recently seen a huge growth due to reductions in hardware costs and a wealth of software use cases. In early consumer models of the Oculus Rift Head-Mounted Display (HMD), interactions with the environment (a key hallmark of iVR) were usually performed with hand-held controllers. Hands were visualized in games and applications (infrequently) in a limited array of poses based on finger position, assumed from contact with triggers and buttons on these controllers. Although the ability to visualize the positions of individual digits was possible with external motion tracking and/or “dataglove” peripherals which measured finger joint angles and rotations, these technologies were prohibitively expensive and were unreliable without careful calibration. A step change in hand tracking occurred with the Leap Motion Tracker, a small encapsulated infra-red emitter and optical camera developed with the goal of having people interacting with desktop machines by gesturing at the screen. This device was very small, required no external power source, and was able to track the movements of individual digits in three dimensions using a stereo camera system with reasonable precision (Guna et al., 2014). Significant improvements in software, presumably through a clever use of inverse kinematics, along with a free software-development kit and a strong user base in the Unity and Unreal Game Engine communities led to a proliferation of accessible hand tracking addons and experiences tailor-made for iVR. Since then, hand tracking has become embedded into the hardware of recent generations of iVR HMDs (e.g., the first and second iterations of the Oculus Quest) through so-called “inside out” tracking, and looks set to continue to evolve with emerging technologies such as wrist-worn electromyography (Inside Facebook Reality Labs, 2021). This paper will briefly outline the main use-cases of hand tracking in VR, and then discuss in some detail the outstanding issues and challenges which developers need to keep in mind when developing such experiences.

Opportunities–Why Hand Tracking?

Our hands, with the dexterity afforded by our opposable thumbs, are one of the canonical features which separates us from non-human primates. We use our hands to gesture, feel, and interact with our environment almost every minute of our waking lives. When we are prevented from, or limited in, using our hands, we are profoundly impaired, with a range of once-mundane tasks becoming frustratingly awkward. Below, I briefly outline three significant potential benefits of having tracked hands in a virtual environment.

Opportunity 1–Increased Immersion and Presence

The degree to which a user can to perceive a virtual environment through the sensorimotor contingencies they would encounter in the physical environment is termed “immersion” (Slater and Sanchez-Vives, 2016). The subjective experience of being in a highly-immersive virtual environment is known as “presence”, and recent empirical evidence suggests that being able to see one’s tracked hands animated in real time in a virtual environment is an extremely compelling method of engagement (Voigt-Antons et al., 2020). Research has shown that we have an almost preternatural sense of our hand’s positions and shape when they are obscured (Dieter et al., 2014), and when our hands are removed from our visual worlds it is a stark reminder of our disembodiment. Indeed, we spend the majority of our time during various mundane tasks foveating our hands (Land, 2009), so removing them from the visual scene presumably has a range of consequences for our visuomotor behaviour.

Opportunity 2–More Effective Interaction

The next point to raise is that of interaction. A key goal of virtual reality is to allow the user to interact with the computer-generated environment in a natural fashion. This interaction can be achieved in its simplest form by the user by moving their head to experience the wide visual world. More modern VR experiences, however, usually involve some form of manual interaction, from opening doors to wielding weapons. Accurate tracking of the hands potentially allows for far more precise interactions that would be possible with controllers, adding not only to the user’s immersion (Argelaguet et al., 2016; Pyasik et al., 2020), but even the accuracy of their movements (Vosinakis and Koutsabasis, 2018), which seems particularly key in the context of training (Harris et al., 2020).

Opportunity 3–More Effective Communication

The final point to discuss is that of communication, and in particular manual gesticulation–the use of one’s hands to emphasize words and punctuate sentences through a series of gestures. “Gestures” in the context of HCI has come to mean the swipes and pinching motions uses to perform commands. However, the involuntary movements of hands during natural communication appear to play a significant role not just for the listener, but also the communicator to such an extent that conversations between two congenitally blind individuals contain as many gestures as conversations between sighted individuals (Iverson and Goldin-Meadow, 1998; Özçalışkan et al., 2016). Indeed, recent research has shown that individuals are impaired in recognizing a number of key emotions in the images of bodies which have the hands removed (Ross and Flack, 2020), highlighting how important hand form information is in communicative experiences. The value of manual gestures for communication in virtual environments is compounded given that veridical real-time face tracking and visualization is technically very difficult due to the extremely high temporal and spatial resolution required to detect and track microexpressions. Furthermore, computer-generated faces are particularly prone to large uncanny-valley like effects whereby faces which fall just short of being realistic elicit a strong sense of unease (MacDorman et al., 2009; McDonnell and Breidt, 2010). Significant recent strides have been made in tracking and rendering photorealistic faces (Schwartz et al., 2020), but the hardware costs are likely to be prohibitive for the current generation of consumer-based VR technologies. Tracking and rendering of the hands, with their large and expressive kinematics, should thus be strong a focus for communicative avatars in the short term.

Challenge 1–Object Interaction

Our hands are one of our main ways to effect change in the environment around us. Thus, one of the main reasons to visualise hands in VR is to facilitate and encourage interactions with the virtual environment. From opening doors to wielding weapons, computer-generated hands are an integral part of many game experiences across many platforms. As outlined above, these manual interactions are typically generated by reverse-engineering interactions with a held controller. For example, on the Oculus Quest 2 controller, if the buttons underneath the index and middle fingers are lightly depressed, the hand appears to close slightly; if the buttons are fully depressed, the hand closes into a fist. Not only does this method of interacting with the world feel quite engaging, it elicits a greater sense of ownership over the seen hand than a visualization of the held controller itself (Lavoie and Chapman, 2021). But despite the compelling nature of this experience, hand tracking offers the promise of a real-time veridical representation of the hand’s true actions, requiring no mapping of physical to seen actions and untethered from any extraneous hardware. Anecdotally, however, interacting with virtual objects using hand tracking feels imprecise and difficult to use, which is supported by recent findings showing that during a block moving task hands tracked with a Leap Motion tracker score lower on the System Usability Scale than hands tracked with a hand-held controller (Masurovsky et al., 2020). Furthermore, subjective Likert ratings on a number of descriptive metrics suggested that the controller-free interaction felt significantly less comfortable and less precise than the controller-based interactions. Even more worryingly, this same article noted that participants performed worse on a number of performance metrics when their hands were tracked with the Leap than with the controller.

It is likely that the main reason that controller-free hand tracking is problematic during object interaction is the lack of tactile and haptic cues in this context. Tactile cues are a key part to successful manual actions, and their removal impairs the accuracy of manual localization (Rao and Gordon, 2001), alters grasping kinematics (Whitwell et al., 2015; Furmanek et al., 2019; Ozana et al., 2020; Mangalam et al., 2021), and affects the normal application of fingertip forces (Buckingham et al., 2016). While controller-based interactions with virtual objects do not deliver the same tactile and haptic sensations experienced when interacting with objects in the physical environment, the vibro-tactile pulses and the mass of the controllers do seem to aid in scaffolding a compelling percept of touching something. A range of solutions to replace tactile feedback in the context of VR have been developed in recent years. From a hardware perspective, solutions range from glove-like devices which provide tactile feedback and force feedback to the digits (Carlton, 2021) to stimuli which precisely deform the fingertips to create a sensation of the mechanics of interaction (Schorr and Okamura, 2017) to devices which deliver contactless ultrasonic pulses aimed at the hands to simulate tactile cues (Rakkolainen et al., 2019). Researchers have also used a lower-cost mixed reality solution known as “haptic retargeting” where an individual interacts with a single physical peripheral and the apparent position and orientation of the hands are subtly manipulated to create the illusion of interacting with a range of different objects (Azmandian et al., 2016; Clarence et al., 2021). It is currently unclear which of these solutions (or one hitherto unforeseen) will solve this issue, but it clearly a major challenge for the broad uptake immersive virtual reality.

Challenge 2–Tracking Location

With “inside-out” cameras in current consumer models (e.g., the Oculus Quest 2), hand tracking is at its most reliable when the hands are roughly in front of the face, presumably to maximise the overlap of the fields of view of the individual cameras which track the hands. In these headsets, the orientation of these cameras is fixed, presumably due to the assumption that participants will be looking at what they are doing in VR. This assumption is probably appropriate for discrete game-style “events”–it is well-established that individuals foveate the hands and the action endpoint during goal-directed tasks (Desmurget et al., 1998; Johansson et al., 2001; Lavoie et al., 2018). In more natural sequences of tasks (e.g., preparing food), however, the hands are likely to spend significant proportion of time in the lower visual field due to their physical location below the head. This asymmetry in the common locations of the hand during many tasks was discussed in the context of a lower visual field specialization for manual action by Previc (1990) and has received support parallels from a range of studies showing that humans are more efficient utilizing visual feedback to guide effective reaching toward targets in their lower visual field than their upper visual field (Danckert and Goodale, 2001; Khan and Lawrence, 2005; Krigolson and Heath, 2006). This behavioural work is supported by evidence from the visual system for a lower visual field speciality for factors related to action (Schmidtmann et al., 2015; Zhou et al., 2017), as well as neuroimaging evidence that grasping objects in the lower visual field preferentially activates a network of dorsal brain regions specialised for planning and controlling visually-guided actions (Rossit et al., 2013). As the range of tasks undertaken in VR widens to include more natural everyday experiences where the hands might be engaged in tasks in the lower visual fields, limitations of tracking and visualization in this region of space will likely become more apparent. Indeed, this issue is not only one of tracking, but hardware field of view. Currently the main focus on field of view is concerned with increasing the lateral extent, with little consideration given to the fact that the “letterbox” shape of most VR HMDs reduce the vertical field of view in the lower visual field by more than 10% compared to that which the eye affords in the physical environment (Kreylos, 2016; Kreylos, 2019). Together, these issues of tracking limitations and physical occlusion are likely to result in unnatural head movements in manual tasks to ensure the hands are kept in view which could limit the transfer of training from virtual to physical environments, or significant impacts on immersion as the hands disappear from peripheral view at an unexpected or inconsistent point.

Challenge 3–Uncanny Phenomenon and Embodiment

The uncanny phenomenon (sometimes referred to as the uncanny valley) refers to the lack of affinity yielding feelings of unease or disgust when looking at, or interacting with, something artificial which falls just short of appearing natural (Mori, 1970; Wang et al., 2015). The cause of this effect is still undetermined, but recent studies have suggested that this effect might be driven by mismatches between the apparently-biological appearance of the offending stimuli and non-biological kinematics and/or inappropriate features such as temperature and surface textures (Saygin et al., 2012; Kätsyri et al., 2015). The main triggers for uncanny valley seem to be in the realms of computer-generated avatars (MacDorman et al., 2009; McDonnell and Breidt, 2010) and interactive humanoid robots (Destephe et al., 2015; Strait et al., 2017) and, as such, much of research into this topic has focussed on faces. Recent studies have suggested that this effect is amplified when experienced through an HMD (Hepperle et al., 2020), highlighting the importance of this factor in the context of tracked VR experiences.

Little work has, by contrast, examined such responses toward hands. In the context of prosthetic hands, Poliakoff et al. (2013, 2018) demonstrated that images of life-like prosthetic hands were rated as more eerie than anatomical or robotic hands in equivalent poses. This effect appears to be eliminated in some groups with extensive experience (e.g., in observers who themselves have a limb absence), but is still strongly experienced by prosthetists and non-amputees trained to use a prosthetic hand simulator (Buckingham et al., 2019). Given the strong possibility of inducing a presence-hindering effect if virtual hands are sufficiently disconcerting (Brenton et al., 2005), it seems prudent to recommend outline or cartoon hands as the norm for even strongly-embodied VR experiences. This suggestion is particularly important for “untethered” HMDs, due to the fact that rendering photorealistic images of hands tracked at the high frequencies required to visualize the full range of dextrous actions will require significant computing power. A final point in this regard which also bears mention is that the uncanny valley is not a solely visual experience, but a multisensory one. For example, it has been shown that user’s experience of their presence in VR rapidly declines when the visual cues in a VR scenario do not match with the degree of haptic feedback (Berger et al., 2018). Furthermore it has recently been shown that when the artificiality of tactile cues and visual cues are mismatched, this can also generate a reduction in feelings of ownership (D’Alonzo et al., 2019). Thus if tactile cues are to become a feature of hand tracking and visualization, care must be taken to avoid features of this so-called “haptic uncanny valley” (Berger et al., 2018).

A more general issue which developers must grapple with than hedonic perception is so-called “embodiment”–the feeling of ownership that one feels toward an effector that they are controlling. This term is usually discussed in the context of a body part or a tool, so has clear implications in the context of hand tracking in VR (Kilteni et al., 2012) and is usually measured either through subjective questionnaires or ostensibly objective measures of felt body position and physiological responses to threat. Anecdotally the dynamic and precise experience of viewing computer-generated hands which are being tracked yields an extremely strong sense of embodiment which does not require a lengthy period of training or induction. In the context of virtual hands presented through an HMD, the literature suggests that embodiment happens naturally with realistic and veridical stimuli. Pyasik et al. (2020) have shown that participants feel stronger levels of ownership toward 3-D scans of their own hand than they did toward an artificially-smoothed and whitened hand. Furthermore, it has been shown that feelings of embodiment are enhanced when the virtual hands appear to be connected to the body rather than disembodied (Seinfeld and Müller, 2020). At the time of writing, however, much work remains to be done to build up a comprehensive picture of what visual factors are required to balance embodiment, enjoyment, and effective interaction with virtual environments.

Challenge 4–Inclusivity

Inclusivity is an increasingly important ethical issue in technology (Birhane, 2021), and the development of hand tracking and visualization in iVR throws up a series of unique challenges in this regard. A fundamental part of marker-free hand tracking is to segment the skin from the surrounding background to build, and ultimately visualize, the dynamics of the hand. One potential issue which has not received explicit consideration is that of skin pigmentation. There are a number of recent anecdotal examples (Fussell, 2017) of examples framed around hardware limitations where items from automatic soap dispensers to heart-rate monitors fail to function as effectively for individuals with darker skin tones (which are less reflective) than lighter skin tones (which are more reflective). It is critical that, as iVR is more widely adopted, the cameras which track the hands are able to adequately image all levels of skin pigmentation.

A related issue comes from the software which is used to turn the images captured by the cameras into dynamic models of the hands, using models of possible hand configurations (inverse kinematics). These models, assuming they are built from training sets, are likely to suffer from the same algorithmic bias which has been problematic in face classification research (Buolamwini and Gebru, 2018), with datasets largely derived from Caucasian males yielding startling disparities in levels of misclassification across skin type and gender. This issue becomes one not just of skin pigmentation, but of gender, age, disability, and skin texture and presumably will be exacerbated at these intersections. Any hardware and software which aims to cater for the “average user” risks leaving hand tracking functionally unavailable to large portions of society. One possible solution to this could be to have users generate their own personalised training sets, akin personalized “voice profiles” used in some speech recognition software and home assistant devices.

The final issue on this topic relates to the visualization of the hands, related to the discussion of embodiment in the section above. Although the current norm for hand visualization is for outline or cartoon-style hands which lack distinguishing features, presumably there will be a drive for the visualization of more realistic-looking hands. As is becoming standard for facial avatars in CG environment, it is important for individuals to be able to develop a model in the virtual environment steps away from the “default” of an able-bodied Caucasian male or female toward one which accurately represents their bodily characteristics (or, indeed, that of another). This can be jarring–for example it has been shown that the appearance of opposite-gender hands reduces women’s experience of presence in virtual environments (Schwind et al., 2017). With hands, this is also likely to be particularly important from an embodiment perspective, with an emerging body of literature suggesting that individuals are less able to embody hands which appear to be from a visibly different skin tone than their own (Farmer et al., 2012; Lira et al., 2017).

Conclusion

In summary, hand tracking is probably here to stay as a cardinal (but probably still optional) feature of immersive virtual reality. The opportunities for facilitating effective and engaging interpersonal communication and more formal presentations in a remote context is particularly exciting for many aspects of our social, teaching, and learning worlds. Being cognisant of the challenges which come with these opportunities is a first step toward developing a clear series of best practices to aid in the development of the next generation of VR hardware and immersive experiences.

Author Contributions

GB conceived and wrote the manuscript.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

The author would like to thank João Mineiro for his comments on an earlier draft of this manuscript.

References

Argelaguet, F., Hoyet, L., Trico, M., and Lecuyer, A. (2016). “The Role of Interaction in Virtual Embodiment: Effects of the Virtual Hand Representation,” in 2016 IEEE Virtual Reality (VR). Presented at the 2016 IEEE Virtual Reality (VR), 3–10. doi:10.1109/VR.2016.7504682

CrossRef Full Text | Google Scholar

Azmandian, M., Hancock, M., Benko, H., Ofek, E., and Wilson, A. D. (2016). “Haptic Retargeting: Dynamic Repurposing of Passive Haptics for Enhanced Virtual Reality Experiences,” in Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, New York, NY, USA (Association for Computing Machinery), 1968–1979.

Google Scholar

Berger, C. C., Gonzalez-Franco, M., Ofek, E., and Hinckley, K. (2018). The Uncanny valley of Haptics. Sci. Robot. 3, eaar7010. doi:10.1126/scirobotics.aar7010

PubMed Abstract | CrossRef Full Text | Google Scholar

Birhane, A. (2021). Algorithmic Injustice: A Relational Ethics Approach. Patterns 2, 100205. doi:10.1016/j.patter.2021.100205

PubMed Abstract | CrossRef Full Text | Google Scholar

Brenton, H., Gillies, M., Ballin, D., and Chatting, D. (2005). “D.: The Uncanny valley: Does it Exist,” in 19th British HCI Group Annual Conference: Workshop on Human-Animated Character Interaction.

Google Scholar

Buckingham, G., Michelakakis, E. E., and Cole, J. (2016). Perceiving and Acting upon Weight Illusions in the Absence of Somatosensory Information. J. Neurophysiol. 115, 1946–1953. doi:10.1152/jn.00587.2015

CrossRef Full Text | Google Scholar

Buckingham, G., Parr, J., Wood, G., Day, S., Chadwell, A., Head, J., et al. (2019). Upper- and Lower-Limb Amputees Show Reduced Levels of Eeriness for Images of Prosthetic Hands. Psychon. Bull. Rev. 26, 1295–1302. doi:10.3758/s13423-019-01612-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Buolamwini, J., and Gebru, T. (2018). “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification,” in Conference on Fairness, Accountability and Transparency. Presented at the Conference on Fairness, Accountability and Transparency (PMLR), 77–91.

Google Scholar

Carlton, B. (2021). HaptX Launches True-Contact Haptic Gloves for VR and Robotics. VRScout. Available at: https://vrscout.com/news/haptx-true-contact-haptic-gloves-vr/(accessed 10 3, 21).

Google Scholar

Clarence, A., Knibbe, J., Cordeil, M., and Wybrow, M. (2021). “Unscripted Retargeting: Reach Prediction for Haptic Retargeting in Virtual Reality,” in 2021 IEEE Virtual Reality and 3D User Interfaces (VR). Presented at the 2021 IEEE Virtual Reality and 3D User Interfaces (VR), 150–159. doi:10.1109/VR50410.2021.00036

CrossRef Full Text | Google Scholar

D’Alonzo, M., Mioli, A., Formica, D., Vollero, L., and Di Pino, G. (2019). Different Level of Virtualization of Sight and Touch Produces the Uncanny valley of Avatar's Hand Embodiment. Sci. Rep. 9, 19030. doi:10.1038/s41598-019-55478-z

PubMed Abstract | CrossRef Full Text | Google Scholar

Desmurget, M., Pélisson, D., Rossetti, Y., and Prablanc, C. (1998). From Eye to Hand: Planning Goal-Directed Movements. Neurosci. Biobehav. Rev. 22, 761–788. doi:10.1016/s0149-7634(98)00004-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Destephe, M., Brandao, M., Kishi, T., Zecca, M., Hashimoto, K., and Takanishi, A. (2015). Walking in the Uncanny Valley: Importance of the Attractiveness on the Acceptance of a Robot as a Working Partner. Front. Psychol. 6, 204. doi:10.3389/fpsyg.2015.00204

PubMed Abstract | CrossRef Full Text | Google Scholar

Dieter, K. C., Hu, B., Knill, D. C., Blake, R., and Tadin, D. (2014). Kinesthesis Can Make an Invisible Hand Visible. Psychol. Sci. 25, 66–75. doi:10.1177/0956797613497968

PubMed Abstract | CrossRef Full Text | Google Scholar

Farmer, H., Tajadura-Jiménez, A., and Tsakiris, M. (2012). Beyond the Colour of My Skin: How Skin Colour Affects the Sense of Body-Ownership. Conscious. Cogn. 21, 1242–1256. doi:10.1016/j.concog.2012.04.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Furmanek, M. P., Schettino, L. F., Yarossi, M., Kirkman, S., Adamovich, S. V., and Tunik, E. (2019). Coordination of Reach-To-Grasp in Physical and Haptic-Free Virtual Environments. J. Neuroengineering Rehabil. 16, 78. doi:10.1186/s12984-019-0525-9

CrossRef Full Text | Google Scholar

Fussell, S. (2017). Why Can’t This Soap Dispenser Identify Dark Skin? [WWW Document]. Gizmodo. Available at: https://web.archive.org/web/20210213095326/https://gizmodo.com/why-cant-this-soap-dispenser-identify-dark-skin-1797931773 (accessed 9 3, 21).

Google Scholar

Goodale, M. A., and Danckert, J. (2001). Superior Performance for Visually Guided Pointing in the Lower Visual Field. Exp. Brain Res. 137, 303–308. doi:10.1007/s002210000653

PubMed Abstract | CrossRef Full Text | Google Scholar

Guna, J., Jakus, G., Pogačnik, M., Tomažič, S., and Sodnik, J. (2014). An Analysis of the Precision and Reliability of the Leap Motion Sensor and its Suitability for Static and Dynamic Tracking. Sensors 14, 3702–3720. doi:10.3390/s140203702

PubMed Abstract | CrossRef Full Text | Google Scholar

Harris, D. J., Bird, J. M., Smart, P. A., Wilson, M. R., and Vine, S. J. (2020). A Framework for the Testing and Validation of Simulated Environments in Experimentation and Training. Front. Psychol. 11, 605. doi:10.3389/fpsyg.2020.00605

PubMed Abstract | CrossRef Full Text | Google Scholar

Hepperle, D., Ödell, H., and Wölfel, M. (2020). “Differences in the Uncanny Valley between Head-Mounted Displays and Monitors,” in 2020 International Conference on Cyberworlds (CW). Presented at the 2020 International Conference on Cyberworlds (CW), 41–48. doi:10.1109/CW49994.2020.00014

CrossRef Full Text | Google Scholar

Inside Facebook Reality Labs (2021). Wrist-based Interaction for the Next Computing Platform [WWW Document]. Facebook Technol. Available at: https://tech.fb.com/inside-facebook-reality-labs-wrist-based-interaction-for-the-next-computing-platform/(accessed 3 18, 21).

Google Scholar

Iverson, J. M., and Goldin-Meadow, S. (1998). Why People Gesture when They Speak. Nature 396, 228. doi:10.1038/24300

PubMed Abstract | CrossRef Full Text | Google Scholar

Johansson, R. S., Westling, G., Bäckström, A., and Flanagan, J. R. (2001). Eye-Hand Coordination in Object Manipulation. J. Neurosci. 21, 6917–6932. doi:10.1523/jneurosci.21-17-06917.2001

CrossRef Full Text | Google Scholar

Kätsyri, J., Förger, K., Mäkäräinen, M., and Takala, T. (2015). A Review of Empirical Evidence on Different Uncanny Valley Hypotheses: Support for Perceptual Mismatch as One Road to the valley of Eeriness. Front. Psychol. 6, 390. doi:10.3389/fpsyg.2015.00390

PubMed Abstract | CrossRef Full Text | Google Scholar

Khan, M. A., and Lawrence, G. P. (2005). Differences in Visuomotor Control between the Upper and Lower Visual fields. Exp. Brain Res. 164, 395–398. doi:10.1007/s00221-005-2325-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Kilteni, K., Groten, R., and Slater, M. (2012). The Sense of Embodiment in Virtual Reality. Presence 21, 373–387. doi:10.1162/PRES_a_00124

CrossRef Full Text | Google Scholar

Kreylos, O. (2016). Optical Properties of Current VR HMDs [WWW Document]. Doc-Okorg. Available at: https://web.archive.org/web/20210116152206/http://doc-ok.org/?p=1414 (accessed 9 3, 21).

Google Scholar

Kreylos, O. (2019). Quantitative Comparison of VR Headset Fields of View [WWW Document]. Doc-Okorg. Available at: https://web.archive.org/web/20200328103226/http://doc-ok.org/?p=1955 (accessed 9 3, 21).

Google Scholar

Krigolson, O., and Heath, M. (2006). A Lower Visual Field Advantage for Endpoint Stability but No Advantage for Online Movement Precision. Exp. Brain Res. 170, 127–135. doi:10.1007/s00221-006-0386-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Land, M. F. (2009). Vision, Eye Movements, and Natural Behavior. Vis. Neurosci. 26, 51–62. doi:10.1017/S0952523808080899

PubMed Abstract | CrossRef Full Text | Google Scholar

Lavoie, E. B., Valevicius, A. M., Boser, Q. A., Kovic, O., Vette, A. H., Pilarski, P. M., et al. (2018). Using Synchronized Eye and Motion Tracking to Determine High-Precision Eye-Movement Patterns during Object-Interaction Tasks. J. Vis. 18, 18. doi:10.1167/18.6.18

CrossRef Full Text | Google Scholar

Lavoie, E., and Chapman, C. S. (2021). What's Limbs Got to Do with it? Real-World Movement Correlates with Feelings of Ownership over Virtual Arms during Object Interactions in Virtual Reality. Neurosci. Conscious. 7 (1), niaa027. doi:10.1093/nc/niaa027

CrossRef Full Text | Google Scholar

Lira, M., Egito, J. H., Dall’Agnol, P. A., Amodio, D. M., Gonçalves, Ó. F., and Boggio, P. S. (2017). The Influence of Skin Colour on the Experience of Ownership in the Rubber Hand Illusion. Sci. Rep. 7, 15745. doi:10.1038/s41598-017-16137-3

PubMed Abstract | CrossRef Full Text | Google Scholar

MacDorman, K. F., Green, R. D., Ho, C.-C., and Koch, C. T. (2009). Too Real for comfort? Uncanny Responses to Computer Generated Faces. Comput. Hum. Behav. 25, 695–710. doi:10.1016/j.chb.2008.12.026

PubMed Abstract | CrossRef Full Text | Google Scholar

Mangalam, M., Yarossi, M., Furmanek, M. P., and Tunik, E. (2021). Control of Aperture Closure during Reach-To-Grasp Movements in Immersive Haptic-Free Virtual Reality. Exp. Brain Res. 239 (5), 1651–1665. doi:10.1007/s00221-021-06079-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Masurovsky, A., Chojecki, P., Runde, D., Lafci, M., Przewozny, D., and Gaebler, M. (2020). Controller-Free Hand Tracking for Grab-And-Place Tasks in Immersive Virtual Reality: Design Elements and Their Empirical Study. Multimodal Technol. Interact. 4, 91. doi:10.3390/mti4040091

CrossRef Full Text | Google Scholar

McDonnell, R., and Breidt, M. (2010). “Face Reality: Investigating the Uncanny Valley for Virtual Faces,” in ACM SIGGRAPH ASIA 2010 Sketches, SA ’10, New York, NY, USA (Association for Computing Machinery), 1–2. doi:10.1145/1899950.1899991

CrossRef Full Text | Google Scholar

Mori, M. (1970). Bukimi No Tani [The Uncanny valley]. Energy 7, 33–35.

Google Scholar

Ozana, A., Berman, S., and Ganel, T. (2020). Grasping Weber's Law in a Virtual Environment: The Effect of Haptic Feedback. Front. Psychol. 11, 573352. doi:10.3389/fpsyg.2020.573352

PubMed Abstract | CrossRef Full Text | Google Scholar

Özçalışkan, Ş., Lucero, C., and Goldin-Meadow, S. (2016). Is Seeing Gesture Necessary to Gesture Like a Native Speaker. Psychol. Sci. 27, 737–747. doi:10.1177/0956797616629931

PubMed Abstract | CrossRef Full Text | Google Scholar

Poliakoff, E., Beach, N., Best, R., Howard, T., and Gowen, E. (2013). Can Looking at a Hand Make Your Skin Crawl? Peering into the Uncanny Valley for Hands. Perception 42, 998–1000. doi:10.1068/p7569

PubMed Abstract | CrossRef Full Text | Google Scholar

Poliakoff, E., O’Kane, S., Carefoot, O., Kyberd, P., and Gowen, E. (2018). Investigating the Uncanny valley for Prosthetic Hands. Prosthet. Orthot. Int. 42, 21–27. doi:10.1177/0309364617744083

PubMed Abstract | CrossRef Full Text | Google Scholar

Previc, F. H. (1990). Functional Specialization in the Lower and Upper Visual fields in Humans: Its Ecological Origins and Neurophysiological Implications. Behav. Brain Sci. 13, 519–542. doi:10.1017/S0140525X00080018

CrossRef Full Text | Google Scholar

Pyasik, M., Tieri, G., and Pia, L. (2020). Visual Appearance of the Virtual Hand Affects Embodiment in the Virtual Hand Illusion. Sci. Rep. 10, 5412. doi:10.1038/s41598-020-62394-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Rakkolainen, I., Sand, A., and Raisamo, R. (2019). “A Survey of Mid-air Ultrasonic Tactile Feedback,” in 2019 IEEE International Symposium on Multimedia (ISM). Presented at the 2019 IEEE International Symposium on Multimedia (ISM), 94–944. doi:10.1109/ISM46123.2019.00022

CrossRef Full Text | Google Scholar

Rao, A., and Gordon, A. (2001). Contribution of Tactile Information to Accuracy in Pointing Movements. Exp. Brain Res. 138, 438–445. doi:10.1007/s002210100717

PubMed Abstract | CrossRef Full Text | Google Scholar

Ross, P., and Flack, T. (2020). Removing Hand Form Information Specifically Impairs Emotion Recognition for Fearful and Angry Body Stimuli. Perception 49, 98–112. doi:10.1177/0301006619893229

PubMed Abstract | CrossRef Full Text | Google Scholar

Rossit, S., McAdam, T., Mclean, D. A., Goodale, M. A., and Culham, J. C. (2013). fMRI Reveals a Lower Visual Field Preference for Hand Actions in Human superior Parieto-Occipital Cortex (SPOC) and Precuneus. Cortex 49, 2525–2541. doi:10.1016/j.cortex.2012.12.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Saygin, A. P., Chaminade, T., Ishiguro, H., Driver, J., and Frith, C. (2012). The Thing that Should Not Be: Predictive Coding and the Uncanny valley in Perceiving Human and Humanoid Robot Actions. Soc. Cogn. Affect. Neurosci. 7, 413–422. doi:10.1093/scan/nsr025

PubMed Abstract | CrossRef Full Text | Google Scholar

Schmidtmann, G., Logan, A. J., Kennedy, G. J., Gordon, G. E., and Loffler, G. (2015). Distinct Lower Visual Field Preference for Object Shape. J. Vis. 15, 18. doi:10.1167/15.5.18

CrossRef Full Text | Google Scholar

Schorr, S. B., and Okamura, A. M. (2017). “Fingertip Tactile Devices for Virtual Object Manipulation and Exploration,” in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, CHI ’17, New York, NY, USA (Association for Computing Machinery), 3115–3119. doi:10.1145/3025453.3025744

CrossRef Full Text | Google Scholar

Schwartz, G., Wei, S.-E., Wang, T.-L., Lombardi, S., Simon, T., Saragih, J., et al. (2020). The Eyes Have it. ACM Trans. Graph. 39, 91:91:1–91:91:15. doi:10.1145/3386569.3392493

CrossRef Full Text | Google Scholar

Schwind, V., Knierim, P., Tasci, C., Franczak, P., Haas, N., and Henze, N. (2017). “"These Are Not My Hands!",” in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. Presented at the CHI ’17: CHI Conference on Human Factors in Computing Systems, Denver Colorado USA (ACM), 1577–1582. doi:10.1145/3025453.3025602

CrossRef Full Text | Google Scholar

Seinfeld, S., and Müller, J. (2020). Impact of Visuomotor Feedback on the Embodiment of Virtual Hands Detached from the Body. Sci. Rep. 10, 22427. doi:10.1038/s41598-020-79255-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Slater, M., and Sanchez-Vives, M. V. (2016). Enhancing Our Lives with Immersive Virtual Reality. Front. Robot. AI. 3, 74. doi:10.3389/frobt.2016.00074

CrossRef Full Text | Google Scholar

Strait, M. K., Floerke, V. A., Ju, W., Maddox, K., Remedios, J. D., Jung, M. F., et al. (2017). Understanding the Uncanny: Both Atypical Features and Category Ambiguity Provoke Aversion toward Humanlike Robots. Front. Psychol. 8, 1366. doi:10.3389/fpsyg.2017.01366

PubMed Abstract | CrossRef Full Text | Google Scholar

Voigt-Antons, J.-N., Kojić, T., Ali, D., and Möller, S. (2020). Influence of Hand Tracking as a Way of Interaction in Virtual Reality on User Experience. ArXiv200412642 Cs.

CrossRef Full Text | Google Scholar

Vosinakis, S., and Koutsabasis, P. (2018). Evaluation of Visual Feedback Techniques for Virtual Grasping with Bare Hands Using Leap Motion and Oculus Rift. Virtual Reality 22, 47–62. doi:10.1007/s10055-017-0313-4

CrossRef Full Text | Google Scholar

Wang, S., Lilienfeld, S. O., and Rochat, P. (2015). The Uncanny Valley: Existence and Explanations. Rev. Gen. Psychol. 19, 393–407. doi:10.1037/gpr0000056

CrossRef Full Text | Google Scholar

Whitwell, R. L., Ganel, T., Byrne, C. M., and Goodale, M. A. (2015). Real-Time Vision, Tactile Cues, and Visual Form Agnosia: Removing Haptic Feedback from a "Natural" Grasping Task Induces Pantomime-Like Grasps. Front. Hum. Neurosci. 9, 216. doi:10.3389/fnhum.2015.00216

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, Y., Yu, G., Yu, X., Wu, S., and Zhang, M. (2017). Asymmetric Representations of Upper and Lower Visual fields in Egocentric and Allocentric References. J. Vis. 17, 9. doi:10.1167/17.1.9

CrossRef Full Text | Google Scholar

Keywords: VR, embodiment, psychology, communcation, inclusivity

Citation: Buckingham G (2021) Hand Tracking for Immersive Virtual Reality: Opportunities and Challenges. Front. Virtual Real. 2:728461. doi: 10.3389/frvir.2021.728461

Received: 21 June 2021; Accepted: 24 September 2021;
Published: 20 October 2021.

Edited by:

Nadia Magnenat Thalmann, Université de Genève, Switzerland

Reviewed by:

Antonella Maselli, Italian National Research Council, Italy
Richard Skarbez, La Trobe University, Australia

Copyright © 2021 Buckingham. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Gavin Buckingham, g.buckingham@exeter.ac.uk

Download