Impact of motion cues, color, and luminance on depth perception in optical see-through AR displays

Ashtiani, Omeed; Guo, Hung-Jui; Prabhakaran, Balakrishnan

doi:10.3389/frvir.2023.1243956

ORIGINAL RESEARCH article

Front. Virtual Real., 06 December 2023

Sec. Augmented Reality

Volume 4 - 2023 | https://doi.org/10.3389/frvir.2023.1243956

This article is part of the Research TopicAugmenting Human Experience and Performance through Interaction TechnologiesView all 8 articles

Impact of motion cues, color, and luminance on depth perception in optical see-through AR displays

Omeed Ashtiani*

Hung-Jui Guo

Balakrishnan Prabhakaran

The Erik Jonsson School of Engineering and Computer Science, Department of Computer Science, The University of Texas at Dallas, Richardson, United States

Introduction: Augmented Reality (AR) systems are systems in which users view and interact with virtual objects overlaying the real world. AR systems are used across a variety of disciplines, i.e., games, medicine, and education to name a few. Optical See-Through (OST) AR displays allow users to perceive the real world directly by combining computer-generated imagery overlaying the real world. While perception of depth and visibility of objects is a widely studied field, we wanted to observe how color, luminance, and movement of an object interacted with each other as well as external luminance in OST AR devices. Little research has been done regarding the issues around the effect of virtual objects’ parameters on depth perception, external lighting, and the effect of an object’s mobility on this depth perception.

Methods: We aim to perform an analysis of the effects of motion cues, color, and luminance on depth estimation of AR objects overlaying the real world with OST displays. We perform two experiments, differing in environmental lighting conditions (287 lux and 156 lux), and analyze the effects and differences on depth and speed perceptions.

Results: We have found that while stationary objects follow previous research with regards to depth perception, motion and both object and environmental luminance play a factor in this perception.

Discussion: These results will be significantly useful for developers to account for depth estimation issues that may arise in AR environments. Awareness of the different effects of speed and environmental illuminance on depth perception can be utilized when performing AR or MR applications where precision matters.

1 Introduction

The understanding of depth cues has been studied since the late 19th century. Of these cues, we see that color and luminance in particular have been researched to find which colors appear in front of one another. This phenomenon has originally been investigated by researchers of the psychology and vision fields, where artists use these cues for their illustrations. As technology advances, researchers have studied the effects of how color cues interact with motion cues on the brain (Self and Zeki, 2004). With emerging Augmented Reality (AR) and Mixed Reality (MR) technology, researchers should focus on understanding how depth cues work on not only real objects but virtual objects as well.

An MR system is defined as having the following properties: 1) combines real and virtual objects in a real environment, 2) runs interactively and in real-time, and 3) registers (aligns) real and virtual objects with each other (Cipresso et al., 2018). Inaccurate depth rendering or perceptions can cause issues for users across multiple domains, including construction, gaming, education, and the medical field, as users expect their input actions to reflect across virtual space.

Due to the importance of depth perception in AR across a variety of disciplines, understanding depth cues, such as color and luminance, can be a useful tool in ensuring accurate depth information is presented in AR systems. Here, we refer to luminance as the relative brightness of an object in contrast to its background and color as hue. Prior to the development of AR systems, much work has been conducted on understanding these depth cues in the real world. With the increase in AR technologies, some researchers have transitioned to focusing on perceptions in augmented space. Although color and luminance as depth cues have been thoroughly studied, most of the work has been investigated on stationary objects, both in the real world and AR. A lot of work in AR with relation to speed or motion involves automatic object tracking or trajectory prediction. However, in relation to human perception of speed and depth, not a lot of work has been evaluated based on color and luminance.

1.1 Need for perception studies in OST-AR devices

Video See-Through AR (VST-AR) displays capture the real world via a camera and render the captured video to the user, overlaying the virtual 3D graphics objects with the captured video. The renderings of the real world may have artifacts that are not present in the real world, due to the effects of foreground or background lighting conditions, resulting in emphasized shadows or reduced image quality. Although previous research has studied the effect of color and luminance of virtual objects on depth perception, these studies have been carried out using VST-AR displays. Because VST headsets render real-world environments with possible image processing artifacts, there could be some safety concerns when users move around in the real world with their headsets on. In contrast, Optical See-Through AR (OST-AR) displays, offer the unique view of images projected onto see-through glass, allowing users to see AR objects directly overlaying the real world without any modification, and in real-time. This observation of the real world allows users to feel more comfortable and walk more similarly to how they would with no headset (Adam Jones et al., 2012). Therefore, it is useful to study how objects, both stationary and moving, are viewed in AR with relation to depth and speed prior to moving in AR.

As articulated earlier, most of the research on this topic of depth perception based on the color and luminance of virtual objects and the effect of motion cues has been conducted for Video See-Through (VST) AR displays. Very little research seems to have been done to test the effects of not only color and luminance on depth perception in OST-AR displays, but also the effects of environment, whether dim or bright, and the effects of AR object mobility To the best of our knowledge, this is the first study to explore and analyze these distinctions and their interactivity with one another in the context of OST-AR displays.

1.2 Proposed approach

Based on the limitations mentioned in Section 1.1, our work is designed to investigate the impact of factors such as object color, object luminance, environmental lighting, and motion of MR objects and how they interact with one another. Due to the popularity of Microsoft’s HoloLens 2 (Microsoft, 2022), an OST-MR device, we conducted our study by using the HoloLens 2 for all participants. The HoloLens 2 is a battery-powered, stereoscopic OST headset created by Microsoft that has a battery life of about 2 h. It projects light to a lens, overlaying AR and MR information over the real world. However, these virtual objects are created at 500 nits, and are to be recommended in an environment between 500 and 1,000 lux. We have found that for our experiment of creating MR objects, we can work in a range of 100–1500 lux. External illumination far exceeding this value, such as the brightness of the Sun, diminishes the visibility of these objects, as the average lux in direct sunlight is between 32,000 and 100,000, whereas ambient daylight is 10,000 to 25,000. In our geographical location, the outdoor lux was well above that recommended by Hololens 2 specification and objects were nearly invisible. Also, it would be difficult to control for external variables across all users, such as Sun positioning and lighting on a given day, increasing the variability of perception.

Due to this degradation, we excluded outdoor environments and conducted our study only indoors, using internal illumination within the range of 100–500 lux. We conducted two experiments to address the concerns in Section 1.1.

• In a brightly lit indoor environment, measured at 287 lux, with AR objects moving toward the user.

• In a dimly lit indoor environment, measured at 156 lux, with AR objects moving toward the user.

Our contributions for OST-AR/MR Headsets (HoloLens 2) are as follows.

• Evaluate the effects of depth cues’ color and luminance on depth perception in indoor environments.

• Evaluate the effects of motion cues with relation to those depth cues with objects in motion

• Recommendations for color and luminance of depth cues based on environment and motion of objects.

2 Related works

2.1 Color and luminance as depth cues

Depth cues have been studied comprehensively for over a century in the real world. Of these depth cues, color and luminance have been the main targets. As far back as late the 19th century, Ashley studied the intensity of light in visual estimates of depth (L Ashley, 1898). They found that in monocular experiments, the trend where an increase of light and decrease of perceived distance follows something similar to Weber’s Law, and that in both monocular and binocular experiments, brighter targets appear closer (Ekman, 1959). In another monocular experiment, Pillsbury and Schaefer investigate neon red and argon blue, noticing that red appears nearer to the eye, but only in specific conditions (Pillsbury and Schaefer, 1937). Johns and Sumner investigated the apparent ‘brightness’ of different colors, finding them to be in this order from brightest to darkest: White, Yellow, Green, Red, Blue, Black (Johns and Sumner, 1948). Troscianko investigated color gradients with monocular vision, finding that a saturation gradient of red-gray was particularly effective at affecting perceived depth, but not the red-green hue gradient (Tom et al., 1991). Multiple other authors have found that luminance is a major cue for perceived distance (Coules, 1955; Payne, 1964).

Farnè took it a step further and found that it was not brightness alone, but that it was brightness in relation to its background that influenced perceived distance. Farnè judged white and black on varying backgrounds, ranging from near white to near black, and found that the target with the higher contrast with the background is perceived as nearer, as opposed to merely the brighter target being nearer. Dengler and Nitschke also observed this phenomenon with orange and blue, orange being the brighter color. They noticed that orange appears before blue on darker backgrounds, but this apparent depth reverses as the background shifts into a brighter color (Dengler and Nitschke, 1993). O’Shea also found this trend, noting that even with differing sizes, the objects with larger contrast to the background appear closer (O’Shea et al., 1994). Bailey et al. took a more in-depth study of the effects of warm and cool colors (Bailey et al., 2006). Several studies have furthered these conclusions, finding that lower contrast is further away, luminance is a major cue to distance, and that in terms of contrast, certain colors are seen as advancing (reds, warm colors) and others are retreating (blues, cool colors) (Guibal and Dresp, 2004; O’Shea et al., 1994; PayneJr, 1964; Pillsbury and Schaefer, 1937).

As technology advanced, researchers started analyzing depth cues in pictures, screens, and AR displays. For 2D displays, Kjelldahl found lighting influenced the accuracy of depth perception on 3D objects (Kjelldahl and Martin, 1995). Guibal and Dresp found that red, when supported by any spatial cue, wins over green on lighter backgrounds. Yet, on darker backgrounds, green or white wins over red for perceived nearness (Guibal and Dresp, 2004). Do et al. investigated color, luminance, and fidelity, finding that brighter colors win over darker on certain backgrounds, yet increased fidelity plays a role as well (Do et al., 2020). Other researchers investigated depth cues in 2D renderings (Berning et al., 2014; Fujimura and Morishita, 2011). Arefin et al. investigated context and focal distance switching on human performance with AR devices (Phillips et al., 2020).

A lot of work in AR with relation to speed or motion involves automatic object tracking or trajectory prediction (Chen and Meng, 2010; Gao and Spratling, 2022; Lee et al., 2020; Li et al., 2021; Morzy, 2006). Bedell et al. investigated which judgment is perceived first, color change or movement, and found that it depends on motion cycle (Bedell et al., 2003). However, in relation to human perception of speed and depth, not a lot of work has been evaluated based on color and luminance.

When evaluating depth cues in AR, work has been done on both the mobile and HUD front (Chatzopoulos et al., 2017; Dey and Sandor, 2014; Diaz et al., 2017; Kalia et al., 2016; Singh et al., 2020). Swan et al. found that judgments differ depending on how far an AR object is, noticing that object distance is underestimated before 23 m and overestimated afterward (Edward Swan et al., 2007). Singh et al. (2009) further studied depth judgments at near-field distances between 34 and 50 cm. Livingston et al. found that outdoor environments cause overestimation yet indoor environments lead to understimation, though these can be closer to correct measurements with linear perspective cues (Livingston et al., 2009). Multiple other authors investigated the effects of environment, device, or occlusion on perspective in AR (Dey et al., 2012; Kruijff et al., 2010). Gabbard et al. noted the color blending phenomenon, where, as the background luminance increases, AR colors appear more washed out (Gabbard et al., 2013). Gombač et al. found that depth cues matter less when a VR or AR object is held by the user (Gombač et al., 2016). Rosales found that the position of the object mattered as well, as objects off of the ground appeared further away than those on the ground (Salas Rosales et al., 2019). Adams investigated the use of shadows in 3D AR space on depth perception (Adams, 2020). Weiskopf and Ertl (2002) evaluated brightness, saturation, and hue gradient with respect to depth-cueing, deriving parameters for their schema. Li et al. (2022) worked on mapping specific colors to depths to assist users with depth perception in VR while Du worked on 3D interaction with depth maps for Mobile AR (Du et al., 2020). Other authors evaluated the effects of perceived distance on depth perception, noticing that at some distances, AR objects are underestimated while at other distances, they are overestimated (Edward Swan et al., 2007). Dey and Sandor presented insights from AR experiments with depth perception and occlusion in outdoor environments. They found that egocentric and exocentric distances are underestimated in handheld AR, where depth perception can improve if handheld AR systems dynamically adapt their geometric field of view to match the display field of view (Dey and Sandor, 2014). Many researchers focus on the effects of a real-world background on AR objects (Dey and Sandor, 2014; Gabbard et al., 2013; Kruijff et al., 2010).

Do et al. (2020) is the most relevant with regard to luminance and color cues. The authors researched the effects of luminance, color, shape, and fidelity, albeit in VST Mobile AR. They performed a paired comparison experiment, where they showed stationary VST-AR objects and asked the user to choose one and only one that was closer. They then verified these relationships by performing the coefficient of agreements between users and the coefficient of consistency for each user.

2.2 Color and motion cue integration

In previous literature, speed and motion cues are considered for object detection and not as factors affecting depth perception (Bedell et al., 2003; Cucchiara et al., 2001; Dubuisson and Jain, 1993; Møller and Hurlbert, 1997). When accounting for speed, these works focus on sequences of frames for automatic object tracking or trajectory prediction (Chen and Meng, 2010; Lee et al., 2020; Li et al., 2021; Morzy, 2006). Self et al. found that color and motion-defined shapes activate similar regions in the brain more strongly when used together than either of these cues separately (Self and Zeki, 2004). Some research works specifically include the color of either the object or environment to assist with prediction (Gao and Spratling, 2022; Wu et al., 2014). Oueslati et al. show the importance of being aware of ever-changing contexts of environments (Oueslati et al., 2021). However, Verghese et al. investigated locational and color cues for attentional bias (Verghese et al., 2013). They found that locational cues are important for prediction, although color could not be used to focus attention and integrate motion alone. Hong et al. demonstrated that the motion of an object affects both its own color appearance and the color appearance of a nearby object, suggesting a tight coupling between color and motion processing (Cappello et al., 2016). Based on these papers, we hope to provide insight into the effects of color and motion in relation to OST-AR and depth perception.

3 Experimental design

3.1 Design choices

3.1.1 Experimental methodology

In prior research such as (Do et al., 2020), paired comparison experiments have been employed to evaluate the quality of certain features. In this experimental methodology, a user sees a combination of two objects and picks between the two to decide which object has the better or expected quality. We follow this (paired comparison) approach in our research. Typically, in the paired comparison experiments done in the literature, the user is not allowed to give the opinion that both objects have the similar quality of features being evaluated. Instead, an analysis of consistency is done by checking for ternary relationships, such as circular triads. For instance, let us consider our research in which users must choose which colored cue is closer. Here, following the ternary relationships, if color_a is closer than color_b and color_b is closer than color_c, then it must follow that color_a is closer than color_c. We analyze the user’s choices to check if this ternary relationship holds or not. This check also helps us to understand if the user is choosing at random. If random choices are made by the users, this inequality may not be followed, resulting in the formation of a circular triad. This metric is further explained in Section 4.1, to find if choices are random or follow a pattern. In this case, we measure the depth perception of the objects.

In each experiment, we cycle through hue and luminance pairs. The hues and luminance tested can be seen in Figure 1. The experiments differ in terms of movement, i.e., objects moving towards the user at varying speeds, and environmental lighting. Based on past research, we expected warm colors to be perceived as nearer than cooler colors at the same luminance, in dim environments, with the reverse true in brighter environments. However, we wanted to observe if these results are maintained when using an OST-AR device, in dim and bright indoor environments, and when objects are moving. Although the HoloLens2 is an OST-MR device, we use its AR capabilities for the purpose of these experiments.

FIGURE 1

FIGURE 1. The color conditions used for the paired comparisons. A color is displayed with its given abbreviation and hex color code (Based on data from Do et al. (2020)).

3.1.2 Experimental setup

The experiments were performed in an indoor environment. In this environment, users had at least 5 m of space in front of them with a width of at least 6 m to their side. A large, white background was placed before them at a 5-m distance to compare the orbs on a static background, leaving the rest of the peripheral environment the same. We ensured that the orbs were not occluded behind any walls and that there was ample room between the AR objects and any real objects. Two spheres, each roughly 25 cm in diameter across each dimension, were placed in front of the user. These spheres were 2.0 m away from the user on the x-axis, left and right respectively, and 4.5 m away from the user on the z-axis. This leads to the visual angle subtended being calculated as $\arctan (\frac{4.5}{2.0})$ , leading to angles −23.96° and 23.96° on the z-axis respectively. Both spheres are rendered at equal distances from the user. The prior research (Do et al., 2020) also follows a similar approach by rendering the spheres at the same distance, though the rendering is static in the prior works. In our work, each sphere moves toward the user at the same speed, thus maintaining an equal distance from the user. This is further explained in Section 3.4. As the HoloLens2 has a limited field of view, we used these measurements to ensure that the entirety of the spheres was visible at all times. Similarly, we ensured that there would be no overlapping of virtual objects, thus removing the issue of color blend and overlap. As participants began their experiment, their eye level was calculated and the spheres were created at eye level on the y-dimension.

3.1.3 User study design

Users were tasked with selecting the orb that appeared either closer to them or would collide with them first if moving. After selection, the selected orb would have a marker, indicating its selection for 500 ms, in line with other similar paired comparison experiments. We conducted the same experiment across two lighting conditions: 1) a bright environment measured at 287 lux, and 2) a dimmer environment measured at 156 lux, as seen in Figure 2. Lastly, after every combination of colors was seen and a choice selected, the spheres would begin moving at a speed of 0.5 m/s. After every combination was seen again, the spheres would increase to 1.0 m/s.

FIGURE 2

FIGURE 2. Augmented Reality Color Experiment - On the (A) a bright environment measured at 287 lux, and (B) a dimmer environment measured at 156 lux.

In this manner, we could analyze the effects of color, luminance, and motion on depth perception in OST-AR headsets, in both lighting conditions. The variables recorded included color choice, speed, distance, experiment location, and time-until-object-chosen for each object pair.

3.2 Color hue and luminance

Six color hues were selected to represent the spectrum of colors. Three warm colors (red, magenta, and yellow) were chosen along with three cool colors (green, blue, cyan). Both the bright and dark versions of each color were included, leading to a total of 12 color combinations. All colors were luminance-balanced. This methodology, based on the previous work (Do et al., 2020), allows us to compare and contrast our results with theirs. Each color was tested on a similar background for each of our two experiments, so that the differences in luminance conditions could be contrasted in both bright and darker environments, along with movement conditions.

3.3 Pre-experiment

Before participation in the paired comparison experiment, participants filled out a background survey to ensure that they had normal or corrected-to-normal vision and did not suffer from any form of color blindness. After confirmation, the users used the HoloLens 2 eye-calibration and color-calibration features to ensure that the device was calibrated to their eye specifications. Afterward, we brought the user to the starting position where they partook in a small tutorial featuring gray orbs, placed at different lengths away from the user. The user was tasked with selecting which orb appeared closer to them by using a game controller, where they used the left and right triggers to select the respective orb. As the tutorial progressed, the distances between the gray orbs decreased until both orbs were equidistant from the user at both the x and z-axes. To ensure that the users felt comfortable with the system, users went through 40 iterations of orb selection, where the first 20 iterations contained a difference in distance, the next 10 iterations were equidistant from the user and stationary, and the last 10 were equidistant and moved towards the user starting from 0.1 m/s and increasing to 1.0 m/s at a rate of 0.1 m/s after each selection. After successfully completing the tutorial, users were then moved on to the experiments.

3.4 Experiment

After the pre-experiment, all possible combinations of each color and luminance were generated. As there are 6 colors and 2 luminances, there are 12 possible color/luminance pairs, leading to n = 12. By choosing r = 2 colors, we generate each pair randomly, randomizing the order in which they are presented as pairs and randomizing which sphere is on the left and right respectively. For each speed, we use the combinatorics formula to create the total possible pairs, $C (n, r) = \frac{n!}{r! (n - r)!}$ where n is the color and luminance combinations n = 12, r = 2 for a total of 66 pairs. After each speed trial is completed, the 66 color pairs are randomly generated again, with the left and right permutations being chosen at random. This continued for the entire experiment, leading to 198 combinations for each light and dark experiment, leading to a total selection size of 396 per participant. The bright environment was measured at 287 lux and the dimmer environment was measured at 156 lux. The participants were then tasked with selecting the object that appeared closer to them. As a control group, the first one-third (66) of the objects were stationary. Afterward, the objects began moving towards the user at speeds {0.5 m/s, 1.0 m/s}, increasing after the second-third (66) objects were selected. Objects were placed at a distance of 4.5 m away from the user on the z-axis and 2.0 m away on the x-axis, left and right of the center line respectively. As the purpose of this experiment is the perception of depth and color changes, for all speeds, the orbs were equidistant from the user. Due to the objects being a constant distance away, each object’s selection time was different at different speeds. Therefore, we allowed users 15 s to select one of the stationary objects, 9 s to select an object at 0.5 m/s, and 4.5 s to select the objects at 1.0 m/s, as the object would collide with the user if time elapsed past these restrictions. If an object collided with the user, that color pair was moved to the end of the current grouping of 66 objects and reshown to the user, recording that an object in a color-pair was not selected within the timeframe. This methodology was used under both lighting conditions.

4 Evaluation methodology

Paired comparison experiments require participants to select between two objects based on a shared quality, as explained in Section 3. We presented the participants with two differently colored versions of the same virtual object via an OST-MR headset and asked them to select the object that appeared closer to them (stationary) or would collide with them first (moving). The participants were instructed to evaluate a set amount of comparison pairs from the set of color conditions.

Suppose that n is the number of colors that we wish to compare against one another. A participant will be presented with pairs. In our experiment, we have 12 color conditions and thus, each participant compares 66 pairs at each speed for the respective experiment. The user’s observation is recorded for each selection. After these 66 selections, the next speed is chosen, for up to three speeds, speed = [0.0 m/s, 0.5 m/s, 1.0 m/s]. An example of this selection matrix can be seen in Table 1. We list the combinations in the table such that the row color was selected over the column color. We then add these matrices for each user together and display the total number of selections.

TABLE 1

TABLE 1. Example preference matrix for a participant s_j when shown all color combinations. We use a 1 to signify that the row color was selected over the column color. As this is a paired-comparison experiment, users select one of the two objects.

Table 1 is representative of how the data was selected and shows the selection of one user’s selection during the object moving experiment, regardless of speed.

4.1 Statistical analysis

We employed the same methodology and analysis methods as Ledda et al. (2005) and Do et al. (2020), who showed support for using the following methods of analysis in a paired comparison experiment.

4.1.1 Kendall coefficient of agreement

If all participants vote the same way, then there is complete agreement. However, this is not usually the case and it is important to determine if there is actual agreement between participants. Kendall’s coefficient of agreement utilizes the number of agreements between pairs. In most paired experiments, we find the agreement as follows:

Σ = \sum_{i \neq j} (\binom{p_{i j}}{2}) (1)

In Equation 1, p_ij is the number of times that color_i is chosen over color_j. The sum of the combination of matches is then taken.

Σ Can then be used to calculate the coefficient of agreement, or the sum of the number of agreements in pairs, extending over all pairs excluding the diagonal component.

We calculate the coefficient of agreement among participants. Kendall and Babington Smith (1940) define the coefficient of agreement, u, as:

u = \frac{2 Σ}{(\binom{s}{2}) (\binom{n}{2})} - 1 (2)

where s is the number of participants and n is the number of items being compared.

Should all of the participants make identical choices, then u would be equal to 1. As participants disagree, u decreases to −1/(s − 1) if s is even and −1/s if s is odd.

This coefficient, u, acts as a metric of agreement between the participants. We can test the significance of this agreement to determine if participants agree with one another using a large sample approximation to the sampling distribution (Siegel and Castellan, 1988).

χ^{2} = \frac{n (n - 1) (1 + u (s - 1))}{2} (3)

χ² is asymptotically distributed with n(n − 1)/2 degrees of freedom. We can use a table of probability value for χ², found at (Siegel and Castellan, 1988) in Table C. Using this statistic, we can test the null hypothesis that there is no agreement among participants, which implies that all colors are perceptually equivalent.

4.1.2 Coefficient of consistency

When using combinations, paired comparison experiments often measure the transitive property of participants’ choices to ensure consistency. As explained in Section 3, we check if the ternary relationship holds among the user’s choices and if a circular triad is getting formed due to randomness in the choices. Inconsistency can frequently occur when the items being compared are similar, making it difficult to judge. We calculated the coefficient of consistency, ζ, as defined by Kendall and Babington Smith (1940) and used in Do et al. (2020), for even n, where c denotes the number of circular triads:

ζ = 1 - \frac{24 c}{n^{3} - 4 n} (4)

We can determine the number of circular triads using the following formula (David, 1988):

c = \frac{n}{24} (n^{2} - 1) - \frac{1}{2} Σ {(p_{i} - (n - 1) / 2)}^{2} (5)

where n is defined as the number of colors and p_i is the score of each color. It is important that participants can have a low coefficient of consistency, ζ, while having a high coefficient of agreement u. This can occur should participants individually make inconsistent decisions leading to a large number of circular triads, yet the population also makes similar inconsistent decisions. ζ ∈ [0, 1], tending towards 0 if inconsistency increases. ζ will be lower if colors are perceptually similar.

5 Results and discussion

5.1 Participants

5.1.1 Participants

For both experiments, we had 15 volunteers. Each participant participated in both the dark and bright experiments, although the order of the experiments was randomized. The age range of the volunteers was 18–30, with the majority of participants between 18 and 23. These participants also varied in technology use and were recruited from both campus and local organizations. Each participant was assigned a random preset to ensure there were not any effects on the population from similar ordering. The average time taken by a user to complete each experiment ranged from 10 to 16 min with the average time being 14 min. Upon completion of each segment, users verbally reported any observations they had about the color and speed of the objects.

5.2 Results

For analysis, we created 3 combined preference matrices for each experiment, based on speed/distance, leading to 6 total matrices. In the 2 bar graph figures, we show total scores for each color in both environments, showing how many times a color was selected over another color at each speed and for each experiment, ordered from the colors most often chosen to the colors least often chosen.

In both experiments at all speeds, brighter colors were perceived as both closer and faster than darker colors. When objects are stationary, red and magenta appear closer to users in both external brightness levels. However, as objects begin moving, green is selected at a higher frequency as appearing closer or faster for many participants. In the dark environment, the darker colors seemed to vary in selection as speed increased, with no real commonality between the speeds. The results can be seen in Figure 3. At speed = 0.0 m/s, the darker colors appear near indistinguishable.

FIGURE 3

FIGURE 3. Bar graphs of total scores of each color for each speed in the dark environment. Colors are ordered from greatest score to least score.

In the bright environment, at speeds 0.5 and 1.0 m/s, we see that all darker luminances appear indistinguishable sans dark green, as seen in Figure 4.

FIGURE 4

FIGURE 4. Bar graphs of total scores of each color for each speed in the bright environment. Colors are ordered from greatest score to least score.

After analyzing which colors were chosen, we evaluated the coefficient of agreement μ, the χ², and the coefficient of consistency ζ as described in Section 4.1.1, among participants’ choices for each speed and each experiment, which can be seen in Table 2. The χ² score shows that the average user was agreeable to their own choices. The users also seem to be within agreement with one another, as agreeability in this case is between $\frac{- 1}{15}$ and 1. However, the question remains: are these values significant?

TABLE 2

TABLE 2. Metrics for Statistical Analysis of each environment and speed.

We analyzed the significance of the coefficient of agreement u for all models using an approximation described in Section 4.1.1. If the p-value of the coefficient of agreement is significant, then groups can be created of perceptual similarity, where all colors in said groups are not perceived as significantly different. However, at α = 0.05 level for 66 degrees of freedom, we can conclude that there is some agreement amongst participants for all experiments except for speed = 1.0 m/s in the dark environment. Therefore, we can reject the hypothesis that there is no significant difference between the colors.

5.3 Discussion and recommendations

Our preliminary results indicate that the speed of moving 3D virtual objects interacts with color and luminance to affect depth perception, to an extent, in OST-AR headsets. Color and luminance are well-studied depth cues, but the influence of these depth cues can vary depending on the speed of a 3D object and the luminance of the environment.

For both experiments, brighter colors appeared closer than darker colors. However, at all speeds, two observations stand out. One, warmer colors did not triumph over cooler colors of the same luminance and two, blue and yellow seem to be consistently chosen the least amount amongst brighter colors. For all experiments except the bright environment where objects moved at 0.5 ms/, dark blue and dark yellow were also chosen the least when compared to the other dark colors.

When discussing with participants, many participants noted that darker colors were more translucent than brighter colors, causing them to appear more faded out, thus signaling that these objects were further away. This was especially noticed in brighter environments, where the increase in brightness caused greater translucency. Upon review of previous works, it seems that this issue may reside only with OST-AR devices, as video AR overlays the virtual objects over a camera-rendered screen, thus there are no translucency issues with the AR objects themselves. This could also explain the agreements being higher for the brighter environments, as for any bright/dark pair, brighter colors were nearly always chosen. Even for specific dark pair combinations, users were able to see one of the objects more clearly, as dark blue and dark yellow were rarely chosen.

However, while in the dim environment, participants noted that the darker objects were more easily viewable. This lack of translucency caused objects to appear more opaque, causing users to be unsure of which objects appeared closer. This caused much less agreement among users. For both experiments, adding motion to these objects changes which color/luminance pair appears closer to the user. Green overtakes magenta in all cases, though sometimes barely. The background illumination also appears to play a factor in the depth perception of AR objects, as we found it harder to see objects in very bright settings, such as an outside Sun setting. AR developers should consider at least three factors when creating objects or adding color to already existing objects.

• The environmental luminance, whether dimly lit or brightly lit based on lux values

• The color and if depth or speed perception matters

• The AR object, whether stationary or moving

It is also well documented that the human visual system is most responsive to green detail as opposed to red and blue (Bayer, 1975). This aspect of the acuity human visual perception system could be the reason for the correlation between the perception of object speed, and the color green, as the experiment results show greener colors appeared closer in the movement experiments.

We recommend that application developers be aware of what type of environment their users will be in. If the application is designed with dim settings in mind, colors are less distinguishable and thus may not have an intended effect. Similarly, if the objects are moving at 1.0 m/s, then the current agreement is insignificant. However, if the users are in brighter environments, brighter luminances will allow the object to be more perceptible and appear closer. Safe choices for stationary objects for closeness appear to be magenta and red, while cyan and green also become good choices. Dark colors especially should be avoided in bright environments, as they may not even be perceptible to the users.

Our findings aim to help improve applications where developers bring attention to moving objects using depth cues, such as an object tracking system. If developers understand the interactions of motion cues on depth cues, these effects can be properly utilized to the developer’s and user’s advantage.

5.4 Limitations

The results are primarily valid for the HoloLens 2 and might hold good for other OST-HMDs with similar color perception. While it is possible that these perceptions may not translate fully to other OST-HMD devices, the results and the experimental design can be adopted for similar experiments to understand the effects on depth and speed perception for other OST-HMD devices.

The total amount of pairs that each participant saw was 396. Users discussed impatience, and thus, we stuck with a combination of pairs instead of investigating permutations, which would have doubled the number of observations. However, by evaluating a coefficiency of consistency, we can better understand if there is a pattern of distinguishability for each participant.

There are a few limitations to consider when using an OST-AR device, such as the HoloLens 2. These devices struggle to render objects in brighter conditions and are quick to overheat in hotter environments. In this research, we worked with the limitations set forth by Microsoft. We have found that in our geographical location, the MR objects are nearly invisible when rendered in outdoor environments during the daytime with bright sunshine. Furthermore, due to the changing nature of outdoor lighting and lighting arrangements, such as the position of the Sun and the resulting orientation of lighting on a given day, it would be difficult to ensure uniform outdoor environments for all the users involved in the study, increasing the variability of perception. Hence, our research results are limited to indoor environments.

6 Conclusion

In this paper, we evaluated the effects of color, luminance, motion, and environment settings on the depth perception of 3D objects with an OST-AR headset. To determine these effects, we conducted two paired-comparison experiments. The results of our study indicate that motion cues on 3D AR objects work with the color and luminance of not only the object but also the environment to affect the depth perception of objects in OST-AR devices. For indoor settings, speed augments which colors are perceived as closer, up to a certain extent. Motion plays a role in perceived depth estimation when working with AR object colors and luminance, as well as the luminance of the environment. We have shown that for OST-AR devices, similar to prior research in the real world, colors with a brighter luminance appear closer than colors with a darker luminance.

In the future, we plan to investigate the effects of speed estimation of objects moving towards a third-party object in comparison to moving towards the user, i.e., time-until-collision. We hope that the results of this experiment can provide insight for object tracking systems with additional highlighting of objects for visibility, as specific color choices may be necessary based on environment and motion. Developers will find it advantageous to understand the effects of motion and environment, as well as previously studied depth cues on depth perception for 3D AR objects.

Data availability statement

The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by University of Texas at Dallas Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

OA conducted all experiments and wrote the majority of the paper outline. BP edited the paper and provided insight into experimentation, acting as advisor. H-JG assisted with editing the paper and conducting some experiments. All authors contributed to the article and approved the submitted version.

Funding

This research was sponsored by the DEVCOM U.S. Army Research Laboratory under Cooperative Agreement Number W911NF-21-2-0145 to BP.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Author disclaimer

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the DEVCOM Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation.

References

Adam Jones, J., Edward Swan, J., Singh, G., Reddy, S., Moser, K., Hua, C., et al. (2012). “Improvements in visually directed walking in virtual environments cannot Be explained by changes in gait alone,” in Proceedings of the ACM symposium on applied perception (SAP ’12) (New York, NY, USA: Association for Computing Machinery), 11–16. doi:10.1145/2338676.2338679