The Impact of an Audience on the Appeal of Virtual Reality

Virtual reality in a public place is enticing for some yet daunting for others. Social Impact theory proposes that performing in front of larger (vs. smaller) audiences is typically seen as more anxiety provoking and less desirable. Having peers perform with you can offset this, however. Our goal was to test whether Social Impact theory extends to the context of trying virtual reality in a busy public setting, and whether any such effects are influenced by extroversion and trait anxiety. In Experiment 1, we ran an online study with 100 participants and found that images of people trying virtual reality in front of others were indeed rated as more anxiety provoking than images with no audiences. Images with (vs. without) audiences were also rated as scenarios in which people would be less willing to try virtual reality. There was no impact of extroversion levels on people’s reported Willingness to Try; however extroverted individuals were less affected by audience size compared to introverts in terms of how anxiety-provoking they considered the scenario. Experiment 1 also found that the presence of a monitor showing one’s virtual reality “performance” made Extroverts keener to try the experience, yet Introverts less keen. Experiment 2 tested whether the main findings of the first study extended to a real-world scenario. 69 participants observed 0–3 individuals trying a virtual-reality experience in the foyer of a busy library and were then questioned on expected anxiety levels and Willingness to Try. Whilst anxiety levels were again influenced by the audience size (number of people in the foyer at the start of each test), there was no impact of audience size on Willingness to Try virtual reality. Note that relative inattention of the audience on those trying VR in Experiment 2 (compared to Experiment 1), as well as a small sample size, may have made it hard to detect effects here. Extroverts were again less anxious about trying VR in-front of others compared to introverts. These findings offer some ways to make public space virtual reality experiences more accessible, whilst suggesting future steps to properly assess some exploratory findings presented here.


INTRODUCTION
We are entering what seems to be a second renaissance for virtual reality (VR), with the appearance of affordable, high-quality equipment for home usage (McRoberts, 2018;Slater, 2018;Bennett et al., 2021), and an increasing number of location-based VR attractions (Fink 2018) appearing in tourist hotspots. VR exhibitions are also popping up in places like museums, with the shift to interactive digital exhibits being thought to help offset declining visitor numbers (Geronikolakis, 2018). Sometimes such VR experiences are done in public places in front of onlookers, raising an important question: does public viewing put people off trying VR? Fiennes (2017, web-blog) found that people "didn't want to look silly in front of friends and family" prior to trying VR for the first time. Allen, Kidd and Nieto McAvoy (2020) reported that VR users worry about potential incongruencies between public VR participation and their projected identity, specifically about "looking like a wally" in front of peers and strangers. These suggest that the potential placement and publicness of experiences are significant factors for organisers and developers to plan for. Here, we investigate the extent to which these concerns are present, as well as examining whether there might be ways of countering this potential barrier.
It is worth mentioning that the donning of the VR headset cuts viewers off from their physical and social surroundings and introduces some unique issues for VR relative to other experiences. For example, we recently found that people felt significantly more disconnected from their body and surroundings when taking part in a gallery-based VR experience, compared to those doing the same experience but in mixed-reality (Verhulst, Woods, Whittaker, Bennett, Dalton, 2021). Others have found that users are worried about personal security (Fiennes, 2017, web-blog) or being affected by nausea . And although the audience is not visible, it can still be socially influential (Dashiell, 1930; for a discussion on this and social presence in VR, see Oh et al., 2018). One other unique issue for VR is that because the user wears the VR technology via a head-mounted display, which is itself a relatively rare technology with approximately 10-15% penetration rates in the United Kingdom , the user becomes part of the spectacle of the experience: putting the user on display as much as the technology. These differences may act so that the typical effects of an audience on an individual (discussed next) do not generalise to VR, indicating the relevance of research we present here.
Whilst we are unaware of research exploring this issue before in the context of VR, prior research on embarrassment and stage fright may offer useful insight. We draw upon theories of social impact (for an overview see Bond, 2005) for a conceptual basis, which broadly propose that the size of an audience determines the degree of impact one feels, for example, in embarrassing situations (for a review on embarrassment see Keltner and Buswell, 1997). For Social Impact theory , this follows a diminishing-returns power law (c.f. Steven's power law, 1957), such that each additional audience member has a progressively smaller level of impact; increasing the number of recipients of the audience's attention, however, dilutes this impact of audience size, whilst also following a power law. Note that other theories of social impact predict different albeit still positive relationships between audience size and social impact (e.g., the Social Influence Model, predicting an "S" shaped relationship; Tanford and Penrod, 1984). In this manuscript we do not focus on the shape of this relationship, rather we test for its presence, if any, in virtual reality-we use the term "Social Impact theory" here to collectively refer to theories of social impact.
Social Impact theory also predicts that the strength and immediacy of the audience are also key in determining impact. Strength can be conceptualized in terms of, for example, the social status of the members of the audience, and past or possible future relationships members may hold with the recipient of attention. Immediacy on the other hand refers to the "closeness in space or time and absence of intervening variers or filters" (Latané, 1981, p344). The authors note that Social Impact theory relates to (indeed, could help explain) several other psychological phenomena such as the bystander effect (on the diffusion of responsibility in the presence of others; for a recent meta-analysis see Fischer et al., 2011), and social loafing (on the reduced effort exerted when working in a group compared to working alone; see Latané et al., 1979).
Demonstrations that Social Impact theory can help to explain stage fright are particularly relevant for the current work. In a study by Jackson and Latané (1981), participants were shown a photo of an audience of 1, 3 or 9 individuals alongside a photo consisting of 0, 2, or 8 co-performers. The participants were asked to imagine that they were singing their (American) national anthem in the situation portrayed by each photo-photo combination; the participants then rated their anticipated nervousness and tension. The authors found that audience size did indeed predict stage fright, and that this followed the expected pattern of diminishing returns; also, as predicted, increasing the number of co-performers reduced this nervousness (again with diminishing returns). In their second (observational) study, participants who were taking part in a talent show in front of an audience of 2500 members were asked how nervous and tense they thought that they would be on stage an hour before their performance. Again, an increasing number of co-performers was linked with a decrease in tension and nervousness by means of a power function. Similar findings were reported by Diener et al. (1980), who, over the course of 3 studies, asked their participants to undertake a series of "self-conscious" tasks together, in-front of audiences of various sizes ("The embarrassing tasks were acting like a chimp, making 'gross' sounds, 'finger-painting' with one's nose, and sucking on a baby bottle", p451). After each task, participants specified how self-conscious/embarrassed they felt. As before, the more peers doing the task, the less embarrassed they felt (experiment 1); conversely, as audience size increased, embarrassment increased (experiment 2), with both relationships again following a diminishing returns power law. Beatty and Payne (1983) also provided support for the theory, demonstrating similar effects with a state-anxiety measure, as opposed to general questions of embarrassment, in a task where participants gave a speech in front of 5, 10, 15, 20 or 25 listeners. The authors also measured participants' Social Desirability (a person's need for approval from others; measured with a questionnaire designed by Crowne and Marlowe, 1964) a week before their speeches and found this score co-varied with anxiety, suggesting that specific personality factors are also likely to influence these types of judgement. For this reason, we also consider individual differences within the current research, focusing in particular on extroversion and trait anxiety.
We predict that more extroverted individuals will be less affected by issues of audience size compared to those who are more introverted, given extroverts' willingness to take social risks for "Sensation Seeking" (a personality trait strongly linked with extroversion, e.g., Eysenck, 1990). Accordingly, extroversion has been found to negatively correlate with stage fright (Steptoe et al., 1995). We also measure trait anxiety levels (day-to-day anxiety levels; Rowland and Van Lankveld, 2019) given that trait anxiety has also been implicated in determining levels of stage fright.
In Experiment 1 we conduct an online study to assess whether indeed audience size, number of peers, trait anxiety and degree of extroversion impact upon anxiety levels and people's Willingness to Try VR in a public VR experience scenario. Several other issues are also explored here, including whether the presence of a monitor showing one's VR performance influences people's judgements on this task. Experiment 2 is observational in nature, extending the findings of Experiment 1 to a real-world scenario. Here, participants could opt to take part in a real VR experience in front of others and with peers. Here we assess anxiety levels and Willingness to Try VR before participation.

EXPERIMENT 1
We adopted a similar task to Experiment 1 of Jackson and Latané (1981) in that participants were shown images of situations of VR users trying VR sometimes in front of an audience and sometimes with peers. The participants were then queried as to their state anxiety levels (Beatty and Payne, 1983) and Willingness to Try, in terms of the imagined context where they were offered the opportunity to take part in a VR experience. Two additional questions, fleshed out below, assessed the effects of the presence (vs. absence) of a VR monitor screen and the presence of a lab-coated (vs. nonlab-coated) helper. Both of these investigations were exploratory in nature.
The idea of being in VR and having one's actions visible to others on a monitor is likely intimidating for some, reducing people's Willingness to Try VR. It stands to reason that social facilitation and inhibition effects may be of impact here (Steinmetz and Pfattheicher, 2017; in augmented reality, see Miller et al., 2019), which act to speed up easy tasks but make harder tasks more error prone. Zajonc (1965) explains how arousal from the audience drives these different effects, whilst self-presentation theory posits they arise because of distraction and embarrassment (discussed by Bond and Titus, 1983). As most people have not tried VR before (e.g., one study surveying 2003 people found that only 16% had tried VR; ComRes, 2017), people may think their first attempts at VR will be evaluated even more poorly in front of an audience, decreasing their Willingness to Try VR. Note that introversion/extroversion is likely to modulate the impact of whether one is concerned about performing poorly in front of others, so it may only be introverts' Willingness to Try scores that will be negatively impacted by the presence of a monitor. In addition, in real-world settings, a consequence of seeing a monitor showing someone's activity in VR may be to draw people's attention (Mai and Khamis, 2018), the thought of which could also impact people's Willingness to Try VR.
A further question of interest concerned the effect of the presence of a perceived authority figure. Made famous by Milgram's studies on authority (see Gibson, 2019 for a recent overview), lab-coated individuals exert a degree of influence over others, and are more trusted, compared to those who are casually dressed (for doctors at least, Brase and Richmond, 2004; and for casually versus smartly dressed academics, Lightstone, Francis, Kocum 2011). We looked here to see if likewise, a lab-coated individual would be more likely to entice individuals to try VR relative to someone casually dressed.

Hypotheses
Based on Social Impact theory, our primary hypothesis, the 'Audience' Hypothesis, was that the presence (vs. absence) of an audience would lead to greater state anxiety levels and reduced desire to try VR. Thus, our expectation was that anxiety would be a main contributor to Willingness to Try. We also predicted that having multiple people doing VR alongside the participant would help to offset the off-putting effects of the presence of an audience-we have termed this the 'Peers' Hypothesis. In addition, we tested whether people are more willing to try VR when approached by lab-coated helpers ("Labcoat" Hypothesis), and when there is a monitor showing one's actions in VR ("Monitor" Hypothesis). Via exploratory analyses, we tested whether extroverted individuals would be less affected by audience size and number of peers than more introverted individuals ("Extroversion" Hypothesis). Similarly, we tested for the same pattern with low and high Trait Anxiety individuals ("Anxiety" Hypothesis).

Participants
One hundred participants (61 free-text entry reported as female, 35 male, 1 a-gender, 1 non-binary) were recruited from Prolific Academic to take part in the study in return for a payment of 0.85 United Kingdom pounds (£9.55/hour). Via a filtering feature of Prolific Academic, only participants aged between 25 and 34 and who reported being from the United Kingdom could be recruited. This group was selected because the results helped to steer a virtual reality experience that was being developed in conjunction with the National Gallery, United Kingdom, aimed primarily at this age group. The participants' ages indeed ranged from 25 to 34 years (M 29.29 years, SD 2.72); although one participant reported being from Bulgaria and one from Norway, the remainder did report being from the United Kingdom. The experiment was conducted on May 8, 2019, from 10:37 GMT onwards, over a period just shy of 2 hours. Data was collected in 3 batches one after the other (one batch of only 3 participants, followed by one of 10, then the remaining were recruited), to offset any unforeseen issues relating to experimental software and server infrastructure (there were none; see Woods et al., 2015, for a recent methodological overview of internet-based psychological research). The participants took an average of 321 s to complete the study (SD 136). All participants provided their informed consent prior to taking part. All testing protocols were approved by the Departmental Ethics Committee.

Stimuli
Image stimuli measuring 570 × 380 pixels were constructed in Photoshop by combining different independent layer visuals (via the Photoshop layer feature). A background gallery scene was shared by all images (which was a photo of an area in the National Gallery, altered by removing signs and gallery paintings via Photoshop). On top of this were placed various silhouettes of photos of people in various poses. Black (Audience members, hex colour #000000) and grey silhouettes (VR Peers, #AFAEAE) were constructed by hand (several were mirror flipped and manually adapted to appear different). Effort was paid to remove stereotypical gender-identifying features (such as high heels and baldness). Photographs of Oculus Rift headsets were added to the VR Peer silhouettes. For the main part of the study there were four images that differed according to the number of audience members (black silhouettes; either no audience or 5 individuals; the latter was decided upon as it was the largest audience we could cleanly fit into the image stimuli-which was designed to replicate a real world testing space being developed in the UK's National Gallery) and by the number of VR Peers (grey silhouettes; either 1 or 3; the latter was decided upon as two additional peers were the most who could be cleanly placed next to the original VR peer in the image stimuli; Figure 1). In another phase of the study, viewers saw an image in which a monitor was placed behind a solitary VR Peer ( Figure 2, "Monitor"). This monitor was constructed by hand in Photoshop to resemble a virtual reality scene. In a further phase, participants saw either a lab-coated assistant or casually dressed assistant (Figure 2 "Lab-coat" or "Casual"; 508 × 1025 pixels) alongside the Solo-Audience image stimulus. The assistant again was constructed by hand in Photoshop. The only pixels to differ between these differently dressed individuals were the photoshopped addition of a lab-coat, and the removal of trouser flares.

Apparatus
Given that the experiment was conducted online, the apparatus varied by participant. The experiment took place within an 800 × 600-pixel box in the participant's web browser. It was conducted on the Internet using version 3 of the Xperiment research package (Woods et al., 2015). Via a feature of Prolific Academic, participants were required to do the study on a desktop computer.

Design
The first part of the experiment was based on a withinparticipants experimental design with all the participants providing ratings for all four conditions (Solo-Audience, Peers-Audience, Solo-NoAudience, Peers-NoAudience). The main dependent variables, each measured on a 100-point scale, were the extent to which participants thought that the scenario would provoke Anxiety (only collected for the main part of the study) and participants' Willingness to Try each VR condition (collected on all conditions). These were collected in a random order during the main part of the experiment. There were then two additional trials that were presented in random order. The lab-coat trial had a between-subject design with participants randomly doing one of two versions of this trial. All participants undertook the monitor trial.
Trait Anxiety and Introversion/Extroversion were measured via two randomly ordered questionnaires after all experimental trials. The former was collected via the anxiety questions of the DASS21 (Lovibond and Lovibond, 1995). The latter was measured by the Introversion/Extroversion questions of the BFI-10 ( Rammstedt and John, 2007). Questions on these questionnaires not pertaining to anxiety or introversion/ extroversion were not asked. Familiarity with VR was measured by asking participants if they use VR "never", "once or twice", "occasionally" or "regularly".

Procedure
Trial ordering is detailed in Figure 3. In the main part of the study, participants undertook two experimental trials, each of which involved making judgements about four images (shown in Figure 1) which represented all possible combinations of audience size (0 or 5) and number of VR Peers (1 or 3). Participants were not provided with any additional context about the scenes in the images. In the Anxiety trial, participants were asked "How anxious do you think you would feel trying out virtual reality in the scenarios?", and in the Willingness to Try trial, "How likely is it you would try out virtual reality in the scenario?". Ratings were undertaken by means of a tool similar in design to a linescale-a tool we call the boxscale (Figure 4; as first used in Van Doorn et al., 2017). Participants were presented with each image in turn in the top part of the screen (the horizontal location of each image was randomised per participant per trial to be centrally placed either at 20, 40, 60 or 80% of the width of the viewing window). Only one image was shown at a time. The participant was instructed to drag the image into the box in the bottom half of the screen, placing it so that its horizontal position reflected the desired score on the scale labelled beneath the far left and right of the bottom of the box (vertical position did not matter and was not recorded; scores were from 0 to 100 and recorded to two decimal places). Upon placement, the next stimulus was revealed, and the process repeated until all stimuli had been placed within the boxscale. Participants could re-arrange the images if they so wished. The next trial commenced when the participant clicked a 'next' button or pressed the spacebar (doing this before all images were placed instead led to the unplaced images being highlighted in red briefly).
Next, participants undertook two additional randomly ordered trials using the same procedure specified above. In the lab-coat trial, participants saw an assistant who was either dressed casually or in a lab-coat (randomly determined per participant; 50 participants were shown each image), and who was placed to the left of the 'Peers-Audience' stimulus (see Figure 2). Participants were asked "You are approached by the above person on the left, asking if you would like to try the virtual reality experience in the scenario on the right. How likely is it you would try out virtual reality in the scenario?". In the monitor trial, participants were shown the "Monitor" stimulus and were asked "Note the monitor, which allows everyone else in the room to see what the person is experiencing in the virtual world. How likely is it you would try out virtual reality in the scenario?". Participants then filled in the extroversion and trait anxiety questionnaires (presented in a random order) and were debriefed as to the purpose of the experiment.

Preregistration and Data Analysis
The analysis was preregistered https://aspredicted.org/44M_9M9 on February 7, 2019. The data analysis is based on methods specified in Wilcox (2012), focusing on trimming-based analysis (20%), and undertaken in R (version 3.5.2) via the WRS package (version 0.24). This methodology is robust to many of the issues that befall traditional measures, such as issues of normality, heteroscedasticity and outliers, whilst often being more sensitive to experimental effects (Wilcox, 2020). The Holm-Bonferroni method was used to control for multiple comparisons.

Results
5 participants reported trying VR occasionally, and 3 regularly. These individuals were excluded from all analyses (46 reported trying VR once or twice, and 46 reported never having tried VR).

Anxiety
A robust trimmed-mean (20%) based alternative to a 2-way repeated measures ANOVA was performed on the Anxiety judgements data via the wwtrim function (Wilcox, 2012, p423).
There was support for both the Audience, Q 187.93, p < 0.001, and the Peers hypotheses, Q 7.61, p < 0.01, as indicated by main effects, but these were qualified by a significant interaction, Q 6.50, p < 0.05, such that the effect of peers was only evident in the presence of an audience (see Figure 5). This pattern of results was confirmed in six posthoc tests using the Yuend method (WRS r package; Wilcox, 2012, p197; this step was omitted from our preregistration) to compare dependent trimmed means (20%). Solo + Audience (20% trimmed M t 77) was more anxiety provoking than Peers + Audience (M t 64; p < 0.01), Solo + No Audience (M t 38; p < 0.001) and Peers + No Audience (M t 33, p < 0.001). Peers + Audience was more anxiety provoking than Solo + No Audience and Peers + No Audience (p < 0.001). There was no influence of the number of VR Peers on anxiety when there was no audience.

Impact of a Monitor
In contrast to our hypothesis, the Yuend method (WRS r package; Wilcox, 2012, p197)

Impact of a Lab-Coated Assistant
Again in contrast to our hypothesis, the Yuend method revealed no evidence to suggest that a lab-coated assistant (Mt 45) influenced participants' Willingness to Try VR any differently from a casually dressed assistant (Mt 42; T y 0.45, p 0.66, 95% CI: 10.89, 17.08). N.B. the 20% trimmed mean for the condition using the identical image (Peers + Audience) but where there was no assistant present is 42. FIGURE 4 | An illustration of the boxscale procedure time-course, from left (start of trial) to right (after one image placement). Note how the image is magnified when it is dragged by the mouse (initially appearing at half scale, zooming to full size-570 × 380 pixels), to be shrunk down when placed within the boxscale (10% size). Image size changes were "smooth", occurring at a linear rate over a 0.2 s period. Labels t0, t1 and t2 are added to help illustrate element ordering.

Exploratory Analyses
In order to examine whether Extroversion/Introversion and Trait Anxiety influenced the main factorial analyses above, we computed change scores from baseline by subtracting the rating for the "Solo-NoAudience" condition from each of the ratings given in the other three conditions. This was carried out for both Anxiety and Willingness to Try data, giving a set of six "change scores" per participant (three relating to the Anxiety data and three relating to the Willingness to Try data). These scores were then tested for correlation with Extroversion and Trait Anxiety scores. As Breush-Pagan tests (function bptest from package lmtest) found that some of the pairs of data were affected by heteroscedasticity (p > 0.05), the corb function (Wilcox, 2012, p459) was used as a robust alternative to the Pearson measure of relatedness. The Holm-Bonferroni method was used to control for multiple comparisons. There was one significant correlation: extroversion negatively correlated with reported anxiety change for Solo + Audience (r .-0.28, CI -0.46, -0.08, p 0.01), such that the more extroverted one was, the less anxiety provoking one found the Solo + Audience condition (over the baseline condition where there was no audience; see Figure 7). We also investigated whether Trait Anxiety and Extroversion impacted upon whether participants were more willing to try VR when the participant's experiences were visible on a monitor, and when there was a lab-coated helper present (as opposed to a casually dressed helper). Once again, difference scores were computed by subtracting the rating given for the Solo + NoAudience condition from the rating given to the same condition in the presence of the monitor, and lab-coat helper. Only in the former did extroversion correlate with this change measure (r 0.23, CI 0.40, 0.04, p < 0.01). As seen in Figure 8A, extroverts were more likely to try VR when others could see their VR performance. Introverts on the other hand were less likely to try VR in the presence of the monitor. Indeed, if we compare the most extreme extroverts (scoring 7) with the most extreme introverts (1), there is a 23.12% change in Willingness to Try. Perhaps to be expected, Trait Anxiety showed the opposite negative relationship (r -0.24, CI 0.44, 0.04, p < 0.05; see Figure 8B), with anxious individuals less likely to try VR with a monitor. There was a 9.93% change in Willingness to Try between the most (scoring 16) and least anxious participants (0).

Discussion
There was support for the Audience hypothesis: people perceived the VR activity as being more anxiety-provoking and reported being less willing to try it in the presence (vs. absence) of an audience. The Peers hypothesis was also supported to an extent, FIGURE 5 | 20% trimmed mean data for Anxiety data, split by Audience and Peer factors. 95% confidence intervals based on 20% trimmed data calculated via the trimci function of the r WRS package (Wilcox 2012; n.b. confidence intervals did not take into account the repeated measures nature of the design). All group trimmed means statistically differed from each other p < 0.05, unless otherwise stated. Figure 5, except with Willingness to Try data. with individuals reporting that the presence (vs. absence) of peers reduced their Anxiety about the VR experience. The presence of peers also increased their Willingness to Try the experience, but this effect was only seen in the presence of an audience. Overall, these findings are in line with Social Impact theory . Via exploratory analysis, we also found that one's level of extroversion influenced these judgements, with extroverts being less anxious than introverts about trying VR in the presence of an audience. There was no support for the Monitor and Lab coat hypotheses, with the presence of a monitor or a lab-coated individual having no impact on people's Willingness to Try VR. Curiously though, exploratory analyses found that extroversion and trait Anxiety were impactful in the monitor condition. Introverts were less likely to try a VR experience when others could see their VR performance whilst extroverts were more likely to try VR under these circumstances. Increased trait Anxiety reduced people's Willingness to Try VR when others could see their performance.

FIGURE 6 | As specified in
A limitation of this study was that no additional context was provided to the participants to explain the VR scenarios in the images. One participant may have interpreted the images as occurring in their living room (surrounded by friends), whilst another, at a public gallery (among strangers). A consequence of this would be a dilution of experimental effects. A further consideration raised by a reviewer was the dissimilarity of the scenes portrayed in the images with existing, real-world, VR events (and experiments), which may call into question the generalisability of these findings. A way to offset this in the future would be to use photo-realistic imagery depicting the scenes, as well as a text-based explanation of the context of the situation.

EXPERIMENT 2
The previous study was based on people's judgements about their likely behaviour in scenes that were depicted in simplified computer-presented images, and it is therefore possible that the results do not reflect the behaviour that is found in realworld settings. Here, we conduct an observational field study to see if individuals confronted with an actual busy situation would make similar judgements as the participants in Experiment 1. The study was conducted in the foyer of a library building in a University in the United Kingdom, between 9.30am and 12.30pm during term time (8th and May 10, 2019), before the Covid pandemic. Sample size was based on the number of people able to be recruited during these times. A different study (not reported here) was being conducted over the same period and involved 0-3 people undertaking a virtual-reality or mixed-reality experience. Participants in the current study saw people undertaking these other headset-based experiences and were asked to judge the likelihood that they would take part if they had the time to do so, and the Anxiety they expected they would feel. The hypotheses were the same as those in Experiment 1.

Participants
People were approached and asked if they would like to take part in a short questionnaire-based survey in exchange for a chocolate bar. Informed consent was obtained from all 81 participants (M 23.55 years, SD 9.17, max 72, min 17) and all testing protocols were run under the jurisdiction of the departmental Ethics Committee.

Stimuli and Apparatus
This study was observational in nature. The "stimuli" were the number of people taking part in headset-based experiences in another study running during the same time period. The headsets used in that study were Oculus Go (https://www.oculus.com/go/) or Mira Prism augmented reality headsets (https://www. mirareality.com/; note that the headsets differed somewhat from that shown in Experiment 1). There were 3 headsets available to potential participants for that study, at any one moment. This equipment was placed in a busy public location (to ensure the presence of "an audience")-on a table in a foyer outside the entrance to the library ( Figure 9). The data gathered from participants was questionnaire-based and collected on iPad (6th generation). Busyness of the foyer (person throughflow) was collected by means of a raspberry pi and webcam setup mounted one floor above the foyer looking down on the table holding the headgear for the other study ( Figure 9). The iPads and the raspberry pi were connected wirelessly to each other (via college internet via a remote server). After entering demographic data, the iPads requested a tally of the current person count to be stored on a remote server (hosted on Amazon Web Services London servers and databases), alongside a photo of the scene taken at that moment (faces were automatically obfuscated via custom software based on opensource tensorflow-based computer vision packages). After completing the questionnaire, the participant data was stored alongside this data.

Design
The study was observational and so the factors of 'busyness' and 'number of people taking part in headset-based experiences in the other study' varied between subjects according to natural variations in pedestrian traffic through the area (henceforth termed 'audience', for consistency with Experiment 1) and participation in the other study. Note that in our preregistration, we specified that we would experimentally vary the number of available headsets that were visibly available for testing-unfortunately this proved too technically challenging to achieve; however, fortunately, there was good variation in the number of headsets used at any one time (0, 1, 2, 3 headsets were observed by 24.6, 44.9, 23.2 and 7.2% of the participants respectively).
As in Experiment 1, the main dependent variables were Anxiety and Willingness to Try each VR condition as reported in the questionnaire. Introversion/Extroversion was also measured (at the end of the study) along with the familiarity with VR measure. We did not collect information here pertaining to participants' gender and trait-anxiety level.

Procedure
The procedure is graphically depicted in Figure 10. Upon agreeing to take part in the study, participants were first asked some demographic questions. They were then asked "There is a virtual reality demo in the corner of this room. How many headsets can you see?" and "How many headsets are currently being used?". In the same fashion as in Experiment 1, participants were next asked (in random order) "How anxious do you think you would feel trying-out those virtual-reality experiences?", and, "If time were not an issue, how likely is it you would try out those virtualreality experiences?"; note that a regular linescale was used here however, whereas in Experiment 1 we used a boxscale (both scales scored elements along a 0-100 range; this was done here by sliding a pointer). Introversion/extroversion was then assessed. Participants were then debriefed as to the nature of the study.

Preregistration and Data Analysis
The analysis, design and procedure were preregistered https:// aspredicted.org/blind.php?x p7i36p. However, there were some differences between what was preregistered and the procedure and analysis (detailed later) we determined that FIGURE 9 | consisting of two photos. The photo on the left shows the position of the webcam (see white arrow) that recorded people traffic, which was mounted approximately 6 m above floor level on an alcove. The photo on the right, taken by the webcam, shows the position of the table (see white arrow) upon which the headsets available to participants in the other study were located. Detectable faces were automatically painted over with white boxes in this photo. The resolution of the photo in the figure has been reduced to bolster anonymity.
Frontiers in Virtual Reality | www.frontiersin.org February 2022 | Volume 2 | Article 807910 it was best to run. Because of this, whilst hypotheses remain unchanged, all analyses for this study are exploratory in nature. Three participants did not answer all our survey questions and so could not be included in all analyses. A further 9 participants had to be excluded from analyses as no foyer busyness images were gathered for them, due to some Wi-Fi issues on the days of testing.
A power analysis was performed to check the suitability of our sample size, using GPower (version 3.1; Faul et al., 2009), including seven predictor variables (including interactions), with alpha 0.05 and power 0 .80. The recommended sample sizes for small, medium and large (f 2 0.02, 0.15, 0.35) effects were respectively 712, 103 and 49 participants. This implies that we may only be able to reliably detect large effects.

Exploratory Results and Discussion
Multiple moderation analysis linear regressions were run to predict Anxiety and likelihood to try scores based on the factors of 'Number of VR Peers', 'Audience Size', 'Extroversion', and two combinations of measures: 'Audience Size' x 'Extroversion'; and 'Audience Size' x 'Number of VR Peers'. 'Ever used VR' and 'Age' were entered into the model as covariates. We followed the multiple moderation regression process outlined by Hayes using Process 3.5 (2017; with Model 2). The HC3 Heteroscedasticity robust standard error estimator was used to compensate for issues of heteroscedasticity (Hayes and Li, 2007); Johnson-Neyman significance regions were calculated to tease apart significant interactions (the interaction of interest entered into a separate Model 1 moderation analysis; Hayes, 2017). Via visual inspection the data could be seen to not be normally distributed and affected by heteroscedasticity, so parameters and their 95% confidence intervals were estimated via bootstrapping (5,000 sample).
FIGURE 10 | illustrating the study design and procedure. All participants were asked to answer all questionnaire-based questions. Participants had to count the number of headsets and the number of participants currently using those headsets (black-dashed boxes). The busyness of the foyer (grey box) was also recorded.

Anxiety
In terms of individual factors, as predicted by the Audience hypothesis, audience size positively impacted upon Anxiety levels, above that expected by chance (p < 0.05, Table 1). There were also an influence of Age and Extroversion (p < 0.05), with older individuals more likely to experience Anxiety, and extroverts also more likely to experience Anxiety at the idea of trying VR. But the latter was qualified by an interaction with audience size, which is key here for testing the Extroversion Hypothesis. As can be seen in Figure 11, as audience size increased, introverts found the idea of trying VR increasingly more Anxiety provoking, whilst the opposite was observed for those more extroverted (p < 0.05, Table 1). When audiences were greater than 39 people (rounding down to nearest person; via Johnson-Neyman significance regions, p 0.05), there was a significant difference between extroverts and introverts. Unexpectedly however, and contrary to expectation, extroverted individuals were more anxious to try than introverts with audience smaller than 14 people. We explore this finding more in the main overall discussion.
There was no evidence for multiple people doing VR at a time offsetting anxiety, relative to fears when doing VR alone (the Peers hypothesis). Note that we did not manage to experimentally vary the number of headsets on display during the experiment, as planned in our preregistration. As participants could always see 3 headsets available for usage, it is possible this offset some fears of doing VR alone. This may have confounded the results for the Number of VR Peers and would need to be controlled for in future studies.

Willingness to Try
The identical analysis to the above was performed for Willingness to Try data. There was no support for the hypothesis that an audience would reduce people's Willingness to Try VR: the moderator analysis was non-significant R 2 0.1, F(7, 60; HC3) < 1. An observation though was that most of the 'audience' where the testing took place paid no attention to individuals doing the virtual reality experiences. Corroborating with this, Experiment 1 Willingness to Try scores without an audience were on average 64.3%, and with an audience 40.3% (SD were 20.6 and 23.9 respectively); in Experiment 2 though we have much higher scores (computed by means of a regression equation), with a small audience (10) 83.8%, and a large audience (40) 79.9%. Note how Anxiety scores are similarly less impactful in Experiment 2. We contend that if the audience were actively watching those doing virtual reality (as portrayed in the scenario in Experiment 1), effects on Willingness to Try may become more apparent. As one reviewer kindly noted, perhaps the term 'passer-by' better describes the onlookers in this study, rather than 'audience'. The small sample size here likely also made it hard to detect for effects. A logical step would be to include a measure of 'audience attention' in future research on this topic and ensure larger sample sizes.
It is important to also consider that in Experiment 2, participants were actively approached to be asked questions, which is off-putting; this could well differently affect introverts and extroverts, and may have had an impact on the results. This could be explored in the future by providing free-standing un-staffed terminals where participants can decide to take part in the study.
We also kindly thank a reviewer who pointed out another contrast between studies that could have affected results-that being, in Experiment 1 our participants were aged between 25-34 years of age, whilst in this study they were 17-72. Collecting gender, and other more in-depth demographic information could help statistically control for such issues in future studies.

Time of Day
During the analysis we observed that for each testing session the average audience size grew as the day progressed. This can be seen in Figure 12A. It is possible this acted to confound the results presented here (for example, with introverts potentially more likely to come to the library early in the day to avoid the crowds). Although, do note that mean extroversion scores did not appear to vary as a function of time ( Figure 12B).

OVERALL DISCUSSION
Over a series of two studies-Experiment 1 online, Experiment 2 "real-life" observational-we found consistent evidence for audience size influencing people's anxiety about trying VR in a public setting (in Experiment 1 at least this also affected people's willingness to try VR), and some more qualified evidence for an influence of the number of other VR users in the experience.

Audience Size
In line with Social Impact theory , there was broad support for the Audience Hypothesis-that the idea of being observed whilst doing VR would both promote anxiety and (in some cases) reduce the likelihood of people wanting to try VR. In terms of anxiety, in both studies, the presence of an FIGURE 11 | Simple slopes equations plot (Hayes, 2017)  audience led to greater anxiety levels associated with the VR experience relative to there being no audience (Experiment 1) or a reduced audience (Experiment 2). In terms of the presence of an audience reducing people's Willingness to Try VR, there was only supporting evidence from the first experiment. A possibility is that the inattention of the "audience" in Experiment 2 led participants to perceive it as less intimidating than the audience in Experiment 1. More specifically, the audience in Experiment 2 paid little attention to those in VR as they walked about the library lobby (tending instead to be chatting in their own groups, walking by, or focusing on their smartphones as can be see Figure 9), in contrast to Experiment 1 where those in VR were depicted as being directly observed. This contrast is also apparent in relation to the audiences reported elsewhere in the literature, where participants performed in an actual talent show (Jackson and Latané, 1981) or made "gross" sounds (Diener et al., 1980) in front of an actively observing audience. It is likely that a more passive audience would dilute any effect of audience size, which may help explain why Experiment 2 only partially replicates findings of Experiment 1. A fruitful next step would be to check how the findings reported here extend to a real-world situation with a more attentive audience. We would also like to point out the small sample size in Experiment 2 which likely also reduced the chances of detecting significant effects. We hypothesised that extroversion would be associated with reduced anxiety levels in relation to the VR experience and increased Willingness to Try VR in front of an audience. Supporting evidence for this link between extroversion and the extent to which the VR experience was expected to be anxietyprovoking was observed in both studies, whereas the link between extroversion and Willingness to Try VR was only observed in Experiment 1-as before, we feel that the inattention of the audience in Experiment 2 may have acted to make it hard to detect any effects of extroversion here.
There were, however, two caveats for the findings relating to anxiety. Firstly, in Experiment 2, extroverts were unexpectedly more anxious about trying VR than introverts in front of a small audience. Secondly, extroverts' anxiety levels unexpectedly dropped as the audience size increased (whilst, in line with the hypothesis, introverts became more anxious). Why was this so?
The study was run during exam season and next to a university library, so we speculate that many of the participants who took part were likely to have been affected by exam nerves. A reviewer kindly pointed out too that some theories of social impact predict there to be differently shaped functions linking audience size with impact (for an overview see Bond, 2005). Future research is needed to investigate these curious and unexpected exploratory findings.

Peers
Only in Experiment 1 did we find evidence to support our second hypothesis-that having peers take part in a VR experience would help offset anxiety issues (associated with there being an audience) and increase one's Willingness to Try VR. However, these findings may again have been influenced by the audience in Experiment 2 being not as daunting as that imagined in Experiment 1. Another consideration is that in Experiment 2, having three headsets on show all the time when participants were questioned about anxiety and Willingness to Try levels may influenced results here-this potentially signified that if one were to take part in the experience, you may well have peers join them at some point, helping to offset fears. This is an interesting question for future research.

Presence of the Monitor
Extroversion also appeared to play a role in whether the presence of a monitor showing one's performance in VR made it more likely that someone would try VR in the first place. Perhaps unsurprisingly, extroverts were more willing to try VR in this scenario, whilst introverts were less willing to try. Trait anxiety levels showed the opposite pattern, such that more anxious individuals were more unlikely to try VR in the presence (vs. absence) of a monitor.

CONCLUSION
This research helps to highlight how the design aspects of VR experiences can act as barriers, or drivers, when people are deciding whether to try a public experience. A simulated online experiment agreed with a real-world observational task that the presence of an audience increased people's reported anxiety about taking part in a VR experience. There was some suggestion that the presence of an audience is more off-putting for introverts than for extroverts-indeed, there was tentative evidence to suggest that an audience might be a driver of participation for some extroverts (perhaps even cathartic to anxiety)-but these results are still preliminary and should therefore be treated as speculative.
Although having multiple participants do the VR experience simultaneously was found to offset issues related to being observed by an audience in Experiment 1, this was not found in Experiment 2. Overall, this pattern of results suggests that the presence or absence of an audience might be a more important factor for experience designers to consider than the number of people taking part simultaneously. However, once again this conclusion can only be tentative, because of the lack of a convincing manipulation of VR peer numbers in Experiment 2. Finally, we found that extroverts were more willing to try a VR experience when others could see their performance on a big screen; introverts, on the other hand were less willing in the same scenario.
In general, the variability in how people responded to the different scenarios here, suggests that introducing some elements of flexibility into VR experiences could help encourage as many people as possible to take part. For example, one could imagine a locationbased experience in which participants can indicate a preference to take part in a visible area or a screened area, or where each participant can decide whether or not they would like their experience to be shared via a monitor. We hope that the findings presented here can offer some ways to make public space virtual reality experiences more accessible to all.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in the article Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Royal Holloway ethics committee https:// intranet.royalholloway.ac.uk/staff/research/research-andinnovation/research-enterprise/ethics/home.aspx. The participants provided their written informed consent to participate in this study.