Pre-Exposure Cybersickness Assessment Within a Chronic Pain Population in Virtual Reality

Virtual Reality (VR) is being increasingly explored as an adjunctive therapy for distraction from symptoms of chronic pain. However, using VR often causes cybersickness; a condition with symptoms similar to those of motion and simulator sickness. Cybersickness is commonly assessed using self-report questionnaires, such as the Simulator Sickness Questionnaire (SSQ), and is traditionally conducted post-exposure. It’s usually safe to assume a zero baseline of cybersickness as participants are not anticipated to be exhibiting any sickness symptoms pre-exposure. However, amongst populations such as chronic pain patients, it’s not unusual to experience symptoms of their condition or medication which could have a confounding influence on cybersickness symptom reporting. Therefore, in population groups where illness and medication use is common, assuming baseline is not necessarily desirable. This study aimed to investigate cybersickness baseline recordings amongst a chronic pain population, and highlights how deviations from an assumed baseline may incorrectly infer adverse effects arising from VR exposure. A repeated measures study design was used, in which twelve participants were assessed pre and post VR exposure via SSQ. Significant differences were found between actual and assumed pre-exposure baseline scores. Furthermore, we found significant differences between actual and assumed increases in cybersickness scores from baseline to post exposure. This study highlights that clinical sub-populations cannot be assumed to have a zero baseline SSQ score, and this should be taken into consideration when evaluating the usability of VR systems or interventions for participants from different demographics.


INTRODUCTION
Virtual Reality (VR) is being used more often in medical and scientific research, for a variety of applications (Riva, 2005;Malloy and Milling, 2010;Valmaggia et al., 2016;Vaughan et al., 2016), and has been demonstrated as a powerful and flexible technology which is also affordable and relatively easy to use.
However, in spite of the rich potential of this technology for use in healthcare, it is common for persons to prematurely exit a VR experience because of symptoms associated with cybersickness (McCauley and Sharkey, 1992;Garrett et al., 2017). Cybersickness is defined as onset of nausea, oculomotor, and/or disorientation while experiencing virtual environments (Rebenitsch and Owen, 2016). This can cause problems for VR users as discomfort caused as a result of cybersickness prevents interaction longevity .
Symptoms of cybersickness can include nausea, headaches, dizziness, eyestrain, sweating, and disorientation (LaViola, 2000). It has been reported that as many as 80% of participants experience an increase in symptoms within 10 min of being exposed to VR (Kim et al., 2005;Cobb et al., 1999), and although these studies pre-date consumer VR, recent research indicates that this issue is still prevalent (Yildirim, 2020).
Although it's clear that side effects from VR exposure exist, there is a lack of consistency in the literature regarding the precise definitions. Rebenitsch and Owen (2016) describe the symptoms of cybersickness produced in users of VR systems to "mimic motion sickness, but due to the absence of actual physical motion this affliction is considered a distinct condition referred to as cybersickness." Cybersickness has been referred to also as visually induced motion sickness (VIMS), virtual simulation sickness, virtual reality-induced symptoms and effects, amongst other terms, as well as commonly being misinterpreted as simulator sickness. Cybersickness is distinctly separate from simulator sickness by the characteristics of its symptom profile, and the apparent disparity in symptom intensity (Stanney et al., 1997).
There is some discussion in the literature regarding other terms for these effects, for example 'virtual reality-induced symptoms and effect' (Cobb et al., 1999). However, for clarity, in this paper we will refer to the side effects of VR exposure as 'Cybersickness,' as this is the term most commonly used in the literature under discussion. We acknowledge that future work in this field should be considering updated terminology in order to describe the symptoms.
The safety of a device or intervention should be paramount when determining whether it is suitable for its intended audience, especially when developing novel applications for the purpose of medical interventions, rehabilitation, or training.
For clinical VR research, it is common to evaluate whether the VR system causes cybersickness symptoms, and thus determining whether it is safe to implement compared to an alternative intervention. For example, VR is being used more commonly within military environments where retention of information and task performance is vital, and thus information inhibition caused by cybersickness symptoms is an important consideration (Stanney et al., 2020).
Aside from the safety considerations, cybersickness may have implications for other factors in immersive systems. It has been suggested that individuals who report greater sickness symptoms in VR could be expected to report less presence (Witmer and Singer, 1998;Weech et al., 2019), which may have unwanted effects on desired outcomes. For example, when VR is used for the purposes of pain distraction, presence is considered a major contributor toward pain alleviation being achieved (Hoffman et al., 2004;Wiederhold et al., 2014). It is therefore important to test for factors, such as, cybersickness, which could potentially affect treatment outcomes. It is recommended that applications should be tested for cybersickness, and evaluated at the feasibility stage (Lubetzky et al., 2018;Davis, Nesbitt, and Nalivaiko, 2015). Cybersickness is traditionally measured using self-report questionnaires, with the most commonly used being the Simulator Sickness Questionnaire (SSQ) (Kennedy et al., 1993). The SSQ was developed for use with simulators, and was adapted from Kennedy's work in developing the Pensacola Motion Sickness Questionnaire (Kennedy et al., 1965), however it has been adopted widely for use with Virtual Environments (VE), as their symptom profiles and sickness characteristics are similar (Stanney et al., 1997).
A number of other self-assessment questionnaires have been devised for monitoring cybersickness, such as the Virtual Reality Symptom Questionnaire (Ames et al., 2005), and the Virtual Reality Sickness Questionnaire (Kim et al., 2018)-an adaption of Kennedy's SSQ, have been used sporadically. A common criticism of these self-report questionnaires has been that they take too long to administer, therefore shorter single question measures have been used also Keshavarz and Hecht, 2011). Aside from questionnaires, assessment of cybersickness by means of postural instability has been used more recently (Risi and Palmisano, 2019), as postural stability has been suggested to be a cause of the experience of cybersickness (as it has similarly been hypothesised to be a contributing factor in the cause of simulator sickness) (Stoffregen et al., 2000), although conflicting opinions exist (Dennison and D'Zmura, 2017).
There is a surprisingly limited discussion in the literature regarding what is considered a 'normal' score for cybersickness amongst healthy or non-healthy populations. In relation to the SSQ, amongst healthy participants, it is not usually necessary to perform sickness questionnaires pre-exposure as a baseline SSQ score could reasonably be assumed to be 0 (indicating no symptoms). However, participants from clinical populations may exhibit symptoms similar to cybersickness pre-exposure, and thus for these populations, the assumed zero baseline may be incorrect. For example, Bouchard et al. (2009) reported non-zero pre-exposure scores amongst participants with selected anxieties. Kennedy et al. (1993) did suggest that pre-exposure screening of participants should be administered, but went onto recommend that individuals in a state other than their usual fitness (who score a non-zero pre-exposure score) should be eliminated from further participation, and thus only postexposure assessment should be scored. However, if we only test on healthy participants, we can never test with clinical populations (such as people with chronic pain). Likewise, if we removed the participants who answered as anything other than 'well,' we would be removing the target population we are trying to study, which in turn would not facilitate clinical work being conducted. Amongst clinical populations such as chronic pain patients, it would certainly be counter-productive to eliminate individuals in this capacity as it has been suggested that confounders between cybersickness and medication exists (McCauley and Sharkey, 1992). Furthermore, understanding the pre-exposure state is important, as any pre-exposure symptoms could influence the interpretation of post-exposure scoring (Kennedy et al., 1993). Thus we suggest that it would be more informative to assess cybersickness pre-exposure, and observe changes which may occur between pre and postexposure assessment. Without pre-exposure assessment, incorrect conclusions about the effect of a VR intervention could be formulated.
In lieu of pre-exposure assessment via SSQ, instruments such as, the Motion Sickness Susceptibility Questionnaire (Golding, 1998) may be administered to assess susceptibility to symptoms. However, susceptibility questioning alone does not reflect the current state of patients, but rather previous experiences within motion sickness-inducing situations, which is not indicative of determining the effect of a VR intervention.
To date, much of the pain research concerned with pain populations does not include any type of sickness assessment as part of their study protocols-including post-exposure sickness questionnaires. However, there are a few which do measure or discuss pre-exposure baseline or pre-exposure symptoms (e.g., Sarig Bahat et al., 2015;Wiederhold et al., 2014;Bouchard et al., 2009). Kennedy et al. (1993) suggested that sickness susceptibility questionnaires could be used as an alternative to pre-exposure baseline testing. However, in the majority of pain research in VR, neither this nor other pre-exposure baseline symptom testing measure is used, nor is the potential need for them discussed. Furthermore, susceptibility questionnaires do not elicit data regarding current symptoms, and therefore do not address the issue of a non-zero baseline score.
As the SSQ is currently the most widely use measure for cybersickness, we use this measure pre and post-exposure in order to facilitate comparisons with other work. The most common approach for assessing results of the SSQ is Kennedy et al.'s. weighted scoring (1993), although this has been criticised for scores being inflated by counting items multiple times in the total score calculation (Bouchard et al., 2007). Alternative scoring methodologies have been proposed, such as Bouchard et al. (2007) revised factor structure which proposes assessment with raw scores, rather than Kennedy's weighted score calculation.
We suggest that it is important to understand whether some clinical populations may present pre-existing symptoms similar to symptoms of cybersickness (H1). Furthermore, it has been observed previously that cybersickness symptom scores may decrease rather than increase as the result of a VR intervention, (e.g., Bouchard et al. , 2009). A decrease in cybersickness-like symptoms in such a population may still give a post-exposure score greater than the zero baseline (H2). It could be hypothesised that a direct comparison of postexposure SSQ scores between healthy and pain populations cybersickness scorings post-VR intervention may indicate that an intervention has made the pain population sicker than the healthy population. However, if pre-exposure (baseline) cybersickness scores were taken into account, then it may be that the any difference is due to a baseline difference, and not caused by the intervention itself (H3).
H1-The pain population will have significantly higher preexposure SSQ scores than the normal population assumed baseline. H2-The pain population will have significantly higher postexposure SSQ scores than the normal population assumed baseline of zero.
H3-The difference in SSQ scores from pre-exposure to postexposure will be significantly less than the difference between the assumed baseline score and the post-exposure score.

METHODS
The participants for this study were drawn from a population of Chronic Pain patients, as this group has been identified as one which may present pre-existing symptoms (McCauley and Sharkey, 1992). In order to reduce the burden on the patient population, this SSQ study was conducted alongside a study observing the effect of VR on experimentally induced pain in Chronic Pain patients, which describes the study methods and procedure summarised here in more depth.

Participants
Twelve participants aged 39-70 (M 56 ± 9.36) ( Table 1) were recruited from a United Kingdom pain support group and networks. All participants had been experiencing chronic pain (defined as a period lasting 3 months or greater). Participants also completed pre-study screening questionnaires to exclude any factors which would prevent them from participating in a VR study. Factors for exclusion included health issues which could prevent someone from using a visual display for an extended period of time.

Design
A within-subjects, repeated measures study design was used. Pre-exposure SSQ was recorded before participants were randomised to receive either an active or passive VR distraction in a counterbalanced order. The passive intervention was part of the parallel study and is not considered further in this paper. In line with previous literature relating to SSQ scores, in this study we are only considering post-active SSQ results referred to hereon as postexposure, unless explicitly stated otherwise.

Hardware and Software
The software interventions used were 1) Bananaland. An active intervention and a proprietary VR experience, in which the user traverses through a jungle environment with ambient music accompanying the visuals. 2) A passive intervention which consisted of grey lines on the screen. In this condition users could look around however no dynamic visual feedback was present. This was designed to be neutral and non-engaging.
Both interventions were presented using an Oculus Rift CV1 Head Mounted Display (HMD).

Procedure
Participants were asked to sign a consent, and were asked to confirm that no changes had occurred since registering that might be applicable to the studies medical exclusion criteria. Before any VR intervention, participants completed a preexposure (baseline) SSQ. As per the protocol for the accompanying study, participants were induced with experimental pain by means of a pressure cuff inflated to and sustained at 200 mmHg, and applied to their non-dominant arm. This procedure was conducted in accordance with the Submaximal Tourniquet Effort Test (SMET) (Moore et al., 1979).
Participants were then exposed to the VR intervention for a maximum of 5 min, or until the participant asked to exit, which they were able to do at any point. After the completion of each VR session, participants completed a post-exposure SSQ. After each session, the participant was given time to rest before continuing.
The study was approved by the University of Portsmouth institutional review board.

Data Analysis
A Shapiro-Wilk test of normality indicated that pre-test sickness scoring was not normally distributed (p 0.001). A Shapiro-wilk test of normality indicated that the post-test sickness scoring was not normally distributed (p 0.014).
As this work is concerned with being comparable in the relevant literature, we will be calculating our results using Kennedy's traditional approach, although will also apply Bouchard et al. (2007) revised factor structure and scoring where applicable.
A Wilcoxon signed-rank test was performed to assess whether our study sample had a significantly higher pre-exposure SSQ scores than the normal population assumed baseline. A Wilcoxon signed-rank test was performed to assess whether our study sample had a significantly higher post-exposure SSQ scores than the normal population assumed baseline. A Mann-Whitney U test was performed to assess whether the difference in SSQ scores from pre and post-exposure of our study sample will be significantly less than the difference the assumed baseline score and the post-exposure score of our study sample.

RESULTS
Descriptive statistics for the pre-exposure and both post-exposure SSQ scores are shown in Table 2, and Figures 1 and 2.
All scores referred to as weighted have been calculated using Kennedy et al. 1993) method. References to non-weighted scores are applying Bouchard et al. (2007) scoring approach.
(1) Pre-exposure SSQ scores compared to the assumed baseline (zero) of healthy participants.
A Wilcoxon signed-rank test indicated that the pre-exposure scores were significantly greater than the assumed baseline (zero) (z 55, p 0.005).
(2) Post-exposure SSQ scores compared to the assumed baseline (zero) of healthy participants.
A Wilcoxon signed-rank test indicated that a significantly higher post-exposure score than the assumed baseline (zero) was reported (z 66, p 0.003).
The mean post-exposure score obtained from our participants was 18.39, with high variability as the SD for the post-exposure was 16.19 (Table 1).
(3) Comparing post-exposure differences between actual and assumed baseline SSQ scores A Mann-Whitney U test indicated that the difference between the pre and post-exposure SSQ scores was significantly less than the difference between the assumed baseline and post-exposure SSQ score (z −2.297, p 0.020) (Figure 3).
A Mann-Whitney U test performed using Bouchard et al. (2007) non-weighting scoring approach indicated that the difference between the pre and post-exposure SSQ scores was significantly less than the difference between the assumed baseline and post-exposure SSQ score (z −2.297, p 0.020).
Individual items of the SSQ are categorised into subscales which are distinct symptom clusters (Kennedy et al., 1993). We therefore also scored each symptom cluster independently ( Table 3).
There was an approximately 10 point increase in mean weighted SSQ scores in each of the sub categories, with the greatest variability being observed in the disorientation sub category. Whilst increases in mean SSQ scores were noted in all three categories, the symptom profile remained the same pre and post exposure.

DISCUSSION
In this study, we theorised that using post-exposure SSQ scores in VR studies may lead to misinterpretation of results in some clinical populations such as those with chronic pain. To investigate this, we formulated three hypotheses regarding how pain participants cybersickness scores would compare to an assumed baseline scoring of the healthy population, and whether the pre-exposure state of pain participants differs from healthy participants assumed baseline.
We first hypothesised that a chronic pain population will have a higher than zero baseline SSQ score than the normal population. Our results indicate that a significant difference between the assumed (zero) baseline, and the measured preexposure SSQ scores exists, supporting the hypothesis. Much of the existing literature relies on post-exposure SSQ scores  only, and few studies to date are concerned with determining the pre-exposure condition (Sarig Bahat et al., 2015;Wiederhold et al., 2014). Whilst this may be of limited importance if we are only interested in absolute SSQ score post-exposure, it becomes highly relevant when comparing the effect of VR exposure between different populations. We cannot assume that all population sub-groups will be starting studies at the same baseline, and this could affect the interpretation of post-exposure scores. Secondly, we hypothesised that a chronic pain population will have greater post-exposure SSQ scores than the assumed (zero) baseline. Our results indicate that a significant difference does indeed exist. We expect with a VR intervention that participants will report some, potentially significant, cybersickness post-exposure, as it is unlikely to be zero. However, when assuming the comparison against a zero baseline, the result is not necessarily indicative of the VR interventions causation. This highlights potential disparities when using assumed baselines amongst groups such as chronic pain participants (who likely enter studies with elevated baselines). Furthermore, one participants cybersickness symptoms actually decreased between pre and post-VR intervention (Figure 1), supporting similar observations made by Bouchard et al. (2009). This result also supports our first hypothesis that assuming baselines can affect the interpretation of post-exposure scoring.
Thirdly, we hypothesised that the difference between SSQ scores of the pre and post-exposure would be significantly less than the difference between the assumed baseline and the post-exposure SSQ scores. Our results demonstrated that the difference in SSQ scores measured pre to post was significantly less than the difference in SSQ scores measured between the assumed baseline and postexposure ( Figure 3). If we aim to determine whether an intervention has caused cybersickness, by just collecting and observing post-exposure scores we may conclude that it does (when compared to zero). However, if we are able to look at our populations incoming SSQ score, it may be that the intervention has little or no negative effect by comparison. Furthermore, this result demonstrates how if these results were analysed without a known baseline and just the assumed (zero) baseline, that a false positive effect of the intervention on SSQ would have been reported. In the absence of pre-exposure or susceptibility testing (Golding, 1998), just post-exposure SSQ scoring could indicate that the pain population would become sicker than the healthy population, and could ultimately conclude that an application is unsuitable for the non-healthy population.  However, if pre-exposure testing is conducted, it's possible that the rate of change in both populations is comparable, and therefore the application is equally suitable for both. This is of particular importance for pain populations, as the literature highlights how cybersickness may be negatively correlated with presence (Witmer and Singer, 1998;Weech et al.,, 2019), which is considered a key reason for VR's efficacy of providing pain alleviation via distraction, as it's suggested to be positively correlated with presence (Hoffman et al., 2004;Wiederhold et al., 2014).
While difference scoring alone is not generally considered a valid measure due to potentially poor reliability (Cronbach and Furby, 1970;Young et al., 2006), our result nevertheless demonstrates the importance of understanding pre-exposure symptoms of a clinical population when examining the effect of cybersickness When considering whether the symptoms of cybersickness become exacerbated in a VR application, basing our assumptions on purely on the post-exposure SSQ scores would be misleading.
The SSQ categorises symptoms into three distinct clusters which are Oculomotor, Nausea, and Disorientation (Kennedy et al., 1993). Spectral profiles of sickness exist to define cybersickness from the often misinterpreted simulator sickness (Stanney et al., 1997), with simulator sickness following the symptom profile of Oculomotor, Nausea, and Disorientation; in order of perceived symptom severity. Cybersickness however is suggested to follows the symptom profile of Disorientation, Nausea, and Oculomotor (Stanney et al., 1997) (Table 4). Although examining the differences between the SSQ subcategories was not a primary aim of this study, and is thus purely observational, we noted that the reported SSQ symptoms did not follow the symptom profiles which traditionally defines cybersickness uniquely (Figure 4). We observed that the symptom profile of participants was similar to that of simulator sickness (Table 4), rather than that of cybersickness. The literature has previously highlighted discrepancies in cybersickness/simulator sickness spectral profile definitions, suggesting that users affected with cybersickness have followed symptom profiles different to that reported by Stanney et al. (1997) and Gavgani et al. (2018). Although this cannot be generalised for the population as this is a small sample, this effect should be looked into further with a larger sample.
This could be because rather than following symptom profiles devised to categorise cybersickness, pain participants are exhibiting symptoms associated with chronic pain symptom profiles. For example, at pre and post-exposure our participants scored highest in Oculomotor and Nausea (Table 3). Oculomotor elements include fatigue, headache, eye strain, and difficulty focusing, similar to some somatic symptoms associated with chronic pain which includes fatigue, unrefreshing sleep, and dyscognition (difficulty concentrating and thinking) (Wolfe et al., 2010;Crofford, 2015). Nausea elements include feeling nauseous and dizziness, which are also common side effects of opioid based medication, such as Hydrocodone and Fentanyl, commonly prescribed for the treatment of chronic pain (Benyamin et al., 2008). However, when observing the rate of increase between pre and post-exposure, the rate of which symptoms have increased is comparable to the traditionally proposed symptom severity of cybersickness ( Figure 4). This would indicate that participants entered with symptoms of their illness, which is comparable to Simulator Sickness and pain symptom profiles, however their increase is representative of cybersickness. Future work could further investigate the symptom profiles of clinical sub-populations pre and post-exposure, which may give some indication as to whether post-exposure SSQ scores are more attributable to the VR intervention or the pre-existing clinical condition. It's also possible that the experimentally induced painful stimuli experienced during the parallel study may have caused an increase in the pain-related symptoms, which overlap with the SSQ symptom list. Further studies using a VR exposure for this group without the experimentally induced pain would control for this possible confounding factor. It has be argued that pre-exposure may influence post-exposure results (Kim et al., 2005;Young et al., 2006). In these previous studies they show a significant increase when given pre-exposure questionnaires, an effect which we did not observe. Future work could explore pre-exposure bias when administering the SSQ further.
Two participants reported a pre-exposure SSQ score comparable to the normal populations assumed baseline (zero), with the mean pre-exposure SSQ for the pain participants being 10.29. A high SD of 11.44 was reported, indicating that reported pre-exposure SSQ scores were highly variable (Table 1), which is representative of pain populations variability and individuality of symptom exhibition (Allen et al., 2009;Bartley et al., 2018). Of the 12 participants, 9 had a post-exposure score greater than 10, only 2 of these showed a substantial increase on pre-exposure scores, and all 9 had a non-zero pre-exposure score. However, 1 participant did show a decrease is symptoms as oppose to an increase ( Figure 1).
Using Bouchard et al. (2007) scoring approach, a comparable Mann-Witney U test was performed which returned the same confidence interval as we observed when using Kennedy's scoring. Similar differences were also observed when scoring the Nausea and Oculomotor sub-scales respectively ( Figure 5). Therefore, although in other contexts Bouchard's scoring may give rise to different conclusions, for the purposes of this study both Bouchard's and Kennedy's approaches lead to the same conclusions.
We would like to acknowledge also that although the focus of this work was to highlight that certain populations may not be as adversely affected as SSQ scores might indicate, it is possible that starting with a non-zero pre-score might mean that when participants do become sick because of VR, the rate which their sickness increases may be greater than someone who entered the study with a zero baseline score. Alternatively, entering with a non-zero baseline score could also provide greater resilience to symptoms when interacting within VR. Kruk,(1992) suggests that medication increases the susceptibility to simulator sickness. Although little work exists as to whether this same interaction exists for cybersickness, McCauley and Sharkey 1992) suggest that problems with cybersickness symptoms may be exacerbated by medication. Further work could be warranted to explore these points.
This study has highlighted that some sub-populations cannot be assumed to have a zero baseline. 10 out of 12 participants enrolled in this study entered with a non-zero baseline, confirming that assumed baselines should not be an indicator for informed research concerned with measuring cybersickness amongst non-healthy populations. and that considerations should be taken when evaluating the usability of VR systems or interventions for participants from different demographics.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by University of Portsmouth CCi. The participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
The study was designed by PB and WP. The data acquisition was conducted by PB. Formal analysis was conducted by PB with statistical advice provided by WP. Supervision for this study was completed by WP. The author who took lead on writing the manuscript was PB, and main feedback and edits were provided by WP. All authors contributed feedback to the final version of the manuscript.

FUNDING
This work was funded by the Faculty of Creative and Cultural Industries at the University of Portsmouth.