ORIGINAL RESEARCH article
Estimating Cognitive Workload in an Interactive Virtual Reality Environment Using EEG
- 1Biomedical Engineering, Old Dominion University, Norfolk, VA, United States
- 2Department of Neurosurgery, School of Mental Health and Neurosciences, Maastricht University, Maastricht, Netherlands
- 3Department of Psychology, Old Dominion University, Norfolk, VA, United States
- 4Virginia Modeling, Analysis and Simulation Center (VMASC), Suffolk, VA, United States
- 5Department of Biomedical Engineering, Virginia Commonwealth University, Richmond, VA, United States
With the recent surge of affordable, high-performance virtual reality (VR) headsets, there is unlimited potential for applications ranging from education, to training, to entertainment, to fitness and beyond. As these interfaces continue to evolve, passive user-state monitoring can play a key role in expanding the immersive VR experience, and tracking activity for user well-being. By recording physiological signals such as the electroencephalogram (EEG) during use of a VR device, the user's interactions in the virtual environment could be adapted in real-time based on the user's cognitive state. Current VR headsets provide a logical, convenient, and unobtrusive framework for mounting EEG sensors. The present study evaluates the feasibility of passively monitoring cognitive workload via EEG while performing a classical n-back task in an interactive VR environment. Data were collected from 15 participants and the spatio-spectral EEG features were analyzed with respect to task performance. The results indicate that scalp measurements of electrical activity can effectively discriminate three workload levels, even after suppression of a co-varying high-frequency activity.
The integration of user-state biofeedback to future virtual reality (VR) and augmented reality (AR) systems is vital for providing more immersive, adaptive and functional VR experiences, as well as optimizing human performance for a wide variety of application domains (Bisson et al., 2007; Lobel et al., 2016; Cipresso et al., 2018). Current VR headsets provide a logical, convenient, and unobtrusive framework for mounting EEG sensors. Additionally, recent advances in dry/wireless EEG electrodes (Wang et al., 2016; Zander et al., 2017; de Camp et al., 2018; Lee et al., 2018; Kam et al., 2019) and motion artifact suppression (Gwin et al., 2010; Daly et al., 2015; Kline et al., 2015; Snyder et al., 2015; Arad et al., 2018) further increase the practicality of integrating EEG into VR headsets.
The vast majority of literature focuses on active or reactive modulation (Zander and Kothe, 2011) of EEG to directly control or interact in the virtual environment, such as decoding imagined movement signals from EEG to navigate through the virtual environment while the user remains stationary in the physical space (Leeb et al., 2007; Scherer et al., 2008; Royer et al., 2010; Velasco-Alvarez et al., 2010; Doud et al., 2011). However, these designs often require significant user training, rely on unnatural or obtrusive sensory stimuli, and exhibit performance issues that limit practical, long-term use (Lotte et al., 2008, 2010; Ron-Angevin and Diaz-Estrella, 2009).
In contrast, implicit or passive BCI control, where the user's cognitive or affective state is passively monitored and used to affect some auxiliary aspect of the interaction (Zander and Kothe, 2011; Brouwer et al., 2015; Unni et al., 2017; Horvat et al., 2018; Ihme et al., 2018) may be better-suited for practical integration into VR systems. Such passive feedback can be designed to be less sensitive to BCI decoding errors, with the potential of being less noticeable and distracting to the user compared to decoding errors in direct BCI control of the environment. Thus, such passive feedback holds promise for improving engagement and immersion in VR.
Prior studies have attempted to classify different cognitive tasks such as rest vs. mental imagery (e.g., mental math or object rotation) using brain activity. The number of tasks and task difficulty can be altered to produce detectable changes in cognitive state. This has been effectively demonstrated in both EEG (Ruchkin et al., 1991; Ryu and Myung, 2005; So et al., 2017) and fNIRS (Power et al., 2010, 2012; Herff et al., 2013b). Other studies have further investigated changes in brain activity with respect to changes in cognitive workload during task performance using EEG (Berka et al., 2007; Brouwer et al., 2012; Gerjets et al., 2014; Hogervorst et al., 2014; Mühl et al., 2014; Ewing et al., 2016; Schultze-Kraft et al., 2016; Grissmann et al., 2017a,b; Scharinger et al., 2017; Pergher et al., 2018) and fNIRS (Ayaz et al., 2012; Herff et al., 2013a; Unni et al., 2017). It has also been shown that cognitive workload models trained on one task condition can be effectively transferred to other conditions (Baldwin and Penaranda, 2012).
Studies have also used passive neurofeedback of EEG or fNIRS to modulate the controllability of the player's avatar in a video game (Muhl et al., 2010), the transformation of the avatar into another physical form (Bos et al., 2010), the adaptation of the game difficulty (Girouard et al., 2013), or to monitor items in the VR environment that were detected by the user (Zander et al., 2010). See Lécuyer et al. (2008), Lotte et al. (2013), and Kerous et al. (2018) for reviews of the application of brain-computer interfaces for VR and videogames.
The aforementioned EEG-based studies largely utilize the various combinations of the traditional power spectral bands: θ, α, β, γ over frontal, central, and parietal locations. Of particular relevance to estimating cognitive workload and working memory from EEG, numerous studies have indicated that the fronto-parietal network exhibits a decrease in α power with increasing task demands, while θ power is positively correlated with increasing task demands (Sauseng et al., 2005, 2010; Brouwer et al., 2012). Other studies suggest that β activity behaves similarly to α, but may be due to motor activity required by the tasks (Pesonen et al., 2007; Scharinger et al., 2017). Due to the limitations of scalp EEG, γ activity has been less frequently reported in relation to cognitive workload. Fitzgibbon et al. found widespread γ activations in a variety of cognitive tasks (Fitzgibbon et al., 2004). Tallon-Gaudry et al. revealed a specific γ-band feature for a memory task that appeared decoupled from head and neck muscle activity (Tallon-Baudry et al., 1998). Additionally, magnetoencephalographic (MEG) and electrocorticographic (ECoG) indicate that α-γ and θ-γ coupling play a role in working memory (Roux and Uhlhaas, 2014).
The present study aims to build upon the prior work on passive EEG-neurofeedback using the n-back task (Brouwer et al., 2012) through the use of an interactive, head-mounted VR experience. This represents a deliberate attempt to move beyond controlled and sterile experimental environments toward a practical VR application, where there is a detailed and potentially-distracting environment where the user is physically interacting with objects to perform a task. In order to verify that EEG measures of cognitive workload can be reliably attained using an interactive VR environment, we adapted the well-established n-back task (Kirchner, 1958) for modulating cognitive workload into an immersive virtual environment using a HTC VIVE VR headset1. For the classical n-back task, participants are presented with a series of symbols and are asked to respond when the current symbol matched the symbol presented n symbols ago in the sequence. The cognitive workload increases as a function of increasing n. To adapt the task to a more immersive, game-like virtual environment, stimuli were a series of colored balls presented on a virtual podium the VR headset as shown in Figure 1.
Figure 1. Configuration of the experimental equipment on a participant (excluding the protective plastic hair dressing cap).
The details of the environment were intentionally designed to be video game-like to increase the level of immersion for comparison of task performance to prior, less visually-distracting desktop-based studies. While the present study does not implement closed-loop BCI control, the intention is to inform the integration of EEG-based feedback into future interactive VR systems.
2. Materials and Methods
2.1. Participants and Experimental Setup
Fifteen participants [ages 18–35 (mean 24.73), 4 female, all right-handed] were recruited to participate in the experiment, which was approved by the Institutional Review Board of Old Dominion University. Participants first completed an informed consent, a visual acuity test, the Motion Sickness Susceptibility Questionnaire short-form (MSSQ-short; Golding, 2006), and the Ishihara Color Blindness test (Clark, 1924). All participants tested satisfied the inclusion criteria, specifically, all participants read the 20/30 line on the visual acuity test, scored at least 19 on the MSSQ, and correctly determined all symbols on the color-blindness test.
The HTC VIVE hardware system primarily consists of a motion-tracked headset display, two motion-tracked hand controllers, and two “lighthouse” base stations that are capable of providing 6 Degree of Freedom (6DOF) tracking. After the screening process, the EEG cap was placed on the participant's head and the EEG electrodes were filled with electrolyte gel. The electrode cap was then covered with a protective plastic hair dressing cap to insure that the gel did not seep onto the VR headset, and the VR headset was positioned over the EEG cap. The wireless EEG amplifier was placed in a shoulder strap on the participants back. The configuration of the experimental equipment on a participant (excluding the protective plastic hair dressing cap) is shown in Figure 1.
After the EEG and VR equipment was positioned, participants grasped a VIVE hand controller in the dominant hand (i.e., the right hand for all participants). Participants were placed in a standing position approximately 1 m in front of the recording computer, within the VR workspace.
2.2. Experimental Task
Stimuli are a series of colored balls presented on a virtual podium the VR headset. Following McMillan et al. (2007), each ball is colored red, blue, purple, green, or yellow. A ball receptacle is placed to the right and left of the participant in the virtual environment. The target receptacle was shaped as treasure chest. For a particular run, the participant's task was to pick up a virtual ball from the podium directly in front of them using the hand controller and move it to the target receptacle if the current ball color matched the color of the ball presented N trials before and to the opposite receptacle otherwise. Screen captures illustrating a single trial of the task are shown in Figure 2.
Figure 2. Screen captures of a single trail of the n-back task using colored balls in the interactive virtual environment. Each frame represents the binocular view as observed through the VR headset. (1) The podium and instruction display. (2) The trial begins when the colored ball appears. (3) Participant uses the trigger on the hand controller to grasp the ball. (4) Participant moves the ball to the right toward the non-target receptacle. (5) Participant releases the ball in the non-target receptacle and the trial ends. (6–8) A new trial begins for which the ball is placed in the target receptacle to the left.
Participants completed a 5 min practice block to familiarize themselves with the VR system and the n-back task. For the practice block, participants performed the 1-back task until one run consisting of a random sequence of 20 balls was completed without any errors. Following the practice block, participants performed a series of three experimental blocks in randomized order: 0-back, 1-back, and 2-back blocks consisting of 4 runs each. For the 0-back task, participants simply determine whether each ball is red or not.
For each block, participants received specific instructions regarding the task, followed by 4 experimental runs of the same n-back task. Each experimental run consisted of a random sequence of 20 balls, each of them remaining visible for 4 s, immediately followed by the onset of the next ball. Only a single ball is displayed at any given time and an auditory tone signaled the appearance of each new ball. The sequence of ball colors was generated randomly such that a minimum of 2 target trials were present in the run. The empirical maximum number of targets in a run was 7.
Participants were required to respond to all balls in each experimental run. Failure to respond (i.e., not placing the ball in a receptacle before the end of the trial) reset the run from the beginning and negated the erroneous run. While such run resets occurred for several participants during the training run, only a single reset for a single participant occurred during the actual experimental runs.
The order of the experimental blocks were counterbalanced across participants. For each participant, the target receptacle locations were counterbalanced to avoid biases that may be created by the lateral movements. To help engage participants, the performance (percent correct) was displayed after each trial. The total duration of the experiment was kept to 20 min to reduce the risk of simulator sickness, thus the time between successive runs and blocks was less than a minute.
2.3. Data Collection and Analysis
Each participant wore an 8-channel electrode cap (g.LADYBIRD, Guger Technologies) with active electrodes positioned based on the international 10-20 system (Sharbrough et al., 1991). Specifically, electrode positions F3, Fz, F4, C3, C4, P3, Pz, P4 were used (see Figure 3), based on neural activations from prior EEG and fMRI studies (Owen et al., 2005; Brouwer et al., 2012). EEG was collected using an 8-channel wireless biosignal amplifier (g.MOBIlab, Guger Technologies), grounded to the left earlobe, referenced to the right earlobe, and digitized at a 256 Hz.
Figure 3. Electrode montage with bipolar channels indicated by the numbering between adjacent electrode pairs.
The position of the VR headset and the controller were also tracked and digitized at 32 Hz. Communication between the VR software (developed in Unity2) and the BCI2000 EEG recording software was performed via UDP communication using the application connector in BCI2000 (Schalk et al., 2004).
A bipolar reference was applied because it empirically minimized the correlation of high frequency activity, presumed to be due primarily to scalp muscle tension (e.g., frontalis, temporalis, and/or occipitalis), with the task compared to an ear or common-average reference. Eight bipolar channels were created by subtracting the adjacent earlobe-referenced channels from right to left and anterior to posterior, as indicated by the numbered positions in Figure 3.
A conservative Hampel outlier filter was applied to the EEG reduce the occasional impulse-like artifact due to the wireless transmission. The Hampel filter computes the median of a sliding 1-second window centered on the current sample. The median absolute deviation is computed over the window. If the current sample differs from the median by more than five standard deviations, it is replaced with the median. The processed EEG was visually inspected to verify the efficacy of the artifact removal.
The EEG data were segmented by 4-second ball-presentation intervals (i.e., trials), yielding 240 total trials (4 runs × 3 conditions × 20 balls per run) per participant. The last trial of each run was excluded from the analysis due to a software issue that prematurely terminated data collection, which resulted in 228 total trials per participant for analysis. Because task performance satisfactory for all participants (see Results section), all trials (i.e., correct and incorrect ball placements) were included in the analysis.
The frequency spectrum of the EEG was computed for each 4 s run using Welch's method with a 256-point FFT and 50% overlap. The resulting spectral amplitudes were log transformed and the spectral bins were averaged over the traditional EEG bands: θ (5–7 Hz), α (8–14 Hz), β (15–30 Hz), and γ (31–55 Hz). Frequencies below 5 Hz are prone to gross movement artifacts and were excluded from the analysis. Additionally, a higher frequency range termed HF (70–100 Hz) was analyzed. This band was shown to be correlated with the task and is outside the frequency range of typical scalp EEG, thus the task-dependent variations of this band are suspected to be modulated in-part by subtle scalp muscle tension.
The data were parsed by n-back level. To explore the univariate characteristics of each spectral feature, Spearman's correlation was computed between the n-back level and the spectral amplitude for each frequency band and bipolar channel. Because the HF band exhibited large task-related correlations relative to the lower-frequency bands, it is suspected that this high-frequency band is due in-part to task-dependent EMG resulting from scalp muscle tension as suggested in Mühl et al. (2014). Since EMG activity is broadband and likely also pervades the low-frequency bands (Goncharova et al., 2003; Fu et al., 2006; Muthukumaraswamy, 2013; Yilmaz et al., 2014; Janani et al., 2017), a linear regression model was applied to reduce the correlation of this high-frequency activity in the lower frequency bands. Using the trial-wise spectral amplitude of each lower frequency band as the regressand (i.e., θ, α, β, and γ), a linear regression model was generated with the corresponding HF-band spectral amplitude as the regressor. The model was then used to remove the correlated activity by subtracting the model output from the respective regressand. This approach is referred to herein as HF suppression.
Spearman's correlation was used to quantify the relationship between the univariate spatio-spectral features and the cognitive workload level. To explore the multivariate discriminative power of the spatial and spectral features, various combinations of the features (5 spectral bins X 8 channels) were classified using regularized linear discriminant analysis (rLDA) with a four-fold cross-validation (due to the number of trials being perfectly divisible by 4). Specifically, the HF suppression (i.e., regression) approach was applied to the training data for each fold and the fitcdiscr function in MATLAB was used to preform the rLDA and optimize the regularization parameters. The HF suppression was performed separately on the test data for each fold.
The average task performance (correct bin placement) was 99.67±1.56% for n = 0; 98.17±4.51% for n = 1; and 95.83±7.49% for n = 2. Fourteen participants scored above 80% on all runs; the remaining participant scored above 70% on all runs. Twelve of the participants scored above 90% on all runs.
Figure 4 (left) shows the relative frequency of the normalized hand-controller position (computed as the signed resultant of the horizontal x-y hand-controller coordinates) across participants for the three workload conditions. It is observed that the distribution of controller positions is consistent across conditions. The rightmost peak is larger due to the fact that all participants were right-handed and the base position is right-of-center. Figure 4 (right) shows a boxplot3 of the r-squared correlation of the workload level with the normalized hand-controller position across participants, indicating that the movements exhibit minimal bias with respect to the workload level.
Figure 4. (Left) The relative frequency of the normalized hand-controller position (computed as the signed resultant of the horizontal x-y hand-controller coordinates) across participants for the three workload conditions. Positive and negative values indicate positions to the right and left of center, respectively. (Right) The r-squared correlation of the workload level with the normalized hand-controller position across participants.
The results of the Spearman correlation analysis for each band and bipolar channel are shown in Figure 5, where the HF suppression is indicated with asterisks (*). Note that the HF band exhibits correlation values in the same general range as the traditional EEG bands workload level for most channels, and that the magnitude of the correlations in each band across participants is somewhat inconsistent in the frontal channels and more consistent in the posterior channels. It can also be observed that the magnitude of the correlations generally drop by varying degrees in each frequency band after HF suppression.
Figure 5. Box plots across participants of the Spearman correlation between the spectral amplitude and workload level for each frequency band and bipolar channel, arranged topographically by channel. The title of each subplot indicates the polarity of the bipolar channel. The asterisks (*) indicate the result after HF suppression.
The differences in average spectral amplitude across conditions for selected participants and channels are shown in Figure 6. While there are clear broadband differences across workload levels for particular channels and some common activity across subsets of participants (i.e., participants H and L in Figure 6), it should be noted that neither the channels nor the relative spectral amplitude modulation across workload levels (i.e., participants A and H/L in Figure 6) appear consistent across participants.
Figure 6. Selected log-amplitude spectra from six different participants across workload levels. The top row represents frontal channels and the bottom row represents posterior channels.
To assess the most discriminable univariate features across participants, a two-sided Wilcoxon rank sum test was used to determine the percentage of participants with statistically-significant differences in spectral amplitude between the extreme workload levels of n = 0 and n = 2 for each feature. The results shown in Figure 7 were Bonferroni corrected to a significance level of 6.94 × 10−4 [0.05/(9 frequency bands × 8 channels)].
Figure 7. The percentage of participants with statistically significant differences (after Bonferroni correction) in spectral amplitude between n = 0 and n = 2 for each band and bipolar channel. The asterisks (*) indicate the result after HF suppression.
The HF band at F3-Fz was significant for 80% of the participants. Otherwise, the most consistent features for roughly 73% of the participants were β and γ at F3-Fz and γ at Pz-P4, all before HF suppression. After HF suppression, multiple feature have roughly 50% prevalence including frontal/central θ, β at Fz-F4, and γ at Fz-F4/F3-C3.
Figure 8 shows the four-fold classification accuracy for each frequency band using all bipolar channels for n = 0 vs. 2 and n = 0 vs. 1 vs. 2. Similar to the correlation analysis, the HF band achieves a comparatively high classification accuracy, and the performance generally drops for each band after HF suppression. However, the inter-quartile range of each comparison is well above the random-chance level, even after HF suppression.
Figure 8. Box plots of the four-fold classification accuracy across participants for each combination of workload levels. For each frequency band, all bipolar channels were included as features for the classifier. The blue horizontal lines indicate the chance level of classification accuracy. The asterisks (*) indicate the result after HF suppression.
Figure 9 shows the four-fold classification accuracy using all bipolar channels and various frequency ranges as features for the classifier. To further indicate the significance of the classification results, permutation tests were performed by randomizing the class labels for each scenario, performing the classification procedure, and repeating 100 times. Since the results were nearly identical for each feature combination for a given workload-level comparison, the random permutation results for the θ:HF condition are included in the figure, labeled as “rand.” Similar to the results shown in Figure 8, HF suppression decreased performance for all conditions and the inter-quartile range of each condition are above inter-quartile range randomization test. For n = 0 vs. 1 vs. 2, all observations are above inter-quartile range randomization test. Figure 10 shows the average accuracy trends across the workload level comparisons for different frequency-band combinations.
Figure 9. Box plots of the four-fold classification accuracy across participants for each combination of workload levels. The horizontal axis indicates the range of frequency bands included in the classifier. The blue horizontal lines indicate the random-chance level of classification accuracy. The “rand” label indicates the classification results from randomly permuting the labels for the θ:HF features, which produces nearly identical results for all feature combinations tested. The asterisks (*) indicate the result after HF suppression.
Figure 10. The mean accuracy of the various workload level comparisons for different frequency-band combinations. The asterisks (*) indicate the result after HF suppression.
To further examine the spatial contributions of the multi-band classification, Figure 11 shows the four-fold classification accuracy workload level extremes of n = 0 vs. n = 2, arranged by channel. For each channel, the traditional low-frequency EEG bands (θ to β), and HF for comparison, were included as features for the classifier. The inter-quartile ranges are generally higher in the frontal channels compared to the parietal channels. The effects of the HF band are most prominent on the left frontal channel (F3-Fz) and the posterior channels (P3-Pz and Pz-P4).
Figure 11. Box plots of the four-fold classification accuracy across participants for n = 0 vs. n = 2, arranged by channel. The horizontal axis indicates the range of frequency bands included in the classifier. The blue horizontal lines indicate the random-chance level of classification accuracy. The asterisks (*) indicate the result after HF suppression.
The results of this study demonstrate that it is possible to discriminate several mental workload levels using electrical activity recorded from the scalp during an interactive VR task. Figure 7 indicates that the most consistent features across participants are frontal β and γ and parietal γ prior to HF suppression. However, after HF suppression, frontal β and γ are the most consistent features across participants, though appreciably less consistent compared to the un-corrected features.
Further examining the univariate features in Figure 5, the parietal activity above the α band is generally positively correlated with workload level. Otherwise, there is high variability across participants, which is consistent with that reported in Brouwer et al. (2012). However, the classical fronto-parietal α/θ activations (Sauseng et al., 2005, 2010; Brouwer et al., 2012) were not consistently observed across participants in the present study. Additionally, β/γ were more prominent in the present study compared to prior findings. These differences may be due to the fact that the present study used an interactive, immersive VR design that incorporated stereotyped movements and rich visual input compared to prior related studies. For example, it has been reported that VR generated more β/γ activity compared to the real-world medium in hand illusion experiments (Škola and Liarokapis, 2016). Another study that analyzed EEG collected during an interactive VR stepping game found statistically-significant hemispherical differences in the α, β, γ bands (de Oliveira et al., 2018).
The HF band exhibits comparable correlations to the lower frequencies in nearly all channels. The positive central/parietal correlations is consistent with broadband EMG due to subtle scalp muscle tension (Goncharova et al., 2003; Fu et al., 2006; Janani et al., 2017). However, Figures 5, 6 suggest that the high-frequency activity in the frontal channels is not always positively correlated with task difficulty, which is not indicative of consistent task-related muscle tension and may be due to arbitrary, spontaneous muscle activity—possibly linked to the head-mounted display.
The single frequency-band classification accuracy in Figure 8 shows that most frequency bands exhibit roughly similar performance ranges before and after HF suppression, respectively. As expected, this can generally be extended to the combined-band results in Figure 9, with the combined-band results generally exhibiting higher overall accuracy ranges. This suggests that spectral bands contain some degree of complimentary information for classification. As shown in Brouwer et al. (2012), Figure 10 reaffirms that the extreme workload levels (i.e., 0 vs. 2) are more clearly discriminable that the first-degree levels of 0 vs. 1 and 1 vs. 2, with 0 vs. 1 being the least discriminable.
Figure 11 shows that the frontal channels tend to produce higher classification accuracies. When assessing the spatial (Figure 11) and spectral (Figure 8) contributions in isolation it is noted that, in general, there are not drastic differences in the performances across bands or channels. However, by comparing to the combined-band results of Figure 9, information from multiple frequencies and channels is crucial for maximizing performance.
This result may be a function of unique and complementary information in the various channels and frequency bands, but it may also be an indication of individual differences as suggested in Figures 5, 6, and in Brouwer et al. (2012). While Figure 7 appears to indicate prevalent features across participants, in a 15-fold transfer learning protocol (training on all combinations of 14 participants and testing on the remaining participant), the training error was high and the cross-validation results were only slightly above random chance - further supporting the notion of individual differences across participants. Possible explanations for the individual differences could be due to varying memory span, cognitive ability (Gonzalez, 2005), or arousal (Matthews and Davies, 2001); vigilance decrement (Mackworth, 1968; Warm et al., 2008); or fatigue effects over the duration of the task.
Further analysis indicated that a straightforward 5 Hz highpass filter is effective at suppressing low-frequency artifacts due to the deliberate, stereotyped gross movements required for the task. However, the elimination of EMG artifacts due to more subtle scalp tension remains a significant challenge (Goncharova et al., 2003; Fu et al., 2006; Muthukumaraswamy, 2013; Yilmaz et al., 2014, 2018; Janani et al., 2017). Because of the overlapping frequency ranges, it is effectively impossible to definitively isolate EMG and EEG activity without applying a neuromuscular blockade (Whitham et al., 2008). Furthermore, studies suggest that any degree of cognitive workload will create subtle, correlated head and neck muscle tension that further confounds such analysis (Laursen et al., 2002; Krantz et al., 2004; LC Leyman et al., 2004; Whitham et al., 2008; Roman-Liu et al., 2013).
The present approach uses linear regression to remove the correlated activity of a high-frequency band from the low-frequency EMG bands. While this may generate a reasonable approximation of EMG activity, there are several issues with this simplistic approach. Firstly, this approach assumes that there is a linear relationship between the dynamics of EMG contamination across frequency bands (Kim et al., 2017). If this relationship is not linear, then residual EMG artifact will be present in the EMG-suppressed signal. Secondly, assuming the high-frequency EMG activity is highly-correlated with the task, this regression approach may be suppressing genuine EEG relationships with the task (Mühl et al., 2014). Overall, these two contrasting effects may cancel to a degree and result in a reasonable estimate of the lower-frequency EEG activity.
In summary, this analysis demonstrates that cognitive workload during an interactive VR task can be estimated via scalp recordings. Using the traditional low-frequency EEG bands (θ−β), average workload classification accuracies across reached 81.1% (chance 50%) for 0 vs. 2 and 63.9% (chance 33.3%) for 0 vs. 1 vs. 2. By comparison, classification accuracies of 73.6 and 60.6%, respectively, can be achieved using the same bands after HF suppression. The recordings appear robust to the head-mounted setup and gross-movements. However, the results suggest that the cognitive workload task generates individual differences in brain activity, which likely require the development of subject-specific models. Furthermore, there are likely contributions of both EEG and scalp muscle tension-related EMG to cognitive workload classification. Ultimately, for practical cognitive workload discrimination, it may not be necessary to isolate EEG from EMG if the result is effective. However, in this case, care must be taken such that EMG does not become consciously or subconsciously conditioned to be predominant over EEG for manipulating the task outcome in closed-loop scenarios, as EEG (or other measures of brain activity) represents the intrinsic biomarker of cognitive workload.
Data Availability Statement
The experimental software used for this study is available in The Open Science Framework (https://osf.io/yhtz8/). The data generated for this study is available upon request to the corresponding author.
This study was carried out in accordance with the recommendations of the Institutional Review Board of Old Dominion University with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Institutional Review Board of Old Dominion University.
CH, KR, YY, and DK developed the experimental design and protocol. CT and KR configured the experimental setup. CT and TS collected the experimental data. CT and DK performed the data analysis and wrote the original draft of the manuscript. All authors contributed to reviewing and editing of the manuscript.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The authors would like to acknowledge Hector Garcia, John Shull, Saikou Diallo, Fernando Sobreira, and Srdjan Lesaja for assisting with the development, configuration, and testing of the virtual reality environment used in this study. The authors would also like to acknowledge Michael Ambinder and Valve Inc. for providing VR equipment and valuable feedback on the task design.
3. ^Boxplots were generated using the ‘boxplot function in MATLAB, where the central mark indicates the median, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points and outliers are indicated with the “+” symbol.
Arad, E., Bartsch, R. P., Kantelhardt, J. W., and Plotnik, M. (2018). Performance-based approach for movement artifact removal from electroencephalographic data recorded during locomotion. PLoS ONE 13:e0197153. doi: 10.1371/journal.pone.0197153
Ayaz, H., Shewokis, P. A., Bunce, S., Izzetoglu, K., Willems, B., and Onaral, B. (2012). Optical brain monitoring for operator training and mental workload assessment. NeuroImage 59, 36–47. doi: 10.1016/j.neuroimage.2011.06.023
Baldwin, C. L., and Penaranda, B. N. (2012). Adaptive training using an artificial neural network and EEG metrics for within- and cross-task workload classification. NeuroImage 59, 48–56. doi: 10.1016/j.neuroimage.2011.07.047
Berka, C., Levendowski, D. J., Lumicao, M. N., Yau, A., Davis, G., Zivkovic, V. T., et al. (2007). EEG correlates of task engagement and mental workload in vigilance, learning, and memory tasks. Aviat. Space Environ. Med. 78, B231–B244. Available online at: https://www.ingentaconnect.com/content/asma/asem/2007/00000078/a00105s1/art00032#expand/collapse
Bisson, E., Contant, B., Sveistrup, H., and Lajoie, Y. (2007). Functional balance and dual-task reaction times in older adults are improved by virtual reality and biofeedback training. Cyberpsychol. Behav. 10, 16–23. doi: 10.1089/cpb.2006.9997
Bos, D. P. O., Reuderink, B., van de Laar, B., Gürkök, H., Mühl, C., Poel, M., et al. (2010). “Brain-computer interfacing and games,” in Brain-Computer Interfaces. Human-Computer Interaction Series, eds D. S. Tan and A. Nijholt (London: Springer), 149–178. Available online at: https://cci.drexel.edu/faculty/esolovey/courses/CS680-S14/papers/Plass-Oude-BCIgames.pdf
Brouwer, A.-M., Hogervorst, M. A., van Erp, J. B. F., Heffelaar, T., Zimmerman, P. H., and Oostenveld, R. (2012). Estimating workload using EEG spectral power and ERPs in the n-back task. J. Neural Eng. 9:045008. doi: 10.1088/1741-2560/9/4/045008
Brouwer, A.-M., Zander, T. O., van Erp, J. B. F., Korteling, J. E., and Bronkhorst, A. W. (2015). Using neurophysiological signals that reflect cognitive or affective state: six recommendations to avoid common pitfalls. Front. Neurosci. 9:136. doi: 10.3389/978-2-88919-613-5
Cipresso, P., Giglioli, I. A. C., Raya, M. A., and Riva, G. (2018). The past, present, and future of virtual and augmented reality research: a network and cluster analysis of the literature. Front. Psychol. 9:2086. doi: 10.3389/fpsyg.2018.02086
Daly, I., Scherer, R., Billinger, M., and Müller-Putz, G. (2015). Force: fully online and automated artifact removal for brain-computer interfacing. IEEE Trans. Neural Syst. Rehabil. Eng. 23, 725–736. doi: 10.1109/TNSRE.2014.2346621
Doud, A. J., Lucas, J. P., Pisansky, M. T., and He, B. (2011). Continuous three-dimensional control of a virtual helicopter using a motor imagery based brain-computer interface. PLoS ONE 6:e26322. doi: 10.1371/journal.pone.0026322
Ewing, K. C., Fairclough, S. H., and Gilleade, K. (2016). Evaluation of an adaptive game that uses EEG measures validated during the design process as inputs to a biocybernetic loop. Front. Hum. Neurosci. 10:223. doi: 10.3389/fnhum.2016.00223
Fitzgibbon, S. P., Pope, K. J., Mackenzie, L., Clark, C. R., and Willoughby, J. O. (2004). Cognitive tasks augment gamma EEG power. Clin. Neurophysiol. 115, 1802–1809. doi: 10.1016/j.clinph.2004.03.009
Fu, M. J., Daly, J. J., and Cavuşoǧlu, M. C. (2006). A detection scheme for frontalis and temporalis muscle EMG contamination of EEG data. Conf. Proc. Annual IEEE Eng. Med. Biol. Soc. 1, 4514–4518. doi: 10.1109/IEMBS.2006.4398455
Gerjets, P., Walter, C., Rosenstiel, W., Bogdan, M., and Zander, T. O. (2014). Cognitive state monitoring and the design of adaptive instruction in digital environments: lessons learned from cognitive workload assessment using a passive brain-computer interface approach. Front. Neurosci. 8:385. doi: 10.3389/fnins.2014.00385
Girouard, A., Solovey, E. T., and Jacob, R. J. (2013). Designing a passive brain computer interface using real time classification of functional near-infrared spectroscopy. Int. J. Auton. Adapt. Commun. Syst. 6, 26–44. doi: 10.1504/IJAACS.2013.050689
Goncharova, I. I., McFarland, D. J., Vaughan, T. M., and Wolpaw, J. R. (2003). EMG contamination of EEG: spectral and topographical characteristics. Clin. Neurophysiol. 114, 1580–1593. doi: 10.1016/S1388-2457(03)00093-2
Grissmann, S., Faller, J., Scharinger, C., Spüler, M., and Gerjets, P. (2017a). Electroencephalography based analysis of working memory load and affective valence in an n-back task with emotional stimuli. Front. Hum. Neurosci. 11:616. doi: 10.3389/fnhum.2017.00616
Grissmann, S., Spuler, M., Faller, J., Krumpe, T., Zander, T., Kelava, A., et al. (2017b). Context sensitivity of EEG-based workload classification under different affective valence. IEEE Trans. Affect. Comput. 1–1. doi: 10.1109/TAFFC.2017.2775616
Gwin, J. T., Gramann, K., Makeig, S., and Ferris, D. P. (2010). Removal of movement artifact from high-density EEG recorded during walking and running. J. Neurophysiol. 103, 3526–3534. doi: 10.1152/jn.00105.2010
Herff, C., Heger, D., Fortmann, O., Hennrich, J., Putze, F., and Schultz, T. (2013a). Mental workload during n-back task-quantified in the prefrontal cortex using fNIRS. Front. Hum. Neurosci. 7:935. doi: 10.3389/fnhum.2013.00935
Herff, C., Heger, D., Putze, F., Hennrich, J., Fortmann, O., and Schultz, T. (2013b). Classification of mental tasks in the prefrontal cortex using fNIRS. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2013, 2160–2163. doi: 10.1109/EMBC.2013.6609962
Hogervorst, M. A., Brouwer, A.-M., and van Erp, J. B. (2014). Combining and comparing EEG, peripheral physiology and eye-related measures for the assessment of mental workload. Front. Neurosci. 8:322. doi: 10.3389/fnins.2014.00322
Horvat, M., Dobrinić, M., Novosel, M., and Jerčić, P. (2018). “Assessing emotional responses induced in virtual reality using a consumer EEG headset: a preliminary report,” in 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (Opatija: IEEE). doi: 10.23919/mipro.2018.8400184
Ihme, K., Unni, A., Zhang, M., Rieger, J. W., and Jipp, M. (2018). Recognizing frustration of drivers from face video recordings and brain activation measurements with functional near-infrared spectroscopy. Front. Hum. Neurosci. 12:327. doi: 10.3389/fnhum.2018.00327
Janani, A. S., Grummett, T. S., Lewis, T. W., Fitzgibbon, S. P., Whitham, E. M., DelosAngeles, D., et al. (2017). Evaluation of a minimum-norm based beamforming technique, sloreta, for reducing tonic muscle contamination of EEG at sensor level. J. Neurosci. Methods 288, 17–28. doi: 10.1016/j.jneumeth.2017.06.011
Kam, J. W. Y., Griffin, S., Shen, A., Patel, S., Hinrichs, H., Heinze, H.-J., et al. (2019). Systematic comparison between a wireless EEG system with dry electrodes and a wired EEG system with wet electrodes. NeuroImage 184, 119–129. doi: 10.1016/j.neuroimage.2018.09.012
Kim, B., Kim, L., Kim, Y.-H., and Yoo, S. K. (2017). Cross-association analysis of EEG and EMG signals according to movement intention state. Cogn. Syst. Res. 44, 1–9. doi: 10.1016/j.cogsys.2017.02.001
Kline, J. E., Huang, H. J., Snyder, K. L., and Ferris, D. P. (2015). Isolating gait-related movement artifacts in electroencephalography during human walking. J. Neural Eng. 12:046022. doi: 10.1088/1741-2560/12/4/046022
Krantz, G., Forsman, M., and Lundberg, U. (2004). Consistency in physiological stress responses and electromyographic activity during induced stress exposure in women and men. Integr. Physiol. Behav. Sci. 39, 105–118. doi: 10.1007/BF02734276
Laursen, B., Jensen, B. R., Garde, A., and Jørgensen, A. H. (2002). Effect of mental and physical demands on muscular activity during the use of a computer mouse and a keyboard. Scand. J. Work Environ. Health 28, 215–221. doi: 10.5271/sjweh.668
Lee, S., Shin, Y., Kumar, A., Kim, M., and Lee, H.-N. (2018). Dry electrode-based fully isolated EEG/fNIRS hybrid brain-monitoring system. IEEE Trans. Bio-med. Eng. 66, 1055–1068. doi: 10.1109/TBME.2018.2866550
Leeb, R., Friedman, D., Muller-Putz, G.-R., Scherer, R., Slater, M., and Pfurtscheller, G. (2007). Self-paced (asynchronous) BCI control of a wheelchair in virtual environments: a case study with a tetraplegic. Comput. Intell. Neurosci. 5, 117–128. doi: 10.1155/2007/79642
Lobel, A., Gotsis, M., Reynolds, E., Annetta, M., Engels, R., and Granic, I. (2016). “Designing and utilizing biofeedback games for emotion regulation: the case of nevermind,” in CHI EA '16 Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (San Jose, CA), 1945–1951. doi: 10.1145/2851581.2892521
Lotte, F., Faller, J., Guger, C., Renard, Y., Pfurtscheller, G., Lécuyer, A., et al. (2013). Combining BCI With Virtual Reality: Towards New Applications and Improved BCI. Berlin; Heidelberg: Springer.
Lotte, F., Renard, Y., and Lecuyer, A. (2008). “Self-paced brain-computer interaction with virtual worlds: a quantitative and qualitative study out of the lab,” in 4th International Brain Computer Interface Workshop and Training Course 2008 (Graz).
Lotte, F., Van Langhenhove, A., Lamarche, F., Ernest, T., Renard, Y., Arnaldi, B., et al. (2010). Exploring large virtual environments by thoughts using a brain-computer interface based on motor imagery and high-level commands. Presence 19, 54–70. doi: 10.1162/pres.19.1.54
McMillan, K. M., Laird, A. R., Witt, S. T., and Meyerand, M. E. (2007). Self-paced working memory: validation of verbal variations of the n-back paradigm. Brain Res. 1139, 133–142. doi: 10.1016/j.brainres.2006.12.058
Oliveira, S. M. S., Medeiros, C. S. P., Pacheco, T. B. F., Bessa, N. P. O. S., Silva, F. G. M., Tavares, N. S. A., et al. (2018). Electroencephalographic changes using virtual reality program: technical note. Neurol. Res. 40, 160–165. doi: 10.1080/01616412.2017.1420584
Owen, A., McMillan, K., Laird, A., and Bullmore, E. (2005). N-back working memory paradigm: a meta-analysis of normative functional neuroimaging studies. Hum. Brain Mapp. 25, 46–59. doi: 10.1002/hbm.20131
Pergher, V., Wittevrongel, B., Tournoy, J., Schoenmakers, B., and Van Hulle, M. M. (2018). N-back training and transfer effects revealed by behavioral responses and EEG. Brain Behav. 8:e01136. doi: 10.1002/brb3.1136
Pesonen, M., Hämäläinen, H., and Krause, C. M. (2007). Brain oscillatory 4-30 Hz responses during a visual n-back memory task with varying memory load. Brain Res. 1138, 171–177. doi: 10.1016/j.brainres.2006.12.076
Power, S. D., Falk, T. H., and Chau, T. (2010). Classification of prefrontal activity due to mental arithmetic and music imagery using hidden markov models and frequency domain near-infrared spectroscopy. J. Neural Eng. 7:26002. doi: 10.1088/1741-2560/7/2/026002
Power, S. D., Kushki, A., and Chau, T. (2012). Intersession consistency of single-trial classification of the prefrontal response to mental arithmetic and the no-control state by nirs. PLoS ONE 7:e37791. doi: 10.1371/journal.pone.0037791
Roux, F., and Uhlhaas, P. (2014). Working memory and neural oscillations: alpha-gamma versus theta-gamma codes for distinct wm information? Trends Cogn. Sci. 18, 16–25. doi: 10.1016/j.tics.2013.10.010
Royer, A. S., Doud, A. J., Rose, M. L., and He, B. (2010). EEG control of a virtual helicopter in 3-dimensional space using intelligent control strategies. IEEE Trans. Neural Syst. Rehabil. Eng. 18, 581–589. doi: 10.1109/TNSRE.2010.2077654
Ruchkin, D. S., Johnson, R., Canoune, H., and Ritter, W. (1991). Event-related potentials during arithmetic and mental rotation. Electroencephalogr. Clin. Neurophysiol. 79, 473–487. doi: 10.1016/0013-4694(91)90167-3
Ryu, K., and Myung, R. (2005). Evaluation of mental workload with a combined measure based on physiological indices during a dual task of tracking and mental arithmetic. Int. J. Indus. Ergon. 35, 991–1009. doi: 10.1016/j.ergon.2005.04.005
Sauseng, P., Griesmayr, B., Freunberger, R., and Klimesch, W. (2010). Control mechanisms in working memory: a possible function of EEG theta oscillations. Neurosci. Biobehav. Rev. 34, 1015–1022. doi: 10.1016/j.neubiorev.2009.12.006
Sauseng, P., Klimesch, W., Schabus, M., and Doppelmayr, M. (2005). Fronto-parietal EEG coherence in theta and upper alpha reflect central executive functions of working memory. Int. J. Psychophysiol. 57, 97–103. doi: 10.1016/j.ijpsycho.2005.03.018
Schalk, G., McFarland, D. J., Hinterberger, T., Birbaumer, N., and Wolpaw, J. (2004). BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng. 51, 1034–1043. doi: 10.1109/TBME.2004.827072
Scharinger, C., Soutschek, A., Schubert, T., and Gerjets, P. (2017). Comparison of the working memory load in -back and working memory span tasks by means of EEG frequency band power and p300 amplitude. Front. Hum. Neurosci. 11:6. doi: 10.3389/fnhum.2017.00006
Scherer, R., Lee, F., Schlogl, A., Leeb, R., Bischof, H., and Pfurtscheller, G. (2008). Toward self-paced brain? computer communication: navigation through virtual worlds. IEEE Trans. Biomed. Eng. 55:2. doi: 10.1109/TBME.2007.903709
Schultze-Kraft, M., Dähne, S., Gugler, M., Curio, G., and Blankertz, B. (2016). Unsupervised classification of operator workload from brain signals. J. Neural Eng. 13:036008. doi: 10.1088/1741-2560/13/3/036008
Sharbrough, F., Chatrain, C. E., Lesser, R. P., Luders, H., Nuwer, M., and Picton, T. W. (1991). Amearican Electroencephalographic Society guidelines for standard electrode position nomenclature. J. Clin. Neurophysiol. 8, 200–202. doi: 10.1097/00004691-199104000-00007
Snyder, K. L., Kline, J. E., Huang, H. J., and Ferris, D. P. (2015). Independent component analysis of gait-related movement artifact recorded using EEG electrodes during treadmill walking. Front. Hum. Neurosci. 9:639. doi: 10.3389/fnhum.2015.00639
Tallon-Baudry, C., Bertrand, O., Peronnet, F., and Pernier, J. (1998). Induced γ-band activity during the delay of a visual short-term memory task in humans. J. Neurosci. 18, 4244–4254. doi: 10.1523/JNEUROSCI.18-11-04244.1998
Unni, A., Ihme, K., Jipp, M., and Rieger, J. W. (2017). Assessing the driver's current level of working memory load with high density functional near-infrared spectroscopy: a realistic driving simulator study. Front. Hum. Neurosci. 11:167. doi: 10.3389/fnhum.2017.00167
Velasco-Alvarez, F., Ron-Angevin, R., and Blanca-Mena, M. J. (2010). Free virtual navigation using motor imagery through an asynchronous brain-computer interface. Presence 19, 71–81. doi: 10.1162/pres.19.1.71
Wang, S., Gwizdka, J., and Chaovalitwongse, W. A. (2016). Using wireless EEG signals to assess memory workload in then-back task. IEEE Trans. Hum. Mach. Syst. 46, 424–435. doi: 10.1109/THMS.2015.2476818
Whitham, E. M., Lewis, T., Pope, K. J., Fitzgibbon, S. P., Clark, C. R., Loveless, S., et al. (2008). Thinking activates EMG in scalp electrical recordings. Clin. Neurophysiol. 119, 1166–1175. doi: 10.1016/j.clinph.2008.01.024
Yilmaz, G., Ungan, P., Sebik, O., Ugincius, P., and Türker, K. S. (2014). Interference of tonic muscle activity on the EEG: a single motor unit study. Front. Hum. Neurosci. 8:504. doi: 10.3389/fnhum.2014.00504
Yilmaz, G., Ungan, P., and Türker, K. S. (2018). EEG-like signals can be synthesized from surface representations of single motor units of facial muscles. Exp. Brain Res. 236, 1007–1017. doi: 10.1007/s00221-018-5194-6
Zander, T. O., Andreessen, L. M., Berg, A., Bleuel, M., Pawlitzki, J., Zawallich, L., et al. (2017). Evaluation of a dry EEG system for application of passive brain-computer interfaces in autonomous driving. Front. Hum. Neurosci. 11:78. doi: 10.3389/fnhum.2017.00078
Zander, T. O., Gaertner, M., Kothe, C., and Vilimek, R. (2010). Combining eye gaze input with a brain–computer interface for touchless human–computer interaction. Int. J. Hum. Comput. Interact. 27, 38–51. doi: 10.1080/10447318.2011.535752
Keywords: cognitive workload, electroencephalogram (EEG), virtual reality, HTC VIVE, n-back task
Citation: Tremmel C, Herff C, Sato T, Rechowicz K, Yamani Y and Krusienski DJ (2019) Estimating Cognitive Workload in an Interactive Virtual Reality Environment Using EEG. Front. Hum. Neurosci. 13:401. doi: 10.3389/fnhum.2019.00401
Received: 13 March 2019; Accepted: 25 October 2019;
Published: 14 November 2019.
Edited by:Björn H. Schott, Leibniz Institute for Neurobiology (LG), Germany
Reviewed by:Camila Rosa De Oliveira, Faculdade Meridional (IMED), Brazil
Edmund Wascher, Leibniz Research Centre for Working Environment and Human Factors (IfADo), Germany
Jochem W. Rieger, University of Oldenburg, Germany
Copyright © 2019 Tremmel, Herff, Sato, Rechowicz, Yamani and Krusienski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Dean J. Krusienski, firstname.lastname@example.org