Optimizing the Face Paradigm of BCI System by Modified Mismatch Negative Paradigm

Many recent studies have focused on improving the performance of event-related potential (ERP) based brain computer interfaces (BCIs). The use of a face pattern has been shown to obtain high classification accuracies and information transfer rates (ITRs) by evoking discriminative ERPs (N200 and N400) in addition to P300 potentials. Recently, it has been proved that the performance of traditional P300-based BCIs could be improved through a modification of the mismatch pattern. In this paper, a mismatch inverted face pattern (MIF-pattern) was presented to improve the performance of the inverted face pattern (IF-pattern), one of the state of the art patterns used in visual-based BCI systems. Ten subjects attended in this experiment. The result showed that the mismatch inverted face pattern could evoke significantly larger vertex positive potentials (p < 0.05) and N400s (p < 0.05) compared to the inverted face pattern. The classification accuracy (mean accuracy is 99.58%) and ITRs (mean bit rate is 27.88 bit/min) of the mismatch inverted face pattern was significantly higher than that of the inverted face pattern (p < 0.05).


INTRODUCTION
Brain-computer interfaces (BCIs) are intended to help patients to communicate with other people or control external devices through their brain activity (Wolpaw et al., 2000b;He et al., 2013). Patients who suffer from Amyotrophic lateral stenosis (ALS) could be helped by this technology. Scalp electroencephalography (EEG) is convenient in experimental setups. Therefore, it is widely used and studied (Mak et al., 2011). Event-related potential (ERP) based BCI systems are amongst the most commonly used BCIs and are often used for designing speller systems (Hwang et al., 2013;Zhang D. et al., 2013).
The first P300-based speller system was presented and used a 6 × 6 stimuli matrix with 36 targets (Farwell and Donchin, 1988). The study showed the potential value of ERP-based BCIs for designing speller systems. However, to overcome problems with the signal to noise ratio of the ERP it was necessary to base control upon the construction of averaged trials, which decreased the information transfer rates (ITRs) of the system (Wolpaw et al., 2002). Therefore, many studies have focused on improving the performance of ERP-based BCIs in practical applications (Sellers et al., 2006). Optimized classifiers were presented to improve the classification accuracy when only a few trials were used for constructing the average ERP . Generic models, using online training methods, were presented to decrease the offline calibration time (Lu et al., 2009;Jin et al., 2013;Tobias et al., 2013). In addition to the mathematic methods, the paradigms used to evoke ERPs and stimulus patterns were also studied to enhance the difference between target and nontarget trials. Martens et al. presented several P300-based BCI systems using different target-to-target intervals (TTIs) to show the refractory and overlap effects in ERPs (Martens et al., 2009). Some work optimized the sequence of stimuli to avoid double flashes, which decreased the repetition blindness and increased the classification accuracy and ITRs (Jin et al., 2011b;Townsend et al., 2012). Hong et al. reported that motion onset potentials (the N200) evoked by moving targets could be used to improve the performance of ERP-based BCIs (Hong et al., 2009). A paradigm was designed to evoke both N200 and P300 to improve the discrimination of target and non-target trials . Finally, hybrid systems have also been proposed. For example, Long et al. designed a BCI system using P300 and motor imagery for multi-degree control of a wheelchair (Long et al., 2012) and Yin et al. combined P300 and steady-state visually evoked potential (SSVEP) brain signals for a high-performance BCI-based speller system (Yin et al., 2013(Yin et al., , 2014(Yin et al., , 2015. Kaufmannn et al. first used images of faces as stimuli, and showed that they could be used to obtain high classification accuracies with both healthy and disabled BCI users with high ITRs (Kaufmann et al., 2011(Kaufmann et al., , 2013. The facial expression change paradigm has also been demonstrated to evoke discriminative ERPs and obtained equally high classification accuracies as the face pattern (Jin et al., 2014b).
It was reported that the inverted face pattern could evoke a large N170, a vertex positive potential (VPP), and yield better performance than the face pattern (Zhang et al., 2012). In their work, seven volunteers participated in the experiment in which eight targets were presented on screen, and nine patterns were compared: an upright face, inverted face, upright eyeless face, inverted eyeless face, upright eye, inverted eye, upright object, inverted object, and highlight icon. The online ITR of the invert face pattern were the best among them and had significantly longer latencies before the N170 and VPP. The N170 and VPP are sensitive to configural processing (Gruss et al., 2012).
It was reported that traditional P300-based BCIs could be improved significantly through a modification of the mismatch paradigm (Jin et al., 2014a(Jin et al., , 2015. Ten subjects were paid to participate in the experiment. The character "D" was the deviant stimulus and the character "S" was the standard stimulus. The modified mismatch paradigm could evoke significantly larger N200 and N400 ERPs compared to the traditional P300 paradigm. Based on previous studies, inverted and upright face images with different expressions were used to design the mismatch inverted face pattern (MIF-pattern). The visual stimulus modality elicits a visual mismatch negativity (vMMN). Consistent with the auditory MMN, the vMMN elicits an N200 (Kimura et al., 2008(Kimura et al., , 2010a. The vMMN with emotional faces can also elicit an N400 (Bobes et al., 2000).
The face stimuli contained more configural information than the character stimuli. A natural question is whether the invited face pattern could be used in the vMMN pattern. This study tries to explore the possibility of improving the vMMN pattern by combining it with the inverted face pattern. Our hypotheses are that the mismatch negative face pattern (MIF-pattern) will evoke significantly larger N200 and N400 amplitudes and that this could be used to obtain significantly higher classification accuracies and ITRs compared to the inverted face pattern (IF-pattern). In this paper, second section is the Methods and Materials, which introduce the experiment and method used in this study. Third section is the Result which shows the performance of the presented paradigm. Fourth section is the Discussion and fifth section is the Conclusion.

METHODS AND MATERIALS Participants
Ten healthy participants (7 male and 3 female, aged 23-25, mean 23.6) were paid for participated in the study. Four participants did not have any experience with BCI. All the participants were asked to remain relaxed during data acquisition. All subjects were informed and signed a written consent form prior to this experiment, and were paid 50 RMB for their participation in each session.
The participants were seated ∼85 cm in the front of a computer monitor, which was 30 cm long by 48 m wide. The display presented to the participants is shown in Figures 1, 2. Twelve items were presented in a 3 × 4 arrangement. The participants' task was to focus their attention on the desired item in the matrix and count the number of times the face appeared directly above the item. The participants were trained simply to guide them as to how to do the tasks before they began the experiment.

Experiment Setup, Off-, and Online Protocols
EEG signals were recorded with a g.HIamp and a g.EEGcap (Guger Technologies, Graz, Austria), band pass filtered between 0.1 and 60 Hz, notch filtered at 50 Hz and sampled at 512 Hz. Fourteen active electrodes were used, as shown in Figure 3. The right mastoid electrode served as the reference, and the ground electrode was placed on the forehead (FPz).
A female face (actress Freema Agyeman) was selected for use in this study, the face figures were obtained from a video of emotional research. Faces with negative and positive valence were used to evoke ERPs. In the IF-pattern, the inverted face with negative valence (i.e., the deviant stimulus), was presented above one of the 12 items ( Figure 1B). The MIF-pattern was similar to the IF-pattern, with one exception. When the inverted face appeared above one of the items, gray faces with positive emotions (the standard stimulus) appeared above other 11 items ( Figure 1C). In other words, the background of the stimulus was the flicking gray faces in the MIF-pattern. The gray face was replaced by an inverted woman's face, which was used as the stimulus in the MIF-pattern (see Figure 2). Several standard stimuli (flicking gray faces) appeared before the deviant stimuli (an inverted woman's face), thereby producing a "visual mismatch." The inter-stimulus-interval (ISI) of the stimulus was 100 ms and the stimulus onset asynchrony (SOA) was 300 ms in  both patterns, which was the same for the flicking background used in the MIF-pattern.
It was reported that there should be more than four standard stimuli before the deviant stimuli to evoke clear mismatch negative potentials (Kimura et al., 2006). One of the stimuli was shown in Figure 2. Items 1-6 were in group 1, the others 6 stimuli (items 7-12) were in group 2. The stimuli order of items 1-6 was randomized and the order of items 7-12 was also randomized. The strategy used was to flash the items in group 1 randomly and then flash the items in group 2 randomly in each trial. All the 12 items in group 1 and 2 were flashed once in each trial. In this way, only a subset of items in group 1 in the first trial did not meet the standard stimulus number before the deviant stimulus was presented.
Participants had to complete two offline phases (3 offline runs per phase) for two patterns, after which there were two online runs. Each offline run contained 5 trial-blocks, that each contained 16 trials, and each online run contained 24 trial-blocks (the targets were items 1-12). The number of trials per trialblock was variable in the online phase, as described in Section Adaptive System Settings. Before one trial-block, a green box was briefly displayed to indicated the target item of the trialblock. A 2-min break was given to participants between two runs. The runs of patterns were presented in the same order as those in the offline phase, and the order was counterbalanced over participants. Participants completed all experiment stages in 2 h.

Feature Extraction Procedure
The first 800 ms of EEG data after each stimulus presentation were used to extract features from the 14 EEG channels (shown in Figure 3). EEG data was filtered into the range 0.1-30 Hz using a third order Butterworth band pass filter. After filtering, the data were down-sampled from 512 to 73 Hz by selecting every seventh sample. So the size of the feature vector for one stimulus was 14 × 58 (14 channels by 58 time points).

The Information Transfer Rates (IRTs)
The bit rate is the measure of ITRs we used. Both speed and accuracy affect the bit rate (Wolpaw et al., 2000a). Bit rate is define as where Br denotes the bit rate, N denotes the number of targets, and P denotes the accuracy. T denotes the time (in seconds) for a trial to complete. AVT denotes the number of trials used to construct the average used in each trial-block for each participant; In this study N is set to 12, T is set to 3.6 s (12 stimuli and the SOA is 0.3 s).

Classification Scheme
The classifier used in this study was a Bayesian Linear Discriminant Analysis (BLDA) classifier, which may be seen as an extension of Fisher's Linear Discriminant Analysis (FLDA; Penney et al., 2001;Hoffmann et al., 2008). The classification rule is defined as, where m is the discriminant vector used for the classification, and y is the output of the classifier. X denotes the matrix, each column of which contains a feature vector, and t denotes the regression targets. The value of t is set to N/N 1 for class 1, and −N/N 2 for class −1 (where N 1 and N 2 are the number of features from class 1 and class −1, and N is the sum of N 1 + N 2 ). The two hyper parameters α and β are the inverse variance of prior distribution and noise. They can be determined with an iterative method. Features acquired from offline data were used to train the classifier and resulted in the classifier model. The data sets used to train the classifier model contained 240 target stimuli and 2640 non-target stimuli per participant.
In the online phase, single trials were classified immediately after the data was completely acquired (800 ms after the onset of the last stimulus in a trial). The 12 classifier outputs, one per output per stimulus, were summed over trials. The stimulus with the maximum summed output was considered to be the target.

Adaptive System Settings
The number of trials per trial-block was automatically selected during the online runs (Jin et al., 2011a). In the online experiment, the system obtained a detected target by using the classifier after each trial. In the first trial, the stimulus whose features obtained the highest classifier score in the trial was regarded as the detected target. For subsequent trials, the classifier scores of the stimuli in that trial were averaged with the classifier scores of the corresponding stimuli in the previous trials. The highest averaged classifier score indicated the detected target of that trial. If the two successive detected stimuli were the same, the corresponding item was regarded as the correct target and shown on the top of the screen. Then, the participant could move on to attempt to select the next character.

RESULTS
The Online Performance Table 1 shows the online classification accuracy, bit rate, and average number of trials used per participant.
These data were statistically tested for normality (One-Sample Kolmogorov Smirnov test) and sphericity (Mauchly's test). Since the classification accuracy was not normally distributed, a nonparametric Kendall test was used to test the differences in classification accuracies between the SF and MIF patterns. The classification accuracy of the MIF-pattern was significantly higher than that of the IF-pattern (p < 0.05). Paired samples t-tests were used to test the differences between the MIF-and IF-patterns in terms of bit rates. The ITRs (Wolpaw et al., 2002) of the MIF-pattern were significantly higher than that of the IF-pattern (t = −2.7, p < 0.05). Figure 4 shows the amplitude of the grand averaged ERP from target flashes across all participants. The baseline was extracted from 100 ms before each deviant stimulus. The N200 ERP  on channel P8 (Czigler et al., 2006;Folstein and Van Petten, 2008;Kimura et al., 2010a;Czigler, 2014), the VPP on channel Cz (Jeffreys, 1989), the P300 ERP on channel Pz (He et al., 2013), and the N400 ERP on channel Cz (Duncan et al., 2009) were selected to analyze the effect on the amplitude between two paradigms. The adaptive mean method (Clayson et al., 2013) was used to measure the peak amplitude of the ERPs. The average 25 ms pre-peak to 25 ms post peak amplitudes were extracted around the most positive or negative peaks between 100 and 300 ms (peak negative, N200), 200-350 (peak positive, VPP), 301-500 ms (peak positive, P300), and 401-800 ms (peak negative, N400). The results showed that there were no significant differences between the IF and MIF patterns in VPP (t = 0.4269, P > 0.05) and P300 (t = 0.7578, P > 0.05) amplitudes. The amplitudes of the N200 (t = 0.0242, P < 0.05) and N400 (t = 0.0222, P < 0.05) of the MIF-pattern were significantly higher than those recorded during the IF-pattern (see Figure 5). Figure 6 shows the grand averaged r-squared values of the ERPs and the topographic maps of the N200, VPP, P300, and N400 components.

DISCUSSIONS
In this paper, the mismatch inverted face pattern (MIFpattern) was presented to improve the performance of the inverted face pattern (IF-pattern) based BCIs. The online results showed that the classification accuracies and ITRs achieved with the MIF-pattern were significantly higher than those achieved with the IF-pattern (p < 0.05). Two of the participants obtained 100% classification accuracies with both patterns, other participants obtained higher classification accuracies with the MIF-pattern compared to the IF-pattern.
The face patterns evoked large N200 and N400 ERPs. A mismatch negativity (MMN) was elicited when a stimulus was incongruent with the sensory memory trace of a standard stimulus (Näätänen, 1992;Kimura et al., 2010b;Kimura, 2012). The visual MMN was elicited in response to a stimulus that was preceded by four or more repetitions of the other stimuli, while it was not elicited in response to a stimulus that was preceded by one, two, or three repetitions of the other stimuli (Kimura et al., 2006). In order to evoke a stable mismatch negative potential, the trials with <4 standard stimuli before the deviant stimuli were decreased with the method used in this study (See Section Experiment Setup, Off-, and Online Protocols). Table 1 showed that high classification accuracies could be obtained from these single trials with four or more standard stimuli before the deviant stimuli.
The ERP data showed that a visual MMN was elicited by a stimulus that was incongruent with the sensory memory trace of a standard stimulus which was consistent with the MMN literature (Näätänen, 1992;Kimura et al., 2010b;Kimura, 2012). It was reported that the MMN will not be elicited without establishing a predictive model of the standard stimulus (Squires et al., 1975). In this paper, the presentations of the standard (upright face) stimulus at the target location established a predictive model of the standard stimulus, and the model was disrupted by presentations of the target (inverted face) stimulus. It was shown that N200 amplitude of the MI-F pattern was significantly larger than the N200 of the IF-pattern (p < 0.05). In this experiment, we had no a priori reason to expect significantly larger N400s in the MI-F pattern. However, the N400 recorded during the MIF-pattern was significantly larger than during the IF-pattern (p < 0.05; see Figure 5). A possible explanation is that the MI-F pattern produced a coherent pattern that provided a clear mismatch of shape, which did not exist in the IF-pattern (Wang et al., 2004;Szűcs et al., 2007).
Zhang et al. designed a BCI based on inverted faces (Zhang et al., 2012). In their study, the averaged online ITR reached 38.7 bits/min, which was higher than that of MIF-pattern (27.78 bits/min). However, the averaged accuracy achieved with the MIF-pattern (99.58%) was a lot higher than the (88.7%) in Zhang's work. It is noteworthy that the condition was different between Zhang's work and this work. For example, the SOA in our study was 300 ms, but it was 180 ms in Zhang's work. Furthermore, more targets (12 targets) were used in this study compared to Zhang's work (8 targets). So the comparison of two studies is not trivial. The vMMN BCI was similar with this work (Jin et al., 2015). The online accuracy in that work was 97.5% and the ITR was 25.1 bit/min. Thus, the performance of that system was lower than our present work and our work has improved the BCI.

CONCLUSIONS
In this paper, a mismatch inverted face (MIF)-pattern was presented and evaluated to assess its effect on the performance of ERP-based BCIs. The results show that the MIF-pattern yielded better performance compared to the inverted face (IF)pattern in terms of classification accuracy and ITRs. This work also has the potential application. For example, when patients suffering from ALS reach an advanced stage, they lose the FIGURE 6 | The r-squared values averaged over participants using offline data. ability to move or speak. The BCI speller would help them "talk" to others. So, in our further work, we will verify our study on patients and add more targets to build a speller system.

AUTHOR CONTRIBUTIONS
SZ did the most of the work on experiment and manuscript. JJ offered the idea of this paper. ID helped to improve the paper. XW helped to improve the experiment. AC did some work on discussion.

FUNDING
This work was supported in part by the Grant National Natural Science Foundation of China, under Grant Nos. 61573142, 61203127, 91420302, and 61305028. This work was also supported by the Fundamental Research Funds for the Central Universities (WG1414005, WH1314023, and WH1516018).