Classification for Single-Trial N170 During Responding to Facial Picture With Emotion

Whether an event-related potential (ERP), N170, related to facial recognition was modulated by emotion has always been a controversial issue. Some researchers considered the N170 to be independent of emotion, whereas a recent study has shown the opposite view. In the current study, electroencephalogram (EEG) recordings while responding to facial pictures with emotion were utilized to investigate whether the N170 was modulated by emotion. We found that there was a significant difference between ERP trials with positive and negative emotions of around 170 ms at the occipitotemporal electrodes (i.e., N170). Then, we further proposed the application of the single-trial N170 as a feature for the classification of facial emotion, which could avoid the fact that ERPs were obtained by averaging most of the time while ignoring the trial-to-trial variation. In order to find an optimal classifier for emotional classification with single-trial N170 as a feature, three types of classifiers, namely, linear discriminant analysis (LDA), L1-regularized logistic regression (L1LR), and support vector machine with radial basis function (RBF-SVM), were comparatively investigated. The results showed that the single-trial N170 could be used as a classification feature to successfully distinguish positive emotion from negative emotion. L1-regularized logistic regression classifiers showed a good generalization, whereas LDA showed a relatively poor generalization. Moreover, when compared with L1LR, the RBF-SVM required more time to optimize the parameters during the classification, which became an obstacle while applying it to the online operating system of brain-computer interfaces (BCIs). The findings suggested that face-related N170 could be affected by facial expression and that the single-trial N170 could be a biomarker used to monitor the emotional states of subjects for the BCI domain.


INTRODUCTION
Emotion is considered to be a subjective perception or feeling toward the internal and external environment, along with some kind of physiological response (Khushaba et al., 2013). Individuals might suffer from mental disorders, such as anxiety and depression, if they do not correctly handle the mood swings caused by psychological or physical damage over a long period of time (Lindquist and Barrett, 2012). Nowadays, mental illness, that has continuously troubled the society, has become a serious problem; thus, the need to identify measures to prevent and treat such disorders has become a vital project. Recently, the rapid development of neuroscience has facilitated improvements in brain-computer interfaces (BCIs), which enable direct communication with external devices bypassing the usual peripheral neural pathways (Panicker et al., 2011;Mühl et al., 2014). Research in the areas of affective neuroscience and emotion classification in BCIs has become of great value to the fields of emotion monitoring and rehabilitation therapy. Moreover, the search for features representing individual emotions or emotional states based on physiological signals has brought prospects in the application of BCIs to comprehend the harmonious interaction between the brain and computers.
Facial expressions, which might embody a person's emotional state, are often utilized to study emotion and for the classification of facial expressions with emotion in the experimental environment with regard to cognition. A large amount of research has shown that compared to other object categories, the human face elicits a larger negative amplitude waveform at a latency of about 170 ms, which was termed as N170 (Bentin et al., 1996). According to previous studies, the N170 was more likely to be generated at occipito-temporal areas, and it might be involved in face perceptual processing (Song et al., 2017). More surprisingly, several studies found an interesting phenomenon that the N170 components might be affected by facial emotion (Caharel et al., 2005;Liu et al., 2013;Rellecke et al., 2013). Therefore, the research that N170 components were utilized in emotional classification might be valuable for the field of emotional BCIs.
Previous studies showed that compared to spontaneous brain activity based on BCIs such as motor imaginary (Pfurtscheller and Neuper, 2001;Toshiro et al., 2012), the performance of evoked brain activity based on BCIs had a higher signal-to-noise ratio and faster mental state recognition (Meng et al., 2008;Jin et al., 2011) such as event-related potentials (ERPs). For example, P300, an electrophysiological response to a novel internal or external stimulus, was a typical positive peak around 300 ms following the presentation of infrequent target stimulus onset during the oddball paradigm (Fazel-Rezai et al., 2012). Similarly, the motion-specific N200 ERPs produced a negative peak around 200 ms after stimuli onset, and they were also introduced to an asynchronous BCI speller (Hong et al., 2009;Zhang D. et al., 2012). In fact, compared with N200 and P300 ERP usually used in the field of BCIs, N170 ERP provides a faster and early processing of brain components in the time course, which might contribute to the real time research of BCIs. However, N170 was used as a feature for facial classification by most of the previous studies (faces vs. non-faces), but it was not used for pattern classification in relation to emotional recognition (Zhang Y. et al., 2012;Cai et al., 2013).
Moreover, an increasing number of features extracted from electroencephalogram (EEG) signals, such as EEG timefrequency features (Chanel et al., 2009), boosting encoded dynamic features (Yang et al., 2009), recoursing energy efficiency (REE), and root mean square (RMS) (Murugappan et al., 2008), were used to classify emotion. However, the classification features used in these studies were generally based on averaged trial ERPs from EEG recordings, but were not based on single-trial ERPs, which might lead to the trial-to-trial variation being neglected. Therefore, we proposed single-trial N170 ERPs elicited by the facial pictures with different emotions to apply for emotional classification, and we focused on the linear/nonlinear characteristics of the chosen classifiers to obtain the generalization performance such as classification accuracy, sensitivity, computational time, and so on. Thus, a comparative study was performed by using three types of classifiers, namely, linear discriminant analysis (LDA), L1-regularized logistic regression (L1LR), and support vector machine with radial basis function (RBF-SVM). LDA represented a classical linear classifier, L1LR represented a special case of a generalized linear model, and RBF-SVM represented a classical nonlinear classifier. Moreover, the three types of classifiers were easy to implement. Therefore, the three classifiers were utilized to reveal the potential relationship between positive and negative emotions in the current study.
We assumed that the single-trial N170 ERP could be used as a feature to successfully classify positive and negative emotions, and we intended to find an optimal classifier for emotional classification. These findings might provide a meaningful reference for the development of emotional classification and emotion regulation in BCIs.

Ethics Statement
Informed consent was signed prior to the study, and subjects also received monetary compensation after the experiments. All experiments were approved by the ethical committee of Chongqing university of Posts and Telecommunications. All experimental procedures were conducted in accordance with the ethical guidelines determined by the National Ministry of Health, Labor and Welfare and the Declaration of Helsinki (BMJ 1991;302:1194).

Subjects
Twenty healthy and right-handed subjects (male: 10; female: 10; mean age: 21 years) from the EEG Laboratory of Chongqing University of Posts and Telecommunications participated in the experiment. None of the subjects had cognitive impairments or mental or neurological disorders. All subjects' vision or corrected vision was normal. The grade point average (GPA) of the subjects was shown in Appendix 1 (Supplementary Material). The experiment consisted of four blocks of 120 trials each (480 trials in total with 160 trials × 3 emotions). Moreover, a pseudorandom approach was adopted to prevent stimulus repetition. Subjects were required to maintain central fixation and minimize eye blinks and body motion throughout the recordings. Stimuli were presented and behavioral data were recorded using E-prime software (http://www.pstnet.com/ eprime.cfm).  and jealous, and neutral emotion was expressionless. Emotion valence and arousal of facial stimuli were shown in Appendix 1 (Supplementary Material). Figure 1B illustrated an example of the stimulus sequence with emotional pictures. A green fixation cross (0.5 × 0.5; at the center of the monitor) was displayed throughout the entire block of trials, and the visual angle displayed by the photos was 4 × 4. Each trial started with the fixation cross flashing for 500 ms. Following that, one of three expression pictures (i.e., positive, neutral, and negative) was presented for 500 ms and subjects were asked to respond to discriminate expressions (positive, neutral, and negative) with a button press using their right hand (index finger, middle finger, and ring finger). In other words, subjects needed to press key "1" using their right index finger if a positive face appeared, subjects needed to press key "2" using their right middle finger if a neutral face appeared, and subjects needed to press key "3" using their right ring finger if a negative face appeared. If subjects did not make a timely response within the maximum allowable response time (1,200 ms), the next stimulus sequence would be represented. Response accuracy and speed were equally emphasized in the instructions. All 480 pictures were randomly presented in a mixed design, with four blocks of 120 stimuli each. For each subject, there were 160 trials for each emotion (i.e., positive: 160 trials, neutral: 160 trials, and negative: 160 trials).

Behavioral Analysis
Subjects' response accuracy (ACC) and reaction time (RT) were recorded and analyzed using a one-way analysis of variance (ANOVA), with emotion (positive, neutral, and negative) as the within-subjects factor. For each subject, incorrect responses or responses with RT more than mean ± 2SD in each emotion were removed from RT analysis (Liu et al., 2013).

EEG Recording and Processing
Electroencephalogram was recorded by a 64-channel NeuroScan system (Quik-Cap, band pass: 0.05-100 Hz, sampling rate: 250 Hz, impedances <5 k ) with a vertex reference. To monitor ocular movements and eye blinks, electrooculogram (EOG) signals were simultaneously recorded from four surface electrodes, with one pair placed over the upper and lower eyelid and the other pair placed 1 cm lateral to the outer corner of the left and right orbit.
The data were re-referenced by the reference electrode standardization technique (REST) (Yao, 2001;Tian and Yao, 2013) which denoted the infinity zero reference. In the study, EEG was segmented from 200 ms before the stimulus onset to 280 ms after the stimulus onset. Electromyography (EMG) and EOG were excluded by blind source separation (BSS) (Negro et al., 2016) and other noise was removed by automatic artifact rejection (±100 µV). The data were baseline corrected using the 200 ms before the stimulus onset, and then the EEG recordings were filtered with a band-pass of 0.5-45 Hz.

Feature Extraction
In the current study, single-trial N170 ERPs were chosen as classification features to classify positive and negative emotions. According to previous studies, N170 was elicited at the occipitotemporal electrodes (Bentin et al., 1996;Cai et al., 2013;Song et al., 2017). Therefore, our focus was on twelve electrodes in occipitotemporal areas: that is, P3, P4, P5, P6, P7, P8, PO3, PO4, PO5, PO6, PO7, and PO8 (Smith et al., 2012). Then, among the twelve electrodes, the paired t-test was performed to select the top five channels where obvious differences between positive and negative emotional N170 trials were observed based on the following classification. At the same time, dimensionality reduction was achieved. For each subject, the features were normalized by scaling between 0 and 1 to reduce individual differences (Lin et al., 2008;Chanel et al., 2009).

Classification
Three types of classifiers were utilized to classify the positiverelated and negative-related N170 trials: namely, LDA, L1LR, and RBF-SVM. Moreover, the positive-related N170 ERPs and the negative-related N170 ERPs were defined as positive and negative in the classification process, respectively.

Linear Discriminant Analysis (LDA)
A classical linear classifier (LDA) was proposed by Fisher for binary classification and was, therefore, called Fisher's linear discriminant. Linear discriminant analysis was widely used in face recognition and machine learning to find a linear projection of features that characterized or separated two or more classes of objects or events (Sharma and Paliwal, 2015). The purpose of projecting the labeled training data was to maximize the distance between the two classes' means and minimize the interclass variance (Müller et al., 2003).

L1-Regularized Logistic Regression (L1LR)
Logistic regression (LR), as a special case of a generalized linear model (Cook and Weisberg, 2008), was considered to be an alternative to LDA. Logistic regression, a crucial method for statistical analysis, was widely used in various fields, and it had a good effect on practical applications (Keating and Cherry, 2004;Ayalew and Yamagishi, 2005), especially in pattern classification (Zhu and Hastie, 2004).
However, there were some obvious shortcomings in the traditional LR model, mainly in the following two aspects: the selection of variables and overfitting problems. Most of the model parameters fitted by the LR model were not zero, that is, the model was related to most of the variables; therefore, it was not sparse. In effect, in many practical problems, if the model was not sparse, the computational complexity would increase, meaning that the interpretation of practical problems was more difficult. With regard to overfitting problems (Kim et al., 2007), a LR model for the training data could often get a good fit accuracy, but for the test data, the classification accuracy was not ideal.
Therefore, some researchers proposed the L1LR model to overcome the above problems (Kim et al., 2007;Park and Hastie, 2007). The L1LR problem was where l avg was the average logistic loss function; ν ∈ R (the intercept) and w ∈ R (the weight vector) were the parameters of the logistic model; and λ > 0 was the regularization parameter, and it was used to control the trade-off between the average logistic loss and the size of the weight vector, as measured by the L1-norm [refer Kim et al. (2007) for more detail].

Support Vector Machine (SVM)
Support vector machine was developed by Vapnik based on statistical learning theory (SLT) (Netherlands, 2008). As a result of its excellent generalization performance, SVM has been applied to a wide variety of issues, such as text classification, images classification, hand writing recognition, and gene classification. Furthermore, SVM had the feature of empirical risk minimization (ERM) and global optimum solution (Netherlands, 2008). Using kernel function, SVM could efficiently perform linear and nonlinear classification by projecting original features into high dimensional feature spaces, which made the two classes easy to distinguish. In the current study, the SVM classification framework was implemented by using the following equation: where f (x) was the decision function; n was the number of trials; β i ∈ R was the Lagrangian multiplier; y i denoted 1 or −1, which was the class label; b was the bias; and K (x, x i ) denoted the kernel function. In the current study, we chose radial basis function as the SVM kernel (Brew, 2016).

Generalization of Classifier
If a classifier could predict the class of a new sample with good performance, it was considered to clearly reflect the relationship between the feature and the class label. Besides choosing a reliable feature to represent emotion, the selection of an appropriate classifier was also a critical problem in the field of BCIs. For most of the previous studies, the generalization of classifier was measured just by classification accuracy (CA), which might not be able to effectively evaluate the generalization of a classifier (Jin and Ling, 2005). Therefore, in the current study, combined with 10-fold cross-validation, six types of metrics were utilized to fully evaluate the generalization of the three classifiers. The processing procedure of the 10-fold cross-validation is described below. Firstly, the initial sample was randomly divided into ten subsamples. Secondly, nine of the ten subsamples were considered to be the training set to establish the SVM model, and the remaining one was retained as the test set to evaluate the generalization of the classifiers. The manipulation was repeated until each subsample had been assigned as a test set on one occasion. Finally, the ten classification results were averaged to obtain the eventual classification results. Six types of metrics for the generalization of the three classifiers were defined by the following expressions.

Classification Accuracy (CA)
Classification accuracy was defined as the percentage of the number of samples predicted correctly in the test set divided by the total number of the samples, and it was calculated by the following equation: where true positive (TP) was the number of positive samples correctly predicted; true negative (TN) was the number of negative samples correctly predicted; false positive (FP) denoted the number of incorrectly predicted positive samples; and false negative (FN) denoted the number of incorrectly predicted negative samples.

Sensitivity (SE) and Specificity (SP)
Sensitivity and specificity were calculated by the following formulae, respectively: Sensitivity referred to the ratio of correctly classified positive samples to the total population of positive samples, whereas SP was the ratio of correctly classified negative samples to the total population of negative samples.

Area Under the Curve (AUC)
The AUC was defined as the area under the receiver operating characteristic (ROC) curves, which was discovered and proved to be better than CA to evaluate the predictive performance of classification learning algorithms (Jin and Ling, 2005). Moreover, AUC was indeed a statistically consistent and more discriminating measure than CA (Ling et al., 2003). Originally, only the ROC curves were introduced to evaluate machine learning algorithms (Provost et al., 1997). In the ROC curves, TP was plotted on the Y axis and FP was plotted on the X axis. It described the classifiers' performance across the entire range independent of class distributions (Provost et al., 1997;Jin and Ling, 2005). However, there was often no clear dominating relation between two ROC curves in the entire range. Therefore, AUC was introduced to provide a good "summary" for the performance of the learning algorithms based on ROC.

Kappa
where P(O) denoted an observational probability of agreement, and P(E) was the hypothetical probability of expected agreement by chance. The Kappa coefficient value ranged from −1 to 1. A Kappa value with 1 denoted a perfect classification, −1 meant a completely incorrect classification, and 0 denoted that the performance of a classifier was equal to a random guess (Landis and Koch, 1977;Eugenio and Glass, 2015).

Computational Time (CT)
For the application of BCIs, one of the challenges was the real time online processing of signals, which required the classifier to have a good speed of operation. Therefore, the training and testing time, called the CT in the current study, were recorded as the metrics for the classification generalization. All of the runtime experiments were conducted on a personal computer (PC) with Intel R Core TM i7-3770 CPU @ 3.40 GHz, 8 GB RAM.

Behavioral Performance
Mean RT was shown in Figure 2A with standard deviation (SD). The results displayed a significant main effect of emotion (F = 6.28, p = 0.003). Post hoc test showed that the RT for negative emotion was clearly faster than the RT for positive emotion (t = 6.06, p < 0.001) and the RT for neutral emotion (t = 6.23, p < 0.001). However, there was no significant difference between the RT for positive emotion and the RT for neutral emotion (t = 0.16, p > 0.05). The ACC was analyzed by using the same statistical model as that used for RT, as shown in Figure 2B.
There was a significant main effect of emotion on ACC (F = 6.28, p < 0.05). Moreover, the post hoc test showed that the negative faces were identified more correctly than the positive faces (t = 3.11, p < 0.05), whereas there was no significant difference on ACC between neutral faces identification and negative faces identification (t = 1.66, p > 0.05). Furthermore, there was no significant difference on ACC between neutral faces identification and positive faces identification (t = 1.46, p > 0.05).

ERP Analysis
To elaborate the feature extraction window, the ERP waveforms with different emotions were drawn to find the data segments where there were obvious differences between positive and negative emotional N170 ERPs. Positive and negative emotions were tested by the paired t-test. The feature extraction window was located on the data segments with significant differences between the positive and negative trials. As shown in Figure 3, there was a significant difference between positive emotion and negative emotion around 170 ms at all the twelve electrodes (P3, P4, P5, P6, P7, P8, PO3, PO4, PO5, PO6, PO7, and PO8). Moreover, the topographic map illustrated the corresponding p-values at the twelve electrodes after performing the paired t-test between positive N170 and negative N170 (Figure 3A).

Generalization for Classifier
The classification results of the three classifiers were shown in the tables (L1LR: Table 1; RBF-SVM: Table 2; LDA: Table 3). The generalizations of the three classifiers (L1LR, RBF-SVM, and LDA) were evaluated by six different aspects including CA, AUC, SE, SP, Kappa, and CT.
According to the classification results mentioned above, the paired t-test was conducted to find the obvious differences on the six metrics of generalization among the three classifiers. For CA, the results showed that the L1LR classifier was obviously superior to the RBF-SVM (t = 2.966, p < 0.01) and the LDA classifiers (t = 3.860, p < 0.001). Moreover, the RBF-SVM classifier was obviously superior to the LDA classifier (t = 1.967, p < 0.05). For AUC, the results showed that the L1LR classifier was significantly superior to the RBF-SVM (t = 4.670, p < 0.001) and the LDA classifiers (t = 3.508, p < 0.01). However, there was no obvious difference between RBF-SVM and LDA (t = 1.560, p > 0.05) on AUC. For SE, the results showed that there was no obvious difference between any two of the three classifiers. For SP, the results showed that the L1LR classifier was significantly superior to the RBF-SVM (t = 2.081, p < 0.05) and the LDA classifiers (t = 1.940, p < 0.05), while there was no obvious difference between RBF-SVM and LDA (t = 0.989, p > 0.05). For Kappa, the results showed that L1LR was significantly superior to LDA (t = 2.411, p < 0.05), while there was no obvious difference between L1LR and RBF-SVM (t = 0.026, p > 0.05). Furthermore,   RBF-SVM significantly overmatched LDA (t = 2.575, p < 0.01) on Kappa. For CT, the results showed that the computing speed of the L1LR classifier was significantly faster than that of the RBF-SVM (t = 35.173, p < 0.001) and the LDA classifiers (t = 4.595, p < 0.001). Moreover, the computing speed of LDA was obviously faster than that of RBF-SVM (t = 35.019, p < 0.001).

DISCUSSION
In the present study, a novel feature extracted from the singletrail N170, evoked by facial pictures with emotion, was proposed to classify positive and negative emotions. Combined with the 10fold cross-validation, six types of metrics (i.e., CA, AUC, SE, SP, Kappa, and CT) were used to evaluate the generalization of the three classifiers (i.e., L1LR, RBF-SVM, and LDA). We found that (1) the N170 at the occipitotemporal electrodes was modulated by facial emotion when REST re-reference was applied; (2) the single-trial N170 could be used as a classification feature to differentiate positive emotion from negative emotion; and (3) compared with the other two classifiers (RBF-SVM and LDA), L1LR showed a good generalization for emotional classification with a single-trial N170 as a feature. The findings could open up a new avenue for monitoring people's mood swings and developing effective BCIs on the regulation of individual emotion.

Emotion and Behavioral Response
The RT results showed that negative faces were clearly recognized more quickly than positive and neutral faces. Moreover, the ANOVA analysis showed that ACC for negative faces was significantly superior to ACC for positive faces, indicating that negative faces were identified more effectively than positive faces. Previous studies had also reached similar conclusions that the detection of negative facial emotion was faster and more efficient than the detection of positive emotion (Fox et al., 2000;Schupp et al., 2004).

N170 and Facial Emotion With Rest Reference
Whether facial recognition-related N170 was modulated by facial emotion has always been a controversial topic. Initially, some researchers argued that the N170, the processing of faces, was independent and parallel to that of emotional expression (Caharel et al., 2005;Eimer and Holmes, 2007). However, recent findings suggested that the amplitude of the N170 could be affected by facial expressions (Liu et al., 2013;Song et al., 2017). For instance, neutral expressions elicited smaller N170 amplitudes than other emotional expressions (Blau et al., 2007), and happy faces elicited smaller amplitudes than other emotional expressions (Liu et al., 2013). These diversities of experimental findings might be correlated with differences in design and stimuli during the cognitive experiment. Moreover, some researchers suggested that the effects of emotional modulation on N170 were related to the reference electrodes (Hinojosa et al., 2015). For example, compared with the mastoid reference, common average reference reinforced the emotional modulation effects at the occipitotemporal electrodes where the N170 ERPs typically occurred (Rellecke et al., 2013;Hinojosa et al., 2015). Moreover, our recent study also revealed that the reference technique might play a crucial role in ERPs data interpretation, and we found that REST reference would be a superior choice for precise evaluation of the scalp spatiotemporal changes connected to various cognitive events (Tian et al., 2018). In the current study, we found that N170 could be modulated by emotion when ERPs were re-referenced by REST (Yao, 2001;Tian and Yao, 2013), which supported the opinions that the processing of facial recognition and expression were integrated mechanisms rather than segregated mechanisms (Hinojosa et al., 2015;Song et al., 2017), and that the effects of emotional modulation on facial recognition might be associated with the reference technique (Rellecke et al., 2013;Hinojosa et al., 2015). The details of REST reference were shown in Appendix 2 (Supplementary Material).

Emotional Classification in BCIs
Previously, some researchers were devoted to classifying different emotions with various features. For instance, the three EEG timefrequency features, namely short time Fourier transform (STFT) features, mutual information (MI) features, and peripheral features, were utilized to distinguish different emotions elicited by imagination or recall of different emotional events (Chanel et al., 2009). Some researchers also attempted to use the boosting encoded dynamic features for facial expression recognition (Yang et al., 2009). Moreover, fractal dimension values of the real time EEG were also proposed to be features in BCIs-based emotional classification for music therapy (Sourina et al., 2012). However, the classification features used in these studies came from the averaged ERPs but not from the single-trial ERPs, which might neglect trial-to-trial difference. Moreover, the generalization of classifier was evaluated just in terms of CA for most of the previous studies, which might not be able to comprehensively evaluate the generalization of a classifier. Furthermore, in comparison with the BCIs based on spontaneous brain activity like motor imaginary (Pfurtscheller and Neuper, 2001;Toshiro et al., 2012), the performance of the BCIs based on evoked brain activity like P300 (Jin et al., 2011) had some clear strengths, such as higher signal-to-noise ratio and faster mental state recognition (Meng et al., 2008). These types of BCIs, based on evoked brain activity, were dependent of external stimulation, such as facial pictures with different emotion used in the current study.
Therefore, in the current study, the evoked brain activity, namely single-trial N170 ERPs, was utilized for BCI-based emotion classification, which was of great significance for recognizing and steadily monitoring individual emotional states. Furthermore, six types of metrics were utilized to roundly evaluate the generalization of the three classifiers. Our results demonstrated that differences between positive N170 trials and negative N170 trials existed, which indicate that single-trial N170 could be applied to emotional classification. Based on the analysis of the paired t-test from the "RESULTS" section, the three classifiers were ranked according to their classification generalization ( Table 4). As shown in Tables 1-3, all the three classifiers demonstrated a good classification effect, but L1LR performed best for pattern classification between positive and negative emotional data in accordance with the rankings (Table 4). Classification accuracy, a commonly used evaluation metric of a classifier, denoted the percentage of the number of samples predicted correctly divided by the total number of samples. For CA, L1LR model showed a good performance, but the CA was just a whole metric of a classifier, which might no longer have good performance when the ratios of positives and negatives changed. Therefore, AUC was introduced to evaluate the generalization of the classifiers. Compared with CA, AUC was independent of changes in class distribution and made full use of the predicted probability value during the classification (Jin and Ling, 2005). As illustrated by Table 4, L1LR was obviously superior to RBF-SVM and LDA in terms of AUC, whereas there was no clear difference in SE among the three classifiers. Sensitivty denoted the probability that positive samples were predicted correctly, indicating that the three classifiers had a similar performance on the prediction of positive samples. However, in terms of SP, which denoted the probability with which the negative samples predict correctly, the L1LR classifier was prominently superior to the RBF-SVM and the LDA classifier. Kappa statistics was the proportion of correctly classied samples after accounting for the probability of chance level in the current study. The larger the Kappa value, the better the performance of the classifier. For Kappa, the results showed that L1LR and RBF-SVM were significantly superior to LDA. In the field of BCIs, one of the most difficult challenges was the real time online processing of signals, which required the classifier to have a good speed of operation. Therefore, CT was used as one of the evaluation metrics for the classification generalization in the current study. With respect to CT, the computational time of L1LR was significantly lower than that of RBF-SVM and LDA. In comparison with L1LR, the most significant drawback of RBF-SVM was time-consuming. RBF-SVM created complex nonlinear boundaries, depending on the RBF kernel function used in the current study; that is, more time was required to optimize the parameters during the classification, making it difficult to apply for BCIs via an online operating system.
Compared with the othe two classifiers, the LDA classifier showed a relatively poor generalization, which might be the reason why EEG was a nonstationary signal (Qin and Ji, 2004), and the differences between two types of signals (namely positives and negatives) could not be simply distinguished by linear mapping. The basic idea of LDA was to linearly project the multidimensional data into the feature space where two types of data could be best distinguished and to eventually create linear boundaries for the two classes. Therefore, LDA might be mainly suitable for the situation that the features of the two classes were linearly separable but not suitable for nonlinear ERP features (Liong and Foo, 2013). Moreover, LDA and RBF-SVM might also be subjected to overfitting, focusing too much on adjusting the boundary to give an optimal fit to the training set, but failing to produce a good general boundary between the two classes (Dixon and Brereton, 2009). Therefore, the cross-validation was utilized to avoid this problem in the current study. Also, the L1LR model with sparsity based on the L1-norm might avoid the overfitting problem to a certain degree (Kim et al., 2007;Park and Hastie, 2007). Moreover, the computational complexity decreased because of the sparsity of L1LR model. In summary, L1LR classifiers showed a good generalization, while LDA showed a relatively poor generalization for the emotional classification with single-trial N170 as a feature in the present study.

LIMITATIONS
Previous studies demonstrated that the vertex positive potential (VPP) component recorded at the Fz electrode might be the positive counterpart of N170 (George et al., 1996;Itier and Taylor, 2004). Therefore, we did not discuss the VPP component in the current study. Based on the standard analysis of ERPs, we found that there was no obvious difference between N170 trials with negative emotion and N170 trials with neutral emotion after performing the paired t-test. Moreover, according to the behavioral analysis, there was no obvious difference in RT between positive face discrimination and neutral face discrimination. For ACC, there was no significant difference between neutral faces identification and negative faces identification. In addition, there was no significant difference on ACC between neutral faces identification and positive faces identification. The reason for this situation might be that some subjects mistakenly regarded expressionless pictures (namely neutral facial expressions) as facial pictures with negative emotions during the experiment, while some other subjects might mistakenly regard neutral facial expressions as positive expressions. Thus, we simply did the pattern classification between positive samples and negative samples. The research that included both amplitude and latency as features to classify emotions could be interesting and valuable, which might increase the classification performance. In future studies, we could continue our study from this respect and might find something interesting.

CONCLUSIONS
In the current study, we proposed using a single-trial N170 as a feature applied in the emotional classification. The results illustrated that three classifiers, namely L1LR, RBF-SVM, and LDA, were utilized to successfully classify positive and negative samples, and L1LR showed a relatively good generalization for pattern classification of different emotions while LDA showed a relatively poor classification performance. The current study could provide beneficial information for researchers in emotion regulation; furthermore, the single-trial N170 could be a biomarker to monitor the emotional states of subjects for the BCI domain.

AUTHOR CONTRIBUTIONS
YT Conceived, designed the experiments, and wrote the manuscript. HZ performed the experiments, analyzed the data, and wrote the first draft. YP and JL contributed reagents, materials, and analysis tools.