Application of the Stockwell Transform to Electroencephalographic Signal Analysis during Gait Cycle

The analysis of electroencephalographic signals in frequency is usually not performed by transforms that can extract the instantaneous characteristics of the signal. However, the non-steady state nature of these low voltage electrical signals makes them suitable for this kind of analysis. In this paper a novel tool based on Stockwell transform is tested, and compared with techniques such as Hilbert-Huang transform and Fast Fourier Transform, for several healthy individuals and patients that suffer from lower limb disability. Methods are compared with the Weighted Discriminator, a recently developed comparison index. The tool developed can improve the rehabilitation process associated with lower limb exoskeletons with the help of a Brain-Machine Interface.


INTRODUCTION
Reduced mobility is a serious handicap for people who have suffered a cerebrovascular accident, brain trauma or encephalitis. Orthesis and prosthesis devices have been developed in last years in order to assist people with severe motor limitations. Although EMG-based interfaces have been used in several applications for controlling these devices (Villarejo Mayor et al., 2017), the use of Brain-Machine Interfaces (BMIs) can be a more suitable option to control a speller or a wheel chair (Li et al., 2014) and especially exoskeletons, since they can improve the neuroplasticity in rehabilitation therapies (Cramer, 2008;Gharabaghi, 2016;Barrios et al., 2017).
The basis of a BMI is to extract the brain waves, normally by electroencephalography (EEG), process and translate them into commands to control a device. The electrical waves obtained are categorized by their frequency components and the location where they are acquired (Rao, 2013). Usually, the following frequency bands are considered: delta (0.1-4 Hz) which is associated with deep sleep (Amzica and Steriade, 1998), theta (4-7 Hz) which is associated with Rapid Eye Movement (REM) sleep and transition from sleep to waking (Cantero et al., 2003), alpha (8-15 Hz) which is associated with relaxed, but awake state with eyes closed (Da Silva, 2010) and beta (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32) and gamma (>25 Hz) which are associated with movement and attentive focus (Rao, 2013). Depending on the author, the bands can slightly differ and overlap. Thus, it is hard to establish a precise limit for them. In literature, bands have received another designation; for instance, mu band (8-12 Hz) which is usually related to the event-related synchronization phenomenon (Pfurtscheller and Neuper, 1994). In Cheron et al. (2012), it was demonstrated that some EEG frequency bands (alpha, beta and gamma) are involved in the control of the walking pattern, and that it is possible to extract EEG signals event-related desynchronization/synchronization (ERD/ERS) (Severens et al., 2012;Wagner et al., 2012) from the sensorimotor cortex controlling the contralateral foot placement. This confirmed the study of Gwin et al. (2011) which stated that electrocortical activity is coupled to gait cycle phase during treadmill walking. Cheron et al. (2012) also used the two most representative independent components of the sensorimotor cortex as input for a Dynamic Recurrent Neural Network (DRNN) learning identification toward the two principal components of the 3 elevation angles (foot, shank, and thigh) of one lower limb kinematics which can be easily interpreted by artificial actuators.
EEG signals are usually analyzed by Fourier transforms (FT). Due to the discrete nature of the data analyzed, signals are cut in several windows and processed (Fast Fourier Transform FFT or Short Time Fourier Transform STFT). Although the information extracted by each epoch can provide the evolution through time of the frequency components, other techniques could be more suitable for its time-frequency analysis due to the non-steady nature of EEG signals.
In literature, there are a few examples of these techniques, such as the wavelet transform (Subasi, 2005). However, it needs a good choice of the wavelet mother function which can make this process difficult. In our previous research (Ortiz et al., 2017), we introduced the application of a time-frequency analysis transform, the Hilbert Huang Transform (HHT) (Huang et al., 1998), in an offline scenario for lower limb detection of start and stop of gait cycle based on the ERD/ERS phenomenon. HHT combines a decomposition algorithm Empirical Mode Decomposition (EMD) and a mathematical transform Hilbert Transform (HT). This paper expands our previous research introducing a new transform, the Stockwell Transform (ST) (Stockwell et al., 1996), in order to compare its performance not only with HHT, but also with the FFT. Besides, the study is carried out in an offline and a pseudo-online scenario for better comparison of the techniques. In order to correctly measure the performance of the different proposals, the Weighted Discriminator index (WD) is used (Rodríguez-Ugarte et al., 2017).
The purpose of this work is to show how time-frequency techniques, such as the ST transform, improve the accuracy of the start and the stop detection of gait cycles through the EEG signal analysis, lowering the number of false detections. Sixteen different subjects (eight healthy and eight patients) participated in the research. Data was not only analyzed offline, but pseudoonline as this approach simulates the behavior of the BMI working with an external device in real time.

MATERIALS AND METHODS
This section provides information about the experimental setup, equipment used for EEG acquisition, motion capture system (MCS) and the data processing.

Experimental Setup
Data was collected on sixteen participants. Eight participants (labeled as H1-8) were healthy and did not have any known health issue. They were all right-handed, and were in the age range of 24-29 years (28.2 ± 3.0) at the time of the experiment. Additionally, there were eight patients (labeled as P1-8) of the National Hospital for Spinal Cord Injury in Toledo (Spain), and they were in the age range of 19-71 years (43.7 ± 18.4). This study was carried out in accordance with the recommendations of ethics committee of the Miguel Hernández University of Elche (Spain) with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the ethics committee of the Miguel Hernández University of Elche (Spain).
Healthy subjects performed ten different trials. However, certain patients due to limitations, tiredness and hardware detection problems completed less trials or some of the trials were not correctly accomplished. Table 1 shows the number of trials considered for analysis by each subject.
Each trial consisted of 4 complete gait cycles, each one with two events: start and stop as can be seen in Figure 1. Each cycle followed the next pattern: relax, start intention of gait, gait, stop intention of gait and stop (out of the model analysis, although it can be considered as a relaxed state).
The purpose of the paper is to measure the performance of a BMI that detects the starts or stops of the gait cycle. Hence, two different models were considered, one for the detection of start and other for the detection of stop. The analysis was carried out considering each event (start or stop intention) as the active part of the model (state 1) and the previous state (relax or gait) as the non-active (state 0).
Data processing was not performed in real time. Two different approaches were considered in order to measure the performance of the methods and the classification model: offline and pseudoonline. Pseudo-online analysis simulated the behavior of the tool in real-time conditions. State labels (0 and 1) were defined based on the Inertial Measurement Units (IMU) activation. Active windows had a 4 s duration starting 2 s previously to the IMU activation. Non-active windows were considered before the active windows with a time gap of 0.5 s. In the case of the offline model, active windows had a duration of 4 s, covering the whole previous time of analysis for the pseudo-online model. Differences between the data windows are shown in Figure 1 for both approaches. Nevertheless, data was sampled at 500 Hz and sent every 0.2 s in epochs of 1 s duration to the data processing tools. This allowed to treat the data in a similar way for all the methods tested.

EEG Data Acquisition
EEG signals were recorded using a commercial device developed by Brain Products GmbH (Germany). 31 electrodes were used Total number of trials 10 10 7 6 7 8 9 10 9 Trials used for pseudo-online model 1-6 1-6 1-4 1-4 1-4 1-5 1-6 1-6 1-6 Trials used for pseudo-online testing 7-10 7-10 5-7 5-6 5-7 6-8 7-9 7-10 7-9 Frontiers in Neuroscience | www.frontiersin.org with the help of an actiCAP for an easier placement. The system registered the EEG signals through the actiCHamp amplifier. They were wireless transmitted by a MOVE transmitter for offline and pseudo-online analyses at a sampling frequency rate of 500 Hz. Electrodes were positioned according to the International 10/10 system. Figure 2A shows healthy subject H4 walking during a trial. Although 31 electrodes were recorded, finally only nine electrodes were used. This is based on previous studies (Hortal et al., 2016b;Ortiz et al., 2017) and preliminary results of the methods tested on this paper for offline scenario. The electrodes chosen were those close to the electrode Cz (Fz, FC1, FC2, C3, Cz, C4, CP1, CP2, and Pz). The reference was positioned on the right ear lobe and the ground on AFz. Figure 3 shows the electrode configuration used.

Motion Capture System
For the MCS a Tech MCS manufactured by Technaid S.L. (Spain) was used. The system consisted of seven wireless Inertial Measurement Units (IMUs) located on the following positions: three at each limb (foot, thigh and leg) and one on the lumbar position. Figure 2 shows the exact position of the units. Each IMU had three different types of sensors: an accelerometer, a gyroscope and a magnetometer. Each IMU provided 19 parameters: 9 for rotation, 3 for acceleration, 3 for angular velocity, 3 for magnetic field and 1 for temperature. Rotation parameters were used to detect the initiation and stop of the movement. The data registered by each IMU were acquired through a HUB connected to the PC USB port at a sampling rate of 20 Hz. The MCS mission was to provide the feedback of gait state changes for accuracy calculation of the tests, i.e., the correct detection of the real initiation and stop of the gait.

Preprocessing
Reducing signal to noise ratio is important to improve the feature extraction, so pre-processing of data was needed for each epoch. Several filters were considered.

Hardware filter
As this depends on the equipment used for data acquisition, it was the same for all the measurements. The hardware used applied a low pass filter with a cut off frequency of 100 Hz and a notch filter at 50 Hz in order to mitigate the power line interference.

Spatial filter
The use of a spatial filter helps to minimize the contribution of the rest of the electrodes to each channel. This way the information of each sensor is better isolated (McFarland et al., 1997). In a previous study (Ortiz et al., 2017) a Laplacian (Lp) and a Common Average Reference (CAR) spatial filter were compared during offline analysis, obtaining Lp filter a better outcome in average than CAR for both event detection (7.8 % better comparison index for the FFT and 6.3 % better for the EMD). Therefore, Lp was the spatial filter used in this study. It aims to subtract the contribution of the rest of electrodes based on distance. Equation 1 shows how the filtered voltage is calculated for electrode i.
Where V Lp i represents the voltage after filtering, V j is the voltage without filtering, j = 1 : 31 and g ij : With d ij representing the distance between the electrodes i and j based on the three dimensional Euclidean method.

Frequency filter
The application of the frequency filter depends on the algorithm or mathematical transform used for data analysis. In this research it was only used before the FFT processing. Based on our previous study (Ortiz et al., 2017) a 4th order Butterworth high-pass filter with a cut off frequency of 0.2 Hz was used to extract the DC component of the signal. HHT and ST methods are not affected by the low pass filter, so it would only increase the computing time without an improvement of the results.

Processing
Three different processing tools were tested to obtain the feature vector of the signal: FFT, HHT, and ST. The three methods were used to extract the data characteristics from the same frequency bands: 8-13, 13-32, and 32-50 Hz. They are related to alpha, beta (ERD/ERS phenomenon) and gamma (attentive focus) bands. Therefore, for all processing methods, each electrode provides three features. Following paragraphs describe the nature of these methods.

Fast fourier transform
Fourier transform (Bracewell and Bracewell, 1986) is one of the most extended tools in signal processing. The non-steady state nature of EEG signals is minimized thanks to the analysis of the signal in epochs. FFT provides information of the evolution in time (different epochs moving every 0.2 s) of the harmonic content. For each epoch and electrode, the harmonic content in each band was computed.

Hilbert huang transform
HHT was developed by Huang et al. (1998) as an improvement to Hilbert Transform (HT) application. It consists of a sifting process based on the envelopes of data. This process, called Empirical Mode Decomposition (EMD) separates a signal in several Intrinsic Mode Functions (IMFs). The process can be described as follows: 1. Find the local extrema of x(t).
2. Find the maximum envelope e + (t) of x(t) by fitting a natural cubic spline through the local maxima. Then, repeat this step to find the minimum envelope, e − (t), by using the local minima. 3. Compute an approximation to the local average: a. The number of extrema and the number of zero crossings may differ by no more than one. b. The local average is zero. The thresholds chosen to set this condition are critical to avoid over or undertraining. In this research the stopping criteria thresholds of Rilling et al. (2003) were followed. c. To avoid the extraction of accidental IMFs, the conditions must be accomplished in at least two to three consecutive iterations (three in our case).
6. If p i (t) is not an IMF, repeat the EMD sifting process by setting: Every IMF i (t) is supposed to be monotonic if EMD is successfully applied. Therefore, IMF i (t) and its HT are orthogonal and instantaneous amplitude a(t) and pulsation ω i (t) can be computed through the analytical complex function z i (t) analysis (Huang et al., 1998) of every mode: Once all the modes are extracted (see Figure 4B for an example), Hilbert Spectrum H(ω, t) (Huang et al., 1998) is calculated based on a i (t) and ω i (t) for all the modes. H(ω, t) is computed as a function of energy (square amplitude) by frequency and time (right part of Figure 4A). As data volume of H(ω, t) can be too large, the Hilbert Marginal Spectrum h(ω) is computed as Equation (5). This is carried out for each epoch as: Being t 2 − t 1 = 1s. For each electrode and epoch the peak value of each frequency band is extracted as one feature of data. See Figure 4A for a clearer representation of h(ω) features. Notice that in Figure 4A H(ω, t) is represented in logarithmic square amplitude, instead of square amplitude, for a better visualization.
However, there are difficulties related to the algorithm nature of the EMD. The sifting process is sensitive to the thresholds defined in the algorithm (Rilling et al., 2003) and the sampling frequency (Rilling and Flandrin, 2006). Besides, it can be hard to extract components that present similar tones (Rilling et al., 2003;Rilling and Flandrin, 2008) which can affect to the quality of the H(ω, t) due to the lack of orthogonality of some of the modes.

Stockwell transform
Stockwell transform, also known as S-Transform (ST), was developed as a time-frequency decomposition tool (Stockwell et al., 1996). It overcomes some of the disadvantages of Short Time Fourier Transform (STFT) (better time-frequency resolution) based on a scalable localizing Gaussian window. It is defined as: One of the properties of ST is to define multiple frequency voices as one dimensional functions of time scale (τ ) and frequency (f i ): Due to the orthogonal nature of voice functions, local frequency and amplitude can be computed which allows to obtain H(ω, t).
Once it is created for each epoch, h(ω, t) is calculated in the same way that was explained for HHT in paragraph 2.2.4.2 and Figure 4A, obtaining the three features per electrode based on the h(ω) peaks per band. ST and HHT are similar in the way the features are extracted, but different in the way H(ω, t) is computed. The main advantage of ST is its analytical nature which makes it not dependable of any thresholds. However, although it improves the frequency resolution of FFT, it has still a worse frequency resolution the higher the frequency is. As the frequency bands related to the characteristics (8-50 Hz) are far from the Nyquist frequency (250 Hz), this is not a problem for our study.

Post-processing
Once the features were extracted per each electrode (9×3 = 27 data vector per 1 s epoch). Two different tests were done: offline and pseudo-online. First, it was necessary to create a model for the later test data identification. Each participant had this way four different models associated: one for type of event detection (start or stop) and one for approach (offline or pseudo-online). The model allowed to identify testing epochs as non-active (0 label for rest or gait) and active intention (1 label for start intention or stop intention).

Classifier
The classifier chosen was the Support Vector Machine (SVM) algorithm. The SVM is based on hyperplane separation by maximizing the margin between the nearest points of different classes (Steinwart and Christmann, 2008). SVM combined with non-linear kernels, such as the radial basis used for this research, results in a robust method (Hortal et al., 2016a;Sburlea et al., 2017). Other alternative classifiers such as Self-organizing maps (SOM) and Linear Discrimination Analysis (LDA) were also considered and tested for some of the healthy subjects at the first steps of the research. However, the higher time of processing required for the model creation in the case of the SOM classifier and the overall better results obtained by SVM were the reason to select it. In order to limit the volume of data presented, only SVM results are shown on this paper. The model creation and evaluation was carried out in a different way for offline and pseudo-online approaches.
In the case of offline analysis, subjects were evaluated by leaveone-out cross-validation. This means that for each participant, one trial was used for validation and the rest of the trials for modeling. For instance, in the case of a subject with ten trials registered, ten different models of nine trials were performed for start intention and another ten for stopping. The ten test trials were evaluated for each one of their models and finally the results were averaged.
For pseudo-online tests, the first trials were used to create the model and the last ones to test it as if they were processed in real time. Therefore, evaluation was carried out without leave-oneout cross-validation. In the case of healthy participants, the ratio was six tests for modeling and four for testing (6/4 ratio). As the number of trials for patients was inferior to ten in some cases, the ratio presented minor differences, e.g., P2 had a 4/3 ratio. Table 1 shows the trials used for the model creation and testing by user. The indices associated with the test trials of each subject were also averaged.
For evaluation, each epoch, formed by a 27 features vector, was tested over the classifier and a label 0 or 1 was returned. This label was compared with the true nature of the epoch, based on the MCS data, and a result of a true (T) or false detection (F) was registered. The process is shown in Figure 5. This true and false vector was used afterwards for the index evaluation.

Evaluation indices
The most common way to evaluate the results is using the following indices: True Positive Rate (TPR), False Positive per Minute (FP/min) and Accuracy (Acc).
TPR indicates the percentage of true start or stop intention events detected. This was evaluated only for the active windows (red or magenta in Figure 1). The evaluation of a true event detection was a bit different for the two analysis. Offline, a number of T > F was enough to consider a true event, while on pseudo-online, more than five consecutive T were needed to consider the whole event as true. A single trial TPR would be defined as:

TPR =
Number of true event detections Number of true events Accuracy appoints how many start or stop intentions detected were really a true detection. This means that it has to be evaluated for active and non-active windows. In the case of non-active windows, a false detection was achieved when F > T (offline) or more than five consecutive F were accounted (pseudo-online).
In active windows, the calculation of a detection was the same as TPR. The Acc of a trial would be: Acc = Number of true event detections Number of total event detections FP/min indicates the number of false detections per minute. It is an important index, because a high number would result in a disturbing operation of the mobility assistant device, in which the mechanism would be activated without the real subject desire. FP/min were computed for non-active windows when F > T in the case of offline analysis. However, pseudo-online analysis is a bit different. As it tries to simulate the real-time behavior of the tool and several FP can occur during non-active windows, it was necessary to compute all the false activations and not only one by event. This way, a FP was computed each time more than five consecutives F were detected. The rest or gait windows (nonactive) for the offline scenario have the same length than active windows: 4 s, i.e., 4/60 min, while for the pseudo-online scenario expand for the whole rest or gait time previous to the 0.5 gap between non-active and active windows per event. This can be seen in Figure 1B as cyan or green windows for the start and the stop models. FP/min can be expressed as: FP/min = Number of false activations Rest or gait time windows in minutes It is important to remark that the three indices are necessary in order to correctly analyze the results. For instance, a trial test with 100% Acc and 0 FP/min could seem a perfect one, but it may only indicate that one of the four events of a trial was detected, being in this case the TPR only 25%. Although the previous indices provide a good information of the results, it can be difficult to compare them based on three independent indices. Consequently, in a previous research, a unified index called Weighted Discriminator (WD) was developed (Rodríguez-Ugarte et al., 2017). WD takes into account TPR, Acc and the False Positive Ratio (FPR) which provides the ratio of false positives per event: with TPR and FPR in p.u. and: FPR = FP/min · Duration of a single FP in mins Being the duration for the offline analysis 4/60 min and 1/60 min for the pseudo-online one (equivalent to five consecutive detections represented by the gap of 0.2 s). WD can oscillate from a perfect performance value of 1 to the worst if value is −1. Therefore, WD acts as a comprehensive index to compare the performance of the different tests.

RESULTS
Results were obtained for the sixteen subjects and the offline and pseudo-online analysis. The comprehensive WD index was calculated from the determination of TPR, FP/min and Acc indices in order to evaluate the performance of the BMI in every case. WD index was statistically analyzed. A Shapiro-Wilks (S-W) test of normality was applied in order to detect outliers (Ghasemi and Zahediasl, 2012) and a factorial multivariate analysis of variance (MANOVA) was carried out to detect the significant differences between methods (ST/HHT/FFT), type of subject (healthy/patient) and type of event (start/stop) with the help of SPSS (Field, 2009).

Offline Analysis
As previously stated, offline analysis was carried out by leave-oneout cross-validation technique. WD acts as a measurement of the BMI performance. It was computed for each method, subject and model. This can be seen in Tables 2-4. The S-W test indicated that P2 was an outlier for the HHT stop model (p < 0.05) and, as a consequence, it was not considered for the stop model.
In order to carry out a MANOVA test, it is needed to assess if the variances between groups are equal (assumption of sphericity). This test is known as Mauchly's test. In this case, assumption of sphericity was not violated (p > 0.05).
Then, it was performed the MANOVA analysis. The interaction between methods and type of event in the test of within-subjects presented significant differences [F (2, 54) = 10.80, p < 0.001, η 2 = 0.29]. In order to see the cause of this, a pairwise comparison using Bonferroni confidence interval adjustment was The data represent the average value of the trials (up to ten by subject) obtained by leave-one-out cross-validation technique for the different evaluation indices. Outliers identified by SPSS are marked by *. Best WD results by subject are marked in bold for the start and stop models.
Frontiers in Neuroscience | www.frontiersin.org The data represent the average value ±, standard deviation by method for the subjects. P2 was not considered for the stop model as it was detected as outlier. Best results for start and stop models appear in bold. The data represent the average WD value ± standard deviation by method and type of subject: healthy or patient. P2 was not considered for the stop model as it was detected as outlier. Best results in bold.
performed. ST and FFT had no significant differences (p > 0.05), but HHT did (p < 0.01) for the start and stop models. On the other hand, the interaction, in the test of within-subjects, between methods and type of subject presented no significant differences (p > 0.05). The interaction between type of event and type of subjects in the test of between-subjects was also not significant (p > 0.05).
Regarding the WD value, ST obtained the best results for both start and stop models (bold text in Table 3). Although, there were no significant differences depending on the type of subject, WD results for healthy users were higher in average (bold text for average in Table 4). ST was also the method with the highest WD for both type of subjects as Table 4 shows, but with a lower difference with the other methods for patients than for healthy subjects. HHT performance was irregular with the lowest WD results for the start model and the highest standard deviation in Tables 3, 4.

Pseudo-Online Analysis
The actual application of the BMI is in a real-time situation where the patient is trying to activate the motion device with the BMI output. As the trials were acquired before the ST and HHT implementation by the authors, a pseudo-online approach was adopted to overcome this issue. In a pseudoonline scenario, it is simulated that epochs are processed as they are acquired. First trials were used for modeling, as stated in subsection 2.2.5.1, and the rest of them were reserved for testing ( Table 1). It is important to notice that, FP/min was calculated in a different way as several false activations can be registered in real-time non-active windows. Therefore, this approach bring us a more realistic outcome of the BMI behavior, while offline tests gives information of the global performance when applying the different methods. A bad trial performance of a subject had a more relevant influence over the results, because not only an inferior number of trials is considered, but false detections can be multiple for each event. Table 5 provides the TPR, FP/min, Acc, and WD results for the different methods and subjects whereas Tables 6, 7 show the average WD value for the three methods and type of subject. The results vary from the offline tests as there were differences in the way a detection was computed and the number of trials used for testing. This variation is more noticeable in the case of the number of FP/min, as a comparison between offline and pseudo-online tables shows. However, as the WD takes into account the FPR and the FP time length varies, WD still acts as a good index for the method comparison.
The S-W test of normality was passed for all the models (p > 0.05) and no outliers were detected. Hence, the sixteen subjects were considered for pseudo-online analysis.
Mauchly's test indicated that the assumption of sphericity was violated (p < 0.05). Therefore, it was needed to apply the corrector factor with the highest power, in this case Huynh-Feldt.
For the MANOVA analysis, the test of within-subjects effects presented no significant differences (p > 0.05) for the interaction between methods and type of event, and for the interaction between the methods and the type of subject, applying for both cases the corresponding corrector factor of Huynh-Feldt. The interaction between type of event and type of subject in the test of between-subjects effects presented significant differences [F (1, 28) = 6.34, p < 0.05, η 2 = 0.185]. The pairwise comparison using Bonferroni confidence interval adjustment did not detect  The data represent the average value ± standard deviation by method for the sixteen subjects. Best results for WD start and stop models appear in bold. The data represent the average WD value ± standard deviation by method and type of subject: healthy or patient. Best results in bold.
differences for healthy subjects (p > 0.05), but it did for patients (p < 0.05). This indicated that patients performed significantly different depending on the start or stop event detection. Regarding the WD value, ST obtained the best results for both start and stop models (bold text in Table 6), with lower FP/min and higher TPR and Acc. Looking at Table 7, WD results were similar in average for healthy subjects and patients, not showing the apparently superior performance that offline analysis attributed to healthy subjects. The same table also shows that ST presented the highest WD value for both healthy subjects and patients. HHT was again the method with the most irregular performance, as the lower WD value and higher standard deviation in Tables 6, 7 indicate. The HHT result was specially low in the case of the stop model of patients which was the reason of the previously detected difference in the pairwise comparison.

DISCUSSION
A new BMI based on ST has been compared to another signal analysis technique (HHT) and a traditional transform (FFT). The tests were done for sixteen different subjects: eight healthy and eight with lower limb disabilities. With the help of a recently developed comprehensive index (WD), the different processing methods were evaluated in an offline and a pseudo-online scenario.
From the point of view of the differences between start and stop event detection of gait, offline analysis seemed to perform better for the stop detection. However, the pseudoonline approach offered a similar performance in average, with the same WD value in the case of the ST method. In addition, statistical analysis showed no significant differences between the start and stop models. Therefore, as pseudo-online model is a more adequate way to represent the performance of the BMI in a real-time scenario, it can be concluded that both event detection models (start/stop) are similar.
Another conclusion is related to the individual performance of the sixteen participants. Results of Tables 2, 5 show that BMI performance was dependent on the subject, as the performance of each of them was substantially different. This means that the subjects need some time to get used to the BMI in order to improve their results. However, there were not significant differences between healthy subjects and patients in the MANOVA test. Therefore, it is not needed to personalize the BMI depending on the type of subject.
Regarding the different methods of analysis, indices showed that ST obtained the best results with better Acc and TPR, and even zero FP/min for certain subjects. All the models showed higher WD value in average for ST (bold text in the tables) which demonstrates the better performance of this transform. This is mainly due to the analytical nature of ST that makes it a more robust method than HHT. HHT had an irregular performance with the lowest WD value for the start offline model of both type of subjects and the stop pseudo-online model of patients (WD < 0.4), but with similar results to the other methods in the rest of the cases (WD > 0.55). HHT was also the method with the highest standard deviation for all the models. The cause of this irregular behavior is the EMD algorithm. EMD did not always achieve to extract the different components related to the bands of frequency considered in the paper. If the EMD of an epoch mixes several tones in a IMF, H(ω, t) is not computed correctly and the h(ω) does not provide the three features per electrode in a constant way, which affects the classifier and the event detection. FFT performed as the second best method, but as it is not based on instantaneous amplitude and frequency, provided a worse determination of the transition from a relax to a starting gait state, and from a gait to a stopping gait state than ST.
The comparison of this work with previous works is not trivial. First, there are not many studies about detection of intention of start and stop gait for lower limb that provide the three parameters: TPR, FP/min and Acc. In addition, the FP/min can be computed differently depending on the approach. For instance, a different number of consecutive detections could be specified, or a statistical mode threshold could be used. And finally, WD is hardly used as a comparison index because it was recently developed.
In Jiang et al. (2015) an offline approach for single-trial detection of gait initiation from movement related cortical potentials was presented. The study, carried out for nine subjects, provided the following averaged results: TPR = 76.80 ± 8.97% and FP/min = 2.93 ± 1.09. No Acc was indicated in the paper, being TPR a bit lower and FP/min a bit higher than the offline ST results shown in Table 3 for the start model (TPR = 83.45 ± 10.98% and FP/min = 2.88 ± 1.27). Looking at previous work of the authors based on FFT (Hortal et al., 2016b), Table 2 of the reference shows averaged indices of TPR = 54.8 ± 9.3% and FP/min = 2.66 ± 2.24 for the offline start and stop gait intention of six subjects (no Acc provided). This example allows to compare the results in a similar scenario with more subjects under analysis. In our research, the same averaged indices (start and stop) show also an improvement: TPR = 78.82 ± 12.39% and FP/min = 2.25 ± 1.28.
It has been demonstrated that the BMI developed allows to detect the start and stop of gait intention through the use of EEG signals improving the accuracy obtained. Future research will aim the online implementation of the BMI with a motion assistant device. This approach could be useful in the context of the lower limb rehabilitation for patients that have suffered stroke.

AUTHOR CONTRIBUTIONS
MO is responsible for the design, implementation and data analysis. MR-U and EI developed the classifier module and MO adapted it. Data acquisition was designed and performed by EI. MO, MR-U, EI and JA contributed to the revision process. JA actively contributed as director of the work.

FUNDING
Research supported by the project Associate-Decoding and stimulation of motor and sensory brain activity to support long term potentiation through Hebbian and paired associative stimulation during rehabilitation of gait (DPI2014-58431-C4-2-R), funded by the Spanish Ministry of Economy and Competitiveness and by the European Union through the European Regional Development Fund (ERDF) "A way to build Europe". The acquisition wireless system of EEG signals with 32 channels from Brain Products has been partially financed by funds from the European Union (P. O. FEDER 2007O. FEDER /2013, with the management of Generalitat Valenciana (Spain).