ORIGINAL RESEARCH article

Front. Phys., 03 January 2022

Sec. Optics and Photonics

Volume 9 - 2021 | https://doi.org/10.3389/fphy.2021.811681

Seizure Prediction With HIVE-CODAs: The Hierarchical Vote Collective of Domain Adaptation Methods

  • Key Laboratory of Measurement and Control of CSE, Ministry of Education, School of Automation, Southeast University, Nanjing, China

Article metrics

View details

1

Citations

1,5k

Views

593

Downloads

Abstract

Epileptic seizure prediction is one of the most used therapeutic adjuvant strategies for drug-resistant epilepsy. Conventional methods are usually trained and tested on the same patient due to the interindividual variability. However, the challenging problem of the domain shift between different subjects remains unsolved, resulting in low prevalence of clinical application. In this study, a generic model based on the domain adaptation (DA) technique is proposed to alleviate such problems. Ensemble learning is employed by developing a hierarchical vote collective of seven DA modules over multi-modality data, such that the predictive performance is improved by training multiple models. Moreover, to increase the feasibility of its implementation, this study mimics the data distribution of clinical sampling and tests the model under this simulated realistic condition. Based on the performance of seven subnetworks, the applicability of each DA algorithm for seizure prediction is evaluated, which is the first study that provides the assessment. Experimental results on both intracranial and scalp EEG databases demonstrate that this method can reduce the domain gap effectively compared with previous studies.

1 Introduction

1.1 Epilepsy Background

Epilepsy is a cerebral anomaly with the transient occurrence of unexpected seizures caused by excessive or hypersynchronous neuronal activities [1]. It is the second most clinically significant neurological disorder, which affects approximately 1.0% of the world’s population [2]. The reliable seizure prediction device, which refers to anticipating an upcoming seizure based on continuous electroencephalogram (EEG) signals, is an emerging and important demand for drug-resistant individuals accounting for about 30% of the epileptic [3, 4]. The early warning device could significantly prevent the injury of epileptic coma, or even death.

EEG is a commonly used type of physiological signal that measures the epileptic brain activity, which records rhythmic information induced by coordinated neuronal firing with characteristic periodicity. The first-in-man forecast study was reported in 2013 [5], which offered the convincing proof of the predictability of seizure. Since then, many EEG-based algorithms adopting the data-driven technique have been presented.

1.2 Related Work

Current research studies regarding seizure prediction can be mainly categorized into two streams. The first stream typically follows a binary classification scheme, which assumes that a difference exists between the interictal and preictal stages. The ictal and postictal sequences are discarded during the data processing due to the futility of their contribution to forecast. The second stream is to detect the fluctuation of a specific index during the preictal period, such as the spike rate [68], zero-crossing intervals [9], and phase/amplitude locking value [10]. If the observed indicator exceeds the previously set threshold, an early warning would be declared. Owing to the multiplicity of multichannel EEG recordings, the first stream is more widely recognized than the second one. This study also adopts the strategy that distinguishes preictal states from interictal states, which is depicted in Figure 1.

FIGURE 1

Approaches using the binary classification scheme commonly adopt machine learning techniques like support vector machines [1113], random forests [14], and k-nearest neighbor [15]. For the past few years, many deep learning frameworks, including convolutional neural network (CNN) [1619], 3D CNN [20], long short-term memory (LSTM) network [2123], and cascades of DNN [24], have been exploited to analyze continuously acquired epileptic EEG signals. However, there are still many promising algorithms to be developed and applied further. Ensemble learning is considered the state-of-the-art solution for many challenging problems?. For instance, several representative approaches, including HIVE-COTE [25], boosting, bagging, and stacking?, have achieved high performance for time series classification. Such methods are appealing because it has stronger generalization ability than a single model by training multiple subnetworks and combining their predictions. For this reason, we attempt to probe into its effectiveness for seizure prediction.

Most recently, various machine learning–based studies have achieved high performance. However, these methods are not yet in widespread use. Most of these research studies only provide patient-specific results, namely, both training and testing sets are collected from one subject. The reason for adopting this strategy is that large interindividual variability is ubiquitous among patients with epilepsy [2628]. Therefore, an ensemble that contains a number of domain adaptation modules is developed in this study to reduce the impact of epileptic individual variability.

1.3 Significance

Although conventional studies achieve encouraging successes in seizure prediction task, their translation to the application remains challenging, in part due to their limited domain adaptability across different subjects. EEG patterns vary significantly from patient to patient as shown in Figure 2, and the issue regarding the model generalization ability remains unsolved. In previous studies, the training and testing sets are from the same patients, which can obtain a very high sensitivity (% on average). Although such trials are important for personalized medicine, they are inconsistent with the clinical scenario in most cases. In other words, conventional models may perform well in one patient but be less effective in another, since the domain gap between different subjects is partly ignored. In practice, the training set is mainly composed of the previous patient data, and only a small amount of user samples can be used for training. The training set consisted of various subjects is the source domain. The “unseen” user is the target domain. In the existing literature, few research studies explore the domain shift issue. Therefore, a general seizure prediction model that is similar to clinical situation remains to explore and perfect further.

FIGURE 2

To alleviate the low popularization of clinical application and circumvent the impact of interindividual variability, domain adaptation (DA) is introduced for seizure prediction. However, few studies aim at using these techniques in epileptic EEG. In the fields such as image recognition and emotion-related EEG, successful applications of domain adaptation approaches have been reported [29, 30]. There are three main streams of DA algorithms. The first stream exploits adversarial learning to extract invariant information among source and target domains. The second stream extends sample size with data augmentation to access the target domain pattern in advance. The third stream establishes general features based on specific prior knowledge.

Inspired by the success in other areas, we hope to extend DA to the field of seizure prediction. Since many DA techniques [

31

33

] have been provided, an ensemble learning–based model, the hierarchical vote collective of DA subnetworks (HIVE-CODAs), is proposed in this study. HIVE-CODAs combine the advantages of various DA methods. Besides, it can evaluate the applicability of each DA algorithm. In general, the main contributions of this study are summarized as follows:

  • • A generic model, HIVE-CODA, is proposed to tackle the DA problem for seizure prediction. It is the first attempt to reduce the domain disparity between different patients and to test the model under simulated clinical sampling conditions.

  • • Ensemble learning is introduced into this model by developing a hierarchical vote collective. Such a framework can improve the predictive performance and generalization ability due to the combination of multiple DA subnetworks.

  • • This study is the first study to evaluate the applicability of different DA algorithms for seizure prediction, which is crucial for follow-up studies.

Based on DA techniques and ensemble learning, the proposed model provides an above par disturbance rejection property, making the model more robust and practical for clinical application. Experiments on two public databases, the Freiburg Hospital EEG database and the CHB-MIT EEG database [34, 35], are conducted for model evaluation. Results indicate that HIVE-CODA achieves better domain adaptability compared with other state-of-the-art baselines.

2 Data Acquisition and Preprocessing

2.1 Patients

Two public EEG datasets, the Freiburg Hospital Intracranial EEG database [34] and the CHB-MIT scalp EEG database [35], are adopted to evaluate the generalization capability of HIVE-CODAs. The Freiburg Hospital EEG database includes time series of 87 seizures from 21 people with medically intractable focal epilepsy, ranging from 10 to 50 years old (8 male and 13 female patients). EEG signals are recorded invasively with six electrodes (3 near the epileptic focus and the other three distal to the epileptogenic zone). The sampling rate for all patients is 256 Hz (data of Patient No. 12 are sampled at 512 Hz but are down-sampled to 256 Hz).

The CHB-MIT database consists of scalp EEG sequences of 22 epileptic subjects, including five male patients ranging from 3 to 22 years and 17 female patients from 1.5 to 19 years. The EEG signals are recorded at a 256 Hz sampling rate with 16-bit analog-to-digital converters. Most samples are acquired from surface electrodes of 23 channels following the 10–20 standard system for electrodes placement. Each patient has a subfolder that contains 9 to 42 recordings.

2.2 Data Selection and Labeling

Power line noise removal is implemented to denoise the data. We discarded the frequency bands of 47–53 and 97–103 Hz in the intracranial EEG set and the frequency bands of 57–63 and 117–123 Hz in the scalp EEG set. It is because noise commonly appears at 50 Hz for the Freiburg database and 60 Hz for the CHB-MIT database. Moreover, a subject selection is performed. Only patients with at least two seizures but fewer than 15 seizures per day are used for prediction, since less than two seizures are not enough to support training and more than 15 seizures make the forecast purportless. The chosen subjects are listed in Tables 1, 2.

TABLE 1

PatientGenderAge (years)Seizure typeNo. of seizures
Pt 1F15SP4
Pt 2M38SP, CP, GTC3
Pt 3M14SP, CP5
Pt 4-F26SP, CP, GTC5
Pt 5F16SP, CP, GTC5
Pt 6F31CP, GTC3
Pt 8F32SP, CP2
Pt 9M44CP, GTC4
Pt 10M47SP, CP, GTC5
Pt 11F10SP, CP, GTC4
Pt 12F42SP, CP, GTC3
Pt 13F22SP, CP, GTC2
Pt 14F41CP, GTC4
Pt 15M31SP, CP, GTC4
Pt 16F50SP, CP, GTC5
Pt 17M28SP, CP, GTC5
Pt 18F25SP, CP5
Pt 19F28SP, CP, GTC4
Pt 20M33SP, CP, GTC5
Pt 21M13SP, CP5

Details of the Freiburg Hospital test set.

F, female; M, male; SP, simple partial; CP, complex partial; and GTC, generalized tonic-clonic.

TABLE 2

PatientGenderAge (years)Seizure typeNo. of seizures
Pt 1F11SP, CP7
Pt 2M11SP, CP, GTC3
Pt 3F14SP, CP6
Pt 5F7CP, GTC5
Pt 6F2CP, GTC4
Pt 7F15SP, CP, GTC3
Pt 8M4SP, CP, GTC5
Pt 9F10CP, GTC4
Pt 10M3SP, CP, GTC6
Pt 13F3SP, CP, GTC5
Pt 14F9CP, GTC5
Pt 17F12SP, CP, GTC3
Pt 18F18SP, CP6
Pt 19F19SP, CP, GTC3
Pt 20F6SP, CP, GTC5
Pt 21F13SP, CP4

Details of the CHB-MIT test set.

F, female; M, male; SP, simple partial; CP, complex partial; and GTC, generalized tonic-clonic.

A prerequisite for seizure prediction is the reliable distinction between preictal and interictal samples. We set 30 min before seizure onset as the seizure prediction horizon (SPH), which follows empirical evidence of comparison tests as applied multiple preictal lengths, and the seizure occurrence period is set to 0. A seizure should occur within 30 min after the predictor returns a positive. The raw EEG recordings are then divided into continuous, non-overlapping fragments by a 5-s time window. The sample number for each subject is sufficient (>7,200) to support training. Besides, we remarked that the amount of interictal samples is much larger than preictal samples. To remedy the sample imbalance, a random subsample on the interictal signals is performed to make an equal quantity of preictal and interictal training samples.

3 Methods

To learn the domain-invariant representation, we proposed a generic seizure prediction model: the hierarchical vote collective of DA subnetworks (HIVE-CODAs). HIVE-CODA is an ensemble that combines 7 DA modules over multi-modality data. Each subnetwork is assigned a weight via the probabilistic voting scheme to equilibrate its value. By analyzing the most contributive DA component and its feature space, we provided a preliminary conclusion about the generalized information during the preictal period among various individuals.

3.1 Clinical Situation Simulation

Conventional approaches only provide the patient-specific results. Such frameworks may obtain high precision but are not consistent with the signal recording situation in real life. It is difficult to collect a large number of long-term EEG samples from one specific patient during clinical treatment, such that the sample size is unable to support the training process. Therefore, we considered using DA technology to apply data from other subjects to predictor training for the particular subject.

The training and testing strategy is depicted in Figure 3. The training and validation sets consist of existing patient data and one seizure of the target subject, while the remaining target seizures served as the testing set. The selection of seizure for training refers to the idea of the leave-one-out cross-validation (LOOCV) approach [36]. Moreover, the combined data are partitioned into five folds, and 80% of the samples are assigned to the training set, while the remaining 20% is named for the validation set to prevent overfitting.

FIGURE 3

3.2 Modular Hierarchical Structure

HIVE-CODAs include seven constituent modules: subject-invariant domain adaption (SIDA) [37], conditional deep convolutional generative adversarial networks (C-DCGANs) [38], plug-and-play domain adaptation (PPDA) [39], maximum independence domain adaptation (MIDA) [40], maximum mean discrepancy–adversarial autoencoders (MMD-AAEs) [41], model-agnostic learning of semantic features (MASF) [42], and cone manifold domain adaptation (CMDA) [43]. The modular hierarchical structure is depicted in Figure 4.

FIGURE 4

Since few domain adaptation techniques of epileptic EEG have been reported, we applied seven state-of-the-art approaches from the related fields to constitute the subnetworks of HIVE-CODAs. Several modules require images as inputs, instead of time series, such that we generate the spectrograms from EEG segments using the short-time Fourier transform (STFT) [44]. The raw EEG recordings are translated into two-dimensional matrices composed of frequency and time axes. Then the EEG fragments and their spectrograms will be sent forward to the corresponding modules depending on their modalities.

3.3 Modules Based on Adversarial Learning

1) MMD-AAE: We developed the MMD-AAE module referring to the study mentioned in reference [41], which aims at assessing the effectiveness of maximum mean discrepancy (MMD) measure and adversarial autoencoders (AAEs). An MMD-based regularization term is proposed to align the distributions among various subjects. The AAE architecture is applied to learn the latent codes that are universal to all domains. The sharable information is captured by matching the aligned distribution to an arbitrary prior distribution. Thus the MMD-AAE may circumvent the overfitting to source data.

2) SIDA: We also estimated the performance of SIDA on epileptic EEG, which combines power spectral density (PSD) features and adversarial learning [37]. SIDA focuses on the extraction of the invariant representations among different domains. The sharable information is jointly learned with the task loss and subject confusion loss . The training procedure adopts the adversarial strategy, which is implemented with a gradient reversal layer. Suppose that there are N source samples , the process can be explicitly written as follows:where θ, γ, ϕ represent the network parameters and are their updated forms. λ is the trade-off positive parameter. and are the classification task and subject discrimination outputs. ci, si denote the corresponding labels for and , respectively. Note that, a specific feature extraction component is assembled in HIVE-CODAs since the inputs of SIDA are PSD features in accordance with the study mentioned in reference [37].

3.4 Modules Based on Data Augmentation

1) C-DCGANs: By introducing C-DCGANs [38], we tested the feasibility of using data augmentation and convolutional neural networks (CNN) to remedy the domain discrepancy. The main idea of C-DCGANs is increasing generalization capability via artificial EEG data generation. A generative adversarial network (GAN) is exploited to expand the training set, and an end-to-end CNN is employed as the classifier. We remarked that C-DCGANs also involve the adversarial learning due to the application of GAN. However, the generation function of GAN is highlighted in HIVE-CODAs, instead of the minmax optimization, such that we placed emphasis on the assessment of data augmentation.

2) MIDA: MIDA subnetwork is developed to measure the importance of background information and feature augmentation. In MIDA framework, an inner product space is established, where feature vectors are maximally independent in the sense of a Hilbert–Schmidt independence criterion (HSIC) [40]. The feature augmentation is performed via generating latent representations based on the background knowledge like acquisition time. The original feature vectors are expanded by concatenating with the produced features. According to the study mentioned in reference [40], we exploited the domain label (which domain a sample belongs) as the background information since no device label and the acquisition time is provided in the epileptic EEG database.

3.5 Modules Based on Specific Features

1) CMDA: The CMDA module is adopted to evaluate the applicability of manifold on epileptic EEG. Referring to the study menitioned in reference [43], the latent feature space among various domains is regularized by modeling sharable information on the Riemannian cone manifold. Specifically, covariance matrices P of EEG segments are computed to constitute the manifold . The CMDS module leverages the global Riemannian mean and the local Riemannian mean to describe the cross-domain center and the centroid of the set for the kth-domain (cite). By using the parallel transport approach, the projections of {P(k)} on the tangent space can describe the invariant features among source domains as follows:where denotes the generalized features, represents the parallel transport from B to A, and S(k) represents the projection of P(k) on the tangent space with the logarithm map (cite). In general, each domain feature is parallelly transported from to the global centroid , and the transported point is embedded in the inner product space to make the generalized features describable in the Euclidean space.

2) PPDA: The long short-term memory (LSTM) architecture and a peculiar learning strategy are evaluated by adding the PPDA module. PPDA divides the latent features into private portions specific to each subject and generalized components among all subjects. To leverage both the universal and private feature vectors, PPDA develops a learning procedure including the training phase, calibration phase, and test phase. Specially, the LSTM layer is adopted for encoding and decoding.

3) MASF: To assess the applicability of meta-learning and semantic features, the MASF module is employed in HIVE-CODAs. According to the study mentioned in reference [42], a model-agnostic learning paradigm is exploited to minimize the domain gap via using a global class alignment loss and a local sample clustering loss . The knowledge about interclass relationships and the domain-independent class-specific cohesion/separation is captured by and , respectively, which is given as follows:where ψ, θ are the network parameters, η is the learning rate, and β1, β2 denote the weighting coefficients. represents the loss function of the predictive task. By introducing both global and local information, the semantic structure regarding the EEG feature space is regularized explicitly.

3.6 Weighted Voting Scheme

To evaluate the contribution of each subnetwork, a weighted voting structure is introduced at the end of the network. We assumed that there are G modules for the classification with C classes. For an arbitrary class y = i, we denote by the weight assigned to the jth module, where and . Then the collective probability pi for the ith class is the normalized weighted sum over modules:

The prediction result can be given as follows:

The applicability of each algorithm on epileptic EEG could be estimated via observing its weight unit. Besides, a more balanced and intuitive collective could be created as the subnetworks are trained adaptively.

4 Results and Discussion

In this section, the comparison results and weight matrix are provided to verify the generalization ability and evaluate the DA algorithms. HIVE-CODA is assessed on both intracranial and scalp EEGs. We adopted three common measures for evaluation: sensitivity, false alarm rate per hour (FPR), and area under the receiver operating characteristic curve (AUC).

4.1 Generalization Ability Analysis

The comparison experiments are conducted to demonstrate the advantages of HIVE-CODAs over other conventional methods. Many time/frequency domain–based approaches have been applied to predict upcoming seizures. Two classic deep neural networks, CNN and LSTM, are selected to assess the generalization ability of our method, which have achieved success in patient-specific forecast. We also attempted to find a generic algorithm across different subjects for comparison. However, little existing research considers the similarity of data acquisition to clinical situation and uses plenty of “unseen” patient’s samples for training. The implementation details of CNN and LSTM refer to references [17] and [22], and the experimental results are listed in Tables 3, 4.

TABLE 3

SourceTargetCNNLSTMHIVE-CODAs
SnFPR (/h)SnFPR (/h)SnFPR (/h)
S.C.Pt 10.730.240.720.230.830.11
S.C.Pt 20.580.310.660.300.850.09
S.C.Pt 30.670.260.670.250.850.13
S.C.Pt 40.670.270.710.160.840.12
S.C.Pt 50.480.400.450.320.730.18
S.C.Pt 60.750.270.580.380.860.11
S.C.*Pt 80.550.340.530.290.660.32
S.C.Pt 90.650.190.750.160.830.15
S.C.Pt 100.510.370.580.260.790.24
S.C.Pt 110.690.360.540.210.870.22
S.C.Pt 120.660.180.650.240.840.17
S.C.*Pt 130.560.310.520.230.680.27
S.C.Pt 140.470.480.570.290.760.23
S.C.Pt 150.660.190.700.170.880.12
S.C.Pt 160.530.370.440.460.790.19
S.C.Pt 170.630.360.420.410.630.28
S.C.Pt 180.720.180.730.200.830.20
S.C.Pt 190.440.290.470.330.750.14
S.C.Pt 200.430.370.460.330.740.26
S.C.Pt 210.640.320.500.280.840.12
Avg.0.600.310.560.270.800.18

Results compared with conventional methods on the Freiburg Hospital database.

S.C., simulated clinical samples; Sn, sensitivity; FPR, false prediction rate; and Avg., average result. Note that S.C.* uses NO samples of the predictor user.

The bold values denote outliers.

TABLE 4

SourceTargetCNNLSTMHIVE-CODAs
SnFPR (/h)SnFPR (/h)SnFPR (/h)
S.C.Pt 10.570.360.550.280.720.16
S.C.Pt 20.480.390.420.270.650.22
S.C.Pt 30.640.300.580.320.750.27
S.C.Pt 50.540.420.450.410.760.23
S.C.Pt 60.660.310.620.320.720.15
S.C.Pt 70.650.250.550.250.830.28
S.C.Pt 80.640.280.570.200.740.20
S.C.Pt 90.480.350.450.260.630.31
S.C.Pt 100.470.320.510.240.620.34
S.C.Pt 130.550.220.500.230.730.26
S.C.Pt 140.550.420.440.320.720.35
S.C.Pt 170.470.40.420.420.620.28
S.C.Pt 180.580.320.450.340.730.29
S.C.Pt 190.590.220.530.210.790.17
S.C.Pt 200.600.250.570.270.770.22
S.C.Pt 210.620.280.620.270.770.15
Avg.0.530.320.510.290.720.24

Results compared with conventional methods on the CHB-MIT database.

S.C., simulated clinical samples; Sn, sensitivity; FPR, false prediction rate; and Avg., average result.

The bold values denote outliers.

The experiment regarding the intracranial EEG is performed based on the widely used Freiburg Hospital database. Table 3 illustrates that HIVE-CODAs achieve a sensitivity of 80% and an FPR of 0.18/h on average, which outperforms other forecast models. For the outlier-like Pt 17, HIVE-CODAs do not produce the desired effectiveness, which might be caused by a larger domain gap existing in the sample space.

Evidently, performances of all these prediction approaches show a significant decline compared with the patient-specific results in their literature. This phenomenon is reasonable since the training and testing samples are collected from one same subject in prior studies, which consider little about the generalization ability. Conversely, our method is implemented based on the existing database and small amount of “unseen” patient’s data, which is more coincident with the real clinical situations. Though the precision is not relative high, the model performance is sufficient for the daily needs of patients, as it approximates to the first-in-man trial [5].

In terms of scalp EEG, the experiment is conducted based on the public CHB-MIT database, produced by the Massachusetts Institute of Technology. As shown in Table 4, HIVE-CODAs achieve a sensitivity of 72% and an FPR of 0.24/h on average. Since the conventional algorithms consider little about the domain shift among different patients, HIVE-CODAs exhibit obvious advantages over other prediction models. Still, for several outliers like Pt 2, Pt 9, Pt 10, and Pt 17, the sensitivity of our approach is slightly higher than the lower bound of a random binary classifier. HIVE-CODA is a variation over deep learning models. As such, it carries with it the uncertainties associated to deep neural networks, in particular a lack of formal convergence guarantees.

Furthermore, experiments compared with DA algorithms are conducted. Results of AUC value are listed in Tables 5, 6. For the Freiburg Hospital database, results indicate that HIVE-CODAs achieve higher generalization ability than the conventional algorithms. It also testifies to the application potential of integrated DA modules on processing epileptic EEG. To be specific, the interindividual variability could be alleviated, and the existing forecast systems could be transferred to the clinic due to the emerging technologies in DA.

TABLE 5

SourceTargetMIDAMASFPPDAMMD-AAESANC-DCGANsSIDACMDAHIVE-CODAs
S.C.Pt 10.590.630.740.780.780.800.810.800.86
S.C.Pt 20.560.620.710.770.780.770.830.820.86
S.C.Pt 30.550.620.670.740.750.800.770.830.84
S.C.Pt 40.520.570.560.570.610.620.620.640.75
S.C.Pt 50.600.630.720.790.790.800.800.820.85
S.C.Pt 60.540.530.550.650.680.660.730.740.79
S.C.*Pt 80.480.510.550.540.560.570.570.610.68
S.C.Pt 90.500.550.610.690.680.700.700.730.80
S.C.Pt 100.530.630.600.690.680.690.700.740.77
S.C.Pt 110.620.650.700.770.810.730.830.820.87
S.C.Pt 120.630.630.700.680.710.750.780.770.85
S.C.*Pt 130.460.520.600.640.640.680.670.680.74
S.C.Pt 140.480.550.630.650.660.690.680.710.78
S.C.Pt 150.590.690.690.730.750.780.770.820.86
S.C.Pt 160.450.480.580.620.670.690.670.700.76
S.C.Pt 170.460.480.520.540.540.550.550.560.68
S.C.Pt 180.610.640.680.750.770.790.780.810.86
S.C.Pt 190.460.470.530.540.530.550.580.600.69
S.C.Pt 200.500.550.600.670.690.680.710.700.78
S.C.Pt 210.520.580.600.620.660.550.690.720.83
Avg.0.530.580.630.640.650.690.710.730.80

Results compared with DA methods on the Freiburg Hospital database.

S.C., simulated clinical samples. Note that S.C.* uses NO samples of the predictor user.

TABLE 6

SourceTargetMIDAMASFPPDAMMD-AAESANC-DCGANsSIDACMDAHIVE-CODAs
S.C.Pt 10.600.640.650.730.750.770.770.800.86
S.C.Pt 20.480.520.460.640.660.700.690.820.86
S.C.Pt 30.540.530.590.650.670.680.710.830.84
S.C.Pt 50.520.560.620.700.750.740.740.820.85
S.C.Pt 60.520.550.610.660.740.750.730.740.79
S.C.Pt 70.590.610.640.700.720.750.760.740.79
S.C.Pt 80.510.580.610.670.700.710.710.610.68
S.C.Pt 90.470.490.520.580.600.610.640.730.80
S.C.Pt 100.460.500.490.510.550.540.600.740.77
S.C.Pt 130.490.530.460.580.620.630.620.680.74
S.C.Pt 140.470.530.550.630.660.680.700.710.78
S.C.Pt 170.510.540.520.610.630.620.640.560.68
S.C.Pt 180.500.510.530.580.610.620.660.810.86
S.C.Pt 190.510.530.560.630.660.660.690.600.69
S.C.Pt 200.550.560.590.650.680.720.740.700.78
S.C.Pt 210.510.530.600.660.710.730.770.720.83
Avg.0.510.540.560.640.660.670.690.700.74

Results compared with DA methods on the CHB-MIT database.

S.C., simulated clinical samples.

For the CHB-MIT database, the conventional studies show a lower performance in a clear margin compared with their patient-specific results, which is consistent with the experiment conducted on intracranial EEG. Moreover, all these model performances drop to a varying degree compared with the precisions on the Freiburg test set. It might be caused by the advantages of ensemble learning in analyzing low spatial resolution recordings, namely, the scalp EEG signals. In other words, intracranial EEG recordings have the high spatial resolution and SNR, and the artifacts are typically seen in scalp EEG [46, 47]. This result also illustrates that HIVE-CODAs have superiority for processing complex time series due to the diverse inner pattern of the collective structure.

4.2 Module Performance Analysis

As few studies evaluate the applicability of different DA algorithms for seizure prediction, this study provides an analysis based on the adaptively trained weight matrices. HIVE-CODAs introduce several successful machine learning models from related fields and assess their performance. The subnetworks are conducted via a statistical analysis of the weighted voting layer. The weight distributions are presented in Figure 5. The greater the normalized weight, the greater the contribution of the corresponding DA module. DA methods with high contributions are considered to have larger potential. This study also tests the predictive precision of each module running alone (with the other modules’ weights reset to 0). The results of AUC values are illustrated in Figure 6. A detailed discussion regarding these DA techniques is provided as follows.

FIGURE 5

FIGURE 6

1) CMDA: CMDA relies on Riemannian manifold–based features to capture the characteristic scale of the neuronal events, which was proposed for motor imagery. As shown in Figure 5, CMDA surpasses the other approaches on both intracranial and scalp EEG datasets. We conjectured that the inner pattern of EEG sequences may obey a compact distribution in the embedding space, such that the manifold-based methods that capture continuous subspace might be applicable to such task. The experimental result indicates that the analytic Riemannian manifold can potentially be used to develop a robust seizure predictor.

2) SIDA: The SIDA module is an adversarial neural network from the area of emotion recognition. It uses EEG spectra as input to learn a new representation, minimizing loss of emotion recognition and subject confusion. As we can see, SIDA makes relatively larger contribution compared with other modules. It might be due to the combination of CNN and generative adversarial network (GAN), which have been exploited to extract invariant latent features successfully. The weight unit of SIDA module may suggest the potential effect of adversarial learning on generalization ability, since SIDA exploited the architecture of GAN. However, this conjecture needs to be further verified for the SIDA module as it adopts the power spectral density (PSD) features as inputs.

3) C-DCGANs: C-DCGANs use conditional GANs to generate EEG artificially, which is developed for the detection of subject’s movement intention (MI). We noticed that the performance of data augmentation–based module is not desired compared with the specific features and adversarial learning–based subnetworks. The degradation might be caused by the limitation of EEG data augmentation. The fake data usually involve more artifacts [48] that may contaminate EEG data. Still, C-DCGANs provide a decent accuracy, which suggests that data augmentation still has potential in developing a generic seizure forecast model.

4) MMD-AAE: By matching the aggregated posterior with a prior distribution, the MMD-AAE module extracts the cross-domain features with adversarial learning. This scheme was originally used for image recognition. On both intracranial and scalp EEG datasets, MMD-AAE outperforms MIDA, MASFF, and PPDA modules and exhibits a slight decrease compared with the C-DCGAN module. Due to the above par performance of MMD-AAE, the conjecture in 2) about the superiority of adversarial learning is verified to some extent. This superiority may derive from the variational inference process of MMD-AAE, which alleviates the overfitting to the source domains effectively.

5) PPDA: PPDA is a technology applied to EEG-based emotion recognition. It uses LSTM-based encoder to decompose the features into general characteristics applicable to all individuals and personalized characteristics. Dividing the raw EEG data into subject-specific information and generalized information is a commonly-adopted strategy for domain adaptation. However, for seizure prediction, PPDA displays subpar performance, which we did not expect. Feature decomposition and the adoption of LSTM seem reasonable in these tasks. However, due to the few reports of relevant models, it cannot be absolutely determined that the decomposed features and LSTM are not suitable for epileptic signals.

6) MASF: MASF exploits semantic features and gradient-based meta-learning to establish a model-agnostic learning paradigm. In the field of image processing, successful application of MASF has been reported. Notably, the performance of the semantic feature–based method is unsatisfactory. We conjectured that the discriminant hyperplane in the feature space may be too complex to be adapted by the explicit semantic features. Moreover, the limitation of the initial neural architecture for meta-learning might also be a constraint of the search space.

7) MIDA: MIDA is originally applied in the emotion recognition field. The purpose of this model is to reduce differences in domain distributions by learning a subspace with maximum independence. Figure 5 indicates that all the DA methods can outperform the MIDA module. This result was expected, given the limited background information on epileptic data. Obviously, the background-specific features are not valid characteristics.

Based on these results, we observed that adversarial learning and manifolds may achieve good performance in epilepsy prediction. In addition, CNN and PSD features may also have the potential to process epileptic signals. In the domain generalization field, CNN has gradually become one of the most popular algorithms. This also further echoes the conjecture about CNN in this experiment. Note that the module performance may be variable on some special cases, since several outliers (in the Freiburg dataset, Pt 11, 21 for the C-DCGAN module; in the CHB-MIT dataset, Pt 2, 13 for the PPDA module, Pt 21 for C-DCGANs module, and Pt 2 for CMDA module) have been observed.

4.3 Model Applicability Analysis

Here, we attempt to summarize the universal characteristics and architectures based on the observations of DA algorithms in Section 4.2. The weight vectors of three types of DA methods (specific features, data augmentation, and adversarial learning) are quantified in a statistical analysis, which is depicted in Figure 7.

FIGURE 7

As shown in Figure 7, the adversarial learning–based approaches exhibit obvious advantages over the other DA methods on both intracranial and scalp EEG. Meanwhile, the weight distribution indicates that model performance regarding the specific features and data augmentation is volatile. Comparing adversarial learning to specific features, we reckon that the amount of above par weights is about 64% up. Comparing adversarial learning to data augmentation, a further 61% benefit is obtained, for a total of about 125% margin over data augmentation–based methods. These observations give us confidence in the efficacy of adversarial learning for processing epileptic signals, and we conjecture that data augmentation is relatively inferior for alleviating individual variability.

In particular, the manifold feature of CMDA surpasses all the other methods, such that the effectiveness of manifold feature requires a further demonstration. The statistical significance of the manifold feature is assessed for discriminating preictal and interictal stages across different patients. The two-sample Kolmogorov–Smirnov test [49] at a 5% significance level (p < 0.05) is implemented during the evaluation.

The significance analysis for each patient is provided in Tables 7, 8. The unqualified performance index is marked in bold format. For the manifold feature, 17 of 20 subjects in the Freiburg dataset and 13 of 16 subjects in the CHB-MIT dataset present an adequately distinguished ability. According to this observation, the manifold-based methods might be the promising techniques in developing a robust seizure predictor.

TABLE 7

PatientPt 1Pt 2Pt 3Pt 4Pt 5Pt 6
PatientPt 1Pt 2Pt 3Pt 4Pt 5Pt 6
p value0.0320.0410.0340.0340.0520.044
PatientPt 8Pt 9Pt 10Pt 11Pt 12Pt 13
p value0.0370.0320.0380.0280.0320.032
PatientPt 14Pt 15Pt 16Pt 17Pt 18Pt 19
p value0.0340.0140.0450.0620.0250.055
PatientPt 20Pt 21
p value0.0470.041

p values on the Freiburg Hospital dataset.

The bold values denote outliers.

TABLE 8

PatientPt 1Pt 2Pt 3Pt 5Pt 6Pt 7
p value0.0460.0680.0380.0550.0280.033
PatientPt 8Pt 9Pt 10Pt 13Pt 14Pt 17
p value0.0320.0560.0480.0400.0440.038
PatientPt 18Pt 19Pt 20Pt 21
p value0.0350.0250.0280.028

p values on the CHB-MIT dataset.

The bold values denote outliers.

5 Conclusion

This study proposes a universal approach to alleviate the problem of individual variability in epileptic seizure prediction. By combining the DA and ensemble learning techniques, the proposed HIVE-CODA model mitigates the effects of epileptic individual variance and increases the generalization ability. Besides, a simulated clinical sampling scenario is adopted during training and testing periods, which is the first attempt to adopt this evaluating strategy. Compared with the patient-specific scheme in conventional studies, such an assessment model is relatively demanding and challenging. Nonetheless, HIVE-CODAs achieve high domain shift robustness and precision, which demonstrates its feasibility of real-world applications.

By analyzing the contributions of each module, the experimental results also demonstrate the effectiveness of adversarial learning and manifolds in epileptic seizure prediction. The underlying causes of this phenomenon remain unclear because there is no definitive explanation of the dynamics of epilepsy in the existing literature. However, the success of the manifold module in this experiment brings new inspiration. We speculate that the mapping of EEG in the high-dimensional space may follow a compact distribution, so the kernel-based method for searching hyperplanes may have potential in this task. The search for more powerful DA algorithms and the underlying reasons will be considered as part of our future research extension to achieve higher performance.

Statements

Data availability statement

Publicly available datasets were analyzed in this study. These data can be found here: https://epilepsy.uni-freiburg.de/ freiburg-seizure-predictionproject/ eeg-database.

Ethics statement

Written informed consent was obtained from the individual(s) and minor(s)’ legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

Author contributions

PP contributed to conception and design of the study, analysis and/or interpretation of data, and drafting the manuscript.

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1.

    FisherRSBoasWVEBlumeWElgerCGentonPLeePet alEpileptic Seizures and Epilepsy: Definitions Proposed by the International League against Epilepsy (ILAE) and the International Bureau for Epilepsy (IBE). Epilepsia (2005) 46(4):4702. 10.1111/j.0013-9580.2005.66104.x

  • 2.

    BanerjeePNFilippiDAllen HauserW. The Descriptive Epidemiology of Epilepsy-A Review. Epilepsy Res (2009) 85(1):3145. 10.1016/j.eplepsyres.2009.03.003

  • 3.

    KwanPSchachterSCBrodieMJ. Drug-resistant Epilepsy. N Engl J Med (2011) 365(10):91926. 10.1056/nejmra1004418

  • 4.

    LinL-COuyangC-SChiangC-TYangR-CWuR-CWuH-C. Early Prediction of Medication Refractoriness in Children with Idiopathic Epilepsy Based on Scalp EEG Analysis. Int J Neur Syst (2014) 24(07):1450023. 10.1142/s0129065714500233

  • 5.

    CookMJO'BrienTJBerkovicSFMurphyMMorokoffAFabinyiGet alPrediction of Seizure Likelihood with a Long-Term, Implanted Seizure Advisory System in Patients with Drug-Resistant Epilepsy: a First-In-Man Study. Lancet Neurol (2013) 12(6):56371. 10.1016/s1474-4422(13)70075-9

  • 6.

    LiSZhouWYuanQLiuY. Seizure Prediction Using Spike Rate of Intracranial EEG. IEEE Trans Neural Syst Rehabil Eng (2013) 21(6):8806. 10.1109/tnsre.2013.2282153

  • 7.

    KarolyPJFreestoneDRBostonRGraydenDBHimesDLeydeKet alInterictal Spikes and Epileptic Seizures: Their Relationship and Underlying Rhythmicity. Brain (2016) 139(4):106678. 10.1093/brain/aww019

  • 8.

    GuoLWangZCabrerizoMAdjouadiM. A Cross-Correlated Delay Shift Supervised Learning Method for Spiking Neurons with Application to Interictal Spike Detection in Epilepsy. Int J Neur Syst (2017) 27(03):1750002. 10.1142/s0129065717500022

  • 9.

    Shahidi ZandiATafreshiRJavidanMDumontGA. Predicting Epileptic Seizures in Scalp EEG Based on a Variational Bayesian Gaussian Mixture Model of Zero-Crossing Intervals. IEEE Trans Biomed Eng (2013) 60(5):140113. 10.1109/tbme.2012.2237399

  • 10.

    MyersMHPadmanabhaAHossainGde Jongh CurryALBlahaCD. Seizure Prediction and Detection via Phase and Amplitude Lock Values. Front Hum Neurosci (2016) 10:80. 10.3389/fnhum.2016.00080

  • 11.

    MirowskiPMadhavanDLeCunYKuznieckyR. Classification of Patterns of EEG Synchronization for Seizure Prediction. Clin Neurophysiol (2009) 120(11):192740. 10.1016/j.clinph.2009.09.002

  • 12.

    DireitoBTeixeiraCASalesFCastelo-BrancoMDouradoA. A Realistic Seizure Prediction Study Based on Multiclass SVM. Int J Neur Syst (2017) 27(03):1750006. 10.1142/s012906571750006x

  • 13.

    SunCCuiHZhouWNieWWangXYuanQ. Epileptic Seizure Detection with EEG Textural Features and Imbalanced Classification Based on Easyensemble Learning. Int J Neur Syst (2019) 29(10):1950021. 10.1142/s0129065719500217

  • 14.

    BrinkmannBHWagenaarJAbbotDAdkinsPBosshardSCChenMet alCrowdsourcing Reproducible Seizure Forecasting in Human and Canine Epilepsy. Brain (2016) 139(6):171322. 10.1093/brain/aww045

  • 15.

    ZhangTChenWLiM. Fuzzy Distribution Entropy and its Application in Automated Seizure Detection Technique. Biomed Signal Process Control (2018) 39:36077. 10.1016/j.bspc.2017.08.013

  • 16.

    PengPXieLWeiH. A Deep Fourier Neural Network for Seizure Prediction Using Convolutional Neural Network and Ratios of Spectral Power. Int J Neur Syst (2021) 31:2150022. 10.1142/s0129065721500222

  • 17.

    ZhangYGuoYYangPChenWLoB. Epilepsy Seizure Prediction on Eeg Using Common Spatial Pattern and Convolutional Neural Network. IEEE J Biomed Health Inform (2020) 24(2):46574. 10.1109/JBHI.2019.2933046

  • 18.

    LiuGZhouWGengM. Automatic Seizure Detection Based on S-Transform and Deep Convolutional Neural Network. Int J Neur Syst (2020) 30(04):1950024. 10.1142/s0129065719500242

  • 19.

    LinL-COuyangC-SWuR-CYangR-CChiangC-T. Alternative Diagnosis of Epilepsy in Children without Epileptiform Discharges Using Deep Convolutional Neural Networks. Int J Neur Syst (2020) 30(05):1850060. 10.1142/s0129065718500600

  • 20.

    OzcanARErturkS. Seizure Prediction in Scalp EEG Using 3D Convolutional Neural Networks with an Image-Based Approach. IEEE Trans Neural Syst Rehabil Eng (2019) 27(11):228493. 10.1109/tnsre.2019.2943707

  • 21.

    DaoudHBayoumiMA. Efficient Epileptic Seizure Prediction Based on Deep Learning. IEEE Trans Biomed Circuits Syst (2019) 13(5):80413. 10.1109/tbcas.2019.2929053

  • 22.

    TsiourisΚΜPezoulasVCZervakisMKonitsiotisSKoutsourisDDFotiadisDI. A Long Short-Term Memory Deep Learning Network for the Prediction of Epileptic Seizures Using EEG Signals. Comput Biol Med (2018) 99:2437. 10.1016/j.compbiomed.2018.05.019

  • 23.

    LiYYuZChenYYangCLiYAllen LiXet alAutomatic Seizure Detection Using Fully Convolutional Nested LSTM. Int J Neural Syst (2020) 30(4):2050019. 10.1142/S0129065720500197

  • 24.

    ÖzcanARErtürkS. Epileptic Seizure Prediction with Recurrent Convolutional Neural Networks. In: Signal Processing and Communications Applications Conference (2017). p. 14.

  • 25.

    PengPWeiHXieLSongY. Epileptic Seizure Prediction in Scalp Eeg Using an Improved HIVE-COTE Model. In: Chinese Control Conference. IEEE (2020). p. 64507. 10.23919/ccc50068.2020.9188930

  • 26.

    JirsaVKProixTPerdikisDWoodmanMMWangHGonzalez-MartinezJet alThe Virtual Epileptic Patient: Individualized Whole-Brain Models of Epilepsy Spread. Neuroimage (2017) 145:37788. 10.1016/j.neuroimage.2016.04.049

  • 27.

    KuhlmannLLehnertzKRichardsonMPSchelterBZaveriHP. Seizure Prediction - Ready for a new era. Nat Rev Neurol (2018) 14(10):61830. 10.1038/s41582-018-0055-2

  • 28.

    ElgerCEHoppeC. Diagnostic Challenges in Epilepsy: Seizure Under-reporting and Seizure Detection. Lancet Neurol (2018) 17(3):27988. 10.1016/s1474-4422(18)30038-3

  • 29.

    SamekWMeineckeFCMullerK-R. Transferring Subspaces between Subjects in Brain--Computer Interfacing. IEEE Trans Biomed Eng (2013) 60(8):228998. 10.1109/tbme.2013.2253608

  • 30.

    ZhangLWangXYangDSanfordTHarmonSTurkbeyBet alGeneralizing Deep Learning for Medical Image Segmentation to Unseen Domains via Deep Stacked Transformation. IEEE Trans Med Imaging (2020) 39(7):253140. 10.1109/tmi.2020.2973595

  • 31.

    PanSJYangQ. A Survey on Transfer Learning. IEEE Trans Knowledge Data Eng (2009) 22(10):134559.

  • 32.

    LongMCaoZWangJJordanMI, Conditional Adversarial Domain Adaptation, Advances in Neural Information Processing Systems (2018).

  • 33.

    CombesRTd.ZhaoHWangY-XGordonG, Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift, Advances in Neural Information Processing Systems (2020).

  • 34.

    ZhouMTianCCaoRWangBNiuYHuTet alEpileptic Seizure Detection Based on EEG Signals and CNN. Front Neuroinform (2018) 12:95. 10.3389/fninf.2018.00095

  • 35.

    GoldbergerALAmaralLAGlassLHausdorffJMIvanovPCMarkRGet alPhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation (2000) 101(23):e21520.E215. 10.1161/01.cir.101.23.e215

  • 36.

    PengL-HYinJZhouLLiuM-XZhaoY. Human Microbe-Disease Association Prediction Based on Adaptive Boosting. Front Microbiol (2018) 9:2440. 10.3389/fmicb.2018.02440

  • 37.

    RayatdoostSYinYRudraufDSoleymaniM. Subject-invariant EEG Representation Learning for Emotion Recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2021). p. 39559. 10.1109/icassp39728.2021.9414496

  • 38.

    ZhangWYanFHanFHeRLiEWuZet alAuto Recognition of Solar Radio Bursts Using the C-DCGAN Method. Front Microbiol (2021) 9:646556. 10.3389/fphy.2021.646556

  • 39.

    ZhaoL-MYanXLyuB. Plug-and-play Domain Adaptation for Cross-Subject EEG-Based Emotion Recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (2021).

  • 40.

    YanKKouLZhangD. Learning Domain-Invariant Subspace Using Domain Features and independence Maximization. IEEE Trans Cybern (2018) 48(1):28899. 10.1109/TCYB.2016.2633306

  • 41.

    LiHPanSJWangSKotAC. Domain Generalization with Adversarial Feature Learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018). p. 54009. 10.1109/cvpr.2018.00566

  • 42.

    DouQCastroDCKamnitsasKGlockerB. Domain Generalization via Model-Agnostic Learning of Semantic Features” in Advances In Neural Information Processing Systems (2019).

  • 43.

    YairOBen-ChenMTalmonR. Parallel Transport on the Cone Manifold of SPD Matrices for Domain Adaptation. IEEE Trans Signal Process (2019) 67(7):1797811. 10.1109/tsp.2019.2894801

  • 44.

    GillVSinghJSinghY. Analytical Solution of Generalized Space-Time Fractional Advection-Dispersion Equation via Coupling of Sumudu and Fourier Transforms. Front Phys (2019) 6:151. 10.3389/fphy.2018.00151

  • 46.

    UsmanSMKhalidSAkhtarRBortolottoZBashirZQiuH. Using Scalp EEG and Intracranial EEG Signals for Predicting Epileptic Seizures: Review of Available Methodologies. Seizure (2019) 71:25869. 10.1016/j.seizure.2019.08.006

  • 47.

    RamantaniGMaillardLKoesslerL. Correlation of Invasive Eeg and Scalp EEG. Seizure (2016) 41:196200. 10.1016/j.seizure.2016.05.018

  • 48.

    FahimiFDosenSAngKKMrachacz-KerstingNGuanC. Generative Adversarial Networks-Based Data Augmentation for Brain-Computer Interface. IEEE Trans Neural Networks Learn Syst (2020).

  • 49.

    XiaoY. A Fast Algorithm for Two-Dimensional Kolmogorov-Smirnov Two Sample Tests. Comput Stat Data Anal (2017) 105:538. 10.1016/j.csda.2016.07.014

Summary

Keywords

seizure prediction, domain adaptation, ensemble learning, EEG, time series classification

Citation

Peng P (2022) Seizure Prediction With HIVE-CODAs: The Hierarchical Vote Collective of Domain Adaptation Methods. Front. Phys. 9:811681. doi: 10.3389/fphy.2021.811681

Received

09 November 2021

Accepted

01 December 2021

Published

03 January 2022

Volume

9 - 2021

Edited by

Kai-Da Xu, Xi’an Jiaotong University, China

Reviewed by

Bin Chen, Yangzhou University, China

Liangyu Ma, North China Electric Power University, China

Updates

Copyright

*Correspondence: Peizhen Peng,

This article was submitted to Optics and Photonics, a section of the journal Frontiers in Physics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics