Seizure Prediction With HIVE-CODAs: The Hierarchical Vote Collective of Domain Adaptation Methods

Epileptic seizure prediction is one of the most used therapeutic adjuvant strategies for drug-resistant epilepsy. Conventional methods are usually trained and tested on the same patient due to the interindividual variability. However, the challenging problem of the domain shift between different subjects remains unsolved, resulting in low prevalence of clinical application. In this study, a generic model based on the domain adaptation (DA) technique is proposed to alleviate such problems. Ensemble learning is employed by developing a hierarchical vote collective of seven DA modules over multi-modality data, such that the predictive performance is improved by training multiple models. Moreover, to increase the feasibility of its implementation, this study mimics the data distribution of clinical sampling and tests the model under this simulated realistic condition. Based on the performance of seven subnetworks, the applicability of each DA algorithm for seizure prediction is evaluated, which is the first study that provides the assessment. Experimental results on both intracranial and scalp EEG databases demonstrate that this method can reduce the domain gap effectively compared with previous studies.


Epilepsy Background
Epilepsy is a cerebral anomaly with the transient occurrence of unexpected seizures caused by excessive or hypersynchronous neuronal activities [1]. It is the second most clinically significant neurological disorder, which affects approximately 1.0% of the world's population [2]. The reliable seizure prediction device, which refers to anticipating an upcoming seizure based on continuous electroencephalogram (EEG) signals, is an emerging and important demand for drug-resistant individuals accounting for about 30% of the epileptic [3,4]. The early warning device could significantly prevent the injury of epileptic coma, or even death.
EEG is a commonly used type of physiological signal that measures the epileptic brain activity, which records rhythmic information induced by coordinated neuronal firing with characteristic periodicity. The first-in-man forecast study was reported in 2013 [5], which offered the convincing proof of the predictability of seizure. Since then, many EEG-based algorithms adopting the datadriven technique have been presented.

Related Work
Current research studies regarding seizure prediction can be mainly categorized into two streams. The first stream typically follows a binary classification scheme, which assumes that a difference exists between the interictal and preictal stages. The ictal and postictal sequences are discarded during the data processing due to the futility of their contribution to forecast. The second stream is to detect the fluctuation of a specific index during the preictal period, such as the spike rate [6][7][8], zerocrossing intervals [9], and phase/amplitude locking value [10]. If the observed indicator exceeds the previously set threshold, an early warning would be declared. Owing to the multiplicity of multichannel EEG recordings, the first stream is more widely recognized than the second one. This study also adopts the strategy that distinguishes preictal states from interictal states, which is depicted in Figure 1.
Approaches using the binary classification scheme commonly adopt machine learning techniques like support vector machines [11][12][13], random forests [14], and k-nearest neighbor [15]. For the past few years, many deep learning frameworks, including convolutional neural network (CNN) [16][17][18][19], 3D CNN [20], long short-term memory (LSTM) network [21][22][23], and cascades of DNN [24], have been exploited to analyze continuously acquired epileptic EEG signals. However, there are still many promising algorithms to be developed and applied further. Ensemble learning is considered the state-of-the-art solution for many challenging problems?. For instance, several representative approaches, including HIVE-COTE [25], boosting, bagging, and stacking?, have achieved high performance for time series classification. Such methods are appealing because it has stronger generalization ability than a single model by training multiple subnetworks and combining their predictions. For this reason, we attempt to probe into its effectiveness for seizure prediction.
Most recently, various machine learning-based studies have achieved high performance. However, these methods are not yet in widespread use. Most of these research studies only provide patient-specific results, namely, both training and testing sets are collected from one subject. The reason for adopting this strategy is that large interindividual variability is ubiquitous among patients with epilepsy [26][27][28]. Therefore, an ensemble that contains a number of domain adaptation modules is developed in this study to reduce the impact of epileptic individual variability.

Significance
Although conventional studies achieve encouraging successes in seizure prediction task, their translation to the application remains challenging, in part due to their limited domain adaptability across different subjects. EEG patterns vary significantly from patient to patient as shown in Figure 2, and the issue regarding the model generalization ability remains unsolved. In previous studies, the training and testing sets are from the same patients, which can obtain a very high sensitivity ( < 85% on average). Although such trials are important for personalized medicine, they are inconsistent with the clinical scenario in most cases. In other words, conventional models may perform well in one patient but be less effective in another, since the domain gap between different subjects is partly ignored. In practice, the training set is mainly composed of the previous patient data, and only a small amount of user samples can be used for training. The training set consisted of various subjects is the source domain. The "unseen" user is the target domain. In the existing literature, few research studies explore the domain shift issue. Therefore, a general seizure prediction model that is similar to clinical situation remains to explore and perfect further.
To alleviate the low popularization of clinical application and circumvent the impact of interindividual variability, domain adaptation (DA) is introduced for seizure prediction. However, few studies aim at using these techniques in epileptic EEG. In the fields such as image recognition and emotion-related EEG, successful applications of domain adaptation approaches have been reported [29,30]. There are three main streams of DA algorithms. The first stream exploits adversarial learning to extract invariant information among source and target domains. The second stream extends sample size with data augmentation to access the target domain pattern in advance.
The third stream establishes general features based on specific prior knowledge.
Inspired by the success in other areas, we hope to extend DA to the field of seizure prediction. Since many DA techniques [31][32][33] have been provided, an ensemble learning-based model, the hierarchical vote collective of DA subnetworks (HIVE-CODAs), is proposed in this study. HIVE-CODAs combine the advantages of various DA methods. Besides, it can evaluate the applicability of each DA algorithm. In general, the main contributions of this study are summarized as follows: • A generic model, HIVE-CODA, is proposed to tackle the DA problem for seizure prediction. It is the first attempt to reduce the domain disparity between different patients and to test the model under simulated clinical sampling conditions. • Ensemble learning is introduced into this model by developing a hierarchical vote collective. Such a framework can improve the predictive performance and generalization ability due to the combination of multiple DA subnetworks. • This study is the first study to evaluate the applicability of different DA algorithms for seizure prediction, which is crucial for follow-up studies.
Based on DA techniques and ensemble learning, the proposed model provides an above par disturbance rejection property, making the model more robust and practical for clinical application. Experiments on two public databases, the Freiburg Hospital EEG database and the CHB-MIT EEG database [34,35], are conducted for model evaluation. Results indicate that HIVE-CODA achieves better domain adaptability compared with other state-of-the-art baselines.

Patients
Two public EEG datasets, the Freiburg Hospital Intracranial EEG database [34] and the CHB-MIT scalp EEG database [35], are adopted to evaluate the generalization capability of HIVE-CODAs. The Freiburg Hospital EEG database includes time series of 87 seizures from 21 people with medically intractable focal epilepsy, ranging from 10 to 50 years old (8 male and 13 female patients). EEG signals are recorded invasively with six electrodes (3 near the epileptic focus and the other three distal to the epileptogenic zone). The sampling rate for all patients is 256 Hz (data of Patient No. 12 are sampled at 512 Hz but are down-sampled to 256 Hz).
The CHB-MIT database consists of scalp EEG sequences of 22 epileptic subjects, including five male patients ranging from 3 to 22 years and 17 female patients from 1.5 to 19 years. The EEG signals are recorded at a 256 Hz sampling rate with 16-bit analogto-digital converters. Most samples are acquired from surface electrodes of 23 channels following the 10-20 standard system for electrodes placement. Each patient has a subfolder that contains 9 to 42 recordings.

Data Selection and Labeling
Power line noise removal is implemented to denoise the data. We discarded the frequency bands of 47-53 and 97-103 Hz in the intracranial EEG set and the frequency bands of 57-63 and 117-123 Hz in the scalp EEG set. It is because noise commonly appears at 50 Hz for the Freiburg database and 60 Hz for the CHB-MIT database. Moreover, a subject selection is performed. Only patients with at least two seizures but fewer than 15 seizures per day are used for prediction, since less than two seizures are not enough to support training and more than 15 seizures make the forecast purportless. The chosen subjects are listed in Tables 1, 2.
A prerequisite for seizure prediction is the reliable distinction between preictal and interictal samples. We set 30 min before seizure onset as the seizure prediction horizon (SPH), which follows empirical evidence of comparison tests as applied multiple preictal lengths, and the seizure occurrence period is set to 0. A seizure should occur within 30 min after the predictor returns a positive. The raw EEG recordings are then divided into continuous, non-overlapping fragments by a 5-s time window. The sample number for each subject is sufficient (>7,200) to support training. Besides, we remarked that the amount of interictal samples is much larger than preictal samples. To remedy the sample imbalance, a random subsample on the interictal signals is performed to make an equal quantity of preictal and interictal training samples.

METHODS
To learn the domain-invariant representation, we proposed a generic seizure prediction model: the hierarchical vote collective of DA subnetworks (HIVE-CODAs). HIVE-CODA is an ensemble that combines 7 DA modules over multi-modality data. Each subnetwork is assigned a weight via the probabilistic voting scheme to equilibrate its value. By analyzing the most contributive DA component and its feature space, we provided a preliminary conclusion about the generalized information during the preictal period among various individuals.

Clinical Situation Simulation
Conventional approaches only provide the patient-specific results. Such frameworks may obtain high precision but are not consistent with the signal recording situation in real life. It is difficult to collect a large number of long-term EEG samples from one specific patient during clinical treatment, such that the sample size is unable to support the training process. Therefore, we considered using DA technology to apply data from other subjects to predictor training for the particular subject.
The training and testing strategy is depicted in Figure 3. The training and validation sets consist of existing patient data and one seizure of the target subject, while the remaining target seizures served as the testing set. The selection of seizure for training refers to the idea of the leave-one-out cross-validation (LOOCV) approach [36]. Moreover, the combined data are partitioned into five folds, and 80% of the samples are assigned to the training set, while the remaining 20% is named for the validation set to prevent overfitting.
Since few domain adaptation techniques of epileptic EEG have been reported, we applied seven state-of-the-art approaches from the related fields to constitute the subnetworks of HIVE-CODAs. Several modules require images as inputs, instead of time series, such that we generate the spectrograms from EEG segments using the short-time Fourier transform (STFT) [44]. The raw EEG recordings are translated into two-dimensional matrices composed of frequency and time axes. Then the EEG fragments and their spectrograms will be sent forward to the corresponding modules depending on their modalities.

1) MMD-AAE:
We developed the MMD-AAE module referring to the study mentioned in reference [41], which aims at assessing the effectiveness of maximum mean discrepancy (MMD) measure and adversarial autoencoders (AAEs). An MMDbased regularization term is proposed to align the distributions among various subjects. The AAE architecture is applied to learn the latent codes that are universal to all domains. The sharable information is captured by matching the aligned distribution to an arbitrary prior distribution. Thus the MMD-AAE may circumvent the overfitting to source data.
2) SIDA: We also estimated the performance of SIDA on epileptic EEG, which combines power spectral density (PSD) features and adversarial learning [37]. SIDA focuses on the extraction of the invariant representations among different domains. The sharable information is jointly learned with the task loss L task and subject confusion loss L subj . The training procedure adopts the adversarial strategy, which is implemented with a gradient reversal layer. Suppose that there are N source samples x i { } N i 1 , the process can be explicitly written as follows: where θ, c, ϕ represent the network parameters andθ,ĉ,φ are their updated forms. λ is the trade-off positive parameter. h(·) and g(·) are the classification task and subject discrimination outputs. c i , s i denote the corresponding labels for L task and L subj , respectively. Note that, a specific feature extraction component is assembled in HIVE-CODAs since the inputs of SIDA are PSD features in accordance with the study mentioned in reference [37].

Modules Based on Data Augmentation
1) C-DCGANs: By introducing C-DCGANs [38], we tested the feasibility of using data augmentation and convolutional neural networks (CNN) to remedy the domain discrepancy. The main  idea of C-DCGANs is increasing generalization capability via artificial EEG data generation. A generative adversarial network (GAN) is exploited to expand the training set, and an end-to-end CNN is employed as the classifier. We remarked that C-DCGANs also involve the adversarial learning due to the application of GAN. However, the generation function of GAN is highlighted in HIVE-CODAs, instead of the minmax optimization, such that we placed emphasis on the assessment of data augmentation.
2) MIDA: MIDA subnetwork is developed to measure the importance of background information and feature augmentation. In MIDA framework, an inner product space is established, where feature vectors are maximally independent in the sense of a Hilbert-Schmidt independence criterion (HSIC) [40]. The feature augmentation is performed via generating latent representations based on the background knowledge like acquisition time. The original feature vectors are expanded by concatenating with the produced features. According to the study mentioned in reference [40], we exploited the domain label (which domain a sample belongs) as the background information since no device label and the acquisition time is provided in the epileptic EEG database.

1) CMDA:
The CMDA module is adopted to evaluate the applicability of manifold on epileptic EEG. Referring to the study menitioned in reference [43], the latent feature space among various domains is regularized by modeling sharable information on the Riemannian cone manifold. Specifically, covariance matrices P of EEG segments are computed to constitute the manifold M. The CMDS module leverages the global Riemannian meanP and the local Riemannian mean P The jth DA module outputs the voting vector p (j) i to claim its decision. Then the voting vectors will be selected adaptively with the weight matrices to produce the probability of class i. Meanwhile, the weight layer will be exploited for DA subnetwork assessment.
whereŜ (k) denotes the generalized features, Γ B→A (·) represents the parallel transport from B to A, and S (k) represents the projection of P (k) on the tangent space T P (k) M with the logarithm map (cite). In general, each domain feature is parallelly transported from P (k) to the global centroidP, and the transported pointŜ (k) is embedded in the 〈·, ·〉P inner product space to make the generalized features describable in the Euclidean space.
2) PPDA: The long short-term memory (LSTM) architecture and a peculiar learning strategy are evaluated by adding the PPDA module. PPDA divides the latent features into private portions specific to each subject and generalized components among all subjects. To leverage both the universal and private feature vectors, PPDA develops a learning procedure including the training phase, calibration phase, and test phase. Specially, the LSTM layer is adopted for encoding and decoding.
3) MASF: To assess the applicability of meta-learning and semantic features, the MASF module is employed in HIVE-CODAs. According to the study mentioned in reference [42], a model-agnostic learning paradigm is exploited to minimize the domain gap via using a global class alignment loss L global and a local sample clustering loss L local . The knowledge about interclass relationships and the domain-independent class-specific cohesion/separation is captured by L global and L local , respectively, which is given as follows: where ψ, θ are the network parameters, η is the learning rate, and β 1 , β 2 denote the weighting coefficients. L task represents the loss function of the predictive task. By introducing both global and local information, the semantic structure regarding the EEG feature space is regularized explicitly.

Weighted Voting Scheme
To evaluate the contribution of each subnetwork, a weighted voting structure is introduced at the end of the network. We assumed that there are G modules for the classification with C classes. For an arbitrary class y i, we denote by w (j) i the weight assigned to the jth module, where i ∈ [1, . . . , C] and j ∈ [1, . . . , G]. Then the collective probability p i for the ith class is the normalized weighted sum over modules: The prediction resultŷ can be given as follows: The applicability of each algorithm on epileptic EEG could be estimated via observing its weight unit. Besides, a more balanced and intuitive collective could be created as the subnetworks are trained adaptively.

RESULTS AND DISCUSSION
In this section, the comparison results and weight matrix are provided to verify the generalization ability and evaluate the DA algorithms. HIVE-CODA is assessed on both intracranial and scalp EEGs. We adopted three common measures for evaluation: sensitivity, false alarm rate per hour (FPR), and area under the receiver operating characteristic curve (AUC).

Generalization Ability Analysis
The comparison experiments are conducted to demonstrate the advantages of HIVE-CODAs over other conventional methods. Many time/frequency domain-based approaches have been applied to predict upcoming seizures. Two classic deep neural networks, CNN and LSTM, are selected to assess the generalization ability of our method, which have achieved success in patient-specific forecast. We also attempted to find a generic algorithm across different subjects for comparison. However, little existing research considers the similarity of data acquisition to clinical situation and uses plenty of "unseen" patient's samples for training. The implementation details of CNN and LSTM refer to references [17] and [22], and the experimental results are listed in Tables 3, 4.
The experiment regarding the intracranial EEG is performed based on the widely used Freiburg Hospital database. Table 3 illustrates that HIVE-CODAs achieve a sensitivity of 80% and an FPR of 0.18/h on average, which outperforms other forecast models. For the outlier-like Pt 17, HIVE-CODAs do not produce the desired effectiveness, which might be caused by a larger domain gap existing in the sample space.
Evidently, performances of all these prediction approaches show a significant decline compared with the patient-specific results in their literature. This phenomenon is reasonable since the training and testing samples are collected from one same subject in prior studies, which consider little about the generalization ability. Conversely, our method is implemented based on the existing database and small amount of "unseen" patient's data, which is more coincident with the real clinical situations. Though the precision is not relative high, the model performance is sufficient for the daily needs of patients, as it approximates to the first-in-man trial [5].
In terms of scalp EEG, the experiment is conducted based on the public CHB-MIT database, produced by the Massachusetts Institute of Technology. As shown in Table 4, HIVE-CODAs achieve a sensitivity of 72% and an FPR of 0.24/h on average. Since the conventional algorithms consider little about the domain shift among different patients, HIVE-CODAs exhibit obvious advantages over other prediction models. Still, for several outliers like Pt 2, Pt 9, Pt 10, and Pt 17, the sensitivity of our approach is slightly higher than the lower bound of a random binary classifier. HIVE-CODA is a variation over deep learning models. As such, it carries with it the uncertainties associated to deep neural networks, in particular a lack of formal convergence guarantees.
Furthermore, experiments compared with DA algorithms are conducted. Results of AUC value are listed in Tables 5, 6. For the Freiburg Hospital database, results indicate that HIVE-CODAs achieve higher generalization ability than the conventional algorithms. It also testifies to the application potential of integrated DA modules on processing epileptic EEG. To be specific, the interindividual variability could be alleviated, and the existing forecast systems could be transferred to the clinic due to the emerging technologies in DA.
For the CHB-MIT database, the conventional studies show a lower performance in a clear margin compared with their patientspecific results, which is consistent with the experiment conducted on intracranial EEG. Moreover, all these model performances drop to a varying degree compared with the precisions on the Freiburg test set. It might be caused by the advantages of ensemble learning in analyzing low spatial resolution recordings, namely, the scalp EEG signals. In other words, intracranial EEG recordings have the high spatial resolution and SNR, and the artifacts are typically seen in scalp EEG [46,47]. This result also illustrates that HIVE-CODAs have superiority for processing complex time series due to the diverse inner pattern of the collective structure.

Module Performance Analysis
As few studies evaluate the applicability of different DA algorithms for seizure prediction, this study provides an analysis based on the adaptively trained weight matrices. HIVE-CODAs introduce several successful machine learning models from related fields and assess their performance. The subnetworks are conducted via a statistical analysis of the weighted voting layer. The weight distributions are presented in Figure 5. The greater the normalized weight, the greater the contribution of the corresponding DA module. DA methods with high contributions are considered to have larger potential. This study also tests the predictive precision of each module running alone (with the other modules' weights reset to 0). The results of AUC values are illustrated in Figure 6. A detailed discussion regarding these DA techniques is provided as follows. 1) CMDA: CMDA relies on Riemannian manifold-based features to capture the characteristic scale of the neuronal events, which was proposed for motor imagery. As shown in Figure 5, CMDA surpasses the other approaches on both intracranial and scalp EEG datasets. We conjectured that the inner pattern of EEG sequences may obey a compact distribution in the embedding space, such that the manifold-based methods that capture continuous subspace might be applicable to such task. The experimental result indicates that the analytic Riemannian manifold can potentially be used to develop a robust seizure predictor.
2) SIDA: The SIDA module is an adversarial neural network from the area of emotion recognition. It uses EEG spectra as input to learn a new representation, minimizing loss of emotion recognition and subject confusion. As we can see, SIDA makes relatively larger contribution compared with other modules. It might be due to the combination of CNN and generative adversarial network (GAN), which have been exploited to extract invariant latent features successfully. The weight unit of SIDA module may suggest the potential effect of adversarial learning on generalization ability, since SIDA exploited the architecture of GAN. However, this conjecture needs to be further verified for the SIDA module as it adopts the power spectral density (PSD) features as inputs.
3) C-DCGANs: C-DCGANs use conditional GANs to generate EEG artificially, which is developed for the detection of subject's movement intention (MI). We noticed that the performance of data augmentation-based module is not desired compared with the specific features and adversarial learning-based subnetworks. The degradation might be caused by the limitation of EEG data augmentation. The fake data usually involve more artifacts [48] that may contaminate EEG data. Still, C-DCGANs provide a decent accuracy, which suggests that data augmentation still has potential in developing a generic seizure forecast model. 4) MMD-AAE: By matching the aggregated posterior with a prior distribution, the MMD-AAE module extracts the crossdomain features with adversarial learning. This scheme was originally used for image recognition. On both intracranial and scalp EEG datasets, MMD-AAE outperforms MIDA, MASFF, and PPDA modules and exhibits a slight decrease compared with the C-DCGAN module. Due to the above par performance of MMD-AAE, the conjecture in 2) about the superiority of adversarial learning is verified to some extent. This superiority may derive from the variational inference process of MMD-AAE, which alleviates the overfitting to the source domains effectively. 5) PPDA: PPDA is a technology applied to EEG-based emotion recognition. It uses LSTM-based encoder to decompose the features into general characteristics applicable to all individuals and personalized characteristics. Dividing the raw EEG data into subject-specific information and generalized information is a commonly-adopted strategy for domain adaptation. However, for seizure prediction, PPDA displays subpar performance, which we did not expect. Feature decomposition and the adoption of LSTM seem reasonable in these tasks. However, due to the few reports of relevant models, it cannot be absolutely determined that the decomposed features and LSTM are not suitable for epileptic signals. 6) MASF: MASF exploits semantic features and gradient-based meta-learning to establish a model-agnostic learning paradigm. In the field of image processing, successful application of MASF has been reported. Notably, the performance of the semantic feature-based method is unsatisfactory. We conjectured that the discriminant hyperplane in the feature space may be too complex to be adapted by the explicit semantic features. Moreover, the limitation of the initial neural architecture for meta-learning might also be a constraint of the search space. 7) MIDA: MIDA is originally applied in the emotion recognition field. The purpose of this model is to reduce differences in domain distributions by learning a subspace with maximum independence. Figure 5 indicates that all the DA methods can outperform the MIDA module. This result was expected, given the limited background information on epileptic data. Obviously, the background-specific features are not valid characteristics.
Based on these results, we observed that adversarial learning and manifolds may achieve good performance in epilepsy prediction. In addition, CNN and PSD features may also have the potential to process epileptic signals. In the domain generalization field, CNN has gradually become one of the most popular algorithms. This also further echoes the conjecture about CNN in this experiment. Note that the module performance may be variable on some special cases, since several outliers (in the Freiburg dataset, Pt 11, 21 for the C-DCGAN module; in the CHB-MIT dataset, Pt 2, 13 for the PPDA module, Pt 21 for C-DCGANs module, and Pt 2 for CMDA module) have been observed.

Model Applicability Analysis
Here, we attempt to summarize the universal characteristics and architectures based on the observations of DA algorithms in Section 4.2. The weight vectors of three types of DA methods (specific features, data augmentation, and adversarial learning) are quantified in a statistical analysis, which is depicted in Figure 7.
As shown in Figure 7, the adversarial learning-based approaches exhibit obvious advantages over the other DA methods on both intracranial and scalp EEG. Meanwhile, the weight distribution indicates that model performance regarding the specific features and data augmentation is volatile. Comparing adversarial learning to specific features, we reckon that the amount of above par weights is about 64% up. Comparing adversarial learning to data augmentation, a further 61% benefit is obtained, for a total of about 125% margin over data augmentation-based methods. These observations give us confidence in the efficacy of adversarial learning for processing epileptic signals, and we conjecture that data augmentation is relatively inferior for alleviating individual variability.
In particular, the manifold feature of CMDA surpasses all the other methods, such that the effectiveness of manifold feature requires a further demonstration. The statistical significance of the manifold feature is assessed for discriminating preictal and interictal stages across different patients. The two-sample Kolmogorov-Smirnov test [49] at a 5% significance level (p < 0.05) is implemented during the evaluation. The significance analysis for each patient is provided in Tables  7, 8. The unqualified performance index is marked in bold format. For the manifold feature, 17 of 20 subjects in the Freiburg dataset and 13 of 16 subjects in the CHB-MIT dataset present an adequately distinguished ability. According to this observation, the manifold-based methods might be the promising techniques in developing a robust seizure predictor.

CONCLUSION
This study proposes a universal approach to alleviate the problem of individual variability in epileptic seizure prediction. By combining the DA and ensemble learning techniques, the proposed HIVE-CODA model mitigates the effects of epileptic individual variance and increases the generalization ability. Besides, a simulated clinical sampling scenario is adopted during training and testing periods, which is the first attempt to adopt this evaluating strategy. Compared with the patient-specific scheme in conventional studies, such an assessment model is relatively demanding and challenging. Nonetheless, HIVE-CODAs achieve high domain shift robustness and precision, which demonstrates its feasibility of real-world applications.
By analyzing the contributions of each module, the experimental results also demonstrate the effectiveness of adversarial learning and manifolds in epileptic seizure prediction. The underlying causes of this phenomenon remain unclear because there is no definitive explanation of the dynamics of epilepsy in the existing literature. However, the success of the manifold module in this experiment brings new inspiration. We speculate that the mapping of EEG in the high-dimensional space may follow a compact distribution, so the kernel-based method for searching hyperplanes may have potential in this task. The search for more powerful DA algorithms and the underlying reasons will be considered as part of our future research extension to achieve higher performance.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. These data can be found here: https://epilepsy.uni-freiburg.de/ freiburgseizure-predictionproject/ eeg-database.

ETHICS STATEMENT
Written informed consent was obtained from the individual(s) and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article. The bold values denote outliers. The bold values denote outliers.