Abstract
Epilepsy is an electrophysiological disorder of the brain, the hallmark of which is recurrent and unprovoked seizures. Electroencephalogram (EEG) measures electrical activity of the brain that is commonly applied as a non-invasive technique for seizure detection. Although a vast number of publications have been published on intelligent algorithms to classify interictal and ictal EEG, it remains an open question whether they can be detected using short-length EEG recordings. In this study, we proposed three protocols to select 5 s EEG segment for classifying interictal and ictal EEG from normal. We used the publicly-accessible Bonn database, which consists of normal, interical, and ictal EEG signals with a length of 4097 sampling points (23.6 s) per record. In this study, we selected three segments of 868 points (5 s) length from each recordings and evaluated results for each of them separately. The well-studied irregularity measure—sample entropy (SampEn)—and a more recently proposed complexity measure—distribution entropy (DistEn)—were used as classification features. A total of 20 combinations of input parameters m and τ for the calculation of SampEn and DistEn were selected for compatibility. Results showed that SampEn was undefined for half of the used combinations of input parameters and indicated a large intra-class variance. Moreover, DistEn performed robustly for short-length EEG data indicating relative independence from input parameters and small intra-class fluctuations. In addition, it showed acceptable performance for all three classification problems (interictal EEG from normal, ictal EEG from normal, and ictal EEG from interictal) compared to SampEn, which showed better results only for distinguishing normal EEG from interictal and ictal. Both SampEn and DistEn showed good reproducibility and consistency, as evidenced by the independence of results on analysing protocol.
Introduction
Epilepsy is the fourth most common neurological disorder after migraine, stroke, and Alzheimer's disease (Sirven and Shafer, 2014) with an estimated 50 million people globally living with epilepsy (Media-Center, 2015). Epilepsy occurs in people of all ages and can affect them economically, socially, and even culturally. People with epilepsy often experience reduced educational opportunities, barriers to particular occupations, reduced access to health and life insurance, and other social stigma and discrimination (Sirven and Shafer, 2014; Media-Center, 2015). Recent studies show that up to 70% of people with epilepsy can be successfully treated. However, about three fourths in low- and middle-income countries may not receive the treatment they need. This is a considerable “treatment gap,” since nearly 80% of the epilepsy population live in those countries (Media-Center, 2015). Barriers to treatment for those people include the lack of trained healthcare providers and reliable low-cost diagnostic techniques (Media-Center, 2015).
Some common reasons of epilepsy are an abnormality in brain connections, an increased synchronization of neuronal activity in the brain (in which some brain cells either over-excite or over-inhibit other cells), a brain damage associated with conditions that disrupt normal brain activity, or some combination of these factors (NINDS, 2015). The hallmark of epilepsy is recurrent and unprovoked seizures. During the “epileptogenesis” process, the normal neuronal network abruptly turns into a hyper-excitable network, affecting mostly the cerebral cortex (Acharya et al., 2013). The most commonly used diagnostic tests for epilepsy is the measurement of electrical activity in the brain through monitoring electroencephalogram (EEG) or magnetoencephalogram (MEG) signals, and brain scans including computed tomography (CT), positron emission tomography (PET), and magnetic resonance imaging (MRI) (NINDS, 2015; Zhang et al., 2015).
EEG is a non-invasive, low-cost, yet effective technique for examining the electrical activity of the brain. Abnormal spike discharges can be identified in EEG recordings before and during a seizure attack (interictal and ictal states). Recognition of interical and ictal seizure phases through the analysis of EEG features has long been studied (Acharya et al., 2013). Those EEG features are selected from a wide spectrum, including time-domain (Meier et al., 2008; Minasyan et al., 2010), frequency-domain (Polat and Güneş, 2007; Chua et al., 2011), time-frequency analysis (Ocak, 2009; Tzallas et al., 2009; Guo et al., 2010; Alam and Bhuiyan, 2013; Kumar et al., 2014a,b), and features based on non-linear dynamics of the signal (Yuan et al., 2011). Non-linear methods have attracted increasing attention recently, since EEG signals are considered outputs of an intrinsically non-linear, complex system—the brain. Published studies have explored the availability of different non-linear methods, especially entropy features such as approximate entropy (ApEn) (Srinivasan et al., 2007; Ocak, 2009; Guo et al., 2010; Kumar et al., 2014a,b), sample entropy (SampEn) (Song et al., 2012), fuzzy entropy (FuzzyEn) (Kumar et al., 2014a; Xiang et al., 2015), and permutation entropy (PE) (Nicolaou and Georgiou, 2012; Li et al., 2014), or the combinations of two or more of these entropy features (Kannathal et al., 2005; Yuan et al., 2011; Acharya et al., 2012, 2015), all of which have shown good performance for distinguishing interictal, ictal EEG signals, and normal signals.
However, these studies are based on the entire EEG recordings (whose lengths is usually more than 20 s) from a specific database (Srinivasan et al., 2007; Ocak, 2009; Guo et al., 2010; Yuan et al., 2011; Acharya et al., 2012; Nicolaou and Georgiou, 2012; Song et al., 2012; Kumar et al., 2014a,b). The reason for using longer data length may partly be due to the fact that the traditional entropy methods are parameter dependent and typically can achieve stable estimations only for relatively long data recordings (e.g., 1000 sampling points or more; Richman and Moorman, 2000; Xie et al., 2008; Chen et al., 2009; Yentes et al., 2013). Therefore, most of the existing algorithms are only suitable for offline or post event detection of seizure rather than online or during event detection. This limits the caregivers to take prompt action during an event, which is important for better health outcome of epileptic patient. Patients may be exposed to life-threatening conditions if a seizure onset cannot be detected promptly. Additionally, online epilepsy and seizure detection based on short-length EEG recordings is set to become increasingly favored with the emergence of portable EEG amplifiers. To the best of our knowledge, there is currently no published study that has systematically attempted to achieve accurate detection using short-length EEG recordings.
In 2015, Li et al. (2015a) developed a new entropy method—distribution entropy (DistEn)—based on the distribution of inter-vector distances in the state space representation of time-series. DistEn has shown superiority for analysis of both benchmark data (Li et al., 2015a; Udhayakumar et al., 2015) and real world clinical data (Li et al., 2015) with extremely small number of samples compared with SampEn and FuzzyEn. In addition, DistEn precludes the dependence upon input parameters of the traditional methods (Li et al., 2015a; Udhayakumar et al., 2015). In our previous study, we applied this novel DistEn method to analyzing normal, interictal, and ictal EEGs, and found significant differences between the interical and ictal EEGs (Li et al., 2015b). Additionally, in that study we have explored how the length of EEGs influences DistEn performance. We tested the between group differences of DistEn using complete recording (4097 samples), average of each 5 s (868 samples) and 1 s (174 samples) epochs over complete recording, respectively. Intriguingly, we found that the performance of average DistEn of 5 s epochs was almost the same as that was found using the complete EEG recording. On the other hand, when using 1 s epochs, DistEn was not only able to detect the difference between interical and ictal EEGs, but also the difference between normal and interictal EEGs. Although the study used 5 and 1 s epoch, it is not true case of short length application, since it was averaged over the complete recording. Therefore, in the current study we will use only one epoch instead. We have decided to use epoch length of 5 s rather than 1 s in this study as we believe that 1 s epoch is too short and other algorithms (such as SampEn) mostly cannot give in valid results.
One important aspect of using a short-length segments from long recording is the selection process of the segment of interest from the recording. Most studies follow the random selection procedure, which presents difficulties with regard to evaluating the reproducibility and generalizability of the technique. Since the choice of data segment mostly affects the feature and hence the overall results, the reliability of the findings becomes questionable. To address this problem, we proposed three segmentation protocols and evaluated results for each of them separately (Li et al., 2015b).
In this study, we compared the performance of the DistEn and SampEn methods for classifying short-length epileptic EEG recordings with a data length of 5 s. Figure 1 shows a block diagram of this study. At first, we collected EEG signals from healthy and epileptic subjects from online database and proposed three protocols for the selection of 5 s EEG signal from complete recording. Then we used the DistEn and SampEn for feature extraction. Finally, we evaluated and compared the classification performance of extracted features among Normal, Interictal, and Ictal groups.
Figure 1

Overall block diagram summarizing steps of this study.
Algorithms of DistEn and SampEn are described in Section Algorithms of DistEn and SampEn. Section Description of EEG Data summarizes the EEG data used in this study. Statistical analysis methods are provided in Section Statistical Analysis. Results are provided in Section Results, followed by discussions in Section Discussion. Conclusions are presented in the last Section.
Materials and methods
Algorithms of DistEn and SampEn
DistEn
DistEn measures the complexity of a time-series through quantifying the amount of information contained in the inter-vector distances of the state space representation of the time-series. Through the evaluation of benchmark data, it has been demonstrated that a time-series with chaotic regime will result in dispersedly distributed inter-vector distances, leading to a larger amount of information, whereas the distribution becomes concentrative for random and periodic time-series, leading to relatively smaller amount of information (Li et al., 2015a). The algorithm for determining DistEn value of a time-series of N samples {u(i), 1 ≤ i ≤ N} can be summarized as follows:
State space reconstruction: Form (N − (m − 1)τ) vectors X(i) by X(i) = {u(i), u(i + τ), …, u(i + (m − 1)τ)}, 1 ≤ i ≤ N − (m − 1)τ. Here m indicates the embedding dimension and τ the time delay.
Distance matrix construction: Compute the inter-vector distances (distances between all possible combinations of X(i) and X(j)) by di, j = max(|u(i + k) − u(j + k)|, 0 ≤ k ≤ m − 1) for all 1 ≤ i, j ≤ N − m. The distance matrix is denoted as D = {di, j}.
Probability density estimation: Estimate the empirical probability density function of the distance matrix D by the histogram approach with a fixed bin number of B. The probability of each bin can be denoted as {pt, t = 1, 2, …, B}. Note here elements with i = j in D are excluded in the estimation.
Calculation: The DistEn value of the time-series {u(i)} can be calculated by the formula for Shannon entropy, that is
SampEn
SampEn provides an estimation of time-series complexity via the quantification of its self-similarity or regularity. By definition, SampEn is the negative natural logarithm of the conditional probability that two vectors (in the state space representation) similar for m points will remain similar at the next point (Richman and Moorman, 2000). For a random time-series, similar vectors of observations will not likely be followed by additional similar observations, thus yielding a higher SampEn. On the contrary, a periodic time-series will have a relatively small SampEn because it contains many repetitive patterns. The following algorithm can be used to determine the SampEn value of a time-series of N points {u(i), 1 ≤ i ≤ N}.
State space reconstruction: Form (N − mτ) vectors X(i) by X(i) = {u(i), u(i + τ), …, u(i + (m − 1)τ)}, 1 ≤ i ≤ N − mτ. Here m indicates the embedding dimension and τ the time delay.
Ranking similar vectors: Define the distance between X(i) and X(j) (1 ≤ i, j ≤ N − mτ, i ≠ j) by di, j = max(|u(i + k) − u(j + k)|, 0 ≤ k ≤ m − 1). Denote the average number of vectors X(j) within r of X(i) (that means di, j ≤ r) for all j = 1, 2, …, N − mτ and j ≠ i to exclude self-matches. Similarly, we define to rank the similarity between vectors with next point added in the comparison. Here r indicates the threshold parameter.
Calculation: The SampEn value of the time-series {u(i)} can be calculated by
Selection of input parameters
DistEn is a function of input parameters m, τ, and B, and SampEn a function of m, τ, and r, as described in the Sections DistEn and SampEn. According to Li et al. (2015a), DistEn is minimally affected by the assignment for B. In this study, we set B = 64 for all DistEn calculations. We used this value for B mainly due to the fact that we are analyzing short-length time-series and a small B-value is adequate for approximating the distribution of the inter-vector distances (Li et al., 2015a). On the other hand, the threshold parameter r is, indeed, a crucial factor for SampEn, since different r-values often lead to large variation in SampEn results (Richman and Moorman, 2000). This parameter-dependence may be further aggravated for a short data set (Yentes et al., 2013). In this study we are not exploring the effect of r on SampEn and therefore set r = 0.15SD (SD indicates the standard deviation of the time-series under consideration) empirically (Richman and Moorman, 2000; Yentes et al., 2013).
The embedding dimension m and time delay τ are important parameters since they together determine whether the state space reconstruction is appropriate or not. Several methods exist for determining the optimal values for m and τ either separately (Fraser and Swinney, 1986; Kennel et al., 1992) or jointly (Gautama et al., 2003). We used a differential entropy based method to determine the optimal m and τ jointly in order to avoid them falling into a non-optimal range (Gautama et al., 2003). Our analysis resulted in an optimal range of [2, 5] for m and [8, 12] for τ. In this study, we used all possible combinations of those m-and τ-values, yielding a total of 20 DistEn/SampEn values for each EEG segment. Details on the determination of m and τ are provided in the Supplementary Material.
Description of EEG data
The EEG data used in this study come from the Bonn database (Andrzejak et al., 2001b). The database is publicly available online (Andrzejak et al., 2001a) and has been widely used in epilepsy and seizure detection research. It is a collection of a total of 500 single-channel EEG recordings of 23.6 s duration each. They are categorized into five groups (sets Z, O, N, F, and S) of 100 recordings each. Sets Z and O consist of surface EEG recordings collected from five healthy volunteers in awake and relaxed state, with their eyes open and closed, respectively, using the standard 10–20 electrode placement scheme. Sets N, F, and S are recorded from five epileptic patients through intracranial electrodes for interictal and ictal activities. Signals in set F are recorded from the epileptogenic zone during seizure-free intervals (interictal activities). Set N also contains only interictal EEG signals, which are recorded from the hippocampal formation of the opposite hemisphere of the brain. Set S contains only signals corresponding to seizure attacks (ictal activities). In this study, we devided the data sets in three groups: (i) Normal–Z, O; (ii) Interictal–N, F; and (iii) Ictal–S.
All EEG recordings are digitized at 173.61 samples per second. Thus, the length of each recording is 173.61 × 23.6 ≈ 4097 samples, which is adequate for achieving robust estimations of entropy. However, in this study we concentrated on short-length EEG with duration of 5 s (173.61 × 5 ≈ 868). In order to select 5 s segment from each EEG recording (23.6 s), we proposed three protocols: (A) The 5 s segment was taken from the earlier stage of recording with its center aligned at the first quartile; (B) It was taken from the middle stage with its center aligned at the median; (C) The segment was taken from the later stage of recording with its center aligned at the third quartile. Figure 2 shows a schematic diagram of the three protocols.
Figure 2

Schematic diagram of the three 5 s segment selection protocols.
Statistical analysis
Statistical analysis include: (i) Mann-Whitney U−test to determine the differences of DistEn and SampEn, respectively, among normal (sets Z and O), interictal (sets N and F), and ictal EEGs as well as differences between each pair; (ii) ROC analysis to determine the classification performance of the two entropy methods for detecting interictal EEG from normal, ictal EEG from normal, and ictal from interictal EEG. All statistical analysis was performed individually for results obtained from each analysing protocol. Statistical significance was accepted at p < 0.01. Statistical analysis was performed using the Matlab R2014b (MathWorks, Natick, Massachusetts, USA).
The Mann-Whitney U-test was employed because the entropy results generally follow a non-normal distribution. p < 0.001 was accepted for statistical significance for parsimony in multiple comparisons. In ROC analysis, the classification performance was evaluated in term of area under the curve (AUC).
Results
DistEn and SampEn results for different EEG groups
Table 1 summarizes results of the 20 DistEn indices for normal, interictal, and ictal EEG groups and Table 2 the SampEn indices, each incorporating results corresponding to all the three analyzing protocols.
Table 1
| Feature Index | A | B | C | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Normal | Interictal | Ictal | Normal | Interictal | Ictal | Normal | Interictal | Ictal | |
| 1†,†,† | 0.85 ± 0.03 | 0.86 ± 0.03c | 0.89 ± 0.05b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.03b | 0.85 ± 0.03 | 0.86 ± 0.04c | 0.89 ± 0.05b |
| 2†,†,† | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.03b | 0.85 ± 0.03 | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 3†,†,† | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.03b | 0.85 ± 0.03 | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 4†,†,† | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.05b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.03b | 0.85 ± 0.03 | 0.86 ± 0.03c | 0.88 ± 0.05b |
| 5†,†,† | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.05b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.03b | 0.85 ± 0.03 | 0.86 ± 0.03c | 0.88 ± 0.05b |
| 6†,†,† | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 7†,†,† | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 8†,†,† | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 9†,†,† | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 10†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 11†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 12†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 13†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 14†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 15†,†,† | 0.85 ± 0.02a | 0.87 ± 0.03c | 0.90 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.90 ± 0.04b | 0.85 ± 0.03a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 16†,†,† | 0.84 ± 0.02a | 0.87 ± 0.03c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 17†,†,† | 0.84 ± 0.02a | 0.87 ± 0.03c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 18†,†,† | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.05b |
| 19†,†,† | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.88 ± 0.05b |
| 20†,†,† | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.02c | 0.89 ± 0.04b | 0.84 ± 0.02a | 0.86 ± 0.03c | 0.88 ± 0.05b |
DistEn of normal, interictal, and ictal EEG recordings.
Values are given in Median ± IQR (inter quartile range). Results corresponding to different analyzing protocols are listed in columns A, B, and C, respectively.
†: p < 0.001.
Interictal significantly different (p < 0.001) from Normal.
Ictal significantly different (p < 0.001) from Normal.
Ictal significantly different (p < 0.001) from Interictal. The p-values are indicated by Mann–Whitney U-test. Strings next to feature index indicate the significant levels for all three analyzing protocols, e.g., the string
indicates p < 0.001 for protocols A, B, and C. Feature index 1, 2, … , 20 indicate the [m, τ] combinations [2, 8], [2, 9], … .,[5, 12], respectively.
Table 2
| Feature Index | A | B | C | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Normal | Interictal | Ictal | Normal | Interictal | Ictal | Normal | Interictal | Ictal | |
| 1†,†,† | 2.32 ± 0.18a | 1.92 ± 0.38 | 1.67 ± 0.49b | 2.29 ± 0.23a | 1.89 ± 0.39 | 1.76 ± 0.44b | 2.29 ± 0.21a | 1.89 ± 0.41 | 1.84 ± 0.58b |
| 2†,†,† | 2.34 ± 0.21a | 1.92 ± 0.38 | 1.72 ± 0.50b | 2.31 ± 0.18a | 1.90 ± 0.40 | 1.81 ± 0.48b | 2.30 ± 0.16a | 1.90 ± 0.47 | 1.83 ± 0.53b |
| 3†,†,† | 2.35 ± 0.17a | 1.94 ± 0.37 | 1.78 ± 0.47b | 2.31 ± 0.15a | 1.93 ± 0.38 | 1.82 ± 0.44b | 2.32 ± 0.15a | 1.94 ± 0.42 | 1.84 ± 0.46b |
| 4†,†,† | 2.34 ± 0.16a | 1.93 ± 0.37 | 1.81 ± 0.42b | 2.31 ± 0.16a | 1.91 ± 0.38 | 1.87 ± 0.41b | 2.32 ± 0.15a | 1.95 ± 0.44 | 1.89 ± 0.47b |
| 5†,†,† | 2.33 ± 0.17a | 1.92 ± 0.34 | 1.83 ± 0.41b | 2.32 ± 0.17a | 1.92 ± 0.37 | 1.88 ± 0.39b | 2.31 ± 0.17a | 1.93 ± 0.39 | 1.93 ± 0.48b |
| 6†,†,† | 2.16 ± 0.38a | 1.38 ± 0.50 | 1.43 ± 0.53b | 2.11 ± 0.38a | 1.33 ± 0.51 | 1.46 ± 0.42b | 2.10 ± 0.32a | 1.36 ± 0.54 | 1.50 ± 0.63b |
| 7†,†,† | 2.17 ± 0.38a | 1.35 ± 0.49 | 1.46 ± 0.53b | 2.11 ± 0.34a | 1.30 ± 0.48 | 1.45 ± 0.46b | 2.14 ± 0.36a | 1.38 ± 0.59 | 1.55 ± 0.58b |
| 8†,†,† | 2.15 ± 0.35a | 1.33 ± 0.48 | 1.50 ± 0.48b | 2.16 ± 0.34a | 1.32 ± 0.55 | 1.54 ± 0.43b | 2.13 ± 0.35a | 1.38 ± 0.52 | 1.52 ± 0.57b |
| 9†,†,† | 2.20 ± 0.34a | 1.29 ± 0.48 | 1.54 ± 0.51b | 2.12 ± 0.34a | 1.32 ± 0.45 | 1.56 ± 0.42b | 2.12 ± 0.34a | 1.34 ± 0.52 | 1.56 ± 0.61b |
| 10†,†,† | 2.18 ± 0.35a | 1.31 ± 0.52 | 1.55 ± 0.54b | 2.16 ± 0.34a | 1.29 ± 0.47 | 1.57 ± 0.35b | 2.18 ± 0.33a | 1.34 ± 0.57 | 1.60 ± 0.57b |
| 11~20 | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined | Undefined |
SampEn of normal, interictal, and ictal EEG recordings.
Values are given in Median ± IQR (inter quartile range). Results corresponding to different analyzing protocols are listed in columns A, B, and C, respectively.
† p < 0.001.
Interictal significantly different (p < 0.001) from Normal.
Ictal significantly different (p < 0.001) from Normal.
cIctal significantly different (p < 0.001) from Interictal. The p-values are indicated by Mann–Whitney U-test. Strings next to feature index indicate the significant levels for all three analyzing protocols, e.g., the string
indicates p < 0.001 for protocols A, B, and C. Feature index 1, 2, … , 20 indicate the [m, τ] combinations [2, 8], [2, 9], … ., [5, 12], respectively.
Significant differences among the three classes (normal, interictal, and ictal) are indicated by all the 20 DistEn indices (all p < 0.001). In contrast, only the first 10 SampEn indices (that means, SampEn with combinations of [m, τ] = [2, 8], [2, 9], [2, 10], [2, 11], [2, 12], [3, 8], [3, 9], [3, 10], [3, 11], and [3, 12]) show significant differences. The SampEn indices for the rest of the combinations of [m, τ] are undefined. More importantly, SampEn indices remained undefined for all three segmentation protocols (Table 2) and both DistEn and SampEn showed similar results over three protocols as indicated by Tables 1, 2.
Additionally, all the 20 DistEn indices show significant differences between each pair of classes except the first one DistEn index for protocol A and the first five indices for protocol C. Significant differences between normal and interictal EEGs, as well as between normal and ictal EEGs, are shown by the first 10 SampEn indices for all the three protocols. However, difference between interictal and ictal classes indicated by SampEn indices are statistically insignificant.
The Median values of DistEn are almost unchanged in all three groups (Normal, Interictal, and Ictal) over three 5 s segmentation protocols (A, B, and C) at all combinations of [m, τ] (Table 1). On the other hand, SampEn shows relative higher fluctuations in Median values across three protocols than DistEn (Tables 1, 2). Besides, the inter quartile range (IQR) values of DistEn are also obviously smaller than SampEn.
ROC analysis results
ROC curves for DistEn and SampEn indices that have significant difference between classes (see Tables 1, 2) under different protocols are shown in Supplementary Figures 1, 3, respectively. Figure 3 summarizes the AUC values for all those ROC curves.
Figure 3

Area under the ROC curves (AUC) of three different protocols (see Supplementary Figures 1–3 for detailed ROC results). Bars show AUC of ROC curves with feature indices from 1 to 20. Error bars show the average and standard deviation of the corresponding AUC sets (undefined values excluded).
DistEn indices achieve an average AUC of 0.71 and maximum of 0.78 for detecting interictal EEG from normal (protocol A). Although SampEn indices are not always defined, they achieve an average AUC of 0.95 and maximum of 0.97 when only those valid features are taken into consideration. For detecting ictal EEG from normal, DistEn indices achieve an average AUC of 0.90 and maximum of 0.92 whereas SampEn of 0.95 on average and 0.96 for maximum (only valid SampEn indices are taken into consideration). Finally, for detecting ictal EEG from interictal, DistEn indices achieve an average AUC of 0.80 and maximum of 0.82; SampEn fails in this classification task because all SampEn indices show no significant difference between these two classes (see Table 2). No remarkable difference is found when comparing results obtained under different protocols (see Table 3).
Table 3
| Protocol | Method | Normal vs. Interictal | Normal vs. Ictal | Interictal vs. Ictal | |||
|---|---|---|---|---|---|---|---|
| Average | Maximum | Average | Maximum | Average | Maximum | ||
| A | DistEn | 0.71 | 0.78 | 0.90 | 0.92 | 0.80 | 0.82 |
| SampEn | 0.95 | 0.97 | 0.95 | 0.96 | - | - | |
| B | DistEn | 0.71 | 0.76 | 0.90 | 0.91 | 0.82 | 0.85 |
| SampEn | 0.93 | 0.96 | 0.95 | 0.96 | - | - | |
| C | DistEn | 0.66 | 0.70 | 0.85 | 0.87 | 0.76 | 0.78 |
| SampEn | 0.93 | 0.95 | 0.93 | 0.95 | - | - | |
Average and maximum Area under the ROC curves (AUC) values of DistEn and SampEn for different classification problems under different analyzing protocols.
Bold indicates the maximum AUC for each of the three classification problems.
Discussion
In this work, we used the DistEn and SampEn methods to analyze short-length, specifically 5 s, EEG recordings with the aim of detecting interictal and ictal EEG timely and accurately. Although both methods showed the capability of differentiating one or more classes from others, differences in their performance were indicated. Besides, we found that results from one entropy method mostly complement the other:
DistEn showed acceptable performance for all the three classification problems with high AUC values (see Figure 3). SampEn failed to distinguish ictal EEG from interictal, but it has shown good performance on the other two tasks. Furthermore, for these two problems SampEn was superior to DistEn as evidenced by a demonstrable increase in AUC values.
SampEn was not always defined. For a half of our used combinations of input m and τ, SampEn frequently yielded invalid results (see Table 2). Whenever defined, SampEn has showed extremely high AUC values for detecting interictal EEG from normal and detecting ictal EEG for optimal combinations of [m, τ]. Thus, it is important to define the optimal range of m and τ in order to use SampEn as classification feature.
DistEn was shown to be independent of input parameters m and τ, as can be seen from the relatively unchanged median and considerably smaller IQR values with the variation of m and τ (see Table 1). This property of DistEn can greatly facilitate clinical practices since the selection of input parameters is usually highly intractable.
With optimal m and τ parameters, we can obtain an AUC of 0.97 for detecting interictal EEG from normal, 0.96 for detecting ictal EEG from normal, and 0.85 for detecting ictal EEG from interictal (see Table 3). Our results indicate that accurate detection of interictal and ictal seizure phases using short-length EEG recordings is possible by combining DistEn and SampEn analysis.
In this study we used three 5 s EEG segment selection protocols, which were non-overlapping. However, the performance of both DistEn and SampEn did not show remarkable differences with different protocols (see Tables 1, 2, Figure 3, and Table 3, respectively). This indicates a good reproducibility and consistency for these methods.
DistEn showed significant differences between normal and ictal EEGs as well as between interictal and ictal EEGs which is in accordance with our previous findings based on averaging 5 s epochs (Li et al., 2015b), suggesting that the use of a 5 s epoch is reasonable. Intriguingly, DistEn also demonstrated acceptable performance for detecting interictal EEG from normal, which cannot be achieved by the previous analysis (Li et al., 2015b). Since in that study we used the default selection of m and τ (m = 2, τ = 1), the results reported here could well underline the importance of searching optimal input parameters. This is also another noticeable difference of our study compared with some previous publications (Srinivasan et al., 2007; Ocak, 2009; Guo et al., 2010; Yuan et al., 2011; Acharya et al., 2012, 2015; Song et al., 2012; Yentes et al., 2013; Kumar et al., 2014a,b) additional to short-length analysis. Although DistEn has shown rather stable outputs with the variation of [m, τ], SampEn did change considerably and even with some combinations of [m, τ] SampEn could not yield valid results as we have mentioned above. Besides, Yuan et al. (2008) have demonstrated that the embedding dimension of EEG signals during seizure changes and becomes different from that of normal EEG signals; the embedding dimension varies intensively during seizure, whereas keeps stable for normal EEG signals. Thus, applying a constant embedding dimension m may not be proper. By searching for optimal values and using different [m, τ] combinations, our methods may also have a good generalization capability because the sampling frequency varies much for different EEG databases or clinical EEG data which will affect the reconstruction in state space and is important deterministic factor to define m and τ (Thuraisingham and Gottwald, 2006; Govindan et al., 2007). Methods of some previous publications based on SampEn (Acharya et al., 2012, 2015; Song et al., 2012; Shen et al., 2013; Xiang et al., 2015), although having indicated good performance, may deserve further discussions when transplanted to other EEG data.
The brain exhibits randomness in normal state and changes to deterministic dynamics during ictal state. Our results indicated decreased SampEn and increased DistEn during both interictal and ictal EEG data relative to normal state. Since SampEn increases with the randomness or irregularity of a time-series (Richman and Moorman, 2000; Costa et al., 2005), higher SampEn in normal EEG data can be attributed to the stochastic dynamics of EEG signals. On the other hand, DistEn is reported to increase in non-linear deterministic dynamics (Li et al., 2015a) and thus, the increased DistEn in epileptic EEG signals can be attributed to the shift to deterministic dynamics in seizure activity. Therefore, although DistEn and SampEn have shown different variation directions, they represent two different characteristics of the signal.
One limitation of our study is that we did not try different values for the threshold parameter r for SampEn analysis. There may be an r-value that can support the capability of SampEn to detect ictal EEG from interictal. However, in clinical practice, it is impossible to change the r-values since it makes the results incomparable. Another limitation is that we did not test the generalization capability of our method, although actually it is expected to be good as we have mentioned above, by applying on other EEG databases.
Conclusion
Interical and ictal phases of epileptic seizure can be detected using short-length EEG data of 5 s length. The SampEn method is more sensitive to the detection of epileptic EEG (including interical and ictal phases) from normal whereas the DistEn method is sensitive to not only the detection of epileptic EEG from normal but also the detection of ictal EEG from interictal. Through SampEn analysis a maximum AUC value of 0.97 was achieved for detecting interictal EEG from normal, and of 0.96 for detecting ictal EEG from normal. A maximum AUC of 0.85 was achieved by DistEn analysis for detecting ictal EEG from interictal. The results of this study have shown that real time detection of epileptic seizure is possible using portable EEG amplifiers.
Statements
Author contributions
PL, CK, MP, and CL contributed to the conceptualization of the study. PL, CK, and CY analyzed the data and interpreted the results. PL drafted the manuscript. CK and PL critically revised the significant intellectual content of the work. All authors approved the final version of the manuscript.
Acknowledgments
PL acknowledges financial support of Shandong Provincial Natural Science Foundation ZR2015FQ016, The International Postdoctoral Exchange Fellowship of China, and China Postdoctoral Science Foundation 2014M561933. CK acknowledges financial support of Deakin University (Alfred Deakin Postdoctoral Fellowship). CL acknowledges financial support of National Natural Science Foundation 61471223.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary material
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fphys.2016.00136
References
1
AcharyaU. R.FujitaH.SudarshanV. K.BhatS.KohJ. E. W. (2015). Application of entropies for automated diagnosis of epilepsy using EEG signals: a review. Knowl. Based Syst.88, 85–96. 10.1016/j.knosys.2015.08.004
2
AcharyaU. R.MolinariF.SreeS. V.ChattopadhyayS.NgK.-H.SuriJ. S. (2012). Automated diagnosis of epileptic EEG using entropies. Biomed. Signal Process. Control7, 401–408. 10.1016/j.bspc.2011.07.007
3
AcharyaU. R.Vinitha SreeS.SwapnaG.MartisR. J.SuriJ. S. (2013). Automated EEG analysis of epilepsy: a review. Knowl. Based Syst.45, 147–165. 10.1016/j.knosys.2013.02.014
4
AlamS. M. S.BhuiyanM. I. H. (2013). Detection of seizure and epilepsy using higher order statistics in the EMD domain. IEEE J. Biomed. Health Inform.17, 312–318. 10.1109/JBHI.2012.2237409
5
AndrzejakR. G.LehnertzK.MormannF.RiekeC.DavidP.ElgerC. E. (2001a). EEG Time Series Download Page [Online]. Department of Epileptology, Bonn University. Available online at: http://epileptologie-bonn.de/cms/front_content.php?idcat=193&lang=3&changelang=3 (Accessed: October 3, 2015).
6
AndrzejakR. G.LehnertzK.MormannF.RiekeC.DavidP.ElgerC. E. (2001b). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. Rev. E64:061907. 10.1103/PhysRevE.64.061907
7
ChenW.-T.ZhuangJ.YuW.-X.WangZ.-Z. (2009). Measuring complexity using FuzzyEn, ApEn, and SampEn. Med. Eng. Phys.31, 61–68. 10.1016/j.medengphy.2008.04.005
8
ChuaK.ChandranV.AcharyaU. R.LimC. M. (2011). Application of higher order spectra to identify epileptic EEG. J. Med. Syst.35, 1563–1571. 10.1007/s10916-010-9433-z
9
CostaM.GoldbergerA. L.PengC. K. (2005). Multiscale entropy analysis of biological signals. Phys. Rev. E71:021906. 10.1103/PhysRevE.71.021906
10
FraserA. M.SwinneyH. L. (1986). Independent coordinates for strange attractors from mutual information. Phys. Rev. A33, 1134–1140. 10.1103/PhysRevA.33.1134
11
GautamaT.MandicD. P.Van HulleM. M. (2003). A differential entropy based method for determining the optimal embedding parameters of a signal, in Proceedings, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (Hong Kong), 29–32.
12
GovindanR. B.WilsonJ. D.EswaranH.LoweryC. L.PreißlH. (2007). Revisiting sample entropy analysis. Phys. A Statis. Mech. Appl.376, 158–164. 10.1016/j.physa.2006.10.077
13
GuoL.RiveroD.PazosA. (2010). Epileptic seizure detection using multiwavelet transform based approximate entropy and artificial neural networks. J. Neurosci. Methods193, 156–163. 10.1016/j.jneumeth.2010.08.030
14
KannathalN.ChooM. L.AcharyaU. R.SadasivanP. K. (2005). Entropies for detection of epilepsy in EEG. Comput. Methods Programs Biomed.80, 187–194. 10.1016/j.cmpb.2005.06.012
15
KennelM. B.BrownR.AbarbanelH. D. (1992). Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys. Rev. A45, 3403–3411. 10.1103/PhysRevA.45.3403
16
KumarY.DewalM. L.AnandR. S. (2014a). Epileptic seizure detection using DWT based fuzzy approximate entropy and support vector machine. Neurocomputing133, 271–279. 10.1016/j.neucom.2013.11.009
17
KumarY.DewalM. L.AnandR. S. (2014b). Epileptic seizures detection in EEG using DWT-based ApEn and artificial neural network. Signal Image Video Process.8, 1323–1334. 10.1007/s11760-012-0362-9
18
LiJ.YanJ.LiuX.OuyangG. (2014). Using permutation entropy to measure the changes in EEG signals during absence seizures. Entropy16, 3049. 10.3390/e16063049
19
LiP.LiuC.LiK.ZhengD.LiuC.HouY. (2015a). Assessing the complexity of short-term heartbeat interval series by distribution entropy. Med. Biol. Eng. Comput.53, 77–87. 10.1007/s11517-014-1216-0
20
LiP.YanC.KarmakarC.LiuC. (2015b). Distribution entropy analysis of epileptic EEG signals, in The 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Milan), 4170–4173.
21
LiY.LiP.KarmakarC.LiuC. (2015). Distribution entropy for short-term QT interval variability analysis: a comparison between the heart failure and normal control groups, in Computing in Cardiology, ed MurrayA. (Nice), 1153–1156. (In press).
22
Media-Center (2015). Epilepsy [Online]. World Health Organization. Available online at: http://www.who.int/mediacentre/factsheets/fs999/en/ (Accessed on September 27, 2015).
23
MeierR.DittrichH.Schulze-BonhageA.AertsenA. (2008). Detecting epileptic seizures in long-term human EEG: a new approach to automatic online and real-time detection and classification of polymorphic seizure patterns. J. Clin. Neurophysiol.25, 119–131. 10.1097/WNP.0b013e3181775993
24
MinasyanG. R.ChattenJ. B.ChattenM. J.HarnerR. N. (2010). Patient-specific early seizure detection from scalp EEG. J. Clin. Neurophysiol.27, 163–178. 10.1097/WNP.0b013e3181e0a9b6
25
NicolaouN.GeorgiouJ. (2012). Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Syst. Appl.39, 202–209. 10.1016/j.eswa.2011.07.008
26
NINDS (2015). The Epilepsies and Seizures: Hope through Research [Online]. National Institute of Health. Available online at: http://www.ninds.nih.gov/disorders/epilepsy/detail_epilepsy.htm (Accessed on September 30, 2015).
27
OcakH. (2009). Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Syst. Appl.36, 2027–2036. 10.1016/j.eswa.2007.12.065
28
PolatK.GüneşS. (2007). Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform. Appl. Math. Comput.187, 1017–1026. 10.1016/j.amc.2006.09.022
29
RichmanJ. S.MoormanJ. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol.278, H2039–H2049.
30
ShenC.-P.LiuS.-T.ZhouW.-Z.LinF.-S.LamA. Y.-Y.SungH.-Y.et al. (2013). A physiology-based seizure detection system for multichannel EEG. PLoS ONE8:e65862. 10.1371/journal.pone.0065862
31
SirvenJ. I.ShaferP. O. (2014). What is Epilepsy? [Online]. Epilepsy Foundation. Available online at: http://www.epilepsy.com/learn/epilepsy-101/what-epilepsy (Accessed on September 27, 2015).
32
SongY.CrowcroftJ.ZhangJ. (2012). Automatic epileptic seizure detection in EEGs based on optimized sample entropy and extreme learning machine. J. Neurosci. Methods210, 132–146. 10.1016/j.jneumeth.2012.07.003
33
SrinivasanV.EswaranC.SriraamN. (2007). Approximate entropy-based epileptic EEG detection using artificial neural networks. IEEE Trans. Inform. Technol. Biomed.11, 288–295. 10.1109/TITB.2006.884369
34
ThuraisinghamR. A.GottwaldG. A. (2006). On multiscale entropy analysis for physiological data. Phys. A Statis. Mech. Appl.366, 323–332. 10.1016/j.physa.2005.10.008
35
TzallasA. T.TsipourasM. G.FotiadisD. I. (2009). Epileptic seizure detection in EEGs using time-frequency analysis. IEEE Trans. Inform. Technol. Biomed.13, 703–710. 10.1109/TITB.2009.2017939
36
UdhayakumarR. K.KarmakarC.LiP.PalaniswamiM. (2015). Effect of data length and bin numbers on distribution entropy (DistEn) measurement in analyzing healthy aging, in The 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Milan), 7877–7880.
37
XiangJ.LiC.LiH.CaoR.WangB.HanX.et al. (2015). The detection of epileptic seizure signals based on fuzzy entropy. J. Neurosci. Methods243, 18–25. 10.1016/j.jneumeth.2015.01.015
38
XieH.-B.HeW.-X.LiuH. (2008). Measuring time series regularity using nonlinear similarity-based sample entropy. Phys. Lett. A372, 7140–7146. 10.1016/j.physleta.2008.10.049
39
YentesJ.HuntN.SchmidK.KaipustJ.McGrathD.StergiouN. (2013). The appropriate use of approximate entropy and sample entropy with short data sets. Ann. Biomed. Eng.41, 349–365. 10.1007/s10439-012-0668-3
40
YuanQ.ZhouW.LiS.CaiD. (2011). Epileptic EEG classification based on extreme learning machine and nonlinear features. Epilepsy Res.96, 29–38. 10.1016/j.eplepsyres.2011.04.013
41
YuanY.LiY.MandicD. P. (2008). A comparison analysis of embedding dimensions between normal and epileptic EEG time series. J. Phys. Sci.58, 239–247. 10.2170/physiolsci.RP004708
42
ZhangY.DongZ.PhillipsP.WangS.JiG.YangJ.et al. (2015). Detection of subjects and brain regions related to Alzheimer's disease using 3D MRI scans based on eigenbrain and machine learning. Front. Comput. Neurosci.9:66. 10.3389/fncom.2015.00066
Summary
Keywords
electroencephalogram (EEG), epileptic seizure, distribution entropy (DistEn), sample entropy (SampEn), short-length EEG analysis
Citation
Li P, Karmakar C, Yan C, Palaniswami M and Liu C (2016) Classification of 5-S Epileptic EEG Recordings Using Distribution Entropy and Sample Entropy. Front. Physiol. 7:136. doi: 10.3389/fphys.2016.00136
Received
26 November 2015
Accepted
29 March 2016
Published
14 April 2016
Volume
7 - 2016
Edited by
Zbigniew R. Struzik, The University of Tokyo, Japan
Reviewed by
Arun V. Holden, University of Leeds, UK; Daniel Goldman, The University of Western Ontario, Canada; Piotr Suffczynski, University of Warsaw, Poland
Updates
Copyright
© 2016 Li, Karmakar, Yan, Palaniswami and Liu.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Peng Li pli@sdu.edu.cn;
This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.