Integrative EEG biomarkers predict progression to Alzheimer's disease at the MCI stage

Alzheimer's disease (AD) is a devastating disorder of increasing prevalence in modern society. Mild cognitive impairment (MCI) is considered a transitional stage between normal aging and AD; however, not all subjects with MCI progress to AD. Prediction of conversion to AD at an early stage would enable an earlier, and potentially more effective, treatment of AD. Electroencephalography (EEG) biomarkers would provide a non-invasive and relatively cheap screening tool to predict conversion to AD; however, traditional EEG biomarkers have not been considered accurate enough to be useful in clinical practice. Here, we aim to combine the information from multiple EEG biomarkers into a diagnostic classification index in order to improve the accuracy of predicting conversion from MCI to AD within a 2-year period. We followed 86 patients initially diagnosed with MCI for 2 years during which 25 patients converted to AD. We show that multiple EEG biomarkers mainly related to activity in the beta-frequency range (13–30 Hz) can predict conversion from MCI to AD. Importantly, by integrating six EEG biomarkers into a diagnostic index using logistic regression the prediction improved compared with the classification using the individual biomarkers, with a sensitivity of 88% and specificity of 82%, compared with a sensitivity of 64% and specificity of 62% of the best individual biomarker in this index. In order to identify this diagnostic index we developed a data mining approach implemented in the Neurophysiological Biomarker Toolbox (http://www.nbtwiki.net/). We suggest that this approach can be used to identify optimal combinations of biomarkers (integrative biomarkers) also in other modalities. Potentially, these integrative biomarkers could be more sensitive to disease progression and response to therapeutic intervention.


INTRODUCTION
Caused by an increasing average age of the population in the developed world, dementia is becoming a major healthcare problem. Alzheimer's disease is the most common form of dementia and the golden standard for diagnosis is the post-mortem identification of Amyloid Beta 42 depositions and tangles (Blennow et al., 2006;Herrup, 2010). It has been suggested that Alzheimer's disease begins years, maybe even decades before actual cognitive symptoms appear (Sperling et al., 2011). However, normal ageing is also characterized by a slow decline of cognitive functions, which means it can be difficult to disentangle normal ageing from Alzheimer at a very early stage.
Patients with mild cognitive impairment (MCI) are at high risk of developing Alzheimer's disease. The label MCI is given when there is a cognitive complaint (mostly memory), which can also be demonstrated on formal testing, while general cognitive functioning is relatively intact and a patient is still living independently (Flicker et al., 1991;Gauthier et al., 2006;Albert et al., 2011). Therapies that stop the conversion to Alzheimer's disease unfortunately remain to be developed, but it is likely that these drugs or therapies will appear in the future (Prins et al., 2010;Huang and Mucke, 2012). It is plausible that these therapies will be most effective before major brain damage has occurred and it is, therefore, important to develop biomarkers sensitive of this very early stage (Sperling et al., 2011). Early-stage identification may also help the development of new treatments that are more effective at this stage as it can facilitate monitoring of the response to the intervention.
We here focus on biomarkers obtained from electroencephalography (EEG) recordings in the eyes-closed resting state (ECR). EEG biomarkers are optimal for screening purposes because the EEG recording can be obtained using relative cheap and non-invasive equipment, which is widely available and fast to use. Several previous EEG studies of conversion from mild cognitive impairment to Alzheimer's disease have been conducted (Jelic et al., 1996Huang et al., 2000;Stam et al., 2003;Schoonenboom et al., 2004;Rombouts et al., 2005;Babiloni et al., 2006Babiloni et al., , 2011Kwak, 2006;Rossini et al., 2006Rossini et al., , 2008Lehmann et al., 2007;Moretti et al., 2007aMoretti et al., ,b, 2008Moretti et al., , 2011Luckhaus et al., 2008) mainly using biomarkers such as spectral measures and synchronization between brain regions. Machine-learning techniques have been used to explore differences between MCI and AD with varying success (Huang et al., 2000;Bennys et al., 2001;Prichep et al., 2006;Buscema et al., 2007;Lehmann et al., 2007;Prichep, 2007;Rossini et al., 2008), however, only few studies have tried to predict the conversion from MCI to AD (Prichep et al., 2006;Prichep, 2007;Antila et al., 2013). Many studies typically focus on a small number of biomarkers (on the order of 15 marker values), and some do not have adequate validation of their results on independent groups. We perform large-scale data mining of multiple biomarkers ( Figure 1A) and validate our results on an independent group of subjects.
Our focus is on the EEG measured as part of the initial hospital intake test, combined with longitudinal recordings measured 1 year after the initial intake test. We have mapped several classical EEG biomarkers, such as frequency and power, but also nonclassical biomarkers such as detrended fluctuation analysis and oscillation burst analysis (Poil et al., 2008;Montez et al., 2009). By combining several biomarkers, it is often possible to find better separation boundaries between two groups ( Figure 1C), because

FIGURE 1 | An integrative approach toward improved prediction of mild cognitive impairment to Alzheimer's disease conversion. (A)
Diagram of processing flow. We calculate biomarkers on the second year EEG recording; hereafter we mapped all potential differences between MCI and AD using Student's t-test (Difference map). Next, we performed logistic regression on each single biomarker. Biomarkers with best single-classification power were seeded to a genetic search algorithm; this algorithm further optimized the combined biomarker set. This biomarker set was then used to predict MCI to AD conversion based on the first-year EEG recording. To evaluate the lower bound on the classification, half-split cross-validation was performed. Finally, the outcome performance was evaluated on the 1st EEG recording. (B) Overview of how the MCI patient cohort splits into AD, another diagnosis, or remain MCI one or 2 years after the in-take. (C) The integration of multiple biomarkers can reveal hidden separation boundaries. Here, we show two simulated biomarkers where the red and blue groups are overlapping if we only consider the single biomarkers. By combining the biomarkers, we see a clear separation boundary at the diagonal. Classification algorithms aim to identify this boundary, and use it to predict group association for new data. each biomarker gives additional information (Lehmann et al., 2007). In this longitudinal study we show that EEG biomarkers from the initial hospital in-take test retrospectively can be used in a classifier algorithm to predict the diagnosis that the patient obtained within the subsequent 2 years.

SUBJECTS
The study involved 86 mild cognitive impairment (MCI) subjects who were referred to the Alzheimer Center at the VU University Medical Center in Amsterdam, the Netherlands ( Figure 1B). Upon the first visit at the Alzheimer Center, all subjects underwent a thorough 1-day examination consisting of history taking, physical, and neurological assessment, neuropsychological testing including the Mini Mental State Examination (MMSE) (Folstein et al., 1975), laboratory tests, structural magnetic resonance imaging (MRI), and a routine electroencephalogram (EEG). After reviewing the clinical and ancillary imaging data, a multidisciplinary team established a consensus-based final diagnosis for each patient. The initial diagnosis of MCI was based on the criteria set by (Petersen et al., 1999), consisting of (a) objective memory impairment as seen during neuropsychological evaluation, defined by performances ≥ 1.5 standard deviation below the mean value of education-and that of age matched controls, (b) normal activities of daily living, and (c) a rating score of 0.5 in clinical dementia (Hughes et al., 1982).
All MCI subjects were followed up clinically during an average period of 709 ± [537:779] days (1.9 years) (median ± 95% confidence interval). The clinical follow up included medical history and functional status assessment re-examination in order to measure potential changes in the cognitive domain. MCI subjects who showed steady or enhanced cognitive functioning (but still fulfilled the criteria for MCI) during re-assessment were considered as MCI-stable, while MCI subjects who showed impoverished cognitive functioning, and fulfilled the NINDS-ADRDA criteria (McKhann et al., 1984) to be diagnosed with Alzheimer's disease, were considered to belong to the AD-converter group. Exclusion criteria were previous head trauma, history of neurological or psychiatric disease or use of psychotropic medications. Patients progressing from MCI to other disorders than Alzheimer's disease (n = 22) were excluded from the analyses reported here. These patients progressed to; "Subjective complaints" (n = 9), possible Alzheimer's disease (n = 1), frontal lobe dementia (n = 1), vascular dementia (n = 3), Lewy body dementia (n = 1), dementia other (n = 2), psychiatric (n = 2), or another neurological disorder (n = 3). The measurements were approved by the Ethics Committee of the VU University Medical Center, and were in accordance to the Helsinki declaration. All subjects signed an informed consent.

BIOMARKERS AND PROCESSING FLOW
The Neurophysiological Biomarker Toolbox (NBT) (http://www. nbtwiki.net/) was used to organize, analyse, and calculate all biomarkers in this study ). An EEG biomarker is a quantitative measure derived from the EEG, e.g., the dominant frequency of the beta frequency band (13-30 Hz), to be used as a diagnostic or prognostic predictor of disease (Figure 2). We extracted 177 biomarkers from each EEG trace. We decided to focus on biomarkers we have had good experiences with in other studies, and acknowledge that many more biomarkers could have been selected.
Based on the broadband signal, we computed 28 biomarkers, namely: Hjorth's activity, mobility and complexity parameters (Hjorth, 1970); Time domain Parameters (Goncharova and Barlow, 1990), Wackermann's global field strength, global frequency, and spatial complexity (Wackermann, 1999), Barlow's amplitude, frequency and spectral purity (Goncharova and Barlow, 1990). Alpha peak frequency, peak width, power corrected for 1/f baseline (Poil et al., 2011), when applicable the same parameters where found for double alpha peaks. Alphatheta transition point (Klimesch, 1999), Beta peak frequency (Figure 2), width, power corrected for 1/f baseline (Van Aerde et al., 2009), same for second beta peak if present; Frequency stability was evaluated using different methods, by the standard deviation and interquartile range of the central frequency and maximum wavelet frequency calculated in windows, and by, the distribution parameters of the phase values above zero, and of the number of oscillation cycle peaks per window.
For each of the classical frequency bands-delta (1-3 Hz), theta (4-7 Hz), alpha (8-13 Hz), beta (13-30 Hz), and gamma (30-45 Hz)-we computed 13 biomarkers; namely: The amplitude envelope was extracted using Hilbert transform and characterized extensively. We calculated the spearman correlations of amplitude envelopes in different channels. The distribution FIGURE 2 | An EEG biomarker is a quantitative measure derived from the EEG. For example the Beta peak frequency. (A) Time-frequency (TF-plot) plot of 6 s of an eyes-closed rest EEG signal (from Pz) (Wavelet). The color shows the power. Low-amplitude bursts in the beta-frequency band (13-30 Hz) not directly coupled with the strong alpha are observed. (B) The raw EEG signal used to calculate the TF-plot in (A). Clear and strong alpha (8-13 Hz) oscillations are observed. (C) Zooming in, we observe small peaks in the space between the strong alpha oscillation peaks, which correspond to the beta oscillations. (D) The power spectrum of the full-length EEG signal reveals a beta peak (left). To find the beta peak we first fit a 1/f baseline (right), next we fit a Gaussian to the small beta peak. We now have four biomarkers; Beta peak frequency, Beta peak width, Beta peak corrected power (i.e., minus 1/f baseline), and peak uncorrected power.
of amplitude values was characterized by kurtosis, skewness, interquartile range, median, range, and variance. Furthermore, detrended fluctuation analysis characterizing long-range temporal correlations (Linkenkaer- Hansen et al., 2001;Hardstone et al., 2012;Poil et al., 2012), multifractality spectral width (Kantelhardt et al., 2002;Ihlen, 2012) and oscillation bursts 95th percentile durations and sizes (Montez et al., 2009;Poil et al., 2011) were calculated on the amplitude envelope. The instantaneous phase was also extracted using Hilbert transform, and the 95th percentile duration and size of the stable phase bursts (a phase bursts is defined as the period between phase slips) were calculated. In addition, we computed for all frequency bands and individualized frequency bands, defined as Alpha1 (APF = individually defined Alpha peak frequency): (APF-4 to APF-2) Hz, Alpha2: (APF-2 to APF) Hz, Alpha3: (APF to APF+2) Hz; Beta: (APF+2 to 30) Hz (Klimesch, 1999), 7 biomarkers: absolute, relative power, and power ratios, furthermore, the central frequency, power in central frequency, bandwidth and spectral edge (Vural and Yildiz, 2010;O'Gorman et al., 2013). In total, we extracted 177 biomarker values from each EEG trace (Table 1). Next, we performed data mining on these biomarkers based on the second EEG recording (Figure 1A), to identify biomarkers that reached a significance level of p < 0.05 (student's t-test) for the comparison of stable MCI vs. AD-converters (based on the diagnosis after 2 years). We here use student's t-test because this test has best statistical power in most cases under the assumption of normal distributed biomarker values. The biomarkers were tested per channel, and a binomial multiple-comparison correction was performed (Poil et al., 2011). The binomial multiplecomparison correction tests whether a significant number of channels are found (i.e., 3 or more channels, p < 0.05). The performance of two different classification algorithms (see below for details) in integrating significant biomarkers into a diagnostic index was then tested using their median values across significant channels.

DEVELOPMENT OF A DIAGNOSTICS INDEX
To move beyond single-biomarker classification we aimed to integrate several EEG biomarkers in a diagnostic index that would classify the AD-converter group from the MCI-stable group better than each individual biomarker. Using one dataset for development and testing is not recommended, because it is theoretically possible to find a perfect separation of two groups if enough biomarkers are included (so-called over-fitting). To counteract this issue we build our classification model based on the second EEG recording (which was obtained in 34 out of a total of 64 subjects that were either MCI-stable or AD-converters), and tested the classification accuracy retrospectively on the first EEG recording. Thirty subjects were not included in the training (22 MCI-stable, 8 AD-converters), because these subjects did not have any second-year recording. These subjects serve as our ultimate classification test. We also used half-split cross-validation to evaluate the stability and lower bound of the solution (see below).

STATISTICS: LOGISTIC REGRESSION WITH GENETIC SEARCH
Binary classification was performed using logistic regression. In logistics regression the binary outcome either AD-converter (1) or MCI-stable (0) is regressed with a linear combination of biomarkers. More specifically we fit a function f (z) using maximum likelihood. and x are the k biomarkers included in the regression (included as medians across significant channels), and β i are the regression coefficients. The function f represents the probability of Alzheimer's disease. We use the 50% probability as our classification threshold, i.e., if f ≥ 0.5, the patient belong to the AD-converter group, otherwise the patient belongs to the MCI-stable group. We used a genetic search method to identify biomarkers that combined (using logistic regression) would give the best classification of the outcome MCI-stable vs. ADconverters. Genetic search is considered an efficient method for searching large data sets, instead of the computationally demanding alternative of testing all possible combinations (Koza and Poli, 2005;Zviling et al., 2005). The genetic approach is based around an evolutionary idea where the combined set of biomarkers is "mutated" by different mutation rules; addition of a random biomarker, removal of a biomarker, random selection of a new set of four biomarkers, and random substitution of a biomarker. Each rule was applied 5 times in each generation, leading to 20 new sets of biomarkers. The classifications of these new sets were then compared with the previous optimal set. Only the best biomarker set survived and was used as the base for next generation of mutations. We did not set limits on the maximum or minimum number of biomarkers in each set. The genetic algorithm was seeded with an initial set of five biomarkers with the highest Matthew correlation coefficient (see outcome evaluation below). The genetic algorithm ran for 100 generations. At each generation the biomarker set with maximal positive likelihood ratio (see outcome evaluation below) survived. In all cases the logistic regression model was fitted using the second EEG recording, and the classification outcome was measured using the first EEG recording.

STATISTICS: ELASTIC NET LOGISTIC REGRESSION
As an alternative to genetic optimization of biomarkers included in the logistic regression, we employed an elastic net logistic regression algorithm (Zou and Hastie, 2005) as implemented in the GLMnet package for Matlab (http://www-stat.stanford.edu/ tibs/glmnet-matlab/) (Friedman et al., 2010). This algorithm promises a build-in selection of features that optimally can perform much better than the less stable genetic optimization. The elastic net optimizes the number of biomarkers included in the diagnostic index by minimizing both the L1 and L2 norm of the regression coefficients by minimizing the equation where the first term is similar to the logistic regression, and the second and third are the penalizing terms (the elastic net) (Zou and Hastie, 2005). The parameters λ 1 and λ 2 determines the influence of either the L1 or L2 norm penalty. We define a new combined parameter α = λ 2 λ 1 + λ 2 which we optimized in 5-split cross-validation based on the best classification by training on second-year data, and testing on the 1/5 left-out subject group on first-year EEG (note that subjects which did not have a second-year EEG were not included, and, therefore, serve as our ultimate test group (see Results) (data not shown). We found the best classification with α = 0.8.

STATISTICS: CLASSIFICATION OUTCOME EVALUATION
To evaluate the outcome of our classification we use five different measures: • Sensitivity (SE): defined as the (number of correctly classified AD-converter patients)/(number of AD-converter patients). (1-Specificity). • Matthew correlation coefficient (MCC): explains the correlation between the outcome and the expected outcome (Baldi et al., 2000).
A Matthew correlation coefficient higher than 0.20, sensitivity higher than 65%, specificity higher than 65%, positive predictive value higher than 65%, and a positive likelihood ratio higher than 1.6 means that the classification is significantly different from a random classification (Monte Carlo simulation, 5000 iterations, n = 65, note these results depends on the sample size making the threshold levels lower for larger sample sizes, p < 0.05). Perfect classification would give a Matthew correlation coefficient (MCC) of 1, sensitivity of 100%, specificity of 100%, positive predictive value of 100%, and an infinite positive likelihood ratio. An issue with these outcome measures is that they only tell how well the classification fits the given subgroup of subjects, but not how well the classification generalizes to other subject populations. We counteract this by three approaches; (1) classification was performed on the second EEG recording, whereas the prediction was tested on the first EEG recording, (2) as the ultimate test we evaluated the prediction on subjects not included for classifier training (because not all subjects had a second EEG recording), and (3) we performed a half-split cross-validation. In the half-split cross-validation the sample was divided randomly in half several times (1000 iterations); the classifier was then trained on the first half, and the outcome was evaluated on the second half. We report the median outcome measures over these splits. Cross-validation gives an estimate of the classification performance on an "unknown" sample (Witten et al., 2011). However, cross-validation also suffers from lower n numbers, which means their outcome should be viewed as a conservative estimate of the average outcome.

STATISTICS: GROUP DIFFERENCES AND CORRELATIONS
We use non-parametric permutation tests based on median (Box and Andersen, 1955;Ernst, 2004) to test for differences between groups. Non-parametric tests are more robust toward non-normal data, but also often have lower power than parametric such as student's t-test. Confidence intervals (95%) were found using non-parametric bias corrected and accelerated bootstrap (n = 5000) (DiCiccio and Efron, 1996).

STATISTICS: 2 × 2 TABLE INDEPENDENCE TESTS
To test for dependence of genotype, gender, and patient group we used Barnard's exact test, which is appropriate for low sample statistics compared with Chi-square test, and has better power compared with Fisher's exact test (Barnard, 1947).

STATISTICS: MULTIPLE COMPARISONS
Because we do large-scale mapping of biomarkers, we employ a lenient approach to multiple comparisons correction at the first level of analysis. This means that in the initial mapping of potential difference between the stable MCI and AD-converter groups, we only perform a binomial correction for the number of significant channels in each biomarker (Poil et al., 2011). We do not correct the p-values across different biomarkers. This approach is appropriate since this mapping of potential difference is only used to identify candidate biomarkers for the genetic search algorithm.

PATIENT GROUPS-AGE AND GENDER
Initially 86 Table 1). We only focus on the patients diagnosed with AD, and subjects remaining stable MCI. In the following we use the last diagnosis of the subjects for the definition of the MCI-stable and AD-converters groups.

MMSE RESULTS
The MMSE score of the MCI-stable group (

APOE STATUS
We observed a significantly higher frequency of E4 allele vs. no E4 allele in AD-converter vs. stable MCI (Barnard test, p < 0.01). Only 38% of MCI-stable compared to 64% of AD-converter group had more than one E4 allele ( Table 2).

SINGLE-BIOMARKER LOGISTIC REGRESSION MODEL OF AD-CONVERTER vs. MCI-STABLE
To show the principle of logistic regression modeling on a single biomarker, we chose the beta peak frequency, because this biomarker showed  (Figure 3A). We fitted a logistic regression model to the second EEG measurement (n = 17 in both groups, Figure 3B). The model classified the second measurements with a sensitivity (SE) of 76%, 76% specificity (SP), 76% positive predictive value (PPV), 0.5 Matthew correlation coefficient (MCC), and a positive likelihood ratio (PLR) of 3.3. Next, we used this logistic model to retrospectively classify the first EEG measurement (Figure 3C). The classification had a SE of 72%, 59% SP, 53% PPV, 0.3 MCC and a PLR of 1.8; thus, as expected, a worse classification power (MCI n = 39, AD n = 25) ( Figure 3D).

MULTIPLE-BIOMARKER LOGISTIC REGRESSION MODEL OF AD-CONVERTER vs. MCI-STABLE
By combining several biomarkers it may be possible to obtain better classification power than the individual biomarkers alone (Schoonenboom et al., 2004;Buscema et al., 2007;Lehmann et al., 2007). However, it is not trivial which combinations of biomarkers are optimal, because of the high number of possible combinations. Here, we employ a genetic search approach and elastic net penalization to assists us in finding these optimal combinations (see Methods and Materials section). The best set of biomarkers identified by the genetic search was (six biomarkers): Amplitude correlations with Cz in Beta (13-30 Hz), Bandwidth of subject-specific Beta frequency, Peak width of dominant beta peak, range of amplitude values in Beta (13-30 Hz), Ratio between theta and alpha power, and alpha relative power (normalized with 1-45 Hz broadband). The logistic regression training on this biomarker set using the second EEG data yielded a SE of 100%, 94% SP, 94% PPV, 0.94 MCC, and PLR of 17 (n = 17 in both groups).
The retrospective testing on first-year data using the classifier model trained on the second-year data gave a SE of 92%, 85% SP, 79% PPV, 0.75 MCC, and PLR of 6 (MCI-stable, n = 39; ADconvert, n = 25) (Figures 3F,G; Table 3), which indicates that even at this very early stage differences between AD-converters and MCI-stable can be identified. However, since second-year and first-year data from the same subjects may be strongly correlated we also performed a classification test using only subjects that were not used for training the model (i.e., the subjects without a second EEG recording). We obtained a good classification with a SE of 88%, 82% SP, 64% PPV, 0.64 MCC and a PLR of 4.8 (MCI-stable, n = 22; AD-convert, n = 8), suggesting the diagnostic index can generally be used for these patient groups. Furthermore, we performed a half-split crossvalidation (1000 iterations), with a SE of 75%, 63% SP, 52% PPV, 0.37 MCC, and a PLR of 2, an indication of the average outcome. As expected, the classification powers decrease; however, this is at least partly explained by the lower n number. However, the combined classification is still much better than prediction obtained on the individual biomarkers in the set (Figures 3E, 4). The best single biomarker in the biomarker set (based on sensitivity and specificity) was the peak width of the dominant Beta peak, with a SE of 64%, 62% SP, 52% PPV, 0.24 MCC, and a PLR of 1.7 (MCI-stable, n = 22; AD-convert, n = 8) ( Table 3). The logistic regression fitting coefficients for the combined solution were; −2.9 for Amplitude correlations with Cz in Beta, 0.5 for bandwidth of subject specific Beta, 3.4 for Peak width of dominant beta peak, −0.6 for range of amplitude values in Beta, −2.3 for ratio between theta and alpha power, and −0.2 for alpha relative power. This means that the peak width of the dominant beta peak had the greatest Outcome evaluation as in (C), but for the "optimal" biomarker set found using genetic search. The first six columns are for classification of the individual biomarkers separately. The last column is the combined classification outcome. We clearly see that the combined outcome is better than the classification using the individual biomarkers. 1, Peak width of dominant beta peak; 2, range of amplitude values in Beta (13-30 Hz); 3, Bandwidth of subject-specific Beta frequency; 4, Ratio between theta and alpha power; 5, alpha relative power (normalized with 1-45 Hz broadband); 6, Amplitude correlations with Cz in Beta (13-30 Hz); (C) Combined logistic classification using the biomarkers 1, 2, 3, 4, 5, and 6. (F) Logistic curve for combined classification based on first-year EEG. (G) Separation plot of MCI vs. AD in first EEG recording using combined classification based on second-year logistic regression coefficients. Note that the recordings used for training in F are different from those used for testing in (G). influence on the outcome, followed by amplitude correlations with Cz. Taken together, our results show that it is possible to obtain a substantial synergistic effect from the integration of several biomarkers; however, they also show that it is not trivial to identify which combination of biomarkers is most optimal. The major issue with our genetic search is that from run to run we do not obtain the same solution, because the algorithm finds local maxima. We, therefore, employed an elastic net penalized logistic regression algorithm. This algorithm uses a penalization of the weights to optimize the set of biomarkers used for classification. The classification outcome from this algorithm is worse than genetic search optimized logistic regression, with a SE of 75%, 86% SP, 67% PPV, 0.59 MCC, and a PLR of 5.5 (MCI-stable, n = 22; AD-convert, n = 8) ( Table 3) based on training on the second-year EEG and testing on the first-year recording of subjects (the test subjects were not used for training). The elastic net logistic regression combined 12 biomarkers (non-zero weights), namely; the amplitude correlations from Cz in  and , the range of the generalized multifractal

FIGURE 4 | Most biomarkers in the diagnostic index have differences between the MCI-stable (MCI) and AD-converter (AD) groups, and only two have longitudinal changes.
For each biomarker, a separation plot is shown for 1st EEG and 2nd EEG. The values are median across channels with significant differences between the MCI-stable (MCI) and AD-converter (AD) groups (Binomial corrected). Topographical plots are of 2nd EEG median value across subjects and channels in 6 regions; Frontal, left/right temporal, Central, Parietal, and Occipital. Asterisk indicates significant differences (permutation test on median, binomial corrected, * p < 0.05, * * p < 0.005).
hurst exponent of the Delta (1-3 Hz) amplitude envelope, the Beta frequency, the power ratio between Gamma (30-45 Hz) and Delta (1-4 Hz), Alpha 1 (Individual Alpha frequency-4: Individual Alpha frequency-2) and Alpha (8-13), Alpha 1 and Beta (13-30 Hz), the spectral edge of the individualized betafrequency range, the peak width of the beta peak, the second beta peak frequency, the stability of the Delta (1-3 Hz) frequency measured in windows of 5 s, and the Hjorth mobility parameter. The outcome evaluation still shows room for improvement, e.g., by including biomarkers from other modalities.

DISCUSSION
We addressed the challenge of predicting whether an MCI subject would convert to AD within 2 years. To this end, we explored the added value of integrating multiple EEG biomarkers into a diagnostic index using logistic regression in combination with either a genetic search or elastic-net penalization for biomarker selection. From an initial cohort of 86 subjects with mild cognitive impairment, 25 converted to Alzheimer's disease within 2 years. We showed how data mining of 177 EEG biomarkers could be used to identify a set of biomarkers that form a diagnostic index. The analysis was performed using the Neurophysiological Biomarker Toolbox (NBT, http://www.nbtwiki.net/) , which is specifically developed to support data mining and integration of large sets of biomarkers. We found that particularly biomarkers sensitive to changes in the beta frequency (13-30 Hz) band were optimal for classifying the very early EEG recordings of yet to be diagnosed AD patients.

CLASSIFICATION BASED DIAGNOSTICS
Previous studies have shown promise in using machine-learning algorithms to classify between MCI and AD based on EEG recordings (Huang et al., 2000;Bennys et al., 2001;Prichep et al., 2006;Buscema et al., 2007;Lehmann et al., 2007;Prichep, 2007;Rossini et al., 2008). A sensitivity of 89% and specificity of 95% were, e.g., found using the so-called IFAST model (Buscema et al., 2007;Rossini et al., 2008). However, these studies were based on training and testing on the same data, which makes it more difficult to judge the performance. Uniquely to the present study, we performed classification training on the second EEG recording, and retrospectively used this to perform prediction based on the first EEG recording from subjects not used for the training. We note, however, that the drawback of the present procedure is the low number of patients in the smallest patient group (i.e., the eight patients converting to AD) produced a fairly high error margin to the classification estimates (12.5%).

OSCILLATIONS ARE INVOLVED IN COGNITION
Empirical and theoretical evidence suggest that oscillations provide important systems-level mechanisms for normal brain function (Engel and Singer, 2001;Buzsáki and Draguhn, 2004;Axmacher et al., 2006;Klimesch et al., 2007;Palva, 2007, 2012;Lisman, 2010). For example, oscillations are involved in memory encoding (Raghavachari et al., 2001;Jensen et al., 2002), and are thought to provide a timing mechanism for spiketime dependent plasticity (Engel and Fries, 2010). It, therefore, seem plausible that if oscillations are abnormal in disorders such as MCI and AD, then cognition is also affected. Apart from relative Alpha power and the theta/alpha power ratio, which may reflect early changes toward the well-known slowing of the EEG in AD (Bennys et al., 2001;Rossini et al., 2006), our optimal set of biomarkers is derived from the Beta frequency band (13-30 Hz). Beta-band changes have previously been observed in Alzheimer's disease, e.g., by a more anterior distribution (Huang et al., 2000). The larger width of the beta peak and bandwidth could potentially be linked with a less stable beta frequency, and, therefore, also a less efficient working memory (Kopell et al., 2011). Beta oscillations are believed to maintain the current sensorimotor and cognitive state (Engel and Fries, 2010). Activity in the beta-frequency range has also traditionally been linked with motor function. Interestingly, it has been found that motor performance is impaired in early-stage Alzheimer's disease but not in mild cognitive impairment (Sheridan et al., 2003;Pettersson et al., 2005), which is a potential explanation of the prominent role of beta-frequency changes in our data. Motor function, e.g., gait control, is a higher cognitive function requiring integration of several cognitive functions, as attention, planning (Hausdorff et al., 2005;Scherder et al., 2007), albeit unrelated to performance in memory tests (Hausdorff et al., 2005). Hyperexcitability of the motor cortex has also been observed in AD (Di Lazzaro et al., 2004), which our finding of higher beta frequency also suggests.

EEG BIOMARKERS AS POTENTIAL INDICATORS OF INFLAMMATION
The standard hypothesis of Alzheimer's disease is the amyloid cascade hypothesis stating that the cause of Alzheimer's should be found in the build up of amyloid and tangles (Hardy and Selkoe, 2002;Huang and Mucke, 2012). It has been hypothesized that Alzheimer's disease is initiated by a micro injury, presumable a vascular event, in the brain with subsequent activation of inflammatory responses that further leads to initiation of the amyloid deposition cycle (De la Torre, 2004;Herrup, 2010). The theta/(lower alpha) power ratio has previously been associated with vascular damage in AD (Moretti et al., 2007b), and the delta (2-4 Hz) power has been associated with inflammation (Babiloni et al., 2009). EEG power and frequency in general has also been correlated with cerebral perfusion (O'Gorman et al., 2013), which is known to be reduced in Alzheimer's disease (De la Torre, 1999;Kogure et al., 2000;Murray et al., 2011). If we could detect early-stage changes using EEG, we would have a powerful tool that could detect Alzheimer's disease at a point where a possible therapy would be most efficient. Mouse models, e.g., show that Aβ-42 modifying therapy has limited effect after neurodegeneration has begun (Dubois et al., 2007;Sperling et al., 2011). Thus, meaning that diagnosing a patient based on neurodegeneration and cognitive decline may already be too late for a good treatment outcome because the brain damage has already occurred.
It has also been shown that the build-up of Aβ42 influences synaptic transmission, and thus, potentially also give rise to further effects in the EEG (Palop and Mucke, 2009;Verret et al., 2012). Further hippocampal injections of amyloid β in rats have been shown to induce impaired memory performance combined with reduced hippocampal theta oscillations and less activity in GABAergic neurons (GABA, gamma-aminobutyric acid) (Villette et al., 2010). A recent suggestion for a potential improvement of Alzheimer's disease symptoms is transcranial direct current stimulation (tDCS) (Hansen, 2012). This method increased theta and alpha oscillations together with improved working memory performance (Zaehle et al., 2010). Interestingly, it has been suggested these effects may be caused by altered GABA concentration within the stimulated cortex, and potentially by an adjustment of the excitatory/inhibitory balance, which is disturbed in Alzheimer's disease (Di Lazzaro et al., 2004;Rossini et al., 2007;Stagg et al., 2009). This balance may be directly linked to EEG biomarkers that have been shown sensitive to Alzheimer's disease (Montez et al., 2009;Poil et al., 2011Poil et al., , 2012. It thus seems that EEG biomarkers may be sensitive to underlying pathophysiology of AD.

OUTLOOK
We here showed that exploratory data mining and integration of multiple biomarkers might yield many exciting results on the large databases of neuroscience data build up over the years. These studies may identify hidden structures (see schematic Figure 1C) and be beneficial for both pre-clinical and clinical research. With recent developments in automatic cleaning of EEG this analysis may potentially be performed immediately after the recording (Nolan et al., 2010;Mognon et al., 2011). This together with the non-invasive character of EEG could make a diagnostic index using EEG biomarkers a powerful tool to support the early-stage clinical assessment. EEG biomarkers, apart from being non-invasive and relative inexpensive, have the advantage of monitoring brain activity in real time, and thus potentially able to identify tiny changes in ongoing cognition. However, we believe the best diagnostic/prognostic performance is achieved if EEG biomarkers are combined with information from other modalities. Future studies should specifically study how the synergistic information of integrative biomarkers can be improved further by the incorporation of different classes of biomarkers, which could range from cognitive markers (Tabert et al., 2006), functional connectivity markers (Stam et al., , 2007, coherence, synchronization, and topographical location markers (Huang et al., 2000;Stam et al., 2005;Rossini et al., 2006) to questionnaire data providing quantitative data on the mental state of the patients during the resting-state EEG recording (Diaz et al., 2013). Improvement in algorithms used for pre-selecting biomarkers could, e.g., be based on measures of interrelatedness between biomarkers or taking scalp topographies into account as opposed to the averaged channel biomarker values used here. We believe the Neurophysiological Biomarker toolbox provides a promising framework for these studies. This could give rise to a better integrative understanding of biomarkers involved with Alzheimer's disease and brain disorders in general (Searls, 2005;Dubois et al., 2007;Schneider, 2010).