Diagnosis of Alzheimer’s disease via resting-state EEG: integration of spectrum, complexity, and synchronization signal features

Background Alzheimer’s disease (AD) is the most common neurogenerative disorder, making up 70% of total dementia cases with a prevalence of more than 55 million people. Electroencephalogram (EEG) has become a suitable, accurate, and highly sensitive biomarker for the identification and diagnosis of AD. Methods In this study, a public database of EEG resting state-closed eye recordings containing 36 AD subjects and 29 normal subjects was used. And then, three types of signal features of resting-state EEG, i.e., spectrum, complexity, and synchronization, were performed by applying various signal processing and statistical methods, to obtain a total of 18 features for each signal epoch. Next, the supervised machine learning classification algorithms of decision trees, random forests, and support vector machine (SVM) were compared in categorizing processed EEG signal features of AD and normal cases with leave-one-person-out cross-validation. Results The results showed that compared to normal cases, the major change in EEG characteristics in AD cases was an EEG slowing, a reduced complexity, and a decrease in synchrony. The proposed methodology achieved a relatively high classification accuracy of 95.65, 95.86, and 88.54% between AD and normal cases for decision trees, random forests, and SVM, respectively, showing that the integration of spectrum, complexity, and synchronization features for EEG signals can enhance the performance of identifying AD and normal subjects. Conclusion This study recommended the integration of EEG features of spectrum, complexity, and synchronization for aiding the diagnosis of AD.


Introduction
According to the World Health Organization (WHO), more than 55 million individuals currently live with dementia, a number projected to increase to 78 million by 2030 and a staggering 139 million by 2050 (WHO, 2021).Alzheimer's disease (AD), a neurological disorder, constitutes the predominant form of dementia, accounting for approximately 70% of cases in the world (Blennow et al., 2006).AD mainly occurs in people aged 65 and older, with its incidence rate notably escalating as age advances (McKhann et al., 1984).Due to the high prevalence of AD and its effect on economic cost, WHO has issued a call to prioritize dementia on global health agendas to heighten awareness, enhance early diagnosis, and offer improved care and support to individuals affected by dementia (Subedi and Sapkota, 2019).
Diagnosis of AD, and in particular early diagnosis is essential due to several reasons (Brookmeyer et al., 2007;Dauwels et al., 2010;Galimberti and Scarpini, 2011): (1) it gives patients a warning effect; (2) symptoms-delaying medications are most effective at an early stage of the disease; (3) effective management of psychiatric symptoms, such as depression or psychosis, holds the potential to alleviate the societal burden and associated costs; (4) preventive therapies may be developed to raise the chance of treating the AD.Thus far, diagnosing AD typically involves a comprehensive approach that combines extensive testing and the systematic elimination of alternative potential causes.Psychological assessments, e.g., mini-mental state examinations (MMSE; Folstein et al., 1975) and Montreal cognitive assessment (MoCA;Nasreddine et al., 2005), blood tests (Moretti, 2015), cerebrospinal fluid (CSF; Jack et al., 2011), and emerging imaging techniques are being employed to diagnose AD (Weiner, 2009).
In recent decades, neuroimaging tools, e.g., magnetic resonance imaging (MRI; Dickerson and Wolk, 2011), positron emission tomography (PET; Risacher et al., 2021), and computed tomography (CT; Imabayashi et al., 2013), have been extensively employed to investigate the underlying causes of AD and to enhance the precision of its diagnosis.However, patients receive a diagnosis based on the present spatial resolution of these neuroimaging techniques, often after notable neurodegeneration has occurred.Additionally, these advanced neuroimaging methods come with considerable expenses, demand time-intensive investment, and necessitate experts for their proper intervention.
Electroencephalogram (EEG), an alternative approach that offers greater ease and convenience, has been used as a biomarker in AD diagnosis, due to its low cost, wide availability, high resolution, and high efficiency (Cassani et al., 2018).By measuring the brain's electrical activity, EEG can detect anomalies in brain waves associated with specific disorders (Noachtar and Rémi, 2009;Kemp et al., 2010;Zheng et al., 2019).Given that EEG signals can reflect functional alterations in the cerebral cortex, EEG-based biomarkers hold the potential to evaluate neuronal degeneration caused by AD progression even before the manifestation of behavioral symptoms (Miltiadous et al., 2021).EEG offers many perspectives from recorded signals, including frequency, dynamic alterations, and source imaging.Previous studies have proven these three typical effects, i.e., diffuse slowing, reduced complexity, and decreased synchronization, of AD patients on resting-state EEG signals compared to normal subjects (Cassani et al., 2018).Firstly, diffuse slowing of brain activity refers to a phenomenon where the power of higher EEG frequency bands (e.g., alpha, beta, and gamma bands) decreases, while the power of lower EEG frequency bands (e.g., delta and theta bands) increases (Jeong, 2004;Garn et al., 2015).Secondly, reduced complexity means the complexity of the brain's electrical activity decreases in AD patients when compared to healthy individuals (Schätz et al., 2013;Şeker et al., 2021).Thirdly, decreased synchronization manifests as a decline in connectivity between different cortical regions in many AD patients (Koenig et al., 2005;Wen et al., 2015).
After extracting the EEG features by signal processing methods, using the machine learning techniques, e.g., decision trees algorithm, K-nearest neighbors (kNN), regularized linear discriminant analysis (RLDA), and support vector machine (SVM), these features can be automatically analyzed to classify the normal and abnormal (Fiscon et al., 2018;Safi and Safi, 2021).However, the automatic identification of AD through the utilization of machine learning and EEG readings is currently in its early stages and lacks research about the effect on diagnosis performance from the integration of various types of EEG features (Dauwels et al., 2010).
On this basis, this study aimed to explore the EEG characteristics of AD patients and then develop a new diagnostic approach for AD with various types of EEG signal features and supervised machine learning classification methods based on a big public database.First, according to previous studies, the EEG signal features of spectrum, complexity, and synchronization, of AD and normal subjects were obtained.Then, combined with the machine learning algorithms of SVM, decision trees, and random forest, the classification results between AD and normal subjects were acquired by leave-oneperson-out cross-validation.

Database description
The public database containing the resting-state EEG recordings from 36 AD patients (aged 66.4 ± 7.9 years, 24 females) and 29 healthy controls (CN; aged 67.9 ± 5.4 years, 11 females) was used in this study (Miltiadous et al., 2023).No other dementia-related comorbidities have been reported in AD patients.The cognitive and neuropsychological assessment was conducted using the MMSE (Creavin et al., 2016).MMSE score ranges from 0 to 30, where a lower score indicates a more severe cognitive decline.The MMSE for the AD group was 17.75 ± 4.5 and for the CN group was 30.

Signal preprocessing
Firstly, the signals were re-referenced to A1-A2.Secondly, the Butterworth band-pass filter within the frequency range of 0.5 to 45 Hz was employed to eliminate artifacts.Thirdly, the independent component analysis (ICA) method was performed to cancel irrelevant noise.Finally, the automatic artifact reject technique, artifact subspace 10. 3389/fnagi.2023.1288295Frontiers in Aging Neuroscience 03 frontiersin.orgreconstruction (ASR), in the EEGLAB toolbox (Delorme and Makeig, 2004), was used to exclude segments of data exceeding the conservative 0.5-s window standard deviation threshold of 17, considered as the maximum acceptable limit.

Feature extraction
In this study, the EEG signals were first extracted to 4-s epochs with a 50% overlap, forming the foundational dataset population, which was subsequently employed for classification with being labeled as AD or CN.Then, three types of signal features of resting-state EEG, i.e., spectrum, complexity, and synchronization, were extracted for each epoch.

Spectrum metrics
For time-domain metrics, the mean, variance, and interquartile range (IQR) were chosen as the features (Miltiadous et al., 2021).For a data segment x j with length N, the mean metric x, estimating the central tendency of a probability distribution for a variable, can be defined by: The variance metric Var, representing the width of data around its central value, can be defined by: The IQR, the difference between Q 1 and Q 3 , referred to 25th percentile (lower) and 75th percentile (upper), respectively, can be calculated by: For the frequency-domain metrics, firstly, the power spectral density (PSD) method was used for each 4-s epoch.Next, the PSD for the whole frequency range of 0.5-45 Hz can be also calculated.Then, the five basic EEG rhythms (namely delta of 0.5-4 Hz, theta of 4-8 Hz, alpha of 8-13 Hz, beta of 13-25 Hz, and gamma of 25-45 Hz) were obtained.Finally, to normalized processing, the relative band power (RBP) of each EEG rhythm was obtained by Miltiadous et al., (2023):

Complexity metrics
Entropy measures typically quantify the degree of complexity and predictability of a signal (Coifman and Wickerhauser, 1992).In this study, the approximate entropy (ApEn), permutation entropy (PermEn), multiscale entropy (MSE), and sample entropy (SamplEn) were used to describe the complexity of the entire frequency spectrum.
ApEn is a non-linear method that can be utilized for quantifying the irregularity of a time series, which can be defined by:  and u l ( ).In this study, the pattern length m = 1 and the similarity factor r = 0 2 .times the standard deviation of the time series (Burioka et al., 2005;Abásolo et al., 2009).
PermEn is a complexity measure of ordinal patterns for arbitrary, noisy, and large signals, which can be defined by: where π represents all the permutations of order n, which corresponds to the number of embedding dimensions.p π ( ) represents the probability associated with ordinal patterns π , indicating the relative frequency of ordinal patterns π (Bandt and Pompe, 2002).In this study, n was set as 3 (Tzimourta et al., 2019).
SamplEn is similar to ApEn but it excludes the assessment of selfsimilar patterns, which can be described by:  and u l ( ).Among them, m = 2 and r = 0 15 .(Yang et al., 2013).As a modification of SamplEn for the scaled signal, MSE introduces a range for multiple time scales denoted as τ , employed to create a coarse-grained version of the original time series, and each element of the coarse-grained signal can be calculated by: In our experiments, m r = = 2 015 , ., and τ = 5, which was consistent with previous studies (Costa et al., 2005;Yang et al., 2013).

Synchronization metrics
Based largely on graph theory, recent developments in the analysis of signal synchronization have been rapidly developed (Liu et al., 2017).In this study, the four metrics of clustering coefficient, characteristic path length, efficiency, and small-worldness were used to describe the signal synchronization from complex brain network features (Bullmore and Sporns, 2009).
The clustering coefficient measures the number of connections among the immediate neighbors of a node, expressed as a proportion of the maximum number of possible connections (Demuru et al., 2020).The clustering coefficient C i of node i can be defined by: ( ) where e i represents the number of edges in the neighborhood of node i, and k i representing the degree of node i is a basic feature of the number of connections that node i makes to other nodes.
The characteristic path length L is the minimum number of edges required to traverse from one node to another, which can be defined by Gaal et al. (2010): where N represents the number of all nodes, and l ij represents the minimum path length between notes i and j .Efficiency E global exhibits an inverse relationship with path length, yet it is more straightforward to employ for estimating topological distances between elements of disconnected graphs, which can be defined by Buchel et al. (2021): The 'small-world' property is characterized by a combination of elevated local clustering among nodes within a network and paths that establish global connections across the network.Small-worldness σ is thus determined by the ratio of the clustering coefficient to the path length (Liu et al., 2017): where γ represents the standardized clustering coefficients, defined by the ratio of the clustering coefficient to the random network's clustering coefficient, and δ represents the standardized characteristic path length, established as the ratio of characteristic path length to the random network's characteristic path length.

Classification algorithm
According to previous studies (Fiscon et al., 2018;Miltiadous et al., 2021;Safi and Safi, 2021), the supervised learning classification methods of decision trees, random forests, and SVM were used as the classifiers.For each algorithm, the leave-one-person-out crossvalidation was used as the testing method (Miltiadous et al., 2021), where all epochs from a specific subject are designated as the test set, while the remaining epochs collectively form the training set.Then, the indexes of accuracy, sensitivity, and specificity were calculated, respectively, according to the following equations (Baratloo et al., 2015):

Signal characteristics
To further analyze the spectrum characteristics of the signal, Figure 1 shows examples of the frequency-domain and timefrequency-domain analyses of resting-state EEG for CN and AD subjects.As shown in the frequency-domain spectrum and timefrequency-domain analysis of Figures 1A,B, there was some difference in the frequency spectrum EEG signals between CN and AD subjects, e.g., an increase in the delta rhythms in AD subjects.
Subsequently, the brain network analysis of resting-state EEG for CN and AD subjects was analyzed.As shown in Figure 2A, the correlation matrix between all pairs of electrodes was generated, indicating a decreasing correlation in AD subjects compared to CN subjects.As shown in Figure 2B, the analysis of the brain network gave clearer connectivity between all pairs of electrodes, showing that there was a decrease in brain network connectivity in AD subjects compared to CN subjects, indicating the decreased EEG synchrony in AD patients under rest conditions.

Signal features
For more statistical analysis of EEG signals between CN and AD subjects, the EEG data was first extracted to 4 s epochs with 50% overlap after being preprocessed for each subject, generating 14,515 epochs labeled AD from 36 AD subjects and 12,011 epochs labeled CN from 29 CN subjects.According to the difference between signal characteristics described above, the signal features of time-domain, frequency-domain, complexity, and synchronization were obtained for each epoch.Moreover, the mean and SD of these signal features are shown in Figure 3, and subsequently, their difference between AD and CN individuals was assessed by independent samples t-test.
For time-domain metrics, the mean, variance, and IQR demonstrated a little upward trend for AD subjects (p < 0.001, respectively).For frequency-domain metrics, the low-frequency bands of delta and theta showed a slight increase (p < 0.05, respectively), the high-frequency band of beta showed a slight decrease (p < 0.05), and the high-frequency bands of alpha and gamma showed a decreasing but insignificant trend, indicating that the major changes in the diagnosis of AD were the attenuated power in higher frequency bands (alpha, beta, and gamma) and increased power in lower bands (delta and theta), that is AD caused EEG signals to slow down.For complexity metrics, the entropies of PermEn, SamplEn, and MSE presented a low value in AD subjects (p < 0.001, respectively), revealing that EEG signals of AD showed reduced complexity and seemed to be regular.For synchronization metrics, the features of clustering coefficient and small-worldness demonstrated a decreasing

Classification results
Using these EEG signal features, three classification algorithms of decision trees, random forests, and SVM were carried out to identify the AD and CN groups by the leave-one-person-out crossvalidation.Table 1 presents the accuracy, sensitivity, and specificity results of three classification algorithms, showing that the random forest achieved the highest classification performance with an accuracy of 95.86%, and SVM performed the lowest accuracy of 88.54%.

Discussion
The presented study underscores the potential of integrating signal features from spectrum, complexity, and synchronization domains of resting-state EEG for enhancing the diagnosis of AD.This study achieved a higher classification accuracy performance of 95.86% for AD and CN subjects based on resting-state EEG, compared to previous studies using the same dataset with a classification accuracy of 77.01%(Miltiadous et al., 2023), showing the combination of these three types of EEG signal features can enhance the classification performance.Besides, in contrast to other studies, e.g., the classification accuracy of 78.50% (Miltiadous et al., 2021) and 83.30% (Fiscon et al., 2018), our study also showed a better performance.
By capturing diverse aspects of neural dysfunction, this integration of spectrum, complexity, and synchronization signal features may offer a more holistic understanding of the underlying pathology.Several key factors have been studied and explored in the pathological causes of AD, e.g., plaques composed of amyloid β, and tangles composed of hyperphosphorylated tau (Scheltens et al., 2021).According to the signal features shown in Figure 3, first, the power spectrum shifted from higher frequency components (alpha, beta, and gamma) toward lower frequency components (delta and theta), which may be related to loss of cholinergic innervations in AD patients (Cassani et al., 2018).Second, a decrease in the complexity of the brain's electrical activity has been noted in AD patients.This phenomenon is potentially attributed to extensive neuronal loss and diminished connectivity in cortical regions, resulting in simpler EEG dynamics (Czigler et al., 2008).Third, reduced synchrony was also presented in AD patients, which can potentially be attributed to a functional disconnection within the neocortex, e.g., anatomical disconnections among different cortical regions in combination (Dauwels et al., 2010).
As for the validation method, this study adopted the leave-oneperson-out cross-validation method.In contrast to k-fold crossvalidation, which employs samples from the same participant in both training and test sets, the leave-one-person-out cross-validation method offers a more realistic validation strategy since no samesubject epochs were in both the training and the test set at the same time (Häfner et al., 2012;Isler et al., 2015).
Some limitations should also be paid attention in this study.First of all, this study only focused on the classification of AD and CN subjects.However, the severity of AD may affect EEG performance, and the severity, e.g., mild, moderate, and serious (Cassani et al., 2018), may also be classified in future studies.Next, the signal processing and feature extraction methods can also be further expanded.For example, the synchronization metrics may also be obtained by Granger causality (Babiloni et al., 2016), phase coherence (McBride et al., 2013), and state space synchrony (Wang et al., 2016), except for the mentioned methods in this study.Then, the features were obtained by averaging EEG signals across the whole recorded electrodes.Nevertheless, the cause of AD may arise from specific brain regions with variable effects on each channel's EEG signals, and the average approach may not be very appropriate.Some techniques, e.g., EEG topographic map (Zheng et al., 2020), physiological cognition (Ranchet et al., 2017), and partial brain networks (Schöll, 2022), may be further carried out in future studies.
Another point the authors would like to mention was that the regional distribution of the brain of these features corresponding to AD was not always consistent for each EEG rhythm and each subject (Knyazeva et al., 2010;Tzimourta et al., 2019).Hence, future studies may focus on the detailed distribution of EEG to find the EEG source localization for AD pathogenesis, and then combine EEG signaling manifestations with causes of AD formation to achieve early detection of AD (Aghajani et al., 2013).Furthermore, the deep learning methods based on large databases can also be explored in future work to realize end-to-end prediction (Khojaste-Sarakhsi et al., 2022).

Conclusion
The proposed integrated approach of three types of EEG signal features demonstrated promising results in differentiating AD patients from healthy controls.The fusion of spectrum, complexity, and synchronization features exhibited improved diagnostic accuracy compared to using individual features alone.This suggests that the combination of multi-domain features of EEG signals provides a more comprehensive representation of the neurophysiological changes associated with AD.This study recommended the integration of EEG features of spectrum, complexity, and synchronization for aiding the diagnosis of AD.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers.Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
TP, FP, TN, and FN represent true positive, false positive, true negative, and false negative, respectively.

FIGURE 1
FIGURE 1 Examples of the frequency-domain and time-frequency-domain analyses of resting-state EEG for CN and AD subjects.(A) Frequency-domain spectrum.(B) Time-frequency-domain analysis.

FIGURE 2
FIGURE 2Brain network analysis of resting-state EEG for CN and AD subjects.(A) Correlation matrix between each electrode.(B) brain network connectivity.

TABLE 1
Accuracy, sensitivity, and specificity results of three classification algorithms with leave-one-person-out cross-validation.