Diagnosis of Amnesic Mild Cognitive Impairment Using MGS-WBC and VGBN-LM Algorithms

Computer-aided diagnosis (CAD) has undergone rapid development with the advent of advanced neuroimaging and machine learning methods. Nevertheless, how to extract discriminative features from the limited and high-dimensional data is not ideal, especially for amnesic mild cognitive impairment (aMCI) data based on resting-state functional magnetic resonance imaging (rs-fMRI). Furthermore, a robust and reliable system for aMCI detection is conducive to timely detecting and screening subjects at a high risk of Alzheimer’s disease (AD). In this scenario, we first develop the mask generation strategy based on within-class and between-class criterion (MGS-WBC), which primarily aims at reducing data redundancy and excavating multiscale features of the brain. Concurrently, vector generation for brain networks based on Laplacian matrix (VGBN-LM) is presented to obtain the global features of the functional network. Finally, all multiscale features are fused to further improve the diagnostic performance of aMCI. Typical classifiers for small data learning, such as naive Bayesian (NB), linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVMs), are adopted to evaluate the diagnostic performance of aMCI. This study helps to reveal discriminative neuroimaging features, and outperforms the state-of-the-art methods, providing new insights for the intelligent construction of CAD system of aMCI.


INTRODUCTION
Alzheimer's disease (AD), which occurs frequently in elderly individuals, is a chronic and irreversible neurodegenerative disease accompanied by brain impairments in memory, communication, and reasoning (Dadar et al., 2017). The new report indicates that more than 5.7 million individuals have been diagnosed with AD in the United States (Association, 2018). Unfortunately, as yet, there is no agreed medication or treatment protocol to cure and rehabilitate patients with AD (Xi et al., 2022). Amnesic mild cognitive impairment (aMCI), which is generally characterized by prominent deficits in memory, is widely considered as the early stage of AD due to its greater risk of conversion to AD (Bian et al., 2014;Barone et al., 2016). Accordingly, a robust and reliable system for aMCI detection is conducive to timely detecting and screening the subjects with AD at high risk, thus providing an optimal treatment period for patients.
Resting-state functional magnetic resonance imaging (rs-fMRI), which reflects the neural functional activity of the brain by measuring spontaneous blood oxygen level dependency (BOLD) signal fluctuations in a non-invasive way, has been widely employed as one of the important techniques to diagnose brainrelated diseases Li R. et al., 2020;Sundaram et al., 2020). However, rs-fMRI data are confronted with formidable challenges to the existing pattern classification methods due to its limited data samples and high dimensions, which is not favorable for aMCI detection . In consequence, existing methods for observing alterations of spontaneous neural activity, such as regional homogeneity (ReHo), the amplitude of low-frequency fluctuation (ALFF), and functional connectivity analysis, are employed to explore the biological mechanisms of brain function and reduce the redundant information of brain (Harrison et al., 2019;Liao et al., 2019;Ting et al., 2020;Xiao et al., 2020). Furthermore, the aMCI can be diagnosed using rs-fMRI data processed by the use of abovementioned methods as reported in a few studies (Zhou et al., 2014;Kim et al., 2020;. Consequently, the diagnosis of aMCI based on these data can be employed for timely prevention and screening of potential patients with AD. In this study, the data sample after preprocessing is then conducted using ReHo calculation, ALFF calculation, and functional connectivity construction, and we obtain two threedimensional whole-brain structures and the brain functional network. Then, we develop an MGS-WBC model, which mainly consists of the intraclass volatility detection method using variable coefficient and multiscale statistical thresholds, to excavate the multiscale features and reduce the redundancy information both in the whole-brain structures and functional network. It is remarkable that the traditional approaches only use the single local features of the functional network as the input of the classifier, while ignoring its global features. Therefore, we present a VGBN-LM model that extracts the global features and then concatenates local features obtained from the MGS-WBC model to further excavate features of the functional network. Then, we fuse the features that resulted from the MGS-WBC model and VGBN-LM model to further improve the diagnostic performance of aMCI. Concurrently, to evaluate the validity of obtained features and diagnostic performance of aMCI, typical machine learning classifiers applied to the limited number of data, such as naive Bayesian (NB), linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVMs), are utilized. Noting that the leave-one-out cross-validation is employed to avert the overfitting problem in this study (Sangnawakij and Niwitpong, 2017). This study greatly improves the diagnostic performance of aMCI compared with the traditional methods, providing new insights for the construction of the computeraided diagnosis (CAD) system for neurodegenerative diseases, especially for aMCI.
Herein, it is not ideal how to extract discriminative features from the limited and high-dimensional data, especially for aMCI data based on rs-fMRI. Furthermore, a robust and reliable system for aMCI detection is conducive to timely detecting and screening subjects at a high risk of AD. Hence, the novel aspect of this study is that we first develop an MGS-WBC model to extract discriminative features of whole-brain structures and functional networks using volatility detection and multiple significant thresholds. In the following section, we present a VGBN-LM model that excavates the global features of the functional network and fuses its local features, thus overcoming the low availability of the network feature. Finally, all features generated from the MGS-WBC model and VGBN-LM model are concatenated to further improve the diagnostic performance of aMCI.
The remainder of the study is structured as follows: We discuss the related studies on feature extraction and the diagnosis of aMCI in the "Related Work" Section. We present data source, data acquisition, and data preprocessing in the "Materials" Section. We describe the methodologies of aMCI's diagnostic system in the "Methods" Section. We provide the experiment results and analysis in the subsequent section and give the conclusion in the final section.

RELATED WORK
Advanced neuroimaging, machine learning techniques, and statistical algorithms provide an opportunity to understand how the brain works between healthy control (HC) and patients with brain diseases, and to analyze the pathological mechanism of the brain (Nadarajah and Kotz, 2006;Lei et al., 2019). Additionally, rs-fMRI, which is convenient and suitable for patients with cognitive impairment who feel difficult to complete tasks, is widely employed in the study of diseases related to the central nervous system, which can reflect the functional condition of the brain under the default state. Nevertheless, how to extract discriminative features from the limited and high-dimensional data is not ideal, especially for the aMCI data based on rs-fMRI.
To address this issue, numerous previous studies have focused on excavating biomarkers associated with aMCI disease from the perspective of neuroimaging. For example, compared with the HC group, abnormal patterns of diverse-club and richclub organizations in the functional network are revealed using the two-sample t-test in the aMCI group, indicating that the overlapping nodes might be potential biomarkers in the diagnosis of aMCI (Xue et al., 2020). Also, a previous study has revealed that abnormal alterations of ALFF in patients with aMCI are found involved in brain regions, such as the right hippocampus, parahippocampal cortex, and left lateral temporal cortex, which may act as biomarkers of disease (Xi et al., 2013). The former investigation has also confirmed that disruptive patterns of ReHo are found using the two-sample t-test in the brain, which is conducive to better comprehending the neural substrates of aMCI and can be served as biomarkers (Zhen et al., 2018). It should be noted that all the studies mentioned above are conducted using statistical methods and existing medical background knowledge to analyze the disruptive patterns of brain disease, ignoring the adoption of machine learning techniques to assess the validity and reliability of diagnostic results.
Currently, machine learning techniques have been actively adopted by researchers to design the CAD system of aMCI, which mainly aims at improving the classification performance and putting it into clinical practice (Zhang et al., 2015;Yang et al., 2021). To the limited number of data and information redundancy problems, feature selection is first implemented and then acted as the input of machine learning classifiers, which are suitable for small data learning, such as NB, LDA, LR, and SVM. A previous investigation based on the functional network classification of aMCI has elucidated that the obvious brain regions which resulted from the two-sample t-test are served as the input of SVM classifier, resulting in the ACC of 69% (Lee et al., 2015). Moreover, it has been reported that the identification ACC of aMCI is 75.35% when using significant regions of the ALFF as the input of SVM classifier (Yang et al., 2018). Up to now, the existing literature not only lacks quantity in the concerned study but also focuses on the single significant threshold method, ignoring the extraction of multiscale features in the brain.
In this study, we first develop an MGS-WBC model using validation detection and multiple significant thresholds methods to extract multiscale features both in the whole-brain structures and functional network. Concurrently, we present the VGBN-LM model to extract the global features of the functional network and fuse its local features. Finally, we fuse the extracted features generated from the MGS-WBC model and VGBN-LM model to further improve the diagnostic performance of aMCI. Compared with the traditional method, the results elucidate that the proposed methods provide the accuracy (ACC) of 89.55, 91.04, 92.54, and 94.03% with the same input data in four typical classifiers, including NB, LDA, LR, and SVM, and maximum improvements are 1.49, 1.49, 1.50, and 4.48%, respectively. Moreover, the area under the curve (AUC) on the four sequence classifiers mentioned above are 95.63, 92.69, 97.59, and 97.33%, increasing by 2.49, 1.96, 2.67, and 2.23% compared with the traditional method. It can be inferred that our aMCI diagnostic system is more suitable and reliable than the traditional method. In addition, this study can help reveal impressible and discriminative neuroimaging features, providing new insights for the construction of the CAD system of aMCI.

Description of the Data Source
In this study, the raw data samples are composed of two parts, including 39 patients with aMCI and 38 HCs. These raw data samples are obtained from the second stage of Alzheimer's Disease Neuroimaging Initiative (ADNI) database, which helps to explore neurological biomarkers of cognitive disease and assist in the diagnosis and treatment of the disease 1 .
It deserves to be further mentioned that not all data samples are conformed to work demands. One data sample has been precluded due to undue head movement (cumulative translation

Data Acquisition and Preprocessing
All the data samples we focus on in this study are based on rs-fMRI obtained using a clinical 3.0-Tesla (T) scanner. Noting that each raw data sample based on rs-fMRI is composed of brains at 140 points in time. To maintain the reliability of the raw data samples, the subjects are instructed to lay flat, to close their eyes without thinking, and to keep their head in position throughout the imaging processing. We perform the scanner parameters as follows (Li W. et al., 2020): Echo time (TE) = 30 ms, repetition time (TR) = 3,000 ms, voxel size = 3.31 mm × 3.31 mm × 3.31 mm, flip angle (FA) = 80 • . More detailed information about scanner parameters based on rs-fMRI data can be inquired on the ADNI's website.
We perform raw data sample preprocessing using Resting-State fMRI Data Analysis Toolkit plus (RESTplus) 2 , which is based on math software MATLAB2012a 3 and Statistical Parametric Mapping software (SPM12) 4 . The raw data preprocessing steps we adopt are depicted as follows: first, owing to the machine and human factors, the imaging signal acquired at the beginning of scanning may exhibit unstable signal ingredient factors. Thus, the first 5 time points of each data sample after imaging are discarded in this study. Also, we calibrate the brains of the remaining time points to eliminate the effects induced by discrepancies between various brains or slices. We further normalize the brains of all data samples using the echo-plane imaging (EPI) template from the original coordinate system to Montreal Neurological Institute (MNI) coordinate system due to the shape of each subject's brain being inconsistent. Specifically, we smooth out the noise that exists in the brain using a Gaussian kernel of 6-mm full width at half maximum (FWHW) for functional network and ALFF analysis (Yang et al., 2021). After that, we remove the variables that affect the dependent variables, including nuisance cerebrospinal fluid signal, global mean signal, white matter signal, and 6 head motion parameters. Finally, the signals with important physiology meaning are obtained through a bandpass filter of 0.01-0.08 Hz (Vicente et al., 2018). Based on the steps mentioned above, the data samples after preprocessing are utilized in the following section.

Overview of the Amnesic Mild Cognitive Impairment Diagnosis System
In this study, we develop a diagnosis system for aMCI. At first, the raw data samples based on the rs-fMRI are pre-processed as depicted in the "Related Work" Section. In the next step, FIGURE 1 | Overall framework of amnesic mild cognitive impairment (aMCI) diagnosis system. the data samples after preprocessing are calculated using typical approaches used to observe alterations of neuronal activity, such as ReHo, ALFF, and functional connectivity, and we obtain two whole-brain structures and one functional network for each data sample. Then, the calculated data samples are acted as the input of MGS-WBC model and VGBN-LM model we propose to extract biological features of whole-brain structures and functional networks. To evaluate the validity of obtained features, NB, LDA, LR, and SVM are utilized in this study. Furthermore, the leaveone-out cross-validation is carried out to avert the overfitting problem in this study.
In the results, the features generated by the MGS-WBC model and VGBN-LM model yield better performance than the traditional method in the aMCI detection process. The framework of our aMCI diagnosis system is illustrated in Figure 1.

Mask Generation Strategy Based on Within-Class and Between-Class Criterion
In this section, we develop an MGS-WBC model to reduce the redundancy both in the whole-brain structures and functional networks and to further excavate the corresponding features of multiscale significant thresholds. Remarkably, the connectivity of the functional network is constructed using the Pearson correlation coefficient (PCC), and the inputs of PCC come from the time series of the corresponding brain regions. Also, to better extract the time series of the brain, we employ the general template of anatomical automatic labeling (AAL) to segment the brain into 90 regions (Ju et al., 2019). The structure of the MGS-WBC model shown in Figure 2 mainly consists of within-class volatility detection using variable coefficient and multi-scale significant thresholds. Since the mask size using a single significant threshold of 0.001 < p < 0.05 is much larger than p < 0.001, the traditional method only uses the mask with p < 0.001 for feature extraction and ignores the effective Frontiers in Aging Neuroscience | www.frontiersin.org utilization of features. Besides, for the limited amount of data with a high dimension, how to better extract discriminative features effectively corresponding to the significant threshold of 0.001 < p < 0.05 is no clear. First, intraclass volatility detection mask of the whole-brain structures and 90 × 90 functional network, which is defined as Mask volatility , are generated using variable coefficient within the group to extract the smallest values of the first 5% (Xu et al., 2017), and the variable coefficient (VC) is defined as follows: (1) wherex denotes the average value of the selected pixel or selected brain region, n denotes the number of subjects within the group, and x j denotes the value of the selected pixel or selected brain region corresponding to the ith subject. To obtain the features under multiple significant thresholds, we use Mask volatility and a two-sample t-test to generate the mask of 0.001 < p < 0.05, defined as Mask 0.001<p<0.05 , which makes a large difference between the averages of the classes and the fluctuations within the classes are stable. Next, we further use the traditional method that extracts the features obtained from p < 0.001 to generate a mask, defined as Mask p < 0.001 . And finally, Mask 0.001 < p < 0.05 and Mask p < 0.001 are concatenated as the input mask of whole-brain structures or functional networks to extract the final features. It is worth mentioning that the wholebrain structures employed in this study are calculated using the ALFF method and ReHo method, respectively, which are commonly employed to analyze neural activity in the brain and reduce the dimensions of brain data.

Vector Generation for Brain Networks Based on Laplacian Matrix
In this section, we present a VGBN-LM model for extracting the global features of the functional network. Laplacian Eigenmaps (LE) is a non-linear dimensionality reduction method, which constructs the relationship between samples from the local point of view, and can reflect the local relationship of samples to a certain extent. That is, the points related to each other are expected to be as close as possible to each other in the space after dimensionality reduction. Accordingly, we mainly use LE's conception to extract global features of the functional network.
Here is a brief introduction to the LE algorithm (Belkin and Niyogi, 2001): Step 1: The construction of the graph is defined as G(V, W): Given n samples x 1 , x 2 ..., x n in R d , the edge connection w ij between sample i and sample j is established using b nearest neighbors, and the w ij can usually be constructed using Gaussian kernel function shown in (2). Notably, w ij represents symmetric, D represents the node degree matrix shown in (3), and x n represents the eigenvector corresponding to the sample n.
Step 2: To maintain the identical geometric property as possible after reducing the dimension of the functional network, it can be solved as the minimization problem of y T Ly. That is, if the connection value w ij is larger, the values of y i and y j will be closer, as shown in (4). Thus, this reduces now to (5).
Step 3: Finally, after using the Lagrange multiplier method, the optimal result can be obtained by (6).
However, the goal of the LE algorithm is to reduce the dimension of the sample features, and the dimension of the reduced features depends on the number of minimum nonzero eigenvalues, which does not conform to the requirements of extracting the global features from the functional network in this study. Moreover, it is unable to transform the functional network from two dimensions to one dimension that maintains the global features. Thus, considering the limited and highdimensional data, the VGBN-LM model based on the Laplacian matrix is proposed to reduce the dimension of 90 × 90 functional network and obtain the global features of functional network. Concurrently, we expect that the functional network after dimension reduction operation can maintain and reflect the identical geometric property of functional network. That is, we represent the global features of 90 × 90 functional network in terms of reduced dimension vector.
First, we construct the minimization problem based on the Laplacian matrix. More specifically, let where σ represents the special real number and its value will be introduced later. We then use Lagrange to acquire optimal result, which can be formulated as follows: Then, we take the derivative of y and λ of G y, λ , respectively.
Thus, the optimal result turns to determine the solution of (11).
Nevertheless, the LE algorithm employs constraint conditions of y T Dy = 1 to obtain the eigenvectors corresponding to all nonzero eigenvalues, ignoring that all the optimal results may not satisfy the original constraint condition. Moreover, the generated eigenvectors of 90 × 90 functional network cannot present its global features due to the excessive redundant features and limited amount of data.
Next, we deform Ly = λDy to obtain the optimal result, and its expression could be written as: If λ = 1, we obtain (1 − λ)y T Dy = y T Wy ⇒ y T Dy = y T Wy 1−λ Then, if λ = 1, it can be inferred as below: (1 − λ)y T Dy = y T Wy Therefore, the optimal result of (11) can be further deformed as: Significantly, the solution of (15) is reached by use of the backward induction. Let (λ ξ i , y ξ i ) represents the solutions of y T Dy = y T Wy 1−λ ⇔ Ly = λDy y T Dy = σ if λ = 1, then we can deduce that σ ξ i = y T ξ i Dy ξ i . Based on this, we can obtain the following form: arg min y T Dy=σ y T Ly where d represents the number of solutions in Ly = λDy. Concurrently, we assume that Dy = Ly have the solutions, defined as y β j , and then we make σ β j = y T β j Dy β j . From this, it can be deduced that the equation have the solution. Therefore, we can deduce the expression as follows: arg min y T Dy=σ y T Ly where h represents the number of solutions in Dy = Ly. To sum up, the optimal result of (7) can be written as follows: arg min y T Dy=σ y T Ly = min(min λ ξ i Remarkably, the value of σ is obtained from the optimal solution in (18). Here, the procedure of VGBN-LM model can be summarized as follows: Step 1: Given n brain regions extracted using AAL template, we use the absolute value of PCC shown in (19) to construct the edge connection w ij between brain region i and brain region j instead of Gaussian kernel function and b nearest neighbors method.
where x i and y i denote the two signals from two different brain regions, _ x denotes the average value corresponding to x i , _ y denotes the average value corresponding to y i , and n denotes the total number of brain regions in AAL template.
Step 2: We present (7-19) to reduce the dimension of the matrix w ij from two dimensions to one dimension. Meanwhile, we maintain and reflect the global structure information in the lower dimension space to a certain degree. That is, the closer the connections between brain regions before dimension reduction, the nearer the corresponding values after dimension reduction. In this way, our generated vector y opt resulted from (18) contains global features of functional network.
Step 3: Due to the limited amount of data, the two-sample t-test method is employed to select the global features with obvious differences (p < 0.05) in the generated vector y opt , and then, we concatenate the local features obtained from MGS-WBC model for the following analysis.

Multiscale Feature Fusion
We concatenate all the generated features obtained from MGS-WBC model and VGBN-LM model to further improve the diagnostic performance of aMCI. That is, if the dimensions of two input features x input and y input are p and q, then the dimension of the output feature z is equal to p + q.
Classification Using Naive Bayesian, Linear Discriminant Analysis, Logistic Regression, and Support Vector Machine As can be seen from the above data, we have obtained multiscale features of whole-brain structures and functional networks using the MGS-WBC model and VGBN-LM mode. Considering the limited amount of data, the selection of an appropriate classifier adopted in this study is crucial to evaluate the validation of obtained features and improve the diagnostic performance of aMCI. Fortunately, the SVM classifier and the LR classifier exhibit effective diagnostic performance in terms of mild cognitive impairment (MCI), and it has been exploited by the majority of researchers (Ciulli et al., 2016;Khatun et al., 2019). In addition, typical classifiers applied to small data learning, such as NB classifier and LDA classifier, are also employed to better assess the generalization ability of features. The following is a brief introduction to the classifiers mentioned above.
The NB classifier mainly uses the probability of known data to determine the classification of the unknown data (Liu et al., 2016): (1) Each data sample x is composed of m-dimensional features, denoted by a 1 , a 2 , a 3 , ..., a m .
where p(l k |x) can be deduced by (20) and k is the class number of x.
The main idea of the LDA classifier is to ensure that the intraclass variance of each class is small, and the mean difference between classes is large in the space after the projection (Ji and Ye, 2008). Given input data set D = {(x 1 , y 1 ), (x 2 , y 2 ), ..., (x n , y n )}, y ∈ {0, 1}, meanwhile, N j (j = 0, 1) denotes the number of data samples corresponding to the category j, x denotes the input and X j (j = 0, 1) denotes the set of data samples corresponding to the category j. The mean vector of the data sample corresponding to class j can be expressed as follows: The covariance matrix of the data sample corresponding to class j is denoted as j.
Then, the divergence matrix within class is defined as S w .
where u denotes mean value. Simultaneously, the divergence matrix between classes can be described as S b .
In this way, the optimization objective is rewritten as follows: According to (21-25), it can be inferred as follows: That is, the optimal projection direction w can be determined by calculating the mean and variance of the original two types of samples, and then, the classification is conducted on this basis.
The LR classifier is a probabilistic statistical classification model, which uses a probability score as the predicted value of the dependent variable to evaluate the mutual relation between the dependent variable and the independent variable. Specifically, we utilize the L2 regularization considering the overfitting problem in this study (Chen et al., 2011). The loss function with L2 regularization can be described as follows: where h θ (x) is defined as: Then, we take the derivative concerning (27).
where m represents the number of data samples, and ηrepresents the regularization coefficient. Finally, the parameters θ j corresponding to the minimum loss function are obtained through continuous iteration. The SVM classifier is to obtain the optimal separating hyperplane in the feature space to maximize the interval between positive and negative data samples on the training set. It is also worth noting that the kernel function adopted in this study is the Gaussian Radial Basis Function kernel (Zhou et al., 2014). The SVM classifier mainly works around the following optimization problems (Cristovao et al., 2022).
The Gaussian kernel is described as follows: where C represents the penalty coefficient, x i represents the data sample, y i represents class corresponding to ith data sample, w and b represent the optimal parameters of the model, N represents the number of data, and ψ i represents the relaxation variable that corresponds to the ith data sample. In this study, the leave-one-out cross-validation is employed to avert the overfitting problem of classifiers and overcome the limited number of data, which means that one data sample is employed as the testing set and the remaining samples are employed as the training set. Besides, the parameter adjustment process of the classifier follows the optimal principle.

Evaluation Criteria
In this study, we obtain features using the MGS-WBC model and VGBN-LM model. To evaluate the validity of obtained features, the most widely used measurements for binary classification problems, such as ACC, F1-Score, and AUC, are adopted based on the confusion matrix (Kam et al., 2020).

EXPERIMENT RESULTS AND ANALYSIS
In this section, compared with the traditional method, we first analyze the diagnostic performance of aMCI by using the whole-brain structural features obtained from the MGS-WBC model as the input of four classifiers. Then, we analyze the diagnostic performance of aMCI by using the global features obtained from VGBN-LM model and local features obtained from MGS-WBC model as the input of four classifiers compared with the traditional method. In the end, we fuse all features, including whole-brain structural features, local features, and global features, to further improve the diagnostic performance of aMCI. It is notable that the order of the above four classifiers is as follows: NB, LDA, LR, and SVM.

The Performance Analysis of Whole-Brain Structural Features
In our experiments, the MGS-WBC model is adopted in two whole-brain structures, including ReHo and ALFF, to reduce the data redundancy and extract multiscale features. For the whole-brain structure calculated by ReHo, as shown in Table 1 and Figures 4, 5, we obtain 10 obvious regions of the brain using MGS-WBC model, including the right gyrus rectus (GR), left pallidum (PAL), left cingulate gyrus (CG), left supplementary motor area (SMA), left inferior cerebellum (IC), right IC, left middle temporal gyrus (MTG), left middle frontal gyrus (MFG), right MFG, and left middle occipital gyrus (MOG), while only six obvious brain regions, namely, left IC, right IC, left MTG, left MFG, right MFG, and left MOG, are found using a single significant threshold (twosample t-test, p < 0.001). Significantly, each significant region is composed of the activated voxels in the brain. Then, the features extracted from 10 significant regions and six significant regions are separately employed as the input of classifiers, including NB, LDA, LR, and SVM, to evaluate the aMCI's diagnostic performance. Table 2 presents the diagnostic performance of aMCI using different feature extraction methods of the wholebrain structure (ReHo). The results of MGS-WBC 2 all exceed 66.67% in the four classifiers. Besides, compared with the single significant threshold in the whole-brain structures (SSW 1 ) (Zhang et al., 2015;Yang et al., 2018)  SSW 1 refers to the single significant threshold (two-sample t-test, p < 0.001) in the whole-brain structures (Zhang et al., 2015;Yang et al., 2018). MGS-WBC 2 refers to the combination of volatility detection and significant threshold (0.001 < p < 0.05) in the MGS-WBC model. MGS-WBC 3 refers to the fused features, that is, the final output of MGS-WBC model. The x, y, and z coordinates are the primary peak locations in the MNI space. Cluster size ≥ 5 voxels in two-sample t-test. VER, vermis. and maximum improvements are 4.30, 2.23, 9.49, and 3.95% compared with SSW 1 . Next, for the whole-brain structure after ALFF computation, as shown in Table 3 and Figure 6, only two significant regions of the brain corresponding to different locations are found, whereas no significant region exists using SSW 1 . As shown in Table 4, the results of MGS-WBC 2 or MGS-WBC 3 provide the ACC of 65.67, 73.13, 71.64, and 67.17% in four classifiers. In addition, we get the AUC of 65.67, 77.72, 77.18, and 75.22% in four classifiers. For F1-Score, we obtain 65.67, 72.73, 69.84, and 60.71% in four classifiers. Our goal is to make the average difference between the classes larger and the variance within the classes more stable. Consequently, we develop an MGS-WBC model using multiple significant thresholds and validation detection to generate the fused features of whole-brain structures. Through ablation experiments, it can be inferred that the features of wholebrain structures extracted using MGS-WBC can significantly improve the diagnostic performance of aMCI in typical classifiers compared with the SSW 1 . It indicates that the MGS-WBC model we present is valid, and the model can also effectively remove redundant information from the brain.

The Performance Analysis of Functional Network Features
In our experiments, to fully excavate the features of the functional network, we develop an MGS-WBC model to extract local features. Meanwhile, we present a VGBN-LM model to extract global features of the functional network and fuse its local features to improve the diagnostic performance of aMCI.
First, we select the local features of the functional network using MGS-WBC model to improve the diagnosis performance of aMCI. Our findings elucidate that three pairs of connected brain regions with notable differences are found using including (45,46),(73,76),and (74,76), as can be seen from Table 5. Also, two pairs of connected brain regions, such as (64, 58) and (63, 74), are found using SSW 1 . The brain regions involved in the local feature selection are shown in Figure 7, where the light blue ball denotes the selected brain regions, and the red line indicates that the two selected brains are connected. Notably, the number in Figure 7 and Table 6 denotes the brain regions segmented using the AAL template.   (Zhang et al., 2015;Yang et al., 2018). MGS-WBC 2 refers to the combination of volatility detection and significant threshold (0.001 < p < 0.05) in MGS-WBC model. MGS-WBC 3 refers to the fused features, that is, the final output of MGS-WBC model. The diagnosis performance of aMCI using local features resulted from the MGS-WBC model is sorted in Table 5 compared with SSF 1 . When the features generated from MGS-WBC 2 are used as the input of the four classifiers, including NB, LDA, LR, and SVM, the classification metrics ranged from 55.38 to 69.61%. Meanwhile, the results elucidate that except for LDA, each of NB, LR, and SVM classifiers makes large improvements using MGS-WBC 3 compared with SSF 1 . Although the ACC and F1-Score of LDA decreased by 1.49 and 0.24%, respectively, the AUC value increased by 5.35%.  We further develop a VGBN-LM model to extract global features of the functional network, including left MFG (p = 0.035, two-sample t-test), left hippocampus (p = 0.030, two-sample t-test), right hippocampus (p = 0.038, two-sample t-test), left amygdala (p = 0.009, two-sample t-test), and right amygdala (p = 0.013, two-sample t-test), and the global features, are then used as the input of classifiers to evaluate the validity of the extracted features. As summarized in Table 7, the results reveal that the ACC, F1-Score, and AUC in the four classifiers can achieve more than 61% using VGBN-LM. Moreover, the results of VGBN-LM can provide the ACC of 64. 18, 62.69, 65.67, and 61.19%  ) to 5 using VGBN-LM. Then, we fuse the multiscale features of functional network, including local features and global features, to further improve the diagnostic performance of aMCI. The local features of functional network are obtained using SSF 1 and MGS-WBC 2 . As summarized in Table 8, using MGS-VGBN 1 , each of NB, LDA, Regarding limited and high-dimensional data, we present a VGBN-LM model that excavates the global features of the functional network and fuses local features obtained by the MGS-WBC model. By ablation experiments, multiscale features of functional network we propose can greatly improve the diagnostic performance of aMCI compared to SSF 1 , indicating that the global features we propose can play a certain role in the intelligent diagnosis of aMCI. Besides, the features extracted by this study can accurately locate the significant regions in the brain, which is convenient for doctors to conduct further studies.

The Performance Analysis of Fused Features
In our experiments, we mainly fuse the multiscale features, including whole-brain structural features and features of functional network, to improve the diagnostic performance of aMCI.
As summarized in Table 9, compared with the single significant threshold (p < 0.001) in the whole-brain structures and function network (SSWF 1 ) (Zhang et al., 2015;Yang et al., 2018)  SSWF 1 refers to the single significant threshold (p < 0.001) in the whole-brain structures and function network (Zhang et al., 2015;Yang et al., 2018). FUSE refers to the fused features obtained from MGS-WBC and VGBN-LM models in the whole-bran structures and functional network.
study provides about 76.47, 77.61, 76.92, and 78.79% in the four classifiers, increasing by 1.67, 1.05, 1.64, and 4.26%, respectively. Over the years, the extraction of biomarkers of aMCI based on rs-fMRI data has been reported by numerous studies (Yang et al., 2018(Yang et al., , 2021, yet very few studies about multiscale feature extraction of whole-brain structures and functional network due to limited data, and how to use machine learning methods to verify the effectiveness of the features. Furthermore, a robust and reliable system for aMCI detection is conducive to timely detecting and screening patients at a high risk of AD. Based on this, we extract multiscale features according to the characteristics of the brain and develop the framework of the diagnostic system. Our proposed models, including MGS-WBC model and VGBN-LM model, outperform the traditional approaches (SSWF 1 ) (Zhang et al., 2015;Yang et al., 2018) in this study. It turns out that the fused features obtained from the MGS-WBC model and VGBN-LM model are more important than the traditional single-scale features in terms of the diagnosis of aMCI. Furthermore, our feature selection methods will make the diagnosis of aMCI more accurate and reliable, providing novel insights for the extraction of discriminative neuroimaging features.

CONCLUSION
In this study, we present a system for the diagnosis of aMCI. We first develop an MGS-WBC model to extract discriminative features of whole-brain structures and functional networks. Then, we propose a VGBN-LM model that excavates the global features of the functional network and fuses its local features, thus overcoming the low availability of the functional network features. Finally, we fuse all the features generated from the MGS-WBC model and VGBN-LM model to further improve the diagnostic performance of aMCI. The results demonstrate that this study outperforms the traditional method. In conclusion, the proposed feature extraction methods can be utilized to detect other similar neurological diseases of the brain, providing new insights for the intelligent construction of the CAD system. The future study contains applying our approaches to other brain diseases and verifying the robustness of the system. Besides, more data will be collected so that deep learning models can be used for accurate classification.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: http://adni.loni.usc.edu/.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Ethical review and approval was not required for the animal study because publicly available datasets were analyzed in this study.
This data can be found here: http://adni.loni.usc.edu/. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
CC was responsible for writing the manuscript and doing the experiments. CY and EC instructed the experiments. JC was responsible for drawing pictures in the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by the Fujian Province Innovation Strategy Research Program (No. 2020R01020196) and the YongTai Institute of Artificial Intelligence.