Predicting MCI to AD Conversation Using Integrated sMRI and rs-fMRI: Machine Learning and Graph Theory Approach

Background Graph theory and machine learning have been shown to be effective ways of classifying different stages of Alzheimer’s disease (AD). Most previous studies have only focused on inter-subject classification with single-mode neuroimaging data. However, whether this classification can truly reflect the changes in the structure and function of the brain region in disease progression remains unverified. In the current study, we aimed to evaluate the classification framework, which combines structural Magnetic Resonance Imaging (sMRI) and resting-state functional Magnetic Resonance Imaging (rs-fMRI) metrics, to distinguish mild cognitive impairment non-converters (MCInc)/AD from MCI converters (MCIc) by using graph theory and machine learning. Methods With the intra-subject (MCInc vs. MCIc) and inter-subject (MCIc vs. AD) design, we employed cortical thickness features, structural brain network features, and sub-frequency (full-band, slow-4, slow-5) functional brain network features for classification. Three feature selection methods [random subset feature selection algorithm (RSFS), minimal redundancy maximal relevance (mRMR), and sparse linear regression feature selection algorithm based on stationary selection (SS-LR)] were used respectively to select discriminative features in the iterative combinations of MRI and network measures. Then support vector machine (SVM) classifier with nested cross-validation was employed for classification. We also compared the performance of multiple classifiers (Random Forest, K-nearest neighbor, Adaboost, SVM) and verified the reliability of our results by upsampling. Results We found that in the classifications of MCIc vs. MCInc, and MCIc vs. AD, the proposed RSFS algorithm achieved the best accuracies (84.71, 89.80%) than the other algorithms. And the high-sensitivity brain regions found with the two classification groups were inconsistent. Specifically, in MCIc vs. MCInc, the high-sensitivity brain regions associated with both structural and functional features included frontal, temporal, caudate, entorhinal, parahippocampal, and calcarine fissure and surrounding cortex. While in MCIc vs. AD, the high-sensitivity brain regions associated only with functional features included frontal, temporal, thalamus, olfactory, and angular. Conclusions These results suggest that our proposed method could effectively predict the conversion of MCI to AD, and the inconsistency of specific brain regions provides a novel insight for clinical AD diagnosis.


INTRODUCTION
Mild cognitive impairment (MCI) is considered a transitional state between normal aging and early Alzheimer's disease (AD) (Lee et al., 2012). Studies have shown that individuals with MCI tend to develop AD at a rate of about 10-15% per year (Allison et al., 2014), but the probability of a healthy elderly to be diagnosed with AD is only 1∼2% (Bischkopf et al., 2002). If MCI is diagnosed at an early stage, through rehabilitation exercise and medication, the incidence of AD can be reduced by nearly one-third (Golob et al., 2007). Thus, early detection of MCI individuals makes it possible to potentially delay or prevent the transition from MCI to AD. The following are MCI clinical conversion criteria: MCI patients can be divided into MCIc and MCInc, depending on whether they become converted into AD patients within a certain period (for instance, the conversion time could be 36 months, 48 months, etc.). Interestingly, the two types of patients have similar clinical manifestations in the early stage, and the morphological differences of their brain lesions are small. To intervene in the diagnosis and treatment of AD disease earlier, the diagnosis and prediction of MCI disease have been studied from multiple perspectives such as genetics, pathology, and medical imaging. Currently, there are different opinions on biomarkers that can accurately reflect the timeliness of preclinical disease progression. However, no research has established the versatility of such markers using prediction/validation study designs. Furthermore, there are defects and difficulties in the diagnosis and classification of MCI disease development. Therefore, finding high discriminative features and establishing a robust classification mechanism is of clinical significance for the diagnosis and timely treatment of MCI diseases, especially the provision of early warning signs for high-risk MCI patients. This may guide the patients to make rational treatment decisions, and thus, even prevent them from developing AD.
Neuroimaging studies of AD patients have found atrophy of structural tissues, and abnormal connections between brain regions in structure and function (Liu et al., 2012;Dai et al., 2019;. Especially, neuroanatomical abnormalities have been found to spread from one brain area to another based on distinctive network patterns in neurodegenerative diseases (Yates, 2012;Pandya et al., 2017;Cauda et al., 2018). Eskildsen and his colleagues (Eskildsen et al., 2013) classified MCI and AD using cortical thickness features from structural MRI and achieved accuracies ranging from 70 to 76% depending on the conversion time. Taking advantage of the difference in the time dimension of disease, Li and his colleagues (Li et al., 2012) proposed a 4-D disease classification algorithm based on the thickness of the cerebral cortex. The classification of MCIc and MCInc achieved the highest classification accuracy (81.7%). Since most studies have reported abnormal and inconsistent brain connections, many recent studies have used the construction of a classification framework combining brain networks and machine learning to classify MCI\AD. Raamana and colleagues (Raamana et al., 2015) constructed a brain network based on the difference in cortical thickness, by taking the average clustering coefficient, boundary number, and node degree as features, and using a multi-core Bayes classifier to classify MCIc and MCInc with a classification accuracy of 64%. Our previous study (Wei et al., 2016) proposed a classification framework to distinguish MCIc from MCInc by using MRI and network features and attained the best accuracies of 76.39%.
To improve the classification effect, many studies have been dedicated to fusing different types of data, such as MRI, fMRI, positron emission tomography (PET), cerebrospinal fluid (CSF), and cognitive scoring scales. Liu et al. (2014) proposed a new multi-modal classification method combining PET and MRI with an accuracy of 67.83% for the classification of MCInc and MCIc. While Wee et al. (2012b) used multi-core SVM to integrate diffusion tensor image (DTI) and rs-fMRI functional network features to classify MCI and normal elderly people and obtained a higher classification accuracy of 96.3%, which was 7.4% higher than that of single-mode data. Besides, appropriate feature selection (Zuo et al., 2010;Chu et al., 2012) and frequency division (Wee et al., 2012a;Mascali et al., 2015) have also been proven to effectively improve classification accuracy. One of our recent studies  supports this view. Essentially, our earlier study distinguished individuals with EMCI and LMCI using a functional brain network of three frequency bands and three feature selection algorithms, during the Resting States, and obtained 83.87% accuracy using the mRMR algorithm in a slow-5 band. Although most previous studies have investigated the utility of the structural MRI or rs-fMRI for classification of MCIc from MCInc, few studies have used cortical and subcortical measurements extracted from DTI/MRI, and graph measures extracted from rs-fMRI, to classify MCIc and MCInc (Mascali et al., 2015;Hojjati et al., 2018). Besides, previous studies only focused on the classification of the different groups of patients, but whether this kind of classification can truly reflect the changes in the structure and function of the brain regions in disease progression remains unverified.
To address these issues, this study aims to: (i) incorporate multiple structural and functional metrics into a combined graph theoretical and machine learning analysis, to evaluate the efficacy of a classification framework to distinguish MCInc/ AD from MCIc. (ii) predict the highly sensitive brain regions of AD conversion, by comparing the difference of the brain regions between MCIc and MCInc, with that between MCIc and AD. Firstly, we proposed structural features including MRI features by FreeSurfer and nodal parameters from thickness network, and functional features derived from constructed functional brain network among time series of the brain regions with three frequency bands (full-band, slow-4, slow-5) at Resting State. Subsequently, we established a weighted network by using a kernel function, and then thresholded it to a binary network at a high discriminative range of sparsity from 8 to 44%. In the current study, the SS-LR and mRMR feature selection algorithms build upon our previous work (Wei et al., 2016;. We employed novel feature selection algorithms (RSFS) to find effective features, and then trained and tested the SVM classifier for classification. We also tested the reliability and stability of the best classification results by applying multiple classifiers (Random Forest, K-nearest neighbor (KNN), AdaBoost, SVM) by upsampling. Finally, we compared the selected top 10 features from the classification of MCInc vs. MCIc and those from the MCIc vs. AD group. Meanwhile, we also investigated the contribution of each modal to the multi-modal classification to explore the conversion of MCI. We hypothesized that the proposed method will improve the accuracy and the sensitivity of identifying prodromal AD, and that the highsensitivity brain regions of the two classification groups may be inconsistent. To the best of our knowledge, this is the first study that has used cortical thickness, structural brain network, and sub-frequency functional brain network for this classification (MCInc vs. MCIc, MCIc vs. AD). Besides, another innovation of this study is the employment of the intra-subject and inter-subject design to classify the two groups of patients.

Participants
Data used in this study were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. 1 The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI was to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), some biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). The demographic data of the datasets are listed in Table 1. A total of 108 participants with full structural and resting-state functional data were collected, but 4 of them failed to pass the data quality control. In the ADNI project, the diagnostic criteria of MCI were as follows: (1)  The present study included 55 MCI non-converters (MCInc), 30 MCI converters (MCIc), and 19 AD. We divided the MCI patients according to Wolz's study (Wolz et al., 2011), into MCInc and MCIc, in which MCIc were defined as patients whose diagnosis changed within 36 months and the complementary MCInc patients defined as MCInc group (up to the time of data screening, MCI had not been converted in the database). Also, 19 out of 30 former MCIc developed AD within 36 months (Other 11 subjects were excluded because of the absence of data and data quality control). In the first instance, we took a baseline for all MCI patients. Thereafter, we continued to take scans until the first reported conversion to AD or up to a period of 36 months. As illustrated in Table 1, gender, age, education and CDR had no significant difference for MCInc and AD, compared to the MCIc.

Feature Extraction
As illustrated in Figures 1A where L ij represents the minimum number of edges between node i and j, and b ij is the connection between node i and j.
S jm represents the number of shortest path lengths between node m and j, S jm (i) represents the number of shortest paths through node i between node m and j.

MRI Features
As indicated in Figure 1A, the atlas used in Desikan-Killiany template included 68 cortical regions. For each cortical region, CT, CV, and CS were calculated as MRI features. CT at each vertex of the cortex was defined as the average shortest length between white and pale surfaces. While CV at each vertex was defined as the product of the CS and CT at each surface vertex. On the other hand, CS was defined as a computation of the area of every triangle in a standardized spherical surface tessellation. This section yielded 204 MRI features for each participant.

Thickness Network Features
The thickness network matrix w ij (i, j = 1,2,. . . ,68) was defined by calculating the difference of CT between each pair of regions, as follows: Where CT k (i) represents the cortical thickness of i ROI of k participants, and the kernel width α is 0.01. To eliminate the influence of false connections and noise, we thresholded the thickness network matrix of each participant into a binary matrix B ij = b ij . The threshold represents the cost of network connection, defined as the ratio of over-threshold connections to the total number of possible connections in the network (Sanz-Arigita et al., 2010). If the weight of the two ROIs was greater than the given threshold, then b ij was 1, or otherwise 0. Notably, there is no golden rule for the definition of a single sparsity threshold, and different sparsity will lead to different results (He et al., 2009;Hojjati et al., 2018). Therefore, we analyzed the range of costs from 8 to 44%, at 1% intervals. Finally, 136 nodal features were employed for subsequent analysis ( Figure 1A).

Functional Network Features
The nodes of the functional brain network were defined by dividing the brain into 90 regions using the automatic anatomical labeling (AAL) template (Tzourio-Mazoyer et al., 2002). The brain network of each participant was a 90 * 90 connection matrix. Each element of the matrix was the Pearson correlation coefficient between brain regions. Then, we applied Fisher's r-toz transformation on the raw undirected connectivity matrix (Wee et al., 2012b). The connection of the brain area itself is meaningless, so the diagonal of the connection matrix was set to zero (Zhan et al., 2013). Consistent with the structural network, we set the threshold 8-44%, at 1% intervals. In this part, 810 nodal features (NL, ND, and BC) were obtained for subsequent feature selection ( Figure 1B).

Feature Selection
In the feature selection section, three feature selection algorithms were applied to classification ( Figure 1C).

Random Subset Feature Selection Algorithm (RSFS)
The RSFS is an algorithm that can find a set of features whose performance is better than the average feature performance of the available feature set (Pohjalainen et al., 2015). The RSFS process Frontiers in Aging Neuroscience | www.frontiersin.org includes the main ideas of the random forest (Breiman, 2001) and random K-Nearest neighbor (KNN) . It repeatedly selects a random feature subset from the set of all possible features and then classifies it by KNN.
In RSFS, F represents a full feature set with j true features, each true feature f j from a full set of features F has a relevance value r j ∈ (-∞, ∞) associated with it. In addition, a set of dummy features z j ∈ Z with related relevances q j is also defined.
During each iteration i, the RSFS algorithm mainly executes the following steps: (1) Randomly select a subset S i of n features (|S i | = n) from the full set F by sampling from a uniform distribution.
(2) For the given data set, uses S i to perform KNN classification and calculates the value of the criterion function c i to measure the performance of classification.
(3) Update r j of all used f j by replacing them according to the formula (5): Where r j is current relevance value, r j is the updated relevance value, c i is the current function value and E {c} is the expectation of the criterion function value (corresponding to the average of all previous iterations of c i ). Specifically, relevance (feature indices) = relevance (feature indices) + performance criterionexpected criterion value.
(4) Repeats step (1) with a new random subset. In parallel to updating the feature relevance, similar processing was performed on virtual features by always selecting a random subset of m virtual features and then updating the relevance values of these features according to formula (5) but using the criterion function value of the true features from the same iteration.
Finally, a statistical test was performed to find the feature set S ⊂ F, that truly surpasses the relevance ratings of virtual features. The selection condition formula is as follows: In formula (6), r rand is the baseline level and δ is the probability threshold. The r rand is modeled as the normal distribution of the virtual correlation q j . Then obtain the probability that the feature is more relevant than a virtual feature from the cumulative normal distribution.
Verification was performed in each repeated process of RSFS. If the feature that exceeds the random feature classification performance was no longer selected, the screening was stopped or the feature selection ended by setting a fixed number of program repetitions Pohjalainen et al., 2015).

Minimal Redundancy Maximal Relevance Feature Selection Algorithm (mRMR)
We used mRMR proposed by Ding and Peng for feature selection (Peng et al., 2005). mRMR can use mutual information as a measure to solve the trade-off between feature redundancy and relevance (Morgado and Silveira, 2015).
Max-Relevance is defined as: S represents a feature set with m features {x i }, D is the mutual information value between the attribute subset, and the label and c is the class.
Min-Redundancy is defined as: R represents the mutual information value between feature attributes.
The combination of formula (8) and formula (9) is the criterion for selecting feature subsets with minimum redundancy and maximum relevance. Therefore, mRMR was defined as:

Sparse Linear Regression Feature Selection Algorithm Based on Stationary Selection (SS-LR)
The SLEP package (Liu et al., 2009) was used to solve sparse linear regression. Given a data set is a true label, n is the number of samples, and m is the number of features for each sample. The linear regression model can be defined as: Where the coefficient of the linear regression is defined as w = (w 1 , w 2 , . . . , w n ) ∈ R m×1 , f (X) is the predicted label vector obtained by distinguishing the unknown samples. Let L (w) be the loss function of linear regression, the function is defined as a formula (12): Add an L 1 regularization term after the loss function to control the complexity of the model, and add the regularized expression: Where ||w|| 1 = m i=1 |w i |, λ > 0 is the regularization parameter of the model control. As λ increases, the sparseness of the function becomes larger. The range is 0.05 < λ < 0.3 and the step size is 0.005. Sub-sampling or bootstrapping from the original sample for stability selection to solve the problem of proper regularization (Meinshausen and Bühlmann, 2010).

SVM Classifier
The SVM classifier adopted here comes from the LIBSVM software package, which was developed by Lin's team (Chang and Lin, 2011). The kernel function in the SVM classifier uses the radial basis kernel function (RBF), where the penalty parameter C and the kernel bandwidth σ in the kernel function range from [4 −4 , 4 4 ]. The RBF kernel was defined as follows: where X 1 , X 2 are two eigenvectors, σ is the width parameter of the REF kernel. Both internal and external cross-validation methods were used in Figure 1C. Internal cross-validation was used to find the best classifier parameters, and external crossvalidation was used to verify the performance of the classifier. A nested cross-validation was used to obtain unbiased estimates. After normalization and feature screening of the training data set, an internal cross-validation (10-fold cross-validation and grid search method) was performed on the training set (inner loop).
In the outer loop, leave-one-out cross-validation (LOOCV) was repeated for N (N = 85 or 49) times. Finally, the held-out sample was used to evaluate the training classifier. These parameters were defined as follows (Fawcett, 2006;Wee et al., 2012b): Balanced Accuracy(BAC) = Sensitivity + specificity 2 where TP is true positive; TN, true negative; FP, false positive and FN, false negative respectively. Area Under Curve (AUC) was defined as the area under the ROC curve and the coordinate axis.

Statistical Analysis
All statistical calculations were performed in the matlab2016b platform (MathWorks, Inc, see text footnote 5). The exact Clopper-Pearson method was used to calculate the 95% confidence intervals (CIs) of sensitivity, specificity, and accuracy (Agresti and Coull, 1998). The CIs of AUC was calculated by the DeLong methods (DeLong et al., 1988;Mercaldo et al., 2007;Mei et al., 2020). McNemar's test (Bates and McNemar, 1964) was used to calculate the two-sided P-value for AUC between MCInc vs. MCIc, AD vs. MCIc.

Classification Results
To reduce feature redundancy for each threshold containing 1150 features, the features of the two classification groups (MCInc vs. MCIc, MCIc vs. AD) were selected by the RSFS, SS-LR, and mRMR in the cost range of 8-44%. The classification results showed that the AUC and ACC obtained by the RSFS algorithm were significantly higher than the other algorithms (Supplementary Figures 1A,B). By comparison, it was found that the classification result obtained by the MCInc vs. MCIc group at cost = 39%, was the best and the most stable, and the classification result obtained by the MCIc vs. AD group at cost = 19%, was the best and the most stable. Therefore, the subsequent results were analyzed and discussed in cost = 39 and 19%. The receiver operating characteristic (ROC) curves and classification results are depicted in Figure 2 and

Comparing Classification Results Based on Different Feature Selection Methods
In Figure 3, the top K features (K = 1, 2,. . . , 30) were used for classification to prove the effect of the number of selected features on the classification performance respectively. After the top 8 features, the AUC curves appeared stable in the two groups. In MCIc vs. AD group, the AUC curves of the mRMR algorithm and SS-LR algorithm go downward and can hardly be classified correctly. We compared the classification performance of the three feature selection algorithms, and the results are shown in Table 3 and Figure 3. As shown in Table 3, the classification performance obtained by the RSFS algorithm showed significant differences compared to those obtained by the mRMR algorithm and the FS algorithm in the two classification groups. But we found no significant difference between the mRMR algorithm and the FS algorithm.
As illustrated in Figure 3A, the AUC scores of the RSFS algorithm were significantly higher than those of the SS-LR algorithm (K = 1, 2, 9-14, 16-30) and mRMR algorithm (K = 10-30) in MCInc vs. MCIc group. At K = 14, the AUC scores of the three algorithms showed significant differences. As shown in Figure 3B, the AUC scores of the RSFS algorithm were significantly higher than those of the SS-LR algorithm (K = 2-4, 8-30) and mRMR algorithm (K = 2-30) in MCIc vs. AD group. At K = 5, 15-18, 21, 24-30, the AUC scores obtained by the SS-LR algorithm were significantly higher than those obtained by mRMR. We found that the AUC scores of the three algorithms have significant differences (K = 15-18, 21, 24-30).   In summary, the classification results of the RSFS algorithm in the MCInc vs. MCIc group was the best, followed by that of the SS-LR algorithm, and then the mRMR algorithm. For the MCIc vs. AD group, the classification results of the RSFS algorithm was also the best, while the classification results obtained by using the other algorithms were relatively poor. Hence, only the two classification groups of results obtained by applying the RSFS algorithm are discussed below. Indicate the classification performance of the RSFS algorithm and SS-LR algorithm is significantly different. Indicate the classification performance of the RSFS algorithm and mRMR algorithm is significantly different. Indicate the classification performance of the mRMR algorithm and SS-LR algorithm is significantly different.

Confirmatory Analyses -Further Resampling Results
With the higher AUC and ACC, the classification effect obtained by the RSFS algorithm outperformed the SS-LR algorithm and mRMR algorithm (Figure 2 and Table 2). In Table 2, it is observable that the imbalanced data caused a gap between sensitivities and specificities. Therefore, we compared the performance of multiple classifiers and verified the reliability of our results through upsampling. As shown in Table 4 and Supplementary Figure 2, the upsampled data were trained and tested by four classifiers (Random Forest (Breiman, 2001), KNN (Yang et al., 2007), AdaBoost (Hastie et al., 2009), SVM). The results showed that the classification accuracy obtained by SVM was the highest and equally matched the results before upsampling.
The reported results of this study were based on only a limited number of iterations (based on the number of subjects) which may be the main reason for the high classification performances. To address this issue and considering the impact of single sampling on classification performance, we upsampling and downsampling the data (Dubey et al., 2013;Hojjati et al., 2017). In general, we performed 500 iterations of the outer loop in the resampling part, and performed the leave-one-out method in the inner loop (For upsampling, based on the number of samples in MCIc vs. AD group is 60 or the number of samples in MCInc vs. MCIc group is 110) for classification prediction, and finally reported the average of those performances average ((60 or 110) × 500 iterations) as the classification result. As illustrated in Supplementary Figures 3, 4, these results show that the result classification performance of the original nosampling data is between upsampling and downsampling when the number of features is 1-30. We compared the classification performance of resample data based on RSFS algorithm and SVM classifier using the top 10 features, and the results are shown in Table 5. In MCInc vs. MCIc group, compare with classification performance of the downsampling (80.20% accuracy, 76.37% sensitivity, 84.03% specificity, 0.853 AUC), nosampling classification performance were slightly higher. However, upsampling classification performance were greater than 90%. In MCIc vs. AD group, compare with classification performance of the downsampling (80.80% accuracy, 71.87% sensitivity, 89.73% specificity, 0.827 AUC), nosampling classification performance were slightly higher, upsampling1 classification performance were greater than those of nosampling. But the accuracy of upsampling2 was lower than that of nosampling. Based on the above results, this study analyzed and compared the nosampling data in the following analysis.

Highly Sensitive Characteristic
In order to investigate which features are highly sensitive brain regions related to MCI disease, we accumulate the number of selected features used for classification, and finally obtain the frequency of occurrence of all selected features. Tables 6, 7 and Figure 4 summarize the details of the top 10 features that can be used to distinguish MCInc and MCIc, MCIc and AD. As shown in Table 6, there was 30% structural features, 20% structural connectivity network features, and 50% functional connectivity network features. Consistent with the previous studies, the brain regions selected by our method to identify MCInc subjects from MCI included the left banks superior temporal sulcus , left entorhinal cortex Nickl-Jockschat et al., 2012;Suk et al., 2015;Rasero et al., 2017), right caudate nucleus (Khazaee et al., 2015;Suk et al., 2015), left calcarine fissure and surrounding cortex (Khazaee et al., 2015;Wang et al., 2015;Pusil et al., 2019), left frontal  Downsampling and upsampling1 are defined as random resampling. Upsampling2 is defined as ensuring that each original sample is included, and then randomly resampling the remaining data.

DISCUSSION
In the present study, we used structure-functional MRI and the combined graph theory with multiple machine learning methods to accurately classify patients with MCIc and MCInc/AD. Our findings demonstrated that, by including the cortical thickness features, structural brain network features, and sub-frequency (full-band, slow-4, slow-5) functional brain network features, the proposed method performed effectively in identifying MCIc subjects from MCInc/ AD. In the classifications of MCIc vs. MCInc and MCIc vs. AD, the proposed RSFS algorithm achieved the best accuracies (84.71%, 89.80%) compared to other algorithms (Table 2 and Figure 3).
In Table 2, there is a gap between specificities and sensitivities due to the imbalanced data. However, our proposed method obtained the best BAC of 80.61 and 87.81% with the RSFS algorithm. We also compared the performance of multiple classifiers and verified the reliability of our results through upsampling (Supplementary Figure 2). The results indicated that the SVM classifier obtained the best accuracy, and was consistent with the results before upsampling. The balance of sensitivities and specificities has also been appropriately improved. In addition, we observed that the mRMR algorithm achieved 5.26% sensitivity in MCIc vs. AD group compared to other methods as described in Table 2. Actually, as shown in Supplementary Figures 1A,B, the SS-LR algorithm and the mRMR algorithm achieved best performance (84.71% ACC, 73.33% SEN, 90.91% SPE, 83.45% AUC at cost = 27%, K = 4 and 77.65% ACC, 53.33% SEN, 90.91% SPE, 74.45% AUC at cost = 8%, K = 20, respectively) in MCInc vs. MCIc group. The SS-LR algorithm and the mRMR algorithm achieved the best performance (71.43% ACC, 42.11% SEN, 90.00% SPE, 70.53% AUC at cost = 36%, K = 2 and 71.43% ACC, 52.63% SEN, 83.33% SPE, 70.35% AUC at cost = 33%, K = 12, respectively) in MCIc vs. AD group.
As illustrated in Tables 8, 9, the classification results obtained by the combination of sMRI and rs-fMRI in the present study are better than those of the unimodal (sMRI\rs-fMRI) approach, including those of our previous studies (Wei et al., 2016;. Meanwhile, we also compared the classification performances with other studies. Most previous methods that constructed brain networks only considered structural or functional features (Suk and Shen, 2014;    Hu et al., 2015; Moradi et al., 2015;Raamana et al., 2015;Ardekani et al., 2016;Suk et al., 2016;Beheshti et al., 2017;Hojjati et al., 2017Hojjati et al., , 2018Zheng et al., 2019;Gupta et al., 2020;Zhu et al., 2021), and obtained an accuracy lower than that of the present study. Only Hojjatia's study (Hojjati et al., 2017) used graph theory and machine learning approach (mRMR, FS) to classify rs-fMRI and obtained a classification accuracy of 91.4%. However, the sample size was too small (<20), and the effect was not widely representative. Besides, the studies in Table 8, Zhang and Shen (2012) used a multi-modal multi-task learning algorithm to fuse MRI, FDG-PET, and CSF data and regressed the MMSE and ADAS-Cog scores to classify MCInc and MCIc with a classification accuracy of 73.9%. Similarly, Cui et al. (2011) combined MRI, CSF, and cognitive scoring scale features to classify MCInc and MCIc with a classification accuracy of 67.13%. Ye et al. (2012) used sMRI, ApoE, and cognitive scores to classify MCIc and MCInc using a smooth selection method based on sparse logistic regression, and obtained good classification results of 0.859 AUC. Therefore, these results may suggest that the method we have proposed could effectively help predict the conversion to Alzheimer's disease. Different from the previous studies, our research not only focused on the brain regions' conversion sensitivity of the two groups of patients (MCIc vs. MCInc), but also studied the conversion sensitivity of the brain regions of the same group of patients (MCIc vs. AD). Tables 6, 7 and Figure 4 list the highly sensitive brain regions selected from the two groups. These results proved the inconsistency of the selected brain regions in the two classification groups. As shown in Table 6, there were 30% structural features, 20% structural connectivity network features, 50% functional connectivity network features. The proportion of functional connectivity network features in each frequency band is listed as follows: 1(full-band):1(slow-5):3(slow-4). In Table 7, all features came from the functional network and the proportion of the three frequency bands was 3(full-band):3(slow-5):4(slow-4). Moreover, it is worth noting that 70% of features came from betweenness centrality. Our results suggest that the betweenness centrality in a functional network carries more disease information and the top 10 selected features are more sensitive to more efficient classification for MCIc and AD. According to Tables 6, 7, it can be seen that the network parameter characteristics of all frequency bands from rs-fMRI have been selected. However, the cortical surface area (CS) was not selected for the top 10 features in two classification groups by three algorithms. More importantly, in Wei's work (Wei et al., 2016), the selected top 10 combined structure features did not include CS. Based on the above results, we consider that CS is not an effective marker for AD disease. In future work, we will assess whether it can be excluded from the feature set. Different from our previous work on EMCI and LMCI classification  the characteristics of the slow-5 band did not show high sensitivity in MCInc and MCIc classification. The reason may be that the former is mainly based on the degree of memory impairment of MCI disease, and the latter is based on the longitudinal time diagnosis status to classify whether MCI develops into AD. Therefore, we suggest that the difference in their brain activity may be reflected in different frequency bands.
Our findings converge nicely with what has been suggested by the previous studies (see Results Section), and these selected brain regions have been shown to be related to MCI conversion. The important roles of several brain regions in MCI disease have been widely recognized. Braak and Braak (1991) used structural magnetic resonance imaging (sMRI) to study AD patients. They first discovered a large number of neurofibrillary tangles in the medial temporal lobe, and the brain areas involved mainly included the olfactory cortex, hippocampus, and parahippocampal gyrus, amygdala, and cingulate cortex area, which is consistent with the conclusion that the brain atrophy of AD or MCI patients are mainly located in the medial temporal lobe (Fan et al., 2008;Das et al., 2015). In line with the previous studies (Khazaee et al., 2015;Wang et al., 2015;Pusil et al., 2019), we also found that the left calcarine fissure and the surrounding cortex are associated with MCI conversion to AD. Damage to this brain area may cause central visual diseases (such as macular avoidance and hallucinations). Studies have reported that visual impairment can affect patients' cognition, thereby increasing the risk of dementia (Uhlmann et al., 1991;Naël et al., 2019). Besides, the top 10 highly sensitive features provided by the other two algorithms are also listed in the Supplementary Material (Supplementary Tables 3.1-3.4). Although the sensitivity was lower than that of the RSFS algorithm, the selected top 10 highly sensitive features are also important to brain areas related to AD disease. It shows that the classification framework of graph theory and machine learning methods considering structural and functional MRI provides a new view for improving MCI clinical prediction and diagnosis. Moreover, our findings suggest that the inconsistency of the selected brain regions between the two classification groups requires more attention. The transformation of MCI disease may imply that the structure of the brain area changed in the early stage of AD, and the function of the brain area later began to degenerate. Inconsistency of the brain regions obtained by the two classification groups indicates that the conversion sensitivity brain regions of the two group patients (MCInc vs. MCIc) and the same group patients (MCIc vs. AD) may be different, which further suggests that the classification between the different groups of patients provides limited information. For the followup within a group, it may be more meaningful for the study of diseases.
In the current study, the best performances achieved with costs of 39 and 19% based on MCInc vs MCIc group MCIc vs. AD group, respectively. The cost was defined as the ratio of the number of above-threshold edges to the total number of edges in a network. Cost range can be defined from 0 to 1, but the upper limit is generally less than 50% (Tan et al., 2019). Compared to cost = 19%, cost = 39% is the low threshold. Compared to the MCIc vs. AD group, the MCInc vs. MCIc group can be distinguished when the cost is large and there are more edges in the network. Refer to the study of Jie et al. (2014), as the threshold increases, weak connections and unimportant connections are removed, and significant differences are found between different groups of patients. Therefore, we suggested that the best classification performance of the two classification groups at different costs is due to the different topological properties of the brain network. Specifically, the larger the cost, the higher the global and local efficiency, the higher the clustering coefficient, the lower the characteristic path length, and the lower the small-world attributes . The difference between brain network parameters is significant, and the topological characteristics of brain regions can be better distinguished. In the future, we will investigate the specific differences in the brain network characteristics of different groups of patients, and combine their clinical scales for predictive analysis.
However, this study has several limitations. One major limitation is the small sample size. Another limitation is the imbalanced data. Despite the promising results of using the RSFS algorithm and the SVM to screen patients with MCIc, further data collection is required to test the generalizability of the method to other patient populations. In future studies, a larger sample should be collected, and the number of subjects balanced as the scale of the ADNI database is expanding (Aisen et al., 2010). Furthermore, future studies should attempt to explore different methods of classification in different stages of AD, including the interpretability of structural and functional brain abnormalities (Ibrahim et al., 2021). The versatility in multiple data sets will be necessary to validate the robustness of the models. For the study of the topological properties of the brain, Power-264 brain regions might be considered as a template for constructing brain networks. In addition, other well-known prognostic information (DTI, ApoE status, Tau/Amyloid/FDG-PET) will be considered for classification (Gupta et al., 2020;Fan et al., 2021). In terms of subject design, we believe that the follow-up data within the subject can better reveal the brain area where the sensitive characteristics of the transformed biomarker are located. The limitation is that the data sample size is too small. If there are subjects who can collect follow-up data through cognitive training (Hernes et al., 2021) and set a baseline control at the same time, more meaningful and reliable results may be obtained.

CONCLUSION
The present study investigated the predictive power of cortical thickness features and brain connectivity network features derived from the sMRI and rs-fMRI to identify individuals with MCI from MCInc/AD for the first time. For the selection of subjects, we proposed a mixed-subject method with an inter-(horizontal) and intra-subject design (longitudinal, follow up), which is rarely used in AD classification. In this classification framework, multiple modalities integration was achieved by using graph theory and machine learning algorithms. We found that this framework improves the classification performance of identifying precursor AD (MCIc), and the high-sensitivity features derived with two classification groups are inconsistent. These findings indicate that the converted sensitivity brain regions of the two groups of patients (MCInc vs. MCIc) and the same group of patients (MCIc vs. AD) may be different, which further indicates that the former way of classifying two different groups of patients may provide limited information. Ultimately, such a classification framework integrating information from sMRI and fMRI can effectively predict the conversion of MCI, and different brain regions obtained in this framework from inter-subject and intra-subject design are probably diagnostic markers for AD.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: adni.loni.usc.edu.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics approval and informed consents were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
TZ: roles/writing -original draft, conceptualization, investigation, methodology, writing -review and editing. QL: resources. DZ: investigation. CZ: software. JY and RN: writing -review and editing. JZ and ZJ: funding acquisition and writing -review and editing. LL: conceptualization, supervision, funding acquisition, project administration, and writing -review and editing. All authors reviewed the manuscript.