Estimating Brain Functional Networks Based on Adaptively-Weighted fMRI Signals for MCI Identification

Chen, Huihui; Zhang, Yining; Zhang, Limei; Qiao, Lishan; Shen, Dinggang

doi:10.3389/fnagi.2020.595322

ORIGINAL RESEARCH article

Front. Aging Neurosci., 14 January 2021

Sec. Neurocognitive Aging and Behavior

Volume 12 - 2020 | https://doi.org/10.3389/fnagi.2020.595322

Estimating Brain Functional Networks Based on Adaptively-Weighted fMRI Signals for MCI Identification

Huihui Chen¹^†

Yining Zhang¹^†

Limei Zhang¹^*

Lishan Qiao¹

Dinggang Shen^2,3

¹School of Mathematics Science, Liaocheng University, Liaocheng, China
²Department of Research and Development, Shanghai United Imaging Intelligence Co. Ltd., Shanghai, China
³Department of Brain and Cognitive Engineering, Korea University, Seoul, South Korea

Brain functional network (BFN) analysis is becoming a crucial way to explore the inherent organized pattern of the brain and reveal potential biomarkers for diagnosing neurological or psychological disorders. In so doing, a well-estimated BFN is of great concern. In practice, however, noises or artifacts involved in the observed data (i.e., fMRI time series in this paper) generally lead to a poor estimation of BFN, and thus a complex preprocessing pipeline is often used to improve the quality of the data prior to BFN estimation. One of the popular preprocessing steps is data-scrubbing that aims at removing “bad” volumes from the fMRI time series according to the amplitude of the head motion. Despite its helpfulness in general, this traditional scrubbing scheme cannot guarantee that the removed volumes are necessarily unhelpful, since such a step is fully independent to the subsequent BFN estimation task. Moreover, the removal of volumes would reduce the statistical power, and different numbers of volumes are generally scrubbed for different subjects, resulting in an inconsistency or bias in the estimated BFNs. To address these issues, we develop a new learning framework that conducts BFN estimation and data-scrubbing simultaneously by an alternating optimization algorithm. The newly developed algorithm adaptively weights volumes (instead of removing them directly) for the task of BFN estimation. As a result, the proposed method can not only reduce the difficulty of threshold selection involved in the traditional scrubbing scheme, but also provide a more flexible framework that scrubs the data in the subsequent FBN estimation model. Finally, we validate the proposed method by identifying subjects with mild cognitive impairment (MCI) from normal controls based on the estimated BFNs, achieving an 80.22% classification accuracy, which significantly improves the baseline methods.

Introduction

Functional magnetic resonance imaging (fMRI), by detecting the change of cerebral blood oxygen saturation degree, achieves the purpose of non-invasive “observation” of the brain activities (Brunetti et al., 2006; Whittingstall et al., 2008). Based on the observed brain activities in fMRI, one can explore the brain in vivo from multiple aspects. For example, in 1995, Biswal et al. first showed activation maps of the human brain by fMRI data, under a state in which the subject did not carry out any specific tasks (Biswal et al., 1995), which opened a new area for studying spontaneous fluctuations of the brain at resting state.

While resting-state fMRI (rs-fMRI) is potentially useful in clinical practice, finding biomarkers that can identify patients from normal controls (NCs) has been a primary driver of resting state research over the last decades (Li et al., 2019b). Unfortunately, it is hard to reveal informative patterns by a direct comparison of the fMRI time series between different subjects, since the fMRI signals are arbitrarily scaled and have no unit (Jenkinson and Chappell, 2018). In contrast, brain functional network (BFN), as a measure of the relative relationship between the fMRI time series, can provide a more reliable way of exploring the inherent organization of the brain (Liu et al., 2015; Yu et al., 2019), and has been used in identifying neurological or psychiatric disorders (Stam, 2014), including autism spectrum disorder (Weikai et al., 2017), major depressive disorder (Greicius et al., 2007), obsessive compulsive disorder (Admon et al., 2012), Alzheimer's disease (AD) (Jin et al., 2010; Shi et al., 2017), and its early stage, namely mild cognitive impairment (MCI) (Yu et al., 2017; Li et al., 2019b), to name a few.

Due to its importance, researchers have proposed a series of BFN estimation methods in recent years. However, in general, estimating a “good” BFN is still an extremely challenging problem, because complex artifacts or noises are always involved in the observed fMRI data. In practice, a preprocessing pipeline, including motion correction, nuisance regression, spatial smoothing and temporal filtering (Poldrack et al., 2011), etc., is employed to improve the quality of data. For example, as a simple and popular preprocessing step, the scrubbing operation has been investigated to clean potentially “bad” volumes if the head motion measured by frame-wise displacement (FD) (Power et al., 2012a) or DVARS (Yan et al., 2013) is greater than a threshold. Despite its helpfulness in general, it is hard to guarantee that all the removed volumes are necessarily unhelpful, since the scrubbing operation is independent of the ensuing BFN estimation task.

Recently, Li et al. proposed a task-dependent scrubbing method by incorporating the scrubbing operation into the BFN estimation task (Li et al., 2019a). Although it can optimize the data scrubbing and BFN estimation jointly, Li's method removes the “bad” volumes by a hard regularized parameter that is similar to the thresholding scheme used in the traditional scrubbing strategy. As a result, different numbers of volumes are generally removed for different subjects, which leads to a bias or inconsistency in the estimated BFNs. Moreover, the hard removal of volumes always reduces the statistical power, which is a serious problem, especially when the number of volumes is limited in the fMRI time series.

To address these issues, in this paper, we develop a new learning approach for BFN estimation based on weighted fMRI time series, where the weight on each volume is adaptively estimated from the data. Consequently, the proposed method not only reduces the difficulty of threshold selection involved in the traditional scrubbing scheme, but also provides a framework for a more flexible data “scrubbing” operation that jointly works with the BFN estimation task. Finally, we validate the proposed method by using it to identify subjects with MCI from NCs based on a publicly available dataset. The experimental results illustrate that the proposed approach can achieve better classification accuracy than the baseline methods. For reproducing our results, we also share the source codes in https://github.com/Huihui-Chen/Adaptively-weighted.

The rest of this paper is organized as follows. In Section related methods, we review two baseline methods, PC and SR, for BFN estimation. In Section BFN estimation, we propose the new BFN learning framework, including its motivation, model, and algorithm. In Section experiments, we evaluate our proposed method with experiments. Finally, we conclude the paper with a brief discussion in Section conclusion.

Related Methods

Due to the potential applications in exploring the inherent organization and neurodegenerative diseases of the brain, many BFN estimation methods have been proposed in the past decades. In this section, we review two representatives of them, PC and SR, which are closely related to our study.

Pearson's Correlation

Suppose that the brain, according to a certain atlas, has been parcellated into N regions of interest (ROIs), and there are T measurements over time, each of the ROIs correspond to an extracted time series $x_{i} \in R^{T \times 1}$ , i = 1, ⋯ , N. Then, the edge weight matrix C=(_{C_ij)N × N} of PC-based BFN can be defined as follows:

\begin{array}{l} C_{i j} = \frac{{(x_{i} - {\bar{x}}_{i})}^{T} (x_{j} - {\bar{x}}_{j})}{\sqrt{{(x_{i} - {\bar{x}}_{i})}^{T} (x_{i} - {\bar{x}}_{i})} \sqrt{{(x_{j} - {\bar{x}}_{j})}^{T} (x_{j} - {\bar{x}}_{j})}} & (1) \end{array}

Where C_ij is the estimated functional connectivity between the ith and jth ROIs, and ${\bar{x}}_{i}$ is the mean vector corresponding to x_i. Without loss of generality, we define a new $x_{i} ≜ (x_{i} - {\bar{x}}_{i}) / \sqrt{{(x_{i} - {\bar{x}}_{i})}^{T} (x_{i} - {\bar{x}}_{i})}$ . As a result, Equation (1) can be simplified into $C_{i j} = {x_{i}}^{T} x_{j}$ . Further, we suppose $X = [x_{1}, x_{2}, \dots, x_{N}] \in R^{T \times N}$ , and thus we have C = X^TX. Equivalently, PC-based BFN can be expressed as the solution of the following optimization problem:

\begin{array}{l} m i n_{C} ∥ C - X^{T} X ∥_{F}^{2} & (2) \end{array}

Where denotes the F-norm of a matrix. Such a reformulation of PC makes it easy to be explained under a unified graph learning framework as described in a recent study (Qiao et al., 2018).

Sparse Representation

Different from PC that measures the full correlation between ROIs, SR is one of the statistical methods for modeling the partial correlation by regressing out the confounding effect from other ROIs (Peng et al., 2009). The model of SR is shown as follows:

\begin{array}{l} m i n_{C_{i j}} \sum_{i = 1}^{N} (∥ x_{i} - \sum_{j \neq i} C_{i j} x_{j} ∥^{2} + λ \sum_{j \neq i} | C_{i j} |), & (3) \end{array}

It can equivalently be expressed as the following matrix form:

\begin{array}{l} m i n_{C} ∥ X - X C ∥_{F}^{2} + λ {∥ C ∥}_{1} \\ s . t . C_{i i} = 0, \forall i = 1, \dots, N, & (4) \end{array}

Where∥·∥_F and ∥·∥₁ are F- and L₁-norm of a matrix, respectively, and C_ii = 0 is employed for avoiding the trivial solution. Traditionally, $X - {X C}_{F}^{2}$ in Equation (4) is called a data-fitting term, and ∥C∥₁ a matrix-regularized term. The balance between the two terms are controlled by a regularized parameter λ.

Note that the network matrix C obtained by solving Equation (4) is generally asymmetric. For facilitating comparison among different BFN estimation methods, we symmetrize it by $C_{i j} = s i g n (C_{i j}) \sqrt{C_{i j} C_{j i}}$ , which is established on the theoretical basis for supporting the relationship between SR and partial correlation (Peng et al., 2009).

BFN Estimation

Motivation

As mentioned earlier, the estimated BFNs rely heavily on the quality of the data, but, unfortunately, the observed data in practice usually contain strong noises caused by many factors, especially, the head motion that results in both first- and higher-order effects (Bijsterbosch et al., 2017). The first-order effect mainly refers to the spatial misalignment from one volume to the next, which can be corrected by spatially registering each volume to a selected reference (Freire and Mangin, 2001; Dijk et al., 2012). In contrast, the higher-order effects, due to introducing noises into blood oxygen level dependent (BOLD) signals, are much harder to be removed from data and impact subsequent analyses even after the spatial registration (Power et al., 2012b; Murphy et al., 2013). Currently, three popular methods, including nuisance regression, independent component analysis (ICA), and scrubbing (Biswal et al., 1995; Rodriguez et al., 2012), are often employed to relieve such higher-order effects. In this paper, we mainly focus on the scrubbing operation, since it is closely related to our proposed method.

The scrubbing operation cleans some potentially “bad” volumes if the head motion, in the fMRI series, is greater than a threshold. Despite its simplicity, popularity and empirical effectiveness for removing “high-order” noises, such a scheme has the following drawbacks: (1) the amount of volumes removed by scrubbing is often relatively high [20–60% of all volumes (Bijsterbosch et al., 2017)], which would significantly reduce the statistical power of the data; (2) with a hard threshold, different numbers of volumes are generally removed across subjects, which leads to a bias or inconsistency in the estimated BFNs; (3) the scrubbing operation is independent of the ensuing BFN estimation task, and thus it is hard to know whether the remaining volumes absolutely benefit from BFN estimation, while the removed are necessarily unhelpful.

To address these issues, and differently from our previous work (Chen, 2019), we develop a new learning approach for BFN estimation based on weighted fMRI time series (instead of direct removal of volumes according to a hard threshold used in the traditional scrubbing scheme), where the weight put on each volume is adaptively estimated from the data. More specifically, as shown in Figure 1, an initial weight is first set for each time point (or volume) in the fMRI signals. Then, the following two steps are conducted till convergence: (1) a BFN is estimated based on the weighted fMRI signals, and (2) the weight on each volume is updated according to the estimated BFN. It is worth emphasizing that the two steps are not artificially designed by hand, but automatically derived from the proposed model that will be described in the next section.

FIGURE 1

Figure 1. An example for intuitively illustrating how our proposed method works.

Proposed Model and Algorithm

In this paper, we develop a new learning framework that conducts BFN estimation and fMRI data “scrubbing” simultaneously in a single model as follows:

\begin{array}{l} m i n_{C_{i j}, W_{t}} \sum_{i, j = 1}^{N} \sum_{t = 1}^{T} T^{2} ∥ w_{t} x_{i}^{(t)} - \sum_{i \neq j} C_{i j} (w_{t} x_{j}^{(t)}) ∥^{2} \\ + λ \sum_{j \neq i} | C_{i j} | \\ s . t . 0 \leq w_{t} \leq 1, \sum_{t = 1}^{T} w_{t} = 1 & (5) \end{array}

where the w_t is the weight for the tth time point $x_{i}^{(t)}$ associated with the ith ROI, C_ij is the estimated functional connectivity between the ith and jth ROIs, and λ is a regularized parameter for balancing the two terms in the objective function. Compared with the traditional SR model given in Equation (3), the proposed model introduces a weight w_t into each time point in the fMRI signals, and the weight is optimized as a variable. Meanwhile, the proposed method implements new constraints of $0 \leq w_{t} \leq 1, \sum_{t = 1}^{T} w_{t} = 1$ , which not only avoids a trivial solution (i.e., w_t = 0, ∀t), but also gives a probability interpretation for the weight. Especially, Equation (5) would reduce to the traditional SR model if w_t = 1/T for allt = 1, 2, ⋯ , T, meaning that the proposed model is more flexible than SR.

Equivalently, Equation (5) can be expressed by the following matrix form:

\begin{array}{l} m i n_{C, W} {∥ W X - W X C ∥}_{F}^{2} + λ C_{1} \\ s . t . C_{i i} = 0, 0 \leq w_{t} \leq 1, \sum_{t = 1}^{T} w_{t} = 1, & (6) \end{array}

Where W = diag(w₁, w₂, ···, w_T) is a diagonal weight matrix, C is the adjacency matrix of BFN to be estimated, and X ∈ R^{T × N} is the fMRI data matrix (Without loss of generality, the data matrix X here has been re-defined as X = TX for simplifying the expression).

Comparison With Li et al.'s Method (Li et al., 2019a)

Our method actually follows Li et al.'s research line. Their model is listed as the following

\begin{array}{l} m i n_{C, V} {∥ V X - V X C ∥}^{2} + λ ∥ C_{1} ∥ - γ {∥ V ∥}_{1} \\ s . t . C_{i i} = 0, 0 \leq v_{t} \leq 1, \forall i, t . \end{array}

It is easy to see that these two approaches have something in common. Both of them construct the BFNs using a sparse representation technique and are task-dependent scrubbing methods by incorporating the scrubbing operation into the BFNs estimation task. It is also easy to see that there are some differences between them. In Li's model, they put a regularization term (W's l1 norm here) which can automatically find and remove the “bad” volumes. However, we put another new constraint as showed in (6) on the weights which leads to a more flexible and sufficient utilization of the volumes. The rationale behind this operation lies in that it is hard to ensure that the so called “bad” volumes are really unhelpful. Thus, Li's method may be viewed as a “hard” operation while ours a “soft” operation. Here we present an example as in Figure 2. These are the weights from the same sample under the same parameter, the green points are Li's and the red are ours. It seems interesting that they have certain coherence: the removed volumes also have smaller weights and the remaining have bigger weights. Further, In the last classification experiment, our performance is superior to Li's, which may verify our intuition that the “bad” volumes may indeed benefit our BFNs construction task.

FIGURE 2

Figure 2. A comparison of two methods.

Optimized Algorithm

In consideration of the fact that two variables, C and W, are involved in Equation (6), we employ the alternative optimization (AO) (Bazaraa et al., 2013) to solve this problem as given in the two following steps.

Step 1: with a fixed W, estimate BFN C.

Note that the objective function in Equation (6) is non-differentiable due to the L₁-norm regularizer. Therefore, we solve it via the proximal method (Combettes and Pesquet, 2011). Specifically, for the data-fitting term $f (X, C) = {∥ W X - W X C ∥}_{F}^{2}$ , we first calculate its gradient w.r.t. C and have $\nabla_{C} f (X, C) = 2 (X^{T} W^{T} W X C - X^{T} W^{T} W X)$ . As a result, we have the following updated formula for C according to the gradient descent criterion:

\begin{array}{l} C_{k} = C_{k - 1} - α_{k} \nabla_{C} f (X, C_{k - 1}), & (7) \end{array}

Where α_k denotes the step size of the gradient descent. Then, the current C is mapped into a “feasible region” by the following proximal operator (Combettes and Pesquet, 2011):

\begin{array}{l} pro x_{λ {∥ \cdot ∥}_{1}} (C) = {[s g n (C_{i j}) \times max (a b s (C_{i j}) - λ, 0)]}_{N \times N} & (8) \end{array}

Where sgn(C_ij) and abs(C_ij) return the sign and absolute value of C_ij, respectively.

Step 2: with a fixed C, update W.

When C is a constant matrix, Equation (6) reduces to the following optimization problem:

\begin{array}{l} m i n_{W} {∥ W X - W X C ∥}_{F}^{2} \\ s . t . 0 \leq w_{t} \leq 1, \sum_{t = 1}^{T} w_{t} = 1 . & (9) \end{array}

We can rewrite it as follows:

\begin{array}{l} m i n_{W_{t}} \sum_{t = 1}^{T} ∥ X^{(t)} - X^{(t)} C ∥^{2} w_{t}^{2} \\ s . t . 0 \leq w_{t} \leq 1, \sum_{t = 1}^{T} w_{t} = 1 & (10) \end{array}

By defining the Lagrange multipliers of Equation (10), we have

\begin{array}{l} L (W, α) = \sum_{t = 1}^{T} ∥ X^{(t)} - X^{(t)} C ∥^{2} w_{t}^{2} - α (\sum_{t = 1}^{T} w_{t} - 1) & (11) \end{array}

Let the derivative of Equation (11) w.r.t. W be zero, that is,

\begin{array}{l} 2 ∥ X^{(t)} - X^{(t)} {C ∥}^{2} w_{t} - α = 0, & (12) \end{array}

And then, we get

\begin{array}{l} w_{t} = \frac{1}{2} α ∥ X^{(t)} - X^{(t)} {C ∥}^{- 2} . & (13) \end{array}

Due to the constraint $\sum_{t = 1}^{T} w_{t} = 1$ , we have the following equation:

\begin{array}{l} \sum_{t = 1}^{T} α {∥ X^{(t)} - X^{(t)} C ∥}^{- 2} = 2, & (14) \end{array}

Which results in the following solution of α,

\begin{array}{l} α = 2 / \sum_{t = 1}^{T} ∥ X^{(t)} - X^{(t)} {C ∥}^{- 2} & (15) \end{array}

Finally, we substitute Equation (15) into Equation (13), and get:

\begin{array}{l} w_{t} = ∥ X^{(t)} - X^{(t)} C ∥^{- 2} / \sum_{t = 1}^{T} ∥ X^{(t)} - X^{(t)} {C ∥}^{- 2} . & (16) \end{array}

This gives a closed-form solution for w_t with a clear physical explanation that the weight of the tth time point depends on the data fitting error at current moment. In other words, the weights tend to be related to the quality of data.

Now, we simply summarize the above algorithm in the following Table 1.

TABLE 1

Table 1. Algorithm.

Experiments

We will evaluate the performance of our method by applying it to the early identification of AD. As a common form of dementia, AD occurs most frequently in the aged population. It not only influences the daily life of patients, but also causes heavy economic burden for the patient's family and society (Peng et al., 2019). As an intermediate between AD and normal aging, MCI is believed to be the earliest clinically detectable stage of progression toward AD (Li et al., 2020). Every year, subjects with MCI may evolve to AD with a rate of 10–15% (Petersen and Roberts, 2009), while healthy controls develop into dementia with a rate of 1–2% annually (Bischkopf et al., 2010). Therefore, the early detection of MCI is vital for delaying the transition from MCI to AD (Chaves et al., 2012; Zhang et al., 2019). In the next experiments, we estimate BFNs based on fMRI data and apply the estimated BFNs as features to classify the subjects with MCI from NCs.

Data Acquisition and Preprocessing

The data used in this study were shared by a recent study (Yan et al., 2013), and can be freely downloaded from Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC).¹ Concretely, 91 subjects (45 MCIs and 46 NCs) were participated in our experiment. The subjects were scanned by 3.0T Siemens scanners with the following scanning parameters: TR/TE is 3,000/30 ms, imaging matrix size is 74 × 74, 45 slices, and voxel size is 2.97 × 2.97 × 3 mm³.

The acquired fMRI images were processed by SPM8 toolbox² based on the well-accepted pipeline (Rubinov and Sporns, 2010). For each subject, the first 10 volumes were removed for signal stabilization. The remaining volumes were first corrected for different slice acquisition timing and head motion. Then, regression of ventricle and white matter signals as well as six head-motion profiles were conducted to reduce the influences of nuisance signals. The fMRI series were further band-pass filtered from 0.01 to 0.08 Hz. After that, the corrected images were registered to Montreal Neurological Institute (MNI) standard space, and, based on the automated anatomical labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002), pre-processed fMRI signals were partitioned into 90 ROIs. Please refer to (Qiao et al., 2016) for the description of the preprocessing pipeline in detail. Finally, for each subject, the mean time series of each ROI (Michel et al., 2012) were extracted and put into the column of a data matrix X that is the only material for estimating BFNs.

Brain Functional Network Estimation

As mentioned above, now each subject corresponds to a data matrix, by which we can construct BFNs using different methods. For SR-based methods, they involve a regularized parameter λ that generally affects the subsequent classification performance significantly. Therefore, in our experiments, we select the optimal parametric values for them based on the training set from a candidate range of [2⁻⁵, 2⁻⁴, ⋯ , 2⁰, ⋯ , 2⁴, 2⁵]. Contrary to SR, the PC-based BFN estimation model is parameter-free. However, to improve its flexibility and conduct a fair comparison, we introduce a threshold parameter in PC to remove a proportion of weak edge weights (connections) in the estimated BFN. To be consistent with SR-based methods, we also use 11 candidate values ranging from [1%, 10%,⋯ , 90%, 100%], where the percentage means the proportion of weak connections that are discarded.

Feature Selection and Classification

With the constructed BFNs for all subjects, the next tasks are feature selection and classification. In this paper, we regard the pairwise functional connections between 90 ROIs as features for MCI identification. As a result, the symmetric BFN adjacent matrix contains 90 × (90 − 1)/2 = 4005 features, which would lead to the so-called curse of dimensionality and overfit the training data. Therefore, we reduce the data dimensions by selecting discriminative features based on the t-test with fixed p-values prior to the subsequent classification task. In particular, we set p-value = 0.005 for PC, and 0.01 for other methods, due to their empirically best performance. Based on the selected features, we use the popular linear SVM with default parameter (C = 1) for classification.

The detailed pipeline for feature selection and classification is shown in Figure 3. Due to the small sample size, we employ the leave-one-out (LOO) cross validation strategy for testing the performance of the involved methods. Specifically, for a total of S subjects, one is left out in each loop for testing, while the remaining S-1 subjects are used for training the model.

FIGURE 3

Figure 3. The MCI identification pipeline based on the estimated BFNs.

As described earlier, the free parameter involved in the BFN estimation methods may have a significant influence on the network topology and the final classification result. Therefore, for each method we first estimate BFNs based on different parametric values, and then in each loop we use an inner LOO to validate the classification accuracy on the training data for finding the optimal parametric values in the range of candidate set. Concretely, in each iteration, one of the S-1 training samples/subjects is left out to validate the performance, while the remaining S-2 subjects are for training the model. Then, with the optimal network parametric value, we re-run feature selection and SVM classifier. The final classification accuracy is calculated by averaging the results from all subjects.

Results

Brain Network Visualization

In our experiment, we construct BFNs using four different methods, including PC, SR, SR+SS (Li et al., 2019a) and the proposed adaptively-weighted scheme (namely SR+W). The SR-based models in this paper are solved via SLEP toolbox (Liu et al., 2011). The adjacency matrices of BFNs estimated by different methods are shown in Figure 4. It can be observed that the BFN estimated by PC (a) has a topology that is highly different from those estimated by SR-based methods, since they use different data fitting terms (or, equivalently, capture different statistical dependency between ROIs). In contrast, the SR-based methods lead to similar topological structures since using the same kind of data fitting term for capturing the partial correlation. Compared with the traditional SR, SR+SS (c) affect the network structure significantly due to the direct and stringent removal of volumes from fMRI time series, while the proposed SR+W method (d) can preserve the original network structure well by using a more flexible and lenient volume-weighted strategy.

FIGURE 4

Figure 4. The BFN adjacency matrices constructed by different methods. (A) PC, (B) SR, (C) SR+SS, (D) SR+W.

Classification Performance

In this paper, we use four quantitative measures, including accuracy (ACC), sensitivity (SEN), specificity (SPE), and false positive rate (FPR) to evaluate the classification performance of different methods. Their mathematical definitions are given as follows:

\begin{array}{l} ACC = \frac{T P + T N}{T P + F P + T N + F N} \\ SEN = \frac{T P}{T P + F N} \\ SPE = \frac{T N}{T N + F P} \\ FPR = \frac{F P}{F P + T N} \end{array}

Where TP, TN, FP, and FN indicate true positive, true negative, false positive, and false negative, respectively. In this paper, we treat the MCI samples as positive class and the NC samples as negative class.

The classification results corresponding to different methods are reported in Figure 5. Despite its popularity, PC-based BFNs tend to involve many indirect and redundant connections (as shown in Figure 4 earlier), which may affect its final classification performance. In contrast, SR can regress out the confounding effect from other ROIs, and thus achieve a better classification accuracy than PC. Different from the traditional PC and SR, SR+SS can detect and remove some potential “dirty” data, which may be one of the reasons for the improved classification accuracy. The proposed SR+W method further introduces a more flexible strategy into the traditional SR model for adaptively weighting the fMRI data, and achieves the best classification accuracy, illustrating that the adaptively weighted scheme can improve the quality (at least, the discriminative power) of the estimated BFNs to some extent.

FIGURE 5

Figure 5. Classification results corresponding to different methods.

Sensitivity to Network Model Parameters

In general, model parameters have a heavy influence on the network topology and then the final classification accuracy. To investigate the sensitivity of the involved methods to different values of parameters, we conduct classification experiments on the above data set via LOO cross validation. We report the classification accuracy corresponding to different parametric values in Figure 6 for different BFN estimation methods. It can be observed that the proposed method achieves the best performance for most of the parametric values. However, it is exceedingly sensitive to the model parameter.

FIGURE 6

Figure 6. Classification accuracy associated with different method under 11 different parametric values.

Top Discriminative Features (Networks Connections)

In this paper, we regard the connections between ROIs as features for identifying subjects with MCI from NCs in the dataset. Since the selected features (network connections) may be different in each folder of cross validation, we sort features according to their average of p-values on all folders. As a result, by recording the selected features during the training process, we select 69 discriminative connections and visualize them in Figure 7. Note that the thickness of an arc is inversely proportional to the p-values for indicating the discriminative power of a connection in the classification task. The colors of arcs are randomly generated only for a clearer visualization. From Figure 7, we can find that the brain regions associated with top discriminative features include the temporal gyrus, parietal lobules, parahippocampus, supramarginal gyrus and precuneus, etc. This result conforms to the previous neuroimaging biomarker reports and the pathology studies on MCI (Greicius et al., 2007; Albert et al., 2011). The existing studies (Wang et al., 2013; Gardini et al., 2015) have pointed out that patients with MCI and AD have the same regional network connectivity anomalies compared with healthy controls. Therefore, it has practical significance to realize the early diagnosis of MCI and inhibit its evolution into AD.

FIGURE 7

Figure 7. The most discriminative features (connections) for the 90 ROIs of AAL template involved in the classification tasks. The figure is created by circularGraph tool, shared by Paul Kassebaum. http://www.mathworks.com/matlabcentral/fileexchange/48576-circulargraph.

Performance Evaluation on Simulated Data

We also carried out experiments on a set of simulated Bold signals for evaluating the generalizability of the proposed algorithm and analyzing their ability to detect the true network. At first, we prepared a known network as Figures 8A,B beforehand which is consistent with the hypothesis of our model as the ground truth, that is, it has a clear sparse structure. Then, based on this ground truth network we generate the Bold signals with 80 time points using linear sparse combinations of two given real ROI time series x₁, x₂, associated with ROI1 and ROI2, that is

\begin{array}{l} d a t a = [x_{1}, x_{2}] [\begin{matrix} 1 & 0 & 3 & 0 & 1 \\ 0 & 1 & 0 & 2 & 0.5 \end{matrix}] & (17) \end{array}

Further, to be more in line with the actual scenario, we change several time points randomly in the generated data for simulating the possible artifacts, noises, or “abnormal” resting-state processes. Specifically, we add Gaussian noise to the data to simulate the inevitable noise in real settings and introduce five “dirty” points to simulate the possible functional “noises,” by setting them off the main direction. The eventually signals are illustrated as in Figure 8C. Finally, we estimate FBNs based on the simulated BOLD signals using different methods, and visualize the estimated results in Figures 8D–G.

FIGURE 8

Figure 8. (A,B) The given network as ground truth; (C) The simulated BOLD signals generated according to our motivation; (D–G) The estimated FBNs by running four different methods based on the simulated BOLD signals in (C).

We have the following observation from Figure 8: (1) PC leads to some false connections and produce a dense topology; (2) SR cannot recover the original graph yet as PC although it can remove some connections which makes it exhibit a little cleaner structure. (3) SR+SS can basically reflect the original graph except that an extra edge has been added. However, the order of the weight values have been significantly altered. (4) Different from the above three methods, the proposed algorithm can not only reveal the original topology structure but also preserve the order of the weight values even if it also has an additional edge. The similarity between the estimated brain networks and ground truth by Pearson's correlation coefficient is shown in Table 2. The results show that our method achieves the highest similarity to the ground truth.

TABLE 2

Table 2. The similarity between ground truth network and the generated FBNs.

Conclusion

It is known that the quality of estimated BFNs heavily depends on the observed data. However, in practice, the observed fMRI data are commonly influenced by many factors, especially head motion. Although complex preprocessing pipelines are employed to remove the effects of head motion as much as possible, they are fully independent to the ensuing BFN estimation. Therefore, in this paper we propose a new learning model to estimate BFN together with data “scrubbing” by adaptively weighting the volumes in fMRI time series. Then, we develop an efficient AO algorithm to solve the proposed model and get a closed-form weight update formula with a clear physical explanation. Experiments on a publicly available dataset show that our estimated BFN can result in the best classification accuracy for an MCI identification task. Finally, but interestingly, it is worth emphasizing that, despite its helpfulness in the final performance, the proposed BFN estimation method is unsupervised, meaning that both the weights on data and the network itself are learnt without using the label information. Regarding the fact that our final goal is to improve the classification accuracy, in the future we therefore plan to develop supervised BFN learning algorithms toward a better (at least more discriminative) human brain connectome.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: http://www.nitrc.org/projects/modularbrain/, Neuroimaging Informatics Tools and Resources Clearinghouse.

Ethics Statement

The studies involving human participants were reviewed and approved by Geneva University Hospital. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Funding

This work is partly supported by National Natural Science Foundation of China (61300154, 61402215) and Natural Science Foundation of Shandong Province (ZR2018MF020).

Conflict of Interest

DS was employed by company of Shanghai United Imaging Intelligence.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnote

1. ^http://www.nitrc.org/projects/modularbrain/

References

Admon, R., Bleich-Cohen, M., Weizmant, R., Poyurovsky, M., Faragian, S., and Hendler, T. (2012). Functional and structural neural indices of risk aversion in obsessive–compulsive disorder (OCD). Psychiatr. Res. Neuroimaging 203, 207–213. doi: 10.1016/j.pscychresns.2012.02.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Albert, M. S., DeKosky, S. T., Dickson, D., Dubois, B., Feldman, H. H., Fox, N. C., et al. (2011). The diagnosis of mild cognitive impairment due to Alzheimer's disease: recommendations from the National Institute on Aging-Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimer Dementia 7, 270–279. doi: 10.1016/j.jalz.2011.03.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Bazaraa, M. S., Sherali, H. D., and Shetty, C. M. (2013). Nonlinear Programming: Theory and Algorithms. John Wiley & Sons.

Google Scholar

Bijsterbosch, J., Smith, S. M., and Beckmann, C. F. (2017). Introduction to Resting State fMRI Functional Connectivity. Oxford University Press.

Google Scholar

Bischkopf, J., Busse, A., and Angermeyer, M. C. (2010). Mild cognitive impairment–a review of prevalence, incidence and outcome according to current approaches. Acta Psychiatr. Scand. 106, 403–414. doi: 10.1034/j.1600-0447.2002.01417.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Biswal, B., Zerrin Yetkin, F., Haughton, V. M., and Hyde, J. S. (1995). Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magnetic Resonance Med. 34, 537–541. doi: 10.1002/mrm.1910340409

PubMed Abstract | CrossRef Full Text | Google Scholar

Brunetti, M., Belardinelli, P., Gratta, C. D., Pizzella, V., Penna, S. D., Ferretti, A., et al. (2006). Human brain activation elicited by the localization of sounds delivering at attended or unattended positions: an fMRI/MEG study. Cogn. Process. 7, 116–117. doi: 10.1007/s10339-006-0093-3

CrossRef Full Text | Google Scholar

Chaves, R., Ramírez, J., Górriz, J. M., and Illán, I. (2012). Initiative AsDN. Functional brain image classification using association rules defined over discriminant regions. Pattern Recogn. Lett. 33, 1666–1672. doi: 10.1016/j.patrec.2012.04.011

CrossRef Full Text | Google Scholar

Chen, H. (2019). “Functional brain network estimation based on Weighted BOLD signals for MCI identification,” in Chinese Conference on Pattern Recognition and Computer Vision (PRCV) (Cham: Springer), 29–40. doi: 10.1007/978-3-030-31723-2_3

CrossRef Full Text | Google Scholar

Combettes, P. L., and Pesquet, J. C. (2011). Proximal splitting methods in signal processing. Heinz H Bauschke 49, 185–212. doi: 10.1007/978-1-4419-9569-8_10

CrossRef Full Text | Google Scholar

Dijk, K. R. A. V., Sabuncu, M. R., and Buckner, R. L. (2012). The influence of head motion on intrinsic functional connectivity MRI. Neuroimage 59, 431–438. doi: 10.1016/j.neuroimage.2011.07.044

PubMed Abstract | CrossRef Full Text | Google Scholar

Freire, L., and Mangin, J. F. (2001). Motion correction algorithms may create spurious brain activations in the absence of subject motion. Neuroimage 14, 709–722. doi: 10.1006/nimg.2001.0869

PubMed Abstract | CrossRef Full Text | Google Scholar

Gardini, S., Venneri, A., Sambataro, F., Cuetos, F., Fasano, F., Marchi, M., et al. (2015). Increased functional connectivity in the default mode network in mild cognitive impairment: a maladaptive compensatory mechanism associated with poor semantic memory performance. J. Alzheimer Dis. 45, 457–470. doi: 10.3233/JAD-142547

PubMed Abstract | CrossRef Full Text | Google Scholar

Greicius, M. D., Flores, B. H., Menon, V., Glover, G. H., Solvason, H. B., Kenna, H., et al. (2007). Resting-state functional connectivity in major depression: abnormally increased contributions from subgenual cingulate cortex and thalamus. Biol. Psychiatr. 62, 429–437. doi: 10.1016/j.biopsych.2006.09.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Jenkinson, M., and Chappell, M. (2018). Introduction to Neuroimaging Analysis. Oxford: Oxford University Press. doi: 10.1093/oso/9780198793816.003.0001

PubMed Abstract | CrossRef Full Text | Google Scholar

Jin, H. L., Durand, R., Gradinaru, V., Zhang, F., Goshen, I., Kim, D. S., et al. (2010). Global and local fMRI signals driven by neurons defined optogenetically by type and wiring. Nature 465, 788. doi: 10.1038/nature09108

CrossRef Full Text | Google Scholar

Li, W., Qiao, L., Zhang, L., Wang, Z., and Shen, D. (2019a). Functional brain network estimation with time series self-scrubbing. IEEE J. Biomed. Health Inform. 23, 2494–2504. doi: 10.1109/JBHI.2019.2893880

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, W., Xu, X., Jiang, W., Wang, P., and Gao, X. (2020). Functional connectivity network estimation with an inter-similarity prior for mild cognitive impairment classification. Aging 12:1. doi: 10.18632/aging.103719

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, W., Zhang, L., Qiao, L., and Shen, D. (2019b). Toward a better estimation of functional brain network for mild cognitive impairment identification: a transfer learning view. IEEE J. Biomed. Health Inform. PP:1. doi: 10.1101/684779

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Yuan, L., and Slep, J. Y. (2011). Sparse Learning With Efficient Projections. Techincal Report. Tempe, AZ: Computer Science Center, Arizona State University.

Google Scholar

Liu, X., Zhou, L., Wang, L., Zhang, J., Yin, J., and Shen, D. (2015). An efficient radius-incorporated MKL algorithm for Alzheimer's disease prediction. Pattern Recogn. 48, 2141–2150. doi: 10.1016/j.patcog.2014.12.007

CrossRef Full Text | Google Scholar

Michel, V., Gramfort, A., Varoquaux, G., Eger, E., Keribin, C., and Thirion, B. (2012). A supervised clustering approach for fMRI-based inference of brain states. Pattern Recogn. 45, 2041–2049 doi: 10.1016/j.patcog.2011.04.006

CrossRef Full Text | Google Scholar

Murphy, K., Birn, R. M., and Bandettini, P. A. (2013). Resting-state FMRI confounds and cleanup. Neuroimage 80, 349–359. doi: 10.1016/j.neuroimage.2013.04.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, J., Wang, P., Zhou, N., and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. J. Am. Statist. Assoc. 104, 735–746. doi: 10.1198/jasa.2009.0126

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, J., Zhu, X., Wang, Y., An, L., and Shen, D. (2019). Structured sparsity regularized multiple kernel learning for Alzheimer's disease diagnosis. Pattern Recogn. 88, 370–382. doi: 10.1016/j.patcog.2018.11.027

PubMed Abstract | CrossRef Full Text | Google Scholar

Petersen, R. C., and Roberts, R. D. (2009). Mild cognitive impairment: ten years later. Archiv. Neurol. 66, 1447–1455. doi: 10.1001/archneurol.2009.266

PubMed Abstract | CrossRef Full Text | Google Scholar

Poldrack, R. A., Mumford, J. A., and Nichols, T. E. (2011). “Statistical modeling: Single subject analysis,” in Handbook of Functional MRI Data Analysis, 70.

Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., and Petersen, S. E. (2012a). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage 59:2142. doi: 10.1016/j.neuroimage.2011.10.018

PubMed Abstract | CrossRef Full Text | Google Scholar

Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L., and Petersen, S. E. (2012b). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage 63, 999–999. doi: 10.1016/j.neuroimage.2012.01.069

PubMed Abstract | CrossRef Full Text | Google Scholar

Qiao, L., Han, Z., Kim, M., Teng, S., Zhang, L., and Shen, D. (2016). Estimating functional brain networks by incorporating a modularity prior. Neuroimage 141, 399–407. doi: 10.1016/j.neuroimage.2016.07.058

PubMed Abstract | CrossRef Full Text | Google Scholar

Qiao, L., Zhang, L., Chen, S., and Shen, D. (2018). Data-driven graph construction and graph learning: a review. Neurocomputing 84:S0925231218306696. doi: 10.1016/j.neucom.2018.05.084

CrossRef Full Text | Google Scholar

Rodriguez, P. A., Calhoun, V. D., and Adalı, T. (2012). De-noising, phase ambiguity correction and visualization techniques for complex-valued ICA of group fMRI data. Pattern Recogn. 45, 2050–2063. doi: 10.1016/j.patcog.2011.04.033

PubMed Abstract | CrossRef Full Text | Google Scholar

Rubinov, M., and Sporns, O. (2010). Complex network measures of brain connectivity: uses and interpretations. Neuroimage 52, 1059–1069. doi: 10.1016/j.neuroimage.2009.10.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Shi, B., Chen, Y., Zhang, P., Smith, C. D., Liu, J., Initiative, A., et al. (2017). Nonlinear feature transformation and deep fusion for Alzheimer's Disease staging analysis. Pattern Recogn. 63, 487–498. doi: 10.1016/j.patcog.2016.09.032

CrossRef Full Text | Google Scholar

Stam, C. J. (2014). Modern network science of neurological disorders. Nat. Rev. Neurosci. 15:683. doi: 10.1038/nrn3801

PubMed Abstract | CrossRef Full Text | Google Scholar

Tzourio-Mazoyer, N., Landeau, B., Papathanassiou, D., Crivello, F., Etard, O., Delcroix, N., et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15, 273–289. doi: 10.1006/nimg.2001.0978

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Zuo, X., Dai, Z., Xia, M., Zhao, Z., Zhao, X., et al. (2013). Disrupted functional brain connectome in individuals at risk for Alzheimer's disease. Biol. Psychiatry 73, 472–481. doi: 10.1016/j.biopsych.2012.03.026

PubMed Abstract | CrossRef Full Text | Google Scholar

Weikai, L., Zhengxia, W., Limei, Z., Lishan, Q., and Dinggang, S. (2017). Remodeling Pearson's correlation for functional brain network estimation and autism spectrum disorder identification. Front. Neuroinform. 11:55. doi: 10.3389/fninf.2017.00055

PubMed Abstract | CrossRef Full Text | Google Scholar

Whittingstall, K., Kevin, W., Wilson, D., Doug, W., Schmidt, M., Matthias, S., et al. (2008). Correspondence of visual evoked potentials with FMRI signals in human visual cortex. Brain Topogr. 21:86. doi: 10.1007/s10548-008-0073-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Yan, C.-G., Cheung, B., Kelly, C., Colcombe, S., Craddock, R. C., Di Martino, A., et al. (2013). A comprehensive assessment of regional variation in the impact of head micromovements on functional connectomics. Neuroimage 76, 183–201. doi: 10.1016/j.neuroimage.2013.03.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, R., Qiao, L., Chen, M., Lee, S., Fei, X., and Shen, D. (2019). Weighted graph regularized sparse brain network construction for MCI identification. Pattern Recogn. 90, 220–231. doi: 10.1016/j.patcog.2019.01.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, R., Zhang, H., An, L., Chen, X., Wei, Z., and Shen, D. (2017). Connectivity strength-weighted sparse group representation-based brain network construction for MCI classification. Human Brain Map. 38, 2370–2383. doi: 10.1002/hbm.23524

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y., Zhang, H., Chen, X., Liu, M., Zhu, X., Lee, S.-W., et al. (2019). Strength and similarity guided group-level brain functional network construction for MCI diagnosis. Pattern Recogn. 88, 421–430. doi: 10.1016/j.patcog.2018.12.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: index terms-brain functional network, functional magnetic resonance imaging, scrubbing, Pearson's correlation, sparse re presentation, mild cognitive impairment

Citation: Chen H, Zhang Y, Zhang L, Qiao L and Shen D (2021) Estimating Brain Functional Networks Based on Adaptively-Weighted fMRI Signals for MCI Identification. Front. Aging Neurosci. 12:595322. doi: 10.3389/fnagi.2020.595322

Received: 16 August 2020; Accepted: 10 December 2020;
Published: 14 January 2021.

Edited by:

Mark Stecker, Independent Researcher, Fresno, United States

Reviewed by:

Qi Zhu, Nanjing University of Aeronautics and Astronautics, China
Xiang Li, Massachusetts General Hospital and Harvard Medical School, United States
Weikai Li, Nanjing University of Aeronautics and Astronautics, China

Copyright © 2021 Chen, Zhang, Zhang, Qiao and Shen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Limei Zhang, emhhbmdsaW1laUBsY3UuZWR1LmNu

^†These authors share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.