Clustering Coefficients for Correlation Networks

Graph theory is a useful tool for deciphering structural and functional networks of the brain on various spatial and temporal scales. The clustering coefficient quantifies the abundance of connected triangles in a network and is a major descriptive statistics of networks. For example, it finds an application in the assessment of small-worldness of brain networks, which is affected by attentional and cognitive conditions, age, psychiatric disorders and so forth. However, it remains unclear how the clustering coefficient should be measured in a correlation-based network, which is among major representations of brain networks. In the present article, we propose clustering coefficients tailored to correlation matrices. The key idea is to use three-way partial correlation or partial mutual information to measure the strength of the association between the two neighboring nodes of a focal node relative to the amount of pseudo-correlation expected from indirect paths between the nodes. Our method avoids the difficulties of previous applications of clustering coefficient (and other) measures in defining correlational networks, i.e., thresholding on the correlation value, discarding of negative correlation values, the pseudo-correlation problem and full partial correlation matrices whose estimation is computationally difficult. For proof of concept, we apply the proposed clustering coefficient measures to functional magnetic resonance imaging data obtained from healthy participants of various ages and compare them with conventional clustering coefficients. We show that the clustering coefficients decline with the age. The proposed clustering coefficients are more strongly correlated with age than the conventional ones are. We also show that the local variants of the proposed clustering coefficients (i.e., abundance of triangles around a focal node) are useful in characterizing individual nodes. In contrast, the conventional local clustering coefficients were strongly correlated with and therefore may be confounded by the node's connectivity. The proposed methods are expected to help us to understand clustering and lack thereof in correlational brain networks, such as those derived from functional time series and across-participant correlation in neuroanatomical properties.

compare them with conventional clustering coefficients.We show that the clustering coefficients decline with the age.The proposed clustering coefficients are more strongly correlated with age than the conventional ones are.We also show that the local variants of the proposed clustering coefficients (i.e., abundance of triangles around a focal node) are useful in characterising individual nodes.In contrast, the conventional local clustering coefficients were strongly correlated with and therefore may be confounded by the node's connectivity.The proposed methods are expected to help us to understand clustering and lack thereof in correlational brain networks, such as those derived from functional time series and across-participant correlation in neuroanatomical properties.

Introduction
Networks have been proven to be a useful language to understand structural and functional properties of the brain.The research field is collectively called network neuroscience [1].Initial studies in network neuroscience revealed that brain networks on various spatial scales have properties common to other biological and non-biological networks, such as the small-world property and community structure.More recent studies tend to depend on the availability of new tools to record data with, look at other properties of brain networks such as network hubs, rich clubs and economic efficiency, and endeavour into the analysis of impaired brains [1][2][3][4][5].
In this article, we focus on a measure which has often been applied to brain (and other) networks: clustering coefficient [6].The clustering coefficient quantifies the abundance of connected triangles in a network.In network neuroscience, the clustering coefficient has been shown to be a useful quantity for understanding function-structure associations in the brain for at least the following two reasons.First, it is one of the two building blocks with which to measure the small-worldness of a network; small-world networks are those having a large clustering coefficient and a small shortest path length between two nodes (such as regions of interest; ROIs) on average [2,6].Brain networks are usually small-world networks in this sense [7,8].Loss of small-worldness is a signature of, for example, Alzheimer disease [9,10] and schizophrenia [11].Second, the abundance of connected triangles around a given node, which is measured by local variants of the clustering coefficient, informs us of other structure and functions of networks, namely, community structure [12,13] and local efficiency [14].Both community structure and local efficiency are often measured for brain networks [2,[15][16][17]; for example, community structure of functional brain networks is less pronounced in childhood-onset schizophrenia than controls [18].
However, the current measurement of the clustering coefficient can be easily fooled when it is applied to correlational brain/neuronal networks, where the connectivity between two nodes is defined by Pearson correlation and potentially some other correlation indices.Such correlational brain networks are often built on the basis of a correlation measure between two ROIs such as the pairwise correlation between time-dependent blood oxygen level-dependent (BOLD) signals obtained from functional magnetic resonance imaging (fMRI) or neural signals obtained from electroencephalogram (EEG) and magnetoencephalogram (MEG) [1,2].Correlational networks are also employed to construct structural networks of the brain, where an edge between two ROIs is calculated as the across-participant correlation in the cortical thickness [19,20].A naive application of network analysis tools, including the clustering coefficient, to such correlation networks can go awry due to the following reasons.
First, a network derived from a correlation matrix tends to have many triangles owing to the so-called indirect paths, i.e., a correlation between nodes i and j and one between i and ℓ result in a correlation between j and ℓ even when there is no direct relationship between j and ℓ [21,22].This mathematical property raises the clustering coefficient values.The same pseudo-correlation effect also automatically produces an inflated correlation between the connectivity of node i and the local clustering coefficient (i.e., which refers to the abundance of triangles around a particular node i and has been used for characterising individual ROIs [7,18,[23][24][25][26][27][28][29][30][31][32]), as we will show (section 3.5).One remedy is to use appropriate null models [22], which respect the natural constraints imposed on correlation matrices including a large clustering coefficient value even in the case of networks generated at random.Nevertheless, this solution does not address the issue of the threshold value, which we will discuss below.The partial correlation matrix is a method of choice for removing pseudo-correlation between ROIs that is present in networks based on the Pearson correlation matrix.However, estimation of the partial correlation matrix is difficult, particularly when the number of image volumes is relatively small as compared to the number of ROIs, which is typical of fMRI experiments [33][34][35].
Second, to create a network, we conventionally threshold on the correlation value to dichotomise the presence or absence of an edge between each pair of ROIs.However, the choice of the threshold is arbitrary [16,17,36,37] and results of graph-theoretical analyses often depend on the choice of the threshold [22,37,38].Specifically, clustering coefficient values considerably depend on the threshold value [22,37].One can avoid thresholding by using weighted networks, i.e., networks with weighted edges [16,17].There are several definitions of clustering coefficient for weighted networks [16,17,[39][40][41][42][43][44].However, it is unclear how the weighted network approach should deal with negatively weighted edges; most network analysis tools including the clustering coefficient assume nonnegative edges [45].An interesting possibility is to separately analyse networks composed of positive edges and those composed of negative edges, and then to combine the measurements obtained from the two types of networks [17].However, there seems to be no consensus regarding the treatment of negatively signed edges [46].
In the present study, we develop two clustering coefficients tailored to correlation matrices.The first type of clustering coefficient is based on three-way partial correlation coefficient.The second type is based on partial mutual information.Partial mutual information is a nonlinear correlation measure, which is defined as the conventional mutual information between two random variables but conditioned on other variables [47].These clustering coefficients are ex-pected to overcome some of the aforementioned difficulties.First, they discount the effect of indirect paths to quantify association between two neighbours of a node given the activity of the focal node.In this manner, we avoid both the problem of pseudo-correlation in ordinary correlation matrices and computational issues in the calculation of partial correlation matrices.Second, as in the case of the clustering coefficients for weighted networks, our clustering coefficients do not use thresholding on the correlation value.Third, we measure how far the realised pairwise correlation value is (no matter positive or negative) from the correlation anticipated by the presence of indirect paths.Although this treatment does not solve the problem of negative edges, we intend to use the information contained in negative as well as positive edges in this manner.For a proof of concept, we apply the proposed clustering coefficient indices to fMRI data obtained from healthy subjects with a wide range of age.We show that the clustering coefficients are negatively correlated with the age.This observation is in general less pronounced with the conventional clustering coefficient measures, although decline in the clustering coefficient with ageing should not be regarded as a ground truth in light of the literature [10,32,[48][49][50][51][52][53].We also show that the local clustering coefficients at specific ROIs provide information orthogonal to the mere strength of connectivity and that their association with the participant's age is independent of brain systems.

Functional connectivity
We used N ROI = 30 regions of interest (ROIs) whose coordinates were determined in a previous study [54].Note that we excluded the four cerebellar ROIs out of the 34 ROIs.The system of the 30 ROIs contained the default mode network (DMN; 12 ROIs), cingulo-opercular network (CON; 7 ROIs) and fronto-parietal network (FPN; 11 ROIs).
Denote by ρ(i, j) the Pearson correlation coefficient between the BOLD signals at two ROIs i and j (1 ≤ i, j ≤ N ROI ).We primarily use ρ(i, j) as a measure of functional connectivity between ROIs.However, we will discount the effect of indirect paths, which is present when the edge between ROIs i and j is solely determined by ρ(i, j), by defining new clustering coefficients (section 2.5).
For comparison purposes, we will also examine conventional clustering coefficients for networks (sections 2.3 and 2.4), which are applied to the Pearson correlation matrix and the partial correlation matrix.The partial correlation matrix, which we use as a benchmark, is an alternative measure of functional connectivity [55,56], and its (i, j) element is estimated by ρ partial (i, j) = −cov −1 (i, j)/ cov −1 (i, i)cov −1 (j, j), where cov denotes the covariance matrix [57].It should be noted that ρ(i, j) = ρ(j, i) and ρ partial (i, j) = ρ partial (j, i).We interchangeably use node and ROI in the following.

Average functional connectivity
We used the following two indices of average functional connectivity: the pairwise Pearson correlation coefficient averaged over all pairs of ROIs, denoted by s, and the same average but only over the ROI pairs having the non-negative ρ(i, j) values, denoted by s + .The introduction of s + is motivated by the observation that the interpretation of negative correlation coefficients remains difficult [4,17,58,59].

Clustering coefficients for unweighted networks
In this section and the next, we explain the previously proposed clustering coefficients for unweighted and weighted networks based on the Pearson correlation coefficient, ρ(i, j).Those based on the partial correlation coefficient, ρ partial (i, j), are analogously calculated.
To construct an unweighted functional network, we lay an edge between nodes i and j (1 ≤ i = j ≤ N ) if and only if ρ(i, j) ≥ θ, where θ is a predetermined threshold.The generated network is undirected.We denote the adjacency matrix of the network by A = (a ij ), where 1 ≤ i, j ≤ N ROI .In other words, a ij = 1 if (i, j) is an edge and a ij = 0 otherwise.The clustering coefficient represents the abundance of connected triangles in a network [6].The local clustering coefficient of node i is defined by where a ji is the degree of node i, i.e., the number of edges to which node i is adjacent.The denominator on the right-hand side of Eq. (1) represents the largest possible number of triangles to which node i belongs.Note that 0 ≤ C unw i ≤ 1 (1 ≤ i ≤ N ROI ) and that C unw i is undefined if k i = 0 or 1.The global clustering coefficient for the entire network, denoted by C unw , is given by the average of C unw i over all nodes.We exclude the nodes with k i ≤ 1 from the calculation of C unw .Note that 0 ≤ C unw ≤ 1.Similar to other types of networks, most brain networks, anatomical or functional, have large values of C unw as compared to randomised networks [1,2].

Clustering coefficients for weighted networks
One can define a weighted functional network by regarding ρ(i, j) as the weight of edge (i, j).Because we do not have established methods to deal with negatively weighted edges (but see [17]) and it is common to discard edges with a negative ρ(i, j) value [16,60], the weighted adjacency matrix is given by w ij = ρ(i, j) if ρ(i, j) > 0 and w ij = 0 otherwise.As benchmarks, we consider three variants of weighted clustering coefficient commonly used in the literature [16,17,42,44].We denote by (a ij ) the adjacency matrix of the unweighted network obtained by ignoring the edge weight in the weighted network.In other words, we set a ij = 1 if w ij > 0 (equivalently, ρ(i, j) > 0) and a ij = 0 otherwise.
The local clustering coefficient of node i proposed by Barrat et al. [39] is given by where s i = NROI j=1 w ij is the node strength (i.e., weighted degree).It should be noted that a ij a iℓ a jℓ = 1 if and only if nodes i, j and ℓ form a triangle in the unweighted network; a ij a iℓ a jℓ = 0 otherwise.The average of C wei,B i over all nodes defines the global weighted clustering coefficient denoted by C wei,B .
The local clustering coefficient proposed by Onnela et al. [40], which is implemented in the Brain Connectivity Toolbox [16], is given by Factor max i ′ j ′ w i ′ j ′ normalises C wei,O i between 0 and 1 and prevents it from scaling when the scale of w ij is changed (i.e., when w ij for all 1 ≤ i, j ≤ N ROI is multiplied by the same constant).The corresponding global clustering coefficient, denoted by C wei,O , is given by the average of C wei,O i over all nodes.The local clustering coefficient proposed by Zhang and Horvath [41] is written as [42] The corresponding global clustering coefficient, denoted by C wei,Z , is given by the average of C wei,Z i over all nodes.

Our proposal: Clustering coefficients tailored to correlation matrices
We propose two clustering coefficient measures for correlation matrices (C cor,A and C cor,M ).Both of them discount correlation between ROIs j and ℓ that is expected from the correlation between ROIs i and j and that between i and ℓ, i.e., indirect path between j and ℓ through i (Fig. 1) [22].One measure uses the three-way partial correlation coefficient and the other measure uses the partial mutual information.
The three-way partial correlation coefficient between ROIs j and ℓ controlling for the influence of ROI i, denoted by ρ partial (j, ℓ | i), is defined by [57] Equation (5) indicates that ROIs i and j would be correlated with an amount ρ(i, j)ρ(i, ℓ) by default owing to the indirect path between j and ℓ through i (e.g., [22]).Deviations of ρ(j, ℓ) from ρ(i, j)ρ(i, ℓ) quantify the tendency that j and ℓ are more strongly or weakly connected than is expected from the presence of an indirect path between j and ℓ through i.Based on this observation, we define a first variant of the clustering coefficient as follows.
It is difficult to interpret negative correlation values in functional connectivity data [3,4,17,58,59,61].Therefore, we assume that any deviation of ρ(j, ℓ) from ρ(i, j)ρ(i, ℓ) caused by the effect of i, irrespective of whether it is positive or negative, contributes to the local clustering coefficient at i.We define the local clustering coefficient for ROI i, denoted by C cor,A i (superscript A standing for the absolute value), as In other words, C cor,A i is a weighted average of the absolute value of the partial correlation over pairs of j and ℓ.We have employed the weight |ρ(i, j)ρ(i, ℓ)| for averaging because a high clustering around ROI i should imply strong association between ROIs j and ℓ (in the sense of partial correlation) when i and j are strongly connected and i and ℓ are.We have used ρ partial (j, ℓ | i) instead of ρ partial (j, ℓ), i.e., the partial correlation between j and ℓ controlling for the effect of the other N ROI − 2 ROIs, to make C cor,A i a locally calculated quantity as is the case for the clustering coefficients for networks (e.g., ).The corresponding global clustering coefficient, denoted by C cor,A , is given by the average of C cor,A i over all nodes.Note that 0 We also use another definition of the clustering coefficient based on the partial mutual information, which is a nonlinear correlation measure [47].By definition, the mutual information is nonnegative and invariant under flipping of the sign of the random variable.We use the partial mutual information between ROIs j and ℓ conditioned on ROI i in place of ρ partial (j, ℓ | i) to define the second variant of the local clustering coefficient for correlation matrices, denoted by C cor,M i (superscript M standing for the mutual information).The partial mutual information is defined as where X i , X j and X ℓ are the random variables on ROIs i, j and ℓ, respectively, and h is the (joint) entropy.For example, h(X i ) = − x p(x) log 2 p(x), where p(x) is the probability that where p(x, x ′ ) is the probability that (X j , X i ) = (x, x ′ ).By assuming that the BOLD signals at ROIs i, j and ℓ obey a multivariate Gaussian distribution, one obtains the entropy values in Eq. ( 7) as follows [47,62,63]: where d is the number of random variables and cov represents the expectation.By substituting Eq. ( 8) in Eq. ( 7) and setting cov ′ ij = ρ(i, j), we obtain Using the partial mutual information, we define The denominator normalises the C cor,M i value to range between 0 and 1.The corresponding global clustering coefficient, denoted by C cor,M , is given by the average of C cor,M i over all nodes.As a robustness test, we also examined variants of these clustering coefficients constrained to only positive triangles or negative triangles.We define C cor,A,+ by restricting the enumeration of triangles in the calculation of C cor,A to the positive triangles.In other words, we restrict the summation on the numerator and denominator of Eq. ( 6) to j and ℓ satisfying ρ(i, j), ρ(i, ℓ), ρ(j, ℓ) > 0. We similarly define C cor,A,− , C cor,M,+ and C cor,M,− .We removed six participants from the calculation of C cor,A,− and C cor,M,− .This is because, for these participants, there was at least one ROI i at which there was no triangle with ρ(i, j), ρ(i, ℓ), ρ(j, ℓ) < 0, rendering C cor,A,− and C cor,M,− undefined.

H-Q-S algorithm
As a null model of the covariance matrix, we employed the Hirschberger-Qu-Steuer (H-Q-S) algorithm [64].As recent fMRI data analysis has demonstrated, the H-Q-S algorithm is a more suitable null model than conventional null models in which the topology is randomised [22,65].The H-Q-S algorithm preserves the mean of the diagonal elements, the mean of the off-diagonal elements and the variance of the off-diagonal elements of the given covariance matrix.From the fMRI data of each participant, we obtained the covariance matrix in the course of calculating the functional connectivity, which is the correlation matrix.Based on this covariance matrix, we generated random covariance matrices using H-Q-S algorithm.We then converted the generated random covariance matrices into correlation matrices, which were used as randomised functional connectivity matrices.We did not implement a fine-tuned heuristic variant proposed in [22].
Denote by µ on the average of the diagonal elements of the covariance matrix over the N ROI diagonal elements.Denote by µ off and σ 2 off the average and variance of the off-diagonal elements, respectively.We set x iℓ x jℓ (1 ≤ i, j ≤ N ROI ).In other words, the algorithm assumes that the signal at ROI i is a white-noise time series with a positive bias of length t max , which is independent across the time and ROIs.

White-noise signals
To generate another null model of the covariance matrix, we used white-noise signals.For each ROI, we generated a time series of length 200 in which the signal at each time step and ROI independently obeyed the normal distribution with mean 0 and standard deviation 1.Then, we calculated the covariance matrix using pairs of the N ROI time series and converted it into the correlation matrix.

Participants
One-hundred thirty eight (n = 138) healthy and right-handed participants (54 females and 84 males) were selected from the Nathan Kline Institute's (NKI) Rockland Sample [66].The NKI's data that we used are publicly available.The age of the participants ranged between 18 and 85 years (mean = 41.7,std = 18.4).
For four of our participants, the H-Q-S algorithm did not work because the average off-diagonal element for the empirical covariance matrix was negative, violating the precondition for the algorithm [64].Therefore, we removed the four participants in the analysis that used the H-Q-S algorithm.
Data preprocessing was performed using FMRIB's Software Library (FSL; www.fmrib.ox.ac.uk/fsl), including skull stripping of structural images with BET and registration with FLIRT; each functional image was registered to the participant's high-resolution brain-extracted structural image and the standard Montreal Neurological Institute (MNI) 2-mm brain.Functional data were then preprocessed with motion correction with MCFLIRT and smoothing with full-width half-maximum 5 mm.We also applied additional preprocessing steps to the functional data to remove spurious variance.First, we regressed out six head motion parameters, the global signal, cerebrospinal fluid (CSF) signal, and white matter (WM) signal with FSL FEAT.For each participant, CSF, gray matter (GM) and WM were segmented through FSL's FAST based on his/her T1.The signal averaged over all voxels in GM, WM and CSF was used as global signal.We then applied band-pass temporal filtering (0.01-0.1 Hz).

Linear mixed model
To estimate the linear mixed model with a fixed effect and random effects, we used the lmer function in lme4 package in R (v. 3.4.1).The dependent variable in the linear mixed model was the local clustering coefficient.The fixed and random effects were the node strength and the participant, respectively.To obtain the P value, we used the F -test with Kenward-Roger approximation implemented as the KRmodcomp function in pbkrtest package in R.

Results
We demonstrate the utility of the proposed clustering coefficients on fMRI data collected from participants of a wide range of the age.We looked for associations of the clustering coefficients with the age and its dependence on the ROIs.

Comparison with null models
Statistically larger values of conventional clustering coefficients have repeatedly been observed in empirical brain networks as compared to the null models [1,2].Motivated by these studies, we examined whether the amount of clustering was different between the empirical data and these null models after we controlled for the amount of correlation between two ROIs j and ℓ expected from an indirect path between j and ℓ through a third ROI i.For each participant, we compared the proposed clustering coefficients between the fMRI data obtained from all the participants, those calculated for the H-Q-S null model [22,64], and white-noise signals.
In contrast, the two types of clustering coefficient were smaller for the empirical data than for the randomised data generated by the H-Q-S algorithm (for C cor,A , H-Q-S: 0.281 ± 0.073, t 133 = −12.4,P < 10 −6 , d = −2.15;for C cor,M , H-Q-S: 0.056±0.039,t 133 = −8.59,P < 10 −6 , d = −1.49).This result has probably arisen because the H-Q-S algorithm generates a correlation matrix from short white-noise time series assumed at each ROI.Then, the partial correlation (Eq.( 5)) calculated for the H-Q-S algorithm is distributed relatively widely due to statistical fluctuations, whose distribution can be even wider than that for the empirical data.This fact makes C cor,A and C cor,M , which more or less depends on the absolute value of the partial correlation, large for the randomised data generated by the H-Q-S algorithm.
3.2.Age-related differences in the clustering coefficients tailored to correlation matrices Normal ageing was shown to adversely affect small-worldness of brain networks [15].Because the clustering coefficient is a major index which is used to assess the small-worldness of networks [6], we examined whether our clustering coefficients were able to detect such age-related changes in network structure.We found a negative relationship between each of the two types of clustering coefficients (i.e., C cor,A and C cor,M ) and the age (C cor,A : r 136 = −0.377,P < 10 −5 ; C cor,M : r 136 = −0.397,P < 10 −5 ; Fig. 2(a), (b), Table 1).To explore whether the age is correlated with an index that can be more easily calculated than the clustering coefficient, we examined the relationships between the age and two indices of average functional connectivity.We found that the age was uncorrelated with s (r 136 = 0.020, P = 0.82; Fig. 2(c), Table 1) but negatively correlated with s + (r 136 = −0.311,P = 0.0002; Fig. 2(d), Table 1).The two clustering coefficients were also strongly correlated with s + , whereas they were not correlated with s (Table 2).Therefore, we suspected that the negative correlation between the clustering coefficients and the age was caused by the combination of the negative correlation between s + and the age and the positive correlation between s + and the clustering coefficient.However, significant negative correlation persisted between the clustering coefficients and the age even after controlling for the effect of s + (C cor,A : r 136 = −0.224,P = 0.0076; C cor,M : r 136 = −0.259,P = 0.0019; see Fig. 2(e) and (f) for the scatter plot between the clustering coefficient and the age after the linear effect of s + has been regressed out from both variables; also see Table 1).This result indicates that the negative correlation between the clustering coefficients and age is not completely explained by s + .Therefore, C cor,A and C cor,M quantify effects of the age on fMRI signals beyond what is revealed by the average functional connectivity.
Positive edges and negative edges may have distinct meanings [17].Therefore, we examined variants of the proposed clustering coefficients calculated only from positive triangles (denoted by C cor,A,+ and C cor,M,+ ) or negative triangles (denoted by C cor,A,− and C cor,M,− ).These variants of clustering coefficients were negatively correlated with the age (C cor,A,+ : r 136 = −0.398,P < 10 −5 ; C cor,A,− : r 130 = −0.291,P = 0.0007; C cor,M,+ : r 136 = −0.431,P < 10 −5 ; C cor,M,− : r 130 = −0.304,P = 0.0004).This negative relationship was significant even after controlling for the effect of s + (C cor,A,+ : r 136 = −0.263,P = 0.0019; C cor,A,− : r 130 = −0.197,P = 0.024; C cor,M,+ : r 136 = −0.315,P = 0.0002; C cor,M,− : r 130 = −0.196,P = 0.024).The negative correlation was stronger for the clustering coefficients based on the positive triangles (i.e., C cor,A,+ and C cor,M,+ ) than those based on the negative triangles (i.e., C cor,A,− and C cor,M,− ).We conclude that the age-related differences in the clustering coefficients observed with C cor,A and C cor,M are robust against the restriction of the method to the positive or negative triangles.Note that the age-related decline of C cor,A,+ and C cor,M,+ was stronger than that of C cor,A and C cor,M , respectively.
The rationale behind our clustering coefficients is that the correlation between two neighbours of a focal ROI should be discounted due to the effect of the indirect path.The clustering coefficients C cor,A and C cor,M are not the only indices complying with this rationale.To examine the robustness of our results with respect to specific definitions of the clustering coefficient, we examined the relationship among two other variants of the clustering coefficient designed for correlation matrices and s, s + and the age.Although the correlation between the clustering coefficient and the age was somewhat weaker than in the case of C cor,A and C cor,M , the results with the other two variants of the clustering coefficient were qualitatively the same as those for C cor,A and C cor,M (Appendix A).

Age-related differences in the conventional clustering coefficients
We repeated the same analysis using the clustering coefficients previously proposed for unweighted networks (i.e., C unw ) and weighted networks (i.e., C wei,B , C wei,O and C wei,Z ).For unweighted networks, we used two edge density values, 0.1 and 0.2.Qualitatively, the clustering coefficients for unweighted and weighted networks behaved similarly to C cor,A and C cor,M did.In other words, the clustering coefficients were negatively correlated with the age (Table 1), positively and strongly correlated with s + and not with s with the exception of C wei,B (Table 2).However, the correlation with the age was weaker than in the case of C cor,A and C cor,M (Table 1; see Appendix B for the statistical results).In fact, the partial correlation between the conventional clustering coefficients (i.e., C unw , C wei,B , C wei,O and C wei,Z ) and the age was not significant when one controls the effect of s + (Table 1).These results suggest that these conventional clustering coefficients extract relatively similar information to that extracted by s + as compared to C cor,A and C cor,M do.

Age-related differences in the clustering coefficients for networks derived from partial correlation matrix
Functional networks are often defined in terms of the partial correlation matrix [55,56,61].Therefore, as a benchmark, we calculated the conventional clustering coefficients (for unweighted and weighted networks) for functional networks defined by the partial correlation matrix.The clustering coefficients were not correlated with s or s + (Table 2).These clustering coefficients were also uncorrelated with the age (Table 1).

Relationship between the local clustering coefficients and the node strength
(weighted degree of the node) Local clustering coefficients have been used for characterising individual ROIs [7,18,[23][24][25][26][27][28][29][30][31][32].In this section we show that, differently from the conventional clustering coefficients, the present clustering coefficients do not confound the strength of local clustering at an ROI and the magnitude of the ROI's connectivity.
The clustering coefficients C cor,A i and C cor,M i are plotted against si ≡ s i /(N ROI − 1), i.e., the node strength normalised between −1 and 1, in Fig. 3(a), where a symbol represents a combination of an ROI and an individual.Figure 3(a) suggests that s i and the local clustering coefficient are uncorrelated.To statistically prove this casual observation, we fitted a linear mixed-effects model for each type of local clustering coefficient.In the linear mixed-effects model, the local clustering coefficient value for the combination of a participant and an ROI was the dependent variable (n = 138 participants and N ROI = 30 ROIs).The independent variable was the equivalent of the node strength, i.e., NROI j=1;j =i ρ(i, j).We assumed random effects over participants influencing the slope and intercept.We found that C cor,A i and C cor,M i did not show strong positive correlation with NROI j=1;j =i ρ(i, j) (C cor,A i : t 4139 = −2.33,P = 0.023, Pearson correlation coefficient between C cor,A i and NROI j=1;j =i ρ(i, j), i = 1, . . ., N ROI for each participant, which is then averaged over all the participants, as a measure of effect size r 28 = −0.023,C cor,A i = −0.013si + 0.222; C cor,M i : t 4139 = −3.20,P = 0.0019, r 28 = −0.047,C cor,M i = −0.0050si + 0.031).Note that the effect size as measured by r 28 was small, although the effects were significant owing to a large sample size.
We investigated the same linear relationship for the correlation matrices generated by the randomization of the original correlation matrices using the H-Q-S algorithm.We generated one null model network per participant.For four participants, the algorithm did not work because the average off-diagonal element of the covariance matrix for the empirical covariance matrix was negative, violating the condition for the algorithm to be used [64].For the remaining n − 4 = 134 participants, the dependence of the local clustering coefficient of ROI i on NROI j=1;j =i ρ(i, j) remained small (C cor,A i : t 4019 = −1.93,P = 0.059,  3) and ( 4), respectively) should be correlated with the degree (i.e., the number of edges connected to a node), k i (in the case of unweighted networks) or node strength, i.e., weighted degree s i (in the case of weighted networks) when applied to correlation matrices.Let us explain this point for weighted networks for the sake of clarity.Because of indirect paths, if w ij and w iℓ are large, w jℓ tends to be large, which increases the value of the local clustering coefficient of ROI i.At the same time, s i is large if w ij and w iℓ are.Therefore, we expect systematic positive correlation between s i and any of C unw ) are plotted against si in Fig. 3(b).We did not examine the local clustering coefficient for unweighted networks (i.e., C unw i ) because it was undefined for many ROIs, whose nodal degree k i was either 0 or 1; our network is relatively small (i.e., N ROI = 30) and the edge density is not assumed to be too large.The three local weighted clustering coefficients and si were strongly correlated (C wei,B : t 4019 = 8.56, P = 3.7 × 10 −13 , r 28 = 0.17, C wei,Z i = 0.217s i + 0.355).These results suggest that these local clustering coefficients are confounded by the effect of node strength, which could arise from the pseudo-correlation due to indirect paths.

Dependence of the local clustering coefficients on the brain system
Previous studies found systematic regional differences (e.g., across different lobes) in the local clustering coefficient in functional brain networks [7,18,25,32].However, this effect may be confounded by the effect of the node strength.As a case study, in this section we show that we do not see the association between previously defined brain systems (i.e., subsets of the ROIs constituting the entire network) and age-related changes in conventional local clustering coefficients if the effect of the node strength is controlled.
We first calculated the Pearson correlation coefficient (r) between the age and a nodal index such as the local clustering coefficient at each ROI.Then, we examined whether r was different across three brain systems whose functions and structures have been examined [54,68]: the default mode network (DMN), cingulo-opercular network (CON) and fronto-parietal network (FPN).
The r values between various nodal indices and the age, averaged over the ROIs in each of the DMN, CON and FPN, are shown in Fig. 4. For the clustering coefficients for weighted networks (i.e., C wei,B , C wei,O and C wei,Z ), r was negative for most ROIs, confirming the results reported in section 3.2 that the (global) clustering coefficient was negatively correlated with the age of the participant.The r value was different between the three brain systems for each type of weighted clustering coefficient (C wei,B However, qualitatively the same association between the age and the brain system was also found when r was defined as the correlation between the node strength (i.e., s i ) and the age (F 2,27 = 8.01, P = 0.0019, η 2 = 0.37) and when r was defined as the correlation between s + i , which was defined as ) were positively correlated with the node strength and s + i , we take s i or s + i as a simpler signature of the system dependence of the age effect than the local clustering coefficient.
In contrast, the proposed local clustering coefficients, which were not correlated with s i or s + i (Fig. 3(a)), were not different across the brain systems (C cor,A i : F 2,27 = 0.13, P = 0.88, η 2 = 0.01; C cor,M i : F 2,27 = 0.04, P = 0.96, η 2 = 0.003; also see Fig. 4).These observations suggest that the apparent dependence of the clustering coefficient on the brain system when a conventional clustering coefficient is used is explained by the nodal measure, s i or s + i .We found similar results in sensory-motor regions in the brain (Appendix C).In other words, the association between the clustering coefficient and the age is more positive for the ROIs in a somatosensory-motor system than for the ROIs in an auditory system and a visual system when we used the clustering coefficients for weighted networks.Qualitatively the same dependence on the brain system was also found when we looked at the association between the node strength and the age.In contrast, with the proposed local clustering coefficients, the auditory system showed the strongest association between the clustering coefficient and the age.These results bear robustness to our suggestion that the proposed local clustering coefficients are not confounded by the node's strength, whereas the conventional clustering coefficients are.

Discussion
We proposed two clustering coefficients tailored to correlation matrices.They do not suffer from pseudo-correlation induced by indirect paths between two ROIs through a third ROI, do not require thresholding, do not discard negative pairwise correlation, and do not suffer from the difficulty in estimating partial correlation matrices.The proposed clustering coefficients were more strongly correlated with the participants' age than the conventional clustering coefficients, including those calculated for partial correlation matrices, were.In addition, our clustering coefficients can be used as a local measure to characterise nodes, whereas the counterparts for the conventional clustering coefficients were confounded with the (weighted) degree of the node.These results hold true for two alternative definitions of the clustering coefficient for correlation matrices that we additionally propose (Appendix A).
Previous research has produced incongruent results regarding the changes in the clustering coefficient along ageing.In an fMRI study, both at rest and during tasks, the clustering coefficient in functional networks decreased with ageing [53], which is consistent with the present results.This observation is also consistent with results of an EEG study at rest [52].In different studies, however, no difference was found in the clustering coefficient between younger and older individuals [10,48], or the clustering coefficient increased with ageing [32,[49][50][51].The diversity in these results may owe to participant's heterogeneity, inefficiency of the conventional clustering coefficients or other reasons.It should be noted that the decrease in the clustering coefficient found in a recent study [53] and the present study is consistent with the decline in small-worldness of brain networks, which have been documented by using different indices [15,69].However, we do not claim that the decline in the clustering coefficient along ageing is a ground truth.In fact, the coordinates of the ROIs in the current data set were determined from participants aged 7-31 [54] so that they may not reflect functional ROIs in older adults [70,71].This issue warrants further study.
We demonstrated the utility of the proposed correlation coefficients with fMRI data collected from individuals of different ages.They may also be useful in deciphering functional brain networks collected from different types of individuals such as those with psychiatric or other disorders, those under different task conditions and children under developments.Furthermore, the present method can be used to any correlation or covariance matrix, thus promising their applicability to other functional data of the brain, such as EEG, MEG, correlation in the cortical thickness between ROIs, where correlation is calculated across individuals (see Introduction for references), and even correlation data outside neuroscience.
The proposed clustering coefficients are expected to find immediate applications in the assessment of small-worldness.In the small-world analysis, a major method is to combine the clustering coefficient and the average path length between a pair of nodes, denoted by L. When L is small and the clustering coefficient is large, one says that the network is small-world [2,6] (but see [15,69] for different definitions based on the so-called network efficiency indices).In neuroscience, it is often the case to combine these two indices to examine a single small-worldness index [72] (also see [73] for a recent development).The motivation behind the present study is that the definition or measurement of clustering is nontrivial for correlation matrices, i.e., functional data.
The same caution applies to the path length.A common way to calculate the path length in correlation data is to threshold on the correlation matrix to generate an unweighted network and then measure the path length.However, this method suffers from arbitrariness of thresholding, as discussed in Introduc-tion.Another common way is to define a relationship between the edge weight, i.e., correlation coefficient value, and the cost of passing through the edge.Popular choices of the cost function are the reciprocal of the edge weight [16] and a constant subtracted by the edge weight [15,69].However, the theoretical basis of these decisions seems unclear.A more sensible definition of the distance between ROIs i and j may be 2(1 − ρ(i, j)), which qualifies as a Euclidean distance [74].
We used the three-way partial correlation coefficient controlling for a single ROI to define the clustering coefficients.In contrast, some previous studies derived functional networks from partial correlation matrices [55,56,61].Both types of methods intend to remove the spurious correlation induced by indirect paths between ROIs.While getting common, the methods based on partial correlation matrices face a technical challenge that the partial correlation matrix cannot be determined uniquely from data in general [33][34][35].In addition, its calculation for a single pair of nodes involves all the other N ROI − 2 nodes, contradicting the original premise of the clustering coefficient that it is a local quantity [6].Our clustering coefficients, which use the three-way partial correlation coefficient, do not suffer from the non-uniqueness problem and is a local quantity.Furthermore, we showed that the present clustering coefficients were associated with the age, whereas those calculated for the partial correlation matrices were not.Generalisation of this finding to different ROIs, data sets and types of participants, such as those with a particular brain-related disorder, warrants future work.

Appendix A: Two alternative clustering coefficients for correlation matrices
We assessed two alternative clustering coefficients for correlation matrices on the empirical and randomised data.
In the first variant, we restricted ourselves to the cases in which ρ(i, j) and ρ(i, ℓ) were positive when calculating the local clustering coefficient at ROI i, denoted by C cor,P i (superscript P standing for positive).We set In other words, C cor,P i is a weighted average of the partial correlation over pairs of j and ℓ (j, ℓ = i, j = ℓ) for which ρ(i, j), ρ(i, ℓ) > 0. The corresponding global clustering coefficient, denoted by C cor,P , is given by the average of C cor,P i over all nodes.Note that −1 ≤ C cor,P i ≤ 1 (1 ≤ i ≤ N ROI ) and that −1 ≤ C cor,P ≤ 1.The second variant of the clustering coefficient uses contributions of all ROIs regardless of the sign of ρ(i, j) and ρ(i, ℓ), but in a different manner from C cor,A i (and hence C cor,A ).If ρ(i, j), ρ(i, ℓ) < 0, Eq. ( 5) implies that ρ(j, ℓ) would be positive if there is no partial correlation between j and ℓ.Therefore, we regard that ρ partial (j, ℓ | i) measures the excess correlation between j and ℓ as usual.In contrast, if ρ(i, j)ρ(i, ℓ) < 0, Eq. ( 5) implies that ρ(j, ℓ) would be negative if there is no partial correlation between j and ℓ.This observation is consistent with Heider's balance theory, originating from social psychology and respected in various signed network data, which dictates that in signed unweighted networks (edge weight is either +1 or −1), triangles with one or three +1's are stable, whereas those with zero or two +1's are unstable [75][76][77][78].A related remark was previously made for correlation matrices [43].When ρ(i, j)ρ(i, ℓ) < 0, we regard that ρ(j, ℓ) being more negative (towards −1) than ρ(i, j)ρ(i, ℓ) is a signature of strong association between ROIs j and ℓ with the influence of ROI i controlled.
In other words, we take a negative large value of ρ partial (j, ℓ | i) as an indication of clustering composed of the three ROIs, i, j and ℓ, from the viewpoint of i.On the basis of this reasoning, we define the local clustering coefficient denoted by C cor,H i (superscript H standing for Heider) as follows: The denominator on the right-hand side of Eq. ( 12) is for normalisation to guarantee The corresponding global clustering coefficient, denoted by C cor,H , is given by the average of C cor,H i over all nodes.Note that −1 ≤ C cor,H ≤ 1.We also note that C cor,P i , C cor,P , C cor,H i and C cor,H can be negative, which is different from the clustering coefficients for unweighted and weighted networks and also from C cor,A i , C cor,A , C cor,M i and C cor,M .Both C cor,P and C cor,H were larger for the empirical data than for white-noise signals (for C cor,P , empirical: mean ± sd = 0.109 ± 0.038, white noise: 0.0003 ± 0.0044, t 137 = 33.4,P < 10 −6 , d = 5.71; for C cor,H , empirical: 0.090 ± 0.040, white noise: −0.0001 ± 0.0017, t 137 = 26.7,P < 10 −6 , d = 4.57).This result is consistent with that for C cor,A and C cor,M .In addition, the empirical C cor,P and C cor,H values were larger than those for randomised signals generated by the H-Q-S algorithm (for C cor,P , H-Q-S: mean ± sd = 0.035±0.032,t 133 = 22.0, P < 10 −6 , d = 3.81; for C cor,H , H-Q-S: 0.003 ± 0.017, t 133 = 25.8,P < 10 −6 , d = 4.47).This result is opposite to that for C cor,A and C cor,M .In sum, C cor,P and C cor,H were larger for the empirical data than for both types of randomised data.
To conclude, the results regarding the association of C cor,P and C cor,H with the age are consistent with but weaker than those for C cor,A and C cor,M .
Local clustering coefficients C cor,P Appendix B: Difference between the proposed and conventional clustering coefficients in terms of their association with the age Table 1 suggests that the correlation between the proposed clustering coefficients and the age is stronger than that between the conventional clustering coefficients and the age.To examine this point statistically, we ran the Williams' t-test for two non-independent correlation coefficients with a variable in common [79].The common variable was the age.We compared each of the two proposed clustering coefficients, C cor,A and C cor,M , with each of the five conventional clustering coefficients, C unw with edge density 0.1, C unw with edge density 0.2, C wei,B , C wei,O and C wei,Z .The results (C cor,A vs C unw with edge density 0.1: t 135 = −1.93,p = 0.028; C cor,A vs C unw with edge density 0.2: t 135 = −2.58,p = 0.0055; C cor,A vs C wei,B : t 135 = −1.86,p = 0.033; C cor,A vs C wei,O : t 135 = −3.09,p = 0.0012; C cor,A vs C wei,Z : t 135 = −3.03,p = 0.0015; C cor,M vs C unw with edge density 0.1: t 135 = −2.26,p = 0.013; C cor,M vs C unw with edge density 0.2: t 135 = −2.89,p = 0.0022; C cor,M vs C wei,B : t 135 = −2.14, p = 0.017; C cor,M vs C wei,O : t 135 = −3.10,p = 0.0012; C cor,M vs C wei,Z : t 135 = −3.08,p = 0.0013; all p values were not corrected for multiple comparison) indicate that seven out of the ten cases survived Bonferroni correction (α = 5%).Therefore, we conclude that the two proposed clustering coefficients are more strongly associated with the age than the conventional clustering coefficients are.

Appendix C: Dependence of the local clustering coefficients on sensorymotor brain systems
We repeated the same analysis as that in section 3.6 for sensory-motor brain systems.Because the brain atlas used in the main text only has the DMN, CON, FPN and cerebellum [54], we used the somatosensory-motor network (SMN; 34 ROIs), auditory network (13 ROIs) and visual network (31 ROIs), which are among several brain systems identified in a different study [68].
The Pearson correlation coefficient, r, between the different clustering coefficients and the age and that between the node strength and the age, averaged over the ROIs in each of the SMN, auditory network and visual network, is shown in Fig. 5.For the clustering coefficients for weighted networks (i.e., C wei,B , C wei,O and C wei,Z ), r was slightly positive on average in the SMN and moderately or considerably negative in the auditory and visual networks.The r value was different between the three brain systems for each type of weighted clustering coefficient (C wei,B ).However, as is observed in Fig. 5, qualitatively the same association between the age and the brain system was also found when r was defined as the correlation between s i and the age (F 2,75 = 11.6,P = 4.2 × 10 −5 , η 2 = 0.92) and when r was defined as the correlation between s + i and the age (F 2,75 = 25.5, P = 3.6 × 10 −9 , η 2 = 0.96).
The proposed local clustering coefficients were also different across the brain systems (C cor,A i : F 2,75 = 9.06, P = 0.00030, η 2 = 0.90; C cor,M i : F 2,75 = 11.1,P = 6.2 × 10 −5 , η 2 = 0.92).However, as suggested in Fig. 4, the brain system that showed the most positive correlation with the age was the auditory network (C wei,A In other words, the assoiation between the clustering coefficients for weighted networks and the age is confounded by that between the node strength and the age.In contrast, the proposed clustering coefficients measure the effect of local clustering on the age without being confounded by the node strength.

28 =,
−0.021, C cor,A i = −0.0051si + 0.28; C cor,M i : t 4019 = −1.21,P = 0.23, r 28 = −0.019,C cor,M i = −0.0016si + 0.055).Therefore, we conclude that C cor,A i and C cor,M i (and hence C cor,A and C cor,M ) are not affected by pseudo-correlation and provide measurements orthogonal to the node strength.In contrast, the previously provided local clustering coefficients for unweighted or weighted networks (i.e., C unw i of clustering coefficient for weighted networks (C wei,B i , C wei,O i and C wei,Z i

i
and C cor,H i did not show strong positive correlation with the equivalent of the node strength, i.e., NROI j=1;j =i ρ(i, j) (C cor,P i : t 4139 = −1.58,P = 0.12, r 28 = −0.057,C cor,P i = −0.043si + 0.110; C cor,H i : t 4139 = −2.86,P = 0.0053, r 28 = −0.075,C cor,H i = −0.033si + 0.091).Note that the Pearson correlation coefficient values were small, whereas the t value was significant due to a large sample size.Upon randomisation, the dependence of the local clustering coefficient of ROI i on NROI j=1;j =i ρ(i, j) remained small in terms of the r value (C cor,P i : t 4019 = −5.14,P = 1.0 × 10 −6 , r 28 = −0.19,C cor,P i = −0.110si + 0.038; C cor,H i : t 4019 = 0.746, P = 0.46, r 28 = 0.011, C cor,H i = 0.005s i + 0.000).Therefore, we conclude that C cor,P i and C cor,H i , and hence C cor,P and C cor,H also, are not affected by indirect correlation and provide measurements orthogonal to the node strength.

Figure 1 :Figure 3 :
Figure 1: Schematic of the indirect path between nodes j and ℓ through node i.

Figure 4 :Figure 5 :
Figure 4: Pearson correlation coefficient between a nodal index and the age, averaged over the ROIs in the DMN, CON or FPN.The circle represents the correlation coefficient value for a single node.
2 off ⌋ , where ⌊•⌋ is the largest integer smaller than or equal to the argument.Then, we generate N ROI × t max variables, denoted by x i,t (1 ≤ i ≤ N ROI , 1 ≤ t ≤ t max ) that independently obey the normal distribution with mean µ off /t max and variance −µ off /t max + µ 2 off /t

Table 1 :
Correlation between the clustering coefficient and age.The correlation coefficient is denoted by r.The degree of freedom is equal to n − 2 = 136.

Table 2 :
Correlation between the clustering coefficient and the node strength.The degree of freedom is equal to n − 2 = 136.