Revisiting non-linear functional brain co-activations: directed, dynamic and delayed

The center stage of neuro-imaging is currently occupied by studies of functional correlations between brain regions. These correlations define the brain functional networks, which are the most frequently used framework to represent and interpret a variety of experimental findings. In previous work we first demonstrated that the relatively stronger BOLD activations contain most of the information relevant to understand functional connectivity and subsequent work confirmed that a large compression of the original signals can be obtained without significant loss of information. In this work we revisit the correlation properties of these epochs to define a measure of nonlinear dynamic directed functional connectivity (nldFC) across regions of interest. We show that the proposed metric provides at once, without extensive numerical complications, directed information of the functional correlations, as well as a measure of temporal lags across regions, overall offering a different perspective in the analysis of brain co-activation patterns. In this paper we provide for a proof of concept, based on replicating and completing existing results on an Autism database, to discuss the main features and advantages of the proposed strategy for the study of brain functional correlations. These results show new interpretations of the correlations found on this sample.


I. INTRODUCTION
The large scale dynamics of the brain exhibits a plethora of spatio-temporal patterns. Since the first description of voxel-wise correlation networks [7] there has been a continuous interest in developing better ways to derive brain "networks" from fMRI time series data. Common to all is the identification of functional "nodes" ( i.e., fMRI time series extracted from region of interest (ROI)), functional edges (i.e., the relatively stronger cross-correlations which allows for the subsequent graph analysis. An important methodological challenge has been always to define an adequate coarse graining of the brain imaging data to compress thousands of the so-called BOLD ("blood oxygenated level dependent") time series. The usual analysis aims at the identification of bursts of correlated activity across certain regions, which requires extensive computations, complicated in part by the humongous size of the data sets.
In earlier work we proposed that the timing of the brief epochs of relatively stronger BOLD activations contain a great deal of functional connectivity (FC) information [26,27]. The results of subsequent work [1-3, 13-16, 23, 28, 31] seems to provide ample support to this idea, by confirming the functional relevance of such relatively large amplitude BOLD events under a variety of conditions. The present work goes beyond the analysis of correlations between BOLD time series to explore and define a set of measures of the nonlinear directed dynamic functional correlation across regions of interest (ROIs). The use of such measures, despite its simplicity, may help to expand at once the perspective of the usual FC paradigms, such as seed correlations maps and networks, into the realms of nonlinear time-dependent directed correlations.
The paper is organized as follows: in the next section we describe the essence of the method, starting with the basic procedure to define the BOLD triggered events followed by a description of the available correlation measures that allow a proper definition of the functional connectivity between the events, including definition of directionality and temporal lag of the events. Section 3 contains the analysis of a simple example as a proof of concept of a healthy subjects fMRI data set, followed by the replication and further analysis of a voxel-wize published data set from Autism Syndrome in order to show the method features. The paper closes with a discussion of the advantages and limitations of the method and potential implications of the results. Derivations and further technical details are condensed on the Appendix.

II. METHODS
The analysis to be discussed can use BOLD time series recorded indistinctly from either resting state conditions or during an experiment in which the subject is performing a given task (for space reasons we only show in the subsequent sections results on resting state conditions). The most common approach to determine functional connectivity is to compute Pearson linear correlation between BOLD time series [9,30]. In contrast, the objective of the present analysis is to determine the relation between relatively large amplitude BOLD activations from a given pair of signals. In this section it will be discussed: II A how large amplitude events are selected given series of fMRI data; II B correlations computed with the selected events; II C how directionality is understood when working with events; and II D how the dynamic connectivity, understood here as lags between time series, is computed.

A. Definition of BOLD triggered events
First, each BOLD time series is z-scored (its mean is subtracted and it is divided by its standard deviation). Next, a threshold for detecting strong activity is chosen, (typically the results remain unchanged when using a range of 1 − 2 standard deviations) and for each time series, the timing of each upward threshold crossing is determined (Fig. 1, panel  A). Note, that the number of threshold crossings depends on the auto-correlation of the BOLD signals (which stays in the range 0.6-0.85 [22]), and more generally on the exponent of the 1/f α frequency spectrum. Empirically, for the threshold of 1σ, in a BOLD signal we find on average 8.5 ± 2.8 upward crossings per 4 minutes of fMRI scan.
The timing is further used to define the seed or source events. For a given seed voxel or region of interest (ROI) they consist of segments of BOLD time series starting typically 4 − 5s before and ending 9 − 15s after the crossing (which translates to 2 − 3 TRs before and 4 − 7 TRs after, with T R = 2.3 in the data we are using as a proof of concept in this study). This time scale is chosen by the typical duration of these events, which in turn is dictated by the longest time scale of the haemodynamic response function (∼ 10 − 15 seconds).
Finally, for each seed event, the target events are extracted from all the other BOLD time series at the exact same times as the seed, see Fig. 1 B-C. The average time courses of the events follow typically a smooth pattern, although they do exhibit variability, for both the seed (see Fig. 1 D) and targets (see Fig. 1 E-F). If the interest of a given experiment is to define an average inter-relation measure between ROIs, then all the seed and target events might be averaged (as shown by red-and-black circles in Panels D-F of Fig. 1), for instance over the entire scan fMRI session.

B. Correlations
Once events are selected, a few options of computing correlations are possible: 1. r P (i, j) linear Pearson correlation between the whole time series i and the whole time series j (computed in section III were we perform a proof of concept). This option is not related to events, but in next section we will compute it to compare with next options, 2. r   The choices, 3-4, even we are not showing results on this paper, can be helpful in getting statistically less biased estimators of correlations. In the paper we use 1 and 5.

C. Directionality
Given two regions of interest i and j, the linear Pearson correlation between their BOLD time series by definition is symmetric, i.e., r P (i, j) = r P (j, i). It is not the case, if the correlations are computed using events. Then, the distinction between source and target becomes relevant, as shown in Fig. 2 A. The shaded areas in the plots mark the positions of source events of each of the two relatively strongly correlated ROIs. Visibly, the first two events are common for both time series, but for instance the BOLD activations around T R = 30 and T R = 40 are source events for ROI 2 but not for ROI 1.
Consequently, the set of events over which one computes correlations when ROI 1 is the source is different from when ROI 2 is the source, which is shown in Fig. 2 B. In that panel, the first row shows source events of ROI 1 and target events in ROI 1 in response to ROI 2 activations; the second row shows target events in ROI 2 in response to ROI 1 activations and source events of ROI 2. So even though the BOLD series of both regions are highly correlated, the source and target events are different, and hence the event correlation is not symmetric r E (i, j) = r E (j, i). The asymmetry in correlations might indicate directionality of co-activations between regions. We can compute and assess a global correlation asymmetry of the whole brain or similarly asymmetry of each ROI, or of each pair of time series i and j. Alternatively, in the spirit of analysis of point processes [4,26,28], we can use another proxy for directionality, based on the relative number of events occurring simultaneously in two regions. For instance, in Fig. 2, there are 2 out of 6 source events in ROI 1 that are also triggers (i.e., above threshold) in ROI 2, and 2 out of 5 in ROI 2 that are also triggers in ROI 1. This approach takes into account event amplitudes, which to a large extent could be also achieved by computing covariance instead of correlation of events. Below we call such ratio event directionality.

D. Temporal lags
Recent studies [17][18][19][20][21] have reported that the latency structure of the fluctuations in the fMRI BOLD signal shows spontaneous activity propagating through and across the cortex on a timescale of about 1s. This information was obtained by conventional lagged cross-covariance between pairs of BOLD time series x i (t) and x j (t) extracted from regions i and j: where τ is the lag (in TRs). The value of τ (i, j) at which C i,j (τ ) exhibits an extremum defines the delay between signals x i and x j . Performing a parabolic interpolation of the cross-covariance extremum allows to determine the temporal lags with a finer resolution [21]. By definition, for a long time series the time delay matrix τ (i, j) is anti-symmetric, i.e., τ (i, j) = −τ (j, i). Information on the cross-covariance value and the lags can be used to determine the structure of entire spatiotemporal processes.
Here, instead of computing cross-covariance between entire time series, we make use of the fact that the BOLD triggered events have a well defined timing. Given a source time series x i (t) and a target time series x j (t), we obtain a set of k i source events. For each source event in x i (t) we find the closest peak in x j (t) irrespective of its size and whether it occurred before or after the source event. We search for the peak within a window of [−6, 8] TRs from the source threshold crossing. As shown in Fig. 3, to obtain a finer timing of both the source and target peak we also use a parabolic fit. The lag τ (i, j) is then defined as a difference between the target and source parabola peaks. As a technical side note, when getting a peak value at the left or right edge of the time window we do not perform the parabola peak estimation, which could have unbounded values, but we set the lag to −6 or 6, respectively. Since the sets of source (threshold crossing) events of x i (t) and x j (t) can be (and usually are) different, the matrix τ (i, j) is in general non-symmetric irrespective of the length of the time series. Additionally, for each i, j pair of ROIs we can obtain a set of delays for each individual source event k: τ (k) (i, j), an average of these valuesτ (i, j), or alternatively a delay between average events τ (i, j) (like the ones in Fig. 1 D-F).

III. RESULTS:
For the sake of presentation we will proceed here to discuss the performance of the method on two settings. The first (3.1) corresponds to the analysis of BOLD time series (n=90) from the AAL parcelation [29] and the second (3.2) describes a voxel-wize functional connectivity analysis. In all cases we will compute classical Pearson correlation measures and compare them with ones obtained from the proposed method.

A. Functional connectivity on AAL parcelled time series data sets
The purpose of this subsection is to show how the computations explained previously perform on a single subject and a group of healthy participants, for this, we are using a group of 32 individuals from ABIDE preprocessed database [5]. The 90 AAL preprocessed (using Data Processing Assistant for Resting-State fMRI (DPARSF) pipeline) time series have been downloaded from the database and z-scored before lineal and nldFC computations.
The results of these computations can be seen in Figure 4, which shows a matrix for single subject, a mean matrix of the whole group and the distribution of each computation. First, Fig. 4 A shows typical results obtained from Pearson correlations between all the time series, note that the distribution is a usual Gaussian shape. The visibly different shape of the histogram of event correlations as compared to the distribution of Pearson correlations in Fig. 4 B is simply due to the shape of the sampling distribution of Pearson estimator for small length of time series. This issue can be adjusted as discussed in Supplementary Material. Fig. 4 C shows the distribution of asymmetry computed from events, here we are computing the proportion of shared events between regions, to see the first computation explained in methods section, subtracting the transposed matrix for each subject can be seen in Supplementary Material. Delay between time series are showed in Fig. 4 D, for shifted time series as in [18] and Fig. 4 E, for delay computed using events. Note that for Fig. 4 D, the apparent asymmetry is due to the TRs subtracted at the beginning and end of the signal, to allow the computation, while for Fig. 4 E, the event selection between target and source, so it is not an artifact of the computation.

B. Replication of voxel-wise functional connectivity findings
As a further validation of the computations explained above, we have replicated [32] findings on functional connectivity between insular sub-regions on Autism Syndrome patients, using data from the ABIDE preprocessed database [5]. This is an open database with thousands of scans already preprocessed of Autistic patients (AU) and age-matched Healthy subjects (HS) http://preprocessed-connectomes-project.org/abide/quality_assessment.html [6,25,33]. For this computations, we have used UCLA database from ABIDE, as in [32], only subjects with good quality of the scan and under 1mm of frame-wise displacement were included into the sample, resulting a sample of 47 AU and 32 HS. The MRI data acquisition as the preprocessing pipeline used for this database can be accessed here: http://preprocessed-connectomes-project.org/abide/Pipelines.html.

Pearson correlations
For each subject scan, mean time series from six insular subregions (using Brainnetome functional atlas, [8]) were extracted and correlated using Pearson correlation with all the rest of tha voxels of brain (grey matter masked) as in [32]. One-sample t-test was computed for each group of participants (AU and HS) to result the correlation pattern of each insular subregion, obtaining comparable results as in [ [32] (Fig. 1)]. For purposes of space we only show results from left ventral agranular Insula subregion. As in [32], HS resulted higher correlation of this ROI with bilateral precuneus cortex (see Figure 5).

non-linear functional co-activations
Following the method explained in [27], relevant events from mean time series of left ventral agranular insula subregion were extracted (triggering events where the amplitude is above a 1sd threshold, 2 TR previous to this trigger and 4 TR after). All time series from all the voxels in the brain (grey matter masking) corresponding to those events were extracted. Then correlations between the 'mean event' of insula and the 'mean event' of each voxel were computed. As it can be observed in Figure 5 panel B, similar results to Pearson correlation of the whole signal were obtained, it is important to note that here we are only taken into account the signal from events, not the whole time series. Computing a two-sample t-test (GFR corrected, p-voxel=.001, p-cluster=.05) the same cluster of higher correlation between insula and precuneus cortex in HS group can be observed. Results for zero-lag seed-voxel correlations using the left ventral agranular insula as seed. It is shown the Z-transformed Pearson correlation one-sample t-test for each group of participants (first-second columns for AU and third-fourth for HS) and two-sample t-test to assess differences between groups (fifth and sixth columns of brain surfaces)(GFR corrected voxel p¡.001, cluster p¡.05); Panel B) The same distribution of columns as A), for the results of correlating the the large events (Z-transformed, GFR corrected voxel p¡.001, cluster p¡.05);

Asymmetry
As it has been explained above, the correlation value between two signals (i,j) obtained when computing relevant events is not symmetric, the correlation of the events of a source with its target r(i, j) is not necessarily the same than the correlation of the events of that target, acting as source, with the source, acting as a target r(j, i). The difference between this r(i, j) and r(j, i) can be understood in terms of directionality of the correlations. To test this and reveal new results on the activity of left ventral agranular insula, we computed these differences across the whole brain, resulting higher asymmetry of HS group between the ROI and Right superior frontal gyrus, and higher asymmetry of AU group between the ROI and right fusiform gyrus ( Figure 6).
FIG. 6. Directionality computed from-Insula and to-Insula for all brain voxels. Panel A shows group averages for asymmetry from Insula subregion mean time series to the rest of the brain for Autistic patients (left columns), Healthy subjects (middle columns) and histrograms showing the distribution of these directionalities. Panel B shows the same computations but for directionality from all the brain to the Insula subregion mean time series.

Delay
Al previous computations can be understood as lag-0 results, but it can also be computed the delay from an event, in a source time series, and the closest event in a target time series. To test this delays from left ventral agranular insula to all the rest of brain voxels were computed. Comparing the delays between groups it is shown that while left postcentral gyrus and the above mentioned precuneus cortex, have a positive delay for AU, while they have a negative delay for HS (Figure 7). To better explain the differences in delay τ between AU and HS subjects, Fig.8 shows time series of postcentral gyrus and ventral agranula Insula for a single AU subject (Panel A) and a HS subject (Panel B).

IV. DISCUSSION: FEATURES, ADVANTAGES AND LIMITATIONS OF THE PROPOSED STRATEGY
Emphasizing the relevance of relatively high amplitude BOLD signal while compressing the data motivated the two paradigms we have proposed previously; namely the point process [26] and the rBeta technique [27]. Both attempt to capture the spatio-temporal dynamics with the smallest possible sampling with a trade off between temporal and spatial resolution. The point process compresses in the temporal domain, which implies that to smoothly represent spatio-temporal correlated patterns one needs to sample more voxels. On the other hand, the rBeta approach uses much fewer voxels, but at the expense of keeping additional temporal information around each threshold crossing. These two variants have demonstrated to main advantages comparing to the above mentioned functional connectivity measures, the fist one is that they imply a data size reduction and less computational resources to obtain comparable results to full time series analyses, and second, as these are only focusing on relevant high amplitude time-points, or events, less significant events occurring during the scan are not blurring the computations. The non-linear dynamic functional connectivity method we are proposing, offers a widely different perspective in the analysis of brain co-activation patterns without much numerical complications, since it implies no more than thresholding and linear correlations, facilitating a simple interpretation of the resulting functional connectivity paths. The fact that the correlations are computed from events identified either as seeds or targets leads to directed graphs (i.e., asymmetric correlation matrices). These seed-target relations can lead to new approaches to understand brain dynamics, for instance, as in the example of Autism Syndrome in which the computation of delays between events shows distinct information. Pearson correlation, computed between Left ventral agranular Insula and Postcentral gyrus, does not show any differences between Autistic patients and Healthy subjects, while it has been reported that Poscentral gyrus has a differential connectivity in Autism Syndrome when analyzing big samples [11]. When computing delays from Insula to other regions, differences between healthy subjects and Autism Syndrome in Postcentral gyrus appear, which leads us to think that this differential connectivity may not be just spatial but on a spatio-temporal domain. Another example is the additional information we have found on the comparison between Autism Syndrome and Healthy Subjects correlations which shows that Autism Syndrome patients present a lower connectivity between precuneus cortex and ventral agranular insula, this explanation can be found on the differences in delay between these two areas (Fig. 7 panel B shows there is a higher delay on Autism Syndrome from insula to precuneus, explaining the drop on correlation).
The main advantages provided by this new method of analysis are two: A) The results are highly reproducible on correlations asymmetry and delays, even changing the threshold used to extract the relevant events, that means that the method is not depending on specific cut-off point, as sometimes can happen when using r − thresholding from Pearson correlations. B) This method is applicable to task data, for instance defining the task timing (convoluted with HFR function) as the seed time series; or as an attempt to explain dynamic functional connectivity fluctuations, possibly due to on-going cognition, as suggested in [10].
Further testing of the method should be performed to identify more specifically the limitations of the method. A) For instance, we have not compared the method with results obtained from sliding-window Pearson correlation, a widely-used method to inspect dynamics in functional connectivity [12,24]. In further work we expect that there will be a relation between this window based connectivity and the information provided from the delays of our method. B) Another technical issue that we are aware is the the few points we are correlating when selecting relevant events, which is the same limitation of sliding-windows correlations for dynamic connectivity analyses. C) It is unclear the meaning of the delays distribution, something already intriguing from previous results obtained using Pearson correlation delays ( [19], Fig.5).
Finally we shall mention that while here we concentrate on the activation events, i.e., denoted by the BOLD signal upward crossing of a threshold, the same method can be applied without modification to de-activation events. In such a way graphs of regions of interest which are correlated with the deactivation of regions can be obtained, something that we are not aware was considered before.
In conclusion, we have analyzed undisclosed properties of the previously published rBeta method [27]. Overall these calculations provide different kind of information, than the usual Pearson correlation of the entire time series. As a prof of concept we have replicated a recently published study of Functional Connectivity in Autism Syndrome, adding the new features obtained from our method. More work has to be done to override the possible limitations of the method and to test it with task paradigms...