Localizing and Estimating Causal Relations of Interacting Brain Rhythms

Estimating brain connectivity and especially causality between different brain regions from EEG or MEG is limited by the fact that the data are a largely unknown superposition of the actual brain activities. Any method, which is not robust to mixing artifacts, is prone to yield false positive results. We here review a number of methods that allow for addressing this problem. They are all based on the insight that the imaginary part of the cross-spectra cannot be explained as a mixing artifact. First, a joined decomposition of these imaginary parts into pairwise activities separates subsystems containing different rhythmic activities. Second, assuming that the respective source estimates are least overlapping, yields a separation of the rhythmic interacting subsystem into the source topographies themselves. Finally, a causal relation between these sources can be estimated using the newly proposed measure Phase Slope Index (PSI). This work, for the first time, presents the above methods in combination; all illustrated using a single, simulated data set.

From the cross-spectra S(f ) one can construct coherency matrices C(f ), which are a normalized version of S(f ), as In contrast to the imaginary parts of the cross-spectra, (C(f )) also depends on independent sources through the denominator in Eq. 2. However, independent sources can only lead to a decrease of (C(f )) and hence also (C(f )) reflects true interaction even though the physiological interpretation is not trivial especially when interpreting differences of (C(f )), e.g., between different tasks.
Based on these observations we suggested a series of methods to identify and localize brain interactions (Meinecke et al., 2005;Nolte et al., 2006;Stam et al., 2007;Marzetti et al., 2008;Nolte et al., 2009). Additionally, we proposed a method to identify causal structures of the dynamical system under study . We here give a brief review of some of these methods (Nolte et al., 2006;Marzetti et al., 2008;Nolte et al., 2008) to identify interacting brain sources and to estimate causal relationships. All the methods will be demonstrated using simulated data whose characteristics are defined in the following section.

Simulated interacting neural data
We simulated a seminal case with four dipolar sources as shown in Figure 1, in which the dipoles have all a parallel orientation and are spatially well separated. The sources on the right (left) are interacting with each other but not with the sources on the left (right). We thus considered two interacting subsystems. For both subsystems the source in the back served as driver while the activity

introduction
Electroencephalography (EEG) can directly measure ongoing brain activity with very high temporal but low spatial resolution. In the past decades the main focus was the analysis of event related potentials, i.e., the average brain response to a given stimulus. More recently, the variability of brain activity and especially its interpretation as signatures from the brain as a dynamical network has attracted many researchers (Daglish et al., 2005;Womelsdorf and Fries, 2006;Buckner and Vincent, 2007;Damoiseaux and Greicius, 2009;Fries, 2009;Miller et al., 2009). Studying brain connectivity using noninvasive electrophysiological measurements like EEG or MEG faces the challenge that the data are largely unknown mixtures of activities of brain sources.
To address this issue, we suggest to construct estimates of brain connectivity from quantities that are unbiased by non-interacting sources. For zero mean data 1 the linear statistical signal properties can be determined by the cross-spectral matrices S(f ) defined as where x m (f) are the Fourier transforms at frequency f in channel m for a given segment or trial and 〈·〉 denotes the expectation value which is typically approximated by an average over the segments or trials. It is straight forward to show that noninteracting sources do not contribute systematically, i.e., apart from random fluctuations around zero to the imaginary part of the cross-spectra, (S(f )), regardless of the number of sources and details of the forward mapping . The reason is that the forward mapping is essentially instantaneous and does not induce phase delays to excellent approximation (Stinstra and Peters, 1998) which would be necessary to yield a nonvanishing imaginary part of S(f ).
Localizing and estimating causal relations of interacting brain rhythms of the more frontal sources appeared merely identical to the ones of the drivers but the activity was delayed by 20 ms. The activity of the right driver was given as where ξ 1 (t) is white Gaussian noise with standard deviation 1. Similarly, the activity of the driver on the left side was simulated via We defined a single time step to equal 10 ms, i.e., we considered a sampling rate of 100 Hz, by which the time series u 1 (t) and u 2 (t) displayed pronounced spectral peaks at around 8 and 12 Hz, respectively, and had roughly identical magnitudes. Both time series also have (weak) higher harmonics at 24 and 36 Hz, respectively.
The frontal sources, v 1 (t) and v 2 (t) for right and left side, respectively, are merely delayed versions of the drivers: corresponding to a delay of 20 ms. In total, we modeled 200 s of EEG data. The activities of the four dipolar sources were mapped into 118 EEG channels equally distributed on the scalp. As volume conductor we assumed a three-shell realistic model calculated from the MRI data containing brain, skull, and scalp with equal conductivities for brain and scalp and 50:1 conductivity ratio between scalp and skull. The Maxwell equations were solved using a semianalytic expansion of the electric lead fields (Nolte and Dassios, 2005). An accurate forward model is important but difficult. For the sake of simplicity we here assumed that the forward model is correct, i.e., for the inverse methods we used the same forward model as for the forward simulation.
To the activities of the sources of interest we superimpose spatially correlated and temporally white noise generated as the activity of a collection of dipoles placed on a 1 cm grid within the entire brain. All components of all dipoles were modeled as iid Gaussian noise leading least when focussing on the current discussion. These assumptions can be expressed for an even number of N channels as a model for the imaginary part of the cross-spectra: For each k the set of topographies (a k and b k ) and the "interaction spectrum" P k (f ) form a -what we call -PISA component. We note that this model is only unique up to linear mixing of the two topographies for each k. In other words, the model only identifies the 2D-subspace spanned by the two topographies and not the individual components.
The model is found by joined diagonalization (cf. Ziehe et al., 2004) of (S(f )) in the complex domain: we find a demixing matrix W such that W(S(f ))W † is diagonal. It can be shown that real and imaginary parts of the columns of the mixing matrix A = W −1 span the same subspaces as the pairs of topographies a k and b k . For technical details we refer to Nolte et al. (2006).
Results of the PISA decomposition for the simulated data set are shown in Figure 4, where we show the largest three components. Only the first and the second component revealed a significant interaction spectrum corresponding to the two interacting subsystems in the left and right hemisphere, respectively.

minimum overlap component analySiS (moca)
In order to uniquely decompose the 2D-subspaces found by the PISA method into contributions from individual sources we must introduce further spatial constraints on the nature of the sources.
to highly correlated noise in the EEG electrodes. The noise level was chosen such that the average of power over all channels and frequencies was 20 times higher than the respective average of the signal of interest. In "good" channels and at peak frequencies the power of the signal of interest was still around 10 times higher than the noise.
Power (imaginary part of coherency) over all channels (pairs of channels) are shown as function of frequency in Figure 2. The spatial distribution of the imaginary part of coherency at 10 Hz, i.e., between the peaks and with contributions from both interacting subsystems, is shown in Figure 3.

pairwiSe interacting component analySiS (piSa)
In general, EEG data are a superposition of many subsystems including (effectively) independent sources but also interacting rhythmic sources of various physiological content. To separate these systems we assumed that (a) all interactions are pairwise and that (b) there are not more interacting sources than channels. These two assumptions are a clear simplification of the true brain dynamics, but they yield a unique decomposition of the data and may capture the most relevant aspects of the interaction observed in EEG data, at

phaSe Slope index (pSi)
We finally want to estimate causal structures between the estimated sources. Since the combination of PISA and MOCA resulted in a complete basis of topographies we can find the source activities by applying the inverse of the respective matrix onto the data.
The "Phase Slope Index" (PSI) estimates the causal structure between any two source activities. It is defined as Nolte et al. (2008)  where C ij (f ) is the complex coherency between sources i and j, as given in Eq. 2, and δf is the frequency resolution of the coherency. F is the set of frequencies over which the slope is summed. Usually, F contains all frequencies, but it can also be restricted to a specified band for rhythmic activities. To see that the definition of  Ψ ij corresponds to a meaningful estimate of the average slope it is convenient to rewrite it as with C ij (f ) = α ij (f )exp(iΦ(f )) and α ij (f ) = |C ij (f )| being frequency dependent weights. For smooth phase spectra, sin(Φ(f + δf ) − Φ(f )) ≈ Φ(f + δf ) − Φ (f ) and hence  Ψ corresponds to a weighted average of the slope. We list the most important qualitative properties of  Ψ : 1. For an infinite amount of data and for arbitrary instantaneous mixtures of an arbitrary number of independent sources,  Ψ is exactly zero, because mixtures of independent sources do not induce an imaginary part of coherencies  which in turn is necessary to generate a non-vanishing  Ψ . For finite data,  Ψ will then fluctuate in this case around zero within error bounds. A special case of this are phase jumps from 0 to ±π which can arise also for mixtures of independent sources.
To this end we apply a linear inverse operator, e.g., a minimum norm solver G onto the topographies denoted here for any fixed k as x 1 = a k and x 2 = b k , such that the topographies are mapped into distributions s i of the source field where s i = s i (m,k) is a three dimensional vector field calculated in brain voxels m = 1,..,M and in directions k = 1,..,3. The distributions do not represent the sources of the brain, denoted as q i , but are, within the accuracy of the inverse method, a yet unknown superposition of them: 2. The sources have minimum overlap: This cost function first squares the scalar product of two dipole moments at each voxel and then sums these squares over all voxels. It vanishes if the two dipole distributions have disjoint support (i.e., disjoint regions of non-vanishing activity), thus measuring overlap. It also vanishes if the orientations at each voxel are orthogonal and therefore corresponds to a weaker form of overlap allowing in principle also activities at the same location as long as the orientations are sufficiently different. Thus, a strong bias toward remote interaction is removed.
The minimization in Eq. 10 can be realized analytically (Marzetti et al., 2008). If the concept is generalized to more than two topographies the minimization requires a numerical approach, which, however, is surprisingly fast and robust (Nolte et al., 2009). We note that the spatial constraints (Eqs 9 and 10) and the methods to solve the minimization are similar to those used in ICA in the context of fMRI data analysis (McKeown and Sejnowski, 1998;Matsuda and Yamaguchi, 2004) with the major difference that we here decompose vector fields rather than scalar ones. In particular, the orthogonality constraint in Eq. 9 corresponds, mutatis mutandis, to "sphering" as is used in most ICA methods also used for EEG/MEG data analysis: for simplicity, the data are transformed to be exactly uncorrelated while independence in higher statistical orders is only forced to be as good as possible.
For the present data set we further assumed the sources to be located on the cortex but allowed for arbitrary orientation. Source estimates of the first two PISA components for the simulated data set are shown in Figure 5. We observe that each of the topographies, decomposed from the PISA results using MOCA, corresponds to one of the simulated dipoles.