Revealing the Relevant Spatiotemporal Scale Underlying Whole-Brain Dynamics

The brain rapidly processes and adapts to new information by dynamically transitioning between whole-brain functional networks. In this whole-brain modeling study we investigate the relevance of spatiotemporal scale in whole-brain functional networks. This is achieved through estimating brain parcellations at different spatial scales (100–900 regions) and time series at different temporal scales (from milliseconds to seconds) generated by a whole-brain model fitted to fMRI data. We quantify the richness of the dynamic repertoire at each spatiotemporal scale by computing the entropy of transitions between whole-brain functional networks. The results show that the optimal relevant spatial scale is around 300 regions and a temporal scale of around 150 ms. Overall, this study provides much needed evidence for the relevant spatiotemporal scales and recommendations for analyses of brain dynamics.


INTRODUCTION
The brain is a dynamical system that can process and adapt to new information by rapidly transitioning between multiple states. Human functional neuroimaging studies have demonstrated these transitions of functional states, even in absence of an active task (Tang et al., 2012;Stitt et al., 2017;Liégeois et al., 2019). These functional states contain orchestrated activity of several networks of brain regions, which transition between each other in recurring patterns over time (Alexandrov, 1999;Ashourvan et al., 2017;van der Meer et al., 2020). These network transitions have been associated with cognition and (ab)normal behavior (Engel et al., 2001;Bassett et al., 2011;Thompson et al., 2013;Vidaurre et al., 2017;Liégeois et al., 2019;Yoo et al., 2020). While describing these transitions is important for our understanding of the neural basis of cognition and behavior, a fundamental question remains, namely on which spatiotemporal scale (i.e., spatiotemporal resolution) these transitions between whole-brain functional networks generally take place. This unclarity has led to arbitrary decisions in neuroimaging experiments regarding spatiotemporal scales, risking losing relevant information on network transitions.
The main reason for a lack of empirical evidence for the optimal spatiotemporal scales of brain transitions is caused by technical restrictions of human neuroimaging studies. Despite a coherent overlap between functional networks in EEG/MEG and fMRI, i.e., EEG microstates and fMRI resting state networks Musso et al., 2010;Yuan et al., 2012), these neuroimaging modalities differ regarding the available spatial and temporal scale. The spatial scale of MEG depends on the number of sensors, and beamforming can only separate up to around 70 regions across the whole brain with a significant drop in signal in deeper regions (Brookes et al., 2016;Deco et al., 2017a). Regarding the temporal scale, EEG and MEG provide a temporal scale of milliseconds, whereas fMRI is limited by the repetition time (TR) and the hemodynamics of the BOLD signal. In line with the respective strengths of the modalities, spatial scales and parcellations have been mainly explored with fMRI (Proix et al., 2016;Arslan et al., 2018), while temporal scales of brain dynamics have been rather investigated using EEG or MEG (Bassett et al., 2006;Britz et al., 2010). However, current technical restrictions make simultaneous comparisons of spatial and temporal scales insurmountable.
One way of overcoming the technical limitations of spatiotemporal scales of empirical neuroimaging data is to use generative dynamical network models that simulate wholebrain neural activity in a simplified manner, while retaining the main properties of empirical spatiotemporal dynamics. In a broad sense, generative models conceptualize the relevant underlying processes (e.g., synaptic connectivity and dynamics of neural populations) that generate the observed empirical neural signal, e.g., by using mean-field approximations (Wilson and Cowan, 1972;Stephan et al., 2007;Deco et al., 2013). In contrast to restricted spatiotemporal scales of empirical data, these models recover the underlying neural signal with a flexible adaptation of the signal's temporal scale up to a millisecond scale (Yuan et al., 2018;Deco et al., 2019;Cornblath et al., 2020), while simultaneously offering an exploration of various spatial scales (Gerstner et al., 2012).
In our previous research, we were able to compare the complexity of dynamic transitions of whole-brain networks using a whole-brain network model incorporating different temporal scales from milliseconds to seconds (Deco et al., 2019). In the current study, we extend this approach by adding a spatial scale to the analysis (i.e., the number of regions) and exploring the transitions of networks across different spatiotemporal scales (i.e., simultaneously considering spatial and temporal scales). The novelty of our study comes from the information about the optimal number of brain regions (i.e., spatial scale) in describing the spatiotemporal dynamics of the resting-state, providing an empirical basis for the choice of parcellations for analyses of brain dynamics. By doing so, we complement previous neuroimaging studies on the temporal scale of functional wholebrain networks' transitions (e.g., Bassett et al., 2006) and the spatial scale of static networks (e.g., using functional connectivity as in Arslan et al., 2018).
To achieve our goal, we explore transitions of whole-brain functional networks at spatial scales from 100 to 900 regions. To add a temporal dimension, these networks are extracted from empirical resting-state fMRI with fixed temporal scales (corresponding to the TR) as well as from simulated time series with various temporal scales from milliseconds to seconds. We determine the relevant spatiotemporal scale by comparing the entropy of functional network transitions. By focusing on the behavior of whole-brain networks instead of separate brain regions, we extract information that is the most relevant for brain dynamics analyses. From our results we derive recommendations for neuroimaging researchers, highlighting our finding that the optimal spatial scale for analyses of brain dynamics is at around 300 regions and at an optimal temporal scale of around 150 ms. Besides implications for a better understanding of functional network transitions, we provide data-based recommendations for choosing appropriate neuroimaging modalities and parcellation techniques for brain dynamics analyses.

MATERIALS AND METHODS
We adapted the existing method comparing different time scales (Deco et al., 2019) to incorporate different spatial scales. Images were created using Biorender, Inkscape, Connectome Workbench and the Matplotlib library within Python.

Data Acquisition and Preprocessing
We used the 100 unrelated subjects' subset (54 females, 46 males) from the Human Connectome Project (HCP) (Van Essen et al., 2013). From this data, we analyzed the left-right (LR) phaseencoding runs from the resting state fMRI data, which had about 15 min duration with a TR of 0.72 sec. The HCP study was approved by the local ethical committees and informed consent was obtained from all subjects. Data from six subjects were discarded as the resulting functional connectivity (FC) matrices consisted of at least one not available row at parcellations with more than 800 regions (due to the sparsity of the networks), resulting in a total of 94 subjects being used for the analysis. During fMRI acquisition, subjects were instructed to keep their eyes open while looking at a fixation cross. A full description of the imaging parameters and minimal preprocessing pipeline can be found in Glasser et al. (2013). In short, after correction for motion, gradient, and susceptibility distortions the fMRI data was aligned to an anatomical image. The aligned functional image was then corrected for intensity bias, demeaned, and projected to a common surface space, which resulted in a cifti-file.
All fMRI data were filtered between 0.1 and 0.01 Hz to retain the relevant frequency range for further analyses of the BOLD signal. We obtain structural and functional matrices in different spatial scales using the Schaefer parcellation, which optimizes local gradient and global similarity measures of the fMRI signal in various spatial scales ranging from 100 to 900 regions (Schaefer et al., 2018). These parcellations can be found on https://github.com/ThomasYeoLab/CBIG/tree/master/ stable_projects/brain_parcellation/Schaefer2018_LocalGlobal. In both fMRI datasets time series were extracted with the help Workbench Command provided by the HCP.
To create a structural connectome as a basis for the wholebrain model, we generated an average structural connectome depicting the number of fibers in the required spatial scales. We used an independent diffusion MRI dataset from the HCP database with a subset of 32 participants acquired at the Massachusetts General Hospital ("MGH HCP Adult Diffusion, " 16 females, 16 males). This dataset had been acquired using high-quality scanning protocols (e.g., using an optimized scanner to achieve a high b-value of 10,000 s/mm 2 and a high-angular resolution, combined with a multi-slice approach), resulting in above-average normative diffusion MRI data. The data had already been preprocessed and made available to the public within the Lead-DBS software package (Setsompop et al., 2013;Horn et al., 2017). In brief, the data were processed using a generalized q-sampling imaging algorithm as implemented in DSI studio. 1 The data were segmented and co-registered using SPM 12. Restricted by a coregistered white-matter mask, 200,000 fibers were sampled within each participant using a Gibbs' tracking approach (Kreher et al., 2008) and normalized into MNI space via DARTEL transforms (Ashburner, 2007;Horn and Blankenburg, 2016). We used the standardized methods from Lead-DBS toolbox version 2.0 (Horn et al., 2018) to obtain structural connectomes for the same parcellation schemes as for the functional data, selecting tracts that both started and ended within the specified parcellation scheme.

Whole-Brain Modeling Using the Dynamic Mean Field Model
The use of fMRI signals would normally limit our study in the temporal dimension. To overcome this shortcoming, we use a whole-brain dynamic mean field (DMF) model, which allows us to simulate data in varying timescales from milliseconds to seconds, while maintaining a comparable structure of the signal. The model expresses the activity of interconnected brain regions with excitatory and inhibitory influences as a reduced set of coupled stochastic differential equations (Deco et al., 2013), following the original derivation by Wong and Wang (2006). Thus, this model allows us to describe the activity of brain regions on a macroscale, while considering the dynamics of local neuronal pools (i.e., inhibitory and excitatory neural populations). Importantly, considering these local dynamics, the model gives a description of local dynamics at millisecond scales. This fine-grained temporal scale cannot be achieved with other whole-brain dynamical models, which usually simulate oscillatory activity in the same temporal scale as the empirical signal (e.g., bifurcation-based models as in Deco et al., 2017b).
A summary of the individual steps to create the model can be found in Figure 1. The model consists of a network of brain regions that emit spontaneous neuronal signals. The spatial scale defines the number of the brain regions. Each of these regions consists of excitatory (E) and inhibitory (I) neuronal pools, which reciprocally influence each other locally within each region n. We further assume that these regions interact via long-range connections, as given by the connection weights of the structural connectome (Deco et al., 2014).
We implement equations based on these assumptions in a modified DMF model, which is based on the original reduction first proposed by Wong and Wang (2006). In the model used in this study, NMDA receptors mediate excitatory currents I (E) and GABA-A receptors mediate inhibitory currents I (I) . Inhibitory sub-populations communicate reciprocally with excitatory subpopulations on a local level. Excitatory sub-populations are additionally linked to other excitatory sub-populations via longrange connections, representing the effect of NMDA receptors. These long-range connections C are based on the number of fiber tracts given by the structural connectome (see description above). The connections are then tuned by the global scaling factor G that linearly scales all synaptic strengths.
The following set of coupled differential equations are used to create the DMF model: For each inhibitory (I) and excitatory (E) neuronal pool in every brain region n, the vector I Here, the gain factors g E =310 nC −1 and g I =310 nC −1 are used to determine the slope of H. The shape of the curvature of H around I thr is defined by the constants d E = 0.16 and d i =0.087. When the threshold currents of I (E) thr =0.403 nA and I (I) thr =0.288 nA are reached, the firing rates increase linearly with the input currents. We add a standard Gaussian noise υ n with an amplitude of σ = 0.01 nA (Deco et al., 2014).
The average synaptic gating of the excitatory pools S Synaptic currents I n result from inputs from the local network, i.e., S (E,I) n , and inputs from other network nodes, i.e., S (E) p . Local inputs are governed by weights, i.e., w E , w NMDA , w n , w I and w. All excitatory synaptic couplings are weighted by w NMDA = 0.15 and the weight of the recurrent excitation w 1.4. Additionally, there is a constant input to each neural pool, denoted by I 0 = 0.382 nA with the weights w E = 1 and w I = 0.7. The local parameters are chosen so that the average firing rate is close to 3 Hz (Deco et al., 2014). Only the weight of feedback inhibition w n is adjusted before each simulation using a Frontiers in Neuroscience | www.frontiersin.org FIGURE 1 | Whole-brain modeling steps to create simulated functional time series fitted to empirical BOLD data. Using a whole-brain network model such as the dynamic mean field model allows us to accurately create time series data at different temporal scales. Local dynamics of each region given by a parcellation are generated by a dynamic mean field model and coupled through the structural connectome (as provided by the numbers of fiber tracts estimated from diffusion-weighted imaging). To fit the resulting neuronal time series to the empirical BOLD time series, we employ a Balloon-Windkessel hemodynamic model to create simulated BOLD time series. The simulated time series are fitted to the empirical time series using metrics of metastability and phase similarity matrix distributions.
regulatory mechanism called Feedback Inhibition Control, which mimics a resting state condition (Deco et al., 2014). This ensures that the network stays in its asynchronous state with firing rates between 3 and 10 Hz for all regions.
Inputs from other regions S (E) p are given by the excitatory populations and are weighted by the connection weights of the structural connectome C np . These connection weights are scaled by the global coupling parameter G, which is adjusted using model fitting (see below).
It is then possible to retrieve separate temporal scales from the simulated neuronal data by binning the time series. However, first the neuronal time series had to be fitted to the empirical BOLD time series (by adjusting G) to ensure a biologically plausible signal. Therefore, we transformed the neuronal signal from the model into a simulated BOLD signal and then compared the simulated and empirical signals (see below). We employed the Balloon-Windkessel hemodynamic model using all biophysical parameters as stated in Stephan et al. (2007). The model is described by the following equations: This model describes a vasodilatory signal s n which is altered by autoregulatory feedback. Depending on s n , the blood flow f n leads to changes of the deoxyhemoglobin content q n and blood volume v n . τ is the time constant, ρ is the resting oxygen fraction and a represents the venous resistance. For each region n the BOLD signal B n is a static non-linear function of q n and v n : To focus on the functionally relevant frequency range, we band-pass filtered the simulated BOLD signals using the same filter as for the empirical data with a bandpass between 0.1 and 0.01 Hz Glerean et al., 2012). A summary of the model parameters can be found in Supplementary Table 1.

Agreement Between Empirical and Simulated Data
To achieve biologically plausible signal statistics in the simulated time series at each scale, we performed the fitting to the empirical signals by adjusting G to have a maximal agreement in three different metrics: the metastability, phase consistency matrices and functional connectivity dynamics (see below). Each of these metrics represent different dynamical properties of the BOLD signal. Previous research has shown that adding these dynamical metrics such as metastability and phase consistency matrices to the fitting procedures improves constraining dynamical working points of dynamical whole-brain models than using only static metrics such as FC (Deco et al., 2017b(Deco et al., , 2019Saenger et al., 2017). These metrics were computed for each value of G (between 0 and 2.5 in steps of 0.025) in the simulated data and for the empirical data and compared as described below. Due to multiple spatial scales, the creation of the model was very compute-intensive, e.g., to replicate the time series of 10 subjects from the HCP dataset at a neuronal timescale using a parcellation of 400 regions with different G-values from 0 to 2.5 about 80-100 GB of RAM and 30 days of computation were required (mostly owing to the feedback inhibition control mechanism). Therefore, we restricted the simulations to 10 iterations, representing the time series of a group of 10 subjects. To ensure that our analyses were generalizable to different healthy subjects, we did 100 iterations of the model fitting to empirical time series with a random selection of 10 subjects at each iteration.

Dynamical Measures Used for the Fitting Metastability
The metastability represents the overall variability of oscillations (Wildie and Shanahan, 2012;Deco et al., 2017b). It is calculated as the standard deviation of the Kuramoto order parameter R(t) across time, which depicts the average phase ϕ k (t) in each region k across n regions.
The phases were derived from the data by detrending the filtered fMRI time series and then applying the Hilbert transform. When R = 1, all phases are fully synchronized, while R = 0 indicates a complete desynchronization of all phases. We calculated the differences between the empirical and simulated metastability. This has been previously proven to be suitable to define the dynamical working point of dynamical whole-brain models (Deco et al., 2017b;Saenger et al., 2017).

Phase Consistency Matrices
We calculated the phase coherence matrix by evaluating the instantaneous phase at each time point t of every region j and then computing the phase difference across all regions. We measured the similarity of these phase coherence matrices over t to create a phase consistency matrix. This matrix is a representation of spatiotemporal fluctuations of phases. To compare empirical and simulated data, we calculated the Kolmogorov-Smirnov distance between the empirical and simulated distribution of the phase consistency matrices. The Kolmogorov-Smirnov distance quantifies the maximal difference between two distribution functions of two samples and is minimized by the optimal value of G (Saenger et al., 2017).

Functional Connectivity Dynamics
We split the BOLD signal into sliding windows of 80 s, overlapping by 40 s. For each window, at time t, we calculated the FC, FC(t). The Functional connectivity dynamics (FCD) is defined by the Pearson correlation of the upper triangular parts of the two FC matrices FC(t1) and FC(t2). For comparison between empirical and simulated data, we calculated the Kolmogorov-Smirnov distance (see above).
Furthermore, we checked whether we retrieved comparable numbers of functional networks in the empirical and simulated data (see Supplementary Figure 2).

Extraction of Whole-Brain Functional Networks Using Independent Component Analysis and Calculation of Entropy
The summary of the analytical steps can be seen in Figure 2. The simulated and empirical time series were available in different spatial scales. In the case of the simulated signal, we retrieved the simulated neuronal time series at separate temporal scales in the range of milliseconds to seconds (see Figure 2A, middle panel). To do so, the simulated neuronal time series were binned by averaging the signals in windows of the width of the timescale, each time bin corresponding to a time point of the newly created time series. As this approach led to multiple fine-grained time series with a high computational cost of the analysis, we were only able to simulate the time series across all temporal scales up to a spatial scale of 400 regions. We created simulated time series at the group level by performing 10 iterations (representing 10 subjects).
In the case of the empirical time series, we extracted a group of 10 subjects from the data by randomly selecting 10 subjects. We concatenated their time series to retrieve functional networks on a group level (using the same group size as in the simulation to ensure comparability). To make the analysis robust to interindividual variability, we repeated this process 100 times. The temporal scale of the empirical data was determined by the TR (HCP: 720 ms). As only one temporal scale was provided in the empirical data, we were able to extract functional networks in a spatial scale from 100 to 900 regions.
In each temporal scale (given by the TR in the empirical data or the bin size in the simulated data), the time series were binarized using the point-process binarization algorithm for BOLD signals (Tagliazucchi et al., 2012). Here, the time series were normalized using a z-score transformation. Depending on the threshold, which was defined by one standard deviation, the time series were set to 0 or 1, resulting in an event matrix (see the right panel of Figure 2A), following the point-process procedure described in previous studies (Tagliazucchi et al., 2012;Deco and Kringelbach, 2017;Deco et al., 2019). Next, the event matrix was normalized using z-score transformation, so that the event matrix in each brain region would have null mean and unitary variance. This procedure has been shown to be robust for threshold choices and is a classical method to reduce dimensionality of dynamical data (Tagliazucchi et al., 2012). We then continued the analysis with the normalized event matrix E (with the dimensions: number of regions i x number of time points b).
To estimate the number of functional networks, we applied an adaptation of an eigenvalue analysis to assess the statistical significance of resulting networks (Peyrache et al., 2010;Deco et al., 2019), as introduced by Lopes-dos- Santos et al. (2013). This method finds the number of principal components within the event matrix with significantly larger eigenvalues than a normal random matrix that follows a probability function, as specified in Marèenko and Pastur (1967). As can be seen in Figure 2B, after determining the number of functional networks, we extracted these functional networks FIGURE 2 | Extraction and tracking of whole-brain functional networks at different spatial and temporal scales using the whole-brain model. (A) We simulate neuronal time series at different spatial scales (from 100 to 400 regions). We then create different bin sizes of the time series (using bins from 10 to 3,000 ms), the bin size corresponds to the temporal scale. The binned time series are binarized using a point process paradigm, resulting in an event matrix. (B) We extract whole-brain functional networks using independent component analysis, resulting in a network matrix (see ribbon plot). These networks are tracked over time by projecting the event matrix onto the networks, resulting in an activity matrix (not displayed). (C) The richness of the switching between functional networks is estimated by calculating the entropy of their switching probability. The entropy is compared across spatial and temporal scales.
by applying an independent component analysis to the event matrix E. This procedure resulted in the network matrix W (with dimensions: number of brain regions i x functional networks c).
Lastly, we tracked the activity of the functional networks over time (see Figure 2B). By projecting the binarized event matrix onto the network matrix, the similarity between each functional network c and the whole-brain activity at each time point b could be assessed. This projection resulted in an activity matrix A (with the dimensions: functional networks c x time points b): with the event matrix E and the projection matrix P. The projection matrix P is defined as: where is the outer product operator, w c is one of the extracted functional networks from the event matrix (the column of the matrix w c ) and − → E b is the b column of the event matrix E (events at time point b).
After retrieving the activity of each functional network over time, we calculated its probability of occurrence. We calculated the ratio of activity of each functional network in relation to overall activity (activity of all networks over time), resulting in the probability of each network c over time: where b corresponds to each time point. Using these probabilities, we computed the entropy of occurrence of each network c. The entropy represents the richness of transition activity between functional networks and it captures the diversity of states depending on their probabilities, adapted from the concept ofentropy by Shannon (1948): As the number of functional networks increased with higher spatial scales, we normalized the entropy. The normalization was done by dividing the entropy by the logarithm of the resulting number of networks for each spatial scale. By doing so, it was possible to compare across spatial scales. We then compared the entropy of network transitions across spatial and temporal scales (see Figure 2C). We made a pairwise comparison of entropy of spatial scales using Wilcoxon tests in the empirical data and the simulated data (at the optimal temporal scale and at the temporal scale = TR).

RESULTS
In our study we described the optimal spatiotemporal scale that captured the highest information content about the temporal evolution of functional networks (as evidenced by the transition activity). We extracted time series at different parcellations at different spatial scales (from 100 to 900 regions) in the empirical data. Furthermore, we created a dynamic mean-field model to create time series at various temporal scales from milliseconds to seconds (Figure 1) and a spatial scale between 100 and 400 regions. We extracted functional networks from both simulated and empirical time series using independent component analysis. We then explored the probability of occurrence of these functional networks over time. We calculated the entropy of these probabilities for each network, representing the diversity of transition activity between functional networks (Figure 2). By focusing our analysis on functional networks (as opposed to region-wise time series), we ensured that the information we gained on the temporal dynamics (as measured by transition activity) was relevant for whole-brain information processing.

Agreement Between Empirical and Simulated Data
The DMF model is a neuronal model that recreates inhibitory and excitatory synaptic dynamics (including AMPA, GABA, and NMDA receptors) following the structure given by the underlying anatomical connectivity. Using the steps detailed in Figure 1 and following the constraints of anatomical connectivity as provided by the structural connectome, we created realistic neuronal time series at the scale of milliseconds to seconds using the DMF model. To ensure the robustness of the model, we fitted the resulting simulated BOLD time series to the empirical BOLD time series based on the global coupling G, which describes the constraint of the dynamics to the anatomy (Deco et al., 2013). Here, we defined a good fitting where the differences in metastability and the Kolmogorov-Smirnov statistics of the phase consistency matrices and FCD reached a minimum (see Supplementary Figure 1). As shown in Supplementary Figure 1, the fitting resulted in an optimum at a global coupling value G between 1.55 and 1.85 (depending on the spatial scale used).
In both simulated and empirical data, some of the resulting networks resembled known classical resting state networks (see Figure 3). As our study focused on the dynamical alteration of functional networks, we ensured that the properties of the resulting functional networks from the simulation were comparable to the properties of the networks derived from the empirical time series. Therefore, we compared the number of functional networks derived from the simulated BOLD time series (see Supplementary Figure 2). Here, the number of functional networks increased when more regions (i.e., a higher spatial scale) were included. This finding is in agreement with the empirical data and with what was reported in the literature using similar approaches and numbers of regions (Yourganov et al., 2011;Amico and Goñi, 2018;Kumar et al., 2019). FIGURE 3 | Examples of group whole-brain functional networks rendered on the standard brain. The left column has been retrieved from the simulated time series (using a TR = 720 ms), the right column from the empirical time series. Some of these networks have a high overlap with classical resting state networks (Yeo et al., 2011) such as the Default Mode Network, Central Visual Network and Temporal Parietal Network.

Entropy of Whole-Brain Functional Network Transitions
The transitions of whole-brain functional networks over time and their probability of occurrence allowed us to estimate entropy H as a representation of the information content of the functional network activity at various spatiotemporal scales from a probabilistic perspective. We display the entropy of spatiotemporal networks as a function of the spatial and temporal scale using empirical ( Figure 4A) and simulated time series (Figure 4B).
We found an inverted U-shape form of the entropy H as a function of probability of spatiotemporal networks across time. Regarding the spatial scale, the H reached the highest value at a scale of 300 regions (mean simulated H = 0.957, mean empirical H = 0.951), but with only a small decrease at scales with 100 (mean simulated H = 0.949, mean empirical H = 0.946) or 400 regions (mean simulated H = 0.938, mean empirical H = 0.946). At spatial scales above 400 regions (analysis only present in FIGURE 4 | Entropy of the temporal probability of whole-brain functional networks in different spatial and temporal scales of the empirical (A) and simulated data (B-D). The entropy is calculated across spatial scales in the empirical data with a fixed temporal scale of 720 ms (corresponding to the TR). The simulated data gives the opportunity to explore different spatial scales at the temporal scale of the TR, 720 ms, (B) as well as at the optimal temporal scale of 150 ms (C). Beyond that it can be also used to explore various temporal scales and spatial scales simultaneously (C). Both the empirical and simulated data show that the highest entropy can be found at a spatial scale of 300 regions with only a minor decrease in entropy at a spatial scale of 200 regions (marked by a red box in A-C). The highest entropy can be found at a temporal scale of 150 ms across all spatial scales (vertical red arrow in D). Each datapoint depicts a random group of 10 subjects in the empirical data or a simulation trial simulating a group of 10 subjects. Statistical significance of comparisons between spatial scales is indicated with "ns" meaning a p > 0.05, * meaning < 0.05, *** meaning 0.001 (FDR-corrected). empirical data, see Figure 4A), we observed a further drop in entropy (down to mean empirical H = 0.916 at 900 regions).
Beside the comparison across spatial scales, the simulated time series allowed us to compare the temporal scales ( Figure 4D). Regarding the temporal scale, we found the highest entropy at an average scale of 150 ms (ranging from 140 to 160 ms, depending on the spatial scale used). Using finer or coarser temporal scales led a much greater drop in entropy (lowest value: mean simulated H = 0.5957) than a change of spatial scales.
Considering both spatial and temporal scales, the highest level of entropy could be found at a temporal scale of 150 ms and a spatial scale of 300 regions (see Figure 4D). The optimal temporal scale of 150 ms persisted at all simulated spatial scales. Also, the effect of temporal scale on entropy was greater than the effect of spatial scale.
Of note, H was always higher when using the empirical dataset in comparison to the simulated time series even when using the optimal temporal scale in the model (see Figure 4A vs. Figures 4B,C), reflecting the variability given by the empirical time series (and signals not accounted for in the DMF model).

DISCUSSION
In this study we investigated the most relevant spatiotemporal scale of fundamental macroscopic dynamical processes, i.e., scale of transitions between whole-brain functional networks. We followed the temporal behavior of functional whole-brain networks at different spatial scales and at fine-grained temporal scales from milliseconds to seconds (using a realistic whole-brain DMF model). Using a data-based and model-based approach, we generated evidence that the entropy of network transitions follows an inverted U-shaped curve with a maximum at a spatial scale of about 300 regions and a temporal scale of about 150 ms. Of note, the optimal temporal scale of about 150 ms persisted at all simulated spatial scales from 100 to 400 regions, indicating an absent interaction effect between spatial and temporal scales. Furthermore, the effect of the temporal scale on entropy was much greater than the effect of spatial scale, which underlines the importance of an appropriate temporal scale for analyses of brain dynamics. Given the close agreement between simulated and empirical data, the DMF model offers an excellent opportunity to bridge analyses of brain dynamics across different neuroimaging modalities and different spatiotemporal scales.
Previous studies have performed comparisons between spatial scales with various metrics, such as the reproducibility of resulting networks, agreement with anatomical connectivity, and prediction accuracy of neuropsychiatric conditions (Craddock et al., 2011;Arslan et al., 2018;Dadi et al., 2019;Messé, 2019). However, all these studies focused on the average functional connectivity, without considering network dynamics. Only Proix et al. (2016) investigated the effect of spatial scale on the information content of brain dynamics by decomposing the time series using a principal component analysis in a whole-brain network model and found the highest eigenvalue at around 140 regions. Higher spatial scales led to an oversampling with a relative reduction of connectome density, leading to more segregated regions and an overall reduction of transmission information content across regions. Although these results are promising, they focused on separate brain regions only. In contrast, we were explicitly interested in whole-brain networks, as functional networks seem to be closer related to cognitive processes than single brain regions (Dadi et al., 2019;Fan et al., 2021).
Our study is the first to examine spatial and temporal scales by simultaneously focusing on brain dynamics of whole-brain networks. Given the significant evidence that maximal entropy of brain dynamics is associated with maximal transmission of information (Lungarella and Sporns, 2006;Rämö et al., 2007;Shew et al., 2011;Wang et al., 2018), cognitive performance (Niu et al., 2018;Liu et al., 2020), and consciousness (Mashour and Hudetz, 2018), we chose to describe the richness of wholebrain network activity using the entropy of whole-brain network transitions. Selecting the most informative spatiotemporal scale during analyses of brain dynamics can help to focus the analysis on relevant information about the dynamical behavior of brain networks, while reducing the amount of noise (Fornito et al., 2010), avoiding oversampling (Proix et al., 2016), and optimizing the computational cost of the analysis, i.e., removing subnetworks that are barely active and contribute little to the overall network activity.
Our findings have several implications for future research of brain dynamics. Regarding the spatial scale, our study provides an empirical basis for choosing the number of brain regions for neuroimaging analyses that focus on brain dynamics of wholebrain functional networks. We provided evidence that a spatial scale of about 300 regions is sufficient to capture the most relevant information on brain dynamics of functional networks. While lower scales may be associated with a loss of information, higher spatial scales might introduce irrelevant and possibly more noisy functional networks.
Regarding the temporal scale, we were able to reproduce the findings of an optimal temporal scale of about 150 ms from our previous study (Deco et al., 2019), which was similar to previous simulations focusing on temporal scales of brain dynamics (Honey et al., 2007). Furthermore, our simulation results are in line with other neuroimaging findings, which showed a mean duration of EEG and MEG microstates between 100 and 200 ms Baker et al., 2014). Besides a close alignment with neuroimaging studies, our findings reflect experimental results of temporal dynamics of conscious processes that operate at similar temporal scales and typically involve a rapid temporal sequence of information stabilization and transfer (Koenig et al., 2002;Van De Ville et al., 2010;Wutz et al., 2014;Salti et al., 2015;Mai et al., 2019). On top of that, our study shows that the optimal temporal scale does not depend on the spatial scale, i.e., an optimal scale of about 150 ms persists across all spatial scales. For researchers focusing on temporal properties of brain dynamics, we therefore advise to either use neuroimaging modalities operating at this optimal temporal scale (e.g., MEG or EEG, Michel and Koenig, 2018) or augment their fMRI analyses with whole-brain modeling, which allows including more fine-grained temporal scales. Therefore, based on empirical data rather than arbitrary choices, our recommendations contribute to efforts directed at harmonizing analyses of brain dynamics across spatiotemporal scales. In addition, our results underline the utility of dynamical wholebrain models to overcome experimental limitations.

Limitations and Outlook
There are several limitations in our methodological approach. First, we used independent component analysis to derive whole-brain functional networks at different scales. As with any other higher-order statistical method, independent component analysis is not free of underlying assumptions and specifically assumes maximal spatial independence of the networks (Jutten and Herault, 1991). Future studies could consider additional analyses using other metrics such as network measures. However, as Arslan et al. (2018) and Hilger et al. (2020) demonstrated in their studies (Arslan et al., 2018;Hilger et al., 2020), many network measures are largely altered by the spatial scale and appropriate correction techniques should be used for such analyses across scales.
Second, our analysis was focused on spatial scales of the brain dynamics at the macroscale. For investigations of microscale brain dynamics, other spatial and temporal scales might be relevant. Therefore, future studies could consider exploring brain dynamics of cellular-level networks using microscale imaging tools such as optical imaging. Ideally analyses of brain dynamics should bridge macro-and microscales; with corresponding methods currently under investigation (Weiskopf et al., 2015;Larivière et al., 2019;Gao et al., 2020).
Third, both the estimation of whole-brain functional networks as well as the calculation of the entropy of the network transition activity was done on a group level and during rest. Future studies could compare the entropy of network transitions on an individual level and under consideration of behavior and cognition, relating individual cognition to dynamical behavior of brain networks. For such approaches, vector-based instead of a region-based parcellations might account for better individual variations (Liu et al., 2021).
Overall, our results suggest that whole-brain functional brain networks operate at an optimum of about 300 regions and a timescale of about 150 ms. We contribute to the understanding of the dynamical behavior of whole-brain networks, which could inspire future human neuroimaging studies to harmonize spatiotemporal scales and neuroimaging modalities and use dynamical models to create connections between micro-and macroscopic scales.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: db.humanconnectome.org.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethical Committee of the Human Connectome Project. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
XK, MK, and GD contributed to the conception, design, analysis, and data interpretation and drafted and critically revised the manuscript. AL-G contributed to the analysis, manuscript drafting, and revision of the manuscript. All authors contributed to the article and approved the submitted version.