Determining Excitatory and Inhibitory Neuronal Activity from Multimodal fMRI Data Using a Generative Hemodynamic Model

Hemodynamic responses, in general, and the blood oxygenation level-dependent (BOLD) fMRI signal, in particular, provide an indirect measure of neuronal activity. There is strong evidence that the BOLD response correlates well with post-synaptic changes, induced by changes in the excitatory and inhibitory (E-I) balance between active neuronal populations. Typical BOLD responses exhibit transients, such as the early-overshoot and post-stimulus undershoot, that can be linked to transients in neuronal activity, but they can also result from vascular uncoupling between cerebral blood flow (CBF) and venous cerebral blood volume (venous CBV). Recently, we have proposed a novel generative hemodynamic model of the BOLD signal within the dynamic causal modeling framework, inspired by physiological observations, called P-DCM (Havlicek et al., 2015). We demonstrated the generative model's ability to more accurately model commonly observed neuronal and vascular transients in single regions but also effective connectivity between multiple brain areas (Havlicek et al., 2017b). In this paper, we additionally demonstrate the versatility of the generative model to jointly explain dynamic relationships between neuronal and hemodynamic physiological variables underlying the BOLD signal using multi-modal data. For this purpose, we utilized three distinct data-sets of experimentally induced responses in the primary visual areas measured in human, cat, and monkey brain, respectively: (1) CBF and BOLD responses; (2) CBF, total CBV, and BOLD responses (Jin and Kim, 2008); and (3) positive and negative neuronal and BOLD responses (Shmuel et al., 2006). By fitting the generative model to the three multi-modal experimental data-sets, we showed that the presence or absence of dynamic features in the BOLD signal is not an unambiguous indication of presence or absence of those features on the neuronal level. Nevertheless, the generative model that takes into account the dynamics of the physiological mechanisms underlying the BOLD response allowed dissociating neuronal from vascular transients and deducing excitatory and inhibitory neuronal activity time-courses from BOLD data alone and from multi-modal data.


INTRODUCTION
Functional magnetic resonance imaging (fMRI) is a widely used non-invasive technique to assess brain function. The fMRI signal reflects neuronal activity only indirectly through the measurements of accompanying hemodynamic processes at temporal resolution typically on the order of seconds and spatial resolution typically on the order of tens of cubic millimeters. In general, neuronal activation causes a series of physiological events, including localized changes in cerebral blood flow (CBF), cerebral metabolic rate of oxygen (CMRO 2 ), cerebral blood volume (CBV), and deoxyhemoglobin content. These physiological variables form the basis of the blood oxygenation level-dependent (BOLD) signal (Ogawa et al., 1990), the most commonly used fMRI approach. However, CBF and CBV can also be directly measured with various fMRI techniques (e.g., see Wong et al., 1997;Lu et al., 2003;Liu and Brown, 2007;Huber et al., 2014b).
In many studies, it has been found that there is high correspondence between response properties measured in fMRI and other hemodynamic techniques and those measured from invasive electrical recordings, mostly acquired in non-human primates, cats and rodents (Logothetis et al., 2001(Logothetis et al., , 2010Kim et al., 2004;Niessing et al., 2005;Devor et al., 2007Devor et al., , 2013Logothetis, 2008;Muckli, 2010;Boynton, 2011;Hillman, 2014). In particular, CBF and BOLD responses show better correlation with post-synaptic local field potentials (LFPs) than with spiking activity (multi-unit activity, MUA), suggesting that the hemodynamic response reflects stronger the input to a neuronal population in a brain area and intrinsic processing (Lauritzen, 2005) rather than the output of that area Logothetis et al., 2010;Magri et al., 2012). Positive CBF and BOLD responses during stimulation are associated with an increase in neuronal activity and decrease in deoxyhemoglobin content, whereas negative CBF and BOLD responses are associated with a decrease in neuronal activity below baseline and increase in deoxyhemoglobin content (e.g., Shmuel et al., 2006 and references therein).
Typical positive BOLD responses to sustained stimulation display transients, such as response adaptation (also referred to as early-overshoot) and post-stimulus undershoot Krüger et al., 1996;Hoge et al., 1999). Similarly, neuronal responses to stimulation exhibit rapid rise followed by a decay (or adaptation) to a steady-state level and are often followed by a brief decrease below baseline after stimulus cessation (e.g., Logothetis et al., 2001). These neuronal transients (Boynton et al., 1996;Hoge et al., 1999;Bandettini and Ungerleider, 2001;Birn et al., 2001) are the result of changes in excitatory and inhibitory (E-I) balance between active neuronal populations, controlled by local micro-circuitry but also by long-range connections with other brain areas (Logothetis, 2002;Logothetis and Wandell, 2004;Shmuel et al., 2006;Hyder et al., 2010;Havlicek et al., 2017b). CBF reflects these neuronal transients in a temporally smoothed fashion (Hoge et al., 1999;Attwell and Iadecola, 2002;Uludag et al., 2004;Sadaghiani et al., 2009;Attwell et al., 2010;Cauli and Hamel, 2010;Mayhew et al., 2014). Because of the complex underlying physiological processes, the BOLD response can exhibit transients not only from neuronal sources, but also due to the properties of blood vessels: the BOLD response is dominated by signal contributions originating from the venous compartments, and venous CBV can be dynamically uncoupled from CBF (i.e., venous CBV lags behind CBF), influencing the amplitude of the early-overshoot and post-stimulus undershoot of the BOLD-response (Buxton et al., 1998b;Mandeville et al., 1999;Yacoub et al., 2006;Chen and Pike, 2009). Alternatively, dynamic uncoupling between CBF and CMRO 2 (i.e., CMRO 2 lags behind CBF) could result in the same BOLD transients as well (Lu et al., 2004;Frahm et al., 2008;Donahue et al., 2009;Hua et al., 2011;Poser et al., 2011;van Zijl et al., 2012). However, we have recently argued, supported by modeling of experimental data, that the contribution of CBF-CMRO 2 uncoupling to BOLD signal transients is much lower than that of CBF-venous CBV uncoupling (Havlicek et al., 2017a). Thus, neuronal and hemodynamic responses in different areas (or voxels) and subjects exhibit dynamic features, which can be both due to changes in E-I balance or due to biomechanical properties of the vasculature.
In standard analysis of fMRI data, linear convolution is applied between a stick or box-car functions (representing the stimulation paradigm) and assumed canonical hemodynamic response function (representing the combined transform from stimulus time-course to neuronal signal and the measured BOLD response) (Friston et al., 1995). However, as indicated above, excitatory and inhibitory neuronal responses may be nonlinearly related to stimulation (e.g., see Boynton et al., 1996;Bandettini and Ungerleider, 2001;Birn et al., 2001;Grill-Spector and Malach, 2001;Kida and Yamamoto, 2008;Mullinger et al., 2013Mullinger et al., , 2014Pérez-González and Malmierca, 2014;Havlicek et al., 2017a;Keller et al., 2017). Furthermore, vascular transients resulting from dynamic uncoupling induce additional nonlinearities between the input function and subsequent hemodynamic variables. Therefore, the linearity assumption in BOLD data analysis may not be sufficient in many experiments and, consequently, the inferred information about the neuronal activity changes obtained from hemodynamic signals using linear (de)convolution analysis might be confounded with vascular effects. In other words, to more accurately estimate neuronal responses from the BOLD signal, a nonlinear model that accounts for dynamic relationships between neuronal and hemodynamic physiological variables underlying the BOLD response is needed.
Recently, we have introduced a physiologically-informed generative model of the BOLD signal within the framework of dynamic causal modeling (Friston et al., 2003) (called P-DCM) 1 , linking excitatory and inhibitory neuronal activity to the BOLD response (Havlicek et al., 2015). In P-DCM, we employ: (i) an adaptive excitation-inhibition neuronal model that accounts for a wide range of neuronal time-course both during stimulation and post-stimulation periods; (ii) a neurovascular coupling (NVC) model that links neuronal activity to blood flow in a strictly 1 As we have developed our novel generative model within the framework of the dynamic causal model (DCM) and applied it both to single ROI and to a network of ROIs, in this paper to avoid confusion, we use DCM and the generative model embedded within P-DCM interchangeably. feedforward fashion; (iii) a balloon model (Buxton et al., 1998b) that can account for a vascular uncoupling between CBF and venous CBV; and (iv) field strength and sequence dependent parameterization of the BOLD signal equation. We compared P-DCM with other DCM models (Friston et al., 2003;Marreiros et al., 2008) and demonstrated significant improvements in the ability to model commonly observed neuronal and vascular response transients in single regions (Havlicek et al., 2015) and also within a network of several regions with task-driven activity changes (Havlicek et al., 2017b). In the latter case, we also showed a high fidelity of P-DCM to jointly explain CBF and BOLD responses simultaneously measured with the arterial spin labeling (ASL) fMRI technique, demonstrating the benefit of additional information provided by a CBF measurement for model inversion.
In general, multi-modal imaging is a powerful approach to study the relationship between neural activity and the BOLD fMRI signal. Measurements of different physiological variables can increase the ability to disambiguate neuronal and vascular effects present in the BOLD signal and potentially unravel limitations of the hemodynamic models. In the current work, we aim to explore the versatility of P-DCM to explain dynamic relationships between various combinations of measured physiological variables and to deduce the excitatory and inhibitory neuronal dynamics from hemodynamic data. This is done under the constraints of assumed physiological mechanisms and experimental manipulations. In particular, we use: (1) newly acquired CBF and BOLD responses to static and flickering stimuli in human subjects; (2) CBF, total CBV and BOLD response to square-wave grating stimulus acquired in the cat brain from the study of Jin and Kim (2008); and (3) positive and negative neuronal and BOLD responses induced by rotating visual stimuli measured in the monkey brain from the study of Shmuel et al. (2006). In our modeling, we emphasize stimulus-type-dependent modulation of response transients that can be linked to a dynamic interplay between excitatory and inhibitory activity. In addition, we allow for differences between stimulation and post-stimulation response periods and account for vascular-, magnetic field strength-, and MRI sequence-dependent properties. The current approach can also be generalized to other invasive and non-invasive multimodal data, such as EEG-fMRI, provided generative models exist for both modalities.

General Description of P-DCM
The generative model in P-DCM consists of four causally-linked components that define how the neuronal signal is transformed to the measured BOLD response (see Figure 1 for an illustration and the summary of model equations). For a more detailed description of the model and its comparison with previous DCM models, please see (Havlicek et al., 2015(Havlicek et al., , 2017b.

Neuronal Model of E-I Balance
In this model, an exogenous input u (t) (e.g., sensory stimulus) drives the change in excitatory activity, n e (t), which is directly coupled with a change in inhibitory activity, n i (t). The strength of this input, expressed with the parameter c, scales the amplitude of the neuronal response. The shape of the neuronal response FIGURE 1 | On the left, a schematic depicts the four main components of P-DCM generative model, representing the causal chain between the neuronal and the BOLD response. In the middle, shapes of the physiological responses generated at different stages of the generative models are shown. The shaded areas around the responses represent the amount of response transient variability that can be modeled by P-DCM. This illustration highlights the two main sources of BOLD response transients: caused either by (1) neuronal transients generated by the model of E-I balance; or by (2) vascular transients due to dynamic uncoupling between CBF and venous CBV; or mixture of these two. On the right, a summary of all equations underlying P-DCM is provided. is tuned by the transient imbalance between excitatory and inhibitory activity. In particular, the typical overshoot at the onset can be produced by gradually increasing inhibitory activity that modulates the excitatory activity via negative feedback. Next, persistence of the inhibitory activity following stimulus cessation can produce post-stimulus deactivation. This temporal evolution of excitatory and inhibitory activity, including their dynamic mismatch, is controlled by the parameters σ and λ, respectively, and the strength of the inhibitory activity modulating the excitatory activity is encoded by the parameter µ. Optimization of these neuronal parameters allows modeling a broad repertoire of neuronal response adaptation profiles and of possible poststimulus deactivations separately for the stimulation period (SP) and the post-stimulation period (PSP) 2 .

Neurovascular Coupling (NVC):
The output of the neuronal model, i.e., the excitatory activity modulated by inhibitory activity, is transformed to CBF, f (t), in a strictly feedforward fashion, via vasoactive signal a(t). Neuronal excitation/inhibition leads to arterial vasodilatation/vasoconstriction associated with increased/decreased CBF (Devor et al., 2007). Thus, the modeled neuronal response transients are conveyed to a CBF response, albeit in a smooth version. Decay and delay of the CBF response with respect to the neuronal response is regulated with three constants, ϕ, φ and χ, with only χ being optimized during model inversion.

Hemodynamic Model:
The hemodynamic model is represented by the balloon model (Buxton et al., 1998a,b). It models the mass balance of normalized changes in CBV, v(t), and deoxyhemoglobin content, q(t), as they pass through the venous compartment. Their changes are driven by changes in the inflowing CBF, f (t), and CMRO 2 , m(t), respectively. It is assumed that CBF and oxygen extraction faction, E(t), are dynamically coupled, thus m (t) = f (t) · E(t) E 0 , where E 0 is the value of oxygen extraction faction at rest (please see discussion of this assumption in (Havlicek et al., 2015), and the Discussion section for more details). Furthermore, during steady-state, the blood leaving the venous compartment (i.e., the outflow, f out (t)) and the venous CBV are coupled via a power law relationship (Grubb et al., 1974), with exponent α, whereas during the transient periods venous CBV and CBF can be uncoupled; i.e., venous CBV can evolve more slowly than CBF. This is due to the vessel's resistance to changes in venous CBV, described by the viscoelastic time constant, τ . Theoretically, the viscoelastic time constant can have different values during SP and PSP, τ SP and τ PSP . The time dimension of changes in v(t) and q(t) is scaled by the mean transit time of the blood through the venous compartment at rest, t 0 . Note that t 0 is linked to the resting venous blood volume fraction V 0 via the central volume principle, t 0 = V 0 F 0 , where F 0 is the blood flow at rest.
2 Please note that we use PSP instead of PSU, as the sign of the post-stimulus BOLD response depends on whether it is preceded by negative or positive changes in neuronal activity during stimulation (as in Shmuel et al., 2006; whose data are investigated in the current study).

BOLD Signal Equation:
The BOLD signal reflects changes in the deoxyhemoglobin content, q(t), together with changes in the deoxyhemoglobin concentration, q(t)/v(t), and venous CBV, v(t). Their relative contribution is weighted by parameters that are magnetic field strength-, TE-and MRI sequence-dependent Havlicek et al., 2015Havlicek et al., , 2017a.
In summary, as illustrated in Figure 1, P-DCM and its parameters allows the BOLD response to exhibit transients, such as response adaptation during stimulation and post-stimulus undershoot that can have both neuronal and vascular origins. As we show below, physiological origins of these transients can be tested under the constraints of concurrent multi-modal physiological data and experimental manipulations.

Data Description
To demonstrate the utility and versatility of P-DCM, below we describe three different data-sets acquired from human, cat, and monkey brains. Each data-set consists of a different combination of physiological measurements: I. CBF and BOLD response; II. CBF, total volume CBV, and BOLD responses, published in Jin and Kim (2008); and III. neuronal and BOLD responses, published in Shmuel et al. (2006). Any additional physiological data, next to the BOLD data (and/or experimental manipulation), provide physiologically-informed constraints on the underlying mechanisms of the BOLD response. This can result in a more accurate inference on the changes in E-I balance forming the neuronal responses (Havlicek et al., 2017b) and potentially inform about the limitations of model structure and parameter assumptions.
The subjects were instructed to fixate on a small dot at the center of the screen throughout the experiments. Each of the EPI functional runs began with a 55 s resting period and continued with alternation of two static and two flickering checkerboard conditions (each 55 s long), interspersed with 110 s resting periods. Flickering checkerboards were presented at 4 Hz (eight reversals per second). The order of static and flickering conditions within a run was pseudo-randomized. For the static condition, a full-field, black-and-white radial checkerboard was presented (Michelson contrast 1), whereas, for the flickering condition, reduced contrast (Michelson contrast 1/3) checkerboards were presented at 4 Hz (i.e., 8 reversal per second) (Sadaghiani et al., 2009). The resting periods consisted of a gray screen isoluminant with the mean luminance of the checkerboard. In order to maintain the subjects' attention, the color change of the fixation dot (altering between red and blue at three pseudo-random intervals within each stimulation block) was passively observed. For each subject, there were twelve trials per condition.
For each subject, the CBF time-series were derived from the ASL data acquired with TE = 8 ms using surround subtraction (Mumford et al., 2006). The BOLD time-series were obtained from the ASL data acquired with TE = 33 ms using surround averaging. The data were preprocessed using SPM12 (R6470) (http://www.fil.ion.ucl.ac.uk/spm). To correct for head motion, the realignment parameters with respect to the first volume were estimated using BOLD data and the same realignment parameters were applied to the corresponding volumes of CBF data. The mean BOLD image was coregistered to the anatomical image and the estimated spatial transformation matrix was applied to the functional BOLD and CBF data. CBF data were modeled voxel-wise using a general linear model (GLM). This model included main predictors representing the periods of static and flickering visual stimulation and additional predictors representing the stimulus onset and offset. All predictors (i.e., three per condition) were convolved with a gamma-variate hemodynamic response function ("spm_Gpdf.m") with shape and scale parameters 4 and 0.5, respectively. The additional predictors were introduced to explain deviations of the hemodynamic response shape between conditions during stimulation and poststimulation periods. The predictors were not orthogonalized in order to retain a direct interpretation of the model. Furthermore, data were high-pass filtered (cut-off = 1/256 s) to remove low frequency signal drifts and a first-order autoregressive model was used to remove serial correlations. Based on a conjunction analysis of the two main contrasts for static and flickering conditions, significant voxels within the gray matter of the left and right visual cortex (p < 0.05, corrected for family-wise errors) were selected from the CBF data. The same voxels were selected from the BOLD data. The statistically significant BOLD signal map (not reported) included the CBF ROI, but had a larger spatial spread. Voxel's time-courses from the CBF and BOLD data (∼40 voxels per subject) were extracted, high-passed filtered (cut-off = 256 s), and the average responses in percent signal change were calculated for the two experimental conditions.

CBF, Total CBV and BOLD Responses
Hemodynamic responses of CBF, total CBV, and BOLD signal were extracted from Figure 1 of the paper by Jin and Kim (2008) using Matlab (MathWorks, Inc.). These responses were measured using fMRI in the visual cortex of anesthetized cats at 9.4 T. In brief, they used 60 s visual stimulus of black and white square-wave gratings drifting with a temporal frequency of 2 cycles/s. This was always preceded by 20 s and followed by 60 s control condition represented by the same but stationary gratings. All data were acquired with GE-EPI readout. BOLD and total CBV signals were acquired simultaneously with TR = 2 s and ∼0.2 × 0.2 × 2 mm 3 voxel size and CBF signal was acquired during separate runs with TR = 3 s and ∼0.3 × 0.3 × 2 mm 3 voxel size. However, the reported responses from Figure 1 were upsampled (using linear interpolation) to TR = 1 s. BOLD responses were derived from two TEs of 10 ms and 20 ms (by calculating the slope ∆R * 2 , but displayed in average percent signal change). Furthermore, in Figure 1 by Jin and Kim, hemodynamic responses were reported for both the middle and top (superficial) part of the gray matter. We have taken CBF and total CBV responses from the middle gray matter and the BOLD response from the superficial area, where it exhibited the highest signal change. Note that CBF and total CBV are more localized to signal changes induced in the arterial blood compartment, while BOLD signal is mostly represented by the venous compartment, where the draining veins carry the signal from deeper gray matter structures toward the surface. Therefore, from the view of blood dynamics, the BOLD response measured at the surface mostly reflects the CBF and arterial CBV changes that occurred deeper in the gray matter. Finally, for all three hemodynamic responses, it is important to preserve the exact amounts of reported percent signal changes.

III. Neuronal and BOLD Responses
Neuronal and BOLD responses were extracted from Figures 2A, 1D of the paper by Shmuel et al. (2006), respectively. These responses were measured simultaneously using invasive electrophysiological recording and fMRI in the visual cortex of anesthetized monkey brain at 4.7 T. In brief, 20 s visual stimuli consisting of high-contrast radial checkers rotating 60 • per s were presented on gray background. The same background was used 5 s before and 25 s after the stimuli. Stimulation ring overlapping with the receptive field at V1 induced positive response in the vicinity of the electrode, while stimulus ring, which did not overlap with the receptive field, induced negative response in the same area. Neuronal responses were obtained by averaging the fractional change in power spectrum over the whole range of frequencies (4-3,000 Hz) with temporal resolution of 1 s. BOLD data were acquired with GE-EPI readout, TE = 20 ms, TR = 1 s and in-plane spatial resolution of 0.75 × 0.75 × 2 mm 3 . Positive and negative BOLD responses induced by two stimuli were sampled and averaged over the same voxels within the ROI around the electrode.

Model Specification
The multi-modal data from each study is used to identify the neuronal and vascular parameters of the generative model. Below, we specify the model assumptions for each study and form the observation equation to enable joint fitting to multiple physiological variables. In general, we aim to constrain the model estimation by at least two physiological measurements, experimental manipulations and properties of the venous blood compartment, the latter being independent of experimental manipulations, as it is given by the biomechanical properties of blood vessels. The assumptions about time-period-(i.e., stimulation or post-stimulation period) and experimental condition-specificity of certain model parameters were motivated by three criteria: (1) we favor the minimum number of parameters that can sufficiently explain the dynamic behavior of the multi-modal experimental data; (2) we have prior knowledge from previous results that some parameters have to be timeperiod-and/or experimental condition-specific, e.g., vascular parameters are condition independent for the same voxels; (3) the first and second criteria should be consistent for all three experiments described above, i.e., the same assumptions about neuronal and vascular parameters have to hold for all three experiments. Furthermore, if available, the model is also constrained by the measured percent signal changes of CBF and BOLD responses.

Experiment I: CBF and BOLD Responses:
To jointly model CBF and BOLD responses using P-DCM during both static and flickering conditions, we made the following specifications for the generative model in order to determine the transfer of condition-dependent neuronal changes to changes in the measured signals. Two independent inputs, u S and u F , in the form of box-car functions representing 55 s long static and flickering visual stimulation (whose strengths were controlled by two parameters c), were used to drive the neuronal activity. To accommodate the assumption that each type of visual stimulus can result in a different adaptation profile during the stimulation period (SP) but also exhibit differences in neuronal adaptation during the post-stimulus period (PSP), parameters of the neuronal model, σ and µ, were allowed to vary between the two phases but also between conditions. On the other hand, λ was allowed to vary only between the two conditions. This is because after the static stimulus the CBF response exhibits a slower return to the baseline without a poststimulus undershoot, which is effectively modeled by setting the parameter µ close to zero. Thus, during this PSP, parameter λ does not have an effect on the shape of the neuronal response and becomes unidentifiable. Further, the NVC parameter χ, was assumed to be the same for both conditions and SP and PSP. Within the hemodynamic model, the viscoelastic time constant, τ , controlling the expansion and deflation of the venous compartment was allowed to vary between SP and PSP but not during the two conditions. The mean transit time at rest, t 0 , was estimated together with the resting blood volume fraction, V 0 , by assuming a blood flow value at rest, F 0 = 0.01 s −1 (i.e., 60 ml/100 g/min, a typical value for human visual cortex; Donahue et al., 2006). All these free parameters and their usage during specific periods and conditions are summarized in Table 1. Next, the Grubb's exponent α and the oxygen extraction fraction at rest, E 0 , were fixed to 0.3 and 0.35, respectively. The BOLD signal equation was parameterized for 3 T magnetic field strength and the sequence parameters utilized in this study (see Table S1).
The modeled physiological variables were linked to measured (averaged) response at the level of CBF (i.e., the output of NVC) and BOLD signals. To enable their joint fitting, we considered a concatenated form of the observation equation: where y f and y b are measured CBF and BOLD responses to both static and flickering stimuli in percent signal changes, all concatenated to a single vector. The measured data were explained by modeled CBF and BOLD responses, f and b, respectively, with additive error terms ε f and ε b , constituting the "AR(1)+white noise" model (Friston et al., 2003).
Indicates parameter that is optimized within a certain time-period or condition. † ‡ Indicates optimized parameters that were consider the same between time-periods but different between conditions (e.g., † marks the static condition and ‡marks the flickering condition in the first experiment). Experiment II: CBF, Total CBV and BOLD Responses: To jointly model CBF, total CBV and BOLD responses to the same visual stimulus, the generative model was specified as follows: the excitatory activity of the neuronal model was driven by a single input in the form of a box-car function representing 60 s stimulation duration, scaled by the parameter c. As in the previous experiment, the neuronal parameters σ and µ were allowed to vary between SP and PSP periods, while a single λ was estimated for both periods. The NVC and hemodynamic model were also controlled in the same way as in the previous experiment (see Table 1). The BOLD signal equation was parameterized for 9.4 T magnetic field strength and specific sequence parameters (see Table S1). Since, we have additional measurements of the total CBV, the observation equation had the following form: Here, the total CBV data, yṽ, is modeled as a weighted sum of CBF (i.e., approximating the response shape of arterial CBV) and venous CBV,ṽ = w a · f − 1 + w v · (v − 1), with weights w a and w v scaling the contribution of arterial CBV (i.e., proportional to CBF) and venous CBV. In other words, the measured BOLD data constraints the relative contributions of arterial and venous CBV to the measured total CBV data, as in the BOLD signal model only the venous CBV contributes to its time-course.

Experiment III: Neuronal and BOLD Responses:
To jointly model positive and negative neuronal and BOLD responses to non-overlapping and overlapping visual stimuli, respectively, the generative model was specified as follows: Two independent inputs, u P and u N , scaled by the input strength parameters were used to induce positive and negative responses, respectively. The stimulus duration for the positive response was 25 s but 26 s for negative response as the measured neuronal response remained decreased ∼1 s after stimulus cessation. Furthermore, in contrast to the two experiments above, we assumed that NVC can differ for the positive and negative responses (i.e., a response-type-specific χ), as the mechanisms for NVC may differ for increases and decreases in CBF. On the other hand, similarly as before, the dynamic properties of venous blood compartment (i.e., the viscoelastic time constant τ ) were assumed the same for the two response types but possibly different between SP and PSP periods. Thus, also a single mean transit time, t 0 , and resting venous blood volume, V 0 , were assumed across both conditions (see Table 1). The BOLD signal equation was parameterized for 4.7 T magnetic field strength and specific sequence parameters (see Table S1). Since in this experiment we also have access to neuronal recordings, the observation equation had the following form: where y n and y b are the measured neuronal and BOLD responses to both types of stimuli (i.e., both positive and negative responses, concatenated to a single vector) in percent signal changes. Since the reported percent signal change of the measured neuronal responses in Shmuel et al. (2006) do not directly relate to the physiologically plausible range of CBF and BOLD signal changes in our model, the excitatory neuronal response, n e , in the observation equation was additionally scaled with the parameter w n .

Model Inversion
Modeled responses were calculated by a numerical integration of differential equations using a local linearization approach (Ozaki, 1992), with integration step t = 0.1 s and later downsampled to match the TR of the measured data. Responses defined by the above observation equations were fitted to the measured data using variational Laplace (VL) optimization algorithm (Friston et al., 2007) as implemented in the SPM12 toolbox ("spm_nlsi_GN.m"). This is a Bayesian estimation procedure designed for the estimation of nonlinear dynamic models, where the model parameters are specified in terms of priors. It calculates posterior parameter estimates by iteratively maximizing the free energy (i.e., the approximation to the model log-evidence).
Since the VL algorithm employs the Laplace assumption, all the parameters-prior and posterior-are defined using Gaussian distributions. As most of the physiological parameters included in the model can only have positive values, thus, their default values are scaled with a latent variable via the log-normal transformations; e.g., τ · exp (τ ) (see Table S2). The prior means and variances of the latent variables are listed in Table S1. Figure 2A shows the average CBF and BOLD responses to static and flickering stimuli in percent signal changes, respectively. With the onset of the static stimulus, the CBF response rises first and then the BOLD response follows slightly later. They both reach their maxima (i.e., 66% for CBF and 2.8% for BOLD) after ∼13 s and continue with a steady decrease toward the end of stimulation. This response adaptation has more pronounced character in the BOLD response. After stimulus offset, both responses rapidly decrease. The amplitude of the CBF response first drops quickly to ∼10% and then slowly recovers to the baseline. In contrast, the amplitude of the BOLD response drops below baseline, with a negative peak of −0.7%, at ∼11 s after stimulus offset. This post-stimulus undershoot then recovers to the baseline in the next ∼60 s. The CBF and BOLD responses to the flickering stimulus differ substantially from the responses to the static stimulus. The CBF response reaches its maximum (i.e., 77%) only by the end of the stimulation, exhibiting mostly a flat plateau from ∼22 to 55 s. In contrast, the BOLD response peaks to 3% during an earlier phase of the stimulation, i.e., after 15 s, which is slightly later compared to the BOLD response to the static stimulus. From this time point, the BOLD response slightly and slowly decreases toward the end of the stimulation. After stimulus cessation, both CBF and BOLD responses drop below baseline (reaching their negative peaks in ∼15 s after stimulus offset, at −15 and −1.8%, respectively) and then slowly recover to baseline. In general, the post-stimulus BOLD undershoot is much larger and peaks slightly later compared to the poststimulus undershoot in the CBF response (both relative to their respective positive responses). Additionally, the post-stimulus BOLD undershoot recovery to baseline is steeper compared to the CBF response.

CBF and BOLD Responses
The results of jointly fitting CBF and BOLD responses using the P-DCM model are overlaid on the measured data in the same Figure 2A. The estimated model parameters are listed in Table 2. One can see that the model was able to accurately explain the discrepancy in the response shape of the two hemodynamic variables, and also the response shape variation due to differences in the type of visual stimuli (see fitted CBF and BOLD responses depicted with thick red and purple lines, respectively).
The estimated excitatory and inhibitory neuronal responses are displayed in Figure 2B. The excitatory and not the inhibitory neuronal response mostly defines the shape of the CBF response. However, it evolves faster, with more pronounced transients,  such as response adaptation in the case of the static condition and post-stimulus deactivation in the case of the flickering condition. Response adaptation and post-stimulus deactivation were explained by an amplitude variation of the inhibitory neuronal response. This means that the more pronounced response adaptation during stimulation is caused by a larger, gradual increase of inhibitory activity above baseline and the post-stimulus deactivation solely reflects the sharp increase of inhibitory activity after stimulus cessation (followed by a slower return to baseline). Note that the inhibitory responses displayed in Figure 2B were modulated by the period and condition specific parameter µ (i.e., the inhibitory-to-excitatory coupling). These experimental data and modeling results demonstrate that the type of stimulus modulates both the positive response and the post-stimulus undershoot, but in a different manner. This means that the E-I balance changes with time and can be very different between stimulation and post-stimulation periods, which is reflected in estimated neuronal parameters (i.e., in σ and µ).
The discrepancy between the measured CBF and BOLD responses was explained with uncoupling between the CBF and venous CBV responses. In Figure 2C, we can see that the venous CBV time-course evolves in a much slower fashion than the CBF time-course (Figure 2A). For example, for the static stimulus, the venous CBV response slowly increases during stimulation while the CBF response starts declining already after ∼15 s of stimulation. Similarly, the CBF response returns much faster to baseline after stimulus cessation than the venous CBV response. This dynamic uncoupling during the transient periods results in a more pronounced response transients in the BOLD response, which approximately represents the inverted deoxyhemoglobin response (see Figures 2A,C). The CBF-venous CBV uncoupling parameterized by the viscoelastic time constant was estimated separately for the stimulation and post-stimulation periods, but yielded almost identical values τ SP ∼ = 68 s and τ PSP ∼ = 69 s, respectively. These large values reflect the fact that τ should scale with stimulus duration (see Uludag and Blinder, 2017 and references therein). Importantly, the discrepancy between CBF and BOLD responses was explained with the same viscoelastic time constants for both the static and flickering conditions. This demonstrates that the passive mechanism of CBF-venous CBV uncoupling is independent of stimulus type (but dependent on the stimulus duration). Figure 3A shows the averaged CBF and BOLD responses in percent signal change to 60 s visual stimulation as reported by Jin and Kim (2008). The CBF response reaches its maximum peak (at 46%) ∼12 s after the stimulus onset. The BOLD response reaches the peak ∼2 s earlier (at 3.5%), even though the CBF response is faster immediately after stimulus onset. The average total CBV response (displayed in Figure 3C in percent signal change) rises with the BOLD response but its peak (at 6.5%) is ∼3 s delayed with respect to the BOLD response peak. After reaching their maxima, all three responses decrease toward the end of stimulation. While the BOLD response exhibits the steepest decay, the decrease of the total CBV response is the slowest. The amplitudes of CBF, total CBV and BOLD at the end of stimulation are 21, 3.6, and 1% (i.e., in ratios of 0.46, 0.55, and 0.29 with respect to their maximum peaks), respectively. After stimulus cessation, all responses drop below baseline and exhibit considerable post-stimulus undershoots. The ratios of the poststimulus response undershoots with respect to the amplitudes at the end of the stimulation for CBF, total CBV, and BOLD are 0.57, 0.42, and 1.4, respectively. The BOLD response with the largest relative post-stimulus undershoot reaches the negative peak earliest (at −1.4%), i.e., ∼18 s after stimulus onset, then CBF with smaller relative undershoot follows (−12%, after 20 s), and the total CBV response has the smallest and most sluggish relative post-stimulus undershoot (−1.5%, after 27 s). All responses take almost 100 s to fully recover to baseline.

CBF, Total CBV and BOLD Responses
The results of jointly fitting CBF, total CBV and BOLD responses using the P-DCM model are overlaid on the measured data in the same Figures 3A,C. The estimated model parameters are listed in Table 2. All fitted CBF, total CBV and BOLD responses follow very closely the dynamic changes observed in the experimental data. The estimated excitatory neuronal response depicted in Figure 3B shows a strong and fast adaptation during stimulation and drops significantly below baseline immediately after stimulus cessation, followed by a slow recovery to baseline. As in the previous study, neuronal response adaptation during stimulation and post-stimulus deactivation are modeled by dynamic changes in the inhibitory neuronal response. Inhibitory response modulates the excitatory response by different amounts during stimulation and post-stimulation periods (see Table 2, for differences in optimized period-specific neuronal parameters).
The dynamic relationship between CBF, total CBV, and BOLD signal was explained with the CBF-venous CBV uncoupling. Estimated arterial and venous CBV (CBV a and CBV v ) responses are displayed in Figure 3C in percent signal changes as they contribute to the predicted total CBV response (weighted by parameters w a and w v ). One can see that the arterial CBV change is larger than the venous CBV change (e.g., by a factor of ∼2 at the end of the stimulation) and that venous CBV evolves slower during the transient periods. The large dynamic uncoupling between CBF (or arterial CBV) and venous CBV, which differed between stimulation and post-stimulations periods (τ SP ∼ = 60 s and τ PSP ∼ = 41 s), was estimated to significantly contribute to the size of BOLD response transients. These τ values provide a good explanation for the significantly more pronounced response adaptation and post-stimulus BOLD undershoot compared to the CBF and total CBV responses (as also reflected by the ratios mentioned above). As a consequence of CBF and arterial CBV responses exhibiting strong post-stimulus undershoots (due to the aforementioned decrease of excitatory activity below baseline during PSP), also venous CBV shows a post-stimulus undershoot, even though it is reduced and smoothed due to the viscoelastic properties of veins (i.e., large τ ) (see Figure 4C). The same mechanism applies for the slower increase and strongly reduced adaptation profile of venous CBV during SP.

Neuronal and BOLD Responses
Average positive and negative neuronal responses in percent signal changes to 20 s visual stimuli overlapping and nonoverlapping with its receptive field, respectively, as reported in Shmuel et al. (2006), are displayed in Figure 4B. The corresponding average positive and negative BOLD responses (also in percent signal changes) are displayed in Figure 4A. The positive neuronal response reaches its maximum peak immediately after stimulus onset (given downsampling of the neuronal signal to TR = 1 s). Then, within the next 5 s, it rapidly decreases to its lower plateau, where it remains till the end of the stimulation. In contrast, the positive BOLD response shows a rather slow increase, reaching its maximum ∼15 s after stimulus onset (at 1.75%) and it remains about this level until the end of the stimulation. After stimulus cessation, the neuronal response drops quickly below baseline, producing a strong post-stimulus deactivation followed by gradual return to baseline in the next ∼13 s. The BOLD response also decreases after stimulus cessation but in a much slower fashion, crossing the baseline ∼9 s after the stimulus offset. Then it continues with a stronger post-stimulus BOLD undershoot, which is significantly delayed (∼15 s) with respect to the neuronal post-stimulus deactivation. The limited post-stimulus period of 25 s did not allow for a full recovery of the BOLD undershoot to baseline. The negative neuronal response can be seen as an inverse of the positive response, but exhibiting significantly smaller signal change. After stimulus onset, there is an immediate decrease in neuronal activity below baseline followed by adaptation to the plateau of lower sustained amplitude during SP. After stimulus offset, it first slightly decreases 3 (within 1 s) and then quickly increases, showing a post-stimulus activation above baseline, which is mostly in phase with the post-stimulus deactivation observed in the positive neuronal response. The negative BOLD response follows the course of the neuronal response more closely compared to the positive BOLD response. It reaches the minimum peak (at ∼1%) ∼6 s after the stimulus onset and keeps increasing almost linearly toward the end of the stimulation (to −0.4%). Afterwards, it crosses baseline ∼4 s earlier than the positive BOLD and exhibits a post-stimulus BOLD overshoot (with maximum at 0.4%) that evolves significantly faster than the post-stimulus undershoot of the positive response.
The results of jointly fitting the positive and negative neuronal and BOLD responses with P-DCM are overlaid with the measured data in Figures 4B,A, respectively. The estimated model parameters are listed in Table 2. The fitted neuronal and BOLD responses follow very closely the dynamic changes observed in the experimental data. The estimated excitatory neuronal responses for both positive and negative responses provided an accurate representation of the transient features observed in the experimental data. As before, response adaption and post-stimulus deactivation profiles were modeled by a variable modulation of the excitatory activity by inhibitory activity (see Figure 4A). The dynamic features of the positive excitatory neuronal response are well comparable to the estimated neuronal response in the second experiment (or to the response features seen in the first experiment). The negative excitatory response was induced by the stimulus input function u N (t) scaled by the negative c parameter. Thus, in contrast to the positive neuronal response, the inhibitory response gradually decreases below baseline during SP, which then causes an increase of the excitatory activity after its initial drop. After stimulus cessation, the inhibitory activity quickly increases up to ∼¼ of the total decrease, and then slowly recovers to the baseline. This slow post-stimulus recovery of the inhibitory response below baseline causes an increase (i.e., overshoot) in the post-stimulus excitatory response.
Next, as the NVC could differ between the positive and negative responses, the estimated positive CBF response is significantly delayed with respect to the neuronal response, which smooths out the strong adaptation during SP and poststimulus deactivation observed in the neuronal response. This slow evolution of the CBF response was achieved by slowing down the feedforward mechanism of NVC (i.e., by lowering the decay constant χ). On the other hand, the estimated CBF response following the negative neuronal response is much faster, closely resembling dynamic features of the neuronal response. This is because the NVC acts faster (by employing a higher decay constant χ).
Furthermore, the fitted positive BOLD response is even more delayed with respect to the CBF response, with mean transit time at rest, t 0 ∼ = 3 s (see Figure 4B). The CBF-CBV uncoupling is smaller, with viscoelastic time constant, τ SP ∼ = 9 s, (see Figure 4C). Therefore, no response adaptation is present during SP. Although the estimated CBF response exhibits a minimal post-stimulus undershoot, the stronger post-stimulus BOLD undershoot is well explained by a larger CBF-venous CBV uncoupling (τ PSP ∼ = 29 s) during PSP. The venous CBV and deoxyhemoglobin responses are displayed in Figure 4C.
The negative BOLD and CBF responses show similar response transients even though the same viscoelastic time constants, regulating CBF-venous CBV uncoupling during SP and PSP, were used as in the case of the positive response. Significant poststimulus overshoot in the CBF response can account for a large fraction of the post-stimulus BOLD overshoot. This is because the actual effect of CBF-CBV uncoupling on the post-stimulus BOLD undershoot for the negative response is smaller due to a generally lower amplitude level of venous CBV during recovery (see Figure 4C). Therefore, in contrast to the positive response, the main origin of post-stimulus BOLD overshoot is neuronal.

DISCUSSION
The BOLD fMRI signal is an indirect reflection of neuronal activity. It has been suggested that it best correlates with the postsynaptic potentials, which-after mediation by metabolic and vascular processes-results in the characteristic hemodynamic delay and blurring relative to neuronal activity. Thus, the high complexity of tissue processes associated with brain activity, ranging from microscopic (i.e., molecular) to macroscopic (i.e., brain area) levels, is reduced to a spatially and temporally varying scalar number (i.e., the dynamic fMRI signal). That is, there is only reduced information about the excitatory and inhibitory neuronal activity available from the fMRI signal. As a result, temporal features of the BOLD signal, such as signal adaptation during stimulation or signal reduction after the stimulation, cannot be taken as a direct evidence of neuronal adaptation or post-stimulation deactivation, respectively. Recently, we have proposed, inspired by physiological observations, a novel generative hemodynamic model within the DCM framework, called P-DCM. We have demonstrated (using BOLD data and BOLD data combined with CBF) that P-DCM is superior in describing single ROI time-courses and also deducing the effective connectivity between brain areas (Havlicek et al., 2015(Havlicek et al., , 2017b compared to previous DCM models (Friston et al., 2003;Marreiros et al., 2008) and that the model inversion, in general, benefits from additional CBF data.
In this paper, we have additionally demonstrated the versatility of P-DCM to jointly explain dynamic relationships between neuronal, neurovascular and hemodynamic physiological variables underlying the BOLD signal using new and previously published multi-modal data. For this purpose, we utilized three data-sets of experimentally induced responses in primary visual areas measured in the brains of human, cat, and monkey, respectively: (1) CBF and BOLD responses to static and flickering stimuli acquired for this study; (2) CBF, total CBV and BOLD responses to square-wave grating stimulus (Jin and Kim, 2008); and (3) positive and negative neuronal and BOLD responses induced by overlapping and non-overlapping visual stimuli with the visual receptive field (Shmuel et al., 2006). The fitting of P-DCM to multi-modal data (i.e., the model inversion) was performed using a VB approach (Friston et al., 2007) under the constraint of assumed physiological mechanisms and experimental manipulations. Specifically, we assumed that the BOLD response transients, such as positive response adaptation and post-stimulus undershoot, can be due to two physiological mechanisms: (1) neuronal, due to changes in E-I balance caused by a dynamic interaction between excitatory and inhibitory neuronal populations (Hoge et al., 1999;Krekelberg et al., 2006;Shmuel et al., 2006;Logothetis, 2008;Sadaghiani et al., 2009;Mullinger et al., 2013); and (2) vascular, due to dynamic uncoupling between CBF and venous CBV responses (Mandeville et al., 1998;Chen and Pike, 2009;Kim and Ogawa, 2012;Huber et al., 2014a;Havlicek et al., 2017a). We also assumed that the experimental manipulation can modulate the neuronal response transients by changing the E-I balance and that this can differ between SP and PSP. Similarly, the vascular uncoupling was allowed to vary between SP and PSP but was invariant with respect to the experimental manipulations. P-DCM provided accurate fits to all measured multi-modal responses and was able to shed a light on the dynamic relationships between the physiological processes underlying the BOLD response. The limitations of P-DCM due to its assumptions are discussed below.
In the first experiment using both CBF and BOLD responses, we were able to show that a 55 s long static and flickering stimuli induced different modulations of the CBF response transients during SP and PSP and that there was a large discrepancy in the size and form of transients between the CBF and BOLD transients, as commonly observed (see e.g., Sadaghiani et al., 2009;Havlicek et al., 2017a and references therein). P-DCM could explain the experimentally induced modulation of the CBF response transient by optimizing the balance between excitatory and inhibitory activity. The accurate fit of both CBF responses to static and flickering stimuli was achieved by allowing some of the neuronal model parameters (σ , µ) to be timeperiod-and condition-specific, while others (λ) including the NVC parameter (χ) were considered condition-and periodinvariant (see Table 2). Next, the large discrepancy in the size of transients between the measured CBF and BOLD responses was explained with a strong uncoupling between CBF and venous CBV responses, which was identified to be similar for SP and PSP (τ SP ∼ = 68 s and τ PSP ∼ = 69 s). More importantly, both BOLD responses to static and flickering stimuli could be explained by assuming the same vascular uncoupling for the two conditions. Additionally, these estimates compared quite well with our previous results obtained by applying P-DCM to the single-subject BOLD responses of the same experimental ASL data (Havlicek et al., 2015). Nevertheless, additional information about the shape of CBF responses incorporated in the current study provided more accurate estimation of the neuronal and vascular component contribution to the BOLD response transients (see Figure 6C in Havlicek et al., 2015).
By using P-DCM to explain the dynamic discrepancy in the shape of the CBF, total CBV and BOLD responses to a 60 s long square-wave grating visual stimulus provided by the second experiment (Jin and Kim, 2008), we were able to accurately jointly fit all measured responses. First, even though the response transients (i.e., early-overshoot and post-stimulus undershoot) were strongly present in both CBF and total CBV responses, the BOLD response transients were even more significantly pronounced (see Figure 3A). Thus, some additional mechanism next to the neuronal contribution is necessary to fully explain the BOLD response transients: As in the first study, the shape of CBF response was well explained by optimizing the E-I balance for the SP and PSP. Importantly, by having measurements of both CBF and total CBV responses underlying the BOLD response, we were able to determine that arterial CBV has a larger contribution to the total CBV than the venous CBV, which is in good agreement with other experimental observations (Drew et al., 2011;Kim and Kim, 2011;Huber et al., 2014a;Gagnon et al., 2015), and that venous CBV evolves much slower compared to the CBF (or arterial CBV) due to a strong (but slightly different) CBFvenous CBV uncoupling between SP and PSP (τ SP ∼ = 60 s and τ PSP ∼ = 41 s) (Uludag and Blinder, 2017). The strong neuronal transients are well reflected in both the CBF and arterial CBV responses, and also the venous CBV response transients reflect this neuronal modulation, albeit largely smoothed out by the strong CBF-venous CBV uncoupling (see Figure 3C). Therefore, even though the venous CBV does not exhibit the more typical slow increase during SP and slow return to baseline during PSP (Kim and Kim, 2011;Huber et al., 2014a), the vascular uncoupling still significantly contributes, in addition to the CBF post-stimulus deactivation, to the post-stimulus BOLD undershoot (having ∼50% neuronal and 50% vascular origin). Thus, our modeling results agree with a suggestion by Jin and Kim (2008) that there is a significant contribution of CBF poststimulus deactivation to the post-stimulus BOLD undershoot, but disagree with their suggestion that these multi-modal data do not support the contribution of the vascular uncoupling between CBF and venous CBV. It is incorrect to assume that the venous CBV must exhibit slow return to baseline after stimulus cessation in order to effectively contribute to the post-stimulus BOLD undershoot if there is a post-stimulus undershoot in CBF and total CBV.
The third experiment offered positive and negative neuronal and BOLD responses to 20 s long visual stimuli overlapping and non-overlapping with the receptive field of the voxels in the ROI (Shmuel et al., 2006). The electrophysiological recordings of neuronal activity in V1 demonstrated that the positive neuronal responses can indeed exhibit a very pronounced response adaptation (but see below) and significant deactivation/activation during SP and PSP, respectively, similarly as estimated by P-DCM from the hemodynamic responses in the two experiments above. By modeling the dynamic changes in E-I balance during both SP and PSP, P-DCM was also able to explain the negative neuronal response, including the post-stimulus increase in neuronal activity. Note that this was achieved under the assumption that the input arriving to the excitatory population in V1 from LGN is negative (i.e., already LGN exhibits negative response, see e.g., Gouws et al., 2014). Furthermore, the relationship between the positive neuronal and BOLD responses in this experimental data is very interesting as there is a strong adaptation in the neuronal response during SP but no sign of adaptation in the BOLD response. This seems to appear as a typical observation for the BOLD responses to stimuli with a comparable stimulus duration measured in the V1 area of anesthetized macaque monkey brain (Logothetis et al., 2001;Logothetis, 2002;Pfeuffer et al., 2004;Goense and Logothetis, 2006) but less common in anesthetized cats or rats (as shown above and, e.g., Zhao et al., 2007;Kida and Yamamoto, 2008;Kim et al., 2010). Moreover, the stronger neuronal deactivation during PSP invites the hypothesis that the post-stimulus BOLD response undershoot could be related to this decrease in neuronal activity. However, the estimate of CBF response provided by P-DCM suggests that the neuronal adaptation and post-stimulus deactivations are almost entirely smoothed out by a slow rate of NVC (with χ = 0.19 Hz), which is also necessary to explain the smoothness of the observed BOLD response. This is in line with experimental observations from anesthetized macaque monkey brain reported in Pfeuffer et al. (2004), Zappe et al. (2008), andZaldivar et al. (2014), albeit not explicitly described by the authors (see also discussion below for possible effects of anesthesia). Thus, in contrary to suggestion made in Shmuel et al. (2006), our modeling results suggest that the post-stimulus BOLD undershoot is not caused by the neuronal deactivation but by the vascular uncoupling (with τ PSP ∼ = 29 s) with a slow return of venous CBV to baseline during PSP (but see also below). Note that this vascular uncoupling is smaller than in the first two experiments, which is in a good agreement with the fact that the size of the poststimulus undershoot and thus also size of the vascular uncoupling is proportional to stimulus duration (Uludag and Blinder, 2017).
On the other hand, the shape of the negative BOLD response follows very closely the shape of the negative neuronal response with a smaller delay of the negative peak with respect to the stimulus onset and the post-stimulus overshoot to stimulus offset compared to the positive BOLD response. The faster evolution of the negative BOLD response (with an earlier poststimulus undershoot) compared to the positive BOLD response was also observed in other studies using human subjects (Shmuel et al., 2002;Pfeuffer et al., 2004;Huber et al., 2014a). As we assumed that the passive hemodynamic properties of the venous compartment are the same for both positive and negative responses (including the size of vascular uncoupling), P-DCM explained the dynamic relationship between neuronal and BOLD response by a faster NVC (with χ = 0.85 Hz) for decreases in neuronal activity. In contrast to the positive BOLD response, the neuronal transients are reflected in the BOLD signal timecourse due to the fact that the relative change of neuronal amplitude between stimulus onset and offset is larger (close to baseline by the end of the stimulation) compared to the positive response. Note that theoretically, one could also explain the discrepancy between positive neuronal and BOLD responses by making the NVC faster, increasing the mean transit time (t 0 ) and minimizing the vascular uncoupling (τ ), which would result in the post-stimulus BOLD undershoot having mainly neuronal origin. However, this would lead to t 0 > 4.5 s, which would have to significantly differ from the negative BOLD response. That is, we would not be able to assume the same hemodynamic (i.e., vascular) model for both negative and positive BOLD response in the same voxels, which is not physiologically plausible. In general, it is more likely that the main differences between the positive and negative BOLD responses are due to different control of NVC (Lauritzen, 2005). Whether the rate difference in NVC for positive and negative responses is a general distinctive feature will need to be clarified in future experiments.
In summary, the three experimental data-sets provided physiological measurements at different stages of the dynamic cascade between neuronal and BOLD responses. First, we demonstrated that P-DCM was able to estimate (excitatory and inhibitory) neuronal responses with different amount of response adaptation during the stimulus period and post-stimulus deactivation/activation after the stimulation. We showed that the response adaptation during SP can vary from fast and strong (e.g., in the third experiment) to minimal (e.g., in the first experiment, response to flickering stimulus). Similarly, during PSP, the excitatory neuronal activity can slowly return to baseline (e.g., in the first experiment, after the response to the static stimulus) or it can decrease below baseline following the positive response (e.g., in the second experiment) or increase above baseline following the negative response. Second, the CBF response was shown to reflect the neuronal time-courses in a smoothed fashion via feedforward NVC, which can reduce or even completely eliminate neuronal transients (as in the third experiment for the positive CBF response). However, modeling of the NVC allows recovering excitatory and inhibitory neuronal transients from the CBF data. Additional transient phenomena in the BOLD response are induced by CBF-venous CBV uncoupling. That is, the discrepancy between the BOLD signal and CBF are due to venous CBV (or, alternatively, CMRO 2 , but see Havlicek et al., 2017a). That is, the presence or absence of dynamic features in the BOLD signal is not an unambiguous indication of the presence or absence of those features on the neuronal level. However, P-DCM applied to multi-modal data was able to dissociate between neuronal and vascular contributions to the BOLD response transients induced by different types and durations of stimuli. Furthermore, P-DCM accommodated the magnetic field strength and sequence parameters differences between experimental studies that also influence the size and nonlinearity of BOLD response transients (Havlicek et al., 2015(Havlicek et al., , 2017aUludag and Blinder, 2017). We think that P-DCM and its emphasis on response transients may be useful to also explain other combinations of multimodal data (e.g., neuronal activity recordings, CBF, CBV and BOLD) or to play an important role in combining EEG and fMRI data (Valdes Sosa et al., 2009;Riera and Sumiyoshi, 2010;Rosa et al., 2011;Butler et al., 2017;Friston et al., 2017). The neuronal mass activity of post-synaptic signals can be decomposed to non-overlapping frequency bands. The higher frequency band (gamma) is more associated with the main signal change of the hemodynamic response while lower frequency bands, such as alpha, and beta carry more information about changes in response transients (Magri et al., 2012;Mullinger et al., 2013Mullinger et al., , 2014Ding et al., 2016).
Finally, it is important to realize that due to complexity of physiological mechanisms transforming the neuronal response to the BOLD response, the standard linear analysis (e.g., Friston et al., 1995Friston et al., , 1998 or (linear) deconvolution applied to BOLD data (e.g., Gaudes et al., 2011;Ryali et al., 2011;Smith et al., 2012;Bush and Cisler, 2013) cannot provide a reliable estimate of the underlying changes in neuronal activity in any of the three experiments. Our results highlight the necessity of nonlinear models, such as P-DCM, which account for dynamic uncoupling at both the neuronal and vascular levels and that can benefit from multi-modal data. In addition, nonlinear generative models, such as P-DCM, have the potential also to improve novel data analysis approaches, e.g., single-voxel and multi-voxel-pattern or representational similarity analysis approaches (e.g., Kriegeskorte et al., 2008;Haxby, 2012).

Model Assumptions and Limitations
It is generally believed that modeling E-I balance-as the underlying source of the BOLD response (Logothetis, 2008)is crucial for relating neuronal and hemodynamic responses. The applied neuronal model of E-I balance within P-DCM represents a large simplification of the underlying complex neuronal processes that operate at very fine temporal and spatial scales. The aim of this model is to mainly characterize regional post-synaptic changes in excitatory and inhibitory activity using a simple mathematical function that can be related to changes in the hemodynamic signal. It is assumed that the two-state neuronal model is driven by an exogenous input entering the excitatory state and the change in excitatory activity is followed by a (smaller) change in the inhibitory activity that subsequently modulates the excitatory activity via negative feedback and the E-I (im)balance eventually settles into a new balance state. Despite its simplicity, this model is able to represent dynamic change in E-I balance resulting in wide repertoire of neuronal response adaptions profiles (Hoge et al., 1999;Bandettini and Ungerleider, 2001;Logothetis et al., 2001), including more abrupt changes of E-I balance with stimulus cessation followed persistence of inhibitory activity that creates the post-stimulus deactivation (Sadaghiani et al., 2009;Mullinger et al., 2013Mullinger et al., , 2014Mullinger et al., , 2017. Therefore, P-DCM offers a new way to assess the neuronal origin of hemodynamic response transients by means of proxies for excitatory and inhibitory responses that can have high neuroscientific relevance. Changes in the E-I balance were encoded in three neuronal parameters (σ , µ, λ). The temporal evolution of excitatory and inhibitory activity, including their dynamic mismatch, was controlled by the parameters σ and λ, respectively, and the strength of the inhibitory activity modulating the excitatory activity is encoded by the parameter µ. In this paper, we have mostly commented on the shape of estimated neuronal response and emphasized possible differences in excitatory and inhibitory dynamics. However, the actual values of estimated parameters controlling the neuronal model are informative and significantly differed between the three experiments (especially σ and µ). It is possible that these differences can be attributed to the neuronal or stimulus properties, as described above (Kida and Yamamoto, 2008), but also to the fact that responses of the two animal experiments were acquired under anesthesia (see below). In addition, testing for significant differences in estimated neuronal parameters due to experimental modulation or between healthy and diseased subjects is potentially an important area for future P-DCM utilization (Stephan et al., 2017). Note that in this paper we have made the specific assumption that (σ , µ) can vary between both SP and PSP but also between experimental conditions. On the other hand, λ was allowed to vary only between experimental conditions. This is because we favored a simpler model that was able to explain the observed neuronal responses in all three experiments, even with λ being only condition specific. Additionally, in the first experiment, the CBF response during PSP exhibits a slower return to the baseline without post-stimulus undershoot. This is effectively modeled by setting the parameter µ (i.e., the influence of inhibitory to excitatory activity) close to zero, which means that during this PSP, parameter λ does not have an effect on the shape of the neuronal response and becomes unidentifiable. Nevertheless, the inhibitory responses (as displayed in Figures 2-4) are modulated by the parameter µ, which makes them time-period specific as well.
Furthermore, the assumption about the input entering the excitatory state is appropriate for the majority of cortical regions as the vast majority of long-range connections between regions are mediated by excitatory neurons (Markram et al., 2004). For an additional description how P-DCM can model long-range connections, please see Supplementary Material 3 in Havlicek et al. (2015). We have utilized the same assumption also for the negative neuronal response but with a negative input entering the excitatory state, as it was shown earlier that negative responses in primary visual areas may be preceded by negative responses in LGN (e.g., Gouws et al., 2014).
NVC in P-DCM transforms neuronal to CBF response using a feedforward mechanism. While the motivation for utilizing feedforward NVC is fully discussed in Havlicek et al. (2015), we have also demonstrated in Havlicek et al. (2017b) that NVC based on negative feedback mechanism (Friston et al., 2000) is suboptimal and that the feedforward NVC in conjunction with the two-state neuronal model of E-I balance is preferred for modeling fMRI data. NVC is controlled by three parameters (ϕ, φ, χ), but optimizing only χ is sufficient to adjust the smoothness and delay of CBF response with respect to the neuronal response (i.e., we prefer a parsimonious NVC model with a minimum number of free parameters). The NVC parameter χ was assumed to vary between conditions. This is mainly because the third experiment involved a condition resulting in the negative response, and it was suggested earlier that the NVC of positive and negative hemodynamic response may differ (see e.g., Lauritzen, 2005;Huber et al., 2014a), which is supported here by our results obtained with constrained model inversion by multi-modal data. In fact, our fitting results of the first experiment showed that positive hemodynamic responses, although resulting from two different stimulus types and having different modulation of response transients, can have the same NVC parameter χ (see Table 2).
In the hemodynamic model, P-DCM assumes that CBF and CMRO 2 are tightly coupled. There are three reasons to do so: (1) It is a common assumption in DCM literature that CBF is tightly coupled with CMRO 2 (Friston et al., 2003), and other papers showed that CBF and venous CBV are uncoupled (e.g., Mandeville et al., 1999;Kim and Kim, 2011;Huber et al., 2014a), as for longer stimuli venous CBV response exhibits much slower increase and return to baseline compared to the CBF response; (2) If one considers both CMRO 2 and venous CBV responses uncoupled from CBF response, then both can have similar impact on the transients of the BOLD response and the generative model becomes unidentifiable. From modeling (and model inversion) perspective, this is seen as redundancy and therefore it is preferable to fix one of these two mechanisms. This is because under normal conditions even with multimodal data (consisting of CBF, total CBV and BOLD) one cannot effectively disentangle these two mechanisms from each other; (3) We have recently provided a comprehensive proof using multi-echo data (Havlicek et al., 2017a) that next to CBF (i.e., neuronal) contribution, the CBF-venous CBV uncoupling (and not the CBF-CMRO 2 uncoupling) is the mechanism behind the BOLD response transients. In this case, the specific echotime dependence of the BOLD response transients (related to contribution from both extravascular and intravascular signals) together with a variable CBF response allowed us to identify the underlying mechanism.
In this paper, we have selected three multi-modal datasets as illustrative examples to demonstrate the versatility of P-DCM to explain underlying causal relationships under the constraint of multiple physiological measurements. We aimed to include multi-modal data-sets acquired at different magnetic field strengths, containing different combinations of physiological variables next to the standard BOLD response, and possibly involving more than one level of experimental manipulation. Further, we favored averaged data with excellent signal-to-noise quality, which clearly manifest discrepancy between different physiological variables and between experimental conditions. There are certainly many more interesting published multimodal data-sets that P-DCM could be further tested on. For example, we have limited our demonstration to human, monkey and cat data, however, there are many excellent multi-modal data measured in rodents (e.g., Kida et al., 2007;Boorman et al., 2010;Füchtemeier et al., 2010;Hyder et al., 2010;Hirano et al., 2011). Since a good correspondence was shown earlier between hemodynamic responses measured in cats and rats (see e.g., Zong et al., 2012) and in monkeys and rats (see e.g., Huber et al., 2015), we expect that P-DCM could perform well also if applied to rodent data. Furthermore, as model parameters are specified in terms of priors, it is also possible to account for interspecies differences in physiological parameters. For example, a higher baseline blood flow and volume influences the main transit time, which could be adjusted for different species (even though it was not necessary in our case, as the mean transit time was one of the free parameters).
All three data-sets represent evoked responses to longer sustained stimuli measured in the primary visual cortex. This choice allowed us to consider model assumptions that could be shared between all three experiments (as mentioned above) and fitting results obtained from these experiments could be more directly compared with each other. Additionally, in our previous aforementioned study (Havlicek et al., 2017a), the same static and flickering stimuli, including identical stimulus durations, provided high evidence that the BOLD response transients are mainly of neuronal and vascular origin (with negligible or zero contribution of CBF-CMRO 2 uncoupling). Therefore, we are in a good position to extend these results to the model assumptions applied in this paper, especially in the case of the first experiment. Similarly, the authors of the second experiment data-set showed in the independent study that with the same type of stimulus, the venous CBV response in the primary visual cortex of the cat exhibits much slower dynamics compared to the arterial CBV (Kim and Kim, 2011). Furthermore, other multi-modal data acquired in the primary visual cortex of monkey brain (i.e., comparable to the third experiment) showed that despite deactivation of after stimulus cessation in the neuronal response, no or negligible post-stimulus undershoot was present in the CBF response but significantly present in the BOLD response (Pfeuffer et al., 2004;Zappe et al., 2008;Zaldivar et al., 2014). This supports our assumption and fitting result that CBF-venous CBV uncoupling can play important role in explaining the observed BOLD response transients also in third experiment. In general, the selected data may have revealed limitations of the structure of P-DCM and assumptions of specific parameters in the generative model. However, as the generative model was able to reproduce the experimental observations, P-DCM proves to be flexible enough to accommodate a wide range of multimodal experimental data. Nevertheless, more work on novel data (including short stimuli and other brain regions) must be performed to further evaluate and develop current generative model used in P-DCM.
Finally, the second and third experiments were performed on anesthetized animals using isoflurane (i.e., a common anesthetic agent used in animal research). In general, anesthesia is known to influence the amplitude and shape of both neuronal and hemodynamic responses (Krautwald and Angenstein, 2012;Uludag and Blinder, 2017). Under anesthesia using isoflurane, the neuronal baseline (i.e., the firing rate) is decreased and the neuronal responses exhibit smaller changes and can have more pronounced and faster adaptation compared to the awake state (Aksenov et al., 2015;Keller et al., 2017). This can potentially explain the large adaptation profiles of the neuronal responses in the second and third experiment, even though dynamic stimuli were applied (i.e., one would expect responses more comparable to responses to flickering rather than static stimuli in the first experiment). Having said this, larger differences in the shape of neuronal responses due to experimental modulations are expected during awake state (Haider et al., 2013;Bahmani et al., 2014;Keller et al., 2017), highlighting some benefits of human fMRI over animal studies under anesthesia. At the hemodynamic or NVC level, isoflurane based anesthesia leads to vasodilatation of mainly arteries and arterioles in the occipital areas, which results in increase of baseline CBF but even larger increase of baseline CBV; i.e., the mean transit time (in the microvasculature) is increased as well (Lorenz et al., 2001). Subsequently, the change in CBF due to activation is smaller and more delayed compared to awake state (Sicard et al., 2003;Pisauro et al., 2013). As a result, BOLD responses during anesthesia are also smaller. The anesthesia is expected to influence more the active mechanisms of the arterial compartment (in our model represented by CBF) than the passive mechanism of the venous compartment. Note that, theoretically, the same type of anesthetic agents could also have different effect on the neuronal responses and NVC between species, which could explain the large discrepancy between the second and third experiment (besides the obvious differences in stimulus type and stimulus duration). As P-DCM is able to optimize all stages of the physiological chain from the neuronal to the hemodynamic responses, it can also be useful in characterizing the differences in the physiological mechanisms during both anesthesia and the awake state.