Extracranial estimation of neural mass model parameters using the Unscented Kalman Filter

Data assimilation, defined as the fusion of data with preexisting knowledge, is particularly suited to elucidating underlying phenomena from noisy/insufficient observations. Although this approach has been widely used in diverse fields, only recently have efforts been directed to problems in neuroscience, using mainly intracranial data and thus limiting its applicability to invasive measurements involving electrode implants. Here we intend to apply data assimilation to non-invasive electroencephalography (EEG) measurements to infer brain states and their characteristics. For this purpose, we use Kalman filtering to combine synthetic EEG data with a coupled neural-mass model together with Ary’s model of the head, which projects intracranial signals onto the scalp. Our results show that using several extracranial electrodes allows to successfully estimate the state and parameters of the neural masses and their interactions, whereas one single electrode provides only a very partial and insufficient view of the system. The superiority of using multiple extracranial electrodes over using only one, be it intra- or extracranial, is shown over a wide variety of dynamical behaviours. Our results show potential towards future clinical applications of the method. Author Summary To completely understand brain function, we will need to integrate experimental information into a consistent theoretical framework. Invasive techniques as EcoG recordings, together with models that describe the brain at the mesoscale, provide valuable information about the brain state and its dynamical evolution when combined with techniques coming from control theory, such as the Kalman filter. This method, which is specifically designed to deal with systems with noisy or imperfect data, combines experimental data with theoretical models assuming Bayesian inference. So far, implementations of the Kalman filter have not been suited for non-invasive measures like EEG. Here we attempt to overcome this situation by introducing a model of the head that allows to transfer the intracranial signals produced by a mesoscopic model to the scalp in the form of EEG recordings. Our results show the advantages of using multichannel EEG recordings, which are extended in space and allow to discriminate signals produced by the interaction of coupled columns. The extension of the Kalman method presented here can be expected to expand the applicability of the technique to all situations where EEG recordings are used, including the routine monitoring of illnesses or rehabilitation tasks, brain-computer interface protocols, and transcranial stimulation.


Introduction
After several decades studying its morphology and dynamics [1], the basic mechanisms that describe the functioning of the brain are still far from being completely understood. There are different reasons that explain this arduous route towards understanding this organ. First, the neurons that form the brain are very diverse morphologically [2] and dynamically [3]. Second, these neurons are connected to each other in extremely large numbers and forming very complex networks [4], whose structural characteristics are still mostly unknown. And third, brain dynamics are very irregular and complex [5,6]. The opposed views of an essentially noisy brain and a deterministic brain exhibiting chaotic activity have been often contrasted. On the one hand there is multiple evidence, both theoretical and experimental, that justifies a stochastic view of the brain [7,8]. On the other hand, other studies reveal deterministic, or rather reproducible, dynamical behaviour [9,10] both at the microscopic scale [11] and at the mesoscale recorded by electroencephalograms (EEG) or magnetoencephalograms (MEG) [12]. The reality is probably a combination of the two views. The fact that the brain receives continuous external inputs from the sensory system also makes its dynamical and experimental interpretation more complex because, even though experiments are designed to minimize uncontrolled inputs, they cannot completely rule them out. Another important limitation for studying the brain is that experimental recordings (such as EEG or fRMI) are almost always indirect reflections of the underlying neural activity [13].
A way of facing the complexities described above is by systematically comparing the experimental observations of brain activity with mathematical models based on specific hypotheses, which can thereby be validated or disproven. Modelling cerebral activity has been attempted both with top-down and bottom-up approaches [14][15][16][17]. Many of these theoretical models are simplifications that capture the basic ingredients of brain dynamics, while others are detailed accounts of the dynamics of neurons that necessarily forgo the description of the whole brain. In that context, a more feasible scale of study is the mesoscopic scale [18][19][20][21][22][23][24]. Many of the modern experimental techniques record information coming from populations of neurons organized in so-called cortical columns. Neural mass models describe mathematically the activity of these populations using reasonably simple equations [25]. These models can describe both the intrinsic oscillatory behaviour recorded at the mesoscale or event-related responses [26,27] with morphologically plausible assumptions for their construction.
In all modeling strategies, however, identifying realistic values for the parameters of the model is a challenging task. One way to address this problem is by integrating experimental information into the models using Bayesian inference [28][29][30][31][32]. This strategy has started to be pursued by using Kalman filtering to integrate experimental data at both the microscopic scale of neuronal networks [33] and the mesoscopic scale of neural mass models [34]. This data assimilation approach is based on the fact that neuronal activity is highly noisy, and allows to estimate both the state and the parameters of the theoretical model using the experimental data available. The method has been used to estimate, for example, the effective connectivity that characterizes epileptic seizures on a patient-specific basis (see [35] and references therein). Kalman filtering has also been used to analyze the suppression of epileptic seizures in coupled neural mass 2/23 models [36], and the induction of the anaesthetised state by drugs [37,38]. But these studies use mainly invasive intracranial signals, and it would be desirable to extend them to non-invasive extracraneal measurements such as EEG. Intracranial signals can be translated into EEG signals in a forward manner [39,40], and in the opposite direction, solving the inverse problem allows to infer intracranial signals from EEG recordings [41][42][43]. In this paper we are going to adopt this approach to extend the Kalman filtering technique, by including a model of the head that transfers intracranial signals onto the scalp.

Methods
To obtain a reliable estimation of the state and the dynamics of the brain, we require a biologically inspired mathematical model of its dynamics, experimental data (as non-invasive as possible), and the means of fusing both sources of information together. In this paper, we use Jansen and Rit's model [25,44] to represent the dynamical evolution of the cortical structures. For our sets of experimental data we model EEG in silico using, again, Jansen and Rit's model together with a model of the head. Finally, we use the Unscented Kalman Filter as our data assimilation algorithm to estimate jointly the state and the parameters of the model [45][46][47].

Mesoscopic neural mass model
Jansen and Rit's model [25,44] describes the mesoscopic activity of a population of neurons [48,49], providing a good compromise between physiological realism and computational simplicity. This model simplifies the neuronal diversity of a cortical column in three interacting populations: pyramidal neurons, excitatory interneurons, and inhibitory interneurons. The larger pyramidal population excites both groups of interneurons, which in turn feed back into the pyramidal cells. In our approximation, the pyramidal population is also driven by excitatory noise from distant areas of the brain and by neighbouring columns.
The dynamics of each population rely on two different transformations. The first converts the average density of incoming action potentials into an average post-synaptic membrane potential (excitatory or inhibitory). It takes the form of a second-order differential equation for excitatory inputs, and for inhibitory inputs, where u e,i (t) and x e,i (t) are the input and output of the transformations, respectively, A and B are the amplitudes of the excitatory and inhibitory post-synaptic potentials, and a and b are the lumped representations of the sums of the reciprocal of the time constant of the passive membrane, and all other spatially distributed delays in the dendritic network. The second transformation converts the net average membrane potential of the population, v, into an average firing rate, and is described by the following sigmoid function: where e 0 is the maximum firing rate of the population, r controls the slope of the sigmoid, and v 0 is the post-synaptic potential for which a 50% firing rate is obtained. We model the brain as a system of N d coupled cortical columns (dipole sources) with the addition of noise. The following equations define our model for each cortical column i: where C 1 to C 4 are connectivity constants that govern the interactions between populations, p(t) is a noisy external input, and the summation term includes the delayed input from other coupled cortical columns. k modulates the strength of the coupling, K is the adjacency matrix, and τ i j is the delay with which column i receives the signal of column j. Table 1 provides the descriptions and values of these parameters. The electrical activity detected by the electrodes on the scalp is originated by the weighted sum of the averaged membrane potential of the pyramidal cells of all the cortical columns, [50].

Extracranial data generation
The main contribution of this paper is the use of multichannel extracranial data to obtain information about the neuronal populations inside the brain using data assimilation. To accomplish this, we use synthetic EEG data generated in silico using Jansen and Rit's model and Ary's head model. To that end, we transform the output x(t) of the neural masses to EEG signals z(t) in the electrodes (see Fig. 1). This transformation is mediated by a lead field matrix [40], which builds on the basic idea of calculating the electric potential caused by a dipole source [13] on a three-layer isotropic hemisphere [39,52]. The lead field matrix also contains information about the geometry of the problem (e.g., locations of cortical columns and electrodes) and about the electrophysiology of the head (e.g., conductivities of the different tissues). The following equations show the potential V e,i on an electrode e, located at r e e [53] , caused by the dipole q i (t) = x i (t)q i generated by the cortical column i, located at r i q and oriented asq i . In these equations, e = 1, . . . , N e , where N e is the total number of electrodes, and i = 1, . . . , N d , where N d is the total number of dipoles: where vectors are typeset in bold and In these expressions, , Γ(r e e , r i q ) = d e,i r e e d e,i + (r e e ) 2 − (r i q · r e e ) .

(11)
σ s is the tangential conductivity of each surface [52] and ρ s and µ s are the Berg parameters relative to it [54] (see Table 2). d e,i is the distance between the dipole i and the electrode e under consideration.  Table 3. Cartesian coordinates of the dipoles used throughout the study.  (7-11), r q is the distance from the origin to the cortical column under consideration; r e is the distance from the origin to the electrode; and d is the distance from the cortical column to the electrode. The placement of the arrows here is for illustration purposes only; in our study, the cortical columns are placed on the surface of the brain, close to the skull.

The Unscented Kalman Filter for data assimilation
The Unscented Kalman Filter (UKF) is our algorithm of choice to bring together the dynamical state of the model and the experimental data. It is a standard tool in the field of systems and control engineering, and has been shown to be both computationally efficient and robust even when dealing with stochastic nonlinear systems [55]. In order to simultaneously estimate the state and parameters of the model described by Eqs. (4)-(6), we regard it as a discrete-time state-space dynamical system of the following form: is the state vector (related to the variables and parameters of the model), with θ p being the parameters to estimate, which obey the equationsθ p = 0. z ∈ R n z is the measurement vector (our in silico EEG readings). v and w are uncertainty terms that account for process noise and measurement noise, respectively, with Gaussian distributions p(v) ∼ N(0, Q) and p(w) ∼ N(0, R), respectively. F is obtained with a numerical implementation of Eqs. (4)-(6), as described below. H relates the state to measurement space. Interestingly, this basic part of the Kalman filter is in our case implemented by the skull, the effect of which is represented by the lead field matrix, based on Ary's head model and introduced above.
The UKF is a recursive predictor-corrector-type algorithm that aims to minimise the mean square error of the estimated states and parameters over time. For each time step it calculates a prediction of the state and parameters of the system, which is corrected when the information from a measurement is incorporated. The amount of confidence given to the model and measurement is quantified by the Kalman gain K, which is calculated at each time step based on prediction covariances as well as model and measurement error covariances (Q and R, respectively). For more details on the implementation of the filter, the reader is referred to S1 Appendix and to Refs. [45][46][47]56].

Generation of in silico datasets
For this paper three different in silico datasets were generated. We consider both simulated electrocorticography (ECoG, intracortical) and electroencephalography (EEG, extracranial) readings (using Ary's model in the latter case). All datasets used the same locations [57] for cortical columns and electrodes, as shown in Figs. 6 to 8. The strength of the coupling was set at a medium value so that the cortical columns have an effect on one another without fully synchronizing behaviours, and the configurations of the couplings are as shown in Fig. 2. Table 1 shows representative values for the parameters used in all analyses unless otherwise specified. In this paper we focus on estimating the amplitudes A of the EPSPs of the different cortical columns, and therefore we choose values for these amplitudes that produce signals that reflect various dynamic regimes that we wish to explore. The numerical solver used to generate the in silico time series was the Heun algorithm [58] with a time step of ∆t = 1 ms; the length of the data is of 100 s in all cases.  Table 3.
To set the matrices Q and R-which reflect the quality of measurement and model, and which crucially affect the output of the filter-, we used our knowledge of the characteristics of the data to fix an initial guess [59], then adjusted it to meet performance criteria.

7/23
For each of the experiments we conducted 50 realizations of each estimation with different initial conditions; therefore, all the figures show averages of the 50 estimations, unless otherwise specified. The initial conditions for state and parameter estimations were randomly generated; the parameters, however, were constrained to deviate no more than 90% of their actual value as an initial assumption.

Results
In order to compare the performance of the extra-and intracranial approaches to Kalman filtering, we have analyzed three different cortical column configurations, each using one of the two motifs shown in Fig. 2. Where relevant, two different types of estimations have been used: intracranial and extracranial. Intracranial estimation uses experimental data that would have hypothetically been obtained from an electrocorticography, that is, using a single intracortical electrode, and is estimated with the data provided by a single location -in other words, the direct output of Jansen and Rit's model. Extracranial estimation, on the other hand, employs experimental data originated from EEG recordings, using several electrodes placed on the skull, and is implemented here with the projection on the head of the model output. We now discuss the three different dipole configurations that we have considered.

Three unidirectionally coupled cortical columns
The first study was performed with the cortical columns coupled unidirectionally (panel (a) of Fig. 2), as described in [59]. The parameters were set to standard values [25] for the three cortical columns (see Table 1), except for the first column, in which A 1 was set to 3.58 mV to make it hyperexcitable. The coupling constant was set to a medium value, large enough for the cortical columns to have a visible effect on each other but not so large that they will fully synchronize and lock their dynamics. In this case, information flows unidirectionally because of the way the cortical columns are coupled [59]. As can be seen in the lower panels of Fig. 3, the first cortical column has a random spiking activity, due to the increased value of A and the presence of noise [60]. Due to the architecture of the coupling, cortical column 1 causes cortical columns 2 and 3 to spike also, when otherwise they would have simply fluctuated around their resting level.
The upper panels of Fig. 3 show the intracortical and extracranial estimations of A for the three cortical columns. The estimation for A 1 of the first column converges to its correct value, with both the intra-and extracortical approaches. This was to be expected, since the first cortical column receives no inputs from other elements of the system. In contrast, the intracortical estimations for cortical columns 2 and 3 converge to values significantly higher than their actual value of 3.25 mV. We conjecture that this is caused by the spiking of these two cortical columns, which as mentioned above is due to the influence of cortical column 1. Multi-channel extracranial information, in contrast, allows to see the complete picture of the coupled cortical columns and treat them as a single composed system, contrary to the partial picture obtained from the information provided by the single intracranial recordings. Therefore, estimation is better when using extracranial information with several electrodes, as shown in the upper panels of the figure. The lower panels of Fig. 3 show the estimation of the state. The UKF shows great efficacy when the estimation is extracranial, but performs poorly in 8/23 the case of intracortical estimation (with the exception of cortical column 1, because it has no input from other cortical columns).   (Table 1). All three cortical columns received an external input, p, in the shape of Gaussian white noise with mean 90 s −1 and standard deviation 20 s −1 . The coupling constant was set to k = 10. The measurements were corrupted with noise of mean 0 and standard deviation 100 mV for extracranial measurements and 5 mV for intracortical measurements. Except for cortical column 1, with intracortical data the filter converges to a much higher value than the target, whereas with extracranial data the filter converges to a value which is accurate. In the lower panels it is shown that extracranial estimations of the state are also accurate, whereas intracortical estimations fail to reproduce the spikes correctly.

Three bidirectionally coupled cortical columns: coarse parameter estimation
The second experiment aims to explore the filter's possibilities in more extreme situations. The three cortical columns are located as in the previous section, but coupled bidirectionally (panel (b) of Fig. 2). Additionally, the maximum amplitudes of the excitatory PSPs are set to A 1 = 4.25 mV, A 2 = 10.00 mV, and A 3 = 3.25 mV. These values were chosen to force the three cortical columns to be in very different dynamical regimes: cortical column 1 operates in a spiking regime; cortical column 2 oscillates with alpha frequency but with an amplitude similar to that of the spikes; and cortical column 3 oscillates in a more standard regime, as described in [25].
Moderate intracortical measurement noise. Figure 4 shows again the performance obtained using the experimental data from a set of extracraneal electrodes compared to using individual intracortical electrodes for each cortical column. In this case we show the 50 realisations of each filtering, without showing the average. The extracranial data for this experiment were corrupted with a measurement noise of zero mean and standard deviation 100 s −1 ; the intracortical data were corrupted with a measurement noise of standard deviation 5 s −1 in order to maintain similar levels of signal-to-noise ratio. As shown in Fig. 4, the intracortical parameter estimations do not approximate very well the target value. In particular, the estimations of A for cortical column 2 converge to three different values depending on the initial conditions. The state estimation follow the actual state of the system closely only for cortical column 1. The situation is very different when with extracranial electrodes, where all 50 realisations of the estimations converge with much more precision to the correct values for both state and parameters (with the exception of A 2 , which still tends to lower values in a very small quantity of the realisations). Again, extracranial performance is better, in general, to intracortical.   Table 1). The external input p for each of the three cortical columns is of mean 200 s −1 and standard deviation 100 s −1 . The coupling constant was set to k = 5. The intracortical measurements were corrupted with noise of mean 0 and standard deviation 5 mV, while the noise in the extracranial measurements is of standard deviation 100 mV. Extracranial estimations of the parameters are both faster and more accurate than intracortical estimations; this applies also to the state, whose dynamics are more faithfully reproduced using multi-electrode extracranial estimation (as shown in the lower panels).
High intracortical measurement noise. The difference between intracranial and extracranial estimation is even larger for higher measurement noise (Fig. 5). In this case, the amount of noise in the intracortical data was set to the same value as the noise in the extracranial data. The value of R was tuned to reflect the increase in measurement noise, but the intracortical estimations failed to obtain the correct values for the parameters and reproduce the state.   Table 1). The external input p for each of the three cortical columns is of mean 200 s −1 and standard deviation 100 s −1 . The coupling constant was set to k = 5. The intracortical measurements were corrupted with noise of mean 0 and standard deviation 100 mV-about an order of magnitude higher than the noise in the previous graph-, while the noise in the extracranial measurements is of standard deviation 100 mV. Extracranial estimations of the parameters are also faster and more accurate than intracortical estimations, more markedly so in this case; as to the state, in this more extreme case, the intracortical estimation does not mimic the evolution of the system in any way.
Using one single extracranial electrode. Using the same dataset, we aimed to investigate the outcome of using each extracranial electrode individually [61], as opposed to using the complete subset as until now. Therefore we used each electrode separately to estimate the state and parameters of the complete system, with 50 realisations of the estimation for each electrode. By doing so, we show that the quality of the estimations is strongly dependent on the relative positions of sources and electrodes.
In  In Fig. 6 the distribution of the estimations of A 1 are shown. The distributions tend to be narrowest in the vicinities of the cortical column whose A value is being estimated. However, it is noteworthy that the histograms obtained from the observations in distant electrodes tend to group not around the actual value of A 1 = 4.25 mV (red vertical line), but of A 3 = 3.25 mV (blue vertical line). This result suggests that the algorithm is unable to distinguish the origin of the EEG activity when sources and electrodes are distant from each other. Figure 7 shows the results of the estimation of A 2 (actual value shown by vertical green lines), revealing wider distributions in general, which indicates a stronger dependence on initial conditions. Although it is true that the electrodes near cortical column 2 perform better in estimating A for that column, the difference with more distant electrodes is not as large as for the estimates of A for cortical columns 1 and 3. Figure 7. Distribution of 50 realizations of A estimations from a single electrode for cortical column 2 (solid green circle). The distributions here are wider than for A 1 and A 3 , although they still tend to be more accurate near the cortical column (solid green circle) and group around the target value of A 2 = 10.00 mV (vertical green line).
Finally, Fig. 8 shows the performance of each electrode when A 3 is being estimated (actual value shown by vertical blue lines in the figure). Interestingly, even the electrodes located at the far left of the figure lead to a good estimate of A, comparable to that coming from the electrodes in the far right, which are closer to column 3 and could therefore be expected to provide a much more accurate estimation.
However accurate some of the single electrodes' estimations are, using the complete set of 15 electrodes invariably yields better results. This is because, in Kalman filtering, combining many sources of information always improves the final estimation, even if some of the sources are inaccurate or incomplete [62].

Three bidirectionally coupled cortical columns: fine parameter estimation
In the previous section, the value of A of one of the cortical columns was much larger than the other two. We now consider the same coupling motif, but with values of the A parameter that are much closer together in value: A 1 = 3.58 mV, A 2 = 3.25 mV, and A 3 = 3.10 mV. The purpose of this test was to ascertain whether the filter could differentiate between parameters with smaller differences in value. This ability is very important if we expect to use the technique in clinical applications. Fig. 9 shows the extracranial estimation of the A parameters using the complete subset of 15 electrodes. The estimations converge to the actual values with enough accuracy as to give hopes of using the filter in a clinical setting.

Discussion
The most important limitation of current data assimilation processes in neuroscience is that the appropriate experimental recordings are usually intracranial. Using Kalman filtering to fit these data to neural mass models shows promise in several contexts and applications. Here we have modified this type of approach 14/23  Table 1). The external input p for each of the three cortical columns is of mean 200 s −1 and standard deviation 100 s −1 . The coupling constant was set to k = 5. The noise in the extracranial measurements is of standard deviation 100 mV. The estimation of the parameters is fairly accurate.
by extending the base neural mass model with a head model, with the aim of integrating non-invasive experimental recordings. We have explored the limitations and advantages of our model using in silico data in very well controlled conditions. Even though we keep the exploration of the technique using real EEG experimental data in mind, here we bring forth a proof of concept by performing several experiments that address different aspects of the method.
In this paper we have considered a system comprised of three cortical columns, modelled according to Jansen and Rit's equations and coupled following two different motifs. The cortical columns are all driven by a noisy input. The signal from the cortical columns is then transferred to the skull, after which it is corrupted by noise to simulate electrode readings from EEG. These are then used to estimate the amplitude of the excitatory post-synaptic potentials using the Unscented Kalman Filter.
The first study involves three columns that are coupled unidirectionally with no backflow. The first cortical column is made hyperexcitable by increasing the excitatory post-synaptic potential to A 1 = 3.58 mV; this cortical column causes the second cortical column and, indirectly, the third to modify their behavior as a result of the spiking of the first. For the intracranial estimations, single intracortical electrodes measured the evolution of the three cortical columns independently; for the extracranial estimations, 15 extracranial electrodes were used simultaneously. Applying the Kalman filter to the extracranial data provided a good estimation of the A parameters; the intracortical measurements, however, yielded mixed results. The estimation for cortical column 1 was accurate, whereas for cortical columns 2 and 3 the estimation was above the target value. We attribute this to the fact that columns 2 and 3 are excited by column 1, which spikes due to a higher value of A. As a consequence, when independently evaluated, the estimation is higher than the actual value. Therefore we suggest that one intracranial electrode provides only a partial view of the system, and thus cannot capture the behaviors of all three cortical columns and the interactions between them; the use of many electrodes provides a more complete view of the system.
Next we considered a situation in which the dipoles were coupled bidirectionally in an all-to-all configuration. The A parameters were chosen such as to cause different dynamic behaviors in the cortical columns: A 1 = 4.25 mV, A 2 = 10.00 mV, and A 3 = 3.25 mV. Three types of fitting via Kalman filtering were performed, using (i) independent intracortical recordings of single cortical columns were filtered, (ii) the complete subset of 15 extracranial electrodes, and (iii) single extracranial electrodes. The intracortical data were corrupted with two different levels (medium and high) of measurement noise. For both cases, the multi-electrode extracraneal estimation surpasses the intracortical results in both speed and quality; the difference, however, is more marked in the presence of higher measurement noise in the intracortical recordings. The results for the single electrodes show a significant influence of space on the quality of the estimations, in the sense that estimations of electrodes close to the source are relatively accurate, and electrodes further away from the source might not allow to discriminate the source of the information correctly, or might completely fail to represent the system.
Finally, we considered the situation of an identical cortical column configuration -in terms of situation and coupling-, except for the values of the EPSPs of the cortical columns: A 1 = 3.58 mV, A 2 = 3.25 mV, and A 3 = 3.10 mV. This dataset was filtered only extracranially, with the purpose of evaluating the filter's ability to discriminate parameter values within narrower ranges. The results were accurate, which is promising in views of applying the algorithm in a clinical setting.
Taken as a whole, our results show that, independently of the need to explore more realistic situations, extracranial EEG recordings constitute a good candidate to be used together with neural mass models and Kalman filters, provided the method is extended with a head model. Using non-invasive techniques in these processes widens the applications of Kalman-based data assimilation methods in neuroscience.
The adaptation of the method to a specific possible applications deserves its own exploration. The possibility of determining parameters of cerebral dynamics in a non-invasive manner would allow us to study, for instance, the origins of the variability in EEG recordings. It would also enable exploring automatic biometric-based user recognition systems [63] and, through single-patient characterization, tracking the changes in brain dynamics due to aging [64,65], and monitoring the evolution of diseases [66]. The possibility of tracking the evolution of brain states during motor imagery-control [67] or task-switching control [68] is also open. Besides, a good description of the brain state would allow the efficient control of epilepsy [69], a good performance in brain-machine interface tasks [70], and the detection and control of transcranial brain stimulation [71]. Rehabilitation tasks [72,73] may also benefit from the possibility of monitoring brain states reliably.
UKF is a predictor-corrector algorithm that estimates the state and parameters at a given time step k in two phases. The first one predicts the state based solely on the dynamical information of the system, i.e., the model. The second incorporates a measurement with which to correct the first estimation.
The first step of the algorithm involves computing the expectation of the state and of the state covariance at time instant k + 1, known as the a priori estimation. For this we use a numerical implementation (using Heun's solver) of Jansen and Rit's model of a cortical column [25,44], as described in the Methods section.
The nature of the nonlinearities of this model prevents us from using a simple linearization approach to propagating the statistics of the state variables across the transformation. Therefore, we incorporate the unscented transform (UT) in our formulation of the Kalman filter, which, instead of attempting to propagate a distribution through the nonlinearity, first propagates a series of deterministically chosen points through the nonlinearity and then recovers the statistical information of the distribution from these. Therefore, the a priori estimation of the state,x − k , is obtained as follows, beginning with the calculation and projection of the 2n + 1 (where n is the state size) sigma points, where P k−1 is the estimated state covariance matrix for the previous time step. This continues with the condensation of the projected sigma points into the a priori state estimate: where Q is the state error covariance and W m and W cov are the weight vectors, defined as In Eqs. 14 and 18, α, β and κ are scaling factors, and λ is calculated as λ = α 2 (n + κ) − n. α, the primary scaling factor, determines the spread of the sigma points around the mean and is usually set between 0.001 and 1. β contains prior information about the distribution of x; for Gaussian distributions, its optimal value is 2. κ, the tertiary scaling parameter, is usually set to 0 [55].
Finally, the sigma points are redrawn [45] and the estimation of the measurement,ŷ − k , is calculated: The use of a measurement to correct the state estimation implies the mapping of the a priori estimate onto the measurement space for comparison. It is worthwhile to note that, in our case, this transformation is a linear matrix H which relates the state of the cortical columns to an EEG reading. See Extracranial data generation section for details.
The second step of the algorithm corrects the a priori estimation of state and covariance by using the information available from the most recent measurement (in our case, an EEG reading). The impact of the measurement is determined by the Kalman gain K k , which essentially expresses the level of confidence on the accuracy of the model and the level of noise in the data.
K k = P x k y k P −1 y k y k (23) where P y k y k is the predicted measurement covariance, P x k y k is the state-measurement cross-covariance, R is the measurement error covariance, and z k is the measurement for the current time step.