Neural Field Theory of Corticothalamic Prediction With Control Systems Analysis

Neural field theory is used to model and analyze realistic corticothalamic responses to simple visual stimuli. This yields system transfer functions that embody key features in common with those of engineering control systems, which enables interpretation of brain dynamics in terms of data filters. In particular, these features assist in finding internal signals that represent input stimuli and their changes, which are exactly the types of quantities used in control systems to enable prediction of future input signals, and adjustment of gains which is argued to be the analog of attention in control theory. Corticothalamic dynamics are shown to be analogous to the classical proportional-integral-derivative (PID) filters that are widely used in engineering.


INTRODUCTION
The brain must carry out functions that implement attention to external stimuli, prediction of their future course, and decision with regard to actions to take in response, including interventions to control aspects of the environment. In the natural environment, a large fraction of visual processing deals with the prediction of spatiotemporal trajectories of objects and the taking of actions to either intercept or avoid them. For example, an organism may seek to capture and eat an object in its environment or to avoid being captured and eaten. These decisions thus differ markedly from the binary decision and discrete classification tasks widely studied in cognitive neuroscience.
With regard to attention to multiple sensory streams, there is extensive evidence that attention paid to individual streams is apportioned on an approximately Bayesian basis (Feldman and Friston, 2010;Friston, 2010). Notably, the weight placed on a particular stream is approximately inversely proportional to its variance (Feldman and Friston, 2010;Friston, 2010). These biological observations, along with advances in machine learning and neuroscience, have motivated a plethora of models of attention and prediction in the brain, each with Bayesian features, with most motivated by cortical neuroanatomy and neurophysiology. However, none of the proposed frameworks has yet been shown to be fully implementable in the brain's tissues.
One example of this point is the model of Rao and Ballard (1999), who used the Kalman filter to model visual information processing in the brain. A Kalman filter is a Bayesian approach that applies when all the distributions of internal states, external data, and their uncertainties, are Gaussian distributed and the underlying internal dynamics are linear. Rao and Ballard showed that a Kalman filter could accomplish some prediction outcomes, and demonstrated neuronal implementation of some of its steps, but did not explain how its all complex matrix operations would actually be implemented in the brain. Helmholtz (1867) suggested that the brain predicts its inputs and adjusts an internal model to minimize mismatches between these predictions and external subsequent inputs. Friston and coworkers elaborated on this idea to develop Bayesian estimation schemes with both bottom-up and top-down signaling, and proposed that the brain employs a hierarchical Bayesian approach for perception and, crucially, action (Lee and Mumford, 2003;Friston, 2005Friston, , 2010Hawkins and Blakeslee, 2007;Daunizeau et al., 2010b,a). The common theme of these approaches rests upon the minimization of the mismatch between top-down predictions and bottom-up expectations. Subsequent work Mathys et al. (2011Mathys et al. ( , 2014 expanded classic Bayesian inference to optimize the precision of prediction errors which enabled them to include reinforcement learning into the ensuing hierarchical Gaussian filter (HGF) for perception, attention, and action. This family of models has been shown to be able to carry out a number of sophisticated prediction tasks, but this raises a number of questions. Specifically, they rely on the brain being able to determine, store, and update multivariate probability distributions of Bayesian priors, to carry out multidimensional integrations over these distributions, and to compute large scale matrix operations, all in real time. Although these problems have been recognized, and suggestions have been made to simplify the evaluations by means of reduced moment-based representations, for example (Bastos et al., 2012;Pouget et al., 2013;Mathys et al., 2014;Friston et al., 2017), it has yet to be established that the brain can carry out the necessary calculations.
The current state of affairs is thus that each proposal in the literature is motivated by the real brain, but relies in places on mathematical steps which have no established implementation in neural tissue. Hence, although these schemes have many plausible features, and many interesting applications have been demonstrated, none has been shown to be fully realizable in the brain.
Motivated by the need for a formulation of brain dynamics that is physiologically realizable, analytically and numerically tractable, and experimentally testable, we take a different approach. Instead of deciding on a favored mathematical formulation and assuming that it works in the brain, the present work takes the physically motivated reverse approach of first analyzing realistic corticothalamic responses to simple visual stimuli using neural field theory (NFT). This enables us determine what filter properties they exhibit, rather than advancing a predetermined model. We thus analyze the corticothalamic system, focusing on the response of primary visual cortex V1 to spatially unstructured stimuli which has been widely used in visual flicker experiments to probe steady state visual evoked potentials (SSVEPs) (Spekreijse, 1966;Spekreijse et al., 1973;Herrmann, 2001;VanRullen and Macdonald, 2012). In doing this we postpone application to more complex stimuli in order to focus on establishing the first example of a fully neurally implementable scheme for prediction, followed in a subsequent paper by attention and decision. This involves treating the brain first and foremost as a physical system that is responding to its environment, rather than as a computer or abstract information processing engine. Some of the dynamical processes that we uncover may be able to be fruitfully interpreted as information processing or computation, but such interpretations are subject to the fact that the brain is a physical object.
We first employ NFT to model the corticothalamic system and determine the extent to which its dynamics can be interpreted within a control systems framework which has the potential to encompass prediction, gain tuning, and control. Most control systems and Bayesian schemes such as Kalman filtering are reduced or constraint form of Bayesian learning under optimal estimation theory (Chen, 2003). In carrying out our analysis, we determine which signals within the system represent input stimuli and their rates of change, because these are the classes of quantities used in control systems to enable prediction of future input signals, and adjustment of gains which is argued to be the analog of attention in control theory. The finding of analogous quantities in the corticothalamic system enables interpretation of its dynamics in terms of control systems, and assists in localizing the structures in which gain control is possible in principle. In a forthcoming paper, we will use these findings to propose physiologically-based mechanisms for attention and control via feedbacks. At each stage, we are explicit about the neural implementation of the mechanisms. This paper is structured as follows. Section 2 briefly outlines the corticothalamic model of the brain using the neural field theory and then we obtain transfer functions for the corticothalamic model. The transfer functions are analyzed and data filter interpretations of them are presented at Section 3. Section 4 concludes and discusses future work.

Corticothalamic Neural Field Theory
In this section we outline the essentials of NFT, then apply it to a generalized corticothalamic model. For more details see Robinson et al. (2002Robinson et al. ( , 2004.

Neural Field Theory
NFT averages over short spatial and temporal scales to obtain equations for the evolution of dynamical variables within each neural population a, which are the local mean cell-body potentials V a , the mean rate of firing at the cell body Q a , and the propagating axonal pulse rate fields φ a .
The mean firing rates Q a are related to the cell body potentials V a (r, t), relative to resting, by where S is a sigmoid function that increases smoothly from 0 to Q max as V a (r, t) increases from −∞ to ∞. An approximation of this function is (Wilson and Cowan, 1973;Freeman, 1975) where θ is the mean neural firing threshold and σ ′ π/ √ 3 is the standard deviation of the difference between the steady state depolarization of individual neurons and their thresholds.
Effectively, the threshold response of a single neuron is smeared out to yield a sigmoid when averaged over the population.
Signals arriving at neurons of type a stimulate neurotransmitter release at synapses. This is followed by propagation of voltage changes along dendrites and soma charging, with dynamics that spread the temporal profile of the signals. The total cell body potential can thus be written where the subscripts on V ab distinguish the different combinations of afferent neural type and synaptic receptor, and where the differential operator D ab that governs the temporal response of V ab to afferent pulse rate fields φ b is The operator D ab encapsulates the rates β ab and α ab of the rise and fall, respectively, of the response at the cell body. On the right of Equation (4), N ab is the mean number of synapses on neurons a from neurons of type b, s ab is the mean time-integrated strength of soma response per incoming spike, and φ b (r, t − τ ab ) is the mean spike arrival rate from neurons b, allowing for a time delay τ ab due to anatomic separations between discrete structures. The overall connection strength between two neural populations is ν ab = N ab s ab . Each part of the corticothalamic system gives rise to neural pulses, whose values averaged over short scales form a field φ a (r, t) in our model that propagates at a velocity v a . To a good approximation, φ a (r, t) obeys a damped wave equation whose source of pulses is Q a (r, t) (Jirsa and Haken, 1996;Robinson et al., 1997), with D a (r, t)φ a (r, t) = Q a (r, t), where the spatiotemporal differential operator D a (r, t) is Here the damping rate γ ab satisfies γ a = v a /r a , where r a and v a are the characteristic range and conduction velocity of axons of type a.

Corticothalamic Model
Corticothalamic NFT incorporates key anatomic connectivities as shown in Figure 1. The neural populations included are cortical excitatory (e) and inhibitory (i) neurons, the thalamic reticular nucleus (r), thalamic relay neurons (s) that project to the cortex, thalamic interneurons (j), and noncorticothalamic neurons responsible for external inputs (n). In the present case, external inputs are visual, the relevant relay nucleus is the lateral geniculate nucleus (LGN), and its projections are to the primary visual cortex (V1). The only nonzero values of τ ab in our model are the forward delays τ es = τ is ≈ 20 ms and the backward delays including τ se = τ je = τ re ≈ 60 ms, which correspond to thalamocortical and corticothalamic propagation times, respectively. The use of a single form of D ab corresponds to the approximation that the mean dendritic dynamics can be described by a single pair of time constants.
In our model, only the axons of excitatory cortical neurons a = e are long enough to yield significant propagation FIGURE 1 | Physiologically based corticothalamic model in which the arrows represent excitatory effects and the circles depict inhibitory ones. The populations are cortical excitatory (e) and inhibitory (i) neurons, the thalamic reticular nucleus (r), thalamic relay neurons (s) that project to the cortex, and non-corticothalamic neurons responsible for external inputs (n). Gray boxes depict the lateral geniculate nucleus (LGN) (left), the thalamic reticular nucleus (TRN) (middle), and primary visual cortex (right). effects in Equation (7); in the other cases, r a is so small that the solution of equations above can be approximated by In the cortex, the number of synapses is closely proportional to the numbers of source and target neurons (Robinson et al., 1997(Robinson et al., , 1998Braitenberg and Schüz, 1998), which implies that ν ee = ν ie , ν es = ν is , and ν ei = ν ii .

Steady States, Linearity, and Transfer Functions
The NFT equations are nonlinear in general, and highly nonlinear phenomena like epileptic seizures have been studied with them (Breakspear et al., 2006). Normal brain states have been shown to correspond to spatially uniform steady states of corticothalamic NFT, and are obtained by setting all time and space derivatives to zero (Robinson et al., 2002Abeysuriya et al., 2015). Stable steady state solutions are interpreted as representing the baseline of normal activity, which yields firing rates in accord with experiment (Robinson et al., 2002. Time dependent brain activity is then represented by linear perturbations from the steady states-an approximation that has reproduced a host of experimental phenomena, including evoked responses (Robinson et al., 1997(Robinson et al., , 1998(Robinson et al., , 2002(Robinson et al., , 2005; O'Connor and Rowe et al., 2004;Kerr et al., 2008;van Albada et al., 2010;Roberts and Robinson, 2012;Abeysuriya et al., 2015;Sanz-Leon and Robinson, 2017). In this section, we calculate the linear transfer functions of stimuli to corticothalamic populations.

Linear Dynamics
The low firing rate of steady-states have been identified with normal states of brain activity (Robinson et al., 1997) and nonlinear terms are only found to be significant in very strong stimulation conditions (Herrmann, 2001;Roberts and Robinson, 2012;Abeysuriya et al., 2015). The criterion for linear approximation to be valid is that voltage perturbations should be significantly less than σ ′ /3. Linear perturbations relative to uniform steady-state values φ (0) a , where a = e, i, r, s, and V (0) a approximately obey the damped wave equation where we henceforth use the symbols φ a and V a to denote the linear perturbations of these quantities relative to their steadystate values, unless otherwise indicated, and ρ a = dS (V a ) /dV a , evaluated at V 0 a . The external field φ n which drives the brain via the relay nuclei also comprises a steady-state component φ (0) n plus a time-varying signal that causes the response, denoted by φ n . The differences from threshold voltage variations of about ±σ ′ /3 are therefore needed before nonlinear terms become appreciable relative to the linear ones. This yields the variations of order 1 mV, or slightly larger, which corresponds to approximately a 2fold firing rate bound. Detailed analysis of the model with respect to these parameters can be found in Robinson et al. (2004).
Operation with D ab on both sides of Equation (8), plus use of Equation (4) with G ab = ρ a ν ab = ρ a N ab s ab .
The gain G ab is the response in neurons a due to unit input from neurons b; i.e., the number of additional pulses out for each additional pulse in. A transfer function is the ratio of the output of a system to its input in the linear regime. Either the Laplace or Fourier transform can be used to determine transfer functions, but the former is more widely used in engineering control theory, particularly to analyze responses to impulses. To derive transfer functions one may apply the Laplace transform to both sides of Equation (9) to transform it from time t to complex frequency s. The unilateral Laplace transform is defined by Ogata and Yang (1970) where s = −iω = Ŵ − i is the complex frequency that parametrizes the response e st . Here, the real quantities and Ŵ denote the oscillation frequency and growth rate of the response, respectively, so stable responses correspond to s lying in the left half of the complex plane, with neutrally stable responses having s on the imaginary axis. Alternatively, one may use the continuous Fourier transform F , which is equivalent to evaluating the bilateral Laplace operator with imaginary argument Before we calculate system transfer functions, we note that the operator in Equation (4) has the Laplace transform We define the corresponding filter function by The Laplace transform of Equation (7) is where we have also Fourier transformed the spatial Laplacian operator via ∇ 2 → −k 2 where k is the wave number.

Transfer Function to Thalamus From Retina
The firing rate (in the spatial-Fourier, temporal-Laplace domain henceforth) in the reticular nucleus is For relay neurons in the LGN, the firing rate φ s is By substituting φ r from Equation (17) into Equation (18), the transfer function to thalamus from retina is found to be where where G abc = G ab G bc , and L abc = L ab L bc .

Transfer Functions to Cortex From Retina
By setting a = e and a = i in Equation (9), the axonal fields of V1 cortical cells are found to obey Replacement of φ i in Equation (26) by means of Equation (27), yields the transfer function to the cortex from the thalamus, where M(s) and N(s) are as in Equations (21-25). Multiplying Equations (19) and (28) yields the overall transfer function to cortex from retina: Replacement of φ s and φ e in Equation (27) by means of Equations (19) and (31) yields the transfer function to the inhibitory population from the retina, . (33) where

Transfer Function to TRN From Retina
Replacement of φ s and φ e in Equation (17) by means of Equations (19) and (31), yields the transfer function to the TRN from retina, where W(s) = G re L re N(s) exp −s(τ es + τ se ) + G rs L rs M(s).
FIGURE 2 | Magnitude of the transfer functions T an to populations a = s, r, i, e vs. frequency, as labeled, for k = 0 and the parameters in Table 1.  .

Corticothalamic Transfer Function Characteristics
The frequency response of the transfer functions is the transfer function evaluated on the imaginary axis of the s-plane, where s = −i . Figure 2 shows the frequency responses of all populations for the nominal parameter values in Table 2, using the Control System Toolbox of Matlab 2017a to carry out the calculations. More detailed analysis of the model with respect to these parameters can be found in Robinson et al. (2002Robinson et al. ( , 2004 and Abeysuriya et al. (2015). Only the the spatiallyuniform effects of perturbations, i.e., k = 0, is explored in this study. Low frequencies are passed, while high frequencies are attenuated, and for input signals with small frequency, each transfer function represents an amplifier with constant gain. At higher frequencies, pronounced resonances at 9 and 18 Hz are present in all transfer functions, which can be associated with the alpha and beta peaks in the brain's wake state. The functions become less resonant as the signal gets further away from retina to the cortex: the thalamic functions T sn and T rn have higher amplitude and wider bandwidth than the cortical ones T en and T in . The transfer function fully describes the linear system properties and enables us to investigate its response to any external signal. Setting the denominator of the transfer function to zero yields the characteristic equation of the system, whose roots are its eigenvalues and mark the poles; these poles determine the basic modes into which the system response can be decomposed. Furthermore, all corticothalamic transfer functions calculated above, have some or all of their poles (basic modes) in common, which is a direct result of the interconnectedness of the system. Roots of the numerator of the transfer function are the zeros of the system; signals at these frequencies are not transfered through the system.

Control Systems Interpretation of Corticothalamic Transfer Functions
In this section, we decompose the transfer functions into elementary modes whose behaviors we associate with data filters whose control system properties are well understood. Only the spatially-uniform effects of perturbations, i.e.,he simpler the description of the system, alt k = 0, have been explored in this study.

Reduced Model
The corticothalamic transfer functions, are ratios of exponential polynomials of s. We approximate each transfer function T ab (s) by a rational function, whose properties can be interpreted in terms of data filters (Ogata and Yang, 1970;Kwakernaak and Sivan, 1972), with where q(s) and p(s) are polynomials of degree m and n, respectively, with m < n. Therefore, when the p j are all distinct (we do not consider degenerate roots here), one has the partial fraction decomposition where the residues r j are The smaller n is, the simpler the description of the system, although accuracy is lost if n is made too small. In subsequent sections, we seek the smallest n that retains the main dynamics. Generally, this leads to the most heavily damped modes (poles with the largest negative real parts) being discarded.

Filter Identification
Once we have a few-pole approximation of the system transfer function, we examine it from a control-systems perspective to determine its predictive properties. Using Equation (40), the response φ b of population b to an input signal φ a can be written as where Equation (42) defines φ j b (s). A general transfer function will have one or more pairs of complex conjugate poles in the Laplace domain, in addition to one or more real poles. Therefore, each pair of conjugate poles generates a real response mode. We also pair up real poles in the next part of the analysis to conveniently treat both cases together as second order filters whose functions are well known.
Hence, we consider the partial transfer function of the sum of two fractions associated with poles p j and p j+1 either both real or conjugate pair, which we denote by T J ab (s), with Equation (43) yields = H J ba (s)G J a (s)φ a (s).
By defining we can write = H J ba (s)G J a (s)φ a (s).
Equations (48) and (49) express how the input φ a (s) first passes through G J a (s) to generate φ I a (s), which is transfered via H J ba (s) to generate the response φ J b (s). The input φ a (t) passes through the filter G J a (s), Transforming Equation (51) to the time domain yields where hence this filter acts as an integrator in the form of a convolution; Figure 3A shows a schematic illustration of this convolution. This filter can be rewritten as a standard second-order filter, as where K is the low frequency gain, 0 is the natural frequency, ζ is the damping coefficient, and the filter's bandwidth termed as B and is 2ζ 0 . At s ≪ 0 the right of Equation (55) approaches K, while in the opposite limit it approaches K/s 2 . When ζ > 1, the FIGURE 3 | Schematic of stages in the predictor given by Equation (47). (A) The input signal φ a (t) first passes through G J a (s), a second-order low-pass convolution filter. (B) The filtered signal φ I a (t) then passes through H J ba (s), where, linear extrapolation over a prediction time τ p yields the predicted signal φ P (t).
filter is overdamped and exponentially decays to the steady state without oscillating, while larger values of ζ yield a slower return to equilibrium. A critically damped filter has ζ = 1 and returns to the steady state as quickly as possible without oscillating. When ζ < 1, the filter is underdamped and exhibits damped oscillations. The peak frequency response occurs at ± peak , with The peak magnitude of G J a (s) satisfies (57) Figure 4A shows the behavior of a nominal G J a (i ) when ζ changes, for fixed 0 and K. Figure 4B shows the response of this filter when the input φ a (t) is a Dirac delta function for a the same range of ζ .
The convolved signal φ I a is transferred through the filter H J ba (s). By transforming Equation (49) into the time domain one then finds where τ p is termed the prediction time and φ P a the predicted signal. Hence, the filter H J ba predicts its own input a time τ p in the future, which is illustrated in Figure 3B. Figure 4C shows the magnitude of the T J ba for the same nominal G J a (i ), ζ = 0.3, and H J ba (s) = s + τ −1 p when τ p varies. Figure 4D shows how T ba predicts the Dirac delta function when τ p varies. At larger τ p the response signal attains its final steady state value more slowly with a smaller resonant peak and less oscillation.
The above convolution plus prediction processes can be interpreted as a Proportional-Integral-Derivative (PID) control scheme used in engineering control systems (Ogata and Yang, 1970;Kwakernaak and Sivan, 1972). Specifically, in the time domain Equation (46) which is a continuous-time serial PID. In this equation the integral (I) smooths the input signal to reduce the effects of noise. The smoothed signal is then extrapolated a time τ p into the future by the combined effects of the proportional (P) and derivative (D) operators in the first parentheses on the right. Hence, each pair of poles in Equation (40) yields a partial transfer function that can be interpreted as a PID controller.

Control System Interpretation of Corticothalamic Dynamics
We now examine each corticothalamic transfer function from Section 2.2 to determine its control-systems characteristics.

Corticothalamic Filters
We find that a 16-pole approximation of T en is accurate to within a root-mean-square (rms) fractional error of 0.01 over the frequency range 0 to 150 Hz for the parameters in Table 2, while a 6-pole approximation suffices for most purposes, accurate to within an rms fractional error of 0.02 over the same range. Figure 5A shows the poles of the 16and 6-pole approximations of T en , while Figure 5B shows the magnitudes of both functions. These results show that the 6pole approximation retains the main features of the dynamics and is sufficient for analyzing its effects; in most cases, it exhibits slightly shifted versions of the least-damped poles of the 16-pole approximation. Similar observations apply to the other transfer functions T in , T rn , and T sn whose 6-pole approximations are accurate to within an rms fractional error of 0.02. The pole maps and the frequency responses for these approximations are plotted in Figures 5C-H. We used the Control System Toolbox of Matlab 2017a to carry out these approximations.
Turning to filter analysis, six poles can be summed in pairs that dominate at low (f 5 Hz), alpha (5 Hz f 15 Hz) and beta (15 Hz f ) frequency ranges, respectively. We thus write = T low bn (s) + T alpha bn (s) + T beta bn (s), where b = s, r, i, e and T low bn is the sum of the two real poles while T alpha bn and T beta bn are the sums over the complex conjugate pairs of poles that represent oscillatory responses in the alpha and beta frequency ranges, respectively; in each case p j = Ŵ j ± i j .

Filter Properties of T low bn
This filter has two poles on the negative real axis at p 1 = Ŵ − and p 2 = Ŵ + , with This corresponds to all ζ > 1 and determines that G low n is an overdamped (no oscillation) low-pass filter. The cut-off frequency of G low n is low The parameters of T low bn calculated for populations b = s, r, i, e are listed in the  Table 3. The impulse response takes the form Because the second exponential decays faster than the first, the response is approximately first-order integrator with a time constant Ŵ − for large ζ . Thus, poles closest to the positive half of the s-plane dominate the response.

Filter Properties of T alpha bn
and T beta bn Both filters share the same main features because each comprises a complex conjugate pair of poles. Therefore, we focus on the alpha filter's properties. The poles are complex conjugates lying in the left half of the s-plane with p j , p j+1 = Ŵ ± i c , where c is termed the cut-off frequency and 0 ≤ ζ < 1. The frequency response of G alpha n has a resonance with |G alpha n as the resonance frequency and magnitude, respectively. Note that peak = c and that the two frequencies are equal only for ζ = 0, which corresponds to a pure oscillator with Ŵ = 0, more generally, the resonance bandwidth is 2Ŵ.
Regarding the temporal response, we must consider the nature of the residues at the poles as well as the pole locations. Since p j = p * j+1 , these residues are conjugates, so the impulse response of G and the predicted signal passed through H alpha bn = 2|r| exp(− 0 ζ t) cos 0 1 − ζ 2 t − arg (r) .
The parameters calculated for this filter are listed in Table 3. The corresponding frequency responses for b = s, r, i, e are plotted in Figure 6. An alpha rhythm (7.5-12 Hz) is detectable in every population's alpha response, with peak frequences ranging from  ≈ 8.4 Hz for the specific nuclei to ≈ 9.3 Hz for inhibitory neurons in the cortex. Similar results for beta filters are also listed in Table 3 and plotted in Figure 6. The beta responses exhibit resonances in the beta band (12.5 − −30 Hz) where amplitudes are significantly smaller than the alpha resonances.
Both alpha and beta filters have higher peaks in thalamus than cortex. Furthermore, in every population except s, the damping rate of alpha filters are approximately half of the beta filters' , meaning the alpha waves live longer than beta waves in these populations. But, the result shows that alpha and beta waves would last for the same time in the response of population s where their damping rates are relatively close. This suggests that beta waves live longer in the LGN response to stimuli than in the rest of the system. Calculating τ p from the filter parameters yields and shows that τ p is governed by both the filters' poles and their corresponding residues. Alpha and beta filters of the corticothalamic functions have small damping rates, ζ ≪ 1, and therefore the the prediction time in Equation (72) for these filters can be approximated as The quantity τ p 0 represents what portion of its resonance cycle the filter predicts in advance. One significant observation is that alpha and beta filters of populations e, i and s present very similar patterns of having τ p of alpha filters longer than those of their beta filters. In contrast, the population r has the opposite relation, which means its beta filter predicts further in advance.

Simulation With Random Input Signal
Aside from the issue of how to interpret corticothalamic dynamics in terms of data filters per se, there is the central question of how well these filters enable the system to predict its complex input signals out to some horizon in the future.
The corticothalamic system can only respond significantly to signals out to approximately the flicker fusion frequency of around 20 Hz. Hence, as a particularly severe test of its prediction capabilities, we simulate the system response to white noise, bandwidth limited to 30 Hz, with total power P n . We are interested in time-varying, but spatially unstructured stimuli. The perturbation analysis then corresponds to presenting the entire filed of view with a stimulus that consists of a sequence of luminances that fluctuates according to a small amplitude (perturbation) around a base level of intensity (steady-state). Such stimuli are widely employed in visual flicker experiments to probe SSVEPs (Herrmann, 2001;VanRullen and Macdonald, 2012). We used Control System Toolbox of Matlab 2017a to carry out the calculations of time responses for transfer function T en using the relevant equations and parameters in the Tables 1, 2, and to generate a random seed for φ n to stimulate the system. Note that the random noise does not affect spectra or other conclusions; it only changes the specific realization of the time Optimize the total of sum of filters through a single gain, as in Equation (76) 3.6 3.6 3.6 0.46 51 Optimize the total of sum of filters through separate gains, as in Equation (77) 1.9 8.9 28 0.29 50 series. We estimate the time offset τ d between the measured output and the stimulus and denote the shifted signal by φ e (t − τ d ). We then define the difference between the stimulus and the output of each filter as which we term the residual signal. For each filter, we find the optimal K that minimizes the power of residual signal, termed as P R . The power of the residuals is minimized to reduce the error between the predicted values and the real coming information subject to temporal frequencies that would normally be encountered in the signal part of sensory input. The resulting parameters are listed in the Table 4. Figure 7A shows the optimized outputs of the slow filter contribution to T en , and its residual signal compared to the stimuli. This filter predicts the slow trends of the external stimulus, corresponding to its large prediction horizon. Figure 7B shows the power spectra of the corresponding signals, which show that the narrow bandwidth of the filter limits it to predicting slow changes because it suppresses P R at low frequencies. The filter has a relatively narrow bandwidth which results in an optimal k l that reduces P R /P n to 0.53. Figure 7C shows the optimally normalized outputs of the alpha filter contribution to T en , and its residual signal compared to the stimuli. The alpha filter requires larger gain k a than the low frequency filter and has a short delay time. Figure 7D shows the power spectra of the corresponding signals which show that this filter acts as an alpha band predictor with P R /P n = 0.39 that cannot forecast higher frequencies or slow trends. Similar results for the beta filter are plotted in Figures 7E,F. The results show a need for a significantly higher gain k b to achieve the closest fit to the stimuli; however, this filter can reduce P R /P n -0.33 because of its significant tail at lower frequencies.
Slow, alpha, and beta filters detect and predict different frequency bands and generate responses, each of which focuses on information in the relevant band. This explains the need for a parallel set of filters normalized so that information about slow trends, and alpha and beta-band oscillations can be retrieved; the best response is obtained by summing the filters. We thus sum the filters in forms in which the gains can be controlled so that the full response is adjusted to optimally track the stimuli. In the first case, we normalize the total sum of the filters through a single factor, k, witch might correspond to an overall neuromodulation, for example. This yields and we adjust k so that P R is minimized. The parameters listed in Table 4 then result in a minimum of P R /P n = 0.46. Figure 7G plots the normalized-shifted φ (I) en and the residual signal compared to the stimuli for these parameters. Figure 7H shows the corresponding power spectrum plots. The results are quite close to the results of individual low-frequency filter in terms of residual power and time delay, because T low en has the highest magnitude, and normalizing the sum of the filters by a single gain causes it to retain its dominance over the total response.
More accurate tracking of the stimuli could be achieved by tuning the parameters separately for each filter. If the gains k m could be separately adjusted for m = l, a, b, denoting the low-frequency, alpha, and beta filters, respectively, then φ (II) en ≈ k l φ low bn + k a φ alpha bn Such an outcome might well be achievable by the real brain because we know that the strengths of the slow, alpha and beta peaks do not vary in lockstep. Figure 7I shows the response φ ( II) en and the residual signal compared to the stimuli for the parameters listed in Table 4, yielding P R /P n = 29. Figure 7J confirms that this response contains information about slow trends of the stimulus, while alpha and beta band waves are predicted with more emphasis than in φ (I) en . Overall, these results show that the response of population e to the stimulus can be approximated by the sum of predicted values. To improve the performance, various normalization constants can potentially be allocated to different filters through gain adjustments. Although we focus on T en here, the same approach can be used to reveal the mechanisms behind other populations' dynamics and prediction capabilities.

DISCUSSION
Motivated by the need for a formulation of brain's global dynamics that starts from its physical characteristics (rather than a predetermined formalism or endpoint) and is analytically, numerically, and experimentally tractable, we have used a neural field corticothalamic model to evaluate the transfer functions for various population of neurons. Particular attention has been paid to the understanding of the transfer functions from a control systems perspective. The main results are: (i) NFT transfer functions for each corticothalamic population were derived and approximated as rational polynomials. These were explored to determine the linear response of each population, and its rate of change, to any stimulus, which are exactly the types of quantities that are used in control systems to enable prediction of future states, and adjustment of gains which is argued to be the analog of implementing attention.
(ii) We approximated transfer functions, while preserving accuracy over the dominant frequency range. These were then used to uncover basic modes by which each population in the corticothalamic system responds to external signals. All corticothalamic transfer functions were then shown to be dominantly governed by a few basic modes and to have similar frequency responses, with different gains.
(iii) Notably, it was shown that each pair of basic modes yields a sub-response that can be expressed in terms of a standard second order PID filter with a low-frequency, alpha, or beta resonance.
(iv) Slow, alpha, and beta filters operate in parallel, with each filter capturing and processing part of the information coming from the external world. The total response to stimuli is obtained by summing these independent responses. This corresponds to the common practice in control systems of using a set of controllers to improve the performance of a tracking system.
(v) Using a random white noise covering the bandwidth of human vision as the stimulus, the responses of corticothalamic filters were simulated. We explored the tracking performance of each filter as well as their sum for the excitatory population in the cortex. The results showed each filter successfully tracks part of the stimulus, while a better prediction is obtained by summing them after separate optimal gain adjustments. This showed how the filters' gains can be adjusted to improve prediction of the future input signal based on its recent time course (value and derives, implicitly) and internal corticothalamic dynamics. Attention, in the other words, may be implemented in the brain through control and adjustment of input gains. This suggested directions for modeling attention in our framework by using mismatches between internal models and external stimuli to drive gain changes. This study will lead to modeling decision and control, and generalization to encompass more general stimuli and sensory systems.
Overall, the results show and interpret the utility of control theory schemes in understanding brain's dynamics, yielding insights into dynamic processes that underlie cognition and action without imposing an outcome in advance.

AUTHOR CONTRIBUTIONS
TB and PR conceived of the presented idea. TB developed the theory and performed the computations. TB and PR verified the analytical methods. TB worked out the technical details, performed numerical calculations, produced figures and numerical results. TB and PR discussed all the results. TB drafted the first manuscript and both authors contributed to the final manuscript.