Generalized activity equations for spiking neural network dynamics

Much progress has been made in uncovering the computational capabilities of spiking neural networks. However, spiking neurons will always be more expensive to simulate compared to rate neurons because of the inherent disparity in time scales—the spike duration time is much shorter than the inter-spike time, which is much shorter than any learning time scale. In numerical analysis, this is a classic stiff problem. Spiking neurons are also much more difficult to study analytically. One possible approach to making spiking networks more tractable is to augment mean field activity models with some information about spiking correlations. For example, such a generalized activity model could carry information about spiking rates and correlations between spikes self-consistently. Here, we will show how this can be accomplished by constructing a complete formal probabilistic description of the network and then expanding around a small parameter such as the inverse of the number of neurons in the network. The mean field theory of the system gives a rate-like description. The first order terms in the perturbation expansion keep track of covariances.


Introduction
Even with the rapid increase in computing power due to Moore's law and proposals to simulate the entire human brain notwithstanding [1], a realistic simulation of a functioning human brain performing nontrivial tasks is remote.While it is plausible that a network the size of the human brain could be simulated in real time [2,3] there are no systematic ways to explore the parameter space.Technology to experimentally determine all the parameters in a single brain simultaneously does not exist and any attempt to infer parameters by fitting to data would require exponentially more computing power than a single simulation.We also have no idea how much detail is required.Is it sufficient to simulate a large number of single compartment neurons or do we need multiple-compartments?How much molecular detail is required?Do we even know all the important biochemical and biophysical mechanisms?
There are an exponential number of ways a simulation would not work and figuring out which remains computationally intractable.Hence, an alternative means to provide appropriate prior distributions for parameter values and model detail is desirable.Current theoretical explorations of the brain utilize either abstract mean field models or small numbers of more biophysical spiking models.The regime of large but finite numbers of spiking neurons remains largely unexplored.It is not fully known what role spike time correlations play in the brain.It would thus be very useful if mean field models could be augmented with some spike correlation information.This paper outlines a scheme to derive generalized activity equations for the mean and correlation dynamics of a fully deterministic system of coupled spiking neurons.It synthesizes methods we have developed to solve two different types of problems.The first problem was how to compute finite system size effects in a network of coupled oscillators.We adapted the methods of the kinetic theory of gases and plasmas [4,5] to solve this problem.The method exploits the exchange symmetry of the oscillators and characterizes the phases of all the oscillators in terms of a phase density function η(θ, t), where each oscillator is represented as a point mass in this density.We then write down a formal flux conservation equation of this density, called the Klimontovich equation, which completely characterizes the system.However, because the density is not differentiable, the Klimontovich equation only exists in the weak or distributional sense.Previously, e.g [6][7][8][9] the equations were made usable by taking the "mean field limit" of N → ∞ and assuming that the density is differentiable in that limit, resulting in what is called the Vlasov equation.Instead of immediately taking the mean field limit, we regularize the density by averaging over initial conditions and parameters and then expand in the inverse system size N −1 around the mean field limit.This results in a system of coupled moment equations known as the BBGKY moment hierarchy.
In [10], we solved the moment equations for the Kuramoto model perturbatively to compute the pair correlation function between oscillators.However, the procedure was somewhat ad hoc and complicated.We then subsequently showed in [11], that the BBGKY moment hierarchy could be recast in terms of a density functional of the phase density.This density functional could be written down explicitly as an integral over all possible phase histories, i.e. a Feynman-Kac path integral.The advantage of using this density functional formalism is that the moments to arbitrary order in 1/N could be computed as a steepest-descent expansion of the path integral, which can be expressed in terms of Feynman diagrams.This made the calculation more systematic and mechanical.We later applied the same formalism to synaptically coupled spiking models [12].
Concurrently with this line of research, we also explored the question of how to generalize population activity equations, such as the Wilson-Cowan equations, to include the effects of correlations.The motivation for this question is that the Wilson-Cowan equations are mean field equations and do not capture the effects of spike-time correlations.For example, the gain in the Wilson-Cowan equations is fixed, (which is a valid approximation when the neurons fire asynchronously), but correlations in the firing times can change the gain [13].
Thus, it would be useful to develop a systematic procedure to augment population activity equations to include spike correlation effects.The approach we took was to posit plausible microscopic stochastic dynamics, dubbed the spike model, that reduced to the Wilson-Cowan equations in the mean field limit and compute the self-consistent moment equations from that microscopic theory.Buice and Cowan [14] showed that the solution of the master equation of the spike model could be expressed formally in terms of a path integral over all possible spiking histories.The random variable in the path integral is a spike count whereas in the path integral for the deterministic phase model we described above, the random variable is a phase density.To generate a system of moment equations for the microscopic stochastic system, we transformed the random spike count variable in the path integral into moment variables [15].This is accomplished using the effective action approach of field theory, where the exponent of the cumulant generating functional, called the action, which is a function of the random variable is Legendre transformed into an effective action of the cumulants.
The desired generalized Wilson-Cowan activity equations are then the equations of motion of the effective action.This is analogous to the transformation from Lagrangian variables of position and velocity to Hamiltonian variables of position and momentum.Here, we show how to apply the effective action approach to a deterministic system of synaptically coupled spiking neurons to derive a set of moment equations.

Approach
Consider a network of single compartment conductance-based neurons The equations are remarkably stiff with time scales spanning orders of magnitude from milliseconds for ion channels, to seconds for adaptation, and from hours to years for changes in synaptic weights and connections.Parameter values must be assigned for 10 11 neurons with 10 4 connections each.Here, we present a formalism to derive a set of reduced activity equations directly from a network of deterministic spiking neurons that capture the spike rate and spike correlation dynamics.The formalism first constructs a density functional for the firing dynamics of all the neurons in a network.It then systematically marginalizes the unwanted degrees of freedom to isolate a set of self-consistent equations for the desired quantities.For heuristic reasons, we derive an example set of generalized activity equations for the first and second cumulants of the firing dynamics of a simple spiking model but the method can be applied to any spiking model.
A convenient form to express spiking dynamics is with a phase oscillator.Consider the quadratic integrate-and-fire neuron where I is an external current and u(t) are the synaptic currents with some weight α i .The spike is said to occur when V goes to infinity whereupon it is reset to minus infinity.The quadratic nonlinearity ensures that this transit will occur in a finite amount of time.The substitution V = tan(θ/2) yields the theta model [16]: which is the normal form of a Type I neuron near the bifurcation to firing [17].The phase neuron is an adequate approximation to spiking dynamics provided the inputs are not overly strong as to disturb the limit cycle.The phase neuron also includes realistic dynamics such as not firing when the input is below threshold.Coupled phase models arise naturally in weakly coupled neural networks [18][19][20].They include the Kuramoto model [21], which we have previously analyzed [10,11].
Here, we consider the phase dynamics of a set of N coupled phase neurons obeying where each neuron has a phase θ i that is indexed by i, u is a global synaptic drive, F (θ, γ, u) is the phase and synaptic drive dependent frequency, γ i represents all the parameters for neuron i drawn from a distribution with density g(γ), ν is the population firing rate of the network,t l j is the lth firing time of neuron j and a neuron fires when its phase crosses π.In the present paper, we consider all-to-all or global coupling through a synaptic drive variable u(t).However, our basic approach is not restricted to global coupling.
We can encapsulate the phase information of all the neurons into a neuron density function [10-12, 22, 23].
where δ(•) is the Dirac delta functional, and θ i (t) is a solution to system (3)-( 5).The neuron density gives a count of the number of neurons with phase θ and synaptic strength γ at time t.Using the fact that the Dirac delta functional in ( 5) can be expressed as l δ(t − t l j ) = θj δ(π − θ j ), the population firing rate can be rewritten as The neuron density formally obeys the conservation equation with initial condition η(θ, γ, t 0 ) = η 0 (θ, γ) and u(t 0 ) = u 0 .Equation ( 8) is known as the Klimontovich equation [4,24].The Klimontovich equation, the equation for the synaptic drive ( 4), and the firing rate expressed in terms of the neuron density (7), fully define the system.The system is still fully deterministic but is now in a form where various sets of reduced descriptions can be derived.Here, we will produce an example of a set of reduced equations or generalized activity equations that capture some aspects of the spiking dynamics.The path we take towards the end will require the introduction of some formal machinery that may obscure the intuition around the approximations.However, we feel that it is useful because it provides a systematic and controlled way of generating averaged quantities that can be easily generalized.
For finite N , ( 8) is only valid in the weak or distributional sense since η is not differentiable.In the N → ∞ limit, it has been argued that η will approach a smooth density ρ that evolves according to the Vlasov equation that has the same form as ( 8) but with η replaced by ρ [4][5][6][7]10].This has been proved rigorously in the case where noise is added using the theory of coupled diffusions [25][26][27][28].This N → ∞ limit is called mean field theory.In mean field theory, the original microscopic many body neuronal network is represented by a smooth macroscopic density function.In other words, the ensemble of networks prepared with different microscopic initial conditions is sharply peaked at the mean field solution.For large but finite N , there will be deviations away from mean field [10][11][12]23].These deviations can be characterized in terms of a distribution over an ensemble of coupled networks that are all prepared with different initial conditions and parameter values.Here, we show how a perturbation theory in N −1 can be developed to expand around the mean field solution.This requires the construction of the probability density functional over the ensemble of spiking neural networks.We adapt the tools of statistical field theory to perform such a construction.

Formalism
The complete description of the system given by equations ( 4), (7), and ( 8) can be written as The probability density functional governing the system specified by the synaptic drive and Klimontovich equations ( 9) and (10) given initial conditions (η 0 , u 0 ) can be written as where P [η, u|η 0 , u 0 ] is the conditional probability density functional of the functions (η, u), and P 0 [η 0 , u 0 ] is the density functional over initial conditions of the system.The integral is a Feynman-Kac path integral over all allowed initial condition functions.Formally we can write P [η, u|η 0 , u 0 ] as a point mass (Dirac delta) located at the solutions of ( 9) and (10) given the initial conditions: The probability density functional ( 11) is then Equation ( 12) can be made useful by noting that the Fourier representation of a Dirac delta is given by δ(x) ∝ dk e ikx .Using the infinite dimensional Fourier functional transform then gives The exponent S[η, u] in the probability density functional is called the action and has the form where represents the contribution of the transformed neuron density to the action, represents the global synaptic drive, S 0 [ φ0 (x 0 ), u 0 (t 0 )] represents the initial conditions, and x = (θ, γ, t).For the case where the neurons are considered to be independent in the initial state, we have where u 0 is the initial value of the coupling variable and ρ 0 (θ, γ, t) is the distribution from which the initial configuration is drawn for each neuron.The action includes two imaginary auxiliary response fields (indicated with a tilde), which are the infinite dimensional Fourier transform variables.The factor of 1/N appears to ensure correct scaling between the u and ϕ variables since u applies to a single neuron while ϕ applies to the entire population.The full derivation is given in [12] and a review of path integral methods applied to differential equations is given in [29].In the course of the derivation we have made a Doi-Peliti-Jannsen transformation [12,30], given by In deriving the action, we have explicitly chosen the Ito convention so that the auxiliary variables only depend on variables in the past.The action (13) contains all the information about the statistics of the network.
The moments for this distribution can be obtained by taking functional derivatives of a moment generating functional.Generally, the moment generating function for a random variable is given by the expectation value of the exponential of that variable with a single parameter.Because our goal is to transform to new variables for the first and second cumulants, we form a "two-field" moment generating functional, which includes a second parameter for pairs of random variables, where J and K are moment generating fields, ξ 1 (x) = u(t), ξ 2 (x) = ũ(t), ξ 3 (x) = ϕ(x), ξ 4 (x) = φ(x), and x = (θ, γ, t).Einstein summation convention is observed beween upper and lower indices.Unindexed variables represent vectors.The integration measure dx is assumed to be dt when involving indices 1 and 2. Covariances between an odd and even index corresponds to a covariance between a field and an auxiliary field.Based on the structure of the action S and (17) we see that this represents a linear propagator and by causality and the choice of the Ito convention is only nonzero if the time of the auxiliary field is evaluated at an earlier time than the field.Covariances between two even indices correspond to that between two auxiliary fields and are always zero because of the Ito convention.
The mean and covariances of ξ can be obtained by taking derivatives of the action W [J, K] in (17), with respect to J and K and setting J and K to zero: Expressions for these moments can be computed by expanding the path integral in (17) perturbatively around some mean field solution.However, this can be unwieldy if closed form expressions for the mean field equations do not exist.Alternatively, the moments at any order can be expressed as self-consistent dynamical equations that can be analyzed or simulated numerically.Such equations form a set of generalized activity equations for the means a i = ξ i , and covariances We derive the generalized activity equations by Legendre transforming the action W , which is a function of J and K, to an effective action Γ that is a function of a and C. Just as a Fourier transform expresses a function in terms of its frequencies, a Legendre transform expresses a convex function in terms of its derivatives.This is appropriate for our case because the moments are derivatives of the action.The Legendre transform of W which must obey the constraints The generalized activity equations are given by the equations of motion of the effective action, in direct analogy to the Euler-Lagrange equations of classical mechanics, and are obtained by setting J i = 0 and K ij = 0 in (19).
In essence, what the effective action does is to take a probabilistic (statistical mechanical) system in the variables ξ with action S and transform them to a deterministic (classical mechanical) system with an action Γ.Our approach here follows that used in [15] to construct generalized activity equations for the Wilson Cowan model.However, there are major differences between that system and this one.In [15], the microscopic equations were for the spike counts of an inherently probabilistic model so the effective action and ensuing generalized activity equations could be constructed directly from the Markovian spike count dynamics.Here, we start from deterministically firing individual neurons and get to a probabilistic description through the Klimontovich equation.It would be straightforward to include stochastic effects into the spiking dynamics.
Using (18) in (17) gives where J and K are constrainted by (19).We cannot compute the effective action explicitly but we can compute it perturbatively in N −1 .We first perform a shift where Our goal is to construct an expansion for Γ by collecting terms in successive orders of N −1 in the path integral of (21).Expanding Γ as Γ[a, C] = Γ 0 + N −1 Γ 1 + N −2 Γ 2 and equating coefficients of N in (21) immediately leads to the conclusion that Γ 0 = S[a], which gives where higher order terms in N −1 are not included.To lowest nonzero order Γ 0,ij = N −1 Γ 0,ij 1 since Γ 0 is only a function of a and not C.If we set we obtain to order N −1 .Q ij is an unknown function of a and C, which we will deduce using selfconsistency.The path integral in (24), which is an infinite dimensional Gaussian that can be explicitly integrated, is proportional to 1/ det Q ij = exp(−(1/2) ln det Q ij ) = exp(−(1/2) Tr ln Q ij ), using properties of matrices.Hence, (24) becomes Taking the derivative of Γ with respect to Self consistency with (23) then requires that Q ij = (C −1 ) ij which leads to the effective action where and we have dropped the irrelevant constant terms.
The equations of motion to order N −1 are obtained from (19) with J i and K ij set to zero: and ( 27) can be rewritten as Hence, given any network of spiking neurons, we can write down the a set of generalized activity equations for the mean and covariance functions by 1) constructing a neuron density function, 2) writing down the conservation law (Klimontovich equation), 3) constructing the action and 4) using formulas ( 26) and ( 28).We could have constructed these equations directly by multiplying the Klimontovich and synaptic drive equations by various factors of u and η and recombining.However, as we saw in [15] this is not a straightforward calculation.
The effective action approach makes this much more systematic and mechanical.

Phase Model Example
We now present a simple example to demonstrate the concepts and approximations involved in our expansion.Our goal is not to analyze the system per se but only to demonstrate the application of our method in a heuristic setting.We begin with a simple nonleaky integrate-and-fire neuron model, which responds to a global coupling variable.This is a special case of the dynamics given above, with F given by The action from ( 14) and ( 15) is and we ignore initial conditions for now.
In order to construct the generalized activity equations we need to compute the first and second derivatives of the action L i and L ij .Taking the first derivative of (30) gives The mean field equations are obtained by solving L i = 0 using (31).We immediately see that a 2 = a 4 = 0 are solutions, which leaves us with ȧ1 + βa 1 − β dγ(I + γa 1 )a 3 (π, γ, t) = 0 (32) The mean field equations should be compared to those of the spike response model [31,32].
We can also solve (33) directly to obtain where ρ 0 is the initial distribution.If the neurons are initially distributed uniformly in phase, then ρ 0 = g(γ)/2π and the mean field equations reduce to which has the form of the Wilson-Cowan equation, with (β/2π) (I + γa 1 ) acting as a gain function.Hence, the Wilson-Cowan equation is a full description of the infinitely large system limit of a network of globally coupled simple phase oscillators in the asynchronous state.For all other initial conditions, the one-neuron conservation equation (called the Vlasov equation in kinetic theory) must be included in mean field theory.
To go beyond mean field theory we need to compute L ij (x, x , x ) = δL i (x, x )/δa j (x ): The activity equations for the means to order N −1 are given by (26).The only nonzero contributions are given by L 13 and L 31 resulting in Adding ( 38) and (39) and taking the limit t 0 → t and setting θ 0 = θ, γ 0 = γ gives Initial conditions, which are specified in the action, are required for each of these equations.
The derivation of these equations using classical means require careful consideration for each particular model.Our method provides a blanket mechanistic algorithm.We propose that these equations represent a new scheme for studying neural networks.
Equations ( 41)-( 45) are the complete self-consistent generalized activity equations for the mean and correlations to order N −1 .It is a system of partial differential equations in t and θ.These equations can be directly analyzed or numerically simulated.Although the equations seem complicated, one must bear in mind that they represent the dynamics of the system averaged over initial conditions and unknown parameters.Hence, the solution of this PDE system replaces multiple simulations of the original system.In previous work, we required over a million simulations of the original system to obtained adequate statistics [12].
There is also a possibility that simplifying approximations can be applied to such systems.
The system has complete phase memory because the original system was fully deterministic.
However, the inclusion of stochastic effects will shorten the memory and possibly simplify the dynamics.It will pose no problem to include such stochastic effects.In fact, the formalism is actually more suited for stochastic systems [15].

Discussion
The main goal of this paper was to show how to systematically derive generalized activity equations for the ensemble averaged moments of a deterministically coupled network of spiking neurons.Our method utilizes a path integral formalism that makes the process algorithmic.The resulting equations could be derived using more conventional perturbative methods although possibly with more calculational difficulty as we found before [15].
For example, for the case of the stochastic spike model, Buice et al. [15] presumed that the Wilson-Cowan activity variable was the rate of a Poisson process and derived a system of generalized activity equations that corresponded to deviations around Poisson firing.
Bressloff [33], on the other hand, assumed that the Wilson-Cowan activity variable was a mean density and used a system-size expansion to derive an alternative set of generalized activity equations for the spike model.The classical derivations of these two interpretations look quite different and the differences and similarities between them are not readily apparent.However, the connections between the two types of expansions are very transparent using the path integral formalism.
Here, we derived equations for the rate and covariances (first and second cumulants) of a deterministic synaptically coupled spiking network as a system size expansion to first order.
However, our method is not restricted to these choices.What is particularly advantageous about the path integral formalism is that it is straightforward to generalize to include higher order cumulants, extend to higher orders in the inverse system size, or to expand in other small parameters such as the inverse of a slow time scale.The action fully specifies the system and all questions regarding the system can be addressed with it.
To give a concrete illustration of the method, we derived the self-consistent generalized activity equations for the rates and covariances to order N −1 for a simple phase model.
The resulting equations consist of ordinary and partial differential equations.This is to be expected since the original system was fully deterministic and memory cannot be lost.Even mean field theory requires the solution of an advective partial differential equation.The properties of these and similar equations remain to be explored computationally and ana-lytically.The system is possibly simpler near the asynchronous state, which is marginally stable in mean field theory like the Kuramoto model [7] and like the Kuramoto model, we conjecture that the finite size effects will stabilize the asynchronous state [10,11].The addition of noise will also stabilize the asynchronous state.Near asynchrony could be exploited to generate simplified versions of the asynchronous state.
We considered a globally connected network, which allowed us to assume that networks for different parameter values and initial conditions converge towards a "typical" system in the large N limit.However, this property may not hold for more realistic networks.While the formalism describing the ensemble average will hold regardless of this assumption, the utility of the equations as descriptions of a particular network behavior may suffer.For example, heterogeneity in the connectivity (as opposed to the global connectivity we consider here) may threaten this assumption.This is the case with so called "chaotic random networks" [34] in which there is a spin-glass transition owing to the variance of the connectivity crossing a critical threshold.This results in the loss of a "typical" system in the large N limit requiring an effective stochastic equation which incorporates the noise induced by the network heterogeneity.Whether the expansion we present here is useful without further consideration depends upon whether the network heterogeneity induces this sort of effect.
This is an area for future work.A simpler issue arises when there are a small discrete number of "typical" systems (such as with bistable solutions to the continuity equation).In this case, there are noise induced transitions between states.While the formalism has a means of computing this transition [35], we do not consider this case here.
An alternative means to incorporate heterogeneous connections is to consider a network of coupled systems.In such a network, a set of generalized activity equations, such as those derived here or simplified versions, would be derived for each local system, together with equations governing the covariances between the local systems.Correlation based learning dynamics could then be imposed on the connections between the local systems.Such a network could serve as a generalization of current rate based neural networks to include the effects of spike correlations with applications to both neuroscience and machine learning.