STDP in Oscillatory Recurrent Networks: Theoretical Conditions for Desynchronization and Applications to Deep Brain Stimulation

Highly synchronized neural networks can be the source of various pathologies such as Parkinson's disease or essential tremor. Therefore, it is crucial to better understand the dynamics of such networks and the conditions under which a high level of synchronization can be observed. One of the key factors that influences the level of synchronization is the type of learning rule that governs synaptic plasticity. Most of the existing work on synchronization in recurrent networks with synaptic plasticity are based on numerical simulations and there is a clear lack of a theoretical framework for studying the effects of various synaptic plasticity rules. In this paper we derive analytically the conditions for spike-timing dependent plasticity (STDP) to lead a network into a synchronized or a desynchronized state. We also show that under appropriate conditions bistability occurs in recurrent networks governed by STDP. Indeed, a pathological regime with strong connections and therefore strong synchronized activity, as well as a physiological regime with weaker connections and lower levels of synchronization are found to coexist. Furthermore, we show that with appropriate stimulation, the network dynamics can be pushed to the low synchronization stable state. This type of therapeutical stimulation is very different from the existing high-frequency stimulation for deep brain stimulation since once the stimulation is stopped the network stays in the low synchronization regime.

the network architecture in a way that long-lasting desynchronizing effects occur which outlast the offset of stimulation. This approach relies on two properties or requirements of the neural tissue. First, the network dynamics must elicit bistability such that both the synchronized and the desynchronized state are stable (see Figure 1A). Second, there must exist a stimulation that (momentarily) destroys the stability of the synchronous state such that the network is driven into the desynchronized stable state (see Figure 1B).
Unfortunately, most of the work that follows this idea of therapeutical stimulation is based on numerical simulations (Tass and Majtanik, 2006;Tass, 2007, 2009;Tass and Hauptmann, 2007) and only very little has been done analytically (Maistrenko et al., 2007) in order to understand what are the crucial parameters that lead to the two above requirements. Here we consider a recurrent network model that is sufficiently simple to be treated analytically but still captures the desired properties.
Concretely, we consider in this paper a network dynamics that acts on two different time scales. On a short-time scale, neurons integrate the inputs from neighboring neurons and fire periodically. On a longer time scale, synapses can change their strength and hence influence the overall level of synchrony. In principle the above mentioned requirement on bistability can be present in either the neuronal dynamics (short-time scale) or the synaptic dynamics (longer time scale). Since we are aiming for long-lasting effects, we focus on the bistability in the synaptic dynamics. More precisely, we consider a synaptic plasticity learning rule that depends on the precise timing of the action

IntroductIon
High level of synchrony in neural tissue can be the cause of several diseases. For example, Parkinson's disease is characterized by high level of neuronal synchrony in the thalamus and in the basal ganglia. In opposition, the same neural tissues in healthy conditions have been shown to fire in an asynchronous way (Nini et al., 1995). The Parkinsonian resting tremor (3-6 Hz) is not only related to the subcortical oscillations (Pare et al., 1990), but has been recently shown to be driven by those oscillations (Smirnov et al., 2008;Tass et al., 2010). In addition, the extent of akinesia and rigidity is closely related to synchronized neuronal activity in the beta band (8-30 Hz) (Kuhn et al., 2006).
A standard treatment for patients with Parkinson's disease is to chronically implant an electrode typically in the sub-thalamic nucleus of the pre-and postsynaptic neurons. This form of plasticity called spike-timing dependent plasticity (STDP) (Gerstner et al., 1996b;Markram et al., 1997;Bi and Poo, 1998) enhances synapses for which the presynaptic spike precedes the postsynaptic one and decreases the synaptic strength when the timing is reversed (see Figure 3A).
Regarding the stimulation itself, several candidates have been proposed. For example, it has been shown that a patterned, multisite and timed-delayed stimulation termed as coordinated reset (CR) stimulation (Tass, 2003) which forces sub-populations to fire in a precise sequence generates transiently a nearly uniform phase distribution. In fact, long-lasting desynchronizing effects of CR stimulation have been verified in rat hippocampal slice rendered epileptic by magnesium withdrawal ).
In the present study, we do not focus on a precise stimulation but rather assume that there exists a stimulation that desynchronizes transiently a given population. Furthermore a given stimulation rarely affects the whole population which fires in a pathological way but only part of it. So we consider two populations. An input population which is affected by the stimulation (see Figure 2A) and a recurrent population with plastic synapses which is driven by the input population.
In order to calculate the expected weight change in the recurrent network, we need to get an expression for the spiking covariance of this recurrent population. This technical step has been achieved by Hawkes (1971) and used by Gilson et al. (2009a,b) in the context of recurrent network with STDP. The present study extends the work of Gilson et al. (2009b) to the case of oscillatory inputs and highlights the conditions for which the network elicits bistability.

Materials and Methods neuronal dynaMics
Let us consider an input population consisting of M neurons and a recurrent population containing N neurons (see Figure 2A).
being the Dirac delta spike train of the j th input neuron. Let x t = {x(s), 0 ≤ s < t} describe the whole input spike pattern from 0 to t. Similarly, let y(t) = (y 1 (t),…,y N (t)) denote the spike trains of the recurrent (or output) population, i.e., y t t t δ denotes the Dirac delta spike train of the i th neuron in the recurrent population at time t. y t = {y(s), 0 ≤ s < t} represents the output spike train from 0 to t.
Let us assume that the input spike trains x(t) have instantaneous firing rates r(t) = (r 1 (t),…,r M (t)) = 〈x(t)〉 x(t) which are assumed to be periodic (i.e., r(t) = r(t + T), see Figure 2B) and a spatiotemporal covariance matrix ∆C(τ) given by The input spike train x k (t) (bottom) of neuron k has an instantaneous firing rate r k (t) (top) which is oscillating around an averaged firing rate ρ k .
where 1 denotes the N×N identity matrix, The tilde symbol ( )  denotes the Fourier transform, i.e., The recurrent population is said to be highly synchronous if ∆ ν( ) ω 0 is large where ω 0 = 2π/T. It is well known that coupled Kuramoto oscillators do synchronize if the coupling exceeds a certain threshold (Kuramoto, 1984) 2 . Here, because we do not consider phase oscillators (such as the Kuramoto oscillators) as neuronal models, this result does not apply directly. However, as we can see in Eq. 7 the stronger the recurrent weights w, the bigger the amplitude of the output oscillation ∆ ν( ).
ν As can be seen in "Spiking Covariance Matrices" in Appendix, it is easier to calculate this covariance matrix in the frequency domain, i.e., rather than in the time domain. So the Fourier transform of this output covariance matrix yields denotes the Fourier transform of the input covariance matrix.

synaptIc dynaMIcs
In the previous section, we assumed that all the weights were constant in order to calculate the output covariance matrix. Let us now allow the recurrent weights w to change very slowly (w.r.t. the neuronal dynamics) and keep the input weights h fixed. Because of this separation of time scales, we can still use the results derived so far.
In the same spirit as Kempter et al. (1999), Gerstner and Kistler (2002), Gilson et al. (2009b), we can express the weight change as a Volterra expansion of both the pre-and postsynaptic spike trains. If we keep terms up to second order we get for all i ≠ j:  w a w y t a w y t y t W s w y t s ds Since we do not allow for self-coupling, we will set  w ii = 0. a pre (w) (and a post (w)) denote the weight change induced by a single presynaptic (resp. postsynaptic) spike. W(s,w) denotes the STDP learning window (see Figure 3A) and W w  ( , ) ω its Fourier transform (see Figure 3B): where ρ ρ = ∫ − T t dt T 1 0 ( ) is the mean firing rate averaged over one period T. Note that the input covariance matrix ∆C(τ) has an atomic discontinuity at τ = 0, i.e., ∆C( ) ( ) ( ).
denote the membrane potential of all the neurons in the recurrent population (see the Spike-Response Model in Gerstner and Kistler, 2002): where the superscript  denotes a convolution 1 with the EPSP kernel (s), i.e., x t s x t s ds and y t s y t s ds Without loss of generality, we will assume that the EPSP kernel is such that is the Heaviside step function. The N × M matrix h denotes the weight matrix between the input and the recurrent population. w is a N × N matrix denoting the recurrent weights (see Figure 2A).
In this model, output spikes are generated stochastically, i.e., the higher the membrane potential the more likely a spike will be emitted. Formally a spike is generated in neuron i at time t with an instantaneous probability density g u t y t denoting the expected membrane potential of neuron i averaged over a period T, and u u where D′ = diag(g′(ū)). Note that from Eq. 2, the averaged expected membrane potential ū can be expressed as is the averaged output firing rate. Combining those two expressions, we get a self-consistent equation for the averaged output firing rate ν : − denote the amplitude of the oscillation in the recurrent population: denotes the amplitude of the input oscillation and ∆ρ ρ Since ∆ν(t) appears on the l.h.s of Eq. 6 and in a convolved form on the r.h.s, we can Fourier transform this equation in order to express explicitly the amplitude ∆ ν of the oscillation in the recurrent population as a function of the angular frequency ω: denotes the area under the STDP curve for a given weight w ij . Since it is easier to express this covariance matrix in the frequency domain we can rewrite Eq. 14 (in matrix form) as 3 : where ∆  Q( ) ω is given by Eq. 9 and the '' operator denotes the pointwise matrix multiplication (also known as the Hadamard product). The operator φ sets to zero the diagonal elements and keep the offdiagonal elements unchanged. This implements the fact that we are not allowing autapses. 1 = (1,…,1) is a vector containing N ones.

sInusoIdal Inputs
Let us now assume that the input population is exciting the recurrent population with independent Poisson spike trains of sinusoidal intensity ρ ρ ρ ( ) sin( ) t t = + ∆ 0 0 ω where ω 0 = 2π/T is the angular frequency. Because of the independence assumption, the input correlation matrix yields Consistently with the modeling literature on STDP (van Rossum et al., 2000;Gütig et al., 2003;Morrison et al., 2007Morrison et al., , 2008 both the potentiation factor A + and the depression factor A w − ( ) are assumed to depend on the weight. Here, the precise dependence on the weights corresponds to the one used by Gütig et al. (2003): where the parameter µ ∈[ , ] 0 1 controls the dependence upon the weight. If μ → 0, the rule is additive and therefore independent on the weight w. Conversely, if μ → 1, the rule becomes multiplicative. Unless specified otherwise, we set μ = 0.05. In addition to this learning rule, hard bounds for the weights (0 ≤ w ij ≤ w max , ∀i ≠ j) are required if the factors a pre (w) and a post (w) are independent of the weights (which we will assume here) or if they do not contain implicit soft bounds. Because of the separation of time scales assumption, we can replace the recurrent weights by their expected value averaged over one period T, i.e., where R(x) takes the real part of x. Here, we assumed that both a pre and a post are independent of the recurrent weights w. Furthermore, if g(u) = u, then the diagonal matrix D′ = 1 (which is present in  J and  K ) is independent of the recurrent weights w.

results
So far, we derived analytically the neuronal dynamics of a recurrent population that is stimulated by an oscillatory input. Furthermore we calculated the synaptic dynamics of such a recurrent network when synapses are governed by STDP. This rich synaptic dynamics has several interesting properties that are relevant for therapeutical stimulation and that we describe below.
Firstly, under some conditions that will be detailed below, the synaptic dynamics can elicit bistability (see Figures 4A,B). Concretely, if synapses are initialized above a critical value w*, they will grow and reach the upper bound w max . Conversely, if they are initialized below w*, they will decrease down to their minimal value. As a consequence, because the amplitudes of the oscillations and its Fourier transform yields By using this expression of the input correlation, we get an explicit expression for the output covariance matrix ∆  Q( ) ω (see Eq. 9) which can be approximated as: This approximation is valid in the limit of large networks (M >> 1, N >> 1) where the terms involving diagonal matrices can be neglected. Indeed the term which depends on diag( ) ρ scales as M −1 and the term which depends on D = diag( ) ν scales as N −1 if the output firing rate ν is kept constant. For a detailed discussion, see Kempter et al. (1999), Gilson et al. (2009b). By inserting Eq. 18 into Eq. 15, we get: The small discrepancy between the numerics and analytics can be attributed to two factors. First, Eq. 20 assumes the limit of large networks, i.e., N, M → ∞ whereas in the numerical simulations, we used N = M = 100. Secondly, Eq. 20 is valid in the limit of small recurrent weights, i.e., w av < < 1/N (here we used Nw max = 0.75).
As we can see on Figure 4B, the smaller averaged weights w av , the better the correspondence between the numerical simulation and the analytical calculations. Even, for large recurrent weights (w av  w max ), the discrepancy between numerics and analytics is remarkably small.

condItIons for bIstabIlIty
So far, we discussed a scenario in which the stimulation effectively acted as a therapeutical stimulation because it shifted the network state from the highly synchronous (and therefore pathological) state to the low synchronous state. In order to have this property, the network must satisfy two conditions: • C1. Bistability condition. In the absence of stimulation, the network must elicit bistability such that both the synchronized and the desynchronized state are stable. Formally this gives • C2. Desynchronizing stimulation condition. In the presence of a desynchronizing stimulation, the highly synchronous state looses its stability and the network is driven to the low synchronous state. This can be expressed as: In order to keep the discussion reasonably simple, let us make several assumptions. First, let us consider the near-additive learning rule i.e., μ  0. Indeed in the case of a multiplicative learning rule (i.e., μ > > 0), the potentiation factor A + (w) and the depressing factor A − (w) have a strong stabilization effect and therefore no bistability can be expected; see Figure 5A with μ = 0.5. This is consistent with the findings of Gütig et al. (2003) who showed that there is a symmetry breaking in feedforward networks with weight-dependent STDP for μ bigger than a critical value.
in the recurrent population depend on the strength of the recurrent weights (see Eqs 7 and 22), then this bistability at the level of the weights implies a similar bistability at the level of the oscillation amplitude of the recurrent population (see Figure 4C).
Secondly, the presence of the desynchronizing stimulation, which is modeled by setting the amplitude of the oscillatory input ∆r 0 to 0, removes the bistability and pushes all the weights to their minimal value (see Figures 4A,B). Once the stimulation is removed the recurrent weights stay at their minimum value because it is a fixed point of the dynamics.

hoMogeneous case
Because the analysis of the non-linear dynamical system given by Eq. 19 can be challenging, let us consider the dynamics of the averaged recurrent weight w N w If the transfer function is linear, i.e., g(u) = u, the averaged output firing rate ν ν av = ∑ − N j j 1 and the oscillation amplitude of the recurrent population ∆ ν av can be approximated as: The dynamics of the averaged recurrent weight w av given by Eq. 20 is remarkably consistent with numerical simulations performed with spiking neurons (see Figures 4A,B). Note that from the definition of the Fourier transform we have R (W  (-ω 0 , w av )) = R (W  (ω 0 , w av )).
In summary, in order to satisfy both conditions C1 and C2, the learning rule must be nearly additive (μ  0), the learning window must be overall positive ( ) W > 0 and the sum of the coefficients tuning the effect of single spikes (a pre and a post ) must be negative with upper and lower bounds set by Eq. 29.
Note that the bistability conditions derived above are valid for the averaged recurrent weight and do not apply for the individual weights of the recurrent population. As a consequence, this bistability condition does not necessarily imply a bimodal weight distribution. Indeed, if we consider the same set-up as in Figure 4, but draw the initial weights from a uniform distribution, between 0 and w max , then all the weights will eventually settle into a unimodal distribution, around w = 0 (data not shown).
A bimodal weight distribution has been shown to occur in recurrent networks with additive STDP learning rule (Gilson et al., 2009b) favoring neurons that receive correlated inputs. For the sake of simplicity we considered in the simulations identical (and uncorrelated) inputs but the recurrent weight dynamics described by Eq. 15 remains valid for any type of inputs.

dIscussIon
In this paper, we developed a model of a recurrent network with plastic synapses that is sufficiently simple to be treated analytically but that still yields the desired properties. In particular, we showed that when (a) the STDP learning rule is near additive and (b) the learning window is overall positive and (c) the terms in the learning rule involving single spikes (a pre + a post ) have a depressing effect (within some bounds), then a desynchronizing stimulation favors long-term depression in the recurrent synapses and therefore drives the network from a highly synchronous state to a desynchronous state. In this way, our study confirms previously performed simulation studies (Tass and Majtanik, 2006;Tass, 2007, 2009;Tass and Hauptmann, 2007) and, in particular, contributes to a theoretically sound foundation for the development of desynchronizing and, especially, antikindling brain stimulation techniques.
The concept of anti-kindling, i.e., of an unlearning of pathological synchrony and connectivity by desynchronizing stimulation, has been introduced by Tass and Majtanik (2006). For this, STDP was incorporated into a generic network of phase oscillators, and both kindling and anti-kindling processes were studied numerically. To approach a more microscopic level of description, a network of bursting neurons has been introduced as a simple model for an oscillatory population in the STN Hauptmann, 2006, 2007;Hauptmann and Tass, 2007). With this model different aspects of kindling and anti-kindling processes have been studied numerically, such as the effect of inhibition vs. excitation , the impact of weak and particularly short stimuli (Tass and Hauptmann, 2006), post-stimulus transients and cumulative stimulation effects ) as well as the differential effects of different sorts of desynchronizing stimulation protocols ). These questions are highly relevant for the clinical application of desynchronizing stimulation to the therapy of diseases with abnormal synchrony, e.g., Parkinson's disease. In forthcoming studies, more microscopic models will be studied. However, apart from this type of numerical simulation analysis, we aim at establishing a thorough, analytical framework With this additivity assumption and the linear transfer function assumption (g(u) = u), the averaged recurrent weight dynamics can be simply expressed from Eq. 20 combined with Eqs 21 and 22: ∆ R are constant and do not depend on the averaged recurrent weight w av .

Bistability condition C1
With these assumptions, the fixed point w av * expressed in the bistability condition C1 yields Because this fixed point has to be between 0 and w max , we have − < If we push the low-frequency assumption further such that ω 0 τ + << 1 and ω 0 τ − << 1, we have W W  ℜ( ( ))  ω 0 (see Figure 3C). In this case, Eq. 25 yields a simple necessary condition: a pre + a post and the integral of the learning window W must be of opposite sign: Finally, the fact that the fixed point w av * must be unstable (see condition C1), we have F w ′( ) av * > 0 which gives α 1 0 < . By using the expression of w av * given in Eq. 24 we have α 2 + α 3 > 0 and therefore with Eq. 26, we get: W a a > + < 0 0 and pre p ost (27) which is consistent with the parameters used in Figures 3 and 4. Note that in Figure 5B, we violated this condition and as a consequence the bistability property is lost.

Desynchronization stimulation condition C2
In the presence of the desynchronization stimulation (i.e., ∆r av = 0), the condition C2 implies that α 1 (1 − Nw av ) + α 2 < 0. Because α 1 < 0 (see Eq. 27), this condition is satisfied for all w av if α 1 (1 − Nw max ) + α 2 < 0. By introducing the values for α 1 and α 2 , we get: In order to express the output spiking covariance matrix ∆Q as a function of the input spiking covariance matrix ∆C, we need first to define the input-output covariance matrix ∆P: By rewriting the average 〈 〉 ⋅ − y t x t ( ), ( ) τ involved here and by using Eqs 1, 2, and 3, the input-output covariance matrix can be expressed in a self-consistent way as: Here the  superscript denotes as before the convolution with (s), i.e., In order to get an explicit expression of this input-output covariance matrix, we can Fourier transform Eq. 31 and therefore turn the convolution into a product in the angular frequency domain ω: As before the tilde symbol ( )  denotes the Fourier transform, i.e., We recall here for convenience the definition of the output covariance matrix (see Eq. 8): For τ ≥ 0, we have: with D = diag(g(ū)). Since the above identity is only valid for τ ≥ 0, we can not simply Fourier transform this expression and extract the output covariance matrix in frequency domain. We will use here the same method as proposed by Hawkes (1971). Let B(τ) be a supplementary matrix defined as: for the anti-kindling concept. As a first step into that direction, we presented here an analytical analysis of a simple network model which we validated by numerical analysis. Different types of neurons and, hence, different target areas in the brain may be associated with different types of STDP learning rules. Accordingly, brain stimulation approaches might be optimized with respect to a particular type of STDP learning rule in order to achieve superior therapeutic effects.
This work is not the first one to treat STDP in oscillatory recurrent network. Karbowski and Ermentrout (2002) consider neurons as formal phase oscillators and therefore do not consider spiking neurons as such. Furthermore Karbowski and Ermentrout (2002) consider a learning window with identical potentiation and depression time constants which limits the potential output of the study. Maistrenko et al. (2007) describe as well neurons as phase oscillators, but they allow potentiation and depression time constants to be different. Interestingly they find multistability depending on those time constants and on the weight upper bound. Morrison et al. (2007) consider a STDP learning rule with multiplicative depression and power-law dependence for potentiation and showed numerically that this STDP rule decouples synchronous inputs from the rest of the network.
Our present model can be seen as an extension of the work of Gilson et al. (2009a,b) in several aspects. First, and most importantly, we consider here oscillatory input whereas Gilson et al. (2009b) assume stationary inputs. In this way we are able to discuss the conditions on the learning parameters to get a bistable regime with high and low synchrony. Secondly, our approach does not require to calculate the inverse Fourier transform of the spiking covariance matrix. In this way we do not need to make the (unrealistic) assumption that the EPSP time constant is much smaller than potentiation time constant in the learning window. Finally, we consider a larger class of neuronal model since we do not restrict ourself to linear neurons, but consider instead (locally linearized) non-linear neurons.
There are several ways in which we can extend the present model. It is known that propagation delays play an important role in the neuronal synchrony (Gerstner et al., 1996a;Cassenaer and Laurent, 2007;Lubenov and Siapas, 2008). Our framework can be easily extended to incorporate those propagation delays. A systematic analysis of the influence of those delays is out of the scope of the current study but would be definitely relevant. According to the arguments of Lubenov and Siapas (2008), one could expect that the bigger the synaptic delays, the stronger the effective amount of depression. This can potentially change the stability of highly synchronous states.
Another way to extend the current model is to consider learning rules that depend on high order statistics of the pre-and postsynaptic neuron. For example, it is known that a triplet learning rule which considers 1 pre-and 2 postsynaptic spikes (Pfister and Gerstner, 2006a,b;Clopath et al., 2010) is more realistic in terms a reproducing experimental data than the pair-based STDP considered here. Another extension would be to consider second order terms in the Taylor expansion of the transfer function. Both of those extensions would require a substantial amount of work since the results of Hawkes (1971) could not be used anymore.