Integration of Continuous-Time Dynamics in a Spiking Neural Network Simulator

Contemporary modeling approaches to the dynamics of neural networks include two important classes of models: biologically grounded spiking neuron models and functionally inspired rate-based units. We present a unified simulation framework that supports the combination of the two for multi-scale modeling, enables the quantitative validation of mean-field approaches by spiking network simulations, and provides an increase in reliability by usage of the same simulation code and the same network model specifications for both model classes. While most spiking simulations rely on the communication of discrete events, rate models require time-continuous interactions between neurons. Exploiting the conceptual similarity to the inclusion of gap junctions in spiking network simulations, we arrive at a reference implementation of instantaneous and delayed interactions between rate-based models in a spiking network simulator. The separation of rate dynamics from the general connection and communication infrastructure ensures flexibility of the framework. In addition to the standard implementation we present an iterative approach based on waveform-relaxation techniques to reduce communication and increase performance for large-scale simulations of rate-based models with instantaneous interactions. Finally we demonstrate the broad applicability of the framework by considering various examples from the literature, ranging from random networks to neural-field models. The study provides the prerequisite for interactions between rate-based and spiking models in a joint simulation.


INTRODUCTION
Over the past decades, multiple strategies of neural network modeling have emerged in computational neuroscience. Functionally inspired top-down approaches that aim to understand computation in neural networks typically describe neurons or neuronal populations in terms of continuous variables, e.g., firing rates (Hertz et al., 1991;Schöner et al., 2015). Rate-based models originate from the seminal works by Wilson and Cowan (1972) and Amari (1977) and were introduced as a coarse-grained description of the overall activity of large-scale neuronal networks. Being amenable to mathematical analysis and exhibiting rich dynamics such as multistability, oscillations, traveling waves, and spatial patterns (see e.g., Roxin et al., 2005), rate-based models have fostered progress in the understanding of memory, sensory and motor processes including visuospatial working memory, decision making, perceptual rivalry, geometric visual hallucination patterns, ocular dominance and orientation selectivity, spatial navigation, and movement preparation (reviewed in Coombes, 2005;Bressloff, 2012;Kilpatrick, 2015). On the brain scale, rate models have been used to study restingstate activity (Deco et al., 2011) and hierarchies of time scales (Chaudhuri et al., 2015). Ideas from functional network models have further inspired the field of artificial neuronal networks in the domain of engineering (Haykin, 2009).
Simulation of rate-based models goes back to the works by Grossberg (1973), McClelland and Rumelhart (1981), Feldman and Ballard (1982), and the PDP group (Rumelhart et al., 1986). Various specialized tools have developed since then (O'Reilly, 2014), such as PDP++ (McClelland and Rumelhart, 1989;O'Reilly et al., 2000), the Neural Simulation Language (Weitzenfeld et al., 2002), emergent (O'Reilly et al., 2012), the simulation platform DANA (Rougier and Fix, 2012), TheVirtualBrain (Sanz Leon et al., 2013), Topographica (Bednar, 2009) and the Neural Field Simulator (Nichols and Hutt, 2015). Similarly, efficient simulators for population-density approaches (MIIND: de Kamps et al., 2008, DiPDE: Cain et al., 2016 as well as spiking neural networks (see Brette et al., 2007 for a review) have evolved. The foci of the latter range from detailed neuron morphology (NEURON: Hines, 2006, GENESIS: Bower and to an abstraction of neurons without spatial extent (NEST: Bos et al., 2015, BRIAN: Goodman andBrette, 2013). Such open-source software, combined with interfaces and simulator-independent languages (Davison et al., 2008;Djurfeldt et al., 2010Djurfeldt et al., , 2014, supports maintainability, reproducibility, and exchangeability of models and code, as well as community driven development. However, these tools are restricted to either rate-based or spike-based models only. The situation underlines that bottom-up and top-down strategies are still mostly disjoint and a major challenge in neuroscience is to form a bridge between the spike-and rate-based models (Abbott et al., 2016), and, more generally, between the fields of computational neuroscience and cognitive science. From a practical point of view, a common simulation framework would allow the exchange and the combination of concepts and code between the two descriptions and trigger interaction between the corresponding communities. This is in particular important since recent advances in simulation (Djurfeldt et al., 2008;Hines et al., 2008;Kumar et al., 2010;Hines et al., 2011;Helias et al., 2012;Kunkel et al., 2014) and computing technology (Jülich Supercomputing Centre, 2015;Miyazaki et al., 2012) enable full-density bottom-up models of complete circuits (Potjans and Diesmann, 2014;Markram et al., 2015). In particular, it has become feasible to build spiking models (Schmidt et al., 2016) that describe the same macroscopic system as rate-based descriptions (Chaudhuri et al., 2015).
The relation between the different model classes is one focus of theoretical neuroscience. Assuming homogeneity across neurons, population-density methods reformulate the spiking dynamics as a dynamical equation for the probability density that captures the time evolution of the population activity (Knight, 1972;Gerstner, 1995Gerstner, , 2000. Under certain assumptions allowing the neglect of fluctuations in the input to neurons, a set of coupled differential equations for the population-averaged firing rate and membrane potential can be derived (Montbrió et al., 2015). For asynchronous irregular activity, input fluctuations can be taken into account in a diffusion approximation which leads to Fokker-Planck mean-field theory that can be used to determine homogeneous stationary state activities of spiking networks (Siegert, 1951;Brunel, 2000). The Fokker-Planck ansatz is, however, not limited to the population level, but can yield an heterogeneous stationary state firing rate across individual neurons in the network (Sadeh and Rotter, 2015). The dynamics of rate fluctuations around the background activity can be obtained using linear response theory on the population level (Brunel and Hakim, 1999) or the level of individual neurons (Lindner et al., 2005;Ostojic and Brunel, 2011;Trousdale et al., 2012;Grytskyy et al., 2013;Schuecker et al., 2015) yielding effective rate models on the population or single-neuron level. An alternative to linear response theory is given by moment expansions for mode decompositions of the Fokker-Planck operator Del Guidice, 2002, 2004;Deco et al., 2008).
An alternative derivation of rate-based dynamics aims at a closure of equations for synaptic currents of spiking networks in a coarse-graining limit by replacing spiking input with the instantaneous firing rate (Bressloff, 2012). Using fieldtheoretical methods (Buice and Chow, 2013) that were originally developed for Markovian network dynamics (Buice and Cowan, 2007;Buice et al., 2010) allows a generalization of this approach to fluctuations in the input (Bressloff, 2015).
In any case, the cascade of simplifications from the original spiking network to the rate-based model involves a combination of approximations which are routinely benchmarked in comparative simulations of the two models. A unified code base that features both models would highly simplify these validations rendering duplication of code obsolete.
In many cases rate models represent populations of spiking neurons. Thus, a hybrid model, employing both types of models in a multi-scale modeling approach, would contain a relatively large number of spiking neurons compared to the number of rate units. Despite the large size of the spiking network, the dynamics still features finite-size fluctuations (Ginzburg and Sompolinsky, we choose as a platform the open source simulation code NEST (Gewaltig and Diesmann, 2007;Bos et al., 2015) which is a scalable software used on machines ranging from laptops to supercomputers. The software is used by a considerable user community and equipped with a Python interface, supports the construction of complex networks, and shields the neuroscientist from the difficulties of handling a model description, potentially including stochastic components, in a distributed setting (Morrison et al., 2005;Plesser et al., 2015). Within this framework we introduce an iterative numerical solution scheme that reduces communication between compute nodes. The scheme builds on the waveformrelaxation technique (Lelarasmee, 1982) already employed for gap-junction interactions .
Our study begins with a brief review of numerical solution schemes for ordinary and stochastic (delay) differential equations in Section 2 and their application to neural networks in Section 2.2. Subsequently, we develop the concepts for embedding rate-based network models into a simulation code for spiking networks, adapt the waveform-relaxation scheme, and detail an extendable implementation framework for rate models in terms of templates (Section 2.3). In Section 3, different numerical schemes are evaluated as well as the scalability of our reference implementation. We illustrate the applicability of the framework to a broad class of network models on the examples of a linear network model (Grytskyy et al., 2013), a nonlinear network model (Sompolinsky et al., 1988;Goedeke et al., 2016), a neural field model (Roxin et al., 2005), and a mean-field description (Wong and Wang, 2006) of the stationary activity in a model of the cortical microcircuit (Potjans and Diesmann, 2014;Schuecker et al., 2017). Straight-forward generalizations are briefly mentioned at the end of the Results section, before the work concludes with the Discussion in Section 4. The technology described in the present article will be made available with one of the next major releases of the simulation software NEST as open source. The conceptual and algorithmic work is a module in our long-term collaborative project to provide the technology for neural systems simulations (Gewaltig and Diesmann, 2007).

MATERIALS AND METHODS
Rate-based single neuron and population models are described in terms of differential equations that often include delays and stochastic elements. Before we turn to the implementation of such models in computer code (Section 2.3) we review how such systems are mathematically solved and in particular how the stochastic elements are commonly interpreted with the aim to avoid an ad-hoc design. A stochastic differential equation (SDE) is defined by the corresponding stochastic integral equation. Let W(t) denote a Wiener process, also called Standard Brownian motion. For the initial condition X(t 0 ) = X 0 an Itô-SDE in its most general form satisfies where the second integral is an Itô integral Alternatively, the second integral can be chosen as a Stratonovich integral, indicated by the symbol •, which approximates Y(s) with the mid-point rule. In this case, the corresponding SDE is called a Stratonovich-SDE. We refer to Kloeden and Platen (1992) and Gardiner (2004) for a derivation and a deeper discussion on the differences between the two types of stochastic integrals. In the case of additive noise (b(t, X(t)) = b(t)) the Itô and Stratonovich integrals coincide. If furthermore the noise is constant (b(t, X(t)) = σ = const.) the integrals can be solved analytically . In the following, we focus on Itô-SDEs only. The differential notation corresponding to Equation (1) reads and denotes an informal way of expressing the integral equation. Another widely used differential notation, called the Langevin form of the SDE, is mostly employed in physics. It reads where ξ (t) is a Gaussian white noise with ξ (t) = 0 and which is a paradox, as one can also show that W(t) is not differentiable (Gardiner, 2004, Chapter 4). Mathematically speaking this means that Equation (3) is not strictly well-defined. The corresponding stochastic integral equation however, can be interpreted consistently with Equation (1) as dW(t) ≡ ξ (t)dt.

Approximate Numerical Solution of SDEs
Similar to ordinary differential equations most stochastic differential equations cannot be solved analytically. Neuroscience therefore relies on approximate numerical schemes to obtain the solution of a given SDE. This section presents some basic numerical methods. Let t denote the fixed step size, t k = t 0 + k t the grid points of the discretization for k = 0, . . . , n, and X k the approximation for X(t k ) obtained by the numerical method, at which X 0 is the given initial value. We consider systems of N stochastic differential equations: with initial condition X(t 0 ) = X 0 . Here, X(t) = (X 1 (t), . . . , X N (t)) and W(t) = (W 1 (t), . . . , W N (t)) denote N-dimensional vectors and a : R N → R N and b : R N → R N are N-dimensional functions. W(t) is an N-dimensional Wiener process, i.e., the components W i (t) are independent and identically distributed.

Euler-Maruyama
The Euler-Maruyama method is a generalization of the forward Euler method for ordinary differential equations (ODE). Accordingly, it approximates the integrands in Equation (1) with their left-sided values. The update formula reads

Semi-Implicit Euler
The (semi-)implicit Euler method is a generalization of the backwards Euler method for ODEs. The update formula reads The resulting scheme requires the solution of a system of nonlinear algebraic equations. Standard techniques for the solution of the system are Newton iteration and fixed-point iteration (Kelley, 1995). The method is sometimes called semiimplicit, because the function b is still evaluated at (t k , X k ) instead of (t k+1 , X k+1 ). However, a fully implicit Euler scheme for SDEs is not practicable (see Kloeden and Platen, 1992, Chapter 9.8) and thus the term implicit Euler usually refers to the semi-implicit method and is used in the following.

Exponential Euler
The exponential Euler method relies on the assumption that a(t, X(t)) consists of a linear part and a nonlinear remainder, i.e., a(t, X(t)) = A · X(t) + f (t, X(t)) with A ∈ R N×N . The idea is to solve the linear part exactly and to approximate the integral of the nonlinear remainder and the Itô Frontiers in Neuroinformatics | www.frontiersin.org integral with an Euler-like approach. Variation of constants for Equation (4) yields There are several versions of stochastic exponential Euler methods that differ in the approximation of the integrals. Unfortunately a standardized nomenclature to distinguish the methods is so far missing. The simplest approach, sometimes named stochastic Lawson-Euler scheme (e.g., Komori and Burrage, 2014), approximates the integrands with their left-sided values More advanced schemes approximate the nonlinear part by keeping f (s, X(s)) constant for [t 0 , t) and solving the remaining integral analytically Here I denotes the N × N identity matrix. The same technique can be used for the Itô integral For a single SDE, Shoji (2011) proposed a method where the remaining integral t t 0 e a(t − s) dW(s) with a ∈ R is approximated by t t 0 α dW(s), such that α ∈ R is chosen to minimize the mean-square error. This results in a similar approximation as for the nonlinear part. Komori and Burrage (2014) adapted this approach for systems of SDEs. The scheme reads Alternatively, calculating the variance of X(t) within the approximation (7), amounts to (Adamu, 2011) Var The corresponding scheme reads with η k ∼ N (0, 1) and yields the exact solution of the system if a(t, X(t)) = A · X(t) and b(t, X(t)) = const., since X(t) has Gaussian statistics in this case (Risken, 1996). Therefore, in the following we exclusively employ (Equation 8) and just refer to it as the stochastic exponential Euler scheme. For more detailed reviews on the different stochastic exponential Euler methods we refer to Adamu (2011) and Komori and Burrage (2014).

Network of Rate Models
We now consider networks of N rate-based units where each unit receives recurrent input from the network. The system fulfills the Itô-SDEs with possibly nonlinear input-functions φ(x) and ψ(x), connection weights w ij , mean input µ i , and optional delays d ij ≥ 0. The corresponding Fokker-Planck equation shows that the parameter σ i ≥ 0 controls the variance of X i (t) and the time constant τ i > 0 its temporal evolution. For readability, from here on we omit unit indices for σ , τ , µ, and d. The considered class of rate models only contains additive noise. Therefore, as noted above, the system (Equation 9) can be written as Stratonovich-SDEs without the need for change in the employed numerical methods. For an illustrative purpose we explicitly state the different explicit solution schemes for the network dynamics (Equation 9) with d = 0. The Euler-Maruyama update step reads The implicit Euler update formula evaluates X at k + 1 instead of k within the square brackets. This turns Equation 10 into a system of nonlinear algebraic equations for which the fixed-point iteration with initial value X i,0 k + 1 = X i k and the choice of yields the solution. For nonlinear φ(x) or ψ(x) the exponential Euler update step is Frontiers in Neuroinformatics | www.frontiersin.org with η i k ∼ N (0, 1). As A = −I is a diagonal matrix, the exponential Euler scheme does not rely on a matrix exponential, but decomposes into N equations with scalar exponential functions. Note that with a linear choice, φ(x) = ψ(x) = x, the system of SDEs can be written in matrix notation with A = −I + W and W = (w ij ) N × N . Here the stochastic exponential Euler scheme (Equation 8) yields the exact solution of the system. The numerical schemes presented in Section 2.1 are developed for SDEs (d = 0), but can analogously be used for stochastic delay differential equations (SDDEs) (d > 0), if the delay d is a multiple of the step size t. For the calculation of the approximation X i k + 1 in time step k + 1 the recurrent input is then evaluated from d t steps earlier, i.e., from X j k− d t for the explicit methods.

Implementation in Spiking Network Simulation Code
This section describes the embedding of rate-based models (Section 2.2) in a simulation code for spiking neuronal networks. The Appendix (Section A.2) illustrates how to create, connect and record activity from rate models in our reference implementation.
The software architecture for rate models is based on existing concepts: Morrison et al. (2005) describe distributed buffers for the storage of delayed interactions and the technique to consistently generate random numbers in a distributed setting, and Hahne et al. (2015) introduce so called secondary events, that allow the communication of continuous state variables, like membrane potentials or rates, between neurons or rate units respectively. Events provide an abstraction layer on top of the MPI communication which allows the implementation of neuron models without explicit reference to MPI calls. Unlike primary events which are used to transmit the occurrence of spikes at discrete points in time, secondary events occur on a regular time grid. These concepts are designed to be compatible with the parallel and distributed operation of a simulation kernel for spiking neuronal networks, ensuring an efficient use of clusters and supercomputers . This allows researchers to easily scale up network sizes to more realistic number of neurons. The highly parallelizable structure of modern simulation codes for spiking neuronal networks, however, also poses restrictions on the utilizable numerical methods.

Parallelization and Restrictions
Parallelization for spiking neuronal networks is achieved by distributing neurons over compute nodes. Since the dynamics of spiking neurons (in the absence of gap junctions) is decoupled for the duration of the minimal synaptic delay d min of the connections in the network, the states of the neurons can be propagated independently for this time interval. Thus, it is sufficient to specify solvers on the single-neuron level. The spike times, i.e., the mediators of interaction between neurons, are then communicated in steps of d min .
As a result of this structure the global connectivity of the network is unknown to the single neuron. The neuron object sends and receives events handled by an object on the compute node harboring the neuron termed network manager. However, the network manager only knows the incoming connections of the neurons on the compute node.
This poses restrictions on the use of implicit schemes. It is impossible to employ the implicit Euler scheme (Equation 6) with Newton iteration, which would require the simultaneous solution of a system of nonlinear algebraic equations with information distributed over all compute nodes. The use of the implicit Euler scheme with fixed-point iteration is however compatible with this structure. To this end, the scheme (Equation 6) needs to be formulated as a fixed-point iteration on the single-unit level (see Section 2.2) and the updated rates need to be communicated to the connected units after every iteration until some convergence criterion is met. The convergence of the fixed-point iteration is however only guaranteed if the scheme is contractive (see e.g., Kelley, 1995, their Section 4.2), which poses restrictions on the employed step size t. Section 3.1 investigates if the implementation can gain stability or accuracy from using the implicit Euler method with fixed-point iteration and if the payoff is large enough to justify the additional effort of an iterative solution scheme.
The restricted knowledge of connectivity also limits the usage of the exponential Euler method. In the case of a linear rate model, we are unable to add the influence from all other rate units to the matrix A in Equation (14), because most of these connections are unknown at the single-unit level. Therefore, we use the exponential Euler method with A = −I resulting in the update formula (13). This also has the benefit of avoiding the need to numerically evaluate a general matrix exponential as A is a diagonal matrix (see Section 2.2 for details).

Implementation
This section describes the additional data structure required for the implementation of rate-based models. While the naming convention refers to our reference implementation in the simulation software NEST, the employed algorithms and concepts are portable to other parallel spiking network simulators. As a result of the previous section and our analysis of the numerical schemes below (see Section 3.1) we restrict the discussion and our reference implementation to the exponential Euler method where we assume A = −I and identify t = h with h denoting the global computation step size (Morrison et al., 2005). We have to distinguish the cases of connections with delay (d > 0) and connections without delay (d = 0). The former case is similar to spiking interaction: assuming a connection from unit i to unit j, the rate of unit i needs to be available at unit j after d h additional time steps. This can be ensured if the delay of the connection is considered in the calculation of the minimal delay d min that determines the communication interval. After communication the rate values are stored in a ring buffer of unit j until they are due (Morrison and Diesmann, 2008). In the case of an instantaneous connection, the rate of unit i at time t 0 needs to be known at time t 0 at the process which updates unit j from t 0 to t 0 + h. Therefore, communication in every step is required for instantaneous rate connections, i.e., setting d min = h.
Due to the conceptual differences between instantaneous and delayed interactions (for the conceptual difference in the case of spiking interaction see Morrison and Diesmann, 2008) we employ two different connection types (delay_rate_connection, rate_connection) and associated secondary events (DelayRateNeuronEvent, RateNeuronEvent). This subdivision simplifies the discrimination of connections on the single-unit level, while still allowing for simultaneous use of instantaneous and delayed connections in the same rate model.
The large diversity of rate models (Equation 9) imposes a challenge for code maintenance and efficiency: Each combination of nonlinearities φ(x) and ψ(x) constitutes its own model. All of these models can be implemented in the exact same way, except for the evaluation of the nonlinearities. A template class (rate_neuron_ipn) providing a base implementation for rate models of category (9) avoids code duplication. Nevertheless, we restrict the reference implementation to one nonlinearity per model. This keeps the zoo of rate models small and to our knowledge covers the majority of rate models.
The template rate model class is instantiated with an object that represents the nonlinearity. Being instantiated at compile time, this template solution does not incur additional overhead at run time compared to a solution using polymorphy (inheritance). A boolean class member linear_summation determines if the nonlinearity should be interpreted as φ(x) (true, default value) or ψ(x) (false). The respective other function is assumed to be the identity function. The boolean parameter is evaluated in every update step of each unit. Deciding upon the type of nonlinearity at compile time would improve efficiency. In the present architecture this would, however, result in twice as many template instances for a given set of gain functions. With the future capabilities of code generation (Plotnikov et al., 2016) in mind it might be beneficial to elevate the constant boolean member object to a constant template parameter to allow compilers efficient preprocessing and at the same time profit from the code reliability achievable by modern C++ syntax. The present base implementation reduces the effort of creating a specific rate model of category (9) to the specification of an instance of the template class. Afterwards an actual rate model can be instantiated in a single line of code. Table 1 gives an overview of template-derived rate models of the reference implementation. These models serve as examples for customized rate models. Activity of rate units can be recorded using the multimeter and the recordable rate.
Gain functions of the rate-based models available in the NEST reference implementation. The name of a particular rate model is formed by <gain model>_ipn. The ending ipn indicates input noise, as the noise directly enters the r.h.s. of Equation (9). H denotes the Heaviside function.
In addition to these template-derived models of category (9) the reference implementation also contains a rate model called siegert_neuron. This model, described by (30) in Section 3.3.4 is used for mean-field analysis of complex networks and constitutes a special case with respect to the recurrent input from the network. Firstly, it requires a numerically stable implementation of the Siegert formula (see Section A.1). Secondly, Equations (28) and (29) demonstrate that for this model the input rates are weighted by separate factors. Thus, for connections between instances of this model two different weights need to be specified and the rate model must be able to handle this anomaly. Therefore, the siegert_neuron does not derive from our base class rate_neuron_ipn, but constitutes an independent class. It comes with the connection type diffusion_connection providing the weight parameters. Section A.2 motivates the parameter names and shows the usage of the model in the reference implementation.

Reduction of Communication Using Waveform-Relaxation Techniques
Instantaneous connections between rate-based models require communication after every time step, thus in intervals of the global computation step size h. This requires setting d min = h and impairs the performance and scalability, especially on supercomputers where communication is particularly expensive, because it is associated with a considerable latency. Therefore, for simulations with instantaneous connections we additionally study an iterative approach based on waveform-relaxation techniques that arrives, after convergence, at the same results as the standard approach, but allows us to use communication on a coarser time grid. Originally waveform-relaxation methods were developed (Lelarasmee, 1982) and investigated (see e.g., Miekkala and Nevanlinna, 1987) for ODEs. More recently they have also been analyzed for SDEs (Schurz and Schneider, 2005) and successfully applied to large systems of SDEs (Fan, 2013). With respect to simulations of rate-based models with instantaneous connections the basic idea is to solve the dynamics of the single units independently for the duration of T = αh with α ≥ 2 by treating the influence of the other units as known over this period of time. The solution of the original SDEs is determined by iteratively solving the decoupled SDEs: is based on the solution of the m − 1-th iteration and hence acts as a given input to the i-th system. The solutions improve step by step with each iteration. Schurz and Schneider (2005) demonstrate the convergence of the method for SDEs under mild assumptions on X, a and b. For our specific application, systems of rate-based models with b = const., The different parameters of the waveform-relaxation algorithm together with their C++ data-type, default value, and a brief description.
the influence of the stochastic part on the convergence is even less critical. Based on our previous results in the neuroscience context , the method converges within only a couple of iterations, due to the weak coupling of the SDEs. For more details on waveform-relaxation methods and their application in the neuroscience context we refer to Hahne et al. (2015). As outlined above, in a simulator for spiking neuronal networks the minimal delay d min in the network defines the communication interval. By employing the waveform-relaxation method with T = d min we retain this communication interval for simulations with instantaneous rate connections. To control the iteration interval T of the waveform-relaxation method, instantaneous connections contribute to the calculation of the minimal delay with an arbitrary user specified value given by the parameter wfr_comm_interval (see Table 2). Consequently the actual communication interval for waveform relaxation then is T = min d min , wfr_comm_interval . Figure 1B illustrates the concept of the iterative approach in contrast to the standard procedure in panel A. The iterative approach requires the repeated solution of all time steps in the communication interval and converges to the solution obtained with the standard approach ( Figure 1A). The iteration terminates when a user chosen convergence tolerance wfr_tol (see Table 2) is met. If the method needs less than T/h iterations, the approach reduces the overall number of communications required to obtain the solution. In conclusion, the avoidance of communication in every step comes for the price of additional computational load.
The coupling of neurons via gap junctions is instantaneous and continuous in time and thus constitutes a very similar problem to the rate dynamics. In order to combine gap junctions with spiking dynamics Hahne et al. (2015) already devised an iterative technique based on waveform-relaxation techniques and described a suitable framework. This framework can also be employed for the simulation of rate-based models with instantaneous connections. The dynamics of a neuron model supporting gap junctions is solved with an adaptive stepsize ODE-solver, routinely carrying out several steps of the employed numerical method within one global computation time step h. The communication of a cubic interpolation of the membrane potential provides the solver with additional information, resulting in a more accurate solution than the one obtained from the standard approach. For rate-based models this additional benefit cannot be gained: The combination of an iterative method with an adaptive step-size solver is not applicable to SDEs, where the noise in each time step constitutes a random number. However, an iterative approach with fixed step size t = h is applicable, as long as we ensure that the random noise applied to the units remains the same in every iteration. In Section 3.2 we investigate the performance of the iterative ( Figure 1B) and the standard approach ( Figure 1A) with a focus on large network simulations on supercomputers. In our reference implementation waveform relaxation can be enabled or disabled by a parameter use_wfr. Note that in the traditional communication scheme for spiking neuronal networks (Morrison et al., 2005) the first communication occurs earliest at the end of the first update step. Therefore, in the absence of waveform relaxation, the initial input to units from the network is omitted. Table 2 summarizes the parameters of our reference implementation of the waveform-relaxation technique. A subset (wfr_interpolation_order, wfr_max_iterations, wfr_tol) was previously introduced by Hahne et al. (2015), but we rename them here to arrive at more descriptive names. The remaining parameters (use_wfr, wfr_comm_interval) result from the generalization to rate-based models.

RESULTS
In the following, we assess the accuracy and stability of the different numerical solution schemes and benchmark the performance of the reference implementation on large-scale machines, with special focus on scalability and the comparison between the standard solution and the iterative approach using waveform relaxation for simulations with instantaneous connections. The iterative approach is only discussed with respect to efficiency, as the iteration always converges against the results of the standard approach within only a couple of iterations (for details see Section 2.3.3). The remainder of the section illustrates the application of the simulation framework to a selection of prominent problems in the neuroscientific literature.

Stability and Accuracy of Integration Methods
In this section we investigate numerical methods (see Section 2.1) for the solution of SDEs that can be employed to solve the dynamics of rate-based units. We analyze the accuracy and stability of the different numerical methods to choose the bestsuited method for application in a distributed simulation scheme of a spiking network simulation code. The analysis only covers methods compatible with a spiking neural network simulator, namely (i) the Euler-Maruyama method, (ii) the implicit Euler method solved with a parallelizable fixed-point iteration, and (iii) the exponential Euler method where the linear part is restricted to −I, in the following called scalar exponential Euler. The distributed representation of the global connectivity of the network rules out both the regular exponential Euler method with a nondiagonal matrix A and the implicit Euler method solved with Newton iteration (see Section 2.3.1 for details).
Consider an exactly solvable network of N linear rate units with µ = 0 (see also Section 2.2): The exact solution of this system of SDEs coincides with the regular exponential Euler scheme and involves a matrix exponential and a matrix square root (Equation 8). We analyze two test cases, i.e., two different choices of A, to demonstrate different stability constraints. First, an all-to-all connected network with inhibitory connections of weight w ij = −1 √ N and hence A = −I + −1 √ N · 1, with 1 denoting a N × N allones matrix (Cichocki et al., 2009). Second, a sparse balanced excitatory-inhibitory network where the number of excitatory units is four times larger than the number of inhibitory units. In this network, each unit receives input from a fixed number of 0.8· p · N excitatory and 0.2 · p · N inhibitory randomly chosen source units with connection probability p and connection weights 1 √ N and −4 √ N , respectively. In the following we refer to the test cases as inhibitory all-to-all and sparse balanced e/i test case.
First we turn to the accuracy analysis. Although the exact solution of Equation (16) cannot be obtained with a distributed representation of A we can compute it using methods for numerical matrix computations implemented in MATLAB or Python (both provide an implementation of the same state-of-the-art algorithms, see Al-Mohy and Higham, 2009;Deadman et al., 2012). This way we obtain the exact solution within the accuracy of floating point numerics. This is the reference for computing the root mean square error (RMSE) of the different approximate methods. To employ the root mean square error in the context of stochastic differential equations we determine the reference solution for every tested step size and use the same random numbers for both, the reference solution and the approximative schemes. Figure 2 shows the root mean square error of the different numerical schemes for the two test cases with N = 400 units. In both test cases all investigated methods with decreasing step size converge toward the exact solution with convergence order 1, which is consistent with the established theory for SDEs with additive noise (Kloeden and Platen, 1992). Figure 2A shows the results for the inhibitory all-to-all test case. Here all three methods require a step size t ≤ 0.1 to deliver reliable results. The implicit Euler scheme solved with fixed-point iteration even requires a step size t ≤ 0.05. Within the stable region t ≤ 0.1 the scalar exponential Euler yields more accurate results than the two other methods. Figure 2B shows the results for the sparse balanced e/i test case. Here all three methods achieve almost identical accuracy for t ≤ 0.1 and stability problems only occur for the Euler-Maruyama method for step sizes t > 0.5. For step sizes t > 0.1 the implicit Euler method is more accurate than the other two methods.
To understand the stability issues shown in Figure 2 we turn to stability analysis. In the following we assume that A is diagonalizable, i.e., A = T −1 DT with T = (t ij ) N×N ∈ C N×N and D = diag(λ 1 , . . . , λ N ), and transform the system of SDEs with Z(t) = T X(t). It follows and Z(t 0 ) = T X 0 . The transformed system consists of N equations of the form that depend on the eigenvalues λ i of A and are pairwise independent except for the common contributions of the Wiener processes W j (t). In the following we only consider networks with bounded activity which requires eigenvalues λ i ∈ C with negative real part Re(λ i ) < 0. The solution of the i-th transformed equation then satisfies for two different initial values Z i 0 andZ i 0 . It is a desirable stability criterion that a numerical method applied to Equation (16) conserves this property. This requirement is closely related to the concept of A-stability for SDEs (see Kloeden and Platen, 1992, Chapter 9.8) and A-respectively B-stability for ODEs (Hairer and Wanner, 1991). To derive stability conditions for the numerical schemes, we apply one update step (t 0 to t 1 = t 0 + t) of the methods to Equation (17) and investigate under which circumstances the property |Z i 1 −Z i 1 | < |Z i 0 −Z i 0 | is conserved. Here Z i 1 denotes the approximation to the exact solution Z i (t 1 ) obtained by the numerical scheme. A straight forward calculation shows that the Euler-Maruyama method retains the property if |1 + λ i · t/τ | < 1 holds. We conclude that the Euler-Maruyama scheme is stable for max λ i ζ EM (λ i ) < 1 with ζ EM (λ i ) = |1 + λ i · t/τ |. The scalar exponential Euler method demands a splitting of A = −I + W into −I and W. Here the stability condition is derived from the modified transformed system whereλ i are the eigenvalues of the matrix W. The scalar exponential Euler method conserves the property (Equation 18) if |e − t/τ +λ i · (1 − e − t/τ )| < 1. We conclude that the scalar exponential Euler scheme is stable for max The implicit Euler method solved with fixed-point iteration is stable, given the convergence of the fixed-point iteration. For the transformed system the employed fixed-point iteration on the single-unit level (Equation 12) reads It converges if the scheme is contractive, i.e., if the inequality holds for two different initial values Z i,0 k+1 andZ i,0 k+1 . It follows that the fixed-point iteration converges if | t/τ 1+ t/τλ i | < 1. Thus, the implicit Euler method solved with fixed-point iteration is Hence for all investigated methods the stability depends on the eigenvalues of the matrix A, the time constant τ and the step size t. To conclude restrictions on the step size t we analyze the eigenvalues of A for our examples.
For the inhibitory all-to-all test case we determine the eigenvalues λ 1 = −1 − √ N and λ 2 = . . . = λ N = −1 of the matrix A = −I + −1 √ N · 1 analytically. It follows that the Euler-Maruyama scheme satisfies the stability criterion for t ≤ 2τ √ N + 1 , the scalar exponential Euler method demands ) and the implicit Euler method with fixedpoint iteration requires t ≤ τ √ N − 1 . For the example of 400 units with τ = 1 in Figure 2A this yields step size restrictions of t ≤ 2 21 for the Euler-Maruyama method, t ≤ − ln( 19 21 ) ≈ 0.1 for the scalar exponential Euler method and t ≤ 1 19 for the implicit Euler method. This is consistent with the numerically obtained result (see vertical lines). For all methods the stability criterion implies that the step size t needs to be reduced with increasing network size N or decreasing time constant τ . This fully connected network, however, constitutes the worst case test for the class of rate-based models (Equation 9), as the absolute value of the negative eigenvalue quickly increases with the number of units N. Our second test case, the sparse balanced e/i network does not suffer from this problem (Rajan and Abbott, 2006), as it is a perfectly balanced network of excitatory and inhibitory units. In a scaling of the connection weights as 1 √ N , the spectral radius of A and therefore the subsequent stability analysis is independent of N. Here the stability analysis is more complicated, as most of the eigenvalues of A are complex and need to be computed numerically. Figure 3A shows the eigenvalues of A for a network of 2000 units. Figure 3B demonstrates that for this test case the scalar exponential Euler method and the implicit Euler method are stable regardless of the step size t. For the Euler-Maruyama the step size is restricted to t < 1.2τ . This is again consistent with the results obtained in Figure 2B, where τ = 0.5 and therefore the stability criterion of the Euler-Maruyama method yields t < 0.6. which is reduced compared to the fully connected inhibitory network and determined by the sparseness and the imbalance between excitation and inhibition. Secondly, the matrix A contains eigenvalues which constitute a cloud in the complex plane that is determined by the randomness of the connectivity. For these random networks λ max = argmax λ i ζ (λ i ) is a real eigenvalue. Figure 4 shows the step size restrictions of the different numerical methods with respect to the absolute value of λ max . For |λ max | < 2 the scalar exponential Euler methods and the implicit Euler are stable regardless of the step size t. Starting at |λ max | ≥ 2.8 the scalar exponential Euler is more stable than the implicit Euler method solved with fixed-point iteration. With increasing |λ max | the step size restrictions of the scalar exponential Euler method converges against the step size restriction of the Euler-Maruyama method.
Based on the results in this section we employ the scalar exponential Euler to solve rate-based model dynamics (Equation 9) in our reference implementation. Figure 4 demonstrates that it is the most stable algorithm compatible with the constraints of the distributed simulation scheme for spiking neural networks. Furthermore, the results in Figure 2 indicate that it is the most accurate method in the case of an all-to-all connected network with inhibitory connections. For the sparse balanced excitatory-inhibitory network the analysis exhibits an accuracy similar to the implicit Euler method. However, the solution of the implicit Euler method with fixed-point iteration requires the application of an iterative scheme in each single time step with communication between the units after every iteration. This algorithm is therefore more time consuming than the scalar exponential Euler scheme. Besides the choice of method the analysis in this section indicates that numerical stability is an issue for all tested methods depending on step size t and time constant τ . Although the applications in Section 3.3 show that many practical examples do not suffer from stability issues, when A B FIGURE 4 | Step size restrictions of the numerical methods. Largest ratio t/τ for which the different numerical methods (blue curve: Euler-Maruyama method, black curve: implicit Euler method solved with fixed-point iteration, red curve: scalar exponential Euler method) are stable shown against the absolute value of λ max = argmax λ i ζ (λ i ) ∈ R. The gray area indicates the region where the scalar exponential Euler method and the implicit Euler method are stable without any restrictions on t/τ . The dotted vertical lines correspond to the examples presented in the panels of Figure 2. a commonly used simulation step size is employed, the inevitable restrictions on the step size t should be taken into account in simulations of rate-model networks. For simulations of linear rate models an appropriate step size can be determined by an analysis of the eigenvalues of A.

Performance of the Reference Implementation
This section investigates the performance of the rate model reference implementation. We are interested in i) the scalability of the rate model framework and ii) the comparison between the standard implementation with communication in every computation time step and the iterative approach using waveform relaxation (see Section 2.3.3 for details). We perform the simulations on the JUQUEEN BlueGene/Q supercomputer (Jülich Supercomputing Centre, 2015) at the Jülich Research Centre in Germany. It comprises 28, 672 compute nodes, each with a 16-core IBM PowerPC A2 processor running at 1.6 GHz. For our benchmarks we use 8 OpenMP threads per JUQUEEN compute node and denote by VP = 8 · #nodes the total number of virtual processes employed.
As a test case we employ the sparse balanced e/i test case of linear rate units (φ(x) = ψ(x) = x) introduced in Section 3.1, but with a fixed number of 2000 inputs independent of the number of units to allow for an unbiased weak scaling.
A weak scaling ( Figure 5A) shows that the scalability of the standard implementation is impaired by the massive amount of communication. While for perfect scaling the simulation time should be constant over the number of virtual processes, the actual simulation time is increased by 15-25% when the number of virtual processes is doubled for VP < 256 and even up to 83% from 8, 192 to 16, 384 virtual processes. For the iterative method, the scaling behavior is close to constant up to 1, 024 virtual processes. When more processes are employed, the simulation time is increasing. However, the iterative method shows a better scaling behavior as the increase is weaker compared to the standard computation due to the lower total number of communication steps. Due to the higher computational load of the iterative method (see Section 2.3.3) the simulation time is larger compared to the straight forward approach for a small number of VP, where communication is not that crucial. For VP ≥ 1024, the iterative approach is superior with a speed up factor close to three for 16, 384 virtual processes (1, 209 s vs. 3, 231 s).
The strong scaling scenario with a fixed total number of N = 51, 200 units in Figure 5B constitutes a similar result. The iterative approach is beneficial for more than 1, 024 virtual processes and the scaling behavior of the iterative method outperforms that of the standard computation. Starting at 4, 096 virtual processes the savings in computation time decrease, which is explained by the very low workload of each single compute node. Again, for a smaller number of virtual processes the amount of additional computations is too high to outperform the standard implementation.
Despite the overall good scaling behavior, the performance in terms of absolute compute time is inferior to a simulator specifically designed for rate-based models alone (not shown). In the latter case it increases performance to collect the states of all units in one vector. If further the connectivity is available in form of a matrix and the delays are zero or homogeneous, the network can be efficiently updated with a single matrix-vector multiplication. Thus, the increased functionality and flexibility of having rate-and spiking models unified in one simulator comes for the price of a loss of performance for the rate-based models. However, as noted in the introduction, the number of units in rate-based network models is usually small and therefore performance is not as critical as for spiking network models.

Applications
First, we discuss a random inhibition-dominated network of linear rate units, then include nonlinear rate dynamics in a random network, and spatially structured connectivity in a functional neural-field model. In each case, simulation results are compared to analytical predictions. Furthermore, we simulate a mean-field model of a spiking model of a cortical microcircuit and discuss possible generalizations.

Linear Model
In the asynchronous irregular regime which resembles cortical activity, the dominant contribution to correlations in networks of nonlinear units is given by effective interactions between linear response modes (Pernice et al., 2011;Trousdale et al., 2012;Grytskyy et al., 2013;Dahmen et al., 2016). Networks of such noisy linear rate models have been investigated to explain features such as oscillations  or the smallness of average correlations (Tetzlaff et al., 2012;Helias et al., 2013). We here consider a prototypical network model of excitatory and inhibitory units following the linear dynamics given by Equation 9 with φ(x) = ψ(x) = x, µ = 0, and noise amplitude σ , Due to the linearity of the model, the cross-covariance between units i and j can be calculated analytically and is given by Ginzburg and Sompolinsky (1994); Risken (1996); Gardiner (2004); Dahmen et al. (2016): where H denotes the Heaviside function. The λ i indicate the eigenvalues of the matrix 1 − W corresponding to the i-th left and right eigenvectors v i and u i respectively. Nonzero delays yield more complex analytical expressions for cross-correlations.
In the population-averaged case, theoretical predictions are still analytically tractable (Equation 18 in Grytskyy et al., 2013). Figure 6 shows the cross-covariance functions for pairs of instantaneously coupled units in a large network, as well as population-averaged covariance functions in a network of excitatory and inhibitory units with delayed interactions. In both cases, simulations are in good agreement with the theoretical predictions.

Non-linear Model
So far we considered a network with linear couplings between the units. Qualitatively new features appear in the presence of nonlinearities. One of the most prominent examples is the emergence of chaotic dynamics (Sompolinsky et al., 1988) in a network of nonlinearly coupled rate units. The original model is deterministic and has been recently extended to stochastic dynamics . The model definition follows from Equation (9) where w ij ≈ N (0, g 2 /N) are Gaussian random couplings. In the thermodynamic limit N → ∞, the population averaged autocorrelation function c(t) can be determined within dynamic mean-field theory (Sompolinsky et al., 1988;Goedeke et al., 2016; A B FIGURE 6 | Linear rate model of a random excitatory-inhibitory network. (A) Cross-correlation functions of two pairs of excitatory units (black, red) and an excitatory-inhibitory unit pair (blue) in a network without delay. The variability across correlation functions arises from heterogeneity in network connections (difference between black and red curves) and from different combinations of cell types (e.g., difference between black and blue curves). (B) Population-averaged autocorrelation function for excitatory (black) and inhibitory units (red), and cross-correlation function between excitatory and inhibitory units (blue) in a network with delay d = 2 ms. Symbols denote simulation results, curves show theoretical predictions. Parameters: N E = 80 excitatory and N I = 20 inhibitory units, random connections with fixed out-degree, connection probability p = 0.1, excitatory weight w E = 1/ N E + N I , inhibitory weight w I = −6 w E , τ = 1 ms, µ = 0, σ = 1.
Step size h = 0.1 ms. Schuecker et al., 2016). Comparing c(t) obtained by simulation of a network (Equation 22) with the analytical result , their Equations 6 and 8) demonstrates excellent agreement (Figure 7). The simulated network is two orders of magnitude smaller than the cortical microcircuit, illustrating that in this context finite-size effects are already negligible at this scale.

Functional Model
Complex dynamics not only arises from nonlinear singleunit dynamics, but also from structured network connectivity (Yger et al., 2011). One important nonrandom feature of brain connectivity is the spatial organization of connections (Malach et al., 1993;Voges et al., 2010). In spatially structured networks, delays play an essential role in shaping the collective dynamics (Roxin et al., 2005;Voges and Perrinet, 2012). Patterns of activity in such networks are routinely investigated using neural-field models. In contrast to the models discussed above, field models require a discretization of space for numerical simulation. Such discretization can be done in the real space, leading effectively to a network of units at discrete positions in space, or alternatively, for particular symmetries in the couplings, in k-space (Roxin et al., 2006). Here, we follow the more general approach of discretization in real space. A prototypical model of a spatial network is given by Roxin et al. (2005), where the authors consider the neural-field model with delayed (delay d) interactions, constant input I ext , thresholdlinear activation function φ = x · H(x) and periodic Mexican-hat shaped connectivity The spatial variable ϕ can also be interpreted as the preferred orientation of a set of units, thus rendering Equation (23) a model in feature space (Hansel and Sompolinsky, 1998). Discretizing space into N segments yields the following set of coupled ODEs:
Step size h = 0.01 ms.
with connectivity w ij = 2π N w(|ϕ i − ϕ j |), ϕ i = −π + 2π N · i for i ∈ [1, N] and discretization factor 2π N that scales the space constants w 0 and w 1 with the neuron density. The spatial connectivity together with a delay in the interaction introduce various spatial activity patterns depending on the shape of the Mexican-hat connectivity.
To illustrate applicability of the simulation framework to neural-field models, we reproduce various patterns (Figure 8) observed by Roxin et al. (2005). Although the discrete and continuous networks strictly coincide only in the thermodynamic limit N → ∞, numerically obtained patterns shown in Figure 8 well agree with the analytically derived phase diagram of the continuous model (Roxin et al., 2005) already for network sizes of only N = 100 units.

Mean-Field Analysis of Complex Networks
A network of spiking neurons constitutes a high dimensional and complex system. To investigate its stationary state, one can describe the activity in terms of averages across neurons and time, leading to population averaged stationary firing rates (Brunel, 2000). Here, the spatial average collapses a large number of neurons into a single population, which is interpreted as a single rate unit. The ability to represent spiking as well as rate dynamics by the same simulation framework allows a straight-forward analysis of the spiking network by replacing the spiking neuron populations by single rate-based units.
In more formal terms, we now consider networks of neurons structured into N interconnected populations. A neuron in population α receives K αβ incoming connections from neurons in population β, each with synaptic efficacy w αβ . Additionally, each neuron in population α is driven by K α,ext Poisson sources with rate X ext and synaptic efficacy w ext . We assume leaky integrate-and-fire model neurons with exponentially decaying post-synaptic currents. The dynamics of membrane potential V and synaptic current I s is (Fourcaud and Brunel, 2002) where t j k denotes the k-th spike-time of neuron j, and τ m and τ s are the time constants of membrane and synapse, respectively. The membrane resistance has been absorbed in the definition of the current. Whenever the membrane potential V crosses the threshold θ , the neuron emits a spike and V is reset to the potential V r , where it is clamped for a period of length τ r . Given that all neurons have identical parameters, a diffusion approximation, assuming asynchronous and Poissonian spiking statistics as well as small synaptic couplings, leads to the population-averaged firing rates X α (Fourcaud and Brunel, 2002) 1 Here, γ = |ζ (1/2)|/ √ 2, with ζ denoting the Riemann zeta function (Abramowitz and Stegun, 1974). We find the fixed points of Equation (27) by solving the first-order differential equation (Wong and Wang, 2006;Schuecker et al., 2017) which constitutes a network of rate units with the dimension equal to the number of populations N.
Next we apply this framework to a cortical microcircuit model (Potjans and Diesmann, 2014) constituting roughly 80, 000 spiking neurons structured into 8 populations across 4 layers [L23, L4, L5, L6], with one excitatory and one inhibitory cell type each (Figure 9). The model exhibits irregular and stationary spiking activity (Figure 9C). Replacing each population by a single rate unit ( Figure 9B) results in an eight-dimensional rate network (Equation 30) which converges to a fixed point corresponding to the population-averaged firing rates obtained from direct simulation of the spiking model ( Figure 9D).
The analysis only considers the stationary state of the microcircuit, which can as well be determined using the population-density approach (Cain et al., 2016). While the mean-field approach presented is strictly valid only in the thermodynamic limit, finite-size fluctuations around this state are accessible using the noisy linear-rate model (Section 3.3.1) as elaborated by Bos et al. (2016) or within the population-density approach (Schwalger et al., 2016).

Non-linear dynamics
A characteristic feature of rate models considered so far is the leaky dynamics, i.e., the linear term −X i (t) in Equation (9). However, the presented framework can be extended to nonlinear dynamics as used for example by Stern et al. (2014). In a more general form Equation (9) reads where a characterizes the intrinsic rate dynamics. If a does not contain a linear part, the Euler-Maruyama scheme can be used for the update, i.e., If a also contains a linear part, so that a(X i ) = −X i + f (X i ), one can use an exponential Euler update approximating the nonlinear part as constant during the update. This leads to with η i k ∼ N (0, 1).

Multiplicative coupling
Another possible extension is a multiplicative coupling between units as for example employed in Gancarz and Grossberg (1998) or the original works of Cowan (1972, 1973). In the most general form, this amounts to which, again assuming the coupling term to be constant during the update, can be solved using the exponential Euler update with η i k ∼ N (0, 1).

Multiplicative noise
So far, we have considered rate models subject to additive noise corresponding to b(t, x(t)) = b(t) in Equation (4). The linear rate model considered in Section 3.3.1 describes the dynamics around a stationary state and due to the stationary baseline, the noise amplitude is constant. However, one might relax the stationarity assumption which would render the noise amplitude proportional to the time dependent rate, i.e., a multiplicative noise amplitude. The presented framework covers the latter since the exponential Euler update is also valid for multiplicative noise (Equation 8). Grytskyy et al. (2013) show that there is a mapping between a network of leaky integrate-and-fire models and a network of linear rate models with so-called output noise. Here the noise is added to the output rate of the afferent units

Output noise
and we cannot write the system as a SDE of type (2), as the nonlinearities φ(x) and ψ(x) are also applied to the white noise ξ j . In addition to the implementation rate_neuron_ipn for the rate-based models (Equation 9) discussed in the present work, our reference implementation also contains a base implementation rate_neuron_opn for models with output noise. For these models, the stochastic exponential Euler method can not be employed. Instead the solver assumes the noise ξ j to be constant over the update interval which leads to the update formula The term X j k + τ t σ η j k with η j k ∼ N (0, 1) is calculated beforehand in the sending unit j, which results in the same amount of communicated data as in the case of models with input noise.

DISCUSSION
This work presents an efficient way to integrate rate-based models in a neuronal network simulator that is originally designed for models with delayed spike-based interactions. The advantage of the latter is a decoupling of neuron dynamics between spike events. This is used by current parallel simulators for large-scale networks of spiking neurons to reduce communication between simulation processes and significantly increase performance and scaling capabilities up to supercomputers (Morrison et al., 2005). In contrast, rate-based models interact in a continuous way. For delayed interactions, rate dynamics are still decoupled for the minimal delay of the network such that information can be exchanged on a coarse time-grid. For instantaneous coupling, communication in every time step is required. This is feasible for small networks that can be simulated on small machines and thus require only a small amount of communication. For improved efficiency of simulations of large networks on supercomputers, we implement an iterative numerical solution scheme (Lelarasmee, 1982). Furthermore, we investigate several standard methods for the solution of rate model equations and demonstrate that the scalar exponential Euler method is the best choice in the context of a neuronal network simulator that is originally designed for models with delayed spike-based interactions. Afterwards, we show the applicability of the numerical implementation to a variety of well-known and widely-used rate models and illustrate possible generalizations to other categories of rate-based models.
The current reference implementation uses an exponential Euler scheme (Adamu, 2011;Komori and Burrage, 2014) with a diagonal matrix A (scalar exponential Euler): The additive noise as well as the leaky dynamics of single neurons are exactly integrated while the network input to the rate units is approximated as piecewise constant. The analysis in Section 3.1 demonstrates that the scalar exponential Euler is the most accurate, stable and efficient standard-method for SDEs that is applicable to a distributed spiking simulator. In particular for all-to-all connected networks of linear rate units the distributed design renders implicit methods less feasible, as the convergence of the involved fixed-point iteration requires small time-steps. For all methods the computation step size needs to be compared against the time constant τ . Therefore, stable solutions for small values τ ≪ 1 may require to decrease the step size below a default value.
The reference implementation provides an optional iterative method, the waveform relaxation (Lelarasmee, 1982), for networks with instantaneous rate connections. This method obtains the same results as the standard approach, but improves scalability by reducing communication at the cost of additional computations. As a consequence, the optimal method (standard vs. iterative) depends on the numbers of compute nodes and virtual processes. In our test case the use of the waveform-relaxation technique is beneficial for 1024 or more virtual processes. It is therefore recommended to employ the iterative scheme for large-scale simulations on supercomputers, but to disable it for smaller rate-model simulations on local workstations or laptops. This can easily be achieved by the parameter use_wfr (see Section 2.3.3 for details) of the algorithm. In general, the scalability for simulations of rate models is worse than for spiking network simulations (Kunkel et al., 2014) and comparable to simulations with gap junctions . This is expected since for rate connections as well as for gap junctions a large amount of data needs to be communicated compared to a spiking simulation. Future work should assess whether this bottleneck can be overcome by a further optimized communication scheme.
While our reference implementation uses the simulation software NEST as a platform, the employed algorithms can be ported to other parallel spiking network simulators. Furthermore, the implementation of the example rate models as templates allows customization to arbitrary gain functions. Researchers can create additional models, without in-depth knowledge of simulator specific data structures or numerical methods. In addition, the infrastructure is sufficiently general to allow for extensions to other categories of rate models as shown explicitly for nonlinear dynamics, multiplicative coupling, and other types of noise. This design enables the usage of the framework for a large body of rate-based network models. Furthermore, the generality of the model equations supports applications beyond neuronal networks, such as in computational gliascience (Amiri et al., 2012) or artificial intelligence (Haykin, 2009). Some design decisions for the reference implementation come with an up-and a downside and may at the present state of knowledge and experience constitute judgment calls: The choice to determine the type of nonlinearity of the recurrent network input with a boolean parameter is based on the assumption that this implementation covers the majority of rate models used in neuroscience today. The solution has an advantage in maintainability as it results in half as many template instances for a given set of gain functions than the alternative solution discussed above. It also avoids the introduction of potentially confusing names of rate models encoding the nature of the nonlinearity. On the downside models that do actually employ both nonlinearities at once cannot be expressed. Furthermore, a decision that can already be made at the time when the model instance is created, is delayed to the simulation phase. The decision to create a separate connection type for mean-field models of the siegert type is led by the ambition to avoid memory overhead. This comes at the price that units of this type cannot be connected to instances of rate models using the generic rate connection. Adapter elements like the parrot_neuron (see Kunkel et al., 2011, for a recent application) are one way to overcome this problem. Only the experience of researchers with the present implementation will inform us on whether characteristics and user interface serve the purpose of the community or if particular decisions need revision.
Mean-field theory has built a bridge between networks of spiking neurons and rate-based units that either represent single neurons or populations (Buice and Chow, 2007;Buice et al., 2010;Ostojic and Brunel, 2011;Bressloff, 2012;Grytskyy et al., 2013). In the latter case, the rate-based approach comes along with a considerable reduction of dimensionality (Section 3.3.4). Due to a possibly large number of populations, the fixedpoint solution of the stationary activity can generally not be determined analytically, but still be found by evolving a pseudo-time dynamics. Within the presented framework, this approach is much faster than the spiking counter-part and thus facilitates the calibration of large-scale spiking network models (Schuecker et al., 2017).
Our unifying framework allows researchers to easily switch between rate-based and spiking models in a particular network model requiring only minimal changes to the simulation script. This facilitates an evaluation of the different model types against each other and increases reproducibility in the validation of reductions of spiking networks to rate-based models. Furthermore, it is instructive to study whether and how the network dynamics changes with the neuron model (Brette, 2015). In particular, functional networks being able to perform a given task are typically designed with rate-based units. Their validity can now be evaluated by going from a more abstract rate-based model to a biologically more realistic spiking neuron model. The present reference implementation does not allow for interactions between spiking and rate-based units. While this is technically trivial to implement, the proper conversion from spikes to rates and vice versa is a conceptual issue that has to be explored further by theoretical neuroscience.
The presented joined platform for spike-based and ratebased models hopefully triggers new research questions by facilitating collaboration and translation of ideas between scientists working in the two fields. This work therefore contributes to the unification of both modeling routes in multi-scale approaches combining large-scale spiking networks with functionally inspired rate-based elements to decipher the dynamics of the brain.

AUTHOR CONTRIBUTIONS
Under the supervision of MH and MD, the authors JH, DD, and JS jointly worked on all parts of the above publication. DD and JS thereby focused on the neuroscientific applications and the implementation of neuron models. JH focused on the NEST infrastructure, numerical schemes and performance benchmarks. All authors contributed to the writing of the manuscript.

A.1. Numerical Evaluation of the Siegert Formula
We here describe how to numerically calculate Equation (27), frequently called Siegert formula in the literature (for a recent textbook see Gerstner et al., 2014), which is not straight forward due to numerical instabilities in the integral. First we introduce the abbreviations y θ = (θ − µ)/σ + γ √ τ s /τ m and y r = (V r − µ)/σ + γ √ τ s /τ m and rewrite the integral as √ π y θ y r e u 2 (1 + erf(u))du Here, the numerical difficulty arises due to a multiplication of a divergent (e u 2 ) and a convergent term (1 + erf(u)) in the integrand. We therefore use the variable transform w = v − u and obtain = 2 where we performed the integral over u in the last line. For y r , y θ < 0 the integrand can be integrated straightforwardly as ∞ 0 e 2y θ w − w 2 − e 2y r w − w 2 w dw , where the two terms in the integrand converge separately and where, in approximation, the upper integration bound is chosen, such that the integrand has dropped to a sufficiently small value (here chosen to be 10 −12 ). Here, for w = 0, the integrand has to be replaced by lim w→0 e 2y θ w − e 2yr w w = 2 y θ − y r . For y θ > 0 and y r < 0 only the combination of the two terms in Equation (A1) converges. So we rewrite ∞ 0 e −w 2 e 2y θ w − e 2y r w u dw = ∞ 0 e 2y θ w − w 2 1 − e 2(y r − y θ )w w dw = e y 2 θ ∞ 0 e −(w − y θ ) 2 1 − e 2(y r − y θ )w w dw .
The integrand has a peak near y θ . Therefore, in approximation, the lower and the upper boundary can be chosen to be left and right of y θ , respectively, such that the integrand has fallen to a sufficiently low value (here chosen to be 10 −12 ). For w = 0 we replace the integrand by its limit, which is lim w→0 e −(w−y θ ) 2 1−e 2(yr −y θ )w w = e −y 2 θ 2 y θ − y r . We actually switch from Equations (A1) to (A2) when y θ > 0.05|Ṽ θ |/σ α withṼ θ = V θ + γ √ τ s /τ m . This provides a numerically stable solution in terms of a continuous transition between the two expressions. Our reference implementation numerically evaluates the integrals using the adaptive GSL implementation gsl_integration_qags (Galassi et al., 2006) of the Gauss-Kronrod 21-point integration rule.

A.2. Usage of the NEST Reference Implementation
We here give a brief description of how to use our reference implementation of the continuous-time dynamics in the simulation code NEST with the syntax of the PyNEST interface (Eppler et al., 2009). Complete simulation scripts will be made available with one of the next major releases of NEST. The description focuses on rate-model specific aspects. A general introduction to PyNEST can be found on the simulator website (http://www.nest-simulator.org).
Script 1 shows the creation of an excitatory-inhibitory network of linear rate units as used in our example in Section 3.3.1. Researchers already familiar with the PyNEST interface notice that there is no fundamental difference to scripts for the simulation of spiking neural networks. Line 4 illustrates how to disable the usage of the waveform-relaxation method. This is advisable for simulations on local workstations or clusters, where the waveform-relaxation method typically does not improve performance (see Section 3.2). The iterative method is enabled by default, as it is also employed for simulations with gap junctions, where it improves performance and even accuracy of the simulation, regardless of network size and parallelization . Instances of the linear ratebased model are created by calling nest.Create in the usual way with model type lin_rate_ipn. The parameter linear_summation characterizes the type of nonlinearity (φ or ψ, see Section 2.3.2) of the rate model. In this particular example the explicit specification of the parameter is added for illustrative purposes only as i) the employed model is a linear rate model, where regardless of the choice of the parameter φ(x) = ψ(x) = x holds and ii) the default value is True anyway. Lines 13-14 and 31 demonstrate how to record the rate activity with the multimeter. The record_from parameter needs to be set to rate to pick up the corresponding state variable. As this particular network model includes delayed rate connections the synapse model delay_rate_connection is chosen (lines 17-20). In order to create instantaneous rate connections instead one changes the synapse model to rate_connection and removes the parameter delay from the synapse dictionary. For the simultaneous use of delayed and instantaneous connections one duplicates lines 17-28 and adapts the synapse and connection dictionaries of the copy according to the needs of the additional instantaneous connections.